Upd:SuperMUC: Returned to user operation

LRZ aktuell publish at lrz.de
Fr Mär 15 09:14:27 CET 2013


 Changes to this message: Returned to user operation
Status update (March 15, 9:00) The system was returned to user
 operation yesterday evening at 21:30. We intend to publish a separate
 announcement on the details of the observed problems soon.
 
 Status update (March 13, 17:00) Unfortunately the data corruption
 problem has recurred. We have again suspended user operation.
 
 Status update (March 13, 16:15) IBM in collaboration with DDN and
 Mellanox has isolated and repaired two technical issues. One of these
 issues was located in the I/O subsystem and has caused data corruption
 which was noticed by GPFS but could not be automatically repaired,
 leading to unavailability of the file system. It is believed that the
 scope of this data corruption was limited to a single file, which was
 identified and removed from the GPFS file system. After successful
 internal test runs the system has been returned to regular user
 operation.
 -----------------------------------------------------------------------
 Dear users of SuperMUC,
 
 Due to a problem with the GPFS services access to the file systems WORK
 and SCRATCH is presently disrupted. IBM is attending to the problem,
 and we'll keep you up to date on the status via this document.
 
 Apologies for any delays in job processing.


 This information is also available on our web server
 http://www.lrz-muenchen.de/services/compute/supermuc/aktuell/ali4543/

 Reinhold Bader



Mehr Informationen über die Mailingliste aktuell