Upd:SuperMUC: Returned to user operation
LRZ aktuell
publish at lrz.de
Fr Mär 15 09:14:27 CET 2013
Changes to this message: Returned to user operation
Status update (March 15, 9:00) The system was returned to user
operation yesterday evening at 21:30. We intend to publish a separate
announcement on the details of the observed problems soon.
Status update (March 13, 17:00) Unfortunately the data corruption
problem has recurred. We have again suspended user operation.
Status update (March 13, 16:15) IBM in collaboration with DDN and
Mellanox has isolated and repaired two technical issues. One of these
issues was located in the I/O subsystem and has caused data corruption
which was noticed by GPFS but could not be automatically repaired,
leading to unavailability of the file system. It is believed that the
scope of this data corruption was limited to a single file, which was
identified and removed from the GPFS file system. After successful
internal test runs the system has been returned to regular user
operation.
-----------------------------------------------------------------------
Dear users of SuperMUC,
Due to a problem with the GPFS services access to the file systems WORK
and SCRATCH is presently disrupted. IBM is attending to the problem,
and we'll keep you up to date on the status via this document.
Apologies for any delays in job processing.
This information is also available on our web server
http://www.lrz-muenchen.de/services/compute/supermuc/aktuell/ali4543/
Reinhold Bader
Mehr Informationen über die Mailingliste aktuell