Upd:SuperMUC: Progress in resolving I/O problems
Changes to this message: Progress Update (Dec 20, 8:50) The two main issues (and a couple of minor problems) that caused the I/O problems have been isolated and, hopefully, fixed; we intend to repeat the acceptance benchmark as soon as possible. Once we consider the fixes fully verified, we intend to return to regular user operation - please watch this document for further updates. ----------------------------------------------------------------------- Update: Please note that user access to the system will be blocked on Friday, Dec 14 at 18:00. LoadLeveler will be stopped in the morning of Dec. 15. ----------------------------------------------------------------------- Dear users of the SuperMUC petaflop system, Strong performance variations and hangs of I/O, which have also been observed by regular production programs on the systems, have caused the acceptance step for GPFS performed on December 11 to fail. For this reason, LRZ has decided to turn over the complete system to IBM for an open ended analysis phase beginning on Saturday, December 15, at 12:00 LRZ is of the opinion that there is no alternative to this procedure in order to isolate and remove the cause for the observed problems on the system and obtain more stable user operation in the long term. We will inform you via this document once a date for returning to user operation has been fixed. This information is also available on our web server http://www.lrz-muenchen.de/services/compute/supermuc/aktuell/ali4488/ Reinhold Bader
participants (1)
-
LRZ aktuell