Upd:Update: Unscheduled maintenance on SuperMUC

LRZ aktuell publish at lrz.de
Mo Okt 1 17:13:56 CEST 2012


 Changes to this message: Update
3rd Update (Oct 1, 17:15)
 
 The cause of the GPFS failure on Saturday, Sep 29 has been identified
 as a software problem triggered by data migration to additionally
 installed disk space. Unfortunately, further investigation is still
 needed before a return to user operation is possible. We will provide
 an update via this URL once we know more.
 -----------------------------------------------------------------------
 Dear users of SuperMUC,
 
 the reason for the problems with the GPFS filesystem on SuperMUC seem
 to be defective Infiniband hardware component(s), which broke as a
 consequence of the power failure on Sunday, September 23rd.
 Unfortunately, the particular hardware components could not yet be
 identified and we cannot reliably estimate how long the search will
 take.
 
 IBM is still working on the problem.
 
 We apologize for the inconveniences caused by this failure. We will
 inform you through an update to this announcement as soon as the
 problem is fixed.
 


 This information is also available on our web server
 http://www.lrz-muenchen.de/services/compute/supermuc/aktuell/ali4424/

 Markus Mueller



Mehr Informationen über die Mailingliste aktuell