Upd:Problems with Infiniband and GPFS mostly resolved

LRZ aktuell publish at lrz.de
Sa Sep 29 09:55:16 CEST 2012


 Changes to this message: Situation has stabilized
Update
 
 The situation has stabilized for now, so job processing should work
 normally again. Some jobs which already were in the machine on Friday
 afternoon (28.9.) may still be impacted; please resubmit such jobs.
 -----------------------------------------------------------------------
 Dear users of SuperMUC,
 
 the reason for the problems with the GPFS filesystem on SuperMUC seem
 to be defective Infiniband hardware component(s), which broke as a
 consequence of the power failure on Sunday, September 23rd.
 Unfortunately, the particular hardware components could not yet be
 identified and we cannot reliably estimate how long the search will
 take.
 
 IBM is still working on the problem.
 
 We apologize for the inconveniences caused by this failure. We will
 inform you through an update to this announcement as soon as the
 problem is fixed.
 


 This information is also available on our web server
 http://www.lrz-muenchen.de/services/compute/supermuc/aktuell/ali4424/

 Markus Mueller



Mehr Informationen über die Mailingliste aktuell