Upd:SuperMUC: scheduled maintenance Oct. 29 - 31

LRZ aktuell publish at lrz.de
Di Okt 30 16:45:18 CET 2012


 Changes to this message: Return to user operation delayed
Update (Oct 30 16:45):
 
 Based on a strong recommendation from Mellanox to install a firmware
 update on the interconnect switches, we have decided to postpone user
 operation by one day (see the new target date below) rather than
 schedule a new maintenance. This update should give a further
 improvement in signal quality, thereby hopefully reducing the
 intermittent occurrence of shaky nodes which have caused job failures.
 
 -----------------------------------------------------------------------
 Between October 29, 8:00 and October 31, 18:00 the system will be
 unavailable for user operation. During this maintenance configuration
 changes and software updates will be installed which should improve
 stability of both Intel and IBM MPI operation.
 
 The following items are of particular interest:
 
   * IBM PE 1.2 PTF9. This update fixes a memory leak in the IBM MPI
     library as well as some problems with MPI-IO. Please remove any
     MP_MPILIB=pempi and/or MP_SHARED_MEMORY=no workaround settings from
     your LoadLeveler scripts, recompile, relink and rerun your
     application and check whether the problems are solved. Please
     provide feedback if this is not the case.
   * An update for the GPFS file system (although it is not expected
     that this update will solve all outstanding I/O problems).
   * Intel compilers (from 12.1.5 to 12.1.7 for the default release, and
     13.0 Update is available as a non-default release)
   * Intel MKL update (from 10.3u9 to 10.3u13)
 


 This information is also available on our web server
 http://www.lrz-muenchen.de/services/compute/supermuc/aktuell/ali4438/

 Reinhold Bader



Mehr Informationen über die Mailingliste aktuell