Upd:SuperMUC: scheduled maintenance Oct. 29 - 31
LRZ aktuell
publish at lrz.de
Di Okt 30 16:45:18 CET 2012
Changes to this message: Return to user operation delayed
Update (Oct 30 16:45):
Based on a strong recommendation from Mellanox to install a firmware
update on the interconnect switches, we have decided to postpone user
operation by one day (see the new target date below) rather than
schedule a new maintenance. This update should give a further
improvement in signal quality, thereby hopefully reducing the
intermittent occurrence of shaky nodes which have caused job failures.
-----------------------------------------------------------------------
Between October 29, 8:00 and October 31, 18:00 the system will be
unavailable for user operation. During this maintenance configuration
changes and software updates will be installed which should improve
stability of both Intel and IBM MPI operation.
The following items are of particular interest:
* IBM PE 1.2 PTF9. This update fixes a memory leak in the IBM MPI
library as well as some problems with MPI-IO. Please remove any
MP_MPILIB=pempi and/or MP_SHARED_MEMORY=no workaround settings from
your LoadLeveler scripts, recompile, relink and rerun your
application and check whether the problems are solved. Please
provide feedback if this is not the case.
* An update for the GPFS file system (although it is not expected
that this update will solve all outstanding I/O problems).
* Intel compilers (from 12.1.5 to 12.1.7 for the default release, and
13.0 Update is available as a non-default release)
* Intel MKL update (from 10.3u9 to 10.3u13)
This information is also available on our web server
http://www.lrz-muenchen.de/services/compute/supermuc/aktuell/ali4438/
Reinhold Bader
Mehr Informationen über die Mailingliste aktuell