Upd:SuperMUC: scheduled maintenance Jan 28  31

LRZ aktuell publish at lrz.de
Mo Feb 4 21:43:28 CET 2013


 Changes to this message: Returned to user operation
Update (Feb 4, 2013, 21:40) Although the afternoon has become rather
 late, we are now returning the system to user operation and holding
 thumbs that the updates will keep the promised stability improvements.
 
 Update (Feb 4, 2013, 14:40)
 
 Dear users of the SuperMUC system,
 
 IBM and LRZ have worked hard over the weekend to resolve all system
 stability issues caused by some of the maintenance activities last
 week. So far, we believe that almost all problems reported by users of
 the system in the last weeks should be resolved now.
 
 We ve also been able to clearly pinpoint one previously unknown problem
 with processing LoadLeveler job STDIN and STDOUT files in GPFS during
 our intensive test and debugging activities over the weekend. Therefore
 we recommend to forego using GPFS for these IO streams, and instead
 write them to your HOME directory. Please also place LoadLeveler output
 and error files inside your HOME directory.
 
 We expect the system to go online again this afternoon. We regret any
 inconvenience the extended maintenance activities may have caused, and
 hope for your understanding.
 
 -----------------------------------------------------------------------
 Dear users of the SuperMUC petascale system at LRZ,
 
 Due to a combined hard- and software maintenance, SuperMUC will be
 unavailable for user operation between January 28, 8:00 and the
 afternoon of January 31. Jobs still running at the beginning of the
 maintenance will be cancelled.
 
 The following changes are introduced by this maintenance:
 
   * Access to the PRACE network will become available to PRACE users,
   * LoadLeveler mail notification will be activated,
   * Licence servers will be accessible from the compute nodes,
   * The Infiniband software stack will be updated to a newer release,
   * Some presently missing packages will become available on the
     compute nodes (e.g., you can remove the workaround for a missing
     libnuma.so)
   * Intel MPI: The newest 4.1 bug fix release is now used by default.
   * Intel MKL: The 11.0 release is now used by default. The old mkl/
     10.3 module is however still available if you see trouble with the
     new version.
 


 This information is also available on our web server
 http://www.lrz-muenchen.de/services/compute/supermuc/aktuell/ali4502/

 Reinhold Bader



Mehr Informationen über die Mailingliste aktuell