Upd:Linux Cluster: Returning to user operation

LRZ aktuell publish at lrz.de
Di Mär 1 15:47:40 CET 2011

 Changes to this message: Slowly returning to user operation
 As of March 1, 15:45, a large part of the cluster systems has been
 returned to regular user operation. There are some exceptions:
   * due to hardware problems, the Altix 4700 will remain unavailable
     for a few days still
   * some other (mostly serial) nodes have also suffered hardware
     failures due to the power cutoff. These will be returned to user
     operation as repairs are forthcoming.
 As indicated in ALI 3942, an interruption of electrical power is
 scheduled for the weekend Feb 26-27. On the Linux cluster, additional
 maintenance measures will also be required. Therefore, the cluster will
 be unavailable for user operation between
         Friday, February 25, 16:00 and Tuesday, March 1, 18:00         
 All running batch jobs will be removed from the systems; we will leave
 the queues open until Friday 15:00, but recommend to set a user hold on
 any queued jobs which do not start execution in time to complete by
 Friday afternoon.
 Note: A job can be put into user hold by issuing the command
 qalter -h u <job id>
 Once the cluster has been returned to operation, you need to explicitly
 remove the hold again with
 qrls -h u <job id>

 This information is also available on our web server

 Reinhold Bader

Mehr Informationen über die Mailingliste aktuell