Linux Cluster: Upgrade of the parallel computing capacity
LRZ aktuell
publish at lrz.de
Di Jul 19 16:34:54 CEST 2011
LRZ intends to take two new systems into user operation in its Linux
Cluster infrastructure within the next weeks; these new systems are
focused on efficient execution of large scale parallel programs.
The first system is an Infiniband-connected cluster of 16-way shared
memory systems with more than 2500 cores delivered by MEGWare, and the
second one consists of SGI Ultraviolet (UV) large-scale shared memory
systems with more than 2000 cores in total.
It is expected that the Infiniband cluster will become available for
regular user operation first, followed by the UV system somewhat later.
However, the existing UV will be retired from user operation soon
because the space it presently occupies will be needed for the new
system. Furthermore, the Infiniband cluster will need to be moved into
the new LRZ infrastructure later in summer, so there will be a
prolonged interruption of operation some weeks after its initial user
operation.
Users are advised that all new systems, which are targeted to running
large parallel jobs, will use SLURM as a batch scheduler. Existing SGE
scripts will not work on the new clusters - they will require rewriting
of the control section, and will also need modification of the program
startup procedure in the script section. Documentation for the new
batch system will be available via the Linux Cluster page (/www.lrz.de/
services/compute/linux-cluster) once user operation starts. The usage
instructions for startup of MPI programs will also be updated.
We hope that this capacity upgrade will reduce the rather long waiting
times which nowadays are observed for parallel jobs. We intend to keep
you informed about the status of these new systems via updates to this
document.
This information is also available on our web server
http://www.lrz-muenchen.de/services/compute/aktuell/ali4065/
Reinhold Bader
Mehr Informationen über die Mailingliste aktuell