Upd:Preparing for the HLRB-II successor SuperMUC
LRZ aktuell
publish at lrz.de
Mi Apr 6 10:29:13 CEST 2011
Changes to this message: Update
Dear users of HLRB II at LRZ,
After more than 5 years of operation, the presently deployed Altix 4700
system is approaching the end of its lifetime. It is presently intended
to retire the Altix 4700 system from user operation toward the end of
October 2011.
Therefore, we hereby inform you what steps will be taken by LRZ to move
its supercomputing operations to the next-generation infrastructure,
which will be delivered by IBM, and will be based on standard x86-64
processors with an Infiniband-based fat tree interconnect. LRZ intends
to incrementally move production to the new system throughout summer
and autumn of 2011.
* Early in August 2011, a migration system containing 205 nodes, each
containing 40 Westmere-EX cores and 160 GBytes of shared memory,
will be made available to selected HLRB II users to enable porting
and tuning of programs. Then, until end of August 2011, the system
will be made generally available to all users.
* Migration of the LRZ software stack will start in July 2011, and
may take until September to be completed. The development
environment will be based on Intel's compilers (version 12.0), and
an MPI implementation delivered by IBM. Pure OpenMP programs can
only be run with up to 40 threads.
* With a peak performance of 78 TFlop/s, the above-mentioned system
will be able to take over the job load from the Altix 4700, while
only needing 25% of the electrical energy compared to HLRB II.
* IBM's LoadLeveler will be deployed as a batch queuing system;
therefore you will need to make changes to your existing PBS job
scripts.
* During the intermediate operation phase (from August 2011 to summer
2012), the I/O bandwidth to the scratch disk and project subsystem
will be lower than presently available on the Altix, and programs
making use of parallel I/O via MPI-IO or HDF5 will require special
configuration.
* In late summer or early autumn 2011, LRZ - in collaboration with
Intel and IBM - is planning a training workshop on handling the
system and its development environment, to which all users of the
HLRB II will be invited.
* In summer 2012, the much bigger system (SuperMUC) with more than
6250 nodes (each with 16 Sandy Bridge-EP cores and 32 GBytes of
memory) will become available for user operation. SuperMUC will
have a peak performance of 3 PetaFlop/s (3,000 TFlop/s) which is
roughly 50 times more than the present system. To fully exploit the
capabilities of the new SIMD (AVX) units on the Sandy Bridge CPUs,
further tuning measures may be appropriate, but programs compiled
for the Westmere-based migration system should run without
problems.
* The system will also provide GPFS ("General Parallel File System")
based disk storage with a total capacity of 10,000 TBytes and an
aggregate bandwidth of 200 GBytes/s, which will fully support
MPI-IO and HDF5-based parallel I/O. Furthermore, SuperMUC will have
provisions in place to run jobs in an energy efficient manner e.g.,
by running processors at a lower frequency if the program's
performance does not deteriorate.
* Some months after SuperMUC's start of user operation, the nodes of
the migration system will be integrated with the main system;
forming a "fat node" island for programs which require a large
shared memory.
We expect that, due to security considerations, some policies on system
usage (especially how to access the system) may change. We will keep
you informed about these changes through the documentation of the new
system once details have been determined.
We wish all of you the best possible success for your scientific
simulation endeavor on our new Petaflop class successor to HLRB II. If
you have further questions, please forward them to us via the LRZ
Servicedesk (https://servicedesk.lrz.de/?service=
Hochleistungsrechnen%20und%20Grid-Supercomputing%20(HLRB)&lang=en).
Some additional information is also available via http://www.lrz.de/
services/compute/supermuc/01_supermuc_desc/.
This information is also available on our web server
http://www.lrz-muenchen.de/services/compute/hlrb/aktuell/ali3984/
Reinhold Bader
Mehr Informationen über die Mailingliste aktuell