Upd:Linux Cluster: Serial SLURM processing returned to operation
LRZ aktuell
publish at lrz.de
Fr Jun 21 15:26:48 CEST 2013
Changes to this message: Update
Update (June 21, 15:20):
SLURM processing is enabled again. All queued and running jobs needed
to be deleted, so resubmission of jobs will be required.
However, the root cause has only been partially fixed by setting a
maximum submitted job number limit (1000). The following additional
restrictions on SLURM usage have been added to the user documentation:
* Please avoid using more than 10 characters in the job name field.
* Please do not submit scripts with - possibly xargs-generated - long
argument lists. Generate argument lists for program startup inside
a submitted script.
Both of the above can lead to overflow of SLURM-internal buffers,
amounting to a denial-of-service situation. We intend to put a
mechanism in place to enforce the above limits, but this will take some
time to implement.
This information is also available on our web server
http://www.lrz-muenchen.de/services/compute/aktuell/ali4605/
Reinhold Bader
Mehr Informationen über die Mailingliste aktuell