Why is my MPI job failing due to the locked memory (memlock) limit being too low?
By default, SLURM propagates all of your resource limits at the time of job submission to the spawned tasks. This can be disabled by specifically excluding the propagation of specific limits in the slurm.conf file. For example PropagateResourceLimitsExcept=MEMLOCK might be used to prevent the propagation of a user’s locked memory limit from a login node to a dedicated node used for his parallel job. If the user’s resource limit is not propagated, the limit in effect for the slurmd daemon will be used for the spawned job. A simple way to control this is to insure that user root has a sufficiently large resource limit and insuring that slurmd takes full advantage of this limit. For example, you can set user root’s locked memory limit ulimit to be unlimited on the compute nodes (see “man limits.conf”) and insuring that slurmd takes full advantage of this limit (e.g. by adding something like “ulimit -l unlimited” to the /etc/init.d/slurm script used to initiate slurmd). Related information