slurm-dummies
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
slurm-dummies [2020/03/02 18:19] – admin | slurm-dummies [2020/05/07 10:25] – admin | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== S.L.U.R.M. ====== | ====== S.L.U.R.M. ====== | ||
- | ====== | + | ====== |
===== for Dummies ===== | ===== for Dummies ===== | ||
- | **1st thing you need to know:** Using a slurm script is like if you're typing the commands from a shell. Therefore you must include in the script all the commands that you would use on the shell before/ | + | **1st things |
+ | Using a slurm script is like if you're typing the commands from a shell. Therefore you must include in the script all the commands that you would use on the shell before/ | ||
\\ | \\ | ||
Every instruction line for the queue manager start with #SBATCH, so\\ | Every instruction line for the queue manager start with #SBATCH, so\\ | ||
Line 11: | Line 12: | ||
# SBATCH...... : this is a comment\\ | # SBATCH...... : this is a comment\\ | ||
- | The mandatory directives that you must **always** include in the scripts are: | + | ====The mandatory==== |
+ | directives that you must **always** include in the scripts are: | ||
- Your email address. the official epfl address or another, but valid (worldwide), | - Your email address. the official epfl address or another, but valid (worldwide), | ||
- How much time your job must run (if the job runs over this limit the cluster manager will kill it). the minimum is 1 minute and there' | - How much time your job must run (if the job runs over this limit the cluster manager will kill it). the minimum is 1 minute and there' | ||
- How much memory (RAM) your job will use. Please remember that if your job use more memory than the limit you put here, then the cluster manager will kill the job. the minimum is 512 Mbyte, currently (as for Feb 2020) the maximum is 64 Gbyte. | - How much memory (RAM) your job will use. Please remember that if your job use more memory than the limit you put here, then the cluster manager will kill the job. the minimum is 512 Mbyte, currently (as for Feb 2020) the maximum is 64 Gbyte. | ||
+ | - How many nodes (computers) you're going to use with your script. | ||
- How many cores/cpu must be reserved for your job. If you don't include this parameter only one core/cpu will be assigned to your job and you cannot run more than a single threaded job. | - How many cores/cpu must be reserved for your job. If you don't include this parameter only one core/cpu will be assigned to your job and you cannot run more than a single threaded job. | ||
+ | - **the name of the queue/ | ||
+ | |||
+ | ==== partitions (a.k.a. queues) ==== | ||
+ | If you used other types of cluster management, you will already known the term '' | ||
The beginning of your script will be: | The beginning of your script will be: | ||
Line 23: | Line 30: | ||
# how much time this process must run (hours: | # how much time this process must run (hours: | ||
#SBATCH --time=04: | #SBATCH --time=04: | ||
- | # how much memory it needs ? 1 GB (1024MB) for the example | + | # how much memory it needs ? 1 GB (1024MB) for the example. Different units can be specified using the suffix [K|M|G|T] |
#SBATCH --mem=1G | #SBATCH --mem=1G | ||
</ | </ | ||
Line 30: | Line 37: | ||
#Numer of cores needed by the application (8 in this example) | #Numer of cores needed by the application (8 in this example) | ||
#SBATCH --cpus-per-task=8 | #SBATCH --cpus-per-task=8 | ||
- | #and of course the number of nodes (computers) your program is supposed to use (usually | + | #and of course the number of nodes (physical |
#SBATCH --nodes=1 | #SBATCH --nodes=1 | ||
</ | </ | ||
Line 40: | Line 47: | ||
< | < | ||
# this line instruct the PBS to send a mail when the job start and finish | # this line instruct the PBS to send a mail when the job start and finish | ||
- | #SBATCH --mail-type=begin | + | #SBATCH --mail-type=begin,end |
- | #SBATCH --mail-type=end | + | |
</ | </ | ||
Line 60: | Line 66: | ||
</ | </ | ||
- | Another mandatory parameter is the queue you want to utilize: at the moment we have only the queue '' | + | Another mandatory parameter is the queue (called partition in SLURM terminology) |
< | < | ||
# queue to be used | # queue to be used | ||
Line 84: | Line 90: | ||
echo " | echo " | ||
- | # it's better to use the command srun to launch the executable command (just prefix srun to you normal command line), so SLURM can better manage the scheduling of the jobs | + | </ |
+ | It's better to use the command srun to launch the executable command (just prefix srun to you normal command line), so SLURM can better manage the scheduling of the jobs | ||
+ | < | ||
srun ./name of the program and parameters you want to launch | srun ./name of the program and parameters you want to launch | ||
Line 115: | Line 124: | ||
</ | </ | ||
- | Now you just need to tell the cluster system that you want to run this job, but how you do that? pretty simple, you use the command | + | Now you just need to tell the cluster system that you want to run this job, but how you do that? pretty simple, you use the command |
< | < | ||
Line 122: | Line 131: | ||
After all this work, you just need to relax and wait until you receive the email messages from the queuing manager telling you about success or failure of your submissions. | After all this work, you just need to relax and wait until you receive the email messages from the queuing manager telling you about success or failure of your submissions. | ||
- | If you browse the the documentation we have on [[slurm|Batch Queuing System]] you'll find examples on how to use Matlab or Mathematica and some explanation about the directives and the commands available for the queuing system. | + | If you browse the the documentation we have on [[1slurm|Batch Queuing System]] you'll find examples on how to use Matlab or Mathematica and some explanation about the directives and the commands available for the queuing system. |
slurm-dummies.txt · Last modified: 2023/10/09 15:17 by admin