slurm-dummies
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
slurm-dummies [2020/03/02 18:26] – admin | slurm-dummies [2020/05/09 16:56] – admin | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== S.L.U.R.M. ====== | ====== S.L.U.R.M. ====== | ||
- | ====== | + | ====== |
===== for Dummies ===== | ===== for Dummies ===== | ||
- | **1st thing you need to know:** \\ | + | **1st things |
Using a slurm script is like if you're typing the commands from a shell. Therefore you must include in the script all the commands that you would use on the shell before/ | Using a slurm script is like if you're typing the commands from a shell. Therefore you must include in the script all the commands that you would use on the shell before/ | ||
\\ | \\ | ||
Line 12: | Line 12: | ||
# SBATCH...... : this is a comment\\ | # SBATCH...... : this is a comment\\ | ||
- | The mandatory directives that you must **always** include in the scripts are: | + | ====The mandatory==== |
+ | directives that you must **always** include in the scripts are: | ||
- Your email address. the official epfl address or another, but valid (worldwide), | - Your email address. the official epfl address or another, but valid (worldwide), | ||
- How much time your job must run (if the job runs over this limit the cluster manager will kill it). the minimum is 1 minute and there' | - How much time your job must run (if the job runs over this limit the cluster manager will kill it). the minimum is 1 minute and there' | ||
Line 18: | Line 19: | ||
- How many nodes (computers) you're going to use with your script. | - How many nodes (computers) you're going to use with your script. | ||
- How many cores/cpu must be reserved for your job. If you don't include this parameter only one core/cpu will be assigned to your job and you cannot run more than a single threaded job. | - How many cores/cpu must be reserved for your job. If you don't include this parameter only one core/cpu will be assigned to your job and you cannot run more than a single threaded job. | ||
- | - the name of the queue you want to use: currently only '' | + | - **the name of the queue/ |
+ | |||
+ | ==== partitions (a.k.a. queues) ==== | ||
+ | If you used other types of cluster management, you will already known the term '' | ||
The beginning of your script will be: | The beginning of your script will be: | ||
Line 26: | Line 30: | ||
# how much time this process must run (hours: | # how much time this process must run (hours: | ||
#SBATCH --time=04: | #SBATCH --time=04: | ||
- | # how much memory it needs ? 1 GB (1024MB) for the example | + | # how much memory it needs ? 1 GB (1024MB) for the example. Different units can be specified using the suffix [K|M|G|T] |
#SBATCH --mem=1G | #SBATCH --mem=1G | ||
</ | </ | ||
Line 33: | Line 37: | ||
#Numer of cores needed by the application (8 in this example) | #Numer of cores needed by the application (8 in this example) | ||
#SBATCH --cpus-per-task=8 | #SBATCH --cpus-per-task=8 | ||
- | #and of course the number of nodes (computers) your program is supposed to use (usually | + | #and of course the number of nodes (physical |
#SBATCH --nodes=1 | #SBATCH --nodes=1 | ||
</ | </ | ||
Line 42: | Line 46: | ||
< | < | ||
- | # this line instruct | + | # this line instruct |
- | #SBATCH --mail-type=begin | + | #SBATCH --mail-type=begin,end |
- | #SBATCH --mail-type=end | + | |
</ | </ | ||
Line 105: | Line 108: | ||
#SBATCH --mail-user=dummy.epfl@epfl.ch | #SBATCH --mail-user=dummy.epfl@epfl.ch | ||
#SBATCH --time=04: | #SBATCH --time=04: | ||
- | #SBATCH --mem=1024mb | + | #SBATCH --mem=1024M |
#SBATCH --cpu-per-task=8 | #SBATCH --cpu-per-task=8 | ||
#SBATCH --mail-type=begin | #SBATCH --mail-type=begin | ||
Line 121: | Line 124: | ||
</ | </ | ||
- | Now you just need to tell the cluster system that you want to run this job, but how you do that? pretty simple, you use the command | + | Now you just need to tell the cluster system that you want to run this job, but how you do that? pretty simple, you use the command |
< | < | ||
Line 128: | Line 131: | ||
After all this work, you just need to relax and wait until you receive the email messages from the queuing manager telling you about success or failure of your submissions. | After all this work, you just need to relax and wait until you receive the email messages from the queuing manager telling you about success or failure of your submissions. | ||
- | If you browse the the documentation we have on [[slurm|Batch Queuing System]] you'll find examples on how to use Matlab or Mathematica and some explanation about the directives and the commands available for the queuing system. | + | If you browse the the documentation we have on [[1slurm|Batch Queuing System]] you'll find examples on how to use Matlab or Mathematica and some explanation about the directives and the commands available for the queuing system. |
slurm-dummies.txt · Last modified: 2023/10/09 15:17 by admin