User Tools

Site Tools


1slurm

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
1slurm [2020/05/09 15:12] admin1slurm [2020/06/26 10:38] (current) admin
Line 3: Line 3:
 Our cluster runs [[https://slurm.schedmd.com| S.L.U.R.M.]] workload manager for managing batch jobs. It is **preferable** to use this system for running long batch jobs as interactive calculations are less reliable and require more human work. Our cluster runs [[https://slurm.schedmd.com| S.L.U.R.M.]] workload manager for managing batch jobs. It is **preferable** to use this system for running long batch jobs as interactive calculations are less reliable and require more human work.
  
-The queuing system give you access to computers owned by LCM, LTHC, LTHI, LINX and IC Faculty; sharing the computational resources among as many groups as possible will result in a more efficient use of the resources (including the electric power)+The queuing system give you access to computers owned by LCM, LTHC, LTHI, LINX and IC Faculty; sharing the computational resources among as many groups as possible will result in a more efficient use of the resources (including the electric power)you can take advantage of many more machines for your urgent calculations and get results faster. 
-As user you can take advantage of many more machines for your urgent calculations and get results faster. On the other hand, since the machines your are using are not always owned by your group, try to be as fair as possible and respect the needs of other users. +On the other hand, since the machines you are using are not always owned by your group, try to be as fair as possible and respect the needs of other users. 
  
-We have configured the system almost without access restriction because the queuing system can make a more efficient use of the cluster if it does not have to satisfy too many constraints. we are currently using only two constraints:+We have configured the system with almost no restriction to access and capabilities because the queuing system can make a more efficient use of the cluster if it does not have to satisfy too many constraints. we are currently using only some constraints:
   - number of CPU/cores: you must indicate the correct number of cores you're going to use;   - number of CPU/cores: you must indicate the correct number of cores you're going to use;
   - Megabytes/Gigabytes of RAM your jobs need to use;   - Megabytes/Gigabytes of RAM your jobs need to use;
   - Time for the execution: if your job is not completed by the indicated time, it will be automatically terminated;   - Time for the execution: if your job is not completed by the indicated time, it will be automatically terminated;
  
-you can find better and more complete guides on how to use S.L.U.R.M. control commands on internet; e.g:+here we provide just a fast and dirty guide for the most basic commands/tasks that you're going to use for the day to daily activities, you can find better and more complete guides on how to use S.L.U.R.M. control commands on internet; e.g:
   - [[https://slurm.schedmd.com/|Slurm Documentation]]   - [[https://slurm.schedmd.com/|Slurm Documentation]]
   - [[https://scitas-data.epfl.ch/confluence/display/DOC/FAQ#FAQ-BatchSystemQuestions|SCITAS Documentation]]   - [[https://scitas-data.epfl.ch/confluence/display/DOC/FAQ#FAQ-BatchSystemQuestions|SCITAS Documentation]]
   - [[https://slurm.schedmd.com/quickstart.html|Quick Start]]   - [[https://slurm.schedmd.com/quickstart.html|Quick Start]]
  
-here we provide just a fast and dirty guide for the most basic commands/tasks that you're going to use forthe day to day activities. 
  
 ==== partitions (a.k.a. queues) ==== ==== partitions (a.k.a. queues) ====
-If you used other types of cluster management, you will already known the term "queue" to identify the type of nodes/jobs you want to use. in S.L.U.R.M. notation, ''queues'' are called **partitions**. The two terms are used to indicate the same entity, even if they are not quite the same.+If you used other types of cluster management, you will already known the term "queue" to identify the type of computers (nodes) or programs (jobsyou want to use. in S.L.U.R.M. notation, ''queues'' are called **partitions**. The two terms are used to indicate the same entity, even if they are not quite the same.
  
 ===== Mini User Guide ===== ===== Mini User Guide =====
  
-The most used commands are:+The most used/needed commands are:
   - ''squeue'' for checking the status of the partitions or of your running jobs   - ''squeue'' for checking the status of the partitions or of your running jobs
   - ''sbatch'' or ''srun'' for submitting your jobs   - ''sbatch'' or ''srun'' for submitting your jobs
Line 29: Line 28:
   - ''sinfo'' to discover the availability of nodes and partitions   - ''sinfo'' to discover the availability of nodes and partitions
  
-  * ''sinfo'' show the list of partitions and nodes and their availability: here you can see that the default partition of the cluster is called "slurm-cluster' (the * indicate the default), the time limit imposed on he partitions and the nodes that are associated with them.+  * ''sinfo'' show the list of partitionsnodes and their availability: here you can see that the default partition of the cluster is called "slurm-cluster' (the * indicate the default), the time limit imposed on he partitions and the nodes that are associated with them and what is their activity status.
 <code> <code>
 $ sinfo $ sinfo
Line 49: Line 48:
 here you can see that the command provides the ID of the jobs, the PARTITION used to run the jobs (hence the nodes where these jobs will run), the NAME assigned to the jobs, the name of the USER that submitted the jobs, the STATUS of the job (R=Run, PD=Waiting), the execution TIME and the nodes where the jobs are actually running (or the reason why they wait in the queue). here you can see that the command provides the ID of the jobs, the PARTITION used to run the jobs (hence the nodes where these jobs will run), the NAME assigned to the jobs, the name of the USER that submitted the jobs, the STATUS of the job (R=Run, PD=Waiting), the execution TIME and the nodes where the jobs are actually running (or the reason why they wait in the queue).
  
-  * ''sbatch'' is used to submit and run jobs on the cluster. Jobs are nothing else than short scripts that contain some directive about the specific requests of the programs that need to be executed. The output of the program will be written by default in two files called xxx.out and xxx.err  respectively for standard output (any message that would be printed on the screen) and standard error (any error message that would be printed on the screen). ''xxx'' stands for the job id. You can change the output file names by setting the ''--output='' for standard output and ''--error='' for standard error. +  * ''sbatch'' is used to submit and run jobs on the cluster. Jobs are nothing else than short scripts that contain some directive about the specific requests of the programs that need to be executed. The output of the program will be written by default in two files called xxx.out and xxx.err  respectively for standard output (any message that would be printed on the screen) and standard error (any error message that would be printed on the screen). ''xxx'' stands for the job id. You can change the output file names by setting the directives ''--output='' for standard output and ''--error='' for standard error. 
-Once a job is submitted (and accepted by the cluster, you'll receive the ID assigned to the job:+Once a job is submitted (and accepted by the cluster), you'll receive the ID assigned to the job:
 <code> <code>
 $ sbatch sheepit.slurm  $ sbatch sheepit.slurm 
 Submitted batch job 552 Submitted batch job 552
 </code> </code>
-  * ''srun'' is used to launch immediately your program inside the cluster (interactive mode): it can be used interactivelybut it **must** be used inside the sbatch/slurm scripts, so the cluster always knows what resources are allocated.+  * ''srun'' is used to launch your program inside the cluster: it can be used to provide a an interactive session of a nodeor to launch parallel computer programs. used inside the sbatch/slurm scripts the cluster always knows what resources are allocated.
  
   * ''scancel'' is used to remove your job from the queue or to kill your program when it's already running on the cluster:   * ''scancel'' is used to remove your job from the queue or to kill your program when it's already running on the cluster:
Line 108: Line 107:
  
 <note> <note>
-At the beginning of the file you can read the line ''#!/bin/bash'' that is not strictly necessary. It turns out that it's common practice to identify the slurm script as bash scripts so they can be executed also outside of the cluster. in this case the '#SBATCH' lines are interpreted as comments.+At the beginning of the fileyou can read the line ''#!/bin/bash'' that is not strictly necessary. It turns out that it's common practice to identify the slurm script as bash scripts so they can be executed also outside of the cluster. in this case the '#SBATCH' lines are interpreted as comments.
 </note> </note>
 Inside a script, all the line that starts with the '#' char are comment, but the lines that start with the '#SBATCH' string, are directives for the queuing system. Inside a script, all the line that starts with the '#' char are comment, but the lines that start with the '#SBATCH' string, are directives for the queuing system.
Line 116: Line 115:
   * ''#SBATCH --cpus-per-task=8'' inform the cluster that the program will need/use 8 cores to run   * ''#SBATCH --cpus-per-task=8'' inform the cluster that the program will need/use 8 cores to run
   * ''#SBATCH --time=4:00:00'' the job will run for 4 hours   * ''#SBATCH --time=4:00:00'' the job will run for 4 hours
-  * ''#SBATCH --mem=16G'' the job will require 16GB of RAM to be executed. Different units can be specified using the suffix [K|M|G|T]+  * ''#SBATCH --mem=16G'' the job will require 16GB of RAM to be executed. Different units can be specified using the suffixes [K|M|G|T]
   * ''#SBATCH --mail-*'' are parameters to indicate the email address that will receive the email messages from the cluster and when these messages are to be sent (when the job start and when the job ends)   * ''#SBATCH --mail-*'' are parameters to indicate the email address that will receive the email messages from the cluster and when these messages are to be sent (when the job start and when the job ends)
-  * ''#SBATCH --error'', ''#SBATCH --output'' these two directives indicate where to write the messages from the program+  * ''#SBATCH --error'', ''#SBATCH --output'' these two directives indicate where to write the messages from the program(s) you execute
   * ''#SBATCH --partition'' indicate the partition that must be used to run the program   * ''#SBATCH --partition'' indicate the partition that must be used to run the program
   * ''#SBATCH --gres=gpu:1'' this parameter inform the cluster that the program must be run only inside nodes that provides the "gpu" resource and that **1** of these resources is needed.   * ''#SBATCH --gres=gpu:1'' this parameter inform the cluster that the program must be run only inside nodes that provides the "gpu" resource and that **1** of these resources is needed.
Line 132: Line 131:
     * ''tensorflow'' nodes that can run tensorflow 2.x     * ''tensorflow'' nodes that can run tensorflow 2.x
     * ''xeon26'', ''xeon41'', ''xeon56'' nodes that have different version of Intel Xeon CPUs     * ''xeon26'', ''xeon41'', ''xeon56'' nodes that have different version of Intel Xeon CPUs
 +    * ''epyc7302'' nodes that provide the AMD Epyc CPUs
     * ''gpu'' nodes that provide GPU capabilities     * ''gpu'' nodes that provide GPU capabilities
  
Line 141: Line 141:
  
 <note important> <note important>
-It is **mandatory** to specify at least the estimated run time of the job and the memory needed, so the scheduler can optimize the machines/cores/memory usage and the overall cluster throughput. If your job will pass the limits you fixed, it will be automatically killed by the cluster manager.+It is **mandatory** to specify at least the estimated run time of the job and the memory needed, so the scheduler can optimize the nodes/cores/memory usage and the overall cluster throughput. If your job will pass the limits you fixed, it will be automatically killed by the cluster manager.
  
 Please keep in mind that longer jobs are less likely to enter the queue when the cluster load is high. Therefore, don't be lazy and do not always ask for //infinite// run time because your job will remain stuck in the queue. Please keep in mind that longer jobs are less likely to enter the queue when the cluster load is high. Therefore, don't be lazy and do not always ask for //infinite// run time because your job will remain stuck in the queue.
Line 164: Line 164:
 <code> <code>
  
-scancel -u $(whoami) -h -t RUNNING | awk '{print $1};' | while read a ; do scancel ${a} ; done +squeue -u $(whoami) -h -t RUNNING | awk '{print $1};' | while read a ; do scancel ${a} ; done 
 </code> </code>
  
  
1slurm.txt · Last modified: 2020/06/26 10:38 by admin