This is an old revision of the document!
Queueing System - Torque/Maui
Our cluster runs Torque resource manager (a pbs variant) and the Maui scheduler for managing batch jobs. It is mandatory to use this system for running long batch jobs: interactive calculations will be less and less tollerated.
The queueing system give you access to computers owned by ALGO, ARNI, LCM, LICOS, LTHC, and LTHI for a total of approximately 140 CPUs (cores). We (the sysadmins) believe that sharing the computational resources among as many groups as possible will result in a more efficient use of the resources and of the electric power. A larger cluster not only have an improved average throughput, but it is also better suted to respond to peak requests.
As an user you can take advantage of many more machines for your urgent calculations and get results faster. On the other hand, since the machines your are using are not always owned by your group, try to be as fair as possible and respect the needs of other users: if you notice that the cluster is overloaded ( qstat -q
), do not submit too many jobs and leave some space for the others.
We have configured the system almost without access restriction because the queueing system can make a more efficient use of the cluster if it does not have to satisfy too many constraints. Please don't force us to introduce limitations such as, for example, a maximum number of jobs per user.
Mini User Guide
The 3 most used commands are:
qstat
: for checking the status of the queues or of your running jobsqsub
: for submitting your jobsqdel
: for deleting a running or waiting job
qstat
qstat -q
shows the status of the queues. In the following example there are 5 queues (long, short, batch, algo
, anddefault
which is an alias forshort
). There are 100 jobs are running on thelong
queue and one is in queued intoalgo
. In theshort
(which is the default one if you don't specify how long your job is supposed to run), a job can run for at most one hour.
[root@licossrv4 server_priv]# qstat -q server: licossrv4.epfl.ch Queue Memory CPU Time Walltime Node Run Que Lm State ---------------- ------ -------- -------- ---- --- --- -- ----- long -- -- -- -- 100 0 -- E R default -- -- -- -- 0 0 -- E R short -- 01:00:00 -- -- 0 0 -- E R batch -- -- -- -- 0 0 -- E R algo -- 24:00:00 -- -- 0 1 -- E R ----- ----- 100 1
qstat -a
gives more informations about the jobs in the queue. The job status is indicated in theS
column:R
=running,Q
=queued, etc. As an alternative, one can useqstat -n1
which shows also the name of the machine where the job is running:
[root@licossrv4 server_priv]# qstat -a licossrv4.epfl.ch: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 146.licossrv4.epfl.c damir batch STDIN 3980 1 -- -- -- R -- 147.licossrv4.epfl.c damir batch STDIN 3998 1 -- -- -- R -- 148.licossrv4.epfl.c damir batch STDIN 24367 1 -- -- -- R -- 149.licossrv4.epfl.c damir batch STDIN 24390 1 -- -- -- R -- 150.licossrv4.epfl.c damir batch STDIN -- 1 -- -- -- Q -- 151.licossrv4.epfl.c damir batch STDIN -- 1 -- -- -- Q -- 152.licossrv4.epfl.c damir batch STDIN -- 1 -- -- -- Q -- 153.licossrv4.epfl.c damir batch STDIN -- 1 -- -- -- Q -- 154.licossrv4.epfl.c damir batch STDIN -- 1 -- -- -- Q -- 155.licossrv4.epfl.c cangiani batch STDIN 15006 1 -- -- -- R -- 156.licossrv4.epfl.c cangiani batch STDIN 15028 1 -- -- -- R -- 157.licossrv4.epfl.c cangiani batch STDIN 11036 1 -- -- -- R -- 158.licossrv4.epfl.c cangiani batch STDIN 11045 1 -- -- -- R -- 159.licossrv4.epfl.c cangiani batch STDIN 11080 1 -- -- -- R -- 160.licossrv4.epfl.c cangiani batch STDIN 11097 1 -- -- -- R -- 161.licossrv4.epfl.c cangiani batch STDIN 30704 1 -- -- -- R -- 162.licossrv4.epfl.c cangiani batch STDIN 30715 1 -- -- -- R -- 163.licossrv4.epfl.c cangiani batch STDIN 30733 1 -- -- -- R -- 164.licossrv4.epfl.c cangiani batch STDIN 30756 1 -- -- -- R -- [root@licossrv4 server_priv]# qstat -n1 licossrv4.epfl.ch: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 165.licossrv4.epfl.c damir batch STDIN 4522 1 -- -- -- R 00:01 lthipc1/0 166.licossrv4.epfl.c damir batch STDIN 4549 1 -- -- -- R 00:01 lthipc1/1 167.licossrv4.epfl.c damir batch STDIN 24672 1 -- -- -- R 00:01 node02/0 168.licossrv4.epfl.c damir batch STDIN 24701 1 -- -- -- R 00:01 node02/1 169.licossrv4.epfl.c damir batch STDIN -- 1 -- -- -- Q -- -- 170.licossrv4.epfl.c damir batch STDIN -- 1 -- -- -- Q -- -- 171.licossrv4.epfl.c damir batch STDIN -- 1 -- -- -- Q -- -- 172.licossrv4.epfl.c damir batch STDIN -- 1 -- -- -- Q -- -- 173.licossrv4.epfl.c damir batch STDIN -- 1 -- -- -- Q -- -- 174.licossrv4.epfl.c cangiani batch STDIN 15202 1 -- -- -- R -- node03/0 175.licossrv4.epfl.c cangiani batch STDIN 15225 1 -- -- -- R -- node03/1 176.licossrv4.epfl.c cangiani batch STDIN 11477 1 -- -- -- R -- lthcserv7/0 177.licossrv4.epfl.c cangiani batch STDIN 11494 1 -- -- -- R -- lthcserv7/1 178.licossrv4.epfl.c cangiani batch STDIN 11501 1 -- -- -- R -- lthcserv7/2 179.licossrv4.epfl.c cangiani batch STDIN 11508 1 -- -- -- R -- lthcserv7/3 180.licossrv4.epfl.c cangiani batch STDIN 30886 1 -- -- -- R -- lcmpc1/0 181.licossrv4.epfl.c cangiani batch STDIN 30910 1 -- -- -- R -- lcmpc1/1 182.licossrv4.epfl.c cangiani batch STDIN 30931 1 -- -- -- R -- lcmpc1/2 183.licossrv4.epfl.c cangiani batch STDIN 30952 1 -- -- -- R -- lcmpc1/3
qsub
Qsub is used to submit jobs. Jobs are nothing else than short scripts where the program to be executed is launched. The easiest job is something like the following
$ echo "cd myProject/bin ; ./mynumbercruncher" | qsub 188.licossrv4.epfl.ch
This will change to the directory myProject/bin
located in my home directory, and executes the program called mynumbercruncher
. The output of the program will be written by default in two files called STDIN.oXXX
and STDIN.eXXX
respectively for standard output and standard error. XXX
stands for the job id (188 in the above example). You can change the output file names by setting the -o filename
for standard output and -e filename
or -j oe
(append to standard output) for standard error.
Scripts
It is convenient to write the job script in a file not only because in this way the script can be reused, but also because it is also possible to set qsub
options directly inside the script as in the following example:
$ cat myScript.sh # lines starting with #PBS are directives for qsub #PBS -j oe #PBS -o myScript.out #PBS -l nodes=1:64bit cd bin ./bogo
Inside a script, all the line that starts with the '#' char are comment, but the lines that start with the '#PBS' string, are directives for the queuing system. The example above instruct the queuing system to:
#PBS -j oe
: put all the output messages (messages from the program and executions errors) on a single file.
#PBS -o myScript.out
: all the output generated by my program must saved on a file named myScript.out.
#PBS -l nodes=1:64bit
: I need at least one node with a 64 bit cpu for my program.
Many options are available for the qsub command. The most important are the following:
-q queue_name
force the job to run on a specific queue. Presently the queue is automatically selected following your requests for the job. we might add more conditions or queue if we see that they are needed.-l resource_list
defines the resources that are required by the job and establishes a limit to the amount of resource that can be consumed. For example, a job that needs a lot of memory is dispatched only to a compute node that can offer that amount of memory. The main resources that can be requested are:cput
for cpu time (example:-l cput=8:00
),pmem
for physical memory (example:-l pmem=4gb
),nodes
for giving a list of nodes (hostnames or properties) to consider.
The properties available on the various nodes can be listed with the pbsnodes -a
command.
For the moment we have defined only the properties:
bit64
on 64 bit machines.bit32
on 32 bit machines (mainly needed because of matlab).matlab
for nodes that can launch matlab simulations.f10
for nodes with Linux Fedora 10 installed.
Example qsub -l nodes=1:bit64 (the string 1:
is mandatory and means: I need at least one node with property 64bit). To specify more than one property use the colon “:” to separate the properties. a job that require 64 bit cpu and matlab should be called using qsub -l nodes=1:bit64:matlab <name of the pbs script>.
<note important> It is very important to specify at least the estimated run time of the job so that the scheduler can optimize the machines usage and the overall cluster throughput.
By default, if no time limit is specified, the job is sent to the short
queue and killed after one hour.
Please keep in mind that longer jobs are less likely to enter the queue when the cluster load is high. Therefore, don't be lazy and do not always ask for infinite run time because your job will remain stuck in the queue. It is also not as smart as it might seem, to submit tons of very short jobs because the start-up and shut-down overheads are intentionally quite long. </note>
-a date_time
declares the time after which the job is eligible for execution.
The date_time
argument is in the form: [[[[CC]YY]MM]DD]hhmm[.SS]
. Where CC
is the first two digits of the year (the century), YY
is the second two digits of the year, MM
is the two digits for the month, DD
is the day of the month, hh
is the hour, mm
is the minute, and the optional SS
is the seconds.
If the month, MM
, is not specified, it will default to the current month if the specified day DD
, is in the future. Otherwise, the month will be set to next month. Likewise, if the day, DD
, is not specified, it will default to today if the time hhmm is in the future. Otherwise, the day will be set to tomorrow. For example, if you submit a job at 11:15am with a time of -a 1110
, the job will be eligible to run at 11:10am tomorrow.
Here you can find some useful pbs script that can be used as starting point
Script | Execute with |
---|---|
Script example that send mail messages when the program start/end running | qsub [qsub options] base.pbs |
base Script example | qsub [qsub options] nomail.pbs |
Script example for running matlab computations | qsub -l nodes=1:matlab [qsub options] matlab.pbs |
Script example for running Mathematica computations | qsub [qsub options] mathematica.pbs |
Script example for windows programs (executed under wine) | qsub [qsub options] wine.pbs |
See the man page for more details.
qdel
When you submit a job, you receive from the system a number that is used as reference to the job. to delete the job all you have to do is launch the qdel command followed by the job number you want to delete.
damir@lthipc1:~$ qdel 236
You can also indicate more than one job number:
damir@lthipc1:~$ qdel 236 237 241
BUG
There is a bug in pbs that appears some time when the server would like to stop a running job but the node where the job is running does not respond (e.g. it did crash). When this happens, the server starts to send you a lot o identical mail messages telling you that it had to kill your job because it exceeded the time limit. If you start to receive the same message over and over about the same JOB ID, please contact a sys admin. Thanks.