sge
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
sge [2010/07/02 08:31] – cangiani | sge [2011/01/26 09:10] – damir | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== | + | ====== |
Our cluster runs [[http:// | Our cluster runs [[http:// | ||
- | | + | |
- | The queueing | + | The queuing |
- | As an user you can take advantage of many more machines for your urgent calculations and get results faster. On the other hand, since the machines your are using are not always owned by your group, try to be as fair as possible and respect the needs of other users: if you notice that the cluster is overloaded ('' | + | As an user you can take advantage of many more machines for your urgent calculations and get results faster. On the other hand, since the machines your are using are not always owned by your group, try to be as fair as possible and respect the needs of other users: if you notice that the cluster is overloaded (using the commands |
- | We have configured the system almost without access restriction because the queueing | + | We have configured the system almost without access restriction because the queuing |
+ | < | ||
+ | As user practice showed us, we had been forced to introduce some limitations: | ||
+ | - The maximum number of jobs per user is between 130 and 150, depending on other resource you requests. this limit can be varied depending on the load of the cluster, please ask your sysadmin for such changes. | ||
+ | - It's mandatory to specify how much memory your jobs will need. | ||
+ | - It's mandatory to specify how much time your job will need to complete. | ||
+ | - Jobs that need to run for more than 120 hours have less precedence over other jobs. | ||
+ | </ | ||
===== Mini User Guide ===== | ===== Mini User Guide ===== | ||
Line 21: | Line 28: | ||
[root@licossrv4 server_priv]# | [root@licossrv4 server_priv]# | ||
- | server: | + | server: |
Queue Memory CPU Time Walltime Node Run Que Lm State | Queue Memory CPU Time Walltime Node Run Que Lm State | ||
Line 120: | Line 127: | ||
* '' | * '' | ||
* '' | * '' | ||
- | * '' | + | * '' |
* '' | * '' | ||
* '' | * '' | ||
The properties available on the various nodes can be listed with the '' | The properties available on the various nodes can be listed with the '' | ||
- | For the moment we have defined | + | For the moment we have defined |
* '' | * '' | ||
* '' | * '' | ||
Line 130: | Line 137: | ||
* '' | * '' | ||
* '' | * '' | ||
+ | * '' | ||
* '' | * '' | ||
- | Example **qsub -l nodes=1: | + | * '' |
+ | Example **qsub -l nodes=1: | ||
<note important> | <note important> | ||
- | It is very **important | + | It **mandatory** |
By default, if no time limit is specified, the job is sent to the '' | By default, if no time limit is specified, the job is sent to the '' | ||
Line 168: | Line 177: | ||
- Compile two version of your code (32 and 64 bit); | - Compile two version of your code (32 and 64 bit); | ||
- | - name the two executables | + | - name the two executable |
- in your pbs script use '' | - in your pbs script use '' | ||
Line 193: | Line 202: | ||
There is a bug in pbs that appears some time when the server would like to stop a running job but the node where the job is running does not respond (e.g. it did crash). When this happens, the server starts to send you a lot o identical mail messages telling you that it had to kill your job because it exceeded the time limit. If you start to receive the same message over and over about the same JOB ID, please contact a sys admin. Thanks. | There is a bug in pbs that appears some time when the server would like to stop a running job but the node where the job is running does not respond (e.g. it did crash). When this happens, the server starts to send you a lot o identical mail messages telling you that it had to kill your job because it exceeded the time limit. If you start to receive the same message over and over about the same JOB ID, please contact a sys admin. Thanks. | ||
+ | |||
===== Tips and Tricks ===== | ===== Tips and Tricks ===== | ||
=== Delete all queued jobs === | === Delete all queued jobs === |
sge.txt · Last modified: 2015/11/16 10:18 by 127.0.0.1