Difference between revisions of "Using the PBS / Torque queueing environment"

From Centre for Bioinformatics and Computational Biology
Jump to: navigation, search
Line 22: Line 22:
 
-l walltime=4:00:00
 
-l walltime=4:00:00
 
<source>
 
<source>
 +
 
sets the total expected wall clock time in hours:minutes:seconds. Note the wall clock limits for each queue.
 
sets the total expected wall clock time in hours:minutes:seconds. Note the wall clock limits for each queue.
  
Line 31: Line 32:
 
to specify the queue.
 
to specify the queue.
  
<source lang="c">
+
=== Example job scripts ===
-P PRJT1234
+
</source>
+
  
specifies the project identifier short name, which is needed to identify the Research Programme allocation you will draw from for this job. Ask your PI for the project short name and replace PRJT1234 with it.
+
A program using 14 cores on a single mode:
Restricted queues
+
  
The large and bigmem queues are restricted to users who have need for them. If you are granted access to these queues then you should specify that you are a member of the largeq or bigmemq groups. For example:
+
<source lang="c">
 +
#!/bin/bash
 +
#PBS -l nodes=1:ppn=14
 +
#PBS -l walltime=8:00:00
 +
#PBS -q normal
 +
#PBS -m ae
 +
#PBS -M your.email@address
  
#PBS -q large
+
module load bowtie2-2.3.4.1
#PBS -W group_list=largeq
+
bowtie2
  
or
+
</source>
  
#PBS -q bigmem
 
#PBS -W group_list=bigmemq
 
 
Example job scripts
 
An MPI program using 240 cores
 
 
Using the normal queue to run WRF:
 
 
#!/bin/bash
 
#PBS -l select=10:ncpus=24:mpiprocs=24:nodetype=haswell_reg
 
#PBS -P PRJT1234
 
#PBS -q normal
 
#PBS -l walltime=4:00:00
 
#PBS -o /mnt/lustre/users/USERNAME/WRF_Tests/WRFV3/run2km_100/wrf.out
 
#PBS -e /mnt/lustre/users/USERNAME/WRF_Tests/WRFV3/run2km_100/wrf.err
 
#PBS -m abe
 
#PBS -M your.email@address
 
ulimit -s unlimited
 
. /apps/chpc/earth/WRF-3.7-impi/setWRF
 
cd /mnt/lustre/users/USERNAME/WRF_Tests/WRFV3/run2km_100
 
rm wrfout* rsl*
 
nproc=`cat $PBS_NODEFILE | wc -l`
 
echo nproc is $nproc
 
cat $PBS_NODEFILE
 
time mpirun -np $nproc wrf.exe > runWRF.out
 
  
 
Assuming the above job script is saved as the text file example.job, the command to submit it to the PBSPro scheduler is:
 
Assuming the above job script is saved as the text file example.job, the command to submit it to the PBSPro scheduler is:

Revision as of 11:00, 11 April 2018

The main commands for interacting with the Torque environment are:

  • qstat View queued jobs.
  • qsub Submit a job to the scheduler.
  • qdel Delete one of your jobs from queue.


Job script parameters

Parameters for any job submission are specified as #PBS comments in the job script file or as options to the qsub command. The essential options for the cluster include:

-l select=10:ncpus=24:mpiprocs=24


sets the size of the job in number of processors: select=N number of nodes needed. ncpus=N number of cores per node mpiprocs=N number of MPI ranks (processes) per node

-l walltime=4:00:00
<source>
 
sets the total expected wall clock time in hours:minutes:seconds. Note the wall clock limits for each queue.
 
The job size and wall clock time must be within the limits imposed on the queue used:
 
<source lang="c">
-q normal

to specify the queue.

Example job scripts

A program using 14 cores on a single mode:

#!/bin/bash
#PBS -l nodes=1:ppn=14
#PBS -l walltime=8:00:00
#PBS -q normal
#PBS -m ae
#PBS -M your.email@address
 
module load bowtie2-2.3.4.1
bowtie2


Assuming the above job script is saved as the text file example.job, the command to submit it to the PBSPro scheduler is:

qsub example.job

No additional parameters are needed for the qsub command since all the PBS parameters are specified within the job script file. IMPORTANT

Note that in the above job script example the working directory is on the Lustre file system. Do not use your home directory for the working directory of your job. Use the directory allocated to you on the fast Lustre parallel file system:

/mnt/lustre/users/USERNAME/

where USERNAME is replace by your user name on the CHPC cluster.

Always provide the full absolute path to your Lustre sub-directories. Do not rely on a symbolic link from your home directory. Hybrid MPI/OpenMP

For example, to request an MPI job on one node with 12 cores per MPI rank, so that each MPI process can launch 12 OpenMP threads, change the -l parameters:

  1. PBS -l select=1:ncpus=24:mpiprocs=2:nodetype=haswell_reg

There are two MPI ranks, so mpirun -n 2 … . Example interactive job request

To request an interactive session on a single node, the full command for qsub is:

qsub -I -P PROJ0101 -q smp -l select=1:ncpus=24:mpiprocs=24:nodetype=haswell_reg

Note:

   -I selects an interactive job
   you still must specify your project
   the queue must be smp, serial or test
   interactive jobs only get one node: select=1
   for the smp queue you can request several cores: ncpus=24
   you can run MPI code: indicate how many ranks you want with mpiprocs=

If you find your interactive session timing out too soon then add -l walltime=4:0:0 to the above command line to request the maximum 4 hours.


Much of this content has been adapted from the Quick Start Guide at CHPC.