Torque version 3.0.2 is installed on the delta cluster. Delta cluster consists of seven compute nodes each with 16 cores. It is configured to allow users to run compute based jobs within and across nodes.
How to use Torque
Users can log on to delta-cluster, use qsub command to submit their jobs.
For c shell users :
Add the following lines in your .cshrc file
set path=(/opt/torque-3.0.2/bin $path)
set path=(/opt/torque-3.0.2/sbin $path)
run the command source .cshrc
For bash shell users:
Add the following lines in your .bashrc file
run the command source .bashrc
Important Note: Specially first time users, Please check for “ssh passwordless login” to other computes nodes. Run command csh /admin/pass_delta from head node before running the mutinodes jobs.
The queue configuration on the delta cluster :
qp16: Users can use this queue, if the number of processors are 16. For this queue the walltime limit is 24 hrs. To submit a job to this queue give as below.
#PBS – l nodes=1:ppn=16:typical
in your script file.
For qp16, the max walltime limit is 24hrs.
batch: This is the default queue in which all jobs are placed when submitted. The purpose of this queue is to route the jobs to the queue based on the parameters specified in the job script.
Note: Users can not directly submit their jobs to a particular queue. All the jobs are routed through batch.
How to submit jobs
Torque is configured on delta to allow users to submit compute intensive parallel jobs using MVAPICH2. In Torque, jobs are submitted through queues.
To Submit Job Using Torque:
A sample scriptfile can be like this:
#!/bin/csh #PBS -N jobname #PBS -l nodes=x:ppn=16 typical or debug #PBS -l walltime=24:00:00 #PBS -e /path_of_executable/error.log
cd /path_of_executable NPROCS=`wc -l < $PBS_NODEFILE` HOSTS=`cat $PBS_NODEFILE | uniq | tr '\n' "," | sed 's|,$||'` mpirun -np $NPROCS --host $HOSTS /name_of_executable
x is 1 (:typical) for 16 cpus, 2 for 32cpus and 4 for 64 cpus respectively.
Sample Job Scripts:
Note: Local scratch : /localscratch/<loginid> is available for Job runtime use. User must access this space through job scripts only. Files older than 10 days in this area will be deleted. Please do not install any software in this area.
Commonly used Torque commands
1. To check the status of the job
Gives the details of the job like the job number, the queue through which it was fired etc.
2.To remove a job from the queue
3. To know about the available queues
Report problems to :