Computational Cluster Programs

How to Get an Interactive Session through GE

This page describes how to use the qrsh command to obtain an interactive session on the Hoffman2 Cluster. For general information about qrsh, see the Sun Grid Engine User's Guide. The following descriptions are specific to Hoffman2.

You should use qrsh if you will run your program interactively. You do not need to use the qrsh command to run commercial programs like Matlab which already have qrsh built into their startup scripts.

Table of Contents

Basic concepts

In the qrsh command, resource requests are specified by the parameters associated with the "-l" directive. Some commonly used parameters for the "-l" directive are:

mem
h_data
Requested memory size per CPU. The maximum memory size is 1 gigabyte unless you are a member of a research group shared cluster which has purchased nodes with more memory. The default memory size depends on the queue in which your session starts. For campus users, the default memory size is 1GB.
time
h_rt
Wall-clock time limit of the interactive qrsh session. The interactive qrsh session will be closed without warning once it reaches this limit. The maximum time limit is 24 hours unless you are a member of a research group shared cluster and you also use the highp parameter on the "-l" directive. The default time limit depends on the queue in which your session starts. For campus users, the default time limit is 24 hours.
i
interactive
Request use of the interactive queues. Each user is limited to 8 CPUs and 24 hours. The default memory size depends on the queue in which your session starts. For campus users, the default memory size is 1GB. The default time limit is 24 hours.

To request multiple CPUs, use the "-pe" directive:

-pe pe_name n

where pe_name is one of the available parallel environments and the integer n is the total number of CPUs requested.

If pe_name is shared or shared_msa, n must be no larger than 8 and all allocated CPUs will be on the same compute node. If pe_name is dc_idre or dc_msa then n may be any number up to your group CPU limit and the allocated CPUs will possibly be distributed across multiple compute nodes.

Use the shared or shared_msa PE if you want to run a large memory or shared memory program, for example, a serial program which needs more than 1GB of memory, or a multi-threaded OpenMP program. Use the dc_idre or dc_msa PE if your program uses MPI distributed memory. Use one of the nthreads or nthreads_msa PEs if your program is both multi-threaded and uses distributed memory.

qrsh Examples

Note: The parameters associated with the "-l" directive are separated by commas, without any white space in between.

  • Request a single CPU for 24 hours. The default memory size depends on the queue in which your session starts. For campus users, the default memory size is 1GB. For MSA Data Center users, the default memory size may be 1GB or 4GB; use the "-l" directive with the mem or h_data parameter to request more than 1GB.

    qrsh

  • Request a single CPU for 2 hours from the interactive queues.

    qrsh -l i,mem=1G,time=2:00:00

  • Request an entire 8-CPU node for 4 hours (total 8*1G=8GB memory) from the Math Science Data Center interactive queues.

    qrsh -l i,mem=1G,time=4:00:00 -pe shared_msa 8

  • Request 4 CPUs for 3 hours from a single 4-CPU or 8-CPU node from the IDRE Data Center.

    qrsh -l mem=1G,time=3:00:00 -pe shared 4

  • Request 12 CPUs, 1GB of memory per CPU, for 2 hours. The 12 CPUs are distributed arbitrarily across multiple compute nodes. Note that you must add -now n because parallel environments that may allocate processors from more than one node are not available with the default -now y option.

    qrsh -l mem=1G,time=2:00:00 -pe dc* 12 -now n

A qrsh job is scheduled along with all other jobs managed by GE. Just like any other GE jobs, the shorter time (time resource), and the fewer number of CPUs (-pe directive) that you request, the better chance you have of getting a session. Immediate startup is guaranteed for sessions on the interactive queues which request a single CPU.

How to interpret error messages

Occasionally, you may encounter one of the following messages:

error: no suitable queues

or,

qrsh: No match.

or,

Your "qrsh" request could not be scheduled, try again later.

If you see the "no suitable queues" message and you are requesting the interactive queues, be sure you have not requested more than 24 hours. This message may mean there is something incompatible with the various parameters you have specified and your qrsh session can never start, for example if you have requested shared_msa but your userid is not authorized to run on the Math Science Data Center nodes. Or, if you are using a parallel environment other than shared, or shared_msa be sure you have added -now n to your command. Or it may mean the number of CPUs that you requested are not presently available.

If you see the "qrsh: No match." message then you are probably using the tcsh shell and it is objecting to a wild card, like the asterisk in the parallel environment dc*. Instead use the dc_idre or dc_msa PE, or one of the nthreads or nthreads_msa PEs.

If your session could not be scheduled, first try your qrsh command again in case it was a momentary problem with the qmaster.

If your session still cannot be scheduled and you are requesting no more than 24 hours and 8 cores, add the interactive parameter to your "-l" directive to request the interactive queues.

If your session still cannot be scheduled, try lowering either the value for time, or the number of CPUs requested, or both values. Or try appending -now n to the qrsh command, for example:

qrsh -l i,mem=1G,time=2:00:00 -pe shared 4 -now n

With the -now n option, qrsh will try indefinitely until the requested resources are allocated successfully. It may take awhile; GE will re-evaluate your request every several minutes. With the -now n option, all the directives, their parameters and parameter values available with the qsub command are also available with qrsh.

How to run an MPI program with qrsh

The following instructions are specific to OpenMPI, the default MPI library on Hoffman2 cluster. They may not apply to other MPI implementations.

There are 2 main steps to run an MPI program in a qrsh session. You need to do step #1 only once per qrsh session. You can repeatedly execute step #2 within the same qrsh session. The executable MPI program is named "foo" in the following example.

  1. Set up the environment. In the qrsh session at the shell prompt, enter one of the following commands:
    • If you are in bash shell:
    • source /u/local/bin/set_qrsh_env.sh
    • If you are in csh/tcsh shell:
    • source /u/local/bin/set_qrsh_env.csh
  2. Launch your MPI program.
  3. Assume your MPI program is named "foo" and is located in the current directory. Run the program using all allocated CPUs with the command:

    mpiexec -n $NSLOTS ./foo

    You could replace $NSLOTS with an integer which is less than the number of CPUs you requested on your qrsh command. For example:

    mpiexec -n 4 ./foo

    If your program was linked dynamically, you may need to pass the LD_LIBRARY_PATH environment variable to mpiexec:

    mpiexec -n $NSLOTS -x LD_LIBRARY_PATH ./foo

The command to see more options of mpiexec is:

mpiexec -help

You do not have to create a hostfile and pass it to mpiexec with its -machinefile or -hostfile option because mpiexec automatically retrieves that information from GE.

Additional tools

Additional scripts are available that may help you run other parallel distributed memory software. You can enter these commands at the compute node's shell prompt.

get_pe_hostfile
Returns the contents of the GE pe_hostfile file for the current qrsh session.

If you have used the -pe directive to request multiple processors on multiple nodes, you will probably need to tell your program the names of those nodes and how many processors have been allocated on each node. This information is unique to your current qrsh session.

To create an MPI-style hostfile named hfile in the current directory:

get_pe_hostfile | awk '{print $1 " slots=" $2}' > hfile

The GE pe_hostfile is located:

$SGE_ROOT/$SGE_CELL/spool/node/active_jobs/ge_jobid.1/pe_hostfile
or,
$SGE_ROOT/$SGE_CELL/spool/node/active_jobs/ge_jobid.ge_taskid/pe_hostfile
where node and ge_jobid are the hostname and GE $JOB_ID, respectively, of the current qrsh session. ge_taskid is the task number of a job array job $GE_TASK_ID.


get_sge_jobid
Returns the value of GE JOB_ID for the current qrsh session.

get_sge_env
Returns the contents of the GE environment file for the current qrsh session. Used by the set_qrsh_env scripts.

GE-specific environment variables are defined in the file:

$SGE_ROOT/$SGE_CELL/spool/node/active_jobs/ge_jobid.1/environment
or,
$SGE_ROOT/$SGE_CELL/spool/node/active_jobs/ge_jobid.ge_taskid/environment
where node and ge_jobid are the hostname and GE $JOB_ID, respectively, of the current qrsh session. ge_taskid is the task number of a job array job $SGE_TASK_ID.

If you need assistance using qrsh, please contact the ATS High Performance Computing consultants at atshpc@ucla.edu. The consultants respond to email sent to this address during normal business hours.

March 2010