Computational Cluster Programs

Hoffman2 GPU Queue

Authorization to use Hoffman2 GPU nodes

To use a Hoffman2 node which has gpus (graphical processing units), you need to add your account to the gpu group with permission from its sponsor. To do this, point your browser at:

Application for a login id on an ATS-Hosted Cluster

and click Update your profile. It will present a page where you can click the check-box for gpu and then click the Update button. You will receive email from the Grid Identity Manager after your account has been added to the gpu group.

How to access GPU nodes

In order to use a node that has a gpu, you need to request it from the job scheduler. Nodes may have two gpus (Tesla T10) or three gpus (Tesla M2070 nodes). To begin an interactive session, at the shell prompt, enter:

qrsh -l gpu

The above qrsh command will reserve an entire gpu node with its 2 or 3 gpu processors. The maximum amount of memory (h_data or mem) that you can request is 24G on the Tesla T10 nodes, or 48G on the Tesla M2070 nodes. An interactive session made with the above qrsh command will expire in 2 hours. The maximum amount of time for a session is 9 hours.

  • To specify a different time limit for your session, use the h_rt or time parameter. Example for requesting 9 hours:
    qrsh -l gpu,h_rt=9:00:00
  • To reserve two nodes at a login node shell prompt, enter:
    qrsh -l gpu  -pe dc_gpu 2
  • To see which node(s) were reserved, at a g-node shell prompt enter:
    get_pe_hostfile
  • To see if the gpu nodes are up and/or in use, at any shell prompt enter:
    qhost_gpu_nodes
  • To see the specifics for a particular gpu node, at a g-node shell prompt enter:
    gpu-device-query.sh
  • To get a quick session for compiling or testing your code. This does not give you exclusive use of the gpu node:
    qrsh -l i,gpu

How to Specify GPU Types

There are multiple GPU types available in the cluster. Each type of GPU has a different compute capability, memory size and clock speed, among other things. If your GPU program requires a specific GPU type to run, you need to specify it explicitly. Without specifying GPU type allows SGE to arbitrarily pick any available GPU for your job. You may need to compile your code on the machine that has the required type of GPU. Currently, the following GPU types are available:

GPU type Compute
Capability
Number
of Cores
Global
Memory Size
SGE option
Tesla T10 1.3 240 4.3 GB -l T10
Tesla M2070 2.0 448 5.6 GB -l M2070

References:

The SGE options in the table above can be combined with other SGE options, for example:

qrsh -l gpu,M2070,h_rt=3:00:00

CUDA

CUDA is installed in /u/local/cuda/ on the Hoffman2 Cluster. There are several versions available. The most recent as of May 2011 is 4.0   You can refer to the current production version with /u/local/cuda/current/. To install CUDA in your home directory, please see the instructions in the /u/local/cuda/README_ATS file. To install the NVIDIA GPU Computing Software Development Kit in your home directory, please see the instructions in the /u/local/cuda/README_SDK_ATS file.

 

November 2011