Computational Cluster Programs

How to Run MPI

MPI Versions available on ATS-Hosted Clusters

On ATS-hosted clusters we use the following implementations of MPI: OpenMPI from The OpenMPI project and, MPI from Myricom, for Myrinet.

Network MPI 1 MPI 2
InfiniBand OpenMPI
Myrinet Myricom

Cluster Networks MPI Implementations Compilers Defaults
Hoffman2 Infinband
Ethernet
OpenMPI Intel OpenMPI and Intel
Cardio Myrinet Myricom Intel Myricom and Intel

Selecting an MPI Version, Interconnect, and Compiler other than the Default

The default combination for a cluster has been selected to give you the best performance with the hardware/software available on that cluster. Only set your path as described in this section to use an MPI version/compiler combination other than the default.

To use a version of MPI/compiler combination other than the default, add the appropriate compiler path as shown in the table below IN FRONT of your PATH environment. For the bash shell:

export PATH=pathToAdd:$PATH
For the tcsh shell:
setenv PATH pathToAdd:${PATH}
You can only select an MPI/compiler combination that is available on the cluster you are using. See the table above.

MPI Implementation
    Compiler
Add this path to the front of your PATH
Open MPI
    Intel

/u/local/mpi/openmpi/current/bin

Compiling and Linking MPI Programs

The following commands are used to compile/link mpi programs. The commands are the same no matter what MPI version, interconnect, and compiler you are using. If you have not modified your path, as described above, you will get the default MPI version, interconnect and compiler for the cluster you are using. Otherwise, what you get will be determined by how you have set your path.

Language Command Used to Compile
Fortran 77 mpif77
Fortran 90 mpif90
C mpicc
C++ mpiCC or mpic++ or mpicxx

Examples:

mpif90 -o myprog myprog.f90
Compiles myprog.f90 and creates the myprog executable.

mpicc -c myprog.c
Compiles myprog.c and creates the myprog.o object file.

mpiCC -o myprog myprog.o
Links the C++ object file myprog.o with the appropriate MPI libraries and creates the myprog executable.

Running an MPI Program as a Batch Job

There are three ways to submit an MPI batch job to the Hoffman2 Cluster. They are from easiest to hardest:

  • from the UCLA Grid Portal
  • via the queue script:
    mpi.q
  • by generating an SGE command file for the job and using the SGE commands to submit it.

Instructions are given in Running a Batch Job on an ATS-Hosted Cluster.

Running an MPI Program Interactively for Testing

NOTE: Not all clusters have interactive nodes. You can only debug an MPI program interactively on clusters that do. Alternatively you can invoke one or more interactive nodes via qrsh (refer to: How to Get an Interactive Session through SGE).

While you can DEBUG or TEST a parallel program for a short time on the interactive nodes of an ATS-hosted cluster, you cannot run a full run on the interactive nodes. After testing, a parallel program must be submitted to the SGE batch queuing system to run on the compute nodes.

To test a parallel program ssh to one of the interactive nodes, or invoke an interactive session via qrsh (refer to: How to Get an Interactive Session through SGE) and follow the instructions for the version of MPI you are using.

MPI 2 (using Open MPI)

Create a host file with one line for each process to be run excluding the local host where you are currently logged on. For example:

  • if you have obtained an interactive session on n different nodes via qrsh (refer to: How to Get an Interactive Session through SGE) on Hoffman2 you can create the host file named hostfile with the following commands:
    /u/local/bin/get_pe_hostfile | awk '{print $1" slots="$2}' > hostfile
  • Then issue the following commands to run the program:

    mpiexec -machinefile hostfile -n 8 pathToExecutable < input > output

    MPI 1

    Create a host file indentically to the way you would for MPI 2. Then issue the command:

    mpirun -machinefile hostfile -np 8 pathToExecutable < input > output

    Replace hostfile with the name of your hostfile, pathToExecutable with the name of your executable (relative or full path), and input and output with the names of your standard input and output files.

    After your program completes, you MUST enter:

    cleanupmpinodes
    to kill any orphan MPI processes that may have inadvertently been left behind. This is especially important if any of your processes terminated abnormally or did not run to completion normally.

    Using Infiniband vs Ethernet (OpenMPI)

    (The following commands work for OpenMPI only.)

    Hoffman2 compute nodes have both Infiniband and Ethernet interfaces. To run MPI codes across multiple nodes, using Infiniband is recommended for its superior bandwidth and latency. Under normal conditions, OpenMPI automatically chooses Infiniband for communication when using the mpirun (or mpiexec) command.

    If you want to use Ethernet network instead, add the "--mca btl ^openib" flag to the mpirun command, e.g.

    mpirun --mca btl ^openib -n 8 -machinefile hf foo
    where "hf" the the machinefile and "foo" is the executable.

    If you want to make sure that Ethernet is excluded, use the "--mca btl ^tcp" flag.