Directions for using Modules on the NMSU CS Bigdat cluster
written by Jonathan Cook, May 29, 2014, joncook@nmsu.edu

Modules are a very convenient mechanism for configuring your environment both to develop and to run HPC applications on clusters. They can really be used to add packages to your environment for any purpose, but mostly we use them to select compiler and MPI versions to use.

The "module" command is the fundamental interface to Modules. It takes a subcommand; the most common you will use are:
CommandDescription
module helplist help info
module availlist all available modules in the system
module listlist modules currently loaded in your environment
module load <module>load the specified module into your environment
module unload <module>unload the specified module

The "module avail" command shows the following modules on our system:

   ---------------- /usr/share/Modules/modulefiles ------------------------
   dot         module-git  module-info modules     null        use.own

   ---------------- /etc/modulefiles --------------------------------------
   compat-openmpi-psm-x86_64 compat-openmpi-x86_64

   ---------------- /act/modulefiles --------------------------------------
   gcc-4.7.2              mvapich/intel          openmpi-1.6/gcc-4.7.2
   mpich/gcc              mvapich2-2.0/gcc       openmpi-1.6/intel
   mpich/gcc-4.7.2        mvapich2-2.0/gcc-4.7.2 openmpi-1.7/gcc
   mpich/intel            mvapich2-2.0/intel     openmpi-1.7/gcc-4.7.2
   mvapich/gcc            open64                 openmpi-1.7/intel
   mvapich/gcc-4.7.2      openmpi-1.6/gcc

I have no idea what the /usr/share and the /etc modules do -- you might guess from their name, but the /act modules are compiler and MPI version configurations. A quick description of these follows.

1. gcc-4.7.2 is a more recent version of Gnu C/C++/Fortran than the one that comes in the standard CentOS distribution. Without this module, gcc/cc is v4.4.7. I would recommend using this newer version if you are going to use Gnu for your development. NOTE: YOU MUST load the module gcc-4.7.2 before loading any of the /gcc-4.7.2 modules, if you are going to use gcc with MPI; you need this at runtime, too.

2. Each MPI library (MPICH, MVAPICH, OpenMPI) has three separate modules depending on the compiler you choose: the default gcc (4.4.7), the updated gcc (4.7.2), and the Intel compilers. Additionally, there are two versions of MVAPICH and OpenMPI.

3. Confused yet? In general, I would recommend using the latest versions, and then drop back to older ones only if you are having problems. I would recommend NOT using MPICH at all -- it will not use Infiniband, only the much slower Ethernet interconnect. And in general the Intel compilers are considered to generally produce faster code than Gnu, so mvapich2-2.0/intel or openmpi-1.7/intel are good choices. But Gnu compilers are good standbys, so if you like and know the Gnu compilers, then I would recommend either mvapich2-2.0/gcc-4.7.2 or openmpi-1.7/gcc-4.7.2.

4. NOTE that the MPI modules give you the MPI compiler wrappers: mpicc, mpic++, mpif90, etc., along with the basic compiler names (gcc/g++/gfortran for Gnu, icc/icpc/ifort for Intel). So even if you are not using MPI, you can load, e.g., mvapich2-2.0/intel and then use the plain Intel compilers.

5. All compilers should natively recognize and process OpenMP directives.

6. The "open64" module does nothing - if you are interested in installing it, contact me (see open64.net for more info).