CIBRClusters/Software

Software available on the CIBR Clusters

Most tools available on standard Linux installations will be available:

GCC collection
Python
Perl
Ruby
OpenMPI
etc.

/usr/local is shared among all of the cluster nodes. So most extra software will be installed within that folder. You can find a lot of available tools simply by checking /usr/local/bin.

It is possible to use Matlab on the cluster, one node at a time. There is a special process for setting this up, which takes some effort. If you need to make use of this capability, please contact Larry Mayran (lmayran@bcm.edu).

Here is a list of some other tools we have currently installed for users on specific clusters:

Software	Cluster	Info
EMAN2.1	Sphere,Prism	Installed in /usr/local/EMAN2. Updated at irregular intervals. See below for details.

Software-specific instructions

EMAN2.1

The cluster-installed version of EMAN2.1 in /usr/local/EMAN2 is optimized for the cluster, giving it a 10-20% performance boost over the binaries you might download from the website. It has also already been configured to work with MPI on the cluster. Set the following environment variables in your .bashrc file to use it:

export EMAN2DIR=/usr/local/EMAN2
export LD_LIBRARY_PATH=$EMAN2DIR/lib:/usr/local/lib
export PYTHONPATH=$EMAN2DIR/lib:$EMAN2DIR/bin
export PATH=$EMAN2DIR/bin:$EMAN2DIR/examples:/usr/local/bin:$PATH

To run jobs using MPI on the cluster, run e2refine_easy.py or other program using the --parallel option. For example, if you requested 96 cores from the batch system, you would specify --parallel=mpi:96:/scratch/<username>. You should NOT issue the mpirun command yourself. This is done for you by the individual programs based on the --parallel option.

Additionally, for programs accepting it, you should specify the --threads=<N> option. Here N is NOT the number of MPI tasks, but is the number of threads available on each node. On sphere, 30 is a good number, assuming you have used the 'bynode' queue. On Prism, 24 is a good number to use. While there are only 16 physical cores present on Prism nodes, the machine can gain a modest (10-20%) performance boost by launching about 50% more threads than physical cores.

Remote GUI display from the cluster is not generally feasible, so we suggest limiting yourself to running large compute jobs on the cluster, and doing any GUI work on your local machine. The easiest way to handle this is to simply rsync the full project directory between machines. For example, consider this session with a local workstation and the sphere cluster:

local> pwd
/home/stevel/myproject
local> rsync -avr ../myproject sphere:data
.. rsync output not shown
local> ssh sphere
sphere> cd data/myproject
sphere> vi myslurmjob.sh
edit job file
sphere> sbatch myslurmjob.sh
wait for job to complete
exit sphere session
local>pwd
/home/stevel/myproject
local>rsync -avr sphere:data/myproject ..
.. rsync output not shown
local> e2projectmanager.py
examine refinement results