Return to the CIBR Cluster Home Page

Torus Policy

updated 12/03/2013

Overview

Torus is a medium-scale high performance linux cluster purchased by a collaborative arrangement between CIBR, and the Ludtke, Chiu and Barth labs. Each group that contributed financially to the purchase of the cluster is entitled to a proportional amount of the overall compute capacity of the cluster. In theory, the cluster can provide up to 5,000,000 CPU-hr of computation annually, however, it is impossible to keep such a cluster fully loaded all of the time, and typical clusters run at ~60-80% of capacity, and allocations reflect this.

CIBR faculty can acquire time allocations on this cluster simply by requesting them via email to sludtke@bcm.edu. Initial allocations of 10,000 CPU-hr do not require any formal application, simply a request. The request must come from the PI, not students/postdocs, as allocations are made on a per-faculty basis, and the professor must decide how to allocate the time among people in his/her lab. Faculty must be members of CIBR to receive free time on the cluster (membership is free). If the initial allocation is exhausted, larger allocations may be possible, depending on usage levels, and the number of requests in that quarter.

Hardware & Software

Unlike our previous collaborative clusters, Torus is equipped with a QDR Infiniband interconnect, capable of very high bandwidth transfers between nodes.

Software Configuration

Detailed Information on Using the Cluster

The cluster is administered in a fairly laissez-faire fashion. Generally speaking, all users, paid and free alike share the same queuing system. While there are specific queues for high and low priority as well as long-running jobs, there is no mechanism to preempt a running job. That is, once a job is started, it will run (occupying assigned resources) until complete. The priority system only impacts the order in which jobs are started. Please see the documentation below for more details. Under no circumstances should compute jobs be executed on the head-node, or directly on any compute nodes without going through the queuing system. It is permissible to run short I/O intensive jobs on the head node (pre-processing data and such) since the head-node has more efficient storage access.

It is the user's responsibility to follow all cluster policies. While we try to be understanding of mistakes, we would much prefer to answer a question rather than spend two days fixing an accidental problem. Users who intentionally abuse policy may have their accounts temporarily or permanently suspended. In such situations, the user's PI will be consulted.

For Assistance

Contact

SysOp: Dwight Noel dwight.noel@bcm.edu

Last modified on Dec 3, 2013