cibrclusters
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
cibrclusters [2025/07/05 21:10] – steveludtke | cibrclusters [2025/07/05 21:32] (current) – steveludtke | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | [[CIBRClusters|Return to the CIBR Cluster | + | ====== |
- | ===== Introduction | + | **WARNING: ** The existing CIBR clusters are now >10 years old, and we no longer know how long the room housing them will remain operational due to BCM cuts. If you have material on the CIBR Clusters |
- | ==== Glossary & Basic Concepts ==== | + | ---- |
- | First, a little terminology. Here is the hierarchy of computing resources on a modern cluster (you may find some disagreement about some of these specific definitions): | + | |
- | * // | + | |
- | * // | + | |
- | * // | + | |
- | * // | + | |
- | * //Thread// - A fairly recent concept. Each //core// can do several non-overlapping things at the same time. Sometimes some of these resources are idle due to conflicts between sequential operations. //threads// allow you to recover some of this wasted capacity by running 2 jobs on each //core//. On current generation processors, using this //thread// system can recoup as much as 10-20% of otherwise wasted capacity. However, using it is tricky. You will also often see vendors (like Amazon) advertising //threads// as if they were //cores//. This is patently false. | + | |
- | Non-CPU terms: | + | **NOTE:** The CIBR cluster Co-op strives to maximize CPU/cost, and BCM no longer provides any staff support for cluster operations. Dr. Ludtke' |
- | | + | |
- | * //Swap// - A concept used to avoid memory exhaustion on computers. When the //RAM// becomes full, the computer may temporarily store some information on the hard drive in specially allocated //swap space//. Note, however, that a typical hard drive can read/write information at ~150 MB/sec. //RAM// can read/write information at ~75,000 MB/sec. Forcing a modern | + | |
- | * // | + | |
- | * //Gigabit// - Normal network used between nodes (and in the computers in your lab), can transfer ~100 MB/sec. The connection into the cluster from the outside world is a Gigabit connection. | + | |
- | * //RAID// - A type of disk storage combining multiple hard drives into a single larger ' | + | |
- | * //RAID5// - With N hard drives, provides read performance improvement of up to N times, | + | |
- | * //RAID6// - Same as RAID5, but can tolerate 2 drive failing at once, and provides N-2 drives worth of space. | + | |
- | * //NFS// - Networked FileSystem. Makes a hard drive/// | + | |
- | === CPU resources | + | ==== Overview ==== |
- | If we consider the Prism cluster, for example, it contains (as of the time of writing) 11 servers, each of which has 4 nodes, each of which has 2 processors, each of which has 8 cores. 11*4*2*8 = 704 cores. Technically this could be interpreted as 1408 //threads//, but whereas 2 cores provide 2x performance, | + | [[https:// |
- | === Network and Storage === | + | The clusters |
- | Our clusters are typically configured with QDR infiniband interconnects, | + | ||**Name**||**Cores**||**Storage**||**Purchased**||**Free CIBR Access**||**Paid Access**||**Notes**|| |
+ | ||[[CIBRClusters: | ||
+ | ||[[CIBRClusters: | ||
+ | ||[[CIBRClusters: | ||
- | === Memory (RAM) === | + | ==== Requesting Access ==== |
- | We currently configure our clusters | + | Free cluster accounts must be requested directly by a PI (tenure track faculty) who must be an established member of CIBR, and such accounts will charge against that PI's allocation. Contact lmayran@bcm.edu (Larry Mayran) about CIBR membership. |
- | ==== Configuring SSH so you can log in to the nodes ==== | + | To request an account, email sludtke@bcm.edu with: |
- | This can be critical to making jobs run correctly. You should be able to ssh into any of the nodes on the cluster without being asked for a password. | + | * Full name |
+ | * Requested username | ||
+ | * Position (student, postdoc, etc) | ||
+ | * Cell Phone Number (for emergencies and initial | ||
+ | * An email address which must be checked at least daily. | ||
- | < | + | ==== Cluster Usage ==== |
- | cd | + | Each cluster has independent user accounts, as different faculty contributed different amounts to each. Please click on one of the links above for detailed policies and procedures for each cluster |
- | mkdir .ssh # if it doesn' | + | |
- | cd .ssh | + | |
- | ssh-keygen | + | |
- | <use default filename> | + | |
- | <use blank password, just hit enter> | + | |
- | cat id_rsa.pub >> | + | |
- | </ | + | |
- | You will need a valid known_hosts file. On each of the clusters | + | Individual users are expected to be familiar with the use of Linux clusters |
- | < | + | |
- | cp / | + | |
- | </ | + | |
- | After this, make sure the permissions on the .ssh directory | + | All of the clusters run some version of CentOS Linux. **We do not provide any commercial software**, but most standard open-source tools are installed, |
- | < | + | |
- | chmod og-w $HOME | + | |
- | chmod 700 $HOME/.ssh | + | |
- | chmod 600 $HOME/.ssh | + | |
- | </ | + | |
- | ==== Using Resources | + | Details |
- | Running jobs on a cluster | + | There is a CIBR Cluster Google Group/ |
- | * Data size - reading | + | ==== In manuscripts, |
- | * CPU - How much computation will your task require, assuming reading | + | Please acknowledge |
- | * Memory requirements - How much data needs to be in the computer' | + | "We gratefully acknowledge |
- | * Parallelism - can your task take advantage of the shared memory on a single node? If your job supports threaded parallelism, | + | |
- | Take the data size and divide by 100 MB/sec. How does this amount of time compare to the CPU time estimate you made. If the CPU time estimate is an order of magnitude or more larger than the number you got from the data, then your job may be well suited for a cluster. If your memory requirements are larger than 4GB/core, and your job does not support threaded parallelism, | + | ---- |
- | + | CIBR hosted | |
- | Again, compute clusters excel at running very CPU-intensive tasks with low file I/O requirements. Tasks such as molecular dynamics or other types of simulation are good examples of this. A task at the other extreme, such as searching a genome for possible primer sequences, is probably not something that should even be attempted on a cluster. Most tasks are somewhere between these extremes. For example, CryoEM single particle reconstruction does work with many GB of data, but the amount of processing required per byte of data is high enough that clusters can be efficiently used. | + | The slides can be downloaded here: [[http://blake.bcm.edu/ |
- | + | Archived video of the presentation | |
- | If your project is very data intensive, it may be worth considering an in-lab workstation configured with a high performance RAID array instead. Such a machine can be purchased for well under $10k, and (in 2014) can provide as much as ~1.5 GB/second read/write speeds. For rapidly processing large amounts of sequence data, machines like this can be much more time and cost-efficient than any sort of cluster. | + | |
- | + | ||
- | If you have any questions about this for your specific project, please just email sludtke@bcm.edu | + | |
cibrclusters.1751749820.txt.gz · Last modified: by steveludtke