Introduction to Clusters and CIBR Philosophy
First, a little terminology. Here is the hierarchy of computing resources on a modern cluster (you may find some disagreement about some of these specific definitions):
Server - a single physical box mounted in a rack in the cluster
Node - One server often contains multiple independent computers, or nodes. A 'node' is an independently operating computer.
Processor or CPU - A physical microprocessor chip inserted into the motherboard on the computer. Most linux cluster nodes have 2 processors, though there are exceptions
Core - A semi-independent computing unit inside a processor. The cores within a single physical package do share some resources (like cache), but can largely be treated as independent. Modern processors typically have 2-12 cores.
Thread - A fairly recent concept. Each core can do several non-overlapping things at the same time. Sometimes some of these resources are idle due to conflicts between sequential operations. threads allow you to recover some of this wasted capacity by running 2 jobs on each core. On current generation processors, using this thread system can recoup as much as 10-20% of otherwise wasted capacity. However, using it is tricky. You will also often see vendors (like Amazon) advertising threads as if they were cores. This is patently false.
RAM - The memory on a single node. This memory is shared among all of the cores on the node, and often not explicitly allocated to specific tasks.
Swap - A concept used to avoid memory exhaustion on computers. When the RAM becomes full, the computer may temporarily store some information on the hard drive in specially allocated swap space. Note, however, that a typical hard drive can read/write information at ~150 MB/sec. RAM can read/write information at ~75,000 MB/sec. Forcing a modern cluster to swap is a very bad thing, and the CIBR clusters have little or no swap space available.
If we consider the Prism cluster, for example, it contains (as of the time of writing) 11 servers, each of which has 4 nodes, each of which has 2 processors, each of which has 8 cores. 11*4*2*8 = 704 cores. Technically this could be interpreted as 1408 threads, but whereas 2 cores provide 2x performance, 2 threads on a single core typically provides ~1.1-1.2x performance, and can only be used in very specific situations.