Tomogram Annotation/Segmentation

Availability: EMAN2.2+, or daily build after 2016-10 This is the tutorial for a Convolutional neural network(CNN) based semi-automated cellular tomogram annotation protocol.

If you are using this in a publication, please cite:

A stable (thus not up-to-date) online version of the tutorial can also be found at Protocol exchange:



CNN based tomogram annotation makes use of the TensorFlow deep learning toolkit. At the time of this writing, tensorflow effectively requires CUDA, the compute framework on NVidia GPUs. While future support for OpenCL (an open alternative to CUDA) seems likely, and while it is possible to use TensorFlow on a CPU, at present we cannot recommend running this tool on Mac/Windows. You may do so, but training the networks may be VERY slow.

To make use of accelerated TensorFlow, your machine must be properly configured for CUDA. This is outside the installation of EMAN2, and requires specific drivers, etc. on your machine. Some extra information on how to use CUDA and GPU can be found at

Setup workspace

If you used EMAN2 for your tomogram alignment/reconstruction (EMAN2/e2TomoSmall), simply 'cd' to the project folder and launch If you will be using tomograms produced by other software:

Make an empty folder and cd into that folder at the command-line. Run

Select the Tomo Workflow Mode.

Import or Preprocess

If your tomograms were produced using the EMAN2 workflow:

If your tomograms were produced by other software, and you wish to do the segmentation in EMAN2:

Import Tomogram


The network training process may seem complicated at first, but in reality it is extremely simple. You will be training a neural network to look at your tomogram slice by slice, and for each pixel it will decide whether it does or does not look like part of the feature you are trying to identify. This feature is termed the Feature of Interest (FOI). To accomplish this task, we must provide a few manual annotations to teach the network what it should be looking for. We must also provide some regions from the tomogram which do not contain the FOI as negative examples. That is, you are giving the network several examples of: "I am looking for THIS", and several more examples of: "It does NOT look like THIS".

For the network to perform this task well, it is critical that you be absolutely certain about the regions you select for training. The largest error people make when training the network is to include regions containing ambiguous density. If you select a tile containing a clear feature you wish to include, but the tile also contains some density you are unsure of, then you will be in a difficult situation. If you mark the ambiguous region as a particle, and it actually is something else, the trained network will not be very selective for the FOI. If you do not mark the ambiguous density, and it turns out to actually be the FOI, then you are trying to train the network to distinguish between the object you said was the ROI and the object you said was not (but was), which may cause a training failure, or a network which misses many FOIs. The correct solution is not to include any tiles containing ambiguous density! Only include training tiles you are certain that you can annotate accurately in 2-D.

There will be other cases where it may be difficult to distinguish between two features. For example, at low resolution, the strong line of an actin filament, may look somewhat like the edge of a membrane on an organelle. If you train only a single network, and it is trained to recognize actin, then you apply it to a cellular tomogram, you will likely find many other features, such as membrane edges, and possibly the edge of the holy carbon film, get mis-identified as actin. While you may be able to more carefully train the actin network, or alter the threshold when segmenting, usually the better solution is to train a second network to recognize membranes and a third network to recognize the C-film hole, then compete the 3 networks against one another. While the vesicle may look somewhat like actin at low resolution, it should look more like a vesicle than actin, so the result of the competitive networks will be a more accurate segmentation of all 3 features.

We will begin the tutorial by training only a single network. You can then go back and follow the same process to train additional networks for other features, then finally at the end, use a program designed to compete these solutions against one another.

Select Training Examples

In this stage we will be manually selecting a few 64x64 pixel tiles from several tomograms, some of which containing the feature of interest (FOI) you wish to train the neural network to identify. We will identify some 2-D tiles which contain the FOI and other tiles which do not contain the FOI at all. In the next step we will annotate the tiles containing

Select Box training references. Press browse, and select the preproc version of any one imported tomogram. Leave boxsize as -1, and press Launch.

In a moment, three windows will appear: the tomogram view, the selected particles view, and a control-panel (labeled Options). Unlike the 2-D particle picking interface used in single particle analysis, which is almost identical, this program allows you to move through the various Z-slices of the tomogram. The examples you select will be 2-D, drawn from arbitrary slices throughout the 3-D tomogram. Additionally, this program permits selecting multiple classes of tiles from the same tomogram. For example, you may have one class of tiles containing the FOI and another class which does not contain the FOI. It is critical that you use a consistent naming convention for these different classes of particles, since we would like to have training references from multiple tomograms used for the network training.

Particle Output

Note: The box size for training tiles MUST be 64x64. If the feature of interest does not fit in a 64x64 tile, you will need to create a more down-sampled copy of your tomogram for the recognition to work well. You can do this manually with If you create a down-sampled version of a tomogram, use the exact same name and location as the original tomogram, but add a suffix to the name preceded by (double underscore). eg - tomograms/tomo_12345.hdf might become tomograms/tomo_12345shrink2.hdf

Manually Annotate Samples

The next step is to manually identify the FOI

Segment Particles

Build Training Set

This step is simply combining the files we have created through this step in a specific way to prepare for network training.

This will create a new file with _trainset appended to the name. It takes only a few seconds to run.

Train Neural Network

Training Results

Once you are satisfied with the result, go to the next step to segment the whole tomogram.

Apply to Tomograms

Finally, open Apply the neural network panel. Choose the tomogram you used to generate the boxes in the tomograms box, choose the saved neural network file (not the "trainout_" file, which is only used for visualization), and set the output filename. You can change the number of threads to use by adjusting the thread option. Keep in mind that using more threads will consume more memory as the tomogram slices are read in at the same time. For example, processing a 1k x 1k x 512 downsampled tomogram on 10 cores would use ~5 GB of RAM. Processing an unscaled 4k x 4k x 1k tomogram would increase RAM usage to ~24 GB. When this process finishes, you can open the output file in your favourite visualization software to view the segmentation.

Segmentation Result

To segment a different feature, just repeat the entire process for the each feature of interest. Make sure to use different file names (eg - _good2 and _bad2)! The trained network should generally work well on other tomograms using a similar specimen with similar microscope settings (clearly the A/pix value must be the same).

Merging multiple annotation results

Merging the results from multiple networks on a single tomogram can help resolve ambiguities, or correct regions which were apparently misassigned. For example, in the microtubule annotation shown above, the carbon edge is falsely recognized as a microtubule. An extra neural network can be trained to specifically recognize the carbon edge and its result can be competed against the microtubule annotation. A multi-level mask is produced after merging multiple annotation result in which the integer values in a voxel identify the type of feature the voxel contains. To merge multiple annotation results, simply run in the terminal:

Tips in selecting training samples

Good vs bad segments


Darius Jonasch, the first user of the tomogram segmentation protocol, provided many useful advices to make the workflow user-friendly. He also wrote a tutorial of the earlier version of the protocol, on which this tutorial is based.

EMAN2/Programs/tomoseg (last edited 2019-12-03 13:43:43 by SteveLudtke)