Extra functions for EMAN2 tomography

Focused refinement

Refine local regions of a large complex. Available after 05/23/2019. Still under development.

Focused refinement on multiple asymmetrical units

When you have a complex with multiple asymmetrical units, start from one unit and get the transform and mask following the previous section. Assume we have c5 symmetry and the first unit is at 32,32,0. Then run e2.py and type

You will get a list of transform dictionaries in the printout. Paste them into a text file and use it as the input for particle extraction.

This also works when you have a complex with multiple identical components but does not follow a clear symmetry. Extract each unit individually and align the same reference to the unit. Put the alignment transforms in a text file for particle extraction.

Determine the handedness of a tomogram

In EMAN2 build after 05/23/2019, we can determine the handedness of a tomogram using CTF information. The idea is, at a non-zero tilt angle, one side of the specimen should be closer to the focal plane than the other one. Since this is already taken into consideration in the CTF estimation step, we just run the estimation twice on both the current and inverted hand, and check which one has a better fit.

Automated particle selection

In EMAN2 build after 02/01/2020, a new tool is implemented for CNN guided automated particle selectin from tomograms. The concept is similar to the tomogram segmentation protocol, but a number of changes have been made to improve the accuracy and throughput of the process. A new GUI has been made to simplify the training process. Note that this requires a CUDA compatible GPU and tensorflow setup to work. To use, run

Here label will be the label of the newly selected particle. This will bring up three windows: the main window with various options and a list of tomograms, and two windows (should be empty in the beginning) for positive and negative samples. Clicking any tomogram in the list will bring up two other windows: the slice view of the tomogram and the list of particles under the given label. Here is a simple workflow.

  1. Select a few (>5) positive to negative samples. On the tomogram slice view, left-click to select positive samples, and Ctrl+left-click to select negative samples. Shift-click an image in the sample list to delete it. The particles should be well-centered in the positive samples, and there should not be particles in the center of negative samples.

  2. Click Train to start training and some output will be printed in the command line. Keep clicking Train (or use a larger Niter) until the loss stops decreasing (or whenever you want to stop).

  3. Click Apply to let the program select particles using the trained network.

  4. Go through the particle list, Ctrl+left-click a falsely recognized particle to add it to the list of negative samples (left-click a particle will add it to the positive samples, but it is not very necessary since they are selected by the network already). You can also go through the tomogram again to add a few particles that are not selected by the network into the positive samples.
  5. Click Train again to re-train the network using the new training set, and click Apply to inspect its results.

  6. Repeat the process until the neural network's performance is satisfying. You can also select other tomograms in the list, to test the performance of the model and add more positive/negative samples to the training set.
  7. Go through all tomograms in the list and apply the network to select the particles. These particles can be viewed and modified in e2spt_boxer.py, and extracted through the particle extraction steps of the main workflow.

    Automated particle selection

Description of items on the GUI:

Map particles to tomograms

There is a simple tool to map the averaged structure to the determined position and orientation of each particle in a tomogram. Available after EMAN2.3. In versions after 05/23/2019, the function is moved to the Analysis and Visualization section in the GUI.

The program will then find all particles in the selected tomogram that are used in the refinement, map the averaged structure back, and produce a file called ptcls_in_tomo_xx_yy.hdf, where xx is the name of tomogram and yy is the number of iteration used. This is sometimes quite useful for objects in a cellular environment (when membrane proteins are obviously upside down for example). Image rendered with Chimera.

Filament refinement

A specialized GUI is implemented for the selection of filament particles in 2.31 or later versions. In the Evaluate Tomograms window, select a tomogram, hold Shift and click the Boxer button. You can also find this through Segmentation -> Manual segmentation -> Draw curve. This will bring up a 2D tomogram viewing window and a small control panel. The following tomogram is from Caltech ETDB.

In the tomogram window, press up or down arrow (`/1 also works)to go through the slices. Use left-click to add a point on the filament, and Shift-click to delete a point. The program will build a curve that goes through all the points while minimizing the total length in 3D, so the order of adding points on the curve is irrelevant. One can select the two ends of filament and then adding points in the middle to adjust the curvature. Ctrl-click to add a point on a new curve or select an existing curve.

On the control panel, the Interpolate button will interpolate the points on all curves with a constant spacing. This will only change the visual appearance in the GUI, as well as the particle count from the Evaluate tomogram window, but the number of actual 3D particle extracted from the tomograms is controlled later in the particle extraction step. The Save PDB button will save the curves as a PDB file, so they can be visualized together with the tomograms in Chimera. Due to the limitation of PDB format, the curves are saved in pixel units, so you will need to change the voxel size of the corresponding tomogram to 1 so they overlap with the model.

When multiple types of filaments exist in the same dataset, they should be labeled separately. Use the small text box at the top of the control panel to switch between different types of filaments. The filament particles can be viewed from the Evaluate Tomograms window as curve_00, curve_01 etc. Make sure the indices of the curves are consistent throughout the dataset (i.e. when a type of filament is labeled as 01 in one tomogram, it should always be 01 even if the type 00 filament does not exist in a tomogram). After selecting the curves, to extract a certain type of filament particles from the tomogram, in the Extract particles step, set curves be the index of the filament class, and curves_overlap to be the overlap between neighboring boxes (so the spacing between boxes is box size related). It is also recommended to name the extracted particles using the newlabel option.

If the 3D particles are extracted based on the curve boxing tool, their directions along the curve are saved in the header which can be used by downstream alignment. In the initial model generation, a command-line only option --refine will build an initial model while keeping the filament orientation of the particles. The same option is also present in subtomogram refinement that constrains the orientation search around the filament direction.

Filling missing wedge in tomograms

In EMAN2 build after 03/20/2020, there is a new deep learning based tool to fill in the missing wedge in raw tomograms with somewhat meaningful information. The idea is similar to a "style transform" that makes the features in the x-z 2D slice views similar to the x-y slice views. To use, run

There is no human input needed as the program will build training sets by itself. You can train and apply to the same tomogram to improve performance, or load a trained network and apply to many tomograms to save time. Note that the missing wedge filling here happens locally (you can specify box size in the program, but the performance may decrease as the box size gets larger), so it does not deal with large scale effect like the artifacts from a high contrast object, or the entire piece of invisible flat membrane.

Here is a before/after comparison of the x-z slice view of a cellular tomogram.