EMAN2/e2tomo - EMAN Wiki

NOTE: This is an early version of our tomography workflow tutorial. Updates will be posted in the coming weeks describing recommended procedures for automated segmentation and subtomogram averaging.

Introduction to the tutorial

EMAN2 can be used at many different levels ranging from high-level task-based workflow, to command-line utilities, to writing code in Python or C++. In this tutorial, we will be focusing primarily on the task-focused high-level Project Manager interface. This interface will help you work step-by-step through established techniques such as single particle analysis and subtomogram averaging.

We will be using an 80S-Ribosome data set for this tutorial obtained from EMPIAR (https://www.ebi.ac.uk/pdbe/emdb/empiar/entry/10064). The manuscript associated with this data deposition reported a final resolution of ~11Å resolution, but obtaining this result on a laptop would require significant time and it is unlikely that such hardware would possess sufficient RAM for memory-intensive processes such as 3D subtomogram refinement. Here we will operate on binned tiltseries/tomograms, reducing the attainable resolution but allowing the complete subtomogram averaging pipeline to be performed on relatively minimal hardware.

We strongly recommend going through this tutorial using the provided data set. Once you understand how everything is supposed to work, then you can use your own data or download additional public data sets from sites like http://www.ebi.ac.uk/pdbe/emdb/empiar.

$/!\$ Edit conflict - other version:

Before getting started, it's a good idea to get a feel for the relative speed of your computer (to set expectations). Run e2speedtest.py. This will give you a score telling you how fast a single processor is on your computer. If your machine has 4 cores, you multiply this number by 4 to get a relative performance value. Note, however, that some processors have a 'turbo' mode, and if you are using only 1 processor (which is what the test does), it will run faster than 1 core normally will. This can exaggerate the speedtest score by as much as 20-30%. My 2017 MacBook Pro (3.1 GHz Intel Core i7) scores ~1.3 (per core) on this test.

A note on project organization/management: What should you do with your raw data?

$/!\$ Edit conflict - your version:

Open e2projectmanager

A note on project organization/management: What should you do with your raw data?

$/!\$ End of edit conflict

We begin by unzipping the tutorial file. The compressed directory includes two directories (CTFnoise and Distortion) and a tiltseries, cryo.st that we will reconstruct using the EMAN2 tomography workflow. Open a terminal window/command prompt and move into the unzipped folder. On my computer, this looks like:

cd /home/jmbell/cryo

Typically, I create a directory called “rawdata” that houses everything I want to preserve but not process. In this case, I suggest making this directory (mkdir rawdata) and moving everything in the current directory into this folder, i.e.:

mkdir rawdata mv ./* rawdata

Next, from within the current folder, run “e2projectmanager.py”, which will bring up the EMAN2 project management GUI. By default, the Workflow Mode shown at the upper left of the window under the EMAN2 logo will be “SPR”. This mode provides tools for single particle analysis. We will be using the new “Tomo” mode, so click on the dropdown menu and select “Tomo”. When you do this, the workflow menu will show a new series of panels including Raw Data 3D Reconstruction Segmentation Subtomogram Averaging Analysis and Visualization

$/!\$ Edit conflict - other version:

On the right side of the project manager window, there are a series of buttons: [Browser: Open the EMAN2 file browser] [Help: Open the e2help.py GUI that provides considerable detail about our C++ image processing utilities (aligners, reconstructors, processors, projectors, etc.).] [Notebook (Log): Open the EMAN2 notebook, which shows a log of processes called and the times they were launched.]

$/!\$ Edit conflict - your version:

$/!\$ End of edit conflict

Task manager: Show the EMAN2 tasks/programs currently running on your computer. The last three buttons are available only for a subset of our programs. Wiki: Open additional documentation on the web Wizard: Open the project manager wizard to assist when filling in program parameters Expert mode: Show/hide additional options for a given program.

To keep things organized, particularly when working on multiple projects simultaneously, it is often useful to assign a unique project name to each tomography project, which will be displayed under the EMAN2 Project Manager title text. To do this, click on the project manager window and either at the top of the window or the top of your screen (depending on which operating system you’re using), click “Project”->”Edit project”. Here you can provide a project name, say “EMAN2 Tomography Tutorial”. For record keeping, we suggest also filling the particle mass, and microscope Cs, voltage, and apix (Å/pixel) values. The Å/pixel value for this tutorial dataset is 2.62. The specified parameters will be used when possible throughout the refinement process, but more importantly, collaborators will be able to access these parameters when you share your project directory, reducing the chance of processing errors due to incorrect parameter specification.

$/!\$ Edit conflict - other version:

$/!\$ Edit conflict - your version:

$/!\$ End of edit conflict

2. Import data

Tomography projects have a variety of starting points. This tutorial begins from raw tiltseries; however, it’s also possible to start from raw movie frames or individual tilt images that must be pre-processed before they can be interpreted as a tiltseries.

While there are no specific requirements for the organization of your raw data in EMAN2, we recommend keeping a copy of your raw data on your machine and processing a separate copy. Following this recommendation, we will create a copy of the raw tiltseries from the “data” directory via the EMAN2 workflow GUI.

Raw tiltseries

We define a raw tiltseries as a stack of tilt images in tilt angle order (usually negative to positive). To import raw tiltseries data, double click the “Raw Data” entry in the EMAN2 project manager workflow menu and select “Import tiltseries”. This will bring up a new display in the EMAN2 program interface where you can specify the path to the tiltseries, whether to invert contrast upon import, and whether to copy, move, or link the incoming data.

Click “Browse” on the upper right of the program interface window, select “cryo.hdf”, and click “OK”. Next type 7.7 into the Apix box. Finally, make sure “copy” is selected under the importation dropdown menu and click “Launch”.

Generally we recommend that users copy their data from a separate directory (such as “rawdata”) so that we always have a backup copy of the raw data on disk in the event that files are somehow corrupted. This directory can be placed within or exist outside an EMAN2 project. However, if you prefer to move an existing copy of your data into an EMAN2 project, you can specify the “move” importation method. Alternatively, it’s also possible to manipulate the data in place using the “link” option.

NOTE: It is essential that all imported files have the correct Å/pixel value in their header or that it is specified when running one of our import routines. If you are certain that the Å/pixel value in the header is correct, you can specify -1 in the apix box during tiltseries import. For more details on manipulating file header parameters, see the block below titled Inspecting and modifying image file header parameters.

Individual micrographs

If starting your tomography with individual tilt images, you can import these images directly into a tiltseries by using the “e2buildstacks.py” program. This is accessible from the GUI by selecting “Generate tiltseries” under the “Raw Data” workflow menu entry. Here you will specify the images in tilt angle order (negative to positive) and type the name you wish to assign this tiltseries. Before clicking “Launch”, be sure the “tilts” box is checked. The resulting tiltseries will be stored in the project “tiltseries” directory. If this directory does not already exist, the program will create it automatically.

DDD frames

If starting from raw DDD frames, you may or may not have an mdoc file containing relevant tilt angle and file name information used to combine aligned frames into a tiltseries. In cases where such a document is unavailable, one can align the frames individually using the e2ddd_external.py program, available through the GUI under “Raw Data” in the workflow menu. To use this program, simply select “Process DDD movies” and provide a list of the movies you want to align in tilt angle order.

If you have an mdoc file, the process is significantly easier. In the “Process DDD movies” program interface window, begin by clicking “Browse” next to “mdoc” file box and select the relevant mdoc file for the movie data you wish to align and convert into a tiltserles. Next click “Browse” in the “input” file input box and select either a list of movies in arbitrary order or a directory containing the movies to be aligned. When launched, the program “e2ddd_external.py” will organize, align, and save the tilt images in tilt angle order according to the contents of the specified mdoc file.

During alignment, either IMOD’s alignframes routine or MotionCor2 will be used. Note that for these programs to work, they will need to be installed and available in your PATH environment variable. Correct installation instructions for these programs/packages are available from their respective developers/distribution platforms.

Optionally, you may specify dark and gain reference movies in the corresponding file boxes, and we offer some basic options for the two alignment routines. For more advanced usage, we suggest that users runs these programs from the command line. Including these in our GUI is solely for the sake of convenience.

The final output of e2ddd_external.py when run within the “Tomo” workflow is an unaligned tiltseries, which will be stored in the “tiltseries” project folder.

NOTE: When aligning DDD movie data using e2ddd_external.py via the command line or project manager interface, it is essential to verify that all imported files have the correct Å/pixel value in their header. For instructions on how to inspect/modify image file header parameters such as Å/pixel, see the block below.

Inspecting and modifying image file header parameters:

To inspect a file’s header parameters (e.g. apix_x, apix_y, etc.), you can either use the EMAN2 file browser or command line.

If you prefer a graphical interface, click on the folder icon from the project manager or run e2display.py via the command line. Next, navigate to the tiltseries directory and single click on an imported tiltseries. Next click the “info” button at the upper right to inspect the header parameters.

Alternatively, from the command line, it is possible to obtain header parameters by running the command “e2iminfo.py filename.ext -H”, which will print the contents of the header to the terminal window. In either case, you should examine the “apix_x”, “apix_y” header parameters and ensure that these values are consistent with the magnification used during data collection and binning applied before/during tiltseries importation.

If these values are incorrect and need to be modified, this task can be accomplished using the command line program, “e2procheader.py”. Specifically, to change the apix value to a preferred value, say 7.699, you would run the following command from within the tiltseries directory (on an already imported tiltseries): e2procheader.py --input tiltseries.hdf --output tiltseries.hdf --stem apix --stemval 7.699.

3. Reconstruct tiltseries

Once tiltseries data is imported into the EMAN2 project directory, we can proceed with tiltseries alignment and 3D reconstruction. This is a fully automated procedure in EMAN2 that begins with a coarse, cross-correlation alignment of tiltseries followed by rounds of iterative refinement. Each iteration consists of generating a tomogram, picking high-contrast landmarks in 3D (rather than relying solely on gold fiducials), mapping landmark coordinates to 2D tilt images, refining the coordinates of those landmarks, and then refining the alignment parameters of each image in the tiltseries. We repeat this process with different levels of binning to focus alignment on low resolution features first-and-foremost. The final result is a reconstructed tomogram that is either 1024x1024x256 (default) or 2048x2048x512 depending on which options are specified. It’s worth noting that unlike most tomogram reconstruction software that uses SIRT or back-projection algorithms, EMAN2 performs reconstructions using direct Fourier methods, similar to how we perform reconstructions in single particle analysis.

To perform a tiltseries alignment and tomographic reconstruction using the project manager interface, begin by double clicking on the “3D Reconstruction” workflow menu item and select “Reconstruct tomograms.” Next, click “Browse” at the upper right of the program interface window and select all the tiltseries you wish to reconstruct. In this case, you should see the “cryo.hdf” tiltseries you imported in the last step. Single click on this file (“cryo.hdf”) and click “OK”, which will close the current window.

In EMAN2, it is not necessary to specify a rawtlt file. Instead, we assume that images in a specified tiltseries are in the correct (tilt angle) order with no missing images. If this is the case, users need only to specify a tilt step (tltstep) and the index of the 0° tilt image (zeroid). For this dataset, the middle image is the 0° image (zeroid=-1), and the angle increment between tilts is 2° (tltstep=2), so the default parameters will work. However, in cases when your tilt step is different, is essential that you specify an accurate value in the tltstep parameter box.

While the default parameters should work well for this tutorial dataset, we recommend that you modify the number of threads at the bottom of the program interface window to correspond to the number of cores on your computer. Note that on a core i7 processor, I can run 6-8 threads without problems. On a core i3 or i5, I would not run more than 4. For more details about each of the options available for this program, you can hover over the parameter in the program interface window or run “e2tomogram.py --help” from the command line.

Once all parameters are set, click “Launch” to begin alignment and reconstruction. A complete reconstruction on a full 4k x 4k dataset usually takes ~8-12 minutes on 12 threads (depending on your hardware). In comparison, the “cryo.hdf” tiltseries is only 2k x 2k and should require only about ~3.6 minutes to reconstruct on a high-end laptop.

While running, the program writes alignment information to ‘info/xx_info.json’, including ali_loss : average residue error for each tilt, in nm tlt_file : tilt series file input for the reconstruction tlt_params : transform parameters for each tilt. 5 columns represent translation x, translation y, tilt axis rotation, tilt angle, off axis tilt angle.

If run with the “notmp” box left empty, a new folder called ‘tomorecon_xx’ will be created, containing the following intermediate files: ali_xx.hdf : aligned tilt series with 3D transform in ‘xform.projection’ in their header landmark_xx.txt : 3D location of the landmark used. loss_xx.txt : average residue error for each tilt, in nm ptclali_xx.hdf : per particle landmark tracking results in 2D. In the header of images, ‘nid’ is the index of tilt series, ‘pid’ is the index of the landmark, and ‘score’ is the (x,y) translation alignment of the particle. samples_xx.hdf : top and side (x-z plane) view of each landmark in 3D. A good way to evaluate the refinement is to see how round are the side views.. tltparams_xx.txt : transform parameters for each tilt after each iteration tomo_xx.hdf : bin8 tomogram reconstruction after each iteration. Note that we shrink by minimum instead of mean value when we make bin8 tilt series for reconstruction and landmark search, so small high contrast landmarks are not averaged out. So these tomogram will look a bit strange as dark things are often larger than expected (the actual final tomogram output still uses mean shrinked tilt series)..

Once finished, bin x4 versions tomograms are written to ‘tomograms/xxbin4.hdf’.

4. Evaluate reconstructions

Now that your tomogram(s) have been reconstructed, it’s a good idea to take a look at the results before performing subsequent processing and analysis. While the default reconstruction parameters have successfully reconstructed a wide variety of tomograms, it may be necessary to perform additional rounds of reconstruction to enhance contrast or reduce artifacts through improved tilt image alignment. To investigate the quality of your reconstructions, we recommend using the e2tomo_eval.py program, which is available from the GUI by first double clicking “Analysis and Visualization”, selecting “Evaluate tomograms”, and clicking “Launch.”

Once launched, the main window will appear. In the table on the left, you’ll find a list of each tomogram reconstructed in this project, the number of subtomogram boxes stored for each tomogram, and a “loss” value, which corresponds approximately to the alignment error in nanometers.

On the right is a blank image display that will show a central x-y slice of a reconstruction when one is selected by clicking a row in the table on the left. Below this image display are a series of buttons. The “Show2D” button will open a slice-wise display of the selected tomogram.

$/!\$ Edit conflict - other version:

$/!\$ Edit conflict - your version:

$/!\$ End of edit conflict

The “Boxer” button will open the e2spt_boxer22.py program used to box particles in 3D for later extraction. This is a cleaned up version of previous ‘e2spt_boxer’ with the support of the current metadata functionality of boxing multiple types of particles. More details about this process will be provided in a later step focusing on subtomogram boxing in EMAN2.

The “Refresh” button should be pressed anytime project parameters change while this program is open. For example, if new particles are boxed, pressing “Refresh” will update the #box column values to include your selected/removed particles.

The “TiltParams” button will bring up a plot window to display alignment data for each tilt image. The display will show columns 0 and 1 of the tilt parameters matrix corresponding to the x and y-translation of each image in the tiltseries; however, there are a total of 5 columns. In order, these correspond to x-translation (tx), y-translation (ty), in-plane rotation (alpha), tilt about the y-axis (ytilt), and tilt about the x-axis (xtilt). To switch which columns are plotted along the X and Y axes, simply middle-click the plot and scroll through the “X Col” and “Y Col” boxes in the inspector window that appears.

The “PlotLoss” button will bring up a 2D plot window showing the values of the “loss” function during tiltseries alignment. Typically, the deeper the trough you observe, the better the tomogram reconstruction will be. While this plot is useful for debugging certain alignment problems, we do not recommend attempting to interpret this beyond choosing whether to repeat the reconstruction process using different parameters.

Finally, the “PlotCtf” button will bring up a 2D plot window showing the defocus for each tilt image by default. When the plot is middle-clicked, an inspector window will appear in which different columns can be selected.

5. Tomogram annotation/segmentation

If you wish to annotate the reconstructed tomograms, EMAN2 offers automated procedures to accomplish this. For details, see the tutorial at the following link: http://blake.bcm.edu/emanwiki/EMAN2/Programs/tomoseg.

Note that recent changes to the EMAN2 tomography workflow have replaced the “TomoSeg” dropdown menu item with “Tomo”. However, you can find the same segmentation tools desribed in the tutorial under the “Segmentation” Workflow Menu item after selecting the “Tomo” workflow in the EMAN2 project manager.

For more details about the automated EMAN2 segmentation protocol, see: Chen, M., Dai, W., Sun, S. Y., Jonasch, D., He, C. Y., Schmid, M. F., Chiu, W., and Ludtke, S. J. (2017), Convolutional Neural Networks for Automated Annotation of Cellular Cryo-electron Tomograms. Nature Methods. 14: 983-98

https://www.nature.com/protocolexchange/protocols/6171

While EMAN2 does not offer solely manual segmentation utilities, we do offer some semi-automated routines for drawing curves and contours. Currently, these are made available through two programs, namely e2tomo_drawcurve.py and e2tomo_drawcontour.py; however, we anticipate incorporating them into a single program in the future alongside other semi-automated annotation tools.

e2tomo_drawcurve.py is a simple GUI tool for manually tracing curves in a reconstructed tomogram. In cases when you only have a few fibers to segment in a tomogram, this sort of semi-automated segmentation can actually be easier than fully automated methods! This approach actually has a built-in simple travel salesman problem (TSP) solver, so the user does not have to add points sequentially. Instead, one can anchor two ends of a fiber and add points between them to improve the overlap of the curve and the feature(s) of interest by building the minimal path that visits all selected points.

To use this program, Mouse click on the terminus of one feature to add the first point. Next, hover above opposite terminus and and click again. To add a new contour hold “ctrl” and click. To remove a point, hold “shift” and click. The program will save points as a text file or pdb file depending on the user’s preference. EMAN2 also offers separate scripts to interpolate the points, extract particles along the curve as subvolumes and refine the position of points by alignment.

$/!\$ Edit conflict - other version:

$/!\$ Edit conflict - your version:

$/!\$ End of edit conflict

Similar to ‘e2tomo_drawcurve.py’, e2tomo_drawcontour.py allows users to annotate closed contours in a semi-automated manner. It also has a built-in TSP solver for building the minimum loop and uses a simple SNAKE algorithm for fitting the contour from the previous slice to the next slice by simply pressing the “shift” key on an adjacent slice. Currently, annotation output is generated as a point cloud text file, which can be converted into a density map and displayed in rendering programs such as Chimera.

7. Subtomogram boxing

EMAN2 provides users with a GUI for manually boxing subtomogram volumes from 3D tomographic reconstructions; however, we also provide tools for automated picking via reference and clipping boxes from annotations obtained using our automated segmentation workflow.

Note that regardless of how boxes are picked, their coordinates are stored in json files corresponding to each tomogram in a project, which are kept in the “info” directory that houses all project metadata. To view box parameters, we recommend opening a particular tomogram using e2spt_boxer22.py, which is accessible via the e2tomo_eval.py GUI or via the EMAN2 project manager GUI under “Manual boxing” within the “Subtomogram averaging” workflow menu item.

Manual Boxing

Particularly when analyzing in situ datasets, there are often more than one type of particle, and the same dataset can be used to study many things. Our latest boxer GUI is designed to handle multiple particle labels when boxing manually or with a reference or prior segmentation. This allows users to explore multiple protein targets within the same EMAN2 tomography project.

To manually box particles using the latest boxer program, double click on “Subtomogram averaging” in the workflow menu and select manual boxing. In the program interface window, click “Browse”, select “cryo.hdf”, and click “OK.” Next, click “Launch.” This will open the e2spt_boxer22.py widget. In cases where you have multiple tomograms to box manually, we recommend accessing the boxer from the e2tomo_eval.py program instead, as it is more convenient.

$/!\$ Edit conflict - other version:

$/!\$ Edit conflict - your version:

$/!\$ End of edit conflict

Three windows will appear when the boxer GUI is launched. The main window shows a large XY view of the specified tomogram. The left column shows the current YZ-slice and the lower image displays the current XZ-slice, which can be manipulated by dragging the slider on the far right. The current box size can be changed in the Box Size input box in the lower left corner. Additionally, multiple slices can be averaged using the integer scroll box. To average all slices, click the “MaxProj” button. Occasionally, it is helpful to filter the slices to exaggerate particle features. This can be done by dragging the Filt slider bar. Magnification is controlled using the Sca slider bar.

When you click on a particle in the tomogram, a box will appear. To move a box, click and hold, then drag it to a new location. To erase a box, hold shift and click on the box you wish to remove.

Boxed particles will appear in the “Particles List” window. They can be easily removed by holding “shift” and clicking particles in this window. This is particularly helpful when trying to remove contaminants after performing automated boxing.

The “Options” window has two main sections. The top bar turns on and off a particle box eraser. When this box is checked, your mouse clicks and drags will erase particles falling within the “Radius” in the Options window. Below this is a “Sets” tab. Here users can assign names to sets, create new sets, delete sets, and save sets.

Once you have boxed various features of interest. Simply close the e2spt_boxer22.py program and all particles will remain in the project metadata for subsequent extraction.

Note: If you box particles in a reconstruction and perform a second reconstruction that alters the alignment parameters, particle boxes may no longer correspond to the tomogram. In such a case it is necessary to re-box all particles (recommended) or manually manipulate the box coordinates in the metadata (NOT recommended).

Reference-based Boxing

Often, rather than manually selecting particles by hand, it is more efficient to detect features of interest by cross-correlating a reference map with the 3D tomogram reconstruction to identify candidate particle coordinates. When dealing with purified samples, this approach is often faster than CNN-based boxing but can produce more false-positives. To perform reference-based boxing from the EMAN2 project manager, click “Reference-based boxing” under the “Subtomogram averaging” workflow menu item. Next, click “Browse” next to the tomograms file box, select “cryo.hdf”, and click “ok.” Next click “Browse” next to the reference file box, select the 3D reference/template map you wish to use for boxing, and click “OK.” If you wish to uniquely identify particles boxed with this reference and this parameter set, type a simple, unique identifier into the “label” box (such as “ribo1”) and click “Launch.”

Note that the name of boxed particles can be easily changed later via the e2spt_boxer22.py program. However, if particles are automatically boxed with non-optimal parameters and the automated procedure must be repeated, it is sometimes helpful to have the original boxing results to compare with can be helpful to hold onto the original. By labeling each instance of the reference based picking uniquely, we maintain a record of how the boxer performed given different parameters. Once we’re happy with the reference-based picking results, we can rename the particle set accordingly (i.e. “ribo” instead of “riboN”).

If the default parameters do not provide satisfactory results, the following parameters can be manipulated: delta: delta angle for generating rotated references. dthr: minimum distance between particles vthr: n-sigma value threshold for particles from the output correlation. Default is 2. nptcl: maximum number of particles.

Segmentation-based Boxing

If you have segmented a tomogram in EMAN2, the segmentation can be used directly to produce boxes for subtomogram averaging. However, currently this operation is only supported from the command line.

First, to generate particle coordinates from the segmentation output, run: extractptclfromseg.py <segmentation output> <input tomogram for segmentation> --thresh <intensity threshold in the segmentation output>. Note that the second argument has to be the tomogram you provided for the 'apply to tomogram' step in the Tomoseg workflow.

If you are segmenting continuous features (e.g. microtubules) and there are no individual particles, run: extractptclfromseg.py <segmentation output> <input tomogram for segmentation> --thresh <intensity threshold in the segmentation output> --random <number of particles>. This will seed particle coordinates at random points where the intensity segmentation output is above the threshold value.

The program will write particle coordinates to standard EMAN2 particle metadata corresponding to the input tomogram, same as manual particle boxing. So the extracted particles can be viewed in the tomogram using: e2spt_boxer22.py <input tomogram for segmentation>

You can manually add or remove particles in the GUI. Once you are satisfied, you can generate particles from the e2spt_boxer GUI. If you are confident in the automated segmentation and do not want to go through the spt_boxer step, or you want to extract particles from the raw unbinned tomogram, run: extractptclfromseg.py <raw tomogram> <input tomogram for segmentation> --genptcls <output particle stack name> --boxsz <box size>. The first argument can be any binned or filtered version of the tomogram and the second argument has to be the same as the argument in the previous extractptclfromseg command.

If you have a binned particle stack from somewhere else (like e2spt_boxer), this program also allows you to extract the same particles from the unbinned raw tomogram using extractptclfromseg.py <raw tomogram> <input particle stack> --genptcls <output particle stack name> --boxsz <box size>

8. Measure CTF (determine per-particle defoci)

EMAN2 can perform CTF correction on an individual particle level using the program e2spt_tomoctf.py, which is vital to obtaining high resolution beyond the spatial frequency of the first CTF zero. The important thing to know is that we use low-tilt, high-signal information to constrain the defocus range searched when measuring the defocus of each particle locally within each tilt in a tiltseries. EMAN2 can also handle phase plate data, but that is beyond the scope of this tutorial.

To perform per-particle, per-tilt CTF correction, navigate to the “Subtomogram averaging” workflow menu item and select “CTF correction.” Click “Browse” and select 1 or more tiltseries. In this case, select “cryo.hdf”. Next, select the defocus range to search by inputting values in the “dfrange” box (minimum defocus, maximum defocus, defocus step). Specify the voltage of the microscope used as well as its Cs value and click “Launch.” CTF parameters will be stored in json metadata files contained within the project “info” directory.

$/!\$ Edit conflict - other version:

$/!\$ Edit conflict - your version:

$/!\$ End of edit conflict

9. Extract subtomograms

Once particles have been boxed and (optionally) CTF corrected, it is time to extract them using the e2spt_extract.py program. To perform particle extraction via the project manager interface, click “Extract particles” in the workflow window, click “Browse” next to the tomograms file box, and select “cryo.hdf”. Specify the box size you wish to use via the “boxsz” parameter. Here I am using 32. If label is not specified, all labeled particle sets will be extracted. If you did not perform CTF correction, check the “noctf” box. Otherwise, we recommend performing Wiener filtering, so check the “wiener” box. Finally specify the number of threads you wish to use for this process. On my core i7 system, I am choosing 8. Click “Launch.”

While running, this program will generate bin4 particles from tomograms via e2spt_boxer.py. Additionally, unbinned 3D particles will be generated using e2spt_subtilt.py. This will take coordinates from bin4 particles and map them back to raw tilt series and generate 2D sub-tilts for each particle, and reconstruct 3D subtomograms. If CTF information exists in a tomogram’s metadata, the program will use that information to calculate the defocus of each particle, and flip the phase of 2D sub-tilt images before making 3D volumes.

Additional options for more advanced usage include: padby : padding factor when extracting sub-tilt images. Default is 2. The program will also pad by an extra 1.5x when doing 3D reconstruction. It seems that padding is never enough.. maxtilt : maximum tilt angle to include in the reconstruction. This is slightly different from the same parameter in other programs since it affect the raw 3D particle that goes into alignment. But it still seems to be useless.

Output sub-tilt images (particles extracted from 2D tilt images) are written to ‘particles/’ and 3D subtomograms are placed in ‘particles3d/’, under the same name as the input, but without the ‘binX’ tag.

10. Build sets

When using particles from multiple tomograms, it is convenient to reference them as a single particle stack. To accomplish this, we create list files using e2spt_buildsets.py. From the GUI, we perform this process by clicking on the “Build sets” workflow menu item. Next click “Browse” and select the particle stacks from each reconstructed tomogram. Note, however, that you should not include particle set generated during segmentation. Once all files are selected, click “OK”, check “allparticles” in the program interface window, and click “Launch”. Almost instantaneously, this program will create a “sets” directory” and generate a single list file for each particle type assigned a label during subtomogram boxing.

11. Generate initial model(s)

Reference-free initial modeling is critical for discovering unknown proteins in cell. EMAN2 utilizes a stochastic gradient descent (SGD) approach to perform reference-based and reference-free initial model generation for subtomogram averaging. The process starts by averaging particles at random orientation and gradually converges upon an initial model.

Convergence of PSII arrays on thylakoid membranes.

To perform initial modeling via the EMAN2 project manager GUI, click on “Generate initial model” in the workflow menu and click “Browse.” Next, select the set containing the relevant particles you wish to use to create an initial model (i.e. sets/particles_00.lst). If you wish to perform reference-based initial modeling, simply specify a reference using “ref”. Also, if you suspect some symmetry, it can be specified using the “sym” and “applysym” options shown in the program interface. Otherwise, we recommend using the default values initially, so click “Launch.”

Additional options are available for special cases: filterto : filter maps to a certain resolution learnrate : increment of map per iteration. We have noticed that increasing this value for higher symmetry objects helps improve convergence. batchsize : number of particles in each batch. Since the multithreading in the program is based on the batches, it will go through all particles faster if batchsize is larger. However, changing this may also impact convergence.

Initial models are saved in folders named ‘sptsgd_XX’ which contain 3 files. ‘Ref.hdf’ or ‘input_model.hdf’ is the initial model from random averaging or user input. ‘Output.hdf’ is the current output, which is updated after each batch (so it is always the latest model if one terminate the program before it finishes). ‘Tmpout.hdf’ is a stack of output per batch.

12. “Gold standard” 3D subtomogram refinement

Once an initial model is obtained, our “gold standard” subtomogram alignment and averaging routine can be used to produce an initial reconstruction. Specifically, after dividing our data into even and odd halves as has become standard practice, we perform missing-wedge aware subtomogram alignment and averaging. Results are post-processed and filtered by the local or global even/odd-FSC, and a final map is generated after each iteration. The final result is a FSC-filtered map.

$/!\$ Edit conflict - other version:

To perform this series of tasks, you can either run e2spt_refine.py from the command line or use the following steps to access this program from the EMAN2 project manager. Begin by navigating to “Subtomogram Averaging” in the EMAN2 workflow menu and click “3D refinement”. Next to the particles file box, click “Browse”, select the “Ribosome.lst” particle set, and click “OK”. Beside to the reference file box, click “Browse”, select the initial model you generated previously, and click “OK”. In the “niter” box, type “4”, in the “sym” box, type “c1”, in the “mass” box, type “3200”, and in the “tarres” box, type “10”. In the “threads” box, specify the number of cores to use when running this process. Once you are done, click “Launch” to begin iterative 3D subtomogram alignment.

$/!\$ Edit conflict - your version:

$/!\$ End of edit conflict

Internally, e2spt_refine.py will scale and clip the reference to the size of particles and run a specified number of rounds of ‘e2spt_align.py’,‘e2spt_average.py’, and ‘e2refine_postpocess.py’. Required options (entered as per the instructions above) include: niter : number of iterations. Default is 5 threads : only threading. mass : mass of particle for normalization in ‘e2refine_postprocess’. tarres : target resolution used in ‘e2refine_postprocess’.

Additional options for more advanced usage include: goldstandard : followed by a resolution number for phase randomization. setsf : in case there is a structure factor text file. Otherwise, no structure factor will be applied. pkeep : fraction of particles to keep. It will compute a ‘--simthr’ for ‘e2spt_average’ to keep the fraction in each iteration. mask : how to mask after each iteration. It accept mask processor like ‘mask.soft:outer_radius=-1’ or a file name of the mask. maxtilt : max tilt angle for ‘e2spt_average’.

13. Sub-tilt refinement

One of the trends leading the field of single particle tomography toward and even beyond subnanometer resolution is the use of per-particle, per-tilt methods. In EMAN2, we facilitate per-particle per-tilt CTF correction, per-particle per-tilt alignment, and bad-tilt exclusion within particles. By correcting for these distortions via per-particle per-tilt alignment methods, we obtain higher fidelity subtomograms that yield improved resolution when averaged with other refined subvolumes.

In the workflow menu under “Analysis and Visualization”, click “Sub-tilt refinement” Specify the path to the “spt_XX” directory, corresponding to the final 3D refinement In the “iter” box, type 3 to run for 3 iterations. In the “threads” box, specify the number of cores to use when running this process Check the “dopostp” box Click “Launch”

e2spt_tiltrefine.py takes the results from a spt alignment and use it to refine the alignment of 2D particles in the sub-tilt images. The alignment is done in the gold-standard way, using ‘threed_xx_even.hdf’ and ‘threed_xx_odd.hdf’ as reference. It takes the transform from tilt series alignment and subtomogram alignment to compute the initial alignment of the sub-tilt, and only does a ‘refine’ alignment from that so it should not be far off. Since we have the correlation from the per sub-tilt alignment, we weight the sub-tilt base on the correlation and exclude the worst (now 50% sub-tilt) instead of simply excluding high angle tilts. In experiments the correlation score and tilt angle seems to be highly correlated but excluding images based on correlation gives better results than that based on tilt angle.

spt_tiltrefine.py --path <existing spt_xx path> --iter <current iteration in spt_xx>

--path : a spt_xx path that has the output from ‘e2spt_align.py’, ‘e2spt_average.py’ and optionally ‘e2refine_postprocess.py’. --padby : padding factor for reconstruction of subtomograms. Default is 2. --keep : fraction of sub-tilts to keep. Default is 0.5 --maxres : maximum resolution for comparison in the alignment. When running from ‘spt_refine.py’, it will use the 0.3 cutoff of FSC --unmask : use the unmasked maps as reference. --maxalt : exclude high tilt images

Output files includes: threed_xx_ali.hdf : 3D map output . threed_xx_ali_even/odd.hdf : even/odd sub-maps fsc_xx_ali.txt : FSC curve after refinement. It will also rename the existing FSC of this iteration to fsc_xx_raw.txt since e2refine_postprocess overwrites..

$/!\$ Edit conflict - other version:

$/!\$ Edit conflict - your version:

$/!\$ End of edit conflict

Ribosome (EMPIAR-10064) maps and FSCs before and after 1 round of sub-tilt refinement.

14. Evaluare SPT refinements

We have implemented a program called e2spt_eval.py that allows users to assess all SPT refinements performed within a given project. To run this program from the project manager, double click “Analysis and visualization” in the Workflow menu and select “Evaluate SPT refinements.” Then click “Launch.”

$/!\$ Edit conflict - other version:

$/!\$ Edit conflict - your version:

$/!\$ End of edit conflict

The window that appears will show a large table on the left with rows corresponding to each SPT refinement performed. Clicking a row will show a 3D view of the map produced during the final iteration of the selected refinement. If you click “ShowBrowser,” a browser window will appear that changes to the currently selected refinement directory.

The “PlotParams” button will bring up a 2D plot where you can explore the per-particle alignment parameters for each particle used in a given refinement. Here you can also examine iteration to iteration values to explore properties such as convergence or parameter-dependent clustering of particle data (as would be seen in cases with strong preferred orientations, or possibly in the presence of significant ice contamination).

Finally, the “PlotFSCs” button will bring up a window showing the FSC calculated for each iteration during which post processing was performed. This is helpful when examining convergence asa well as determining the final gold standard resolution of a given SPT refinement.

15. Addressing heterogeneity

15a. Multi reference refinement

In the workflow menu under “Analysis and Visualization”, click “Multi-reference refinement”. Next to the “particles” box, click “Browse”. Select the set containing your ribosome particles (“Ribosome.lst”) and click “OK”. Specify reference maps corresponding to the various states you hope to draw out of the data. If performing a focused classification of particles, specify a mask in the “mask” file box. Use “Browse” to search for this file. In the “threads” box, specify the number of cores to use when running this process. In the “tarres” box, specify the resolution target for multi-model refinement. In the “mass” box, type “3200”. Click “Launch”.

15b. Focused classification

$/!\$ Edit conflict - other version:

15c. MSA/PCA split method

$/!\$ Edit conflict - your version:

15c. MSA/PCA split method

$/!\$ End of edit conflict