Size: 22564
Comment:
|
Size: 21821
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 6: | Line 6: |
* Time estimates for each step are for a well-configured tomography workstation with a high-speed disk, 64+ GB of RAM and 16+ cores. | |
Line 8: | Line 9: |
== Prepare input files == | == Prepare input files (~2 minutes) == |
Line 17: | Line 18: |
* Import tilt series using '''Raw Data -> Import tilt series''' | * '''Raw Data -> Import tilt series''' |
Line 25: | Line 26: |
== Tomogram reconstruction == | == Tiltseries Alignment and Tomogram Reconstruction (20 min) == Alignment of the tilt-series is performed iteratively in conjunction with tomogram reconstruction. Tomograms are not normally reconstructed at full resolution, generally limited to 1k x 1k or 2k x 2k, but the tilt-series are aligned at full resolution. For high resolution subtomogram averaging, the raw tilt-series data is used, based on coordinates from particle picking in the downsampled tomograms. On a typical workstation reconstruction takes about 4-5 minutes per tomogram. |
Line 27: | Line 29: |
To first look at the performance of the program, it is useful to start from one representative tilt series and turn off the '''notmp''' option, so temporary files will be written to '''tomorecon_XX'''. While default parameters work in most cases, slightly tweaking the parameters may produce more optimal results. In the '''tomorecon_XX''' folder, '''landmark_0X.txt''' has the location of the landmarks used at each iteration, and '''samples_0X.hdf''' shows the top and side view of those landmarks. '''ptclali_0X.hdf''' has the trace of each landmark throughout the tilt series (they should stay at the center of image all the time if the alignment is good), and '''tomo_0X.hdf''' is the reconstruction after each iteration. | For the tutorial tilt-series: * '''3D Reconstruction -> Reconstruct Tomograms''' * check ''alltiltseries'' * alternatively you can select one or more tilt series from the ''tiltseries'' folder * check ''correctrot'' * ''tltstep'' = 2 * ''clipz'' = 64 * If you wish to look at the intermediate aligned tilt-series and other files, uncheck ''notmp'' * This is not required for the remaining steps in the tutorial, but can be used to help understand how the tomogram alignment works. This requires significant additional disk space. You may consider doing this for only one tomogram. * In each ''tomorecon_XX'' folder * ''landmark_0X.txt'' has the location of the landmarks (which may be fiducials if present) in each iteration * ''samples_0X.hdf'' shows the top and side view of those landmarks * ''ptclali_0X.hdf'' has the trace of each landmark throughout the tilt series (they should stay at the center of image all the time if the alignment is good) * ''tomo_0X.hdf'' is the reconstruction after each iteration * Launch |
Line 31: | Line 47: |
Make sure to set '''tltstep''' to be the angle between each tilt. While the program can automatically compute the rotation of tilt axis, it is still better to fill in the correct value in '''tltax''' since there is a handedness ambiguity of the tomogram generated if the value is not provided. | When working with your own data: * Either specify the correct ''tltstep'' if the tilt series is in order from one extreme to the other, '''or''' specify the name of a ''rawtlt'' file (as produced by serialem/IMOD). * While the program can automatically compute the orientation of the tilt axis, it is better to fill in the correct value in ''tltax'' since there is a handedness ambiguity in the tomogram if determined automatically. * In most cases, the default ''npk'' should be fine. If fiducials are present, it is not necessary to adjust this number to match the number of fiducials. The program will use any high contrast areas it finds as potential landmarks. * ''bytile'' should normally be selected, as it will normally produce better quality reconstructions at higher speed. If 2k or larger tomograms are created, memory consumption may be high, and you should check the program output for the anticipated RAM usage. * The graphical interface only permits 1k or 2k reconstruction sizes. In our experience this is normally sufficient for segmentation or particle picking. * When the sample is thin (purified protein, not cells), it is useful to check '''correctrot''' to automatically position tomograms flat in ice * It can also be helpful with thin ice to specify a '''clipz''' value to generate thinner tomograms (perhaps 64 or 96 for a 1k tomogram). |
Line 33: | Line 56: |
In most cases, the default '''npk''' should work fine and it is not necessary to change the value according to the number of fiducial in images. When there is fewer (or no) fiducial in the tilt series, the program will use other high contrast objects as landmarks. | == CTF Estimation (10 min) == |
Line 35: | Line 58: |
We highly recommend using output size of 1K and 2K which can be specified with the '''outsize''' option. In our experience, this is enough for visualization, annotation and particle picking. For subtomogram averaging, full-sized particles will be generated from tilt series in the later steps. Although 4K output option can also be specified from the command line, and the program is capable of generating 4K tomograms, it is generally unnecessary and takes much more CPU time, memory and storage. | For the tutorial tilt-series: * '''Subtomogram Averaging -> CTF Correction''' * check ''alltiltseries'' * Double check the ''voltage'' and ''cs'' * Launch |
Line 37: | Line 64: |
In general, enabling '''bytile''' option can produce visually better results and make the reconstruction run faster. With this option, the program will generate tomogram in small tiles and merge them in real space. Note that the reconstruction process is memory intensive when the output size is 2K or larger. The program will print out the total memory required during reconstruction, so make sure the computer has enough memory for the process. | When working with your own data: * The first two options, ''dfrange'' and ''psrange'' indicate the defocus and phase shift range to search. They take the format of “start, end, step”, so “2, 5, .1” will search defocus from 2 to 5 um with a step size of 0.1. Units for phase shift is degrees. * For images taken with volta phase plate, we usually have '''dfrange''' of “0.2,2,0.1” and '''psrange''' of “60,120,2”. |
Line 39: | Line 68: |
When the sample is thin, it is useful to check '''correctrot''' to automatically position tomograms flat in ice. It also can be helpful to specify a '''clipz''' value to generate thinner tomograms. | Note that this program is only estimating CTF parameters, taking tilt into account. It is not performing any phase-flipping corrections on whole tomograms. CTF correction is performed later as a per-particle process. This process requires metadata determined during tilt-series alignment, so it cannot be used with tomograms reconstructed using other software packages. |
Line 41: | Line 70: |
When the sample is thick, consider check '''normslice''', which can compensate the weaker contrast at the top and bottom of the tomogram. After satisfied with the parameter selections, we can proceed to the whole dataset, simply check '''alltiltseries''' and uncheck '''notmp''' to reconstruct all tomograms sequentially. For the EMPIAR example, the easist way is to check '''alltiltseries''', fill in '''2''' in '''tltstep''', and '''64''' in '''clipz''', check '''correctrot''', leave everything else as default and click '''Launch'''. If it is the first time you use the workflow, it is still recommended to start from one tilt series and uncheck '''notmp''' to look at intermidiate results and play with different options as described earlier. == CTF correction == For this example, simple go to '''CTF correction''', check '''alltiltseries''' and launch the program. For general applications, make sure the '''voltage''' and '''cs''' is correct for your microscope. The first two options, '''dfrange''' and '''psrange''' indicate the defocus and phase shift range to search. They take the format of “start, end, step”, so “2, 5, .1” will search defocus from 2 to 5 um with a step size of 0.1. Unit for phase shift is degree. For defocused micrographs, we usually search a range slightly larger than the target defocus range. For images taken with volta phase plate, we usually have '''dfrange''' of “0.2,2,0.1” and '''psrange''' of “60,120,2”. The program estimates the CTF taking the tilt angle of each image into consideration, so it only works after tomograms are reconstructed in EMAN2. Note in this case, the program only determines the defocus of each tilt-image, but does not correct for the CTF, so it will not affect the tomograms. CTF correction will be done at a per particle per tilt level in the later steps. == Tomogram annotation == |
== Tomogram annotation (optional) == |
Line 57: | Line 74: |
While it is unnecessary to automatically annotate the tomograms since the dataset we use for this tutorial are purified ribosomes, and can be easily picked by template matching, we still demonstrate the annotation process here to show how the annotation process connects to the following subtomogram averaging steps. This section is brief and is probably only useful for users familiar with the tomogram annotation protocol already. A more detailed tutorial of the subject can be found in [[http://blake.bcm.edu/emanwiki/EMAN2/Programs/tomoseg| TomoSeg]]. Note that some directory structure and user interfaces have changed in the latest version to keep with new tomogram workflow. You can also skip this section and use the template matching method described later. | * Since the tutorial data set is purified ribosomes, this step can be skipped for the tutorial data, and you can move on to template-based particle picking. For cells or other types of complex specimens, tomogram annotation can be used to produce locations of different types of objects. |
Line 59: | Line 76: |
First, preprocess the tomograms with the '''Preprocess tomograms''' command. This is not always necessary when the tomograms are reconstructed in EMAN2, but may still produce slightly better results. Next, box a few good and bad references in the '''Box training references''' step. We now switched to the new tomogram boxer GUI for particle picking which includes more functionalities. Go through slices along z-axis using '''‘~’''' and '''‘1’''' on the keyboard. | This section is brief and is only an update to the more detailed tutorial: [[http://eman2.org/Programs/tomoseg| TomoSeg]]. Some directory structure and user interfaces have changed in the latest version to match new tomogram workflow as described here: |
Line 61: | Line 78: |
You can now have different types of particles in the same tomogram, and add/rename/delete particle set in the set list window. Still, it is better to keep the box size at 64 and shrink the tomogram for features of different sizes. As long as the tomograms are shrunk in EMAN, the boxer will keep track of the correct box sizes and coordinates in different versions of the same tomogram. In this case, we just need two classes of particles, ribo_good, and ribo_bad. When clicking the '''Save''' button, all particles visible (with the box checked in front of the particle name) will be saved into one stack file. So in a more complicated cellular case, for example, one can have particles types of ribosome, microtubule, noise, and save (ribosome + noise) as negative training set for microtubules. | * '''Segmentation -> Preprocess tomogram''' * This step is not always necessary for tomograms reconstructed in EMAN2, but may slightly improve results. * '''Segmentation -> Box Training References''' * This is a newer interface than previously used for this step. Select a few "Good" (regions containing the feature of interest) and "Bad" (regions not containing the feature of interest) boxes. * "~" and "1" on the keyboard can be used to move along the Z axis. * The new interface permits different types of features to be identified in a single session and in the same tomogram. * If the different features of interest have very different scale, it is always better to keep the box size at 64, and instead rescale the tomogram. As long as the rescaling is done using EMAN2 utilities, the program will correctly keep track of the geometry relative to the original tomogram & tilt series. * if you are doing this with the tutorial data, you would only have 2 classes of particles "ribo_good" and "ribo_bad". * When pressing ''Save'' all visible particles (box checked next to the class name) will be saved |
Line 63: | Line 88: |
The rest of the annotation process remain unchanged, except for now all trained neural networks and training results are saved in the '''neuralnets''' folder, and all segmented maps are in the '''segmentations''' folder. You can now only specify the label of the output file instead of the full file name so the program can keep track of the metadata. | * The rest of the annotation process remain unchanged from the original tutorial, except that now, all trained neural networks and training results are saved in the ''neuralnets'' folder, and all segmented maps are in the ''segmentations'' folder. You now only specify the label of the output file instead of the full file name. |
Line 65: | Line 90: |
Finally, to turn segmented maps into particle coordinates, go to '''Find particles from segmentation''', and input both the tomogram and its corresponding segmentation, and the particles coordinates will be written into the metadata file. Slightly tweaking the threshold parameters may yield better results. Here '''featurename''' will become the label of particles generated. Those particles can be viewed in the particle picking step and processed in the following protocols. | * Segmentation -> Find particles from segmentation to turn segmented maps into particle coordinates. * Input both the tomogram and its corresponding segmentation, and the particles coordinates will be written into the metadata file. * Slightly tweaking the threshold parameters may yield better results. * ''featurename'' will become the label of particles generated. Those particles can be viewed in the particle picking step and processed in the following protocols. |
Line 71: | Line 99: |
Launch the boxer in '''Subtomogram averaging -> Manual boxing''' step. You can also launch it via the '''Tomogram evaluation''' step which is discussed later in this tutorial. Go through slices along z-axis using '''‘~’''' and '''‘1’''' on the keyboard and deleting boxes by holding Shift and click. The boxes are shown as circles, whose radii indicate the distance from the current slice to the center of the particles. You can rename the label of the boxes and create new types of boxes in the '''options''' window. Make sure the label is consistent for the same particle type in different tomograms, as it is used for particle extraction later. The box size can be set in the main window at the left bottom corner, in this case, 45 for the ribosomes (so the unbinned box size is 180). If you have got the particle coordinates from tomogram annotation, just take a look at the automatically generated particles here and remove some obvious bad ones. While you can save 3D particles from the GUI, there is no need to do so in this step. When you are satisfied with the result, simply close the window. You should have ~3000 particles from the 4 tomograms in the dataset. | * ''Subtomogram averaging -> Manual boxing''' * Go through slices along z-axis using '''‘~’''' and '''‘1’''' on the keyboard * Hold down Shift when clicking to delete existing boxes. * Boxes are shown as circles, which vary in size depending on the Z distance from the center of the particle. * The interface supports different box types within a single tomogram. Each type has a label. Make sure the label is consistent if selecting the same feature in different tomograms. * The box size can be set in the main window at the left bottom corner, for the tutorial, use 45 for ribosomes (the unbinned box size is 180). |
Line 73: | Line 106: |
If you skipped the tomogram annotation step, we will pick a few particles here to generate an initial model first, and use the initial model as a reference for template matching. Select 30-50 particles from a tomogram, then close the boxer window. | * If you skipped the tomogram annotation step, we will pick a few particles here to generate an initial model first, and use the initial model as a reference for template matching. * Select 30-50 particles from a tomogram, then close the boxer window. |
Line 75: | Line 109: |
* If you have the particle coordinates from tomogram annotation above, you may still wish to do this step to delete any obviously bad particles. * While you can save 3D particles from the GUI, there is no need to do that here. When you are satisfied with the result, simply close the window. * You should have ~3000 particles from the 4 tomograms in the dataset. |
EMAN2 Tomography Workflow Tutorial
- This tutorial is best suited for EMAN2 built after 09/27/2018. Not everything described in the tutorial was functioning yet in the 2.22 release.
This tutorial uses data from EMPIAR: EMPIAR 10064 (only the 4 mixed CTEM tilt series)
- Time estimates for each step are for a well-configured tomography workstation with a high-speed disk, 64+ GB of RAM and 16+ cores.
- The pixel size in the header of the files are incorrect as provided by EMPIAR. The correct Apix value (2.62) should be specified when importing the images.
Prepare input files (~2 minutes)
- Make a new empty folder for the project and 'cd' into that folder
- Make sure any EMAN2 commands you run are executed from within this folder (not any subfolder)
- You may use "Edit Project" from the Project menu to set default values for the project. While not required, it reduces later errors.
- Make sure the workflow mode is set to "TOMO" not "SPR"
Raw Data -> Import tilt series
Select the files, and make sure importation says copy
In this step you should enter the correct A/pix for your data in the apix box. For EMPIAR10064, this is 2.62. For your own data, you need to know this number. In later steps you should be able to use -1 (default) for apix.
If your tilt series isn't a single stack file, but is many individual images instead, you will need to use Generate tiltseries to build an image stack. This is not necessary for the tutorial data.
Once the options are set, press Launch
It is critical that the filenames for your data not contain any spaces (replace with underscore) or periods (other than the final period used for the file extension). "" (double underscore) is also reserved for describing modified versions of the same file, and should not be used in your original files.
Tiltseries Alignment and Tomogram Reconstruction (20 min)
Alignment of the tilt-series is performed iteratively in conjunction with tomogram reconstruction. Tomograms are not normally reconstructed at full resolution, generally limited to 1k x 1k or 2k x 2k, but the tilt-series are aligned at full resolution. For high resolution subtomogram averaging, the raw tilt-series data is used, based on coordinates from particle picking in the downsampled tomograms. On a typical workstation reconstruction takes about 4-5 minutes per tomogram.
For the tutorial tilt-series:
3D Reconstruction -> Reconstruct Tomograms
check alltiltseries
alternatively you can select one or more tilt series from the tiltseries folder
check correctrot
tltstep = 2
clipz = 64
If you wish to look at the intermediate aligned tilt-series and other files, uncheck notmp
- This is not required for the remaining steps in the tutorial, but can be used to help understand how the tomogram alignment works. This requires significant additional disk space. You may consider doing this for only one tomogram.
In each tomorecon_XX folder
landmark_0X.txt has the location of the landmarks (which may be fiducials if present) in each iteration
samples_0X.hdf shows the top and side view of those landmarks
ptclali_0X.hdf has the trace of each landmark throughout the tilt series (they should stay at the center of image all the time if the alignment is good)
tomo_0X.hdf is the reconstruction after each iteration
- Launch
When working with your own data:
Either specify the correct tltstep if the tilt series is in order from one extreme to the other, or specify the name of a rawtlt file (as produced by serialem/IMOD).
While the program can automatically compute the orientation of the tilt axis, it is better to fill in the correct value in tltax since there is a handedness ambiguity in the tomogram if determined automatically.
In most cases, the default npk should be fine. If fiducials are present, it is not necessary to adjust this number to match the number of fiducials. The program will use any high contrast areas it finds as potential landmarks.
bytile should normally be selected, as it will normally produce better quality reconstructions at higher speed. If 2k or larger tomograms are created, memory consumption may be high, and you should check the program output for the anticipated RAM usage.
- The graphical interface only permits 1k or 2k reconstruction sizes. In our experience this is normally sufficient for segmentation or particle picking.
When the sample is thin (purified protein, not cells), it is useful to check correctrot to automatically position tomograms flat in ice
It can also be helpful with thin ice to specify a clipz value to generate thinner tomograms (perhaps 64 or 96 for a 1k tomogram).
CTF Estimation (10 min)
For the tutorial tilt-series:
Subtomogram Averaging -> CTF Correction
check alltiltseries
Double check the voltage and cs
- Launch
When working with your own data:
The first two options, dfrange and psrange indicate the defocus and phase shift range to search. They take the format of “start, end, step”, so “2, 5, .1” will search defocus from 2 to 5 um with a step size of 0.1. Units for phase shift is degrees.
For images taken with volta phase plate, we usually have dfrange of “0.2,2,0.1” and psrange of “60,120,2”.
Note that this program is only estimating CTF parameters, taking tilt into account. It is not performing any phase-flipping corrections on whole tomograms. CTF correction is performed later as a per-particle process. This process requires metadata determined during tilt-series alignment, so it cannot be used with tomograms reconstructed using other software packages.
Tomogram annotation (optional)
- Since the tutorial data set is purified ribosomes, this step can be skipped for the tutorial data, and you can move on to template-based particle picking. For cells or other types of complex specimens, tomogram annotation can be used to produce locations of different types of objects.
This section is brief and is only an update to the more detailed tutorial: TomoSeg. Some directory structure and user interfaces have changed in the latest version to match new tomogram workflow as described here:
Segmentation -> Preprocess tomogram
- This step is not always necessary for tomograms reconstructed in EMAN2, but may slightly improve results.
Segmentation -> Box Training References
- This is a newer interface than previously used for this step. Select a few "Good" (regions containing the feature of interest) and "Bad" (regions not containing the feature of interest) boxes.
- "~" and "1" on the keyboard can be used to move along the Z axis.
- The new interface permits different types of features to be identified in a single session and in the same tomogram.
If the different features of interest have very different scale, it is always better to keep the box size at 64, and instead rescale the tomogram. As long as the rescaling is done using EMAN2 utilities, the program will correctly keep track of the geometry relative to the original tomogram & tilt series.
- if you are doing this with the tutorial data, you would only have 2 classes of particles "ribo_good" and "ribo_bad".
When pressing Save all visible particles (box checked next to the class name) will be saved
The rest of the annotation process remain unchanged from the original tutorial, except that now, all trained neural networks and training results are saved in the neuralnets folder, and all segmented maps are in the segmentations folder. You now only specify the label of the output file instead of the full file name.
Segmentation -> Find particles from segmentation to turn segmented maps into particle coordinates.
- Input both the tomogram and its corresponding segmentation, and the particles coordinates will be written into the metadata file.
- Slightly tweaking the threshold parameters may yield better results.
featurename will become the label of particles generated. Those particles can be viewed in the particle picking step and processed in the following protocols.
Particle picking
Subtomogram averaging -> Manual boxing
Go through slices along z-axis using
- Hold down Shift when clicking to delete existing boxes.
- Boxes are shown as circles, which vary in size depending on the Z distance from the center of the particle.
- The interface supports different box types within a single tomogram. Each type has a label. Make sure the label is consistent if selecting the same feature in different tomograms.
- The box size can be set in the main window at the left bottom corner, for the tutorial, use 45 for ribosomes (the unbinned box size is 180).
- Select 30-50 particles from a tomogram, then close the boxer window.
- While you can save 3D particles from the GUI, there is no need to do that here. When you are satisfied with the result, simply close the window.
- You should have ~3000 particles from the 4 tomograms in the dataset.
Particle extraction
In this step, the program will extract unbinned 2D particles from tilt series, perform per particle per tilt CTF correction, then reconstruct individual 3D particles. Select Extract particles from the left panel, check alltomograms, and specify the label of particle you want to extract. Make sure the label specified here corresponds to the label of particles from the particle boxer. If the box size is correct when you select particles from the GUI, you can leave boxsz_unbin as -1, so the program will keep that box size. You can adjust the value if you want to change the box size of the extracted particles. If your particles are deeply buried in other densities, using a bigger padtwod may help, but doing so may significantly increase the memory usage and slow down the process. With CTF information present, it generally does not hurt to check wiener, which filters the 2D particles by SSNR before reconstructing them in 3D. If you want to generate particles without CTF correction, check noctf. By default, the generated particles will have the same label as they are named in the boxer. If you want to have multiple types of particles, for example, with and without CTF correction, you can specify a different newlabel each time you launch the program. Specify a binning factor in shrink to produce downsampled particles if your memory/storage/CPU time is limited, but it may also limit the resolution you achieve at the end. For the EMPIAR example, check Then, go to
To build an initial model from scratch, simply go to the In this ribosome dataset with
If you generated all particles with tomogram annotation already, skip this step. If not, go to After the program finishes, take a look at the particle coordinates from
Click
Once the subtomogram refinement finishes, check the final map and FSC curves. In this dataset, you should be able to achieve a resolution of 13-15Å. Now we can refine the orientation of each individual subtilt, i.e. 2D particles from raw tilt series that are reconstructed into to the 3D particles, and push the resolution of the averaged map. Click The default parameters should be generally fine for this dataset, though you may need to alter the
This is a tool that helps you visualize your tomograms with their corresponding metadata, and launch other programs from it. It can be found via On the left is a list of tomograms in the project. Clicking the header of each column will sort the table by that attribute. On the right, the image on the top shows the center slice of the tomogram. The
This tool helps visualize and compare results from multiple subtomogram refinement runs. Launch it from Initial model generation
Template matching
Subtomogram refinement
Subtilt refinement
Tomogram evaluation
Refinement evaluation