Differences

This shows you the differences between two versions of the page.

--- eman2:e2tomo_atpsyn [2026/06/03 19:16] – muyuanchen
+++ eman2:e2tomo_atpsyn [2026/06/11 16:19] (current) – muyuanchen
@@ Line 1: / Line 1: @@
 ====== EMAN2 tomography - ATP synthase in mitochondria (2026) ======
-This tutorial uses a public in situ CryoET dataset ([[https://www.ebi.ac.uk/empiar/EMPIAR-11830/ | EMPIAR-11830]]) of Chlamydomonas reinhardtii prepared using cryo-plasmaFIB milling. Here, we use 5 tilt series and target the structure and dynamics of ATP synthase inside mitochondria.
+This tutorial uses a public in situ CryoET dataset ([[https://www.ebi.ac.uk/empiar/EMPIAR-11830/ | EMPIAR-11830]]) of Chlamydomonas reinhardtii prepared using cryo-plasmaFIB milling. Here, we use [[https://drive.google.com/file/d/18llt4TLnAbDn5MajfAZA-zumwf4w4LGp/view?usp=sharing | 5 tilt series]] and target the structure and dynamics of ATP synthase inside mitochondria. In the end, from this small dataset, we will produce monomer structure at (slightly) sub-nanometer resolution, and characterize the rotary movement of the F1 head domain. Using a larger dataset, it is possible to reach ~5Å resolution and solve the full rotation of the central stalk as well.
 It is recommended to cross reference with previous tutorials of [[https://blake.bcm.edu/emanwiki/EMAN2/e2TomoSmall | ribosomes ]] and [[https://blake.bcm.edu/emanwiki/EMAN2/e2tomo_p22 | viruses ]].
@@ Line 13: / Line 13: @@
 Unzip the dataset, and you should have a folder called "tiltseries", with four hdf image stacks in it, inside the project folder. To view the tilt series, run **e2display.py**, locate the file in the browser, and click **Show2D**.
-{{http://blake.bcm.edu/dl/EMAN2/atpsyn_tilt_series.png|tiltseries|width="600"}}
+{{:eman2:atpsyn_tilt_series.jpg|tiltseries}}
 ===== Initial tomogram reconstruction =====
@@ Line 31: / Line 31: @@
 </code>
-The handedness of the tilt series should be correct.
+The handedness of the tilt series should be correct. Just use the reported --tltax for the reconstruction of all tomograms.
 ===== All tomogram reconstruction =====
-Reconstruct all tilt series using the same parameters and the tilt axis estimated by the handedness check. Note that in the full dataset, tilt series of different sessions in EMPIAR-11830 may have different tilt step and pixel size in their header. Run the **--alltiltseries** command with caution when processing large datasets. The 5 tilt series in this tutorial are from the same session and have similar conditions.
+Reconstruct all tilt series using the same parameters and the tilt axis estimated by the handedness check. Note that in the full dataset, tilt series of different sessions in EMPIAR-11830 may have different tilt step and pixel size in their header. Run the **--alltiltseries** command with caution when processing large datasets. The 5 tilt series in this tutorial are from the same session and have similar conditions. While the selection makes the processing simpler, it limits the final resolution because the 5 tilt series are collected at similar (and relatively high) defocus.
 <code>
@@ Line 56: / Line 56: @@
 </code>
-{{http://blake.bcm.edu/dl/EMAN2/atpsyn_eval_tomo.png|Evaluate tomogram|width="600"}}
+{{:eman2:atpsyn_eval_tomo.png|Evaluate tomogram|width="600"}}
 Here we pick a few particles manually. In this dataset, we just need ~70 particles to make a good initial model. Here we label them as **atpsyn_init**.
-{{http://blake.bcm.edu/dl/EMAN2/atpsyn_pick_ptcls.png| Pick particles |width="600"}}
+{{:eman2:atpsyn_pick_ptcls.png| Pick particles |width="600"}}
 ===== Initial model generation =====
@@ Line 76: / Line 76: @@
 </code>
-{{http://blake.bcm.edu/dl/EMAN2/atpsyn_init_model.png| initial model |width="600"}}
+{{:eman2:atpsyn_init_model.png| initial model |width="600"}}
 Note the structure should be c2 symmetrical. At this point, it is recommended to rotate the initial model to the symmetry axis to take advantage of the symmetry in later steps. Sometimes, this can be done automatically.
@@ Line 102: / Line 102: @@
 The template matching should work fine inside mitochondria but leave false positives outside. Since we only have 5 tomograms here, it is easy to take a look and manually clean the particles. Lauch the manual boxer and use the Eraser tool to remove particles outside the mitochondria. Uncheck "Limit Side Boxes" will show all boxes along one axis in each view so particles on the edge across all depth can be removed with one click.
-{{http://blake.bcm.edu/dl/EMAN2/atpsyn_temp_pick.png| Template based particle picking |width="600"}}
+{{:eman2:atpsyn_temp_pick.png| Template based particle picking |width="600"}}
+After the clean up, I got ~5000 particles total.
 For the full dataset, using the deep learning based particle picker can reduce the manual effort of the cleaning step. Because the ATP synthase has quite distinct top vs side view, training the model can take multiple iterations of training set refinement and not efficient for a small dataset. Please refer to previous tutorials for details.
@@ Line 133: / Line 135: @@
 This should bring the resolution to about 12Å.
-{{http://blake.bcm.edu/dl/EMAN2/atpsyn_c2_refine.png| C2 refinement |width="600"}}
+{{:eman2:atpsyn_c2_refine.png| C2 refinement |width="600"}}
 ===== Refinement of ATP synthase monomers =====
@@ Line 145: / Line 147: @@
 The new list, spt_01/aliptcls3d_06_sym.lst should have 2x particles. While not necessary, it is better to shift on of the two asymmetrical units to the center of the box, and roughly align it so the central stalk is along the z axis. This can be done in the FilterTool with the xform processor, and let's call the output spt_01/threed_06_xf.hdf. To also modify the alignment of the particles, first get the exact alignment in between, then apply it to the particle list.
 <code>
-cp spt_01/aliptcls3d_06.lst spt_01/aliptcls3d_06_sym.lst
+e2proc3d.py spt_01/threed_06.hdf spt_01/threed_06_ali.hdf --align rotate_translate_3d_tree --alignref spt_01/threed_06_xf.hdf
-e2proclst.py spt_01/aliptcls3d_06_sym.lst --sym c2
+e2proclst.py spt_01/aliptcls3d_06_sym.lst --create spt_01/aliptcls3d_06_sym_xf.lst --applyxf spt_01/threed_06_ali.hdf
 </code>
-{{http://blake.bcm.edu/dl/EMAN2/atpsyn_euler_view.png| Euler angle comparison |width="600"}}
+To make sure the operations on particle lists are done properly, compare the Euler angles of the lists by clicking "Plot2D".
+{{:eman2:atpsyn_euler_view.png| Euler angle comparison |width="600"}}
+Finally we can refine the monomer particles. Here we also need to make a customized mask for the monomer, keeping only one ATP synthase inside the mask. This is also done in FilterTool using mask.zeroedge3d followed by mask.auto3d.thresh. Call this mask mask_01.hdf.
+<code>
+e2spt_gathermeta.py --ptcls spt_01/aliptcls3d_06_sym_xf.lst --ali2d spt_01/aliptcls2d_06.lst
+e2spt_refine_new.py --path spt_02 --continuefrom 0.5 --localrefine --mask mask_01.hdf --setsf sf.txt --iters=p2,g,p,g,p --keep=0.95 --parallel=thread:64 --tophat localwiener
+</code>
+The resolution should reach 11Å by the end of this.
+===== Bad particle removal =====
+Before heterogeneity analysis, we need to get rid of the bad particles first. While they are automatically downweighted and having a small fraction of bad particles generally does not have a strong impact in the refinement, the heterogeneity analysis might interpret the good vs bad particles as actual structural differences. Although we have manually removed the obvious bad particles outside mitochondria earlier, there are still some that are just bare cristae membranes. Here we remove them through simple classification.
+<code>
+e2spt_sgd_new.py spt_02/aliptcls3d_07.lst --res=50.0 --niter=100 --shrink=1 --parallel=thread:64 --ncls=2 --batch=12 --learnrate=0.2 --sym=c1 --classify --refine --skipali
+</code>
+This generates one correct structure of ATP synthase, and the other looks somewhat flat, likely a piece of misaligned membrane. However, e2spt_sgd_new.py does not actually assign particles to classes, so we need to run a full classification with all particles.
+<code>
+e2spt_refinemulti_new.py sptsgd_01/output_cls0.hdf sptsgd_01/output_cls1.hdf --ptcls spt_02/aliptcls3d_07.lst --niter 3 --maxres 20 --loadali3d --skipali --loadali2d spt_02/aliptcls2d_07.lst --parallel thread:64
+</code>
+{{:eman2:atpsyn_bad_ptcls.png| Bad particle removal | width="600"}}
+<code>
+e2spt_gathermeta.py --ptcls sptcls_00/aliptcls3d_02_00.lst --ali2d spt_02/aliptcls2d_06.lst
+e2spt_refine_new.py --path spt_03 --continuefrom 0.5 --localrefine --mask mask_01.hdf --setsf sf.txt --iters=p2,g,p,g,p --keep=0.95 --parallel=thread:64 --tophat localwiener
+</code>
+===== Gaussian mixture model (GMM) based refinement =====
+First convert the voxel map into GMM representation, and start a global refinement.
+<code>
+e2gmm_guess_n.py spt_03/threed_07.hdf --thr 4 --maxres 11 --evenodd --startn 4000 --jax
+e2gmm_spt_refine_iter.py spt_02/threed_07.hdf --initpts spt_02/threed_07_seg.pdb --startres 11 --maskpp mask_01.hdf
+</code>
+The resolution should get slightly better than 10Å at this point. Next, we can focus the refinement on the rotation of F1 head. Make a mask using FilterTool that covers the F1 head only, and name it mask_f1.hdf. Here we first run one iteration of the deep learning based alignment to recover the large scale rotation, followed by 3 iterations of the direct alignment from the deep learning result.
+<code>
+e2gmm_spt_refine_iter.py gmm_00/threed_03.hdf --initpts spt_03/threed_07_seg.pdb --startres 15 --maskpp mask_01.hdf --mask mask_f1.hdf --align_mlp --niter 1
+e2gmm_spt_refine_iter.py gmm_01/threed_01.hdf --initpts spt_03/threed_07_seg.pdb --startres 10 --maskpp mask_01.hdf --mask mask_f1.hdf
+</code>
+This should improve the structure features at the F1 head domain, but the FSC resolution does not necessarily improve here. Because the even/odd half set only are only aligned to the "neutral" struture of their half-set and never see each other, there is a possiblity that they converge to slightly different states, and the FSC resolution decrease even though the feature in each half-set improves. This is less of a problem in datasets with more particles since the "neutral" state would be better defined, but here there are some uncertainties with only 5 tomograms...
+{{:eman2:atpsyn_cmp_focus_refine.png| Focus refinement comparison | width="600"}}
+To visualize the dynamics, run the following.
+<code>
+e2gmm_eval.py --pts gmm_01/mid_01_even.txt --pcaout gmm_01/pca_even.txt --ncls 4 --spt --ptclsin gmm_01/aliptcls2d_00_even.lst --ptclsout gmm_01/class_01_even.lst --mode regress --outsize 128 --parallel thread:64 --nptcl 800
+e2proc3d.py gmm_01/class_01_even.hdf gmm_01/class_01_even_lp.hdf --process filter.lowpass.gauss:cutoff_freq=0.067 --process normalize
+</code>
+This only shows the motion of the even set, and the same can be done to the odd half. Since the deep learning models for the two half-sets are trained independently, visualizing the motion in the combined dataset without breaking the "gold-standard" validation is impossible. Still, the rotation movement should be visible already even with the small dataset.
+{{:eman2:atpsyn_f1_motion.gif | F1 head motion | width="600"}}
+===== Map particles to tomograms =====
+As shown in previous tutorials, we can map particles back to the original tomograms to visualize their organization inside the tomogram.
+<code>
+e2spt_mapptclstotomo.py --path=spt_03 --iter=7 --tomo=tomograms/14042022_BrnoKrios_Arctis_grid5_Position_17__bin4.hdf --new
+</code>
+Then open the raw tomogram and the mapped back particles (spt_03/ptcls_in_tomo_14042022_BrnoKrios_Arctis_grid5_Position_17_07.hdf) in Chimera.
+{{:eman2:atpsyn_map_back.jpg | Particles in tomo | width="600"}}