Availability: EMAN2 daily build after 2016-10
Programs in the tomogram segmentation requires Theano, which is not distributed with EMAN2. To use the protocol, one needs to build EMAN2 from source and install Theano manually.
First, make an empty directory and get into that directory in command line. Then run e2projectmanager.py from the command line. While a GUI window will show up, it is still a good idea to keep the command line window open to view the messages.
Click the Workflow Mode drop-down menu next to navigate to the TomoSeg panel.
Click "Import Tomogram Files" on the left panel (1). On the panel showed up on the right, click "Browse" next to import_files (2), and select the tomogram you would like to segment in the browser window, and click "Ok". If you want to bin the tomogram before processing, write the shrinking factor in the text box next to "shrink" (3). Make sure that the "import_tomos" and "tomoseg_auto" box is checked. Finally, click "Launch" (4) and wait the pre-process to finish.
Select Positive Samples
Open "Box training references". Press browse, and select your imported tomogram. Leave “boxsize” at -1, and press Launch.
In a moment, three windows will appear on your screen, which will be familiar if you’ve boxed particles before. The only difference between this boxing and the other is that you can box in 2D on slices of a 3D image.
On the window named “e2boxer”, make sure your box size is 64. None of the other options need to be changed.
On the window containing your tomogram, you can begin selecting boxes. Go up and down in the tomogram using the arrow keys, select and drag boxes using the left mouse button, and delete boxes using Shift + left mouse button. As you select boxes, they will appear in the (Particles) window.
Select around 10 boxes containing your structure. If your structure appears differently throughout the cell (e.g. microtubules), be sure to include a variety of views in the boxes.
When selecting boxes, ensure that your structure is clear in the (Particles) window. You will have to manually segment these boxes, so if you can’t see your structure, your segmentation will be more difficult, and your final segmentation will suffer as a result. It is better to have fewer boxes that you can segment better than more boxes you segment worse.
After getting an appropriate number of boxes, press “Write output” in the e2boxer window.
- Select your boxes in the “Raw Data” window.
- Write the suffix of the particles in the “Output Suffix” text box.
- In “Normalize Images”, select “None”.
- Press “OK”.
Manually Annotate Samples
In order to train the program to recognize your structure, you have to segment the boxes that you selected.
Navigate to the “Segment training references” interface in the EMAN2 window. For “Particles”, browse and select the _ptcls file you just generated.
Leave “Output” blank and keep “segment” checked, and press Launch.
Two windows will appear, one small and one large. The smaller will contain your boxes, which you can navigate through with your arrow keys or zoom in and out of with your scroll wheel. The larger will open on the “Draw” tab. Using your cursor, draw on the structures in your boxes. You can go back to the boxing window and check the surrounding of the region for better segmentation.
Segment all of your boxes. If you need to change the size of the pen, change both “Pen Size” and “Pen Size2” to a larger or smaller number. Try not to select too much of the space outside of your structure, so definitely shrink the pen size if it is too big.
When you are finished, simply close the windows. The segmentation file will be saved automatically as "*_seg.hdf" under the same file name of your particles.
Select Negative Samples
Go back to the boxing windows, find and press the “Clear” button on the “e2boxer” window. This deletes your previous selections.
Now, in the tomogram window, select boxes that DON’T contain your particle. You can select as many of these as you like (normally ~100). Try to get a wide variety of other cellular structures, empty space, gold fiducials and high-contrast carbon.
After finishing picking the negative samples, write the particle output following the same way you generate the positive samples. Make sure to set a different suffix in the "Output Suffix" box.
Build Training Set
- Find the “Build training set” option in EMAN2.
- In “particles_raw”, select your _ptcls file.
- In “particles_label”, select your _ptcls_seg file.
- In “boxes_negative”, select your _bad file.
Leave “trainset_output” blank. “Ncopy” controls the number of particles in your training set. The default of 10 is fine, unless you want to do a faster run at the expense of accuracy.
Press Launch. The program will print “Done” in your Terminal when it has finished. The training set will be saved as the same name as the positive particles with "_trainset" suffix.
Train Neural Network
Open up “Train the neural network” in EMAN2. In “trainset”, browse and choose your _trainset file.
The defaults for everything else in this window are sufficient to produce good results. To significantly shorten the length of the training (and potentially reduce the quality), reduce the number of iterations. Write the filename of the trained neural network output in the "netout" text box, and leave the "from_trained" box empty if it is the first training process.
Press Launch. The program will print a few numbers quickly at the beginning (this is to monitor the training process. Something is wrong if it prints really huge values), and then will notify you once it’s completed each iteration. When it’s finished, it will output the trained neural network in the specified netout file and samples of the training result in a file called "trainout_" followed by the netout file name.
Apply to Tomograms
Darius Jonasch, the first user of the tomogram segmentation protocol, provided many useful advices to make the workflow user-friendly. He also wrote a tutorial of the earlier version of the protocol, on which this tutorial is based.