The "Project" concept in EMAN2 and the internal database
When doing a single particle reconstruction, a common problem is how to document the long series of operations that goes into a typical reconstruction. Eventually, when a high resolution structure is ready for publication, it must also be submitted to a database. EMAN1 had a very simple mechanism for tracking operations. Every EMAN command was logged in the local directory, so in theory someone could trace the exact sequence of operations that led to the current state. However, if someone used other commands to move their files around, or overwrote files with other files, clearly this mechanism was insufficient.
In EMAN2, while the individual programs can generally still be run in standalone mode from the shell, everything is designed in general to work together within a 'project' infrastructure. You begin the process of initiating a 'project' generally using 'e2desktop.py', by selecting a task like 'Single particle reconstruction'. When you answer the initial questions related to this task, you have started the process of making a project. From this point on, most data will be stored in internal databases managed by EMAN2. This way, a complete record of the data processing will be available for both an accurate internal accounting of how a project was processed, but also for easy upload to centralized databases such as the EBI or PDB.
The basic directory structure:
Single Particle Reconstruction
- raw_data - contains original micrographs/ccd frames particles - boxed out particles, and preprocessed (phase flipped, wiener filtered, etc.) particle images initial_model_# - Refinements used to determine a good initial model refine_#