Differences between revisions 2 and 3
Revision 2 as of 2010-01-22 13:34:54
Size: 3963
Editor: SteveLudtke
Comment:
Revision 3 as of 2010-01-22 13:40:02
Size: 3981
Editor: SteveLudtke
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
=== WARNING ! === == WARNING ! ==
Line 9: Line 9:
* '''Do NOT move files within an EMAN2DB directory around'''. These are not normal image files that you can access or transport  * '''Do NOT move files within an EMAN2DB directory around'''. These are not normal image files that you can access or transport
Line 11: Line 11:
* ''exception to the above statement:'' If you need to remove files from an EMAN2DB directory (taking up too much space and
aren't needed, etc.), you can do so, but ONLY after running 'e2bdb.py -c' on the machine first
* '''Beware of network mounted filesystems'''. ie - if your home directory is on a network volume, rather than the local machine,
you need to be very very cautious. This can be done safely, but only with care. The EMAN2 database is safe for running multiple
programs on a single machine. It is NOT safe for simultaneous access by multiple machines. ie - if you run an EMAN2 program
accessing a particular database on one machine, and simultaneously access the database on another machine via NFS, you may get
very unpredictable results, and if you write to the database from both machines, you could cause corruption.
* If you wish to switch running jobs from one machine to another, you must run 'e2bdb.py' on the first machine, and insure that
EMAN2 programs are closed before opening programs on the other machine.
* If you DO get a message saying there is a database error and corruption may have resulted, first try running 'e2bdb.py -c'. 90% of
the time that will fix the problem. If that doesn't work, then you may have to resort to removing the cache directory in /tmp.
This may be a risky operation which could result in data loss, and is only a last resort.
 * '''exception to the above statement:''' If you need to remove files from an EMAN2DB directory (taking up too much space and aren't needed, etc.), you can do so, but ONLY after running 'e2bdb.py -c' on the machine first
 * '''Beware of network mounted filesystems'''. ie - if your home directory is on a network volume, rather than the local machine, you need to be very very cautious. This can be done safely, but only with care. The EMAN2 database is safe for running multiple programs on a single machine. It is NOT safe for simultaneous access by multiple machines. ie - if you run an EMAN2 program  accessing a particular database on one machine, and simultaneously access the database on another machine via NFS, you may get very unpredictable results, and if you write to the database from both machines, you could cause corruption.
 * '''If you wish to switch running jobs from one machine to another''', you must run 'e2bdb.py' on the first machine, and insure that EMAN2 programs are closed before opening programs on the other machine.
 * '''If you DO get a message saying there is a database error and corruption may have resulted''': first try running 'e2bdb.py -c'. 90% of the time that will fix the problem. If that doesn't work, then you may have to resort to removing the cache directory in /tmp. This may be a risky operation which could result in data loss, and is only a last resort.
Line 24: Line 16:
=====Brief explanation===== ==== Brief technical explanation ====

WARNING !

The embedded database used in EMAN2 (which stores most of your image data and information about your projects in a set of EMAN2DB directories) has a number of very important limitations and restrictions associated with it. Failure to be aware of these restrictions could result in data loss and waste of your valuable time. Those of you accustomed to working with normal image files and moving them around manually need to be aware that you cannot do this with a database system as EMAN2 uses.

  • Do NOT move files within an EMAN2DB directory around. These are not normal image files that you can access or transport between machines. They are the internal files generated by an embedded database system. Don't mess with them !

  • exception to the above statement: If you need to remove files from an EMAN2DB directory (taking up too much space and aren't needed, etc.), you can do so, but ONLY after running 'e2bdb.py -c' on the machine first

  • Beware of network mounted filesystems. ie - if your home directory is on a network volume, rather than the local machine, you need to be very very cautious. This can be done safely, but only with care. The EMAN2 database is safe for running multiple programs on a single machine. It is NOT safe for simultaneous access by multiple machines. ie - if you run an EMAN2 program accessing a particular database on one machine, and simultaneously access the database on another machine via NFS, you may get very unpredictable results, and if you write to the database from both machines, you could cause corruption.

  • If you wish to switch running jobs from one machine to another, you must run 'e2bdb.py' on the first machine, and insure that EMAN2 programs are closed before opening programs on the other machine.

  • If you DO get a message saying there is a database error and corruption may have resulted: first try running 'e2bdb.py -c'. 90% of the time that will fix the problem. If that doesn't work, then you may have to resort to removing the cache directory in /tmp. This may be a risky operation which could result in data loss, and is only a last resort.

Brief technical explanation

Details on the database are discussed in Eman2DataStorage

EMAN2 uses an embedded database to store information about a project, as well as much of the actual image data. This choice was made for a number of reasons including performance, flexibility, and dealing with projects with thousands of micrographs and hundreds of thousands of particles. However, it comes with a few limitations. Like most databases, it uses a memory & disk cache to give faster access to information and coordinate access to the data from multiple programs (on the same machine). This cache consists of a set of files stored in /tmp (which must be physically attached to the local machine). If you try to access the same database from two different machines at the same time via a shared network filesystem, each machine establishes an independent cache in /tmp, and both think they have exclusive access to the files. This produces a situation where the machines can easily disagree about the contents of a file, and can cause database corruption. The 'e2bdb.py -c' program will safely close the cache on one machine, so it can be reliably accessed from another machine. It is also possible in some cases to open the databases read-only from multiple machines at once, with no cache, however this is a special case used in some situations on clusters, and not a general rule.

The files in the EMAN2DB directories are not normal flat image files, but are actually proprietary database files. Moving them around or otherwise messing with them will confuse the database. Just like you wouldn't create a MySQL database and go moving around its database files wily-nily, you shouldn't mess with files in the EMAN2DB directory, particularly if there is an active cache.

EMAN2/DatabaseWarning (last edited 2014-04-22 14:49:36 by SteveLudtke)