This page is obsolete

Using EMAN2 with shared filesystems (like NFS)

In many large groups, computers will be configured so users can login to multiple machines, and their home-directory will be cross-mounted from wherever it lives. This is also the case on linux clusters, where the user's home directory is inevitably cross-mounted on all of the compute nodes. This fact causes some complications for any image processing where more than one computer may be in use.

The problem

NFS (and other shared filesystems) do not generally guarantee that 2 different computers will see identical versions of the same file at all times. That is, if you write to a file called x.txt on workstationA, then read x.txt on workstationB, it may not see the entire x.txt file until some undefined amount of time after you finish writing. Usually this delay is small, but it does exist. This problem can be observed on any image processing system, not just EMAN1/2. Say you are appending images to the end of a stack on 3 different computers at the same time. MOST of the images will get appended correctly, but there is a small, but finite, risk of getting image data out of phase, and causing funny shifts and jumbling of the images. For systems like embedded databases (the BDB system used in EMAN2) there is ZERO tolerance for writing information to a file and not seeing exactly what you wrote (unreliable storage).

The solution

EMAN1's solution

In EMAN1, the user was expected to be careful and insure that they not write to the same file at the same time from multiple nodes. For situations like parallel computing on clusters, this was handled through use of a fileserver in the 'runpar' command. That is, when runpar was used to spawn jobs, it also took care of all reading and writing of files. Whenever a node needed image data, instead of reading it using NFS, it would ask runpar for the data from the 1st node in the job. If this mechanism was intentionally bypassed in EMAN1 you would often see corrupted images as a result.

EMAN2's solution

In EMAN2, in addition to the image data itself, we also have to consider the database cache, which is used to give BDB access much better performance and also enhance reliability. Here are the rules, also somewhat described in EMAN2/DatabaseWarning :

Eman2NFS (last edited 2013-08-11 18:08:27 by SteveLudtke)