Using BDB with NFS mounted filesystems

You should, of course, read this database warning as a first step.

Quick advice

More details

The BDB databases used in EMAN2 can be safely used from any single computer, even over an NFS filesystem. The problems start to arise when you try to access the BDB file, and in particular, WRITE to bdb files from multiple computers over on a shared filesystem like NFS. Note that this sort of thing can also cause problems with regular flat files in Imagic, Spider, etc., it's just that in those cases, you often don't notice the corruption. In the case of BDB, it yells at you when anything bad happens.

So, the whole point of NFS is that you want to use it from multiple computers, right ? Of course, this is possible, you just HAVE to remember to run 'e2bdb.py -c' on the first machine before switching to the second. Running 'e2bdb.py -c' is NEVER harmful (unless you have active EMAN2 jobs running, in which case it should warn you), so if you can't remember if you already ran it, running it again is harmless.

Why ?

BDB has a 'cache' which is stored in /tmp on the local machine. That is, when you read from a BDB file, it keeps a copy of the information you recently accessed in the cache, and checks there before looking in the file again. This can be a tremendous speedup during processing. It also permits you to run multiple EMAN2 commands on the SAME computer safely. This permits things like threaded parallelism. With most software if you have 2 different programs on the same computer simultaneously writing to the same file, you will get corruption in the file. With BDB, this won't happen. While you can still get corruption when you do this from 2 different machines, if you do it from a single machine, everything will be fine.

There can also be some confusing things that happen when reading from BDB's. Say you have two computers called 'one' and 'two'. Say you're running a big refinement job on 'one', and want to log in to 'two' to check the results while it's running. Even though what you're doing on 'two' doesn't change the database, and thus won't result in corruption, you may see some strange behavior. Say the job has been running for an hour, you log into 'two' and check it, and everything is fine. Then 3 hours later you log in to 'two' again, but it seems like nothing has progressed. What's going on ? When you logged in the first time, 'two' cached a copy of the BDB files at that point in time. It has no way of knowing the files were since modified on another computer, so when you log in again later, you still see the cached version of the files from before. However, if you run 'e2bdb.py -c' on 'two', it will remove the cache, and you will see the updated files.

EMAN2/FAQ/NfsBdb (last edited 2011-06-09 14:54:31 by SteveLudtke)