EMEN2 Maintenance and Backups

There are several ways to backup your EMEN2 environment.

At the simplest level, you can just copy all the files directly to another location. This is a "cold backup," and is the simplest and most reliable backup mechanism, but requires that you stop all database writes for the duration of the process to ensure the integrity of the archive. Alternatively, you can perform a "hot backup," which can be performed at any time, even while the database is active. A hot backup copies database log files to an existing cold backup to bring it up-to-date with the current state. I recommend running cold backups about once a week, with hot backups daily.

Cold Backup

To create a cold backup, shut down any open database processes, and use the EMEN2 backup utility with the "--cold" argument.

backup.py --cold

This will run a database checkpoint, and create a cold backup in the path specified by config:BACKUP. To prevent overwriting an existing cold backup, it will not run if the target directory exists. You should move or remove an existing cold backup first.

You can update this cold backup by running a hot backup.

Example:

# python ./cmdlineutils/backup.py --cold
                ... snip: startup ...
Opening Database Environment: /home/emen2/db/
Cold Backup: Checkpoint
Cold Backup: Copying data: /home/emen2/db/data -> /home/emen2/db_backup/data
Cold Backup: Copying config: /home/emen2/db/config.yml -> /home/emen2/db_backup/config.yml
Cold Backup: Copying config: /home/emen2/db/DB_CONFIG -> /home/emen2/db_backup/DB_CONFIG
Cold Backup: Copying log: /home/emen2/db/log/log.0000000311 -> /home/emen2/db_backup/log/log.0000000311

Hot Backup

backup.py --hot

# python ./cmdlineutils/backup.py --cold
                ... snip: startup ...
Opening Database Environment: /home/emen2/db/
Hot Backup: Log archive
Log Archive: Checkpoint
Log Archive: /home/emen2/db/log/log.0000000303 -> /home/emen2/db_archive/log.0000000303
Log Archive: /home/emen2/db/log/log.0000000304 -> /home/emen2/db_archive/log.0000000304
Log Archive: /home/emen2/db/log/log.0000000305 -> /home/emen2/db_archive/log.0000000305
Log Archive: /home/emen2/db/log/log.0000000306 -> /home/emen2/db_archive/log.0000000306
Log Archive: /home/emen2/db/log/log.0000000307 -> /home/emen2/db_archive/log.0000000307
Log Archive: /home/emen2/db/log/log.0000000308 -> /home/emen2/db_archive/log.0000000308
Log Archive: /home/emen2/db/log/log.0000000309 -> /home/emen2/db_archive/log.0000000309
Log Archive: /home/emen2/db/log/log.0000000310 -> /home/emen2/db_archive/log.0000000310
Hot Backup: Copying log: /home/emen2/db/log/log.0000000303 -> /home/emen2/db_hotbackup/log/log.0000000303
Hot Backup: Copying log: /home/emen2/db/log/log.0000000304 -> /home/emen2/db_hotbackup/log/log.0000000304
Hot Backup: Copying log: /home/emen2/db/log/log.0000000305 -> /home/emen2/db_hotbackup/log/log.0000000305
Hot Backup: Copying log: /home/emen2/db/log/log.0000000306 -> /home/emen2/db_hotbackup/log/log.0000000306
Hot Backup: Copying log: /home/emen2/db/log/log.0000000307 -> /home/emen2/db_hotbackup/log/log.0000000307
Hot Backup: Copying log: /home/emen2/db/log/log.0000000308 -> /home/emen2/db_hotbackup/log/log.0000000308
Hot Backup: Copying log: /home/emen2/db/log/log.0000000309 -> /home/emen2/db_hotbackup/log/log.0000000309
Hot Backup: Copying log: /home/emen2/db/log/log.0000000310 -> /home/emen2/db_hotbackup/log/log.0000000310
Hot Backup: Copying log: /home/emen2/db/log/log.0000000311 -> /home/emen2/db_hotbackup/log/log.0000000311
Log Archive: Checkpoint
Log Archive: Removing /home/emen2/db/log/log.0000000303
Log Archive: Removing /home/emen2/db/log/log.0000000304
Log Archive: Removing /home/emen2/db/log/log.0000000305
Log Archive: Removing /home/emen2/db/log/log.0000000306
Log Archive: Removing /home/emen2/db/log/log.0000000307
Log Archive: Removing /home/emen2/db/log/log.0000000308
Log Archive: Removing /home/emen2/db/log/log.0000000309
Log Archive: Removing /home/emen2/db/log/log.0000000310

Log Archive

EMEN2 uses Berkeley DB as the underlying database technology. To provide guarantees about transaction atomicity and durability, Berkeley DB writes changes to a log file on stable storage before a transaction is marked as committed. Database files are not updated until this step is complete. In the event of a crash or hardware failure, the database files can be checked against the log files to correct any errors or missing data.

The log files are stored in $DB_HOME/log as log.XX, where XX is a sequential integer starting from 1. With default settings, the files are 8 MB each. As one log file is finished, the next log file in the sequence is created and used as the active log file.

Log files that are not being currently open or used may be archived after a "checkpoint" is made. This frees up disk space in the DB_HOME environment, and lets the administrator move the logs to long term archival storage.

The EMEN2 backup utility will archive finished logs when run with the "--archive" flag. Finished log files will be moved to the location specified by the "ARCHIVE" configuration setting.

backup.py --archive

Example:

# python ./cmdlineutils/backup.py --archive
        ... snip: startup ...
Opening Database Environment: /home/emen2/db/
Log Archive: Checkpoint
Log Archive: /home/emen2/db/log/log.0000000303 -> /home/emen2/log_archive/log.0000000303
Log Archive: /home/emen2/db/log/log.0000000304 -> /home/emen2/log_archive/log.0000000304
Log Archive: /home/emen2/db/log/log.0000000305 -> /home/emen2/log_archive/log.0000000305

The log archive directory should be part of your normal backup procedures. In a "worst case scenario" failure, these log files will be necessary to rebuild the database.

You may want to run this periodically as a cron job, combined with an action to copy the log archive directory to a different system.

In addition to log archiving, you will want to maintain hot and cold backups.