Differences between revisions 19 and 20
Revision 19 as of 2010-03-30 20:22:38
Size: 6535
Editor: root
Comment:
Revision 20 as of 2010-03-30 20:48:22
Size: 7326
Editor: root
Comment:
Deletions are marked like this. Additions are marked like this.
Line 12: Line 12:
There are several ways to backup your EMEN2 database environment. At the simplest level, you can just copy all the files directly to another location. This is a "cold backup," and is the simplest and most reliable backup mechanism, but requires that you stop all database writes for the duration of the process to ensure the integrity of the archive. Alternatively, you can perform a "hot backup," which can be performed even while the database is active. A hot backup copies database log files to an existing cold backup to bring it up-to-date with the current state. I recommend running cold backups once a week, with hot backups daily. There are several ways to backup your EMEN2 database environment. At the simplest level, you can just copy all the files directly to another location. This is a "cold backup," and is the simplest and most reliable backup mechanism, but requires that you stop all database writes for the duration of the process to ensure the integrity of the archive. Alternatively, you can perform a "hot backup," which can be performed even while the database is active. A hot backup brings an existing cold backup up-to-date. I recommend running cold backups once a week, with hot backups daily.


== Notes on the EMEN2 database environment ==

An EMEN2 database environment contains three types of files: database files, log files, and region files.

Database files contain key/value pairs that comprise all the records in the database, as well as a number of database files used for indexes. Database files are contained in $DB_HOME/data and subdirectories. Log files contain data from all committed transactions, and are stored in $DB_HOME/log as log.XX, where XX are consecutive integers starting from 1.

To provide guarantees about transaction atomicity and durability, changes are first written to log files on stable storage before a transaction is marked as committed. The database files are not updated until this has been completed. In the event of a crash or hardware failure, the database files can be checked against the log files to correct any errors or missing data.

Because a cold backup copies the database files, the database must be stopped so they are not changed while the backup is in progress. Once a cold backup is made, it can be updated with a hot backup. A hot backup only copies new log files, which are append-only, and does not require the database files to be stable during the backup.
Line 17: Line 28:
EMEN2's database core includes some methods to help manage database environment backups. These methods (db.archivelogs, db.coldbackup, db.hotbackup) can be invoked easily from the cmdlineutils/backup.py script. EMEN2's database core includes some methods to help manage database environment backups. These methods (db.archivelogs, db.coldbackup, db.hotbackup) can be called from the cmdlineutils/backup.py script.
Line 61: Line 72:
To provide guarantees about transaction atomicity and durability, EMEN2/Berkeley DB first writes changes to log files on stable storage before a transaction is marked as committed. Database files are not updated until this is complete. In the event of a crash or hardware failure, the database files can be checked against the log files to correct any errors or missing data.
Line 72: Line 82:
# python ./cmdlineutils/backup.py --cold # python ./cmdlineutils/backup.py --hot

EMEN2 Maintenance and Backups

An EMEN2 installation consists of several components:

The EMEN2 database environment requires special backup procedures because it is an active database environment. The other directories are just normal files on disk, however, and can handled with standard backup tools (e.g. rsync).

There are several ways to backup your EMEN2 database environment. At the simplest level, you can just copy all the files directly to another location. This is a "cold backup," and is the simplest and most reliable backup mechanism, but requires that you stop all database writes for the duration of the process to ensure the integrity of the archive. Alternatively, you can perform a "hot backup," which can be performed even while the database is active. A hot backup brings an existing cold backup up-to-date. I recommend running cold backups once a week, with hot backups daily.

Notes on the EMEN2 database environment

An EMEN2 database environment contains three types of files: database files, log files, and region files.

Database files contain key/value pairs that comprise all the records in the database, as well as a number of database files used for indexes. Database files are contained in $DB_HOME/data and subdirectories. Log files contain data from all committed transactions, and are stored in $DB_HOME/log as log.XX, where XX are consecutive integers starting from 1.

To provide guarantees about transaction atomicity and durability, changes are first written to log files on stable storage before a transaction is marked as committed. The database files are not updated until this has been completed. In the event of a crash or hardware failure, the database files can be checked against the log files to correct any errors or missing data.

Because a cold backup copies the database files, the database must be stopped so they are not changed while the backup is in progress. Once a cold backup is made, it can be updated with a hot backup. A hot backup only copies new log files, which are append-only, and does not require the database files to be stable during the backup.

backup.py

EMEN2's database core includes some methods to help manage database environment backups. These methods (db.archivelogs, db.coldbackup, db.hotbackup) can be called from the cmdlineutils/backup.py script.

Usage: backup.py [options]

Options:
  --help                Print help message
  -h HOME, --home=HOME  DB_HOME
  -c CONFIGFILE, --configfile=CONFIGFILE
  --archive             archive log files
  --cold                cold backup
  --hot                 hot backup

Cold Backup

To create a cold backup, shut down any open database processes (see emen2control.py), and use the EMEN2 backup utility with the "--cold" option.

backup.py --cold

This will run a database checkpoint, and create a cold backup in the path specified by BACKUPPATH. The database files, highest numbered log file, and configuration files will be copied.

To prevent overwriting an existing cold backup, the script will not run if the target directory exists. You should rename or remove the existing cold backup first.

Example:

# python ./cmdlineutils/backup.py --cold
                ... snip: startup ...
Opening Database Environment: /home/emen2/db/
Cold Backup: Checkpoint
Cold Backup: Copying data: /home/emen2/db/data -> /home/emen2/db_backup/data
Cold Backup: Copying config: /home/emen2/db/config.yml -> /home/emen2/db_backup/config.yml
Cold Backup: Copying config: /home/emen2/db/DB_CONFIG -> /home/emen2/db_backup/DB_CONFIG
Cold Backup: Copying log: /home/emen2/db/log/log.0000000311 -> /home/emen2/db_backup/log/log.0000000311

Once you have created a cold backup, it can be updated by running a hot backup.

Hot Backup

A hot backup copies these log files to an existing cold backup and uses them to bring it up to date with the current state of the main database environment.

backup.py --hot

Example:

# python ./cmdlineutils/backup.py --hot
                ... snip: startup ...
Opening Database Environment: /home/emen2/db/
Hot Backup: Log archive
Log Archive: Checkpoint
Log Archive: /home/emen2/db/log/log.0000000303 -> /home/emen2/log_archive/log.0000000303
Log Archive: /home/emen2/db/log/log.0000000304 -> /home/emen2/log_archive/log.0000000304
Log Archive: /home/emen2/db/log/log.0000000305 -> /home/emen2/log_archive/log.0000000305
Log Archive: /home/emen2/db/log/log.0000000306 -> /home/emen2/log_archive/log.0000000306
Log Archive: /home/emen2/db/log/log.0000000307 -> /home/emen2/log_archive/log.0000000307
Log Archive: /home/emen2/db/log/log.0000000308 -> /home/emen2/log_archive/log.0000000308
Log Archive: /home/emen2/db/log/log.0000000309 -> /home/emen2/log_archive/log.0000000309
Log Archive: /home/emen2/db/log/log.0000000310 -> /home/emen2/log_archive/log.0000000310
Hot Backup: Copying log: /home/emen2/db/log/log.0000000303 -> /home/emen2/db_backup/log/log.0000000303
Hot Backup: Copying log: /home/emen2/db/log/log.0000000304 -> /home/emen2/db_backup/log/log.0000000304
Hot Backup: Copying log: /home/emen2/db/log/log.0000000305 -> /home/emen2/db_backup/log/log.0000000305
Hot Backup: Copying log: /home/emen2/db/log/log.0000000306 -> /home/emen2/db_backup/log/log.0000000306
Hot Backup: Copying log: /home/emen2/db/log/log.0000000307 -> /home/emen2/db_backup/log/log.0000000307
Hot Backup: Copying log: /home/emen2/db/log/log.0000000308 -> /home/emen2/db_backup/log/log.0000000308
Hot Backup: Copying log: /home/emen2/db/log/log.0000000309 -> /home/emen2/db_backup/log/log.0000000309
Hot Backup: Copying log: /home/emen2/db/log/log.0000000310 -> /home/emen2/db_backup/log/log.0000000310
Hot Backup: Copying log: /home/emen2/db/log/log.0000000311 -> /home/emen2/db_backup/log/log.0000000311
Log Archive: Checkpoint
Log Archive: Removing /home/emen2/db/log/log.0000000303
Log Archive: Removing /home/emen2/db/log/log.0000000304
Log Archive: Removing /home/emen2/db/log/log.0000000305
Log Archive: Removing /home/emen2/db/log/log.0000000306
Log Archive: Removing /home/emen2/db/log/log.0000000307
Log Archive: Removing /home/emen2/db/log/log.0000000308
Log Archive: Removing /home/emen2/db/log/log.0000000309
Log Archive: Removing /home/emen2/db/log/log.0000000310

Log Archive

Log archival is generally run as part of the hot backup process, but can also be invoked manually with the --archive option.

# python ./cmdlineutils/backup.py --archive
        ... snip: startup ...
Opening Database Environment: /home/emen2/db/
Log Archive: Checkpoint
Log Archive: /home/emen2/db/log/log.0000000303 -> /home/emen2/log_archive/log.0000000303
Log Archive: /home/emen2/db/log/log.0000000304 -> /home/emen2/log_archive/log.0000000304
Log Archive: /home/emen2/db/log/log.0000000305 -> /home/emen2/log_archive/log.0000000305

EMEN2/BackupsOld (last edited 2010-09-22 07:18:44 by root)