Differences between revisions 23 and 24
Revision 23 as of 2010-03-31 06:11:46
Size: 7574
Editor: root
Comment:
Revision 24 as of 2010-03-31 06:12:14
Size: 7623
Editor: root
Comment:
Deletions are marked like this. Additions are marked like this.
Line 28: Line 28:
Shutdown the web server (see [[EMEN2/emen2control.py]]) and perform a normal cold backup of everything by rsync'ing to a remote backup server. This will probably be sufficient for most users. Shutdown the web server (see [[EMEN2/emen2control.py]]) and perform a normal cold backup of everything by rsync'ing to a remote backup server.

EMEN2 Maintenance and Backups

An EMEN2 installation consists of several components:

EMEN2 Database Environment

There are several ways to backup your EMEN2 database environment.

In the simplest case, you can perform a normal copy of the database environment. This is a "cold backup", and is the easiest and most reliable mechanism, but requires that you stop all database writes for the duration of the process to ensure the integrity of the archive (see "long answer"). If your uptime requirements are not stringent, performing cold backups once a night or once a week using normal shell tools (see simple answer below) may be all you need.

If you want to perform very frequent backups, or do not want to stop the database environment, you can perform a "hot backup," which can be performed even while the database is active.

File Storage Area, Logs, etc.

The other directories are just normal files on disk and can handled with standard shell tools.

Backup: Simple Answer

This will probably be sufficient for most users. Shutdown the web server (see EMEN2/emen2control.py) and perform a normal cold backup of everything by rsync'ing to a remote backup server.

Example (default config, everything in /home/emen2):

[emen2@ncmidb ~/emen2]# python ./emen2control.py --shutdown
[emen2@ncmidb ~/emen2]# python ./emen2control.py --recover
[emen2@ncmidb ~]# cd /home/emen2/
[emen2@ncmidb ~]# rsync -vr db db_backup log_archive applog emen2files emen2tiles emen2 emen2@remotebackup:~/emen2backup/
        ....

Backup: Long Answer

An EMEN2 database environment contains three types of files: database files, log files, and region files.

Database files contain key/value pairs that comprise all the records in the database, as well as a number of database files used for indexes. Database files are contained in $DB_HOME/data and subdirectories. Log files contain data from all committed transactions, and are stored in $DB_HOME/log as log.XX, where XX are consecutive integers starting from 1.

To provide guarantees about transaction atomicity and durability, changes are first written to log files on stable storage before a transaction is marked as committed. The database files are not updated until this has been completed. In the event of a crash or hardware failure, the database files can be checked against the log files to correct any errors or missing data.

Because a cold backup copies the database files, the database must be stopped so they are not changed while the backup is in progress. Once a cold backup is made, it can be updated with a hot backup. A hot backup only copies new log files, which are append-only, and does not require the database files to be stable during the backup.

backup.py

EMEN2's database core includes some methods to help manage database environment backups. These methods (db.archivelogs, db.coldbackup, db.hotbackup) can be called from the cmdlineutils/backup.py script.

Usage: backup.py [options]

Options:
  --help                Print help message
  -h HOME, --home=HOME  DB_HOME
  -c CONFIGFILE, --configfile=CONFIGFILE
  --archive             archive log files
  --cold                cold backup
  --hot                 hot backup
  --force                               Force overwrite of existing backup

Cold Backup

To create a cold backup, shut down any open database processes (see emen2control.py).

Once all processes are stopped, you can either copy or tar the DB_HOME environment, or use the the EMEN2 backup utility with the "--cold" option:

[emen2@ncmidb ~]# python cmdlineutils/backup.py --cold

This will run a database checkpoint, and create a cold backup in the path specified by BACKUPPATH. The database files, highest numbered log file, and configuration files will be copied.

To prevent overwriting an existing cold backup, the script will not run if the target directory exists. You can rename/remove the existing cold backup first, or specify the "--force" option to backup.py.

Example:

[emen2@ncmidb ~]# python cmdlineutils/backup.py --cold
                ... snip: startup ...
Opening Database Environment: /home/emen2/db/
Cold Backup: Checkpoint
Cold Backup: Copying data: /home/emen2/db/data -> /home/emen2/db_backup/data
Cold Backup: Copying config: /home/emen2/db/config.yml -> /home/emen2/db_backup/config.yml
Cold Backup: Copying config: /home/emen2/db/DB_CONFIG -> /home/emen2/db_backup/DB_CONFIG
Cold Backup: Copying log: /home/emen2/db/log/log.0000000311 -> /home/emen2/db_backup/log/log.0000000311

Once you have created a cold backup, it can be updated by running a hot backup.

It is safe to copy hot/cold backups because they are not active database environments.

Hot Backup

A hot backup copies these log files to an existing cold backup and uses them to bring it up to date with the current state of the main database environment.

backup.py --hot

Example:

[emen2@ncmidb ~]# python cmdlineutils/mdlineutils/backup.py --hot
                ... snip: startup ...
Opening Database Environment: /home/emen2/db/
Hot Backup: Log archive
Log Archive: Checkpoint
Log Archive: /home/emen2/db/log/log.0000000303 -> /home/emen2/log_archive/log.0000000303
Log Archive: /home/emen2/db/log/log.0000000304 -> /home/emen2/log_archive/log.0000000304
Log Archive: /home/emen2/db/log/log.0000000305 -> /home/emen2/log_archive/log.0000000305
Hot Backup: Copying log: /home/emen2/db/log/log.0000000303 -> /home/emen2/db_backup/log/log.0000000303
Hot Backup: Copying log: /home/emen2/db/log/log.0000000304 -> /home/emen2/db_backup/log/log.0000000304
Hot Backup: Copying log: /home/emen2/db/log/log.0000000305 -> /home/emen2/db_backup/log/log.0000000305
Hot Backup: Copying log: /home/emen2/db/log/log.0000000306 -> /home/emen2/db_backup/log/log.0000000306
Log Archive: Checkpoint
Log Archive: Removing /home/emen2/db/log/log.0000000303
Log Archive: Removing /home/emen2/db/log/log.0000000304
Log Archive: Removing /home/emen2/db/log/log.0000000305

Log Archive

This is normally done automatically as part of the normal hot backup process, but can be invoked manually if necessary (e.g. running out of disk space on DB_HOME partition)

[emen2@ncmidb ~]# python cmdlineutils/backup.py --archive
        ... snip: startup ...
Opening Database Environment: /home/emen2/db/
Log Archive: Checkpoint
Log Archive: /home/emen2/db/log/log.0000000303 -> /home/emen2/log_archive/log.0000000303
Log Archive: /home/emen2/db/log/log.0000000304 -> /home/emen2/log_archive/log.0000000304
Log Archive: /home/emen2/db/log/log.0000000305 -> /home/emen2/log_archive/log.0000000305

Recovery

To prepare a cold/hot backup environment for use, run db_recover with the "-c" and "-h" flags. You should then copy the environment to the location specified by $DB_HOME.

Example:

[emen2@ncmidb ~]# db_recover -c -h db_backup
[emen2@ncmidb ~]# mv db db.crashed
[emen2@ncmidb ~]# cp -vr db_backup db

Additional Help

If you have any questions about how to best backup your EMEN2 environment, or to recover from a crash, please contact Ian Rees.

EMEN2/BackupsOld (last edited 2010-09-22 07:18:44 by root)