•
|
Versioned files are files submitted by Perforce users. Versioned files are stored in directory trees called depots.
|
•
|
Database files store metadata, including changelists, opened files, client specs, branch specs, and other data concerning the history and present state of the versioned files.
|
Database files appear as db.* files in the top level of the server root directory. Each
db.* file contains a single, binary-encoded database table.
Disk space shortages, hardware failures, and system crashes can corrupt any of the Perforce server's files. That's why the entire Perforce root directory structure (your versioned files and your database) should be backed up regularly.
The files that constitute the Perforce database, on the other hand, are not guaranteed to be in a state of transactional integrity if archived by a conventional backup program. Restoring the
db.* files from regular system backups can result in an inconsistent database. The only way to guarantee the integrity of the database after it's been damaged is to reconstruct the
db.* files from Perforce checkpoint and journal files:
•
|
A checkpoint is a snapshot or copy of the database at a particular moment in time.
|
•
|
A journal is a log of updates to the database since the last snapshot was taken.
|
The checkpoint file is often much smaller than the original database, and it can be made smaller still by compressing it. The journal file, on the other hand, can grow quite large; it is truncated whenever a checkpoint is made, and the older journal is renamed. The older journal files can then be backed up offline, freeing up more space locally.
Because the information stored in the Perforce database is as irreplaceable as your versioned files, checkpointing and journaling are an integral part of administering a Perforce server, and should be part of your regular backup cycle.
A checkpoint is a file that contains all information necessary to re-create the metadata in the Perforce database. When you create a checkpoint, the Perforce database is locked, enabling you to take an internally consistent snapshot of that database.
Versioned files are backed up separately from checkpoints. This means that a checkpoint does
not contain the contents of versioned files, and as such,
you cannot restore any versioned files from a checkpoint. You can, however, restore all changelists, labels, jobs, and so on, from a checkpoint.
To guarantee database integrity upon restoration, the checkpoint must be as old as, or older than, the versioned files in the depot. This means that the database should be checkpointed, and the checkpoint generation must be complete, before the backup of the versioned files starts.
Checkpoints are not created automatically; someone or something must run the checkpoint command on the Perforce server machine. To create a checkpoint, invoke the
p4d program with the
-jc (journal-create) flag:
You can create a checkpoint while the Perforce server (p4d) is running. The checkpoint is created in your server root directory (
P4ROOT).
To make the checkpoint, p4d locks the database and then dumps its contents to a file named
checkpoint.n in the
P4ROOT directory, where
n is a sequence number. Before unlocking the database,
p4d also copies (on UNIX where the journal is uncompressed, renames) the journal file to a file named
journal.n-1 in the
P4ROOT directory (regardless of the directory in which the current journal is stored), and then truncates the current journal. This guarantees that the last checkpoint (
checkpoint.n) combined with the current journal (
journal) always reflects the full contents of the database at the time the checkpoint was created.
The sequence numbers reflect the roll-forward nature of the journal; to restore databases to older checkpoints, match the sequence numbers. That is, you can restore the database reflected by
checkpoint.6 by restoring the database stored in
checkpoint.5 and rolling forward the changes recorded in
journal.5. In most cases, you're only interested in restoring the current database, which is reflected by the highest-numbered
checkpoint.n rolled forward with the changes in the current
journal.
In this case, your checkpoint and journal files are named prefix.ckp.n and
prefix.jnl.n respectively, where
prefix is as specified on the command line and
n is a sequence number. If no
prefix is specified, the default filenames
checkpoint.n and
journal.n are used. You can store checkpoints and journals in the directory of your choice by specifying the directory as part of the prefix. (Rotated journals are stored in the
P4ROOT directory, regardless of the directory in which the current journal is stored.)
Running p4 admin checkpoint is equivalent to
p4d -jc. You must be a Perforce superuser to use
p4 admin.
You can set up an automated program to create your checkpoints on a regular schedule. Be sure to always check the program's output to ensure that checkpoint creation was started. After successful creation, a checkpoint file can be compressed, archived, or moved onto another disk. At that time or shortly thereafter, back up the versioned files stored in the depot subdirectories.
To restore from a backup, the checkpoint must be at least as old as the files in the depots, that is, the versioned files can be newer than the checkpoint, but not the other way around. As you might expect, the shorter this time gap, the better.
If the checkpoint command itself fails, contact Perforce technical support immediately. Checkpoint failure is usually a symptom of a resource problem (disk space, permissions, and so on) that can put your database at risk if not handled correctly.
The journal is the running transaction log that keeps track of all database modifications since the last checkpoint. It's the bridge between two checkpoints.
If you have Monday's checkpoint and the journal that was collected from then until Wednesday, those two files (Monday's checkpoint plus the accumulated journal) contain the same information as a checkpoint made Wednesday. If a disk crash were to cause corruption in your Perforce database on Wednesday at noon, for instance, you could still restore the database even though Wednesday's checkpoint hadn't yet been made.
To restore your database, you only need to keep the most recent journal file accessible, but it doesn't hurt to archive old journals with old checkpoints, should you ever need to restore to an older checkpoint.
For Windows installations, if you used the installer (perforce.exe) to install a Perforce server or service, journaling is turned on for you.
If you installed Perforce without the installer (for an example of when you might do this, see
Multiple Perforce services under Windows), you do not have to create an empty file named
journal in order to enable journaling under a manual installation on Windows.
If P4JOURNAL is left unset (and no location is specified on the command line), the default location for the journal is
$P4ROOT/journal.
Be sure to create a new checkpoint with p4d -jc (and
-J journalfile if required) immediately after enabling journaling. Once journaling is enabled, you'll need make regular checkpoints to control the size of the journal file. An extremely large current journal is a sign that a checkpoint is needed.
Every checkpoint after your first checkpoint starts a new journal file and renames the old one. The old
journal is renamed to
journal.n, where
n is a sequence number, and a new
journal file is created.
By default, the journal is written to the file journal in the server root directory (
P4ROOT). Because there is no sure protection against disk crashes, the journal file and the Perforce server root should be located on different filesystems, ideally on different physical drives. The name and location of the journal can be changed by specifying the name of the journal file in the environment variable
P4JOURNAL or by providing the
-J filename flag to
p4d.
Whether you use P4JOURNAL or the
-J journalfile option to
p4d, the journal filename can be provided either as an absolute path, or as a path relative to the server root.
or set P4JOURNAL to
/usr/local/perforce/journalfile and use
If your P4JOURNAL environment variable (or command-line specification) doesn't match the setting used when you started the Perforce server, the checkpoint is still created, but the journal is neither saved nor truncated. This is highly undesirable!
Your checkpoint and journal files are used to reconstruct the Perforce database files only. Your versioned files are stored in directories under the Perforce server root, and must be backed up separately.
Versioned files are stored in subdirectories beneath your server root. Text files are stored in RCS format, with filenames of the form
filename,v. There is generally one RCS-format (
,v) file per text file. Binary files are stored in full in their own directories named
filename,d. Depending on the Perforce file type selected by the user storing the file, there can be one or more archived binary files in each
filename,d directory. If more than one file resides in a
filename,d directory, each file in the directory refers to a different revision of the binary file, and is named
1.n, where
n is the revision number.
Perforce also supports the AppleSingle file format for Macintosh. These files are stored on the server in full and compressed, just like other binary files. They are stored in the Mac's AppleSingle file format; if need be, the files can be copied directly from the server root, uncompressed, and used as-is on a Macintosh.
Because Perforce uses compression in the depot file tree, do not assume compressibility of the data when sizing backup media. Both text and binary files are either compressed by the Perforce server (denoted by the
.gz suffix) before storage, or they are stored uncompressed. At most installations, if any binary files in the depot subdirectories are being stored uncompressed, they were probably incompressible to begin with. (For example, many image, music, and video file formats are incompressible.)
In order to ensure that the versioned files reflect all the information in the database after a post-crash restoration, the
db.* files must be restored from a checkpoint that is at least as old as (or older than) your versioned files. For this reason, create the checkpoint before backing up the versioned files in the depot directory or directories.
Although your versioned files can be newer than the data stored in your checkpoint, it is in your best interest to keep this difference to a minimum; in general, you'll want your backup script to back up your versioned files immediately after successfully completing a checkpoint.
You might want to use the -q (quiet) option with
p4 verify. If called with the
-q option,
p4 verify produces output only when errors are detected.
The p4 verify command recomputes the MD5 signatures of all of your archived files and compares them with those stored when the files were first stored, and that all files known to Perforce exist in the depot subdirectories.
By running p4 verify before the backup, you ensure that you create and store checksums and file length metadata for any files new to the depot since your last backup, and that this information is stored as part of the backup you're about to make.
Regular use of p4 verify is good practice not only because it enables you to spot any server corruption before a backup, but also because it gives you the ability, following a crash, to determine whether or not the files restored from your backups are in good condition.
|
For large installations, p4 verify might take some time to run. Furthermore, the database is locked when p4 verify is running, which prevents most other Perforce commands from being used. Administrators of large sites might choose to perform p4 verify on a weekly basis, rather than a nightly basis.
|
Because p4d locks the entire database when making the checkpoint, you do not generally have to stop your Perforce server during any part of the backup procedure.
You never need to back up the db.* files. Your latest checkpoint and journal contain all the information necessary to re-create them. More significantly, a database restored from
db.* files is not guaranteed to be in a state of transactional integrity. A database restored from a checkpoint is.
If the database files become corrupted or lost either because of disk errors or because of a hardware failure such as a disk crash, the database can be re-created with your stored checkpoint and journal.
There are many ways in which systems can fail. Although this guide cannot address all failure scenarios, it can at least provide a general guideline for recovery from the two most common situations, specifically:
If only your database has been corrupted, (that is, your db.* files were on a drive that crashed, but you were using symbolic links to store your versioned files on a separate physical drive), you need only re-create your database.
mv your_root_dir/db.* /tmp
There can be no db.* files in the
$P4ROOT directory when you start recovery from a checkpoint. Although the old
db.* files are never used during recovery, it's good practice not to delete them until you're certain your restoration was successful.
3.
|
Invoke p4d with the -jr (journal-restore) flag, specifying your most recent checkpoint and current journal. If you explicitly specify the server root ( $P4ROOT), the -r $P4ROOT argument must precede the -jr flag:
|
|
If you're using the -z (compress) option to compress your checkpoints upon creation, you'll have to restore the uncompressed journal file separately from the compressed checkpoint.
You must explicitly specify the .gz extension yourself when using the -z flag, and ensure that the -r $P4ROOT argument precedes the -jr flag.
|
The database recovered from your most recent checkpoint, after you've applied the accumulated changes stored in the current journal file, is up to date as of the time of failure.
If both your database and your versioned files were corrupted, you need to restore both the database and your versioned files, and you'll need to ensure that the versioned files are no older than the restored database.
The journal contains a record of changes to the metadata and versioned files that occurred between the last backup and the crash. Because you'll be restoring a set of versioned files from a backup taken
before that crash, the checkpoint alone contains the metadata useful for the recovery, and the information in the journal is of limited or no use.
mv your_root_dir/db.* /tmp
The corrupt db.* files aren't actually used in the restoration process, but it's safe practice not to delete them until you're certain your restoration was successful.
3.
|
Invoke p4d with the -jr (journal-restore) flag, specifying only your most recent checkpoint:
|
After recovery, your depot directories might not contain the newest versioned files. That is, files submitted after the last system backup but before the disk crash might have been lost on the server.
•
|
In a case where only your versioned files (but not the database, which might have resided on a separate disk and been unaffected by the crash) were lost, you might also be able to make a separate copy of your database and apply your journal to it in order to examine recent changelists to track down which files were submitted between the last backup and the disk crash.
|
After any restoration, it's wise to run p4 verify to ensure that the versioned files are at least as new as the database:
This command verifies the integrity of the versioned files. The -q (quiet) option tells the command to produce output only on error conditions. Ideally, this command should produce no output.
If any versioned files are reported as MISSING by the
p4 verify command, you'll know that there is information in the database concerning files that didn't get restored. The usual cause is that you restored from a checkpoint and journal made after the backup of your versioned files (that is, that your backup of the versioned files was older than the database).
If (as recommended) you've been using p4 verify as part of your backup routine, you can run
p4 verify on the server after restoration to reassure yourself that your restoration was successful.
Copyright 1997-2009 Perforce Software.