Replication is the duplication of server data from one Perforce Server to another Perforce Server, ideally in real time. Replication is used to reduce server load, for example, by using the replicated data for reporting tasks, and for disaster recovery, to minimize downtime and data loss.
Perforce's p4 replicate utility replicates server metadata (that is, the information contained in the
db.* files) unidirectionally from one server to another (the replica server). To achieve a fully functional replica server, you can configure the replica server to read the source server's versioned files (the
,v files that contain the deltas that are produced when new versions are submitted), or use third-party or native OS functionality to mirror the versioned content to the replica server's root.
You can use p4 replicate to maintain a replica server as an up-to-date warm standby system, to be used if the master server fails. These replica servers require that both server metadata and versioned files are replicated. The
p4 replicate command replicates metadata only. To mirror the master server's versioned files, use tools such as rsync.
The p4 replicate command maintains transactional integrity while reading transactions out of an originating server and copying them into a replica server's database.
You run p4 replicate on the machine that will host the replica Perforce Server.
p4 replicate fetches journal records from the master Perforce Server, and provides these records to a subprocess that restores the records into a database.
Finally, you run a p4d for the replica Perforce Server that points to the database being restored. Users may connect to the replica server as they would connect to the master server.
•
|
p4 replicate does not read compressed journals. Therefore, the master server must not compress rotated journals until the replica server has fetched all journal records from older journals.
|
•
|
p4 replicate only replicates metadata. If your application requires the replication of (or access to) archive files, these archive files need to be accessible to the replica Perforce Server. Use a network mounted file system or utilities such as rsync or SAN replication.
|
•
|
Users must not submit changes to the replica server. To prevent submission of changes, set permissions on the replica archive files to read-only (relative to the userid that owns the replica p4d process.) Other mechanisms, such as the use of trigger scripts on both the master and the replica server, written such that they would always succeed on the master server and fail on the replica server, could be used to enforce this requirement.
|
Replica servers are supplied with metadata from a master server by means of a p4 replicate command. The
p4 replicate command polls the master server for new journal entries, and outputs them either on standard output, or pipes the records to a subprocess, also supplied on the command line.
p4d -r replicaroot -p
replicaport
In most cases, the command supplied to
p4 replicate will be a variation of
p4d -jr, to read in the journal records from the master server into a set of database files used by the replica server.
A typical invocation of p4 replicate looks like this:
Use -s to specify the state file. The state file is a one-line text file that determines where subsequent invocations of
p4 replicate will start reading data. The format is
journalno/byteoffset, where
journalno is the number of the most recent journal, and
byteoffset is the number of bytes to skip before reading. If you do not specify a byte offset,
p4 replicate starts reading at the start of the specified journal file. If no statefile exists, replication commences with the first available journal, and a statefile is created for you.
Use -i to specify the polling
interval in seconds. The default is two seconds. To disable polling (that is, to check once for updated journal entries and then exit), specify an
interval of 0.
Use -J prefix to specify a filename prefix for the journal, such as that used with
p4d -jc prefix.
By default, p4 replicate shuts down the pipe to the command process between polling intervals; in most cases, you will want to use -k to keep the pipe to the
command subprocess open. This is the most common use case.
|
If you are using -k, you must use the -jrc (see p4 replicate commands) option for consistency checking. Conversely, if you are not using -jrc, do not use -k to keep the connection open.
|
By default, p4 replicate continues to poll the master server until it is stopped by the user. Use
-x to have
p4 replicate exit when journal rotation is detected. This option is typically used in offline checkpointing configurations.
For a complete list of all the flags usable with p4 replicate, see the
Command Reference entry for
p4 replicate.
A typical command supplied to
p4 replicate looks like this:
p4d -r replicaroot -f -b 1 -jrc -
The invocation of the p4d command makes use of three new (as of release 2009.2) flags to
p4d:
The -jrc flag instructs
p4d to check for consistency when reading journal records. Batches of journal records are read in increasing size until
p4d processes a marker which indicates that all transactions are complete. After the marker is processed, the affected database tables are locked, the changes are applied, and the tables are unlocked. Because the server always remains in a state of transactional integrity, it is possible for other users to use the replica server while the journal transactions are being applied from the master server.
The -b 1 flag refers to bunching journal records, sorting them, and removing duplicates before updating the database. The default is 5000 records per update, but in the case of replication, serial processing of journal records (a bunch size of 1) is required; hence, each line is read individually.
The -f flag supplied to
p4d in conjunction with the
-jr flag forces
p4d to ignore failures to delete records. This flag is required for certain replication configurations because some tables on the replica server (depending on use cases) will differ from those on the master server.
The basic replication configuration consists of a master server and a replica server. The replica server has a replica of the master server's metadata, but none of the versioned files. This configuration is useful for offloading server-intensive report generation, and for performing offline checkpoints.
6.
|
p4 replicate keeps track of its state (the most recent checkpoint sequence number read, and a byte offset for subsequent runs) in a statefile.
|
The replica server can be stopped, and checkpoints can be performed against it, just as they would be with the master server. The advantage is that the checkpointing process can now take place without any downtime on the master server.
Users connect to replica servers by setting P4PORT as they would with any other server.
Commands which require access to versioned file data (p4 sync, for example) will fail, because this basic configuration replicates only the metadata, but does not have access to the versioned files.
Because the depot table (db.depot) is part of the replicated metadata, the replica server must access the versioned files through paths that are identical to those on the master server. If the master server's depot table has archives specified with full paths (that is, if local depots on the master server were specified with absolute paths, rather than paths relative to the master server's root directory), the same paths must exist on the replica server. If the master server's local depots were specified with relative paths, the versioned files
can be located anywhere on the replica server by using links from the replica server's root directory.
To support warm standby servers, a replica server requires an up-to-date copy of both the master server's metadata and its versioned files. To replicate the master server's metadata, use
p4 replicate .
To replicate the versioned files from the master server, use native operating system or third-party utilities. Ensure that the replicated versioned files can remain synchronized with the metadata (check the polling interval used by
p4 replicate to replicate transactions being applied to the master server's
db.* files). The
%changeroot% trigger variable may be of use; it contains the top-level directory of files affected by a changelist.
Disaster recovery and failover strategies are complex and tend to be site-specific. Perforce Consultants are available to assist organizations in the planning and deployment of disaster recovery and failover strategies.
Copyright 1997-2009 Perforce Software.