Perforce Replication

Perforce 2009.2: System Administrator's Guide

Chapter 10

Perforce Replication

What is Replication?

Replication is the duplication of server data from one Perforce Server to another Perforce Server, ideally in real time. Replication is used to reduce server load, for example, by using the replicated data for reporting tasks, and for disaster recovery, to minimize downtime and data loss.

Perforce's p4 replicate utility replicates server metadata (that is, the information contained in the db.* files) unidirectionally from one server to another (the replica server). To achieve a fully functional replica server, you can configure the replica server to read the source server's versioned files (the ,v files that contain the deltas that are produced when new versions are submitted), or use third-party or native OS functionality to mirror the versioned content to the replica server's root.

Replica servers are intended to be used for read-only purposes. If you require read/write access to a remote server, use the Perforce Proxy.

Why Replicate?

You can use replication to:

•

Reduce load and downtime on the primary server

Long-running queries and reports, builds, and checkpoints can be run against a replica server, reducing lock contention. For reports and checkpoints, only metadata needs to be replicated. For builds, ensure that the build processes have read access (but not write access) to the master server's versioned files.

•

Provide warm standby servers

You can use p4 replicate to maintain a replica server as an up-to-date warm standby system, to be used if the master server fails. These replica servers require that both server metadata and versioned files are replicated. The p4 replicate command replicates metadata only. To mirror the master server's versioned files, use tools such as rsync.

How does replication work?

Metadata replication works by using the same data as Perforce's backup and restore features.

The p4 replicate command maintains transactional integrity while reading transactions out of an originating server and copying them into a replica server's database.

You run p4 replicate on the machine that will host the replica Perforce Server. p4 replicate fetches journal records from the master Perforce Server, and provides these records to a subprocess that restores the records into a database.

Finally, you run a p4d for the replica Perforce Server that points to the database being restored. Users may connect to the replica server as they would connect to the master server.

System requirements

•

The master and replica servers must be revision 2009.2 or higher.

•

The replica server must have a license file. (Any license file will do; it does not have to be the same license file as used on the master server.)

•

The master and replica servers must have the same time zone setting.

Windows

On Windows, the time zone setting is system-wide.

On UNIX, the time zone setting is controlled by the TZ environment variable at the time the replica server is started.

Warnings

•

Bidirectional replication is not supported, because it can corrupt the versioned files on both the master and replica servers.

Limits and restrictions

•

p4 replicate does not read compressed journals. Therefore, the master server must not compress rotated journals until the replica server has fetched all journal records from older journals.

•

p4 replicate only replicates metadata. If your application requires the replication of (or access to) archive files, these archive files need to be accessible to the replica Perforce Server. Use a network mounted file system or utilities such as rsync or SAN replication.

•

Client workspaces on the master and replica servers must be mutually exclusive. Users must be certain that their P4PORT, P4CLIENT, and other settings are configured such that files from the replica server are not inadvertently synced to client workspaces used with the master server, nor vice versa.

•

Users must not submit changes to the replica server. To prevent submission of changes, set permissions on the replica archive files to read-only (relative to the userid that owns the replica p4d process.) Other mechanisms, such as the use of trigger scripts on both the master and the replica server, written such that they would always succeed on the master server and fail on the replica server, could be used to enforce this requirement.

Replication commands

Replica servers are supplied with metadata from a master server by means of a p4 replicate command. The p4 replicate command polls the master server for new journal entries, and outputs them either on standard output, or pipes the records to a subprocess, also supplied on the command line.

To start a replica server, you must first run p4 replicate:

p4 replicate [replicate-flags] command [command-flags]

and then start the replica server:

p4d -r replicaroot -p replicaport

In most cases, the command supplied to p4 replicate will be a variation of p4d -jr, to read in the journal records from the master server into a set of database files used by the replica server.

Warning!

The replica server's root must not be the same as the master server's root.

Always specify the replica server's root by means of a command line flag, or from within a script. The command line flags override any P4ROOT variable setting, reducing the risk that you will inadvertently start a replica server that points to your master server's db.* files.

For further protection, it is recommended to run your replica server as a different user than the one used to run your master server. The userid that owns the replica server's process should not have write privileges to any of the directories managed by the master server.

p4 replicate flags

A typical invocation of p4 replicate looks like this:

p4 replicate -s statefile -i interval [-k] [-x] [-J prefix] command

Use -s to specify the state file. The state file is a one-line text file that determines where subsequent invocations of p4 replicate will start reading data. The format is journalno/byteoffset, where journalno is the number of the most recent journal, and byteoffset is the number of bytes to skip before reading. If you do not specify a byte offset, p4 replicate starts reading at the start of the specified journal file. If no statefile exists, replication commences with the first available journal, and a statefile is created for you.

Use -i to specify the polling interval in seconds. The default is two seconds. To disable polling (that is, to check once for updated journal entries and then exit), specify an interval of 0.

Use -J prefix to specify a filename prefix for the journal, such as that used with p4d -jc prefix.

By default, p4 replicate shuts down the pipe to the command process between polling intervals; in most cases, you will want to use -k to keep the pipe to the command subprocess open. This is the most common use case.

Note

If you are using -k, you must use the -jrc (see p4 replicate commands) option for consistency checking. Conversely, if you are not using -jrc, do not use -k to keep the connection open.

By default, p4 replicate continues to poll the master server until it is stopped by the user. Use -x to have p4 replicate exit when journal rotation is detected. This option is typically used in offline checkpointing configurations.

For a complete list of all the flags usable with p4 replicate, see the Command Reference entry for p4 replicate.

p4 replicate commands

A typical command supplied to p4 replicate looks like this:

p4d -r replicaroot -f -b 1 -jrc -

The invocation of the p4d command makes use of three new (as of release 2009.2) flags to p4d:

The -jrc flag instructs p4d to check for consistency when reading journal records. Batches of journal records are read in increasing size until p4d processes a marker which indicates that all transactions are complete. After the marker is processed, the affected database tables are locked, the changes are applied, and the tables are unlocked. Because the server always remains in a state of transactional integrity, it is possible for other users to use the replica server while the journal transactions are being applied from the master server.

The -b 1 flag refers to bunching journal records, sorting them, and removing duplicates before updating the database. The default is 5000 records per update, but in the case of replication, serial processing of journal records (a bunch size of 1) is required; hence, each line is read individually.

The -f flag supplied to p4d in conjunction with the -jr flag forces p4d to ignore failures to delete records. This flag is required for certain replication configurations because some tables on the replica server (depending on use cases) will differ from those on the master server.

Replication Examples

Offloading Reporting and Checkpointing Tasks

The basic replication configuration consists of a master server and a replica server. The replica server has a replica of the master server's metadata, but none of the versioned files. This configuration is useful for offloading server-intensive report generation, and for performing offline checkpoints.

1.

Create a server root directory for the replica server.

replica$ mkdir /usr/replica/root

2.

Checkpoint the master server:

master$ p4d -r /usr/master/root -jc

You could also use p4 -p master:1666 admin checkpoint.

3.

When the checkpoint is complete, copy the checkpoint file (checkpoint.nnn) to the replica server's server root and note the checkpoint sequence number (nnn).

4.

On the replica server, recover the checkpoint to the replica server's root directory:

replica$ p4d -r /usr/replica/root -jr checkpoint.nnn

This saves time by creating an initial set of db.* files for the replica server to use; from this point forward, transfer of data from the master will be performed by p4 replicate.

5.

Install a license file in the P4ROOT directory on the replica server.

6.

p4 replicate keeps track of its state (the most recent checkpoint sequence number read, and a byte offset for subsequent runs) in a statefile.

The first time you run p4 replicate, you must set the checkpoint sequence number in the statefile as follows:

replica$ echo nnn > state

7.

Start replication by issuing the following command:

replica$ p4 -p master:1666 -u super replicate -s state -i 10 -k p4d -r /usr/replica/root -f -b 1 -jrc -

The specified user requires super access on the master server (in this case, master:1666).

The default polling interval is 2 seconds. (In this example, we have used -i 10 to specify a polling interval of 10 seconds.)

8.

The replica server's metadata is now being updated by p4 replicate; you can now start the replica server itself:

replica$ p4d -p 6661 -r /usr/replica/root

Users should now be able to connect to the replica server on replica:6661 and run basic reporting commands (p4 jobs, p4 filelog, and so on) against it.

Using replica servers

The replica server can be stopped, and checkpoints can be performed against it, just as they would be with the master server. The advantage is that the checkpointing process can now take place without any downtime on the master server.

Users connect to replica servers by setting P4PORT as they would with any other server.

Commands which require access to versioned file data (p4 sync, for example) will fail, because this basic configuration replicates only the metadata, but does not have access to the versioned files.

Configuring for Continuous Builds

To support continuous builds, a replica server requires read-only access to the versioned files that reside on the master server.

Because the depot table (db.depot) is part of the replicated metadata, the replica server must access the versioned files through paths that are identical to those on the master server. If the master server's depot table has archives specified with full paths (that is, if local depots on the master server were specified with absolute paths, rather than paths relative to the master server's root directory), the same paths must exist on the replica server. If the master server's local depots were specified with relative paths, the versioned files can be located anywhere on the replica server by using links from the replica server's root directory.

Warning!

Replication is unidirectional.

To prevent users from submitting changes to the replica server, run the replica p4d as a user that does not have write access to the master server's archive files.

Providing a Warm Standby Server

To support warm standby servers, a replica server requires an up-to-date copy of both the master server's metadata and its versioned files. To replicate the master server's metadata, use p4 replicate .

To replicate the versioned files from the master server, use native operating system or third-party utilities. Ensure that the replicated versioned files can remain synchronized with the metadata (check the polling interval used by p4 replicate to replicate transactions being applied to the master server's db.* files). The %changeroot% trigger variable may be of use; it contains the top-level directory of files affected by a changelist.

Disaster recovery and failover strategies are complex and tend to be site-specific. Perforce Consultants are available to assist organizations in the planning and deployment of disaster recovery and failover strategies.

http://perforce.com/perforce/services/consulting.html

Next Steps

For more information about replication, see the Perforce Knowledge Base article:

http://kb.perforce.com/?article=1099

Windows	On Windows, the time zone setting is system-wide. On UNIX, the time zone setting is controlled by the TZ environment variable at the time the replica server is started.

Warning!	The replica server's root must not be the same as the master server's root. Always specify the replica server's root by means of a command line flag, or from within a script. The command line flags override any P4ROOT variable setting, reducing the risk that you will inadvertently start a replica server that points to your master server's db.* files. For further protection, it is recommended to run your replica server as a different user than the one used to run your master server. The userid that owns the replica server's process should not have write privileges to any of the directories managed by the master server.

Note	If you are using -k, you must use the -jrc (see p4 replicate commands) option for consistency checking. Conversely, if you are not using -jrc, do not use -k to keep the connection open.

replica$ mkdir /usr/replica/root

master$ p4d -r /usr/master/root -jc

replica$ p4d -r /usr/replica/root -jr checkpoint.nnn

replica$ echo nnn > state

replica$ p4 -p master:1666 -u super replicate -s state -i 10 -k p4d -r /usr/replica/root -f -b 1 -jrc -

replica$ p4d -p 6661 -r /usr/replica/root

Warning!	Replication is unidirectional. To prevent users from submitting changes to the replica server, run the replica p4d as a user that does not have write access to the master server's archive files.

Perforce 2009.2: System Administrator's Guide