Backup and recovery planning

A commit server can use the same backup and high availability / disaster recovery (HA/DR) strategy as a master server. Edge servers contain unique information and should have a backup and an HA/DR plan. Whether an edge server outage is as urgent as a master server outage depends on your requirements. An edge server might have an HA/DR plan with a less ambitious Recovery Point Objective (RPO) and Recovery Time Objective (RTO) than the commit server.

If a commit server must be rebuilt from backups, each edge server must be rolled back to a backup prior to the commit server’s backup.

Alternatively, if your commit server has no local users, the commit server can be rebuilt from a fully-replicated edge server. In this scenario, the edge server is a superset of the commit server.

Backing up and recovering an edge server is similar to backing up and restoring an offline replica server:

  1. On the edge server, schedule a checkpoint to be taken the next time journal rotation is detected on the commit server. For example:

    $ p4 -p myedgehost:myedgeport admin checkpoint

    The p4 pull command performs the checkpoint at the next rotation of the journal on the commit server. A stateCKP file is written to the P4ROOT directory of the edge server, recording the scheduling of the checkpoint.

  2. Rotate the journal on the commit server:

    $ p4 -p mycommithost:mycommitport admin journal

As long as the edge server’s replication state file is included in the backup, the edge server can be restored and resume service. If the edge server was offline for a long period of time, it might need to catch up on the activity on the commit server.

As part of a failover plan for a commit server, make sure that the edge servers are redirected to use the new commit server.

Note

For commit servers with no local users, edge servers could take significantly longer to checkpoint than the commit server. You might want to use a different checkpoint schedule for edge servers than commit servers. If you use several edge servers for one commit server, you should stagger the edge-checkpoints so they do not all occur at once and bring the system to a stop. Journal rotations for edge servers could be scheduled at the same time as journal rotations for commit servers.