Scripting efficiently

The Helix Server Command-Line Client, p4, supports the scripting of any command that can be run interactively. Helix Server can process commands far faster than users can issue them, so in an all-interactive environment, response time is excellent. However, p4 commands issued by scripts — triggers, or command wrappers, for example — can cause performance problems if you haven’t paid attention to their efficiency. This is not because p4 commands are inherently inefficient, but because the way one invokes p4 as an interactive user isn’t necessarily suitable for repeated iterations.

This section points out some common efficiency problems and solutions.

Iterating through files

Each Helix Server command issued causes a connection thread to be created and a p4d subprocess to be started. Reducing the number of Helix Server commands your script runs might make it more efficient if the command is lockless. Depending on the use of shared locks however, it might be more efficient to have several commands operate on smaller sets of files than having one command operate on a large set of files.

To minimize the number of commands, try this approach:

for i in p4 diff2 path1/... path2/...
do
    [process diff output]
done

Instead of an inefficient approach like:

for i in p4 files path1/...
do
    p4 diff2 path1/$i path2/$i[process diff output]
done

Using list input files

Any Helix Server command that accepts a list of files as a command-line argument can also read the same argument list from a file. Scripts can make use of the list input file feature by building up a list of files first, and then passing the list file to p4 -x.

For example, if your script might look something like this:

for components in header1 header2 header3
do
    p4 edit ${component}.h
done

A more efficient alternative would be:

for components in header1 header2 header3
do
    echo ${component}.h >> LISTFILE
done
p4 -x LISTFILE edit

The -x file flag instructs p4 to read arguments, one per line, from the named file. If the file is specified as - (a dash), the standard input is read.

By default, the server processes arguments from -x file in batches of 128 arguments at a time; you can change the number of arguments processed by the server by using the -b batchsize flag to pass arguments in different batch sizes.

Using branch views

Branch views can be used with p4 integrate or p4 diff2 to reduce the number of Helix Server command invocations. For example, you might have a script that runs:

$ p4 diff2 pathA/src/...   pathB/src/...
$ p4 diff2 pathA/tests/... pathB/tests/...
$ p4 diff2 pathA/doc/...   pathB/doc/...

You can make it more efficient by creating a branch view that looks like this:

Branch:        pathA-pathB
View:
        pathA/src/...      pathB/src/...
        pathA/tests/...    pathB/tests/...
        pathA/doc/...      pathB/doc/...

…and replacing the three commands with one:

$ p4 diff2 -b pathA-pathB

Limiting label references

Repeated references to large labels can be particularly costly. Commands that refer to files using labels as revisions will scan the whole label once for each file argument. To keep from hogging the Helix Core Server, your script should get the labeled files from the server, and then scan the output for the files it needs.

For example, this:

$ p4 files path/...@label | egrep "path/f1.h|path/f2.h|path/f3.h"

imposes a lighter load on the Helix Core Server than either this:

$ p4 files path/f1.h@label path/f1.h@label path/f3.h@label

or this:

$ p4 files path/f1.h@label
$ p4 files path/f2.h@label
$ p4 files path/f3.h@label

The "temporary client workspace" trick described below can also reduce the number of times you have to refer to files by label.

On large sites, consider unloading infrequently-referenced or obsolete labels from the database. See Unloading infrequently-used metadata.

Using a temporary client workspace

Most Helix Server commands can process all the files in the current workspace view with a single command-line argument. By making use of a temporary client workspace with a view that contains only the files on which you want to work, you might be able to reduce the number of commands you have to run, or to reduce the number of file arguments you need to give each command.

For instance, suppose your script runs these commands:

$ p4 sync pathA/src/...@label
$ p4 sync pathB/tests/...@label
$ p4 sync pathC/doc/...@label

You can combine the command invocations and reduce the three label scans to one by using a client workspace specification that looks like this:

Client:        XY-temp
View:
        pathA/src/...      //XY-temp/pathA/src/...
        pathB/tests/...    //XY-temp/pathB/tests/...
        pathC/doc/...      //XY-temp/pathC/doc/...

Using this workspace specification, you can then run:

$ p4 -c XY-temp sync @label