Perforce introduced scripting APIs for Python, Ruby, Perl and PHP five years ago and many customers have used these APIs to create powerful scripts and applications. Simplicity is an important reason why the scripting APIs are popular among Perforce administrators and other toolsmiths. Writing in a scripting language is quicker and easier than writing in C++ or Java just by nature of the language. And, the scripting APIs have a very simple way to get data back from Perforce: call the run method to run just about any Perforce command, and you get a list of key-value structures (dictionaries or hashtables) back.
But the simplicity of this approach has two disadvantages:
- If a Perforce command returns a large number of results, the script will use a corresponding amount of memory.
- Care is required to deal with errors, warnings, and other information from the server.
Both problems are addressed in the 2011.1 release of the scripting APIs through the introduction of output handlers.
Output handlers let you use a callback technique when getting data back from a Perforce command. Instead of getting the entire result set at once and then doing something with it, you can act on a single record at a time.
That solves both of the disadvantages noted above. If you run a Perforce command that returns 1,000 records, you can operate on each record as it is returned, rather than getting all 1,000 back at once (and holding them in memory). Any errors or warnings are returned as the command is executing rather than when the command finishes, giving you more flexibility to deal with problems related to a single result record.
A simple example is the best way to look at output handlers. In this Python script, I want to print out the depot location of each file in the current directory and all sub-directories. (In other words, run p4 files ….) The script will use the run command with and without callbacks for illustration.
Without the callback, the first call to p4 files gives us an entire list of results back at once. Then we process the entire result set in a for loop. With the callback, the output handler’s outputStat method is called for a single record (file) at a time.
The gory details
Now let’s indulge in some technical details.
Output Handler Interface
To use a handler, you need to subclass P4.OutputHandler, which has the following signature (here in P4Python):
As you run a Perforce command while using an output handler, one of the output methods will be called for each record in the result set. The method selected depends on the type of data the server is giving you:
- outputStat is called for tagged output, and is the method used for most Perforce command output.
- outputBinary is called for binary data, like printing a binary file.
- outputText is called for text data, like printing a text file.
- outputInfo is called for tabular data, like non-tagged output or other unstructured messages.
- outputMessage is called for errors, warnings, and other server messages. When using an output handler, errors and warnings go through this method, not via exception or other error handling. Processing will continue for subsequent records unless you return one of the CANCEL options.
The argument provided to each method is a piece of data from the server. For example, the argument provided to outputStat is a dictionary.
After you handle the data provided by the server, you need to return a code indicating whether you have handled the record (OutputHandler.HANDLED ) or wish to skip it ( OutputHandler.REPORT ). If you don’t handle the record, it is passed back as part of the output for the run command. You can also use one of the CANCEL options to indicate that you want to stop the command. (Do not expect the CANCEL to be instantaneous though: due to buffering of Perforce messages you must expect to receive a few more calls to your handler before the flood of data stops.)
Turning on the handler
Enable the handler by setting an attribute in (P4Python and P4Ruby) or a method (P4Perl and P4PHP):
In P4Python, there are two additional ways to use the handler. First, you can use a with block.
Or, you can specify a handler for a single call.
In both cases, P4.handler is reset to its original value after the block or the call completes.
- Using output handlers is optional. If you are not dealing with large datasets and don’t need fine grained error handling, you probably won’t need them.
- While inside an OutputHandler method, you cannot call another command against the Perforce server on the same connection. If you need to run a command inside the handler, you must set up a second connection to the server to run that command. That second connection should be set up in advance outside the scope of the handler, or performance will suffer.
- P4Python provides the class P4.ReportHandler, instances of which will simply print out the object prepended with ‘info:’, ‘stat:’, ‘error:’, and so on.
- When using a handler, the APIs do not raise exceptions for errors and warnings. Instead, the method outputMessage is called to flag any problems. Processing does not stop; you should return CANCEL from this method if you want the server to stop sending you data.
Thank you to Randy DeFauw for his help in revising this article.
For more scripting tips by Sven, visit the p4 blog.
Sven Erik Knop joined Perforce in January 2007 as the first senior consultant for Perforce in Europe. His role encompasses training and consulting services, as well as working alongside the support team in the UK. He presented his paper, Replication at the 2011 Perforce User Conference in San Francisco, CA. In his spare time he likes to develop tools like P4Python.