ClearCase to Perforce Conversion Guide

This guide consists of two parts:

  • a description of the conceptual differences between ClearCase and Perforce
  • a guide for transferring information from ClearCase to Perforce

Conceptual Differences

Atomic changes

SUMMARY:

  • Perforce supports atomic change transactions; ClearCase doesn't.

Perforce allows the grouping of a number of add, delete, edit and branch operations as one atomic change. The files the user is working on are shown with p4 opened. When the user runs p4 submit, these files (or some user-determined subset) are submitted as one change.

Thus, if a feature or bugfix is to be added, either all of it will appear or none of it will. Other users can never see an inconsistent state.

ClearCase views versus Perforce clients

SUMMARY:

  • ClearCase modifies filesystem semantics; Perforce doesn't.
  • ClearCase requires network availability to access files; Perforce doesn't.
  • Perforce requires disk space devoted to each client; ClearCase doesn't.

ClearCase users specify "views" which define which files and which versions of files are visible along specific paths. This is termed "transparency", since the user can access files with normal pathnames (although filesystem operations such as mv and rm are not transparent, and there is administrative overhead required to set up VOBs and filesystem mounts). Views are dynamic; new data added to the central repository is visible immediately.

Dynamic views require the interception of all file system calls in order to compute what the user should see. This slows down all file system accesses, even those unrelated to ClearCase, and imposes an added load on the network. It also requires constant network availability. (There is an additional ClearCase product to allow disconnected operation).

Perforce users typically each have their own separate "client workspace" which holds all files of interest to them. The files which appear on the client are determined by the "client view" which the user specifies. The default client view maps all depot files to files with the same name in the client workspace. Specifying a mapping such as

 //depot/projecta/...  //localclient/projecta/...
will result in only files in "projecta" appearing on the client "localclient". There is no artificial limit on the number of mapping lines a view can have. Mappings also allow translation of names between the depot and the client workspace. For example,
 //depot/a/b/c/internationalization/...  //localclient/i18n/...
maps a smaller section of the depot the user is interested in and names it something manageable.

Perforce clients are not modified except by explicit user action. Changes to depot files which are mapped to a client view will not be updated in that client view until the user runs p4 sync on that client.

Users specify which versions of files they want using the p4 sync command. By default p4 sync will synchronize the client workspace to the latest revision of files in the depot (including deleting those whose latest version is "delete"). Users can specify specific files and specific revisions of those files, however. For example:

CommandResult
p4 sync @42 synchronize client to state at change 42
p4 sync @awesome synchronize client to list of files and revisions in label "awesome"
p4 sync foo#2 bar@15 update foo to reflect contents of revision 2, update bar to reflect contents of bar as of change number 15
p4 sync junk/...#none remove all files in junk directory on client without affecting the depot

Note that files which are opened on the client (about to be added, deleted, edited or merged into) are never overwritten by p4 sync.

Perforce keeps track of the files and revisions on each client. All client workspaces must be disjoint, otherwise Perforce's records will be inaccurate. The way to share information is through the depot, not by "sharing" local client files.

Special naming conventions

SUMMARY:

  • ClearCase uses "extended pathnames" to access specific revisions.
  • Perforce allows the specification of files using either "client syntax" or "depot syntax".
  • Both systems disallow some filenames.

ClearCase provides an "extended namespace" to allow viewing of files other than those defined by the current view. The symbol used to denote extended names (by default @@) is not allowed at the end of filenames.

In Perforce files may be specified in either the "client" syntax or the "depot" syntax. The former is the name used to refer to the file on the local operating system (eg /usr/jane/foo or ../foo or c:\dick\foo), the latter is the canonical name of the file stored in the depot (eg //depot/foo). Note that the client view determines the mapping between the two namespaces. Any Perforce command which takes a file argument can take either client or depot syntax.

As well, Perforce defines special wildcard characters * (match any file) and ... (match any string). The latter is most often used to specify whole directory trees. For example p4 files //depot/main/... will list all files whose names start with //depot/main/ and hence are in the //depot/main directory. The command p4 files //.../blast.c will list all files ending with "/blast.c" and hence will show which directories contain the files "blast.c". The command p4 files //...blast.c will also list things such as "//depot/main/whatablast.c".

Perforce users can use the p4 print command to quickly view the contents of a file without having to map the file to the client workspace. For example, p4 print -q //depot/foo#2 will print revision 2 of the file "//depot/foo" without a header. p4 print //depot/projectb/... will print all files in "//depot/projectb" with a header line for each file.

Wildcards and the symbols used for specifying revisions (#) or change numbers or labels (@) are not allowed in filenames in Perforce.

Versioned directories versus separate files

SUMMARY:

  • ClearCase "versions" directories; Perforce doesn't.
  • Both systems support the ability to see files in their correct locations at any point in time.

In ClearCase adding a file requires checking out the directory first. Files are moved with the "ct mv" command. The version of the directory selected by the view determines where the file will exist. The movement of a file is not immediately obvious from user output (for example cleartool lshistory shows the file as existing in its current location for all revisions, even if it was moved).

In Perforce files are uniquely specified by a path in depot syntax. Files cannot be moved; rather they can be copied and deleted by doing p4 integrate from to then p4 delete from (then at some point a "p4 submit"). The p4 sync command will ensure that files which have been deleted in the depot are deleted on the client. File histories in Perforce clearly show which versions of a file have been integrated into another file. An example may make this clear:

# echo 'hi' >hi

# p4 add hi
//depot/a/hi#1 - opened for add

# p4 submit

Change 1 created with 1 open file(s).
Submitting change 1.
Locking 1 files ...
add //depot/a/hi#1
Change 1 submitted.

# p4 edit hi
//depot/a/hi#1 - opened for edit

# echo 'Hi!' >hi

# p4 submit

Change 2 created with 1 open file(s).
Submitting change 2.
Locking 1 files ...
edit //depot/a/hi#2
Change 2 submitted.

# p4 integ hi ../b/hi
//depot/b/hi#1 - branch/sync from //depot/a/hi#1,#2

# p4 delete hi
//depot/a/hi#2 - opened for delete

# p4 submit
Change 3 created with 2 open file(s).
Submitting change 3.
Locking 2 files ...
delete //depot/a/hi#3
branch //depot/b/hi#1
Change 3 submitted.

# p4 changes
Change 3 on 1998/02/15 by fish@shark 'moved hi from a to b '
Change 2 on 1998/02/15 by fish@shark 'More emphatic greeting! '
Change 1 on 1998/02/15 by fish@shark 'Greetings '

# p4 describe 3
Change 3 by fish@shark on 1998/02/15 12:09:23

 moved hi from a to b

Affected files ...

... //depot/a/hi#3 delete
... //depot/b/hi#1 branch

Differences ...

# p4 filelog //depot/b/hi
//depot/b/hi
... #1 change 3 branch on 1998/02/15 by fish@shark 'moved hi from a to b '
... ... branch from //depot/a/hi#1,#2

# p4 filelog //depot/a/hi
//depot/a/hi
... #3 change 3 delete on 1998/02/15 by fish@shark 'moved hi from a to b '
... #2 change 2 edit on 1998/02/15 by fish@shark 'More emphatic greeting! '
... ... branch into //depot/b/hi#1
... #1 change 1 add on 1998/02/15 by fish@shark 'Greetings '

# p4 sync @2
//depot/a/hi#2 - added as /home/fish/a/hi
//depot/b/hi#1 - deleted as /home/fish/b/hi

# p4 sync
//depot/a/hi#2 - deleted as /home/fish/a/hi
//depot/b/hi#1 - added as /home/fish/b/hi

Branching

SUMMARY:

  • ClearCase associates a revision tree with every file and directory; Perforce associates a simple linear list of revisions for every file.
  • ClearCase uses merge "hyperlinks" to store merge information; Perforce stores "integration records".
  • ClearCase creates branches for you in a piecemeal fashion; in Perforce the branching is done by the user once, up front.

Branches in ClearCase follow the traditional model of having a tree of versions for every file. As well, ClearCase "versions" directories, so there is a tree of versions for every directory. Implicit creation of branches is encouraged through view configuration specifications. VOB-extended pathnames include version information which specifies the revision of each element along the path. For example,

/vobs/ dbtools/ .@@/ main/ spm_bldmdl/ 1/ dbcore/ main/ spm_bldmdl/ 8/ factory/ main/ 2/ DB/ main/ 1/ ctbasic/ main/ CHECKEDOUT.41495/ mmframei.cpp@@/ main/ 1

(The example has spaces after slashes so that it doesn't force your browser to display obnoxiously wide paragraphs!)

Careful examination of this path reveals that the file is really on the "spm_bldmdl" branch. Files which exist along paths which have not been branched may also be on this branch as well, though - the config spec is usually "if there's a branch, follow it, otherwise give the latest version on main".

ClearCase advocates a model in which the main line is used for releases, not for development. Development requires creating a branch, then merging code back into the main line.

ClearCase uses merge "hyperlinks" to denote which versions have been merged between branches.

Perforce uses a simple integer to denote a version of a file. Branching is done by simply copying the file to a file with another name. Perforce does this quickly and efficiently, by doing a "lazy copy"; the copy doesn't actually exist in the depot initially, rather the metadata stores the relationship between the copied file and its source.

The naming convention is up to the user, but one typical convention is to have the first pathname component after "//depot" denote the branch name. For example, if the main line of development is in //depot/main and it is time to create a branch to release version 3, the user would simply type p4 integ //depot/main/... //depot/v3/.... Every file in the directory //depot/main will then have a corresponding file in //depot/v3. Users working on bug fixes for the release would simply look at files in //depot/v3, those continuing main development would work on files in //depot/main. Bugfixes may be integrated from v3 back into main, and new development may be integrated from main into v3.

Note that the model Perforce advocates is different than the one ClearCase advocates. Perforce suggests calling your main line of development "main" and creating branches (copies) for each release. There is no need to use labels.

The output of p4 filelog in the example given above for moving a file illustrates the "integration records" which are kept. For more information on branching see the papers on Inter-File Branching and Software Life-Cycle Modelling.

Labels

SUMMARY:

  • Use of labels is important in ClearCase; in Perforce labels are not usually needed.
  • Both systems support labels.

Both systems support the notion of an object which identifies a specific set of files and revisions. In Perforce this is stored as a list of (file#revision) entries, and is updated with the p4 labelsync command. Note that labels are often used to simply denote a snapshot in time for a particular easily-specified set of files; this is inherently easy to do in Perforce without using a label, due to Perforce's use of atomic change transactions and file naming syntax. For example, the state of all the files in //depot/projecta as of change 42 can be obtained with p4 sync //depot/projecta/...@42

Symbolic links

SUMMARY:

  • Perforce does not support symbolic links on Windows/NT; ClearCase does.
  • Perforce maps depot files to client files as a 1:1 mapping.

ClearCase users may encounter problems moving to Perforce because of the different way Perforce handles symbolic links. Since ClearCase intercepts file system calls it can implement symbolic links with UNIX semantics on Windows/NT. Perforce uses the native operating system, so it cannot do this.

It is also common for ClearCase users to use symbolic links to point to common code. These symbolic links can be transferred to Perforce successfully if they are added as type "symlink", and the client mapping does not render the symlink pointing to an incorrect location.

If it is necessary for the files to appear whole rather than as symbolic links, the best solution is to use Perforce's branching mechanism. This is not transparent, but in most cases common files should be "released" to the rest of the company anyway.

For example, if all files in //depot/main/common are supposed to appear in //depot/main/projecta and //depot/main/projectb, then when a change is made to a common file the following commands should be run:

p4 integ //depot/main/common/... //depot/main/projecta/...
p4 integ //depot/main/common/... //depot/main/projectb/...
p4 resolve -am
p4 submit

It is recommended that this be put in a script.

Multiple VOBs, "ClearCase Multisite" product, "ClearMake" and wink-in

SUMMARY:

  • Many ClearCase concepts do not apply.

All of these concepts are irrelevant to Perforce. Everything in Perforce is typically stored in one depot handled by one server (although Perforce does support read-only "remote depots"). Builds are a separate issue; in Perforce it is simply a matter of doing the appropriate p4 sync command and running your favourite build tool. Perforce hosts a freely available build tool called Jam which has advantages over traditional make utilities, but Jam is independent of Perforce.

Transferring information from ClearCase to Perforce

Difficulty with complete conversion utility

Preservation of existing data would be simple if there was a conversion utility which transferred all data in a consistent manner, taking into account the different concepts used by ClearCase and Perforce. Unfortunately, ClearCase's complex versioning and branching of directories and its poor event record output make such a converter difficult.

Furthermore, it seems likely that such a utility would be of little practical value because of the speed at which it would run. A prototype converter which did not handle directory versioning ran at the blistering rate of 8 revisions per minute, or about one to two weeks to convert a mid-sized site. The main limitation seems to be the time it takes to extract revisions from ClearCase.

The simple snapshot approach

The other extreme to preserving everything is to simply start using Perforce with a snapshot of the existing data. Here are the steps required:

  1. Decide on the desired organization of files, including branching issues. Recall that branches are simply different files in Perforce; hence branch names are part of the file namespace. Typically, files should be branched as a group at the highest level in the tree which is in common. Typically, as well, there is a "main" development line which is branched into all others. Thus, you could have //depot/main contain all files, with each branch x being under //depot/x. Or you could have //depot/projecta/main branched into //depot/projecta/v3.2 and //depot/projecta/v3.3, plus //depot/projectb/main branched into //depot/projectb/v2.0, and so on.
  2. Decide on a location for the Perforce server - choose a disk partition with lots of space, and preferably run the server on a machine which is not running ClearCase (for performance reasons). Then, most likely as root, run p4d -r /usr/perforce. If you want to choose a port other than the default, specify it with -p
  3. Set your login script to set P4PORT to the machine:port# the server is listening on, and set P4CLIENT to a reasonable client name (e.g. your user name)
  4. Decide on a location for your client, say /home/me/work, make that your current working directory and run p4 client, accepting the default.
  5. Copy all desired files from ClearCase to /home/mew/work, doing whatever reorganization is desired. If you are copying the "main line" of development, for example, you would likely do
    mkdir main
    cleartool setview (as appropriate)
    cp -R /vobs main
    
  6. add all of these files to Perforce with find . -type f -print | p4 -x - add followed by p4 submit Now the Perforce depot contains all the files, submitted as one change.

Note that the simple snapshot approach does not result in any merge information; future integrations will at first be a two-way merge. This problem can be minimized by merging as many branches as possible in ClearCase before taking the snapshot.

The middle ground

The first merge after the "simple snapshot approach" will be difficult. To avoid this problem, one could essentially do multiple snapshots so that the merge relationship is correct (well, mostly correct). Here is the procedure:

  1. Determine the oldest common ancestor of all branches you will be importing. Perform the simple snapshot approach as above, putting the code in //depot/main.
  2. For each branch x do p4 integ //depot/main/... //depot/x/...
  3. Run p4 submit. Now the Perforce depot will contain a copy of the snapshot for all branches. Perforce now knows that each branch was copied from the particular version that exists in //depot/main
  4. Delete all files on the client with rm -rf. This is *not* something you would do normally when using Perforce!
  5. Copy the latest version of each file from ClearCase to the appropriate location.
  6. Follow the instructions in Perforce tech note 2 to construct a changelist which will add, delete, and edit the appropriate files, then run p4 submit.

A simpler alternative to the above is to replace the first step with just getting the latest version of the main line. This will result in merge information which is definitely wrong, but it is preferable to the simple snapshot approach which would require a two-way merge.

Another alternative is to abandon the notion of keeping integration records and simply transfer snapshots based on time (every day or every week) or labels. The procedure for each snapshot would be:

  1. Set the ClearCase view according to the criteria (label or date).
  2. Create a list of all files called "filelist" and add each file to Perforce with p4 -x filelist add.
  3. Run p4 submit with a change description saying "snapshot as of (date/label)"
  4. If it's a label, run p4 label and p4 labelsync to create the same label in Perforce
  5. Run p4 -x filelist delete to mark all files as deleted, then run p4 submit OR modify the "add files to Perforce" step to add the file if it doesn't exist or edit it if it does.

This approach eliminates the filename mapping problem, and may solve the performance problem because it transfers only some of the history. Anyone wishing to pursue this approach may wish to contact us; we will be glad to assist.

The phased approach

Due to the different branching model suggested by ClearCase and the need to become accustomed to a different system, it may be best to take a phased approach to conversion. The best approach would seem to be

  1. Take a simple snapshot of the main line only. Due to ClearCase's model this will probably currently be a branch called "version n" where n is the next version to be released.
  2. Switch the main line developers over to using Perforce. When it is time to release "version n", create a branch for version n.
  3. Code maintainers working on version n will now switch to Perforce. The process continues until all supported releases are in Perforce.

The advantage of this approach is that the Perforce history is very clean, the transition is more manageable, and experience with Perforce can be passed on from developer to developer.

The disadvantage, of course, is that the interim solution involves two CM systems, which may introduce short term complications.

Other conversion issues

Multiple names for one directory

ClearCase installations will typically have a myriad of NFS mounts and symbolic links. This can result in the same directory having multiple names. Perforce maps files between client syntax and depot syntax based on the current working directory. In order to ensure Perforce uses the canonical name for the directory p4 can be aliased to p4 -d `pwd`.

Performance problems

ClearCase slows down all filesystem accesses. Thus it is best for the Perforce server to be run on a machine which does not have ClearCase installed. Users who will never need to access ClearCase should have ClearCase de-installed from their machines.