Requirements_for_Simple_SVN_Synch

advertisement
ClearCase to Subversion bridge
Handling Branch Merged Directory Changes................................................................................. 1
Reviewing the Subversion and ClearCase Change Models ........................................................ 1
The ClearCase Directory Model ............................................................................................. 2
How Various ClearCase Directory Changes Work ................................................................ 2
The Subversion Directory Model............................................................................................ 3
Interpreting the SVN Log ....................................................................................................... 4
Design ............................................................................................................................................. 5
Quality Level .............................................................................................................................. 5
Wish List ..................................................................................................................................... 5
ClearCase to SVN ....................................................................................................................... 5
Caveats, Assumptions, and Release Notes ................................................................................. 6
Caveats .................................................................................................................................... 6
SVN to ClearCase: initial migration ........................................................................................... 8
How to do a directory-only merge .......................................................................................... 8
SVN to ClearCase: synchronization updates .............................................................................. 8
ClearCase to SVN ....................................................................................................................... 9
Requirements for SVN/ClearCase Synchronization ....................................................................... 9
Handling Branch Merged Directory Changes
This section recaps much of the discussion between Mike Pilato, Bob Jenkins, and Bill Rassieur
as they discuss the solution to handling merged directory changes when importing to ClearCase
from Subversion.
The algorithm developed so far by Bill is able to parse and interpret the Subversion log for
directory changes made directly on the sync branch (the source for changes to be migrated into
ClearCase). The current algorithm, as of rev 1.76 of the script, fails in a number of cases when it
tries to parse a log of the sync branch when the changes being parsed represent directory changes
merged from a branch that is a child of the sync branch. (Note: the term child branch means that
the branch originated in Subversion as a copy of the sync branch.)
Reviewing the Subversion and ClearCase Change Models
This section reviews a few of the key aspects of how Subversion and ClearCase model changes
to directories and files. You will see the key difference is that a ClearCase version of a directory
lists the versioned files and directories it contains, whereas Subversion also records the specific
versions of each of the elements the directory contains. We will also discuss the implications of
this difference.
Page 1 of 10
ClearCase to Subversion bridge
The ClearCase Directory Model
In this section you can read a very brief description of what data ClearCase stores when it
versions directories. Then for each operation like Adding a new element, Deleting, or Moving an
element, we say what the directory changes look like under this model.
In ClearCase, files and directories are both treated as versioned entities. The general term for a
versioned entity is element. Every element in ClearCase, whether file or directory, has an
associated version tree. The version tree indicates the entire change history of that element. The
version tree consists of branches with versions (along with a representation of the contents of
each version).
[Editor’s note: be nice to insert a diagram of a version tree at this point.]
Each element has an OID (object id) that uniquely identifies it among elements stored in the
ClearCase repository.
Note that elements conspicuously lack a name! That is not one of their intrinsic attributes.
How, then, do files (and directories) get names?
To understand that, we need to consider what ClearCase directory elements actually contain as
their contents: a list of names each associated with an OID pointing to an element in the
repository.
A key distinction with Subversion is that ClearCase directory versions contain pointers to
elements (i.e., whole version trees), and NOT to specific versions of each of the elements the
directory version contains.
ClearCase uses another mechanism, known as a Configuration Specification, a set of rules, to
determine which version of an element a user workspace expresses. For more information, see
any of the Rational reference books on ClearCase.
A ClearCase directory version can also associate a symbolic link with a name. This, in effect,
makes the ClearCase file-system model look very much like the Unix file system, with the added
concept that a given file (or directory) can be versioned. But as you can see, both hard links
which are just OID values - and symlinks can be found in a given (version of a) directory.
How Various ClearCase Directory Changes Work
The last section talked about what data ClearCase stores, especially with regard to versions of
Directory elements. This section builds on that by talking about what various change operations
do in terms of this model.
This section is written under the simplifying assumption that we are concerned only with a single
VOB database. Consideration of the ‘relocate’ command is not treated here as it would
complicate the discussion quite a bit and is out of scope for the current Bridge project.
Page 2 of 10
ClearCase to Subversion bridge
Adding files and directories: To Add a file or directory under a given parent directory, the
parent is checked out, the new element created, and the parent checked in. The command that
creates the new element, starting its version tree, also inserts a name and OID pointer into the
checked out version of the parent directory. Once the parent directory is checked in, a diff with
its predecessor version clearly shows the addition of the new name. This process works the same
whether the new element being added is a file or a directory.
Deleting files or directories: To make a file go away, or not appear any more under a directory,
that directory is checked out, and then an ‘rmname’ command executed against the name that is
no longer required. This removes the name from the list of the checked out version of the
directory. When the parent directory is checked back in, the change is recorded in the repository.
Moving or Renaming elements: The source parent directory and destination parent directory
are checked out. A ‘move’ command is invoked. This results in a name removal in the source
parent directory and a name insertion in the destination directory. The entire history of the
moved element is retained at the new location. This is because the OID reference in the
before/after directory entries point to the same element, hence the same version tree.
Copying elements: ClearCase has no built in command to perform copies per se. There are
commands that provide for manually inserting hardlinks (names+OID refernces) into directory
versions at will. This can be used to implement copy functionality. Scheible Rassieur always
advises clients to avoid using this feature of ClearCase as it results in “instant change
propogation”. The notion of instant change is nearly always anathema to best practices of
Change Management in which the desire is to manage and control change.
As mentioned in an earlier section, ClearCase also supports symbolic links. These might be
similarly employed to provide copy semantics if one wished to do so. Scheible Rassieur does
recommend limited use of symbolic links, although not as a means of implementing copy
semantics, but for other purposes which we will not enter into here.
The Subversion Directory Model
In this section you can read a brief description of what data Subversion is storing as it versions
directories. A description of what happens for copies, moves, adds, and deletes is then given in
terms of this model, with an eye towards comparing with what ClearCase is doing.
Note: this section as seen currently has been drafted by Bill Rassieur whose knowledge of
Subversion certainly is superseded by that of Mike Pilato or Robert Jenkins.
The Subversion directory model is substantially the same as the ClearCase model in that
directories are indeed versioned objects. This affords Subversion a great deal of power to track
directory changes which power a number of other popular versioning tools lack. A contrast
between how Subversion and ClearCase treat directory objects is that in Subversion the
references to children that a directory stores actually point to versions of the children, not the
children as a whole. To repeat a statement made earlier in this document:
Page 3 of 10
ClearCase to Subversion bridge
A key distinction with Subversion is that ClearCase directory versions contain pointers to
elements (i.e., whole version trees), and NOT to specific versions of each of the elements the
directory version contains.
One consequence of this structure is that whenever a file somewhere in the repository tree is
checked in, each of its ancestor directories all the way back to the root need to be ‘bumped up’ a
version. One can infer from this that the root directory in a subversion repository must have the
maximum number of internal revisions of any versioned object in the system. In fact, it may be
the same number as the number of commits in the system [Mike is that right, or approximately
right?]
Adds: When a new file is added to the system, a new versioned object is created for it. This
further results in a new internal revision of the parent and all other ancestor directories of the
new file.
Deletes: When a file is deleted, it’s old history is retained (just like in ClearCase). A new internal
revision is created for the parent directory and any other ancestor directories of the deleted file.
Copies: A copy works the the Add of a new object but with an exception. The file at the location
that is the destination of the copy has an internal pointer, a predecessor pointer linking back to
the source of the copy. In this way, Subversion is able to represent the history of the versioned
object.
Note that a copy operation is essential to how Subversion treats branching. In Subversion, unlike
ClearCase, the file-system and branching are intertwined: to copy is to branch. In ClearCase the
history of a versioned object is associated with that object. Where that object is located in the
ClearCase representation of the file-system is entirely unrelated: it is stored in the history of
directory objects.
Moves and Renames: These operations (which are really the same thing) are implemented as a
Copy from source to destination followed by a Delete of the source.
Interpreting the SVN Log
In order for the ClearCase to SVN source Bridge to work effectively, it must be able to correctly
interpret the SVN log to try and correctly derive and rollup directory changes occurring along a
sync branch between two given revisions. We are having problems in this area currently. This
section highlights some of the theory of operation relative to these problems.
[Editor’s note… add more text to this section]
Page 4 of 10
ClearCase to Subversion bridge
Design
Quality Level
The quality level of this design has been requested as “down and dirty”; something that arguably
works, but may have caveats that a refined, finished product would not have.
Wish List
Following is a list of potential features for future consideration:

address all the caveats shown below

provide nice error messages to users, and in no case have a Python stack traceback (if it can at
all be helped)

support Linux/Unix environment

support snapshot views


the shutil copy and test stuff could say, hey you've forgotten to update? Log it as an error.
Also, at the end, we can recommend NOT checking in anything if an error has occurred.

as an added feature, if there are no errors, then check in and label everything.

as and added feature, have clearsvn check for non-versioned (aka, private) elements before
running, as well as checking the configuration of both spaces. (CC if possible.. maybe not).

be nice to use a .ini file to store parameters, as well as hold the record of updates

have a log file, and a directory where update logs are maintained.
ClearCase to SVN
Pre-req: the user has set up
-- an SVN client space and two ClearCase views, a BEFORE and an AFTER.
-- the SVN workspace should exactly match the BEFORE view configuration
(we need to also have a way for the initial import into SVN - I'll get to
that)
-- The AFTER can represent some point later.
NOTE: It behooves the user to have some type of sane configuration rule for
their views, like, use UCM baselines, or base ClearCase labels, or
timestamps. If not, they risk losing the ability to effectively synchronize.
When clearsvn is invoked for CC->SVN mode it does the following:
1. Gets command line args. (duh)
2. Does a tree walk comparing the BEFORE view and the SVN workspace.
-- any differences are noted as errors.
3. Does a tree walk of the BEFORE and AFTER views to get adds, deletes, and
modifies.
4. For each ADD or DELETE, looks up the object id's and resolves any matches
as moves.
5. Performs the ADD, DELETE, and MOVE operations in SVN, reporting any errors
(remember, you can do stuff in ClearCase that you can't play back in SVN...)
Page 5 of 10
ClearCase to Subversion bridge
6. Performs the modify updates on changed files, reporting any errors (not
that we expect any).
Wish List Items:
-- At step 2, you could also imagine having clearsvn check for checkouts
and/or view private files, of which, there should be none.
Initial setup.
This should be pretty easy, actually. We just provide a mode where there is
no BEFORE directory, and clearsvn:
1. Gets command line args.
2. Does a tree walk comparing the AFTER view (the only view) and the SVN
workspace.
3. Adds any missing elements. Deletes any extra elements. Updates any
changes.
In other words, in this mode, clearsvn is effective, (but dangerous) like a
table saw: it will force the svn workspace to look EXACTLY like the AFTER
view.
Again, I'm not recommending we do any auto-checkin. In my experience (you can
accuse me of being all thumbs if you want :-), these kinds of updates are
fraught with peril (even with great tools to help - it's the band saw thing):
the last thing you want is to have to undo a thousand files worth of change
when your script checked it all in for you w/o review first.
However, it is not technically difficult to add that enhancement. I'm not
trying to dig my feet in. It can be done if you really want.
Caveats, Assumptions, and Release Notes
Caveats

The –x option cannot take an empty value: you always have to specify at least one level
of directory in SVN that is going to be truncated off the front of all paths processed from
SVN into ClearCase. As an example, you have to do ‘-x /trunk’ or ‘-x /branches’
(referring to what you originally populated your client workspace with, which is probably
going to be /trunk). And what this precludes is the possibility that you can update
ClearCase with all paths under the repository root of SVN.
Some of this section is redundant. To be cleaned up later.

if a co or other CC operation fails, we just abort, don't do things intelligently. User must unco and
correct the underlying issue.
-- could be a lot more helpful here
 not parsing the svn log very intelligently:
-- if there are comments masquerading as Add and Delete operations, we get fooled
-- not parsing for the change revision, which would be very cool to do, so repeated add/del of same file
not handled correctly necessarily.
Page 6 of 10
ClearCase to Subversion bridge
 user keeps track of svn revision changes themselves
-- would be great if clearsvn.py maintained some information somewhere

user does initial setup themselves, i.e., svn->CC is a clearfsimport

after running update, user validates it and manually does CC checkin, or does svn commit
 it's up to the user to refrain from:
- making any changes in a receiving workspace. If they do, such will be interpreted as requiring update
from the sending side. (although this would be detectable, especially in the SVN->CC case... see next
item)
- having any non-versioned files in either a sending or a receiving workspace.

clearsvn will not attempt to identify and skip non-versioned files; such will essentially be the cause
of error generation. The user will have to go back and cleanup any such elements.

clearsvn is not attempting to reconcile the SVN Modify changes with what we find in the treewalk.
We *could* do this and it would help address the case where the user accidentally made changes
on the CC receiving workspace where they shouldn't have.
There are two modes, and two major cases that have to be considered:
 modes: SVN to ClearCase, and ClearCase to SVN updates.
 Cases: first time update (a migration), a synchronization.
There is another case which occurs when an update is aborted part way through. The same type
of issue, but for different reasons occurs if the assumption of a clean destination is broken.
The assumption is that any branch in SVN or in ClearCase that is the target or destination for an
update, that said branch receives changes solely through update operations. The branch never
receives changes for any other reasons.
Depending on what time allows, we may or may not provide checks for non-versioned files
appearing in either the source or target trees. Such files are called view private files in ClearCase
terminology. The simplest approach is to assume (read: restrict so) that the user never has any
non-versioned files in either tree. It may be prudent – and cost effective - to simply put checks
for any such files and severely warn the user if such are found upon doing a sync operation.
Another assumption we shall make is that synch updates are atomic operations. This means that
if an update operation gets aborted for some reason and before it has completed in entirety, that
there is a way that the user can revert the partial changes made to the target. Furthermore, we
assume the user always takes care of this so that the synch script can be designed relying on the
notion that it has a revision number or some other means to specify the state of the source and
target branches each time it runs.
This means we are not providing a general way for the synch script to decipher and determine
the current state of a target branch and make the right set of changes to bring it into line with a
source. We are providing a specific way of doing this in which the target branch is assumed
Page 7 of 10
ClearCase to Subversion bridge
clean and up to a certain point, and that we can easily establish that same state on the source
branch.
The script is designed under the assumption that the user has done all the work to setup:
 a properly configured ClearCase dynamic view.
 a properly configured SVN client workspace.
that are ready and usable for the synchronization script.
SVN to ClearCase: initial migration
In this case, the sync script can invoke “clearfsimport” as its central action. The assumption is
generally that the target view in ClearCase will appear empty, and that there is an SVN client
workspace that the script should use as a source basis. The revision number specifies the source
configuration. The source revision number gets recorded in a text file specially maintained in the
ClearCase target view.
How to do a directory-only merge
Here’s the overview of how to setup a UCM view in ClearCase to do the directory-only merge
into the receiving branch.
1) Configure a view (not the receiving view, but may be one configured just like it – Merge
Manager won’t permit view-to-same-view merging).
2) (optional) Update that view’s config spec to select only directory types (you can omit this
step if you don’t know how to do that 100% successfully)
3) Set up the trees in ClearCase you want to merge, and run the find merge.
4) If you did step 2, skip to step 6.
5) In the graphical list, sort by type, select all files, and delete them from the list.
6) Select all candidates and merge them in. Note their merge type should all say ‘trivial’.
7) Check in the merged directory elements.
SVN to ClearCase: synchronization updates
In this case, the sync script expects to find a special text file in the target view area, which file
indicates the revision level of the SVN source tree the last time any update occurred. The script
makes sure the SVN source area is updated, and then parses the log for the changes that have
occurred since the last update. As part of the update, the special text file in the ClearCase view is
updated. This file is kept under revision control and checked in after all updates have been made
during each synchronization. The log from SVN is parsed. The overall (cumulative effect of)
adds, deletes, and moves are determined and translated into mkelem, rmname, and mv
commands for cleartool. These commands are issued. After that, the source and destination trees
are walked. In each directory, contents are compared. At this point, there should be no directory
change type of differences found: all files and directory contents should match up. If such
reconciliation is not seen, then the differences are reported as errors. The files are compared. If
any are different, then the source version is copied over to a writeable (checked out) copy of the
destination file. Everything is checked in finally.
Page 8 of 10
ClearCase to Subversion bridge
Note that certain directories shall be excluded from synchronization updates. One case is that no
.svn directory from an SVN source tree will ever be moved into ClearCase.
During operation of the script, a log file is kept that tracks every command and every step that
the sync script takes. This log file is generated with a timestamp as part of its name so as to make
it easy to correlate a given log file with the event when it was run.
ClearCase to SVN
To be determined. This is contemplated to work much as the SVN to ClearCase operations as
discussed above.
Requirements for SVN/ClearCase Synchronization
Summary –
In general, the requirement is to be able to reproduce the snapshot state of the source
version management system in the target version management system. That entails adds,
modifications, deletes, renames and moves of files and directories. The synchronization
must be supported to go from Subversion to ClearCase and vice versa.
SVN -> ClearCase Synchronization –
1. Update the SVN working copy (focused on the development branch)
2. Need to identify the files and directories that have moved (or renamed) via the ‘svn log’
command (e.g., I was able to get the new location/name and old location/name via
following commandline, but this would need to either filter by date or revision to get just
the latest changes since the last synchronization)
svn log --verbose|grep from|sed 's/ A //'|sed 's/(from //'|sed 's/)//'
3. Need to execute the ClearCase command (cleartool mv) to move the identified files and
directories in the ClearCase view (focused on a branch for the SVN work)
4. Need to export SVN (svn export)
5. Need to execute clearfsimport to update ClearCase with new files, deleted files and
modified files (per the current clearcvs.py with deleted files handled) from the SVN
export
ClearCase -> SVN Synchronization –
1. Triggers need to be created for the following ClearCase commands: mv, rmname, rmelem
and rmfolder. The triggers would need to record the from and to paths along with the
operation to be performed (i.e., move or remove).
2. Need to be able to execute moves (svn move) and deletes in the SVN working copy
(focused on a branch for the ClearCase work) working from a list of files and directories
that have moved in ClearCase.
3. Copy the contents of the ClearCase view (focused on the internal development branch)
into the SVN working copy (being sure to eliminate the lost+found directory)
4. Identify the files and directories that need to be added (e.g., I was able to do this via the
following commandline).
svn status |awk '/\?/'| awk 'BEGIN { FS= "?" } { print $2 }' > ..\addfiles
Page 9 of 10
ClearCase to Subversion bridge
5. Execute a SVN add on the identified files and directories (e.g., I was able to do this via
the following command).
svn add --targets ..\addfiles
6. Execute an SVN commit which will commit the adds, moves and modifies together.
The key here is to automate the steps for the synchronization. We can document how to establish
the views and working copies leaving it to the user to execute properly. The SVN to ClearCase
synchronization is the most critical one for the customer so it is needed first. It would be nice to
allow the commit message to be passed to the script.
Page 10 of 10
Download