This blog entry is a result of the recent webcast focused on the topic of “Subversion Best Practices – Branching Successfully.” I promised to use our Subversion blog to respond to questions that the audience asked, but that I didn’t have time to answer. I hope to answer about half of those questions in this entry and will do the other half next week. If yours isn’t in this posting, then wait until next week. And feel free to ask more questions via our discussion forums.
What do you offer in the way of Subversion training?
I’ll talk to this briefly and point you to the appropriate web pageif you want further information. CollabNet offers training for administrators, users, and configuration managers. We have both half day and full day classes for administrators that can be taught onsite or remotely. We have both half day and full day classes for users that can be taught onsite or remotely or even via web-based training (we also do train-the-trainer sessions onsite). We have a full day class for configuration managers focused on best practices that we can deliver onsite. Occasionally we do some other open enrollment classes online like client specific user classes or advanced topics.
With the training and consulting, is it Subversion server or product specific, or more generic as far as the server and client applications are concerned, or a combination?
Most of our administration training is done specifically for an organization which makes it server, product (stand alone Subversion, CollabNet Subversion Edge or CollabNet Team Forge), and environment specific to a point, but driven by what that organization has or plans to have implemented. Our user training has no relationship to which server, but there are some aspects that vary based on the product involved. We try to address the primary client that an organization plans to use via our demonstrations and any labs (labs come with the day long class). We often demonstrate in multiple clients for customers who have many standard clients identified to use. We have offered open enrollment classes for specific clients as well.
Server consulting is product and environment specific as defined by the customer. Migration and process (Applied Workshop) consulting are impacted a little by products and that is taken into consideration.
Do you have a cheat sheet or reference card for Subversion commands?
Yes, we keep an up-to-date version on the CollabNet Subversion Community site.
In the Stable Trunk model, what is the purpose of the merge to trunk and branching from trunk if you don’t develop or release on trunk?
You could release from the trunk assuming that you have only a single version in production at a time, but that’s up to you. The purpose of the trunk is to provide a central point of truth from which all releases relate. It also gives us a condensed version of history of at least all the main releases which, in time, becomes pretty much what we’d want to peruse as far as history. That central point lets us respond to changes better as well (e.g., the injection of a new release between 2 already defined ones). It would also make it much easier to follow what has happened via any visualization available today or offered in the future.
What are the main differences from the Agile Branch (Release) Approach and the Feature Branch Approach?
The use of feature branches with either the unstable or stable trunk models is all about isolating the work on a single feature due to risk related to working on that feature while other things are changed elsewhere that might distract from doing this specific feature; due to risk related to committing in-process changes where others would be impacted by getting them in their work environments through the update operation; or due to concern about whether a feature might ever be included in the current release or at least might fall into a later release. An earlier question about developer branches is what you’re looking to handle with Feature branches as just a nuance.
The Agile Release model assumes nothing about what any upcoming release might contain until that release is basically about ready to start through the formal promotion model process. Feature branches here are not selectively done due to risk posed by or to a specific piece of work, but rather are the default for all units of work. They are spawned from the last release and get subsequent releases merged into them until they are merged into a release themselves.
We use developer branches so that we can commit changes without impacting others. I didn’t see it in your list of nuances so what is your perspective on these as best practices?
I didn’t cover every possible type of nuance and certainly developer branches would fall into that category. From my perspective, they are not a best practice and I generally recommend against their usage. There is definitely the occasional need to commit changes that you don’t want to commit to the general development branch. You don’t want to go to long without putting changes into the repository for a number of reasons. That said, those changes should be related to a specific piece of work that you’ve been assigned to implement and thus I would implement that as a feature branch. Developer branches don’t make it clear what is to be found on that branch or what it is being used for at any point in time. They often become messy sandboxes that are an administrator’s worst nightmare if a developer is gone or leaves the company. There is a definite need to commit to the repository and isolate that from other people’s work, but I strongly recommend the use of feature branches for that purpose as they have a clear content and an understood lifespan.
Are you saying not to have separate test and production branches as a best practice, but to use a common branch and apply other techniques to identify development, test or production work (i.e., tags)?
Yes. Development is the only environment where work should really be done so it is the only one that maps to the idea of a branch. That mapping logically applies to the trunk in the unstable model and a release branch for the stable model. Test and production are specific revisions that we want to identify as being approved for or deployed into those environments. We look for a specific revision that doesn’t change on us for these environments. That maps to the idea of a tag. That also makes it easy for us to identify what, went where, when, and for how long. We can also easily determine what changed between one stage and another one or one release and another one. We want to historically know these things and we want others to understand these events/revisions which tags support and branches do not.
Are there walk-throughs for these actions somewhere?
The actions themselves are basic Subversion operations that can vary based on the client you chose to use. Some clients may provide information approaching walk-throughs. Obviously training classes cover the operations and our Subversion for Configuration Managers helps people walk through the best practices that we discussed.
How have you seen administrators restrict access to creating branches to prevent branching chaos? Obvious challenge is that it is done via the copy command.
If they really want to restrict the creation of branches, then the only method is to utilize a pre-commit script where it can parse to see if a copy command is targeting the branches folder and if the user is on a list of approved branch creators. As you note, the fact that it is a general command adds some challenge, but assuming the branch is being created in the branches folder can mitigate that challenge to some extent (the challenge then is understanding how to identify a specific write operation is a copy).
I don’t find that most administrators try and enforce such restrictions. Most define that as a process issue where people are told what they are expected to do and not do. They then monitor what gets created in the branches folder so they can address violations of the process with the individuals themselves. Governance covers up who is trying not to adhere to defined processes at some cost to everyone else. Why not fix the issue with the people who are failing to follow policy (e.g., get them to comply or have them leave)?
Are properties related to branching strategies?
No. The only property that really comes into play would be Subversion’s use of the svn:merge-info property to store information on paths and revisions that have been merged to a path. The best practice is to merge at the top of a branch whenever possible so branching strategies really don’t impact that property either. Obviously if you merge branches that have properties on paths, then the merge will also address those properties. Keep in mind that conflicts can occur that need resolved.
Is there a good resource (or other presentation) on the more specific practices of merging and resolving conflicts? I come to Subversion from ClearCase and find that I’m in desperate need of merge/conflict best practices (i.e., it still seems MUCH easier with ClearCase)?
You may find materials with regard to a specific client and the online book addresses it to some extent. Obviously a training class could also help and I’d suggest our web-based training module, Enterprise Features.
There is a definite difference between how ClearCase defines and approaches merging and how Subversion does it. The idea of ClearCase’s “most common recent ancestor” isn’t really mapped into Subversion due to how merging is done. That makes the transition a bit more challenging since you get used to one approach that isn’t duplicated in the new tool. Having sold and supported both, I think that’s the biggest thing to adjust to. Outside of that, the tree conflicts are the other aspect that brings complexity and we hope to make that easier moving forward.
Is there a way to run a dry-run merge report without access to a working copy (e.g., via URLs)?
No. Today there is really no practical feature defined as a dry-run. Clients that offer such functionality are doing a normal merge and throwing away the results rather than placing them in a working copy. These clients still require the working copy to do even that. So there really is only a merge operation and since it needs to apply the diff (of two trees in your repository) to a third tree that it expects might have conflicts, it requires a working copy. There has been talk of implementing such functionality (though I’m not sure it was planned for having no working copy or just not impacting the working copy), but I am not aware of anyone working on it currently.
You suggested that it is possible to pull back a merge using a rollback. But what if the merged code had been modified/evolved on the release branch for some time before we tried to rollback the merge? Is that possible? If so, then I’m assuming it would like require manual unmerging that could be error prone?
Yes it is possible to rollback a change even after the branch has had numerous other changes applied to it since the change you want to rollback. It is also possible that the rollback will itself result in a conflict that will require manual resolution, but I don’t see that as overly error prone, in general anyway.
A rollback of either a merge or of a commit on a path is implemented via a reverse merge. You identify the revision(s) that you want rolled out of the branch and supply that information to the client of your choice. Basically, Subversion looks at the delta between the revision to rollback and the revision prior to that to determine what operations to do in reverse (e.g., if the revision added a file, then it will delete that file; if the revision deleted a line, then it will add that line; etc.). Some clients refer to the operation as a revert, but ultimately it is executed as a reverse merge. If the rollback is of a merge, then not only will the changes be reversed, but the svn:merge-info property will no longer show that merge and thus what was unmerged is a candidate to be merged again in the future if desired.
What is a tree conflict?
A tree conflict is basically when a path has been deleted by one user, modified by another user, and then the two changes are merged together. This conflict isn’t about file content or a property value, but about the structure of the repository. It can involve 1 or many files and folders. Some of these conflicts have straight-forward resolutions (e.g., a merge of a moved file and changes to its contents which should result in the moved file having the changed contents) and some are situational so require human intervention (e.g., a file is deleted by one user and modified by another). Subversion has such scenarios arise with updates, switches and merges. Since a move or rename is actually a 2 operation macro involving a delete of the original object, it is often the cause of a tree conflict, but it is not the only cause so changing the move/rename operation to be atomic won’t resolve all tree conflicts.
Subversion 1.6 added automatic tree conflict detection and blocking. Future releases will address use cases where automated resolution is universally agreed upon, but today some of that can be handled by an open source utility, truMerge, that can be used instead of the merge operation provided with Subversion.
What types of notifications does Subversion provide so that management knows when tasks are being worked, checked in, etc.?
It has been my experience that much of that needs to come not from the version control tool, but from the issue management tool. The user is assigned an issue (e.g., defect, enhancement, feature, etc.) and they indicate in the issue management system that they are working on it. When they complete the work, they change the state of that issue to indicate that as well. Issue management tools are expecting management inquiries so they are built to report this information back. Integration between Subversion and the issue management system could also be in place so that it is easy to see what was changed to address a specific issue.
Within just Subversion, there is no concept of a task to where that could be reported upon. Locking could be used so that the files being worked on could be identified, but that puts other limits in place that doesn’t make this use of locking a general best practice. It is common to have commits (check-ins) logged to a mailing list and there are general post commit hooks available that you can implement to accomplish that. Feature branches can also be used as a means of tracking what work is in-process assuming they are created for each unit of defined work and are deleted when that work is integrated/merged into the general development branch. That would make it possible to see what was in-process by seeing what feature branches were active.
Does Subversion support linking of a file into multiple folders similar to Visual SourceSafe (VSS)?
Not really being much of an expert on VSS I won’t answer this question definitively, but rather describe what Subversion can do in a case of wanting to link a file or folder into other structures.
Subversion has a property, svn:externals, that allows you to define a revision (optional) and path that you want mapped into a working copy at the time of checkout under a name that you define in that mapping. This approach supports the idea of shared components, linking of role specific repository data (e.g., QA, DB, etc.), or third party libraries. A directory can be mapped in from with the same repository or from another repository. A file can be mapped in from within the same repository only.