I have intended for a long, long time now to write a blog post about one of my favorite soapbox topics of configuration management – tagging. It can’t be a hard topic to at least understand the principle given we apply tags to Flickr, hashtags to Twitter, and tags to Facebook. The idea is to apply a piece of metadata to some type of content in each of those tools. That’s what we’re talking about with Subversion with the key being that the metadata is applied to a revision of the folders and files that you’ve put under version control. The purpose isn’t that much different in many ways from tagging in those other tools. You want to draw attention to that revision by providing meaning to the set of folders and files that makes that set important to others, or important over the passage of time.
At best, most projects apply a tag to the revision they approve for deployment into production. That means that the casual user or the new employee months or years later can see on-going development through the individual commits and associated changesets, but it appears as though one day the project team was developing as usual and then the next they suddenly felt the code was worthy of being put into production. Now that’s not usually what happens (one can never say never in some organizations) as instead the project team has a logical and often multi-step process through which to move a revision of code before it is deemed production worthy. Without tags, that information is lost and when someone asks about the process and where it might be improved, they are left with harvesting human recollections that are inherently flawed. A negative event is much like a criticism in that it carries more weight than its opposite. I’ve heard it said that it takes seven compliments to overcome one criticism. I don’t if that’s the right ratio, but it is obvious that it isn’t one to one. If we want to be able to talk accurately about what has gone on historically, we need to have it recorded in a way that keeps it correct and unchangeable.
The value of knowing what a particular engineer did on a particular day at a particular point in time is of quickly decreasing interest over time. The value of knowing what happened in our process of moving developed code to production code is likely to become more value with the passing of time. Oftentimes we just haven’t been asked the questions that require us having that information available yet, so we can’t appreciate its value. We’ve also found that tagging, labeling, or whatever a previous tool did to track this data becomes a drag on performance and scalability over time. We might appreciate some value in having this data tracked, but the trade-off for performance and scalability didn’t make it worth the effort.
Subversion tackles the issue of tagging in the exact same manner as it does branching, but obviously the use of the result is quite different for tagging than it is for branching. The process of creating a tag is consistently quick and with low overhead no matter how few or how many folders and files we’re using. In most cases, we’re talking about capturing two small pieces of data to define our tag – a pointer to the folder at the top of the structure we want tagged, and a global revision number indicating the point in time we want captured. The time and space doesn’t change if the structure contains five folders and a hundred files or if it contains 500 folders and 100,000 files. That means there is no downside to using this functionality to capture the various stages of our “promotion” process, but there is certainly a lot of potential upside. Think about how you would answer the following questions accurately without this data:
- How many times was there a handoff from development to the release engineer before a successful test candidate could be built?
- How many times did we go through phase M before we were successful and able to move on to phase N?
- How much time did we spend in phase M?
- Where do we spend the most time in our “promotion” model on average?
- How much time does it take for us to get a release out once we’ve started down the “promotion” process?
- What files are doing the most thrashing through the “promotion” process?
- Who’s consistently responsible for the code issues that cause us to spend time fixing bugs and producing a new test candidate?
For many organizations applying these extra tags is a new part of their processes, but with a little time spent understanding the low cost and high value should make it obvious to add tagging. Even if your tags are more complex than the simple ones discussed above, you would be hard pressed to make a case that the potential value doesn’t significantly outweigh any cost. In short tagging is a #nobrainer.