Multiple Subversion repositories?

On Wednesday CM Crossroads and CollabNet hosted a webinar: Subversion in the Enterprise, presented by C. Michael Pilato and Bob Jenkins from CollabNet plus Terrence Cordes, SCM manager at Reuters. Terry gave some great insights into deploying Subversion across global teams; I’ll post a link to a recording of this webinar here soon. Because the presenters only got to a few of the questions that were asked, we will answer some of the remaining ones in this blog over the next few weeks. Here is the first one, asked by a couple of people:

Do you recommend multiple repositories for distributed teams due to WAN performance?

Subversion itself does not support synchronized repositories that are concurrently used for read and write. Subversion uses one central server. When it was designed, the WAN was kept in mind and networking with low bandwidth requirements is built into the system.

Subversion’s working copy model means that the developer works on his or her code without needing to be connected to the server. You only need a connection with the server in a few cases, for instance for a commit or when updating your local working copy with changes from the repository. When data is sent back and forth, only differences are sent.

Mike Pilato actually touched on this during the webinar. If you make a small change to a large file and commit that change, only the change is sent across the network, not the entire file. This minimizes band-width requirements. Subversion only needs to send the entire files across the wire the first time a developer checks out the repository.

The conclusion is that WAN performance is not an issue when considering Subversion, assuming your network is reliable.

Subversion does have some support for multiple repositories. With version 1.4 svnsync was introduced. This utility lets you replicate your repositories into any number of read-only copies. There are several usage models for this, with back-ups being the most common.

But there are other usages as well. For example, at EclipseCon I met some people from the Philippines who were asking about using multiple repositories to get around network downtime (we’ve all heard about the recent big internet outage in Asia). Their development team is in Los Angeles but build and test happens in the Philippines. This company can set up a main repository with read-write access in the US and use svnsync to make a remote copy for the build and test team. Should the international network go down, they can access the local read-only copy to make a build.

You can find out more about svnsync by typing “svnsync help” at the command prompt or check the online version of Version Control with Subversion. The authors are updating this book for release 1.4 and have a chapter on svnsync (I cannot give you the exact url of the svnsync chapter, due to daily builds the url changes all the time)

If you want to use Subversion and really need multiple read-write repositories, there is a solution: svk (its primary author is Chia-liang Kao). svk is a decentralized version control system built on top of Subversion. It supports things like repository mirroring and disconnected operation. Some people will prefer this but before you choose a distributed repository solution make sure you really need it. It does have some advantages for developers if they are often working disconnected from the network, for example: like Subversion they can work disconnected but additionally they can commit to a local repository. However, it can come at the cost of higher administration overhead, fatter bandwidth requirements and more server infrastructure. Subversion’s centralized model is easier to deploy and maintain and, if you don’t need a distributed model, will have lower cost of ownership.

Tagged with: , , , , , , , ,
Posted in Subversion
4 comments on “Multiple Subversion repositories?
  1. Johnathan Gifford says:

    It should also be noted that a next general release of Subversion (1.5) will include the capability to redirect a commit to a read-only repository on a remote server to the master repository on the central server. This will greatly benefit those who have developers working globaly at a different offices to have their own repository server for looking at history and such without consuming more bandwidth on their WAN.
    This approach doesn’t solve the outage issues when the connection is disrupted between the master and remote servers. However, it does allow the developers to continue work even though updates by all developers globally are not getting replicated to that read-only repository on the remote server.
    Another advantage of having the remote read-only repositories is that in the event of a disaster recovery situation at location where the master server is, there could be plenty of live backups remotely. It is also easy for an administrator to turn a read-only repository into a master repository so work can continue.

  2. jeffhung says:

    Last year, we have to send a team to work in a factory that had no Internet access. The team members need to cooperate with each other, and their works need to synchronize with our main office, too. In this situation, SVK didn’t help since mirrored SVK working copy cannot interchange changesets with other team members.
    Therefore, I developed a sequence of manual[1] operations using Subverion and svk mirror (SVK::Mirror might be enough, but I didn’t try). I call these operations SubStation, and it is still some _manual_ operations that cannot automate. SubStation can replicate the main repository, with only some minor no-big-deal different isolated in different branches. Using SubStation, we main office and the on-site team can both work with two disconnected repositories, and synchronize with each other when the team got Internet access every night.
    The author of SVK, clkao, heard our problems. After some discussions, he then developed Pushmi[1]. Pushmi can do bidirectionally synchronization, and makes the “slave” repositories also writable.
    [1] Pushmi: http://search.cpan.org/dist/Pushmi/

  3. jeffhung says:

    Sorry, the first [1] should be deleted.

  4. You may remenber the four proverbs:
    The first step is the only difficulty.
    The fox knew too much, that‘s how he lost his tail.
    The fox preys farthest from home.
    The frog in the well knows nothing of the great ocean.

Leave a Reply

Your email address will not be published. Required fields are marked *

*