Overview of the basics of versioning
Versioning is a crucial key of the development process for several reasons best described through the lack of versioning:
- Want to see a great piece of code that was refactored out, deleted or otherwise lost six months ago? Hmm…
- If you have multiple versions of a product used in live environments (by customers or yourself), how are you able to provide any form of bug-fixing without upgrading to the most recent version?
- Want to experiment with a new architecture, build system, unit test library or general-purpose doomahickey without risking your current setup? Tough.
- Is your code full of commented-out chunks that are kept around just in case? That’s poor man’s versioning!
- And a whole bunch of other reasons that I’ll probably remember as soon as I hit the publish button.
The most basic – and naive – form of versioning if simply to take a copy of the codebase every so often and place it somewhere safe. This is better than nothing, but it would require logic to be convoluted at Olympic level to persuade anyone it’s a reasonable approach; it smacks of the wrong kind of laziness – zero effort up front, and sustained effort for minimal gain on the back end.
Skipping over the part where a company decides the best solution is to develop an in-house versioning system, we can move directly to tried-and-tested version control systems (VCSs). Many exist, both free and commercial, and whichever you decide will generally make your life easier.
In this particular example, I will use Subversion (http://subversion.tigris.org) , a.k.a. svn- a popular, well-established VCS system that enjoys support from the command line, and various continuous integration servers and IDEs.
Versioning with Subversion
At the core of the user interaction possible, Subversion works on a system of checkouts, commits and updates. A quick definition:
- repository – a collection of versioned files within the svn server
- checkout – the process by which the versioned files, plus some svn-specific information, are copied from the svn server to a working copy (e.g. on your local computer, or on a continuous integration server)
- commit – an atomic operation where all the files added, removed or updated in the working copy are propagated back to the svn server.
- update – synchronizes the files of the working copy with those of the svn server. This is used to pull the results of commits other than your own into your working copy. Your own commits, by definition, are already in your working copy because that’s where you made the changes.
A typical project lifecycle in an imaginary world would be
1. Create project in subversion
2. Create or import the initial files
3..n-1. Commit and update files as you and other committers make changes to the repository
n. End of project
Anyone who has ever got to n, break out the champagne and take a five-minute break – you’ve either shipped the definitive (not to mention bug-free) version of the project, or you’ve just been downsized. Nonetheless, this illustrates versioning with a single branch with no tags. Each commit can be traced to a specific committer, and the development history of the project is kept for both posterity and code audits.
Branches? Tags?
As mentioned above, versioning can be done in a linear fashion – one single line, with commits falling exclusively on that line. In svn terminology, this line is called the trunk.
Versioning using the trunk only |
A project that has multiple versions, however, needs to consider branching. Branching, just as the name suggests, represents a major offshoot of the trunk. In fact, it’s an exact copy of the trunk at the point the branching took place. We now have, in effect, two parallel lines of development which are completely independent but have a common root.
The branch may represent, for example, the point at which version 1.1 was released. The trunk would be branched into, say, MyApp-1.1, built, tested and shipped. At roughly the same time as this, the person sitting opposite you just committed a major change that won’t be available until version 1.2 – this commit does not affect the 1.1 code, and so ongoing developments in the trunk do not corrupt bug-fixing endeavours in the 1.1 branch.
A more realistic development cycle would be
trunk.1 – Create project in subversion
trunk.n..trunk.n-1 – Commit and update files as you and other committers make changes to the repository
trunk.n – Branch code as new customer comes on board and makes various requests
trunk.n+1 – Continue working on new features, requirements, etc, no effect is felt by the branch. Any bug discovered which also exists in the branch may have its fix ported into the branch.
branch.n.1 – Bug fixes, no effect is felt by the trunk. Any bug discovered which also exists in the trunk may have its fix ported into the trunk.
A repository with branches and tags |
When certain points are reached, for example when all the requirements are have been met, tags come into play. A tag is a snapshot of the branch at a given point, and is useful for recalling the state of the project at a given point. Tags should be used, at the very least, for each release; they may also be used, for example, for internal tracking.
So you can only have one branch?
So far, the examples have centered around the trunk and a single branch. This is fine for illustrative purposes, but remember that not only are multiple branches possible, it’s also possible to branch from branches. In fact, you can think of the trunk as a branch with the only difference being it doesn’t have a parent.
Why version when the binaries tell the whole truth and nothing but the truth?
Pragmatism and laziness are, at the end of the day, two of the greatest attributes any developer can have. They lead to automation, efficient processes and a tendency to get things right in the minimum amount of time. Taking this into account, surely adding version control to your project is an unnecessary overhead? After all, the binaries held by the customer not only fit their reqiurements but also contain exactly the bug that requires fixing! The binaries can be requested from the customer, decompiled, patched, recompiled and redistributed. Easy as pie, sweet as a pea, etc, etc.
If you’re going to take this route, you have to ask the customer for their binaries each and every time you have to fix a bug for them. There is, of course, the possibility that you’ll keep a copy of each binary you send to them to avoid this unprofessional and embarrasing request…but if you’re going to do that, why not just version the source code in the first place?
Reference: A rough overview of the basics of versioning from our JCG partner Steve Chaloner at the Objectify blog.