When I began using an SCM to manage my hosts /etc CVS was the most famous free SCM. It worked, but was the opposite of a perfect fit. In the meantime, we’ve seen the growth of a diverse SCM-biosphere.
There has not only been a growth in quantity, but the new stars bring new qualities, too.
Distributed vs. Centralized
While CVS and Subversion are centralized by design, most new SCMs can be used in a distributed environment. In CVS and Subversion there is a clear distinction between the repository and working copies. Every change in a working copy has to be submitted to the central repository server, from where others are able to pull them. In CVS this dependency was so strong that almost any command needed a connection, offline working was nearly impossible. While this centralization is good to coordinate a few persons working on a common project, it doesn’t fit when many groups are working on dedicated branches. A distributed setup also needs much less infrastructure, as no central repository is needed, but everything can be done only using the working directory.
When managing /etc, the distributed nature of todays SCMs comes very handy. It allows to just initialize /etc and start tracking changes. No central repository setup needed. Its also very easy to replicate such changes between hosts, as you may pull changes from the /etc on one host to the /etc on another. One thing you must be aware is that every tracked file is represented in the SCMs repository directory (called .git, _darcs etc.). You need to protect this directory (
chmod 600) in order not to leak the content of otherwise protected files!
Branching and Merging
In CVS, branching was only for the pros. It was hard to do and merging was a pain in the as. SVN somewhat changed this, only when the distinction between a branch and a working directory was removed by Darcs and others, branching got an everyday thing. In Darcs, Git etc. working directories are full featured repositories. They contain the whole history. It’s because of this feature that every working directory can pull in changes from any other. It also allows to branch by simply copying a working directory.
While branching and merging isn’t needed when simply revision controlling /etc, it gives the whole exercise a new dimension. You may not only track changes host-by-host, but you may be able to group changes into branches that enable some kind of service or feature. More on this in a later post…
Changesets vs. Revisions
Most SCMs use some kind of revision number to track the history of changes. Every revision describes the state of the project at a specific point in time. Each commited change enhances the revision number. There is a clear timeline. This system seems intuitive and simple. But it doesn’t fit well into a distributed design, where different branches can move forward independendly and be merged back later. While it is possible to handle this case, as shown by Subversion, a radical different approach is implemented in e.g. Darcs: A revision is not identified by a number, but simply by the set of changes. This allows to cherry pick some changes while ignoring others when pulling in changes. You may want to learn more about the underlying theory of patches.
In software development most changes are here to stay. They come one after the other and do contribute small features or bugfixes to a single application. You mostly aren’t interested in splitting the project into many, and you don’t want to be able to reapply changes in a different order. The changes belong together and depend on previous ones.
The changes applied to /etc are of a different nature. They may well depend on each other but can often be grouped into features or services. These changesets (features’n’services) don’t need to be applied in any particular order. You’d like to just choose a changeset and apply it to a given host.