Copyright © 1999 Luke Mewburn. All rights reserved.
This paper describes techniques for using revision control and documentation to install and maintain third-party source code. The automated building and distribution of software is outside the scope of this paper.
One of a system administrator's tasks is managing the installation and maintenance of third-party software, which is often compiled from source code.
Over the last ten years I have been in situations where I had to maintain third-party software which had insufficient documentation to rebuild or upgrade the software if necessary. In many situations /usr/local/src consists of a haphazard collection of tar files and directories, possibly with local modifications, and generally only the original system administrator has any idea on how the product was installed.
I started using CVS [1] over five years ago for the management of my own source, and since that time I have also been using it for managing the third-party source that I was responsible for.
This paper provides an introduction to CVS, and recommendations (based on my experiences) on how to use it in conjuction with appropriate documentation to improve the installation and maintenance of third-party software.
This paper was first presented at the SAGE-AU '99 conference in July 1999. I've updated it sporadically since then.
CVS (Concurrent Versions System) is a version control system. Generally version control allows you to track changes in files, and see the who, when and what information for each change.
System administrators who are familiar with RCS (Revision Control System) software should find using CVS fairly easy since CVS was based on RCS.
diff
style.
CVS was primarily aimed at groups of programmers developing medium to large software projects. Most of the large open-source software efforts which have developers distributed globally use CVS, including: NetBSD, FreeBSD, Linux, XFree86, Apache, and Mozilla.
CVS is also proven to be useful for a wide range of other tasks, including the subject of this paper...
CVS uses a central repository, which is a hierarchical directory structure where all of the files and the associated version history is stored.
Developers do not work directly on the files in the repository; instead they use private checked-out copies of the sections of the repository they are interested in.
As the repository is a directory based tree, there is no reason that the repository can not be used for a variety of different purposes, including third-party software, in-house projects, host configuration files, etc.
The repository can be accessed via a pathname (either a local or NFS filesystem), or via a variety of remote protocols including rsh or ssh, and CVS's internal pserver protocol.
It is important to ensure that repository is secure, especially if the source to security tools is to be kept in it.
CVS uses a few environment variables. Some of the more commonly used ones are:
-d repository
option every time you use a CVS command.
Set these in your shell's startup file for convenience.
Choose a machine with suitable reliable disk capacity for the respository. The CVS info documentation [4] suggests to allocate three times the size of the code stored in it. Our current CVS repository is 1250 MB in size, and there are over 220 packages in there (including large packages such as X11R6 and egcs).
To initialise the repository, on the server run:
% cvs -d /path/to/repository init
Alternatively, set $CVSROOT appropriately and run
% cvs init
Some tweaking of the CVS configuration files may be required to customise for the local environment. The CVSROOT module at the top level of the repository contains these files. CVSROOT/loginfo controls where the commit messages are to be sent (e.g., append to a local file, email to interested users, etc.) Refer to the CVS documentation for more information on configuring these files.
All of the CVS operations are performed with the cvs
command.
cvs
takes a word argument to describe the operation, with optional
arguments to vary the operation.
The basic commands are:
cvs checkout module
module
from the repository into a
subdirectory of the current directory called module.
cvs release [-d] module
module
is no longer
in use (after checks are made to see if any files are modified).
If -d
is given, delete the directory as well.
This should be used to indicate that a checked-out copy of the
module is no longer required.
cvs update
cvs add
cvs remove
cvs commit
cvs diff
diff
style output.
cvs log
cvs import
As well as the examples below, [3] provides a sample session.
The CVS info documentation [4] and the CVS Reference Manual [5] provide further information on these commands.
The use of CVS in itself is not a panacea to the problem of managing third-party source code. The use of accurate documentation (which some system administrators fear and loathe, probably because they do not know to utilitise it correctly) is mandatory. CVS is just a tool; it does not define policies.
The following documentation should be maintained:
Note that CVS and documentation should be used in conjunction with effective communication between the developers and system administrators, not as a replacement for communication!
For each separate software package imported into CVS, the following should be documented:
I maintain a file called 3RDPARTY, which is kept in a CVS module called docs. It consists of separate entries per package. An example entry is:
Package: python Version: 1.5.2 Current: 1.5.2 Maintainer: URL: http://www.python.org/ Archive Site: ftp://ftp.python.org/pub/python/src/py152.tgz Mailing List: CVS location: lang/python Vendor tag: PYTHON_DIST Release tag: python-1-5-2 Responsible: lukem Compiler: /opt/SUNWspro/bin/cc Environment: SunOS wombat.cs.rmit.edu.au 5.6 Generic_105181-12 sun4u sparc SUNW,Ultra-2 Date: Fri Apr 16 16:18:37 EST 1999 Depends upon: blt 8.0-unoff Depends upon: readline 4.0 Depends upon: tcl 8.0.5 Depends upon: tk 8.0.5 Depends upon: x11r6.3 Depends upon: zlib 1.1.3 Notes: * created Modules/Setup.local to contain modules we use, based on commented-out entries in Modules/Setup.in that we want. * configured, compiled and installed with env CC="cc -mt" ./configure --with-threads make make test make install
Currently this is just a flat text file, maintained by hand.
Hints:
Current:
field even if the upgrade is not performed
at that time.
Amongst other things, it helps you find software to upgrade on those
rainy days when you have got some time to kill... :-)
% ./configure % make % make install
Possible future enhancements include:
Whether your site installed programs in /usr/local, /opt/local, or /opt/gnu, there should be sufficient general documentation on where third-party software is to be installed so that there should be consistency in installation, especially when there are multiple system administrators responsible for software installation.
It is also sensible to document why the policies were contravened if it was necessary to do so (as it is in some cases).
To import a new package, we require a fresh copy of the source. Suggested locations to look for software are described below in Finding third-party source.
In this example we will use the Internet Software Consortium's DHCP server - dhcp 2.0b1 pl18.
Run cvs import
to import the code:
% cvs import -m 'ISC dhcp 2.0b1pl18' net/dhcp ISC dhcp-2-0-b1-pl18 N net/dhcp/CHANGES N net/dhcp/Makefile.conf ... N net/dhcp/server/dhcpd.leases.cat5 No conflicts created by this import
If -m '...'
is not supplied an editor
will be invoked and you will be prompted for a commit message.
The "N
" at the start of the each line
is the status character;
"N
" means new file.
Refer to the CVS info documentation [4] node
"import output" for the meaning of other status
characters.
The package is now in the repository.
Depending upon how the CVSROOT/loginfo file has been configured, you might receive an email log message similar to:
Update of /src/cvsroot/net/dhcp In directory wombat.cs.rmit.edu.au:/tmp/dhcp-2.0b1pl18 Log Message: ISC dhcp 2.0b1pl18 Status: Vendor Tag: ISC Release Tags: dhcp-2-0-b1-pl18 N net/dhcp/CHANGES N net/dhcp/Makefile.conf ... N net/dhcp/server/dhcpd.leases.cat5 No conflicts created by this import
% cd ~/some-scratch-directory % cvs checkout net/dhcp % cd net/dhcp
For this example, I needed to change some of the paths in Makefile.conf and includes/site.h to reflect our site's policy on where we install configuration and run-time files.
After modifying the files, we can see what files changed from the distribution by running:
% cvs update M Makefile.conf M includes/site.h
The actual changes can be shown with cvs diff
which will
display the diff output.
I prefer cvs diff -up
, which outputs a
unified diff ("-u
"),
with the C function the change is in preceeding each block
("-p
").
Before compilation we can commit our changes back to the repository using:
% cvs commit -m 'update for local paths' Checking in Makefile.conf; /src/cvsroot/net/dhcp/Makefile.conf,v <-- Makefile.conf new revision: 1.2; previous revision: 1.1 done Checking in includes/site.h; /src/cvsroot/net/dhcp/includes/site.h,v <-- site.h new revision: 1.2; previous revision: 1.1 done
Configure the program with:
% ./configure sunos5-cc
We can see what ./configure changed by running
cvs update
again:
% cvs update ? Makefile ? client/Makefile ? common/Makefile ? relay/Makefile ? server/Makefile
The "?
" means that CVS does not know about the file.
% make % make install
cvs history
functionality (q.v.), indicate to CVS that you do not need the working
copy anymore:
% cvs release -d net/dhcp ? Makefile ? client/Makefile ? common/Makefile ? relay/Makefile ? server/Makefile You have [0] altered files in this repository. Are you sure you want to release (and delete) directory `net/dhcp': y
It is rare for third-party software to remain unchanged forever; updates are available (even on a regular basis).
An important part of managing third-party software is upgrading to a newer version in a sane way.
The process to import an updated version of a package is similar to importing the original package. However, because changes in the newer version may conflict with any local changes that you may have made to the product, there are a couple of steps that are different.
In this example, we'll be upgrading dhcp 2.0b1 from pl18 to pl27.
% cvs import -m 'ISC dhcp 2.0b1pl27' net/dhcp ISC dhcp-2-0-b1-pl27 U net/dhcp/CHANGES C net/dhcp/Makefile.conf ... U net/dhcp/server/dhcpd.leases.cat5 1 conflicts created by this import. Use the following command to help the merge: cvs checkout -jISC:yesterday -jISC net/dhcp
As you can see there was a conflict on the import
(Makefile.conf; the file with the status character of
"C
").
This is CVS's way of indicating that those files have been modified from
the previous release of the vendor code.
The recommended command to "help the merge
";
cvs checkout -jISC:yesterday -jISC net/dhcpshould NOT be performed, for the following reasons:
It ends up creating more work than necessary.
checkout
command, you should
use a checkout of the form:
cvs checkout -j old-release-tag -j new-release-tag module
old-release-tag
is the previous release tag, and
new-release-tag
is the one just imported.
For this example the cvs command is:
% cvs checkout -j dhcp-2-0-b1-pl18 -j dhcp-2-0-b1-pl27 net/dhcp U net/dhcp/CHANGES U net/dhcp/Makefile.conf RCS file: /src/cvsroot/net/dhcp/Makefile.conf,v retrieving revision 1.1.1.1 retrieving revision 1.1.1.2 Merging differences between 1.1.1.1 and 1.1.1.2 into Makefile.conf ... U net/dhcp/server/dhcpd.leases.cat5
The advantages of using
checkout -j old-release-tag -j new-release-tag
include:
Remember; always use
cvs checkout -j old-release-tag -j new-release-tag
when checking out code for the first time after you have updated
a vendor release.
It saves a lot of time.
cvs update
will show any files modified, added, or removed
by the vendor between releases.
% cvs update M Makefile.conf
Note that even though include/site.h was locally modified in the previous release, it does not show up as modified here. This is because it was not actually modified by the vendor between the two vendor releases.
cvs diff
will display any changes between the last
release and this code; changes either from local updates or from the
new release.
The diff output will also display our changes to
include/site.h.
It is probably more useful to use
cvs diff -r VENDOR
to
display changes between the most recent vendor release and this code:
% cvs diff -r ISC Index: Makefile.conf ======================================================= RCS file: /src/cvsroot/net/dhcp/Makefile.conf,v retrieving revision 1.1.1.2 diff -r1.1.1.2 Makefile.conf 133a134,141 > ## RMITCS overrides > #BINDIR = /usr/local/sbin > #CLIENTBINDIR = /usr/local/sbin > #ADMMANDIR = /usr/local/man/cat1m > #FFMANDIR = /usr/local/man/cat4 > #VARRUN = /var/run > #VARDB = /var/run/dhcp > #SCRIPT=none Index: includes/site.h ======================================================= RCS file: /src/cvsroot/net/dhcp/includes/site.h,v retrieving revision 1.1.1.1 retrieving revision 1.2 diff -r1.1.1.1 -r1.2 39a40 > #define _PATH_DHCPD_PID "/var/run/dhcpd.pid" 45a47 > #define _PATH_DHCPD_DB "/var/run/dhcpd.leases" 50a53 > #define _PATH_DHCPD_CONF "/usr/local/etc/dhcpd.conf"
Peruse the output of the diffs and determine if things look reasonable. In this case I am fairly happy with the changes.
Refer to the CVS info documentation [4] node "Conflicts example" for more information on resolving conflicts.
In Tips and tricks below I will discuss techniques for minimising conflicts when you make local changes.
% cvs commit -m 'merge new version' Checking in Makefile.conf; /src/cvsroot/net/dhcp/Makefile.conf,v <-- Makefile.conf new revision: 1.3; previous revision: 1.2 done
% ./configure sunos5-cc
% make % make install
CVS has other operations which are useful to know. Whilst these may be more relevant to a software developer, they are still useful for our purposes:
cvs annotate
% cvs annotate Makefile.conf ... 1.1 (lukem 25-Mar-99): #VARDB = /etc 1.1 (lukem 25-Mar-99): #SCRIPT=solaris 1.2 (lukem 26-Mar-99): ## RMITCS overrides 1.2 (lukem 26-Mar-99): #BINDIR = /usr/local/sbin ...
cvs tag symbolic_tag
symbolic_tag
to the to the
nearest repository revisions of the files checked out in the current
directory.
This is useful to checkpoint software for future reference and/or comparison.
cvs rtag symbolic_tag module
cvs tag
, except that it performs the
operation directory on the repository copy of module
(I.e., module
does not have to be checked out).
cvs history
checkout
,
commit
, release
, rtag
, and
update
.
cvs history
displays this history.
By default, it shows the modules you currently have checked out.
If cvs release
is used after you have finished with
each checked-out module, then cvs history
is a
useful indicator of what is still checked-out.
Some people do not bother releasing modules, so the history
may not be that useful.
CVS is a powerful tool and it has useful functionality that is often only hinted at in the documentation. This sections covers some of that functionality, and other tips and tricks that I felt might be relevant.
In my opinion, this is one of the most useful underdocumented operations in CVS, especially for the purpose of maintaining third-party packages (or anything that uses vendor branches).
As shown in the example in
Updating an existing package,
this operation checks out a copy of module
and merges any
changes made by the vendor between release tags
old-vendor-release
and new-vendor-release
.
This includes files that were added or removed between releases.
It may be useful to tag a package at a known working state after successful installation or prior to the import of a new version. This can simplify determining the local changes made to the previous vendor release.
The sequence of operations would be something like:
% cvs rtag dhcp-local net/dhcp
% cvs import -m 'dhcp 2.0b1pl28' net/dhcp ISC dhcp-2-0-b1-pl28 % cvs checkout -j dhcp-2-0-b1-pl27 -j dhcp-2-0-b1-pl28 net/dhcp
% cvs diff -r dhcp-2-0-b1-pl27 -r dhcp-local > diffs.last
% cvs diff -r ISC > diffs.now
diffs.last
and diffs.now
to see if
the changes look similar.
% cvs rtag dhcp-local net/dhcp
It is possible to modify code to minimise conflicts when the vendor changes the same sections of the code. Rather than change a line provide an override line which has the same effect.
CC=gcc CFLAGS=-Wall -Werror -g
and cc -O
is the preferred compiler, change this to:
CC=gcc CFLAGS=-Wall -Werror -g CC=cc CFLAGS=-O
/* #define _PATH_DHCPD_DB "/etc/dhcpd.leases" */
change this to:
/* #define _PATH_DHCPD_DB "/etc/dhcpd.leases" */ #define _PATH_DHCPD_DB "/var/run/dhcpd.leases"
It may be necessary to add #undef item
before a
local #define item value
, in case item
was defined elsewhere.
Certain packages are distributed with binary files that must be stored unmodified in the repository and be checked out unmodified upon a check-out.
The simplest way to support a package with some binary files is:
% cvs import -m '...' module vendor release
% cvs import -kb -m '...' module vendor release
(the "-kb
" is the important part).
CVS ignores symbolic links upon import (the files have a status of
"L
").
A workaround is to add a local script to module which is run manually after checkout to regenerate the symlinks. I call this fixlinks, and generate it with:
% find . -type l -print | perl -e 'while (<>) { chomp ; \ print "ln -s ", readlink($_), " $_\n"; } ' > fixlinks
An example fixlinks (from our net/netatalk package) is:
ln -s ../sys/netatalk ./include/netatalk ln -s ../codepage.h ./etc/afpd/nls/codepage.h ln -s kpatch-4.2 ./sys/ultrix/kpatch-4.4 ln -s kpatch-4.2 ./sys/ultrix/kpatch-4.3 ln -s ../../ultrix/sys/cdefs.h ./sys/solaris/sys/cdefs.h
Upon checkout, regenerate the links with:
% sh ./fixlinks
The CVSROOT/modules file provides a mapping from a module
alias to a directory or directories.
I find it useful to have shortcuts to the packages in the repository.
For example, this allows me to run cvs checkout dhcp
instead of cvs checkout net/dhcp
.
I wrote a script a few years ago called genmodules [7] which which parses CVSROOT/commitlog and updates CVSROOT/modules as necessary.
If you are working with multiple repositories (e.g., a local repository and that of an open source software project), it may help to have shell aliases which do the right thing.
For example, in my .cshrc I have:
alias ncvs 'env CVSROOT=cvs.netbsd.org:/cvsroot cvs' alias cscvs 'env CVSROOT=wombat:/src/cvsroot cvs'
If you are always invoking a CVS command with the same set of options, you can simplify your typing by adding a relevant entry in $HOME/.cvsrc. For example, I have a line of the form:
update -dP
which means that cvs update ...
is run as
cvs update -dP ...
("-d
"; build directories,
"-P
"; prune empty directories).
It is highly recommended to have a sensible organisation for the third-party packages in the repository. For example, we have the following directories; it should be trivial to infer the contents of the directories:
% ls $CVSROOT CVSROOT devel lang rmit www ai docs mail security x11 archivers file net text audio graphics news utils
An element of managing third-party source is finding the source in the first place :-) A few good places to look include:
I believe that CVS is well suited to the task of managing third-party software. I have been using it this role for over five years for three different employers. For over three years I have also been using CVS as one of the many distributed developers of the NetBSD project [8].
Before I started my current position at RMIT Computer Science nearly two years ago, /usr/local/src was a two gigabyte directory with a haphazard structure (I hesistate to call it "organisation"). In many cases there multiple copies of the same product, without any obvious indication of which was the currently installed version (in the case of the Columbia Appletalk Package there were six different versions of the source code). As we rebuilt systems (including rebuilding /usr/local from scratch on new machines), we kept all new products in the CVS repository.
Hand in hand with CVS is the maintenance of relevant documentation. Without documentation a CVS tree is almost as bad as the proverbial unorganised /usr/local/src. NetBSD provided the inspiration for the 3RDPARTY file, although I have expanded upon it since then.
Thanks to Matt Green for teaching me the
checkout -j old -j new module
trick (amongst others); in my opinion it is one of the most useful commands
to know when maintaining third party source.
Giles Lean's assistance by reviewing the paper and providing a wealth of feedback was much appreciated.
[1] |
Cyclic CVS site,
http://www.cyclic.com/cvs/info.html |
[2] |
CVS Overview,
http://www.cyclic.com/cyclic-pages/overview.html |
[3] |
Introduction to CVS,
http://www.cyclic.com/cvs/doc-blandy-text.html |
[4] |
The CVS info documentation.
This should be installed as part of the CVS installation. |
[5] |
CVS Reference Manual,
http://www.loria.fr/~molli/cvs/doc/cvs_toc.html |
[6] |
SSH secure shell,
http://www.ssh.fi/ |
[7] |
genmodules,
http://www.mewburn.net/luke/src/genmodules |
[8] |
The NetBSD Project,
http://www.NetBSD.org/ |