Thomas E. Dickey
$Date: 2004/02/14 01:36:43 $
lint, which was a Good Thing.
I adopted the practice of running lint,
before starting to debug, because it often found the bug that I
had just noticed.
That was before ANSI C, of course. Though actually ANSI C was standardized before it was widely available. So (like other people), I adopted it piecemeal, using prototypes in header files, adding varargs (before converting to stdarg), converting some functions to ANSI form. I stopped writing code which did not take advantage of ANSI C's better type-checking around 1990, having spent about a year developing a system written in Ada.
In the mid-1990's I converted the larger programs I was working on to ANSI C (tin, vile, ncurses). My development focus had switched from SunOS 4 with a K&R compiler to Linux or Solaris with gcc or Sun's compiler. Unlike Sun's compiler, gcc could be told to give lots of warning messages which were useful for finding non-ANSI code. Actually some other compilers are better for this purpose, but they cost money, and generally run on only one or two platforms. (I do use them when they're available).
XFree86 is larger than the other programs (about 1.3 million lines when I
started, and before some of the contrib programs were added). Initially I
started looking at ANSIfying X when I got tired of filtering compiler warnings
in my day job's legacy code. It seemed possible that I could get XFree86 to
change their code, and that could be leveraged into getting X Consortium to
adopt the changes. I made an initial set of changes in the server code, to
test this, but as it happened, that was the final year for the X Consortium. I
put that plan aside. Later, I converted xterm to ANSI C when it
was clear that the former custodians (and their successors) were not going to
maintain it any longer.
Shortly after, the XFree86 core group changed the compiler warnings used for building to stricter ones which would show problems in the code, as well as non-ANSI stuff. The resulting 8Mb logfile gave me some motivation to reduce its size. It was too large to see the pattern, so I wrote a simple utility to filter the logfile and make a list of files which produced the most warnings.
Here is some sample output from
a build log from Redhat 6.2.
ANSIfying code reduces maintenance effort, and allows me to work on much larger systems than with K&R code. It also extends the lifetime of existing programs.
Maintainers of libraries which must interface with existing applications should be careful to not alter the nature of the interface. In particular, these are problem areas:
char and short
parameters are treated as if they are first assigned to an int
variable - that is, they take up the same amount of space as an integer.
This is called argument promotion.
Given a prototype, the compiler is free (depends on its design) to
use less space on the stack for those parameters.
If different callers to a function disagree on those sizes,
the program will malfunction.
gcc) will (silently)
change the parameter sizes.
const into a function's definition changes
the functions with which it can be compiled, even if the compiler does
not change the stack alignment of the parameter list.
varargs.h to stdarg.h
is claimed by some people to introduce possible incompatibilities
but I have not found any practical cases which are not due to
changes in argument promotion or using const.
But it is something to consider.
cproto and protoize.
The former attempts to preserve argument promotion (but fails).
The latter does not even try.
The tools that I do use are determined by my goal: convert as much of the code as possible without introducing functional changes. For XFree86 libraries, the goal is stricter: no changes to parameter alignment. In turn, the choice of tools determines the process.
I use the compiler to find the places to change and to ensure that there is minimal impact on the interface. Compiling with gcc without the -g (debug) option produces object files which will differ if an editing change modifies parameter alignment. At the same time, most editing changes that do not modify logic or alignment will not change the object code.
Shell scripts are useful for automating the compiler checks: Regress and remake. A good text editor is needed to carry out the process of following the compiler warnings, doing recompiles and occasionally undoing a set of changes.
That gives me a starting point. I look at why the warnings come about, which is usually because they're not prototyped, and decide which file I should try to resolve.
Bear in mind that changes to code which is not ifdef'd for the current platform will not be testable by edit/compile/compare, e.g., the Regress script. A further limitation is that some types may happen to be the same on the current platform, e.g., int/long on a 32-bit machine.
That said, with reasonable care you can convert most of a program to use prototypes without risk of altering the interfaces as used in the K&R original. Make the gcc warnings find the missing prototypes. Gcc will not find all of them; it lacks one corresponding to the ANSI compiler on IRIX which flags functions that are defined with K&R syntax which may have a prototype, e.g.,
int foo(void);
int foo() { return 1; }
But the IRIX compiler is not useful for this type of development, because
it embeds line-number information into the object files even when debugging
is disabled.
Hence, deleting a blank line will result in a different object file.
So we use gcc.
Gcc's useful warnings include:
-Wall -Wstrict-prototypes -Wmissing-prototypes -Wshadow -WconversionBecause of gcc's blind spot (it does not flag functions defined with K&R syntax), I modify the header files last:
static) functions to ANSI form.
extern declarations to header files (choosing the
right one can be a problem, since they are associated with type
definitions).
Some troublesome language features to watch out for:
sizeof(foo). If the length
happens to be negative, then C will treat it as larger than the
size, giving an unexpected result.
const is nice, but do it later, and do not modify the
documented interfaces. Otherwise existing programs simply will not
compile with several compilers.
#define ARGS a1,a2,a3,a4,a5,a6,a7,a8,a9
int foo(ARGS) long ARGS; { ... }
int bar() { foo(1,2,3); }
rather than even use <varargs.h>.
int foo(char a);
int foo(a)
char a;
{ }
The Apollo C compiler did the wrong thing with this: it compiled the prototype (and calls against it) with a stack-alignment for the char parameter, but the function definition with a stack-alignment for the char parameter promoted to an integer. While other contemporary C compilers may have internally done the same thing, on Apollo this malfunctioned because the machine's byte ordering (the Motorola 68000 series) resulted in moving the parameter so the function could not get the data. On other architectures such as Intel, this does not happen because of the way the bytes are ordered. Even on SunOS (Sparc) where the bytes are ordered as on the Apollo, the data stored on the stack is apparently aligned to 4-byte integers.