Chapter 8

advertisement
CHAPTER 8
The C preprocessor
Reference: Kelly & Pohl, Chapter 8
Additional reference
Peter A. Darnell, Philip E. Margolis, Software Engineering in C, Springer-Verlag,
New York.
The compiling process -- overview
C Program -- foo.c
% cc -c foo.c
cpp -- C preprocessor
Handles #-directives; removes comments
foo.E
ccom -- C Compiler
compile program
C Optimizer
(optional)
foo.s
as -- assembler
foo.o
Copyright (c) 1999 by Robert C. Carden IV, Ph.D.
3/9/2016
The C preprocessor
Compiling process -- translation phases (ANSI)
1. Physical source file characters are mapped to the source character set
(including new-line characters and end-of-file indicators) if necessary.
Trigraph sequences are replaced by corresponding single-character internal
representations.
2. Each instance of a new-line character and an immediately preceding backslash
character (\) is deleted, splicing physical source lines to form logical source
lines.
3. The source file is decomposed into preprocessing tokens and sequences of
white-space characters (including comments). A source file shall not end in a
partial preprocessing token or comment. Each comment is replaced by one
space character. New-line characters are retained.
4. Preprocessing directives are executed and macro invocations are expanded. A
#include preprocessing directive causes the named header or source file to
be processed from phase 1 through phase 4 recursively.
5. Each source character set member and escape sequence in character constants
and string literals is converted to a member of the execution character set.
6. Adjacent character string literal tokens are concatenated and adjacent wide
string literal tokens are concatenated.
7. White-space characters separating tokens are no longer significant. Each
preprocessing token is converted into a token. The resulting tokens are
syntactically and semantically analyzed and translated.
8. All external object and function references are resolved. Library components
are linked to satisfy external references to functions and objects not defined in
the current translation. All such translation output is collected into a program
image which contains information needed for execution in its execution
environment.
8-2
The C preprocessor
Compiling process (2)
printf("Eh??/
???/n");
Rule 1 -- Replace trigraphs:
??/ --> \
printf("Eh\
?\n");
Rule 2 -- splice lines
printf("Eh?\n");
Rules 3 and 4 have no effect
Rule 5 -- convert \n to newline char
Rule 6 -- no effect
Rule 7 -- convert into tokens
identifier
printf
(
string
literal
"Eh?\n"
Compile token stream
8-3
)
;
The C preprocessor
Definition of a preprocessing directive

Any source line in a source file which begins with a # character (in column 1)
is called a preprocessing directive
File inclusion

Any source line of the form
#include "filename"
or
#include <filename>
is replaced by the contents of the file filename

If filename is quoted (first form), searching typically begins in the directory
where the source program is located
If it is not found there, or if the name is enclosed in chevrons (< and >),
searching follows an implementation-defined rule to find the file
If the compiler cannot find the file, the compilation process stops (error)
An included file may itself contain #include lines



Sample C source file
Foo.h
#define PI 3.14159
foo.c
#include "foo.h"
void foo()
{
printf("PI=%g\n", PI);
}
After file inclusion
#define PI 3.14159
void foo()
{
printf("PI=%g\n", PI);
}
8-4
The C preprocessor
File inclusion (2)
Including files within an include file
foo.h
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define PI 3.14159
foo.c
#include "foo.h"
void foo()
{
printf("PI=%g\n", PI);
}
After including file foo.h -- still more inclusion needed
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define PI 3.14159
void foo()
{
printf("PI=%g\n", PI);
}



The line containing the #include "foo.h" directive is replaced with the
contents of that file
From there, the preprocessor reads through the file again, including any
additional required includes
In this example, it must also include the three standard library headers
8-5
The C preprocessor
File inclusion under Unix






When one does a #include <filename.h>, the compiler will look for the
file filename.h where it expects system include files to be located
Under Unix, this is usually in the directory /usr/include
Some compilers will look in the current working directory (.) first followed by
the system include directory
With gcc (GNU's C compiler), it looks first in
/.../gnu/lib/gcc-include
followed by
/usr/include
The GNU compiler does not look in the current working directory (.) by default
for system include files
When one does a #include "ace/filename.h", the compiler will look
for the file filename.h in the “ace” directory
Modifying the default compiler search paths





Most C compilers support the -I option
This allows users to modify the search path for system include files and for
user include files
The syntax of the -I option is as follows: -I <directory>
There may be more than one -I option on the cc command line
These directories are prepended onto the default search path
8-6
The C preprocessor
Example search path modifications
% cc -I/user/john/include foo.c
#include "foo.h"
./foo.h
/user/john/include/foo.h
#include <foo.h>
/user/john/include/foo.h
/usr/include/foo.h
% cc -I. -I../include foo.c
#include "foo.h"
./foo.h
../include/foo.h
#include <foo.h>
./foo.h
../include/foo.h
/usr/include/foo.h
File inclusion -- comments




One uses #includes to group common #define statements, extern
declarations, and other shared data between files
Among other things, they are used to declare common functions
They are also used to access definitions for library functions from headers such
as <stdio.h>
 on most systems, such files exist somewhere
 on a Unix system, they would exist in the directory /usr/include
 on a personal computer, they would probably exist in an INCLUDE
directory somewhere within the compiler's environment
 however, strictly speaking, these need not be files
 this is indeed the case with the Unisys A-Series C compiler
 the compiler generates the code directly
 the #include serves more as a compiler directive
 the net effect is the same
If you use a standard library function, always include its header
8-7
The C preprocessor
Macro substitution

A macro definition has the form
#define

name
replacement-text
Subsequent occurrences of the token name will be replaced by the
replacement-text
example

A file with the lines:
#define PI 3.14159
void foo()
{
printf("PI = %g\n", PI);
}
is transformed into the following file after the #defines are processed
void foo()
{
printf("PI = %g\n", 3.14159);
}

Notice that the PI literal (token) was changed to the replacement text, but not
within the string token
8-8
The C preprocessor
Macro substitution -- illustration
#define PI 3.14159
void foo()
{
printf("PI = %g\n", PI);
}
identifier
comma
identifier
left-parenthesis
right-parenthesis
string literal
semicolon
void foo()
{
printf("PI = %g\n", 3.14159);
}
Phase 3 defines tokens
Phase 4 replaces preprocessor token PI with 3.14159
PI identifier (name) replaced
by replacement text 3.14159
String literal NOT touched
8-9
The C preprocessor
Macro substitution (2)



The replacement-text is all text to the end of the line
 If the line ends with a backslash (\), then the macro is continued onto the
next line
 The scope of the macro is limited to the source file
A definition may use any definition that is visible upon its invocation
Macros, however, cannot be recursively defined
#define PI P+I
#define P 3
#define I .14159
void foo()
{
printf("PI = %g\n", PI );
}
#define P 3
#define I .14159
void foo()
{
printf("PI = %g\n", P+I );
}
void foo()
{
printf("PI = %g\n", 3+.14159 );
}
void foo()
{
printf("PI = %g\n", 3.14159 );
}
8-10
The C preprocessor
Macro substitution (3)


Macros are expanded by literally substituting the macro identifier with the
desired replacement-text
This can lead to problems if the user is not careful
example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18


#define FOO
#define FOOFOO
100
100*100
void foo (void)
{
float x, y, z, t, v;
x = FOO;
y = FOOFOO;
z = 100 / FOOFOO;
t = 100 / y;
v = 1 / FOO;
if (z == t) {
printf("z is equal to t\n");
} else {
printf("z is not equal to t\n");
}
}
What gets assigned to x, y, z, t, and v?
What is the output of function foo?
8-11
The C preprocessor
Potential bug -- ending a macro definition with a semicolon

A common mistake is to place a semicolon at the end of a macro definition
#define SIZE 10;


The problem is that the semicolon becomes part of the replacement string
Thus, the statement
x = SIZE;
expands to
x = 10;;

While that will compile and probably work fine, the following example will not
compile
int y = SIZE, z;

A more pernicious example is where we write
#define GOOD_CONDITION (var == 1);
...
while GOOD_CONDITION
foo();

This expands to
while (var == 1);
foo();
which is definitely not what we wanted
8-12
The C preprocessor
Potential bug -- using = to define a macro



Another common mistake people make when defining macros is to include the
= sign as if one were initializing the variable
Algol programmers should make a special note here
Thus, instead of writing
#define MAX 100
one mistakenly writes



#define MAX = 100
// const int MAX = 100;
The replacement text for MAX becomes "= 100"
This can lead to some extremely obscure bugs
For instance, suppose we write
for (i = 0; I < MAX; i++) { ... }
this would expand to
for (i = 0; I <= 100; i++) { ... }

Similarly, if we write
for (j = MAX; j > 0; j--) { ... }
this would expand to
for (j == 100; j > 0; j--) {...}


This last example changes the assignment expression into a relational
expression, thus leaving j uninitialized
Both of these examples will compile, but one would expect very unpredictable
results
8-13
The C preprocessor
Parameterized macros


Macros in C may be parameterized
The general syntax for a parameterized macro is as follows
#define identifier(identifier, ..., identifier)


replacement-text
Now, along with the macro identifier, formal parameters are expected when
using the macro
Whenever a formal parameter is encountered in the replacement text, it is
substituted just as if it were a macro itself
example
#define SQ(x) ((x)*(x))
void foo (void)
{
int a = 8, b, c, d;
b = SQ (a);
c = SQ (a + b);
d = SQ (SQ (a));
}
gets expanded to...
void foo (void)
{
int a = 8, b, c, d;
b = ((a)*(a));
c = ((a + b)*(a + b));
d = ((((a)*(a)))*(((a)*(a))));
}

Our heavy usage of parentheses is to protect against the macro expanding an
expression which would yield an unanticipated order of evaluation
8-14
The C preprocessor
Parameterized macros (2)

Consider rewriting the previous example as follows...
#define SQ(x) x*x
void foo()
{
int a = 8, b, c, d;
b = SQ (a);
c = SQ (a + b);
d = SQ (SQ (a));
}

This would get expanded to the following...
void foo (void)
{
int a = 8, b, c, d;
b = a*a;
c = a + b*a + b;
d = a*a*a*a;
}

The assignment to c is messed up because of operator precedence

Now suppose we define SQ as follows:
#define SQ(x)

(x)*(x)
Then, the expression
4 / SQ (2)
expands to
4 / (2)*(2)
which was not what we wanted
8-15
The C preprocessor
Parameterized macros (3)

Spaces are not allowed between the macro name and the left parenthesis
#define SQ (x)

((x)*(x))
Calling this macro:
y = SQ (7);
expands to
y = (x) ((x)*(x)) (7);


Macros are often written to replace function calls by inline code
The following is a macro to find the minimum of two values
#define min(x,y) (((x)<(y))?(x):(y))

After this definition, an expression such as
m = min (u,v);
gets expanded to
m = (((u)<(v))?(u):(v));






Notice that when we define a parameterized macro, we did not specify the type
of its operands
This is because macros do not care about the type of operands
They simply expand; the resulting expression must be compatible
Thus, our macro min may be used to find the minimum of integers or floats or
pointer, or anything for which < is defined
We may also use it as part of another macro definition
If we wish to find the minimum of four values, we can write
#define min4(a,b,c,d) min(min(a,b),min(c,d))
8-16
The C preprocessor
Parameterized macros (4)



Macros are not function calls
The arguments to a macro are evaluated each time they are referenced within
the replacement text
The following source code illustrates a potential pitfall
#define max(x,y) ((x)>(y)?(x):(y))
void foo (void)
{
int i=1, j=2, x;
x = max (I++, j++);
}

After the macros have been expanded, this becomes
void foo (void)
{
int i=1, j=2, x;
x = ((i++)>(j++)?(i++):(j++));
}






This has the potentially undesirable effect of incrementing (in this case) i once
and j twice
Because the code is substituted inline, the macro argument is evaluated each
time it is encountered
This is similar to an Algol call-by-name parameter
Because of this discrepancy, it is important to know when a particular operator
is a macro as opposed to being an actual function
Documentation will often specify that a given routine may either be
implemented as a macro or function
In this case, one should assume that it is a macro
8-17
The C preprocessor
No type checking is done for macro arguments



One advantage and possible disadvantage of a macro is that no type checking is
done on the macro arguments
This follows from the fact that the arguments are expanded inline
Consider the following example
#define DOUBLE_IT(x) ((x)+(x))

This may seem equivalent to the following function
int double_it (int x) { return x+x; }


However calling double_it(2.5) is not the same as calling
DOUBLE_IT(2.5)
One could rewrite this to be
double double_it(double x) { return x+x; }


But this would be much less efficient when calling double_it(2) because
we would be using floating point arithmetic to do integer addition
Using a macro allows us to utilize the most efficient implementation depending
on the context of the operation
8-18
The C preprocessor
ANSI feature -- Using a macro name in its own definition



Most older C compilers do not allow this feature
ANSI realizes that macros cannot be recursive and thus defines what it means
for a macro to reference itself
Consider the following example
#include <math.h>
#define sqrt(x) (((x) < 0) ? sqrt(-(x)) : sqrt(x))



An older compiler would fail because it would try to expand sqrt within the
body of sqrt itself
The ANSI compiler specifies that if a macro name appears within its own
definition, it will not be expanded
Thus, the invocation of
y = sqrt(5);
would expand to
y = (((5) < 0) ? sqrt(-(5)) : sqrt(5));

In this example, the sqrt() function would be called with 5 as its argument
8-19
The C preprocessor
The macros in <stdio.h> and <ctype.h>

The C standard library includes two macros defined in <stdio.h>
 The first macro is used to get a character from standard input
 The second macro is used to write a character to standard output
#define getchar()
#define putchar(c)





getc (stdin)
putc ((c), stdout)
Using the macro instead of calling the actual functions directly is just as
efficient because the code is substituted in-line
The only time lost (so to say) is the compile time, i.e. it takes a little longer to
compile the file
The header <ctype.h> contains macros to do character tests
These macros should be assumed to take an argument of type int and return a
value of type int
The following table describes these macros
macro
isalpha(c)
isupper(c)
islower(c)
isdigit(c)
isalnum(c)
isxdigit(c)
isspace(c)
ispunct(c)
isprint(c)
isgraph(c)
iscntrl(c)
isascii(c)
c
c
c
c
c
c
c
c
c
c
c
c
is
is
is
is
is
is
is
is
is
is
is
is
nonzero (true) is returned if...
a letter
an uppercase letter
a lowercase letter
a digit
a letter or digit
a hexadecimal digit
a white space character
a punctuation character
a printable character
printable, but not a space
a control character
an ASCII code
8-20
The C preprocessor
The macros in <stdio.h> and <ctype.h> (2)



In some cases, the standard will specify that a given routine may either be a
function or macro
The following table gives functions from <ctype.h> which convert
characters to other formats
The standard specifies that these may either be functions or macros
function
or
macro
toupper(c)
tolower(c)
Toascii(c)




effect
changes c from lowercase to uppercase
changes c from uppercase to lowercase
changes c to ASCII code
In older versions of C, toupper(c) and tolower(c) work only if c is
lowercase or uppercase respectively
The problem in these older versions is that the transformation would be applied
to c even if it were not a lowercase or uppercase letter respectively
To be safe, one should test to see if the letter being converted needs to be
converted before converting it
The following macros do just that...
#define lowercase(c) (isupper(c)? tolower(c):(c))
#define uppercase(c) (islower(c)? toupper(c):(c))
8-21
The C preprocessor
Undefining macros and not calling them

A macro may be deleted by undef-ing it
#ifdef toupper
#undef toupper
#endif /* toupper */




This removes any previous definitions of the macro toupper
It is not erroneous to apply #undef to an unknown identifier
Putting parentheses around an identifier will prevent the parameterized macro
from being invoked
However, putting spaces between the macro identifier (upon invocation) and
the left parenthesis will still cause the macro to be called
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20


#include <ctype.h>
int (tolower)(int c)
{
return (tolower (c));
}
extern char buffer[];
extern int bufsize;
void foo (void)
{
int i, c;
for (i = 0; i < bufsize; i++) {
c = buffer[i];
buffer[i] = toascii (c);
}
for (i = 0; i < bufsize; i++) {
buffer[i] = (tolower)(buffer[i]);
}
}
On line 12, the macro toascii is invoked even though there is a space
between the macro identifier and its argument list
 this follows from the discussion on page 230 of K&R, 2nd edition
 the discussion in A Book on C is wrong on this point
On line 15, the function tolower is invoked
 if no such function exists, then the program will not link
8-22
The C preprocessor
Advantages of using macros as compared to functions
1. Macros are usually faster than functions since they avoid the function call
overhead
2. The number of macro arguments is checked to match the definition.
 this is done for functions that use the ANSI prototyping syntax
 on older C systems, users may have to use traditional C syntax for functions
 thus, on older systems, prototypes may not be available
3. No type restriction is placed on arguments so that one macro may serve for
several data types
Disadvantages of using macros in place of functions
1. Macro arguments are re-evaluated at each mention in the macro body
 this can lead to unexpected behavior if an argument contains side effects
2. Function bodies are compiled once so that multiple calls to the same function
can share the same code without repeating it each time. Macros, on the other
hand, are expanded each time they appear in the program.
 a program with many large macros may be longer than a program that uses
functions in place of macros
 if the macro is included in a header, then if one changes the macro, all files
including the header must be recompiled to realize the change in the macro;
if it were a function, only the file implementing the function need be
recompiled
3. Though macros check the number of arguments, they don't check the argument
types. ANSI function prototypes check both the number of arguments and the
argument types.
4. It is more difficult to debug programs that contain macros because the source
code goes through an additional layer of translation, making the object code
even further removed from the source code.
8-23
The C preprocessor
Conditional compilation



Various constructs exist to control compilation in C
The preprocessor has such directives for conditional compilation
Each preprocessing directive is of the form
#if
#ifdef
#ifndef

Each of these provide for the conditional compilation of the code that follows
until either the preprocessing directive
#endif



or
#else
or
#elif
The following properties must be true for the intervening code to be compiled
directive
#if
#ifdef
#ifndef

constant_integer_expression
identifier
identifier
requirement for code to be compiled
constant_integer_expression must be true (nonzero)
the named identifier must exist
the named identifier must not exist
The constant integral expression may not contain a sizeof operator or a cast
 this is because the code is evaluated by the preprocessor, not the C compiler
It may use the special defined preprocessing directive
Users should note that the defined operator may not exist in older versions
of C
8-24
The C preprocessor
Conditional compilation (2)
#if
Evaluates a constant integer expression (which may
not include sizeof, casts, or enum constants)
If this expression is non-zero, subsequent lines until
an #endif, #elif, or #else are included
Otherwise, those lines are converted into whitespace.
Expression defined(name) in a #if is 1 if name
has been defined, 0 otherwise
#ifdef name
Equivalent to
#if defined(name)
#ifndef name
Equivalent to
#if !defined(name)
#else
May be nested within a #if, #ifdef, or #ifndef
construct
Code between it and the #endif is compiled if and
only if the preceding code was not.
#elif
Similar to #else but much like the else-if
construct in C
8-25
The C preprocessor
Application -- prevent multiple inclusion of header files

In C, a lot of problems can occur if a header file is included more than once
within a given source file
bar.h
#include <stdio.h>
#include "foo.h"
...



foo.c
#include <stdio.h>
#include "foo.h"
#include "bar.h"
...
In this example, foo.c includes files foo.h and <stdio.h> twice
This is not necessarily a bad thing as long as the headers are written in such a
way so as to allow for multiple inclusion
We can add several lines to each of our header files...
foo.h
#ifndef FOO_H
#define FOO_H
bar.h
#if !defined (BAR_H)
#define BAR_H
#include <stdio.h>
...
#include <stdio.h>
#include "foo.h"
...
#endif /* BAR_H */
#endif /* FOO_H */

File foo.c includes foo.h followed by bar.h
 when foo.h is being compiled as the result of the first (direct) inclusion
 FOO_H is not defined
 the code surrounded by the #ifndef is compiled
 FOO_H is immediately defined
 when bar.h is being compiled
 BAR_H is not defined
 the code surrounded by the #ifndef is compiled
 BAR_H is immediately defined
 when foo.h is being compiled as the result of being included by bar.h
 FOO_H is defined
 the code surrounded by the #ifndef is not compiled
8-26
The C preprocessor
Application -- isolate machine dependent code




Sometimes it is necessary to do things which are inherently machine dependent
Suppose, for example, we have an application which needs to know the page
size of the operating system
In general, one would expect to call an operating system function to return the
page size
However, if no such function exists, it is sufficient to make an educated guess
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#if defined (unix)
/* use unix system call getpagesize(2) */
extern int getpagesize (void);
#endif /* unix */
long pagesize (void)
{
#if defined (unix) /* predefined macro */
return getpagesize ();
#else
# if defined (__ASERIES__)
# if defined (LONGLIMIT)
return LONGLIMIT * 6;
# else
/* a pretty good guess */
return 0x1000 * 6;
# endif /* LONGLIMIT */
# else
/* heck, I don't know :: good enough */
return 1024;
# endif /* __ASERIES__ */
#endif /* unix */
}
8-27
The C preprocessor
Another approach to isolating machine dependent code



Another common approach is to assume that the function getpagesize()
exists and that if it doesn't on some system, then it will be implemented
Typically, there will be a file getpagesize.c which implements it
This is good for cases where a program has already been written to use
getpagesize() and we just want to provide it in case a given system
doesn't have such a function
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Getpagesize.c
#include "allocate.h"
#ifdef NOGETPAGESIZE
int getpagesize (void)
{
#ifdef __ASERIES__
# ifdef LONGLIMIT
/* LARGE MEMORY MODEL */
return LONGLIMIT * 6;
# else
/*an EWAG at an A-Series page size*/
return 0x1000 * 6;
# endif /* LONGLIMIT */
#else
/* HECK, I DON'T KNOW*/
return 1024;
#endif /* __ASERIES */
}
#endif
/* NOGETPAGESIZE */
8-28
The C preprocessor
Macros may reference other macros which have not yet been defined
System.h
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#define SYSV 100
#define BSD
200
#define MSDOS 300
#ifdef SYSTEM
# if SYSTEM == SYSV
# define SYSTEM_H "sysv.h"
# elif SYSTEM == BSD
# define SYSTEM_H "bsd.h"
# elif SYSTEM == MSDOS
# define SYSTEM_H "msdos.h"
# else
# error unknown SYSTEM specified
# endif
#else
# error no SYSTEM specified
#endif /* SYSTEM */
#include SYSTEM_H
Foo.c
1
2
3
4
5
6
7
8
9
10
#ifndef SYSTEM
# if defined (USG)
# define SYSTEM SYSV
# elif defined (unix)
# define SYSTEM BSD
# elif defined (dos)
# define SYSTEM MSDOS
# endif /* USG */
#endif /* SYSTEM */
#include "system.h"
% cc foo.c
% cc -DSYSTEM=SYSV
% cc -DSYSTEM=ASERIES
8-29
The C preprocessor
Other preprocessing directives
Error generation (ANSI)

The following preprocessor directive is available under ANSI C
#error (token-sequence)?

This causes the compiler to write a diagnostic message that includes the token
sequence
5
6
7
8
9
10
11
12
13
14
15
16
17
example -- from the previous page
#ifdef SYSTEM
# if SYSTEM == SYSV
# define SYSTEM_H "sysv.h"
# elif SYSTEM == BSD
# define SYSTEM_H "bsd.h"
# elif SYSTEM == MSDOS
# define SYSTEM_H "msdos.h"
# else
# error unknown SYSTEM specified
# endif /* SYSTEM == SYSV */
#else
# error no SYSTEM specified
#endif /* SYSTEM */
% gcc foo.c
In file included from foo.c:10:
system.h:16: #error no SYSTEM specified
% gcc -DSYSTEM=SYSV
% gcc -DSYSTEM=ASERIES
In file included from foo.c:10:
system.h:13: #error unknown SYSTEM specified
8-30
The C preprocessor
Other preprocessing directives (2) -- line control

The #line directive may be used to change which file and line number it
thinks it is currently processing
#line constant "filename"
#line constant

This causes the compiler to believe, for error diagnostics, that the line number
(and filename) of the next line is constant (and filename) respectively
99 program
100
101
102 command_list
103
104
105
106 command
107
108
109
110

parse.y
: command_list
;
: command_list command
| command
;
: HALT
{ /* forget semicolon */
printf ("Adios!\n")
exit (0);
}
Yacc (Bison) then generates a corresponding C program file...
parse.c
654 case 4:
655 #line 107 "parse.y"
656 { /*forget semicolon*/
657
printf("Adios!\n")
658
exit(0);
659
;
660
break;}

Thus, when we compile parse.c, the error diagnostics will point us to the
appropriate line in parse.y
8-31
The C preprocessor
Other preprocessing directives (3) -- the #pragma directive (ANSI)

The following preprocessor directive is available under ANSI C
#pragma (token-sequence)?




This performs implementation specific tasks
Each compiler is free to support special names that have implementationdefined behavior when preceded by a #pragma
For instance, a compiler might support the names NO_SIDE_EFFECTS and
END_NO_SIDE_EFFECTS, which inform the compiler whether it need to
worry about the side effects for a certain block of statements
Consider the following code fragment
#pragma NO_SIDE_EFFECTS
a = fn (x, 2);
*p = 2;
#pragma END_NO_SIDE_EFFECTS

In this example, the pragma is used to help the compiler generate more efficient
code by telling it that it can do *p = 2 before the call to fn because the call
to fn will not produce any side effects, and likewise the change in *p will not
effect fn

Note that pragmas are compiler dependent and cannot be expected to be
portable
An unknown pragma should at worst generate a warning

8-32
The C preprocessor
Other preprocessing directives (4)
The NULL directive

A line with simply a # on it has no effect, e.g.
#
Predefined macro names (ANSI)




The ANSI standard defines five macro names that are build into the
preprocessor
Each names begins and ends with two underscore characters
You may not redefine or #undef these macros
Older compilers may support some but probably not all of these macros
macro
__LINE__
__FILE__
__TIME__
__DATE__
__STDC__
expanded value
Expands to the source file line number on which it is
invoked (int) -- available on most older compilers
Expands to the name of the file in which it is invoked
(char []) -- available on most older compilers
Expands to the time of program compilation
(char []) -- ANSI
Expands to the date of program compilation
(char []) -- ANSI
Expands to the constant 1 if the compiler conforms to the
ANSI standard
8-33
The C preprocessor
Using predefined macros



The __LINE__ and __FILE__ macros are valuable for diagnosing programs
We can implement a macro that compares two expressions and if they are not
equal, will print out a diagnostic
We first implement the function which prints out the diagnostic
void fail (int a, int b, char p[], int line)
{
printf ("Check failed in file %s at line %d: ",
p, line);
printf ("received %d, expected %d\n",
a, b);
}

Then, in a common header, we could define
#define CHECK(a, b) \
if ((a) != (b)) \
fail (a, b, __FILE__, __LINE__)

Then, anywhere within the program we could check to see if, for instance, a
variable x equals 0 by including the following diagnostic:
CHECK (x, 0);
8-34
The C preprocessor
Using predefined macros (2)


Similarly, we can use the __DATE__ and __TIME__ macros for recording the
time and date that the file was last compiled
The following procedure will print out the date and time when that file was last
compiled
void print_version (void)
{
printf ("This file last compiled on ");
printf ("%s at %s\n", __DATE__, __TIME__);
}





The __STDC__ macro, if it expands to 1, signifies that the compiler conforms
to the ANSI standard
If it expands to any other value, or if it is not defined, one should assume that
the compiler does not conform to the ANSI standard
In general, the existence of __STDC__ implies that the compiler can handle
function prototypes
However, certain ANSI headers, such as <stdlib.h> and <stdarg.h>,
may not exist
Those headers should exist if __STDC__ expands to 1
#ifdef __STDC__
/* compiler understands prototypes */
# if __STDC__ == 1
/* system is ANSI compliant */
# else
/* system is almost ANSI compliant */
# endif /* __STDC__ == 1 */
#else
/* traditional C compiler */
#endif /* __STDC__ */
8-35
The C preprocessor
“Stringification” -- ANSI feature




One of the limitations of the preprocessor described in the first edition of K&R
is that there is no way to treat a series of characters as both a string and an
expression
However, with an ANSI conforming compiler, one can produce this behavior
by using the preprocessor operator #
This forces the preprocessor to surround the next replacement argument with
double quotes
The preprocessor operator # may be applied only to formal parameters of
macros
#define message_for(a, b) \
printf (#a " and " #b \
": please report to the front desk\n")
main()
{
message_for (Wilma, Fred Flintstone);
}
The output of the program is
Wilma and Fred Flintstone: please report to the front desk
The macro message_for (Wilma, Fred Flintstone) expands to
printf("Wilma" " and " "Fred Flintstone" \
": please report to the front desk\n");
Then, adjacent string literals are concatenated into one big string literal
8-36
The C preprocessor
The assert macro from <assert.h>



A useful macro available on almost all C installations, old and new, is the
assert macro
This macro is defined in the header <assert.h>
It allows a user to test to see if an expression is true
 if it is, nothing happens
 if it is false, then an error diagnostic is printed and the program is
terminated
 conceptually it says to assert that this condition is true
foo.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14

#include <stdio.h>
#include <assert.h>
double compute_mean (double a[], int n)
{
double total;
int i;
assert ("Bad size" && n > 0);
for (i = 0, total = 0; i < n; i++)
total += a[i];
return total / n;
}
If compute_mean() were called with n either 0 or negative, the following
run-time error would occur
Traditional C
Failed assertion at line 9 of `foo.c'
ANSI C
Failed assertion `"Bad size" && n > 0' at line 9 of `foo.c'
8-37
The C preprocessor
The assert macro from <assert.h> (2)



The assert macro, by default, is enabled
Users may turn it off by defining the macro NDEBUG
This may either be done explicitly before including the <assert.h> header
#define NDEBUG

It may also be done by defining it on the cc command line, e.g.:
% cc -DNDEBUG -c foo.c

In fact, any macro may be defined on the cc command line using the -D
command line option:
-D<macro>
-D<macro>=<replacement-text>

When we define a macro on the compiler command line, it gets defined before
any source code is read

If NDEBUG is defined, then the assert macros will do nothing
This is useful if one is writing production code
The assert macros will be enabled during the test and debug phase, but can
then be disabled when the code goes into production


8-38
The C preprocessor
The assert macro from <assert.h> (3)
Implementation in traditional C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
assert.h -- traditional C implementation
#ifdef NDEBUG
#define assert(expr)
#else
#define __assert {\
fprintf(stderr, "Failed assertion at ");\
fprintf(stderr, "line %d of `%s'\n",\
__LINE__, __FILE__);\
abort();\
}
#define assert(expr)\
if ((expr) == 0) __assert
#endif /*NDEBUG*/
8-39
The C preprocessor
The assert macro from <assert.h> (4)
Implementation in ANSI C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
assert.h -- ANSI C implementation
#ifdef NDEBUG
#define assert(expr)
#else
# ifndef __STDC__
# define __assert {\
fprintf(stderr, "Failed assertion at ");\
fprintf(stderr, "line %d of `%s'\n",\
__LINE__, __FILE__);\
abort();\
}
# define assert(expr)\
if ((expr) == 0) __assert
# else
# define __assert(expression_string) {\
fprintf(stderr, "Failed assertion `%s'",
expression_string);\
fprintf(stderr, " at line %d of `%s'\n",\
__LINE__, __FILE__);\
abort();\
}
# define assert(expr)\
if ((expr) == 0) __assert(#expr)
# endif
#endif /*NDEBUG*/
8-40
The C preprocessor
Token pasting -- the ## operator (ANSI)
 The ANSI standard defines a new preprocessor operator ## that pastes two
lexical tokens
 Consider the following example
#define
#define
#define
#define
mask_0
mask_1
mask_2
mask_3
#define
#define
#define
#define
shift_0
shift_1
shift_2
shift_3
0x1
0x2
0x4
0x8
0
1
2
3
#define BIT_nn(x,n)
((x) & mask_ ## n)
#define SET_nn(x,n)
((x) | 0x1 << shift_ ## n)
#define RESET_nn(x,n) ((x) & ~ mask_ ## nn)
void foo (unsigned int x)
{
x = SET_nn (x, 1);
x = RESET_nn (x, 2);
if (BIT_nn (x, 3)) printf ("foo\n");
}
The function foo() expands as follows:
void foo (unsigned int x)
{
x = ((x) | 0x1 << shift_1);
x = ((x) & ~ mask_2);
if (((x) & mask_3)) printf("foo\n");
}
The preprocessor then expands this by expanding the resultant macros:
void foo(unsigned int x)
{
x = ((x) | 0x1 << 1);
x = ((x) & ~ 0x4);
if (((x) & 0x8)) printf("foo\n");
}
8-41
The C preprocessor
Token pasting -- the ## operator (2)

Consider the following macro:
#define FILENAME(extension) test_ ## extension

The code fragment
void foo (void)
{
FILE *FILENAME(1), *FILENAME(2);
FILENAME(1) = fopen ("test.1", "r");
FILENAME(2) = fopen( "test.2", "w");
}
expands to
void foo (void)
{
FILE *test_1, *test_2;
test_1 = fopen ("test.1", "r");
test_2 = fopen ("test.2", "w");
}

Notice that the paste operator produces new lexical tokens
8-42
The C preprocessor
Token pasting -- the ## operator (ANSI) -- (2)

A more interesting example is one which includes a particular version of some
header file
#ifndef VERSION
#define VERSION 3
#endif /* VERSION */
#define str(s)
#define FILENAME(n)
#s
db ## n
#include str (FILENAME(VERSION).h)
The #include statement expands to
#include str (FILENAME(3).h)
which in turn expands to
#include str (db3.h)
which finally expands to
#include "db3.h"
We could then determine which file we include on the cc command line:
% cc -DVERSION=4 foo.c
This would force "db4.h" to be included instead of "db3.h"
8-43
The C preprocessor
Conditional function prototypes



If one writes code that must be compiled under both an ANSI and traditional C
compiler, then one must use the traditional C approach for function definitions
However, it is possible to declare them using either a prototype or not
depending on the type of compiler
Consider the following macro:
#ifndef P
# if __STDC__
# define P(s) s
# else
# define P(s) ()
# endif /* __STDC__ */
#endif /* P */

One can the declare a function:
s
int foo P((int c, double d, char *s));
and implement it
int foo (c, d, s)
char c;
double d;
char *s;
{
...
}

The function declaration expands to either
int foo ();
or
int foo (int c, double d, char *s);
depending on whether __STDC__ is defined to be nonzero

Note that the parameter s binds to (int c,double d,char *s) if
__STDC__ is defined to be nonzero, and '()' otherwise
8-44
The C preprocessor
Final remarks


Most C compilers define additional preprocessing tokens to help developers
tailor their applications to the host environment
For instance, UNIX systems define the macro unix
Apollo systems define the macro apollo
Sun systems typically define the macro sun

Thus, one often isolates machine dependent code:


#ifdef
/*
#endif
#ifdef
/*
#endif


sun
do sun stuff */
/* sun */
apollo
do apollo stuff */
/* apollo */
The Unisys A-Series recently defines __ASERIES__
Such options are usually defined in the documentation
example

The GNU C compiler gcc defines the following macros on an Apollo DN3000:
apollo
aegis
unix
mc68020
__apollo__
__aegis__
__unix__
8-45
Download