This chapter describes
the
m4
macro preprocessor, a front-end filter that lets
you define macros by placing
m4
macro definitions at the
beginning of your source files.
You can use the
m4
preprocessor
with either program source files or document source files.
This chapter contains the following information on Macros:
Macros ease your programming or writing tasks by allowing you to substitute a simple word or two for a great amount of material. Macro calls in a source file have the following form:
name
[ (
arg1[ ,
arg2] ) ]
For example, suppose you
have a C program in which you want to print the same message at several points.
You could code a series of
printf
statements like the
following:
printf("\nThese %d files are in %s:\n",cnt,dir);
As your program evolves, you decide to change the wording; but you have to edit each instance of the message. Defining a macro like the following will save you a great deal of work:
define(filmsg,`printf("\nThese %d files are in %s:\n",$1,$2)')
Then, everywhere you want to output this message, you use the macro this way:
filmsg(cnt,dir);
With this implementation, you only need to edit the message in one place.
A
macro definition
consists of a symbolic name (called a
token) and the character string that is to replace it.
A
token is any string of alphanumeric
characters (letters, numbers, and underscores) beginning with a letter or
an underscore and delimited by nonalphanumeric characters (punctuation or
white space).
For example,
N12
and
N
are both tokens but
A+B
is not a token.
When you process
your file through
m4
, each occurrence of a recognized macro
is replaced by its definition.
In addition to replacing symbolic names with
text,
m4
also can perform the following operations:
Arithmetic calculation
File manipulation
Conditional macro expansion
String and substring functions
System command execution
The
m4
program reads each token in the file and determines if the token
is a macro name.
Macro names that are embedded in other tokens are not recognized;
for example,
m4
does not interpret
N12
as containing an occurrence of the token
N
.
If the token
is a macro name,
m4
replaces it with its defining text
and pushes the resulting string back onto the input to be rescanned.
Macro expansion is thus recursive; macro definitions can include nested occurrences of other macros to any depth of nesting. You can call macros with arguments, in which case the arguments are collected and substituted into the right places in the defining text before the defining text is rescanned.
The
m4
preprocessor is a standard UNIX filter.
It
accepts input from standard input or from a list of input files and writes
its output to standard output.
The following lines illustrate correct
m4
usage:
%
grep -v '#include' file1 file2 | m4 > outfile
%
m4 file1 file2 | cc
The
m4
program processes each argument in order.
If there
are no arguments, or if an argument is a minus sign ( -
),
m4
reads standard input as its input
file.
5.2 Defining Macros
You create a macro definition with the
define
command, one of about 20 built-in macros provided by
m4
.
For example:
define(N,100)
The open parenthesis must follow the word
define
with no
intervening space.
Given this macro definition, the token
N
will be
replaced by
100
wherever it appears in the file being processed.
The defining text can be any text, except that if the text contains parentheses,
the number of open (left) parentheses must match the number of close (right)
parentheses unless you protect an unmatched parenthesis by quoting it.
See
Section 5.2.1
for an explanation of quoting.
Built-in and user-defined macros work the same way except that some of the built-in macros change the state of the process. Refer to Section 5.3 for a list of the built-in macros.
You can define macros in terms of other macros. For example:
define(N,100) define(M,N)
This example defines both
M
and
N
to be
100
.
If you later change
the definition of
N
and assign it a new value,
M
retains the value of
100
, not the new value
you give
N
.
The value of
M
does not
track that of
N
because the
m4
preprocessor
expands macro names into their defining text as soon as possible.
The overall
result, as far as
M
is concerned, is the same as using
the following input in the first place: define(M,100) If you want the value
of
M
to track the value of
N
, you can
reverse the order of the definitions, as follows:
define(M,N) define(N,100)
Now
M
is defined to be the
string
N
.
When the value of
M
is requested
later, the
M
is replaced by
N
, which
is then rescanned and replaced by whatever value
N
has
at that time.
Macro definitions made with the
define
command do
not delete characters following the close parenthesis.
For example:
Now is the time for all good persons. define(N,100) Testing N definition.
This example produces the following result:
Now is the time for all good persons. Testing 100 definition.
The blank line results
from the presence of a newline character at the end of the line containing
the
define
macro.
The built-in
dnl
macro deletes all characters that follow it, up to and including the next
newline character.
Use this macro to delete empty lines.
For example:
Now is the time for all good persons. define(N,100)dnl Testing N definition.
This example produces the following result:
Now is the time for all good persons. Testing 100 definition.
This section contains the following information:
5.2.1 Using the Quote Characters
To
delay the expansion of a
define
macro's arguments, enclose
them in a matched pair of quote characters.
The default quote characters are
left and right single quotation marks (`
and
'
), but you can use the built-in
changequote
macro to specify different characters.
(See
Section 5.3.)
Any text surrounded by quote characters is not expanded
immediately, but the quote characters are removed.
The value of a quoted string is the string with the quote characters removed.
Consider the following example:
define(N,100) define(M,`N')
The quote characters around the
N
are removed as the argument is being collected.
The result of
using quote characters is to define
M
as the string
N
, not
100
.
This example makes the value of
M
track that of
N
, and it is thus another way
to accomplish the effect of the following definitions, shown in
Section 5.2:
define(M,N) define(N,100)
The general rule is that
m4
always strips off one level of quote characters whenever it evaluates something.
This is true even outside macros.
For example, to make the word, define,
appear in the output, enter the word in quote characters, as follows:
`define' = 1
Because of the way
m4
handles quoted strings, you must be careful about nesting macros.
For example:
define(dog,canine) define(cat,animal chased by `dog') define(mouse,animal chased by cat)
When the definition of
cat
is processed,
dog
is not expanded to
canine
immediately because it is quoted.
But when
mouse
is processed, the definition of
cat
(animal chased by dog
) is used; this time,
dog
is not quoted, and the definition of
mouse
becomes
animal chased by animal chased by canine
.
If the previous example
is included in a file named
infile
:
%
cat infile
define(dog,canine) define(cat,animal chased by `dog') define(mouse,animal chased by cat) dog cat mouse%
m4 infile
canine animal chased by canine animal chased by animal chased by canine
When you redefine an existing macro, you must quote the first argument (the macro name), as follows:
define(N,100)
.
.
.
define(`N',200)
Without the quote characters, the second
define
macro sees
N
, recognizes it, and substitutes
its value, producing the following result:
define(100,200)
The
m4
program ignores this statement
because it only can define names, not numbers.
5.2.2 Macro Arguments
The simplest
form of macro processing is replacing one string with another (fixed) string
as illustrated in the previous sections.
However, macros can also have arguments,
so that you can use a given macro in different places with different results.
To indicate where an argument is to be used within the replacement text for
a macro (the second argument of its definition), use the symbol
$
n
to indicate
the
nth argument.
For example, the symbol
$1
refers to the first argument of a macro.
When the macro is used,
m4
replaces the symbol with the value of the indicated argument.
For example:
define(bump,$1=$1+1)
.
.
.
bump(x);
In this example,
m4
will replace
the
bump(x)
statement with
x=x+1
.
A macro can have
as many arguments as needed.
However, you can access only nine arguments
by using the
$
n
symbols ($1
through
$9
).
To access arguments past the
ninth argument, use the
shift
macro, which drops the first
argument and reassigns the remaining arguments to the
$
n
symbols (second argument to
$1
, third to
$2
, and so on).
Using the
shift
macro more than
once allows access to all arguments used with the macro.
The symbol
$0
returns the name of the macro.
Arguments that are not supplied
are replaced by null strings, so that you can define a macro that concatenates
its arguments as follows:
define(cat,$1$2$3$4$5$6$7$8$9)
.
.
.
cat(x,y,z)
This example replaces the
cat(x,y,z)
statement with
xyz
.
Arguments
$4
through
$9
in this example are null because corresponding
arguments were not provided.
When scanning a macro, the
m4
program discards leading unquoted blanks,
tabs, or newline characters in arguments, but keeps all other white space.
For example:
define(a, "$1 $2$3")
.
.
.
a(b, c, d)
This example expands the
a
macro to
be
b cd
.
In the
define
macro, however,
newline characters are meaningful.
For example:
define(a,$1 $2$3)
.
.
.
a(b,c,d)
This latter example expands the
a
macro as follows:
b cd
Macro arguments are separated by commas. Use parentheses to enclose arguments containing commas, so that the commas are not misinterpreted as ending the arguments containing them. For example, the following statement has only two arguments:
define(a, (b,c))
The first argument is
a
, and the second is
(b,c)
.
To use a single parenthesis in an argument, enclose it in quote
characters:
define(a,b`)'c)
In
this example,
b)c
is the second argument.
5.3 Using Other m4 Macros
The
m4
program provides a set of macros that are
already defined (built-in macros).
Table 5-1
lists
all of these macros and describes them briefly.
The following sections further explain many of the macros and how to use them:
Macro | Description |
changecom( l, r) |
Changes the left and right comment characters to the characters represented by l and r. The two characters must be different. |
changequote( l, r) |
Changes the left and right quote characters to the characters represented by l and r. The two characters must be different. |
decr( n) |
Returns the value of n-1. |
define( name, replacement) |
Defines a new macro, named
name , with a value of
replacement. |
defn( name) |
Returns the quoted definition
of
name . |
divert( n) |
Changes the output stream to the temporary file number n. |
divnum |
Returns the number of the currently active temporary file. |
dnl |
Deletes text up to a newline character. |
dumpdef( `name'[, `name'...]) |
Prints the names and current definitions of the named macros. |
errprint( str) |
Prints str to the standard error file. |
eval( expr) |
Evaluates expr as a 32-bit arithmetic expression. |
ifdef( `name', arg1, arg2) |
If macro name is defined, returns arg1; otherwise, returns arg2. |
ifelse( str1, str2, arg1, arg2) |
Compares the strings
str1
and
str2 .
If they match,
ifelse
returns the value of
arg1 ; otherwise, it returns
the value of
arg2 . |
include( file)
sinclude( file) |
Returns the contents of
file.
The
sinclude
macro does not report
an error if it cannot access the file. |
incr( n) |
Returns the value of n+1. |
index( str1, str2) |
Returns the character position
in string
str1
where
str2
starts, or -1
if
str1
does not contain
str2 . |
len( str)
dlen( str) |
Returns the number of characters
in
str .
The
dlen
macro operates on strings
containing 2-byte representations of international characters. |
m4exit( code) |
Exits
m4
with
a return code of
code. |
m4wrap( name) |
Runs macro
name
before exiting, after completing all other processing. |
maketemp( strXXXXX str) |
Creates a unique file name by
replacing the literal string
XXXXX
in the argument string
with the current process ID. |
popdef( name) |
Replaces the current definition
of
name
with the previous definition, saved with the
pushdef
macro. |
pushdef( name, replacement) |
Saves the current definition of
name
and then defines
name
to be
replacement
in the same way as
define . |
shift( param_list) |
Shifts the parameter list leftward one position, destroying the original first element of the list. |
substr( string, pos, len) |
Returns the substring of string that begins at character position pos and is len characters long. |
syscmd( command) |
Executes the specified system command with no return value. |
sysval |
Gets the return code from the
last use of the
syscmd
macro. |
traceoff( macro_list) |
Turns off trace for any macro
in the list.
If
macro_list
is null, turns off all tracing. |
traceon( name) |
Turns on trace for the named macro.
If
name
is null, turns trace on for all macros. |
translit( string, set1, set2) |
Replaces any characters from set1 that appear in string with the corresponding characters from set2. |
undefine( `name') |
Removes the definition of the named macro. |
undivert( n, n[, n...]) |
Appends the contents of the indicated temporary files to the current temporary file. |
5.3.1 Changing the Comment Characters
To include comments in
your
m4
programs, delimit the comment lines with the comment
characters.
The default left comment character is the number sign ( #
); the default right comment character is the newline character.
If these characters are not convenient, use the built-in
changecom
macro.
For example:
changecom({,})
This example makes the left and right braces the new comment
characters.
To restore the original comment characters, use
changecom
as follows:
changecom(#, )
Using
changecom
with no arguments disables
commenting.
5.3.2 Changing the Quote Characters
The default quote characters
are the left and right single quotation marks (`
and
'
).
If these characters are not convenient, change the quote characters
with the built-in
changequote
macro.
For example:
changequote([,])
This example makes the left and
right brackets the new quote characters.
To restore the original quote characters,
use
changequote
without arguments, as follows:
changequote
5.3.3 Removing a Macro Definition
The
undefine
macro removes macro definitions.
For example:
undefine(`N')
This example removes the definition
of
N
.
You must quote the name of the macro to be undefined.
You can use
undefine
to remove built-in macros, but once
you remove a built-in macro, you cannot recover that macro for later use.
5.3.4 Checking for a Defined Macro
The built-in
ifdef
macro determines if a macro
is currently defined.
The
ifdef
macro accepts three arguments.
If the first argument is defined, the value of
ifdef
is
the second argument.
If the first argument is not defined, the value of
ifdef
is the third argument.
If there is no third argument, the
value of
ifdef
is null.
5.3.5 Using Integer Arithmetic
The
m4
program provides the following built-in functions for doing arithmetic on
integers only:
incr |
Increments its numeric argument by 1 |
decr |
Decrements its numeric argument by 1 |
eval |
Evaluates an arithmetic expression |
For example, you can create a variable
N1
such that its value will always be one greater than
N
, as follows:
define(N,100) define(N1,`incr(N)')
The
eval
function can evaluate expressions containing
the following operators (listed in decreasing order of precedence):
unary + (plus), unary - (minus)
**
or
^
(exponentiation)
*
,
/
,
%
(modulo)
+
,
-
==
,
!=
,
<
,
<=
,
>
,
>=
!
(NOT)
&
or
&&
(logical AND)
|
or
||
(logical OR)
Use parentheses to group operations
where needed.
All operands of an expression must be numeric.
The numeric
value of a true relation such as
1>0
is 1, and false is
0 (zero).
The precision in
eval
is 32 bits.
For example,
to define
M
as
2==N+1
, use
eval
as follows:
define(N,3) define(M,`eval(2==N+1)')
Use quote characters around the
text that defines a macro, unless the text is simple and contains no instances
of macro names.
5.3.6 Manipulating Files
To merge a new file in the input, use the built-in
include
macro as follows:
include(myfile)
This example inserts the contents of
myfile
in place of the
include
command.
As the included file
is read,
m4
scans it for macros as if it were part of the
primary input.
With the
include
macro, a fatal error occurs if the
named file cannot be accessed.
To avoid an error, use the alternative form,
sinclude
(silent include).
The
sinclude
macro
continues without error if the named file cannot be accessed.
5.3.7 Redirecting Output
You can redirect the output of
m4
to temporary
files during processing, and the collected material can be output upon command.
The
m4
program can maintain up to nine temporary files,
numbered 1 through 9.
To redirect output, use the
divert
macro as in the following example:
divert(4)
When this comand is encountered,
m4
begins writing its output to the end of temporary file 4.
The
m4
program discards the output if you redirect the output to a temporary
file other than 1 through 9; you can use this feature to make
m4
omit a portion of the input file.
Use
divert(0)
or
divert
with no argument to return the output to the
standard output stream.
At the end of its processing,
m4
writes all redirected
output to the standard output stream, reading from the temporary files in
numeric order and then destroying the temporary files.
To retrieve the information from all temporary files in numeric order
at any time before processing is completed, use the built-in
undivert
macro with no arguments.
To retrieve
selected temporary files in a specified order, use
undivert
with arguments.
When using
undivert
,
m4
discards the temporary files that are recovered and does not search the recovered
information for macros.
The value of
undivert
is not the diverted text.
The built-in
divnum
macro returns the number of the currently active temporary file.
If you do
not change the output file with the
divert
macro,
m4
puts all output in temporary file 0 (zero).
5.3.8 Using System Programs in a Program
You can run any program in the operating
system from a program by using the built-in
syscmd
macro.
If the system command returns information, that information is the value
of the
syscmd
macro; otherwise, the macro's value is null.
For example:
syscmd(date)
Use
the built-in
maketemp
macro to make a unique file name
from a program.
If the literal string
XXXXX
is present
in the macro's argument,
m4
replaces the
XXXXX
with the process ID of the current process.
For example:
maketemp(myfileXXXXX)
If the current process ID
is 23498, this example returns
myfile23498
.
You can use
this string to name a temporary file.
5.3.10 Using Conditional Expressions
The built-in
ifelse
macro performs
conditional testing.
The simplest form is the following:
ifelse(a,b,c,d)
This example compares the two
strings
a
and
b
.
If they are identical,
ifelse
returns string
c
.
If they are not identical,
it returns string
d
.
For example, you can define a macro
called
compare
to compare two strings and return
yes
if they are the same or
no
if they are different,
as follows:
define(compare, `ifelse($1,$2,yes,no)')
The quote characters prevent the evaluation of
ifelse
from occurring too early.
If the fourth argument is missing,
it is treated as empty.
The
ifelse
macro can have any number of arguments,
and it therefore provides a limited form of multiple path decision capability.
For example:
ifelse(a,b,c,d,e,f,g)
This statement is logically the same as the following fragment:
if(a == b) x = c; else if(d == e) x = f; else x = g; return(x);
If the final argument is omitted, the result is
null.
5.3.11 Manipulating Strings
The
built-in
len
macro returns
the byte length of the string that makes up its argument.
For example,
len(abcdef)
is 6, and
len((a,b))
is 5.
The built-in
dlen
macro
returns the length of the displayable characters in a string.
In certain international
usages, 2-byte codes are displayed as one character.
Thus, if the string
contains any 2-byte international character codes, the result of
dlen
will differ from the result of
len
.
The built-in
substr
macro returns the substring (beginning at the character position specified
by the second argument) from a specified string (first argument).
The third
argument specifies the length in bytes of the returned substring.
For example:
substr(Krazy Kat,6,5)
This example returns "Kat", which is the 3-character substring beginning at character position 6 of the string "Krazy Kat". The first character in the string is at position 0 (zero). If the third argument is omitted or if the string is not long enough to satisfy the third argument, as in this example, the rest of the string is returned.
The built-in
index
macro returns the byte position, or index, in a string (first argument) where
a substring (second argument) begins.
If the substring is not present,
index
returns -1.
As with
substr
, the
origin for strings is 0 (zero).
For example:
index(Krazy Kat,Kat)
This example returns 6.
The built-in
translit
macro performs one-for-one character substitution, or transliteration.
The
first argument is a string to be processed.
The second and third arguments
are lists of characters.
Each instance of a character from the second argument
that is found in the string is replaced by the corresponding character from
the third argument.
For example:
translit(the quick brown fox jumps over the lazy dog,aeiou,AEIOU)
This example returns the following:
thE qUIck brOwn fOx jUmps OvEr thE lAzy dOg
If the third argument is shorter than the second argument, characters from the second argument that are not in the third argument are deleted. If the third argument is missing, all characters present in the second argument are deleted.
Note
The
substr
,index
, andtranslit
macros do not differentiate between 1- and 2-byte displayable characters and can return unexpected results in some international usages.
The built-in
errprint
macro writes its arguments to the standard error file.
For example:
errprint (`error')
The built-in
dumpdef
macro
dumps the current names and definitions of items named as arguments.
Names
must be quoted.
If you supply no arguments,
dumpdef
prints
all current names and definitions.
The
dumpdef
macro writes
to the standard error file.