A External Data Representation: Technical Notes

This appendix contains technical notes on the External Data Representation (XDR) standard, a set of library routines that enable C programmers to describe arbitrary data structures in a machine-independent way. For a formal specification of the XDR standard, see RFC1014 , External Data Representation: Protocol Specification.

XDR is the backbone of the Open Network Computing Remote Procedure Call (ONC RPC) package, because data for remote procedure calls is transmitted using the XDR standard. Use the XDR library routines to transmit data that is read or written from several types of machine. For a complete specification of the system External Data Representation routines, see the xdr(3) reference page.

This appendix also contains a short tutorial overview of the XDR library routines, a guide to accessing currently available XDR streams, and information on defining new streams and data types.

XDR was designed to work across different languages, operating systems, and machine architectures. Most users (particularly RPC users) only need the information on number filters (Section A.1.3.1), floating-point filters (Section A.1.3.2), and enumeration filters (Section A.1.3.3). Programmers who want to implement RPC and XDR on new machines should read the rest of the appendix.

Note

You can use rpcgen to write XDR routines regardless of whether RPC calls are being made.

C programs that need XDR routines must include the file <rpc/rpc.h>, which contains all necessary interfaces to the XDR system. The C library libc.a contains all the XDR routines, so you can compile as usual.

A.1 Usefulness of XDR

Consider the following two programs, writer.c and reader.c:

#include <stdio.h>
 
main()			/* writer.c */
{
	long i;
 
    for (i = 0; i < 8; i++) {
		if (fwrite((char *)&i, sizeof(i), 1, stdout) != 1) {
			fprintf(stderr, "failed!\n");
			exit(1);
		}
	}
	exit(0);
}
 
#include <stdio.h>
 
main()			/* reader.c */
{
	long i, j;
 
	for (j = 0; j < 8; j++) {
		if (fread((char *)&i, sizeof (i), 1, stdin) != 1) {
			fprintf(stderr, "failed!\n");
			exit(1);
		}
		printf("%ld ", i);
	}
	printf("\n");
	exit(0);
}

The two programs appear to be portable because:

They pass lint checking.

They work the same when executed on two different hardware architectures, a Sun and a MIPS.

Piping the output of the writer.c program to the reader.c program gives identical results on a MIPS or a Sun, as shown:

sun% writer | reader
0 1 2 3 4 5 6 7
sun%
 
mips% writer | reader
0 1 2 3 4 5 6 7
mips%

With local area networks and Berkeley UNIX 4.2BSD came the concept of network pipes, in which a process produces data on one machine, and a second process on another machine uses this data. You can construct a network pipe with writer.c and reader.c. Here, the first process (on a Sun) produces data used by a second process (on a MIPS):

sun% writer | rsh mips reader
0 16777216 33554432 50331648 67108864 83886080 100663296
117440512
sun%

You get identical results by executing writer.c on the MIPS and reader.c on the Sun. These results occur because the byte ordering of long integers differs between the MIPS and the Sun, although the word size is the same. Note that 16777216 is equal to 2²⁴; when four bytes are reversed, the 1 is in the 24th bit.

Whenever data is shared by two or more machine types, there is a need for portable data. You can make programs data-portable by replacing the read and write calls with calls to an XDR library routine xdr_long, which is a filter that recognizes the standard representation of a long integer in its external form. Here are the revised versions of writer.c (Example A-1) and reader.c (Example A-2):

Example A-1: Revised Version of writer.c

#include <stdio.h>
#include <rpc/rpc.h>	/* xdr is a sub-library of rpc */
 
main()		/* writer.c */
{
	XDR xdrs;
	long i;
 
	xdrstdio_create(&xdrs, stdout, XDR_ENCODE);
	for (i = 0; i < 8; i++) {
		if (!xdr_long(&xdrs, &i)) {
			fprintf(stderr, "failed!\n");
			exit(1);
		}
	}
	exit(0);
}

Example A-2: Revised Version of reader.c

#include <stdio.h>
#include <rpc/rpc.h>	/* XDR is a sub-library of RPC */
 
main()		/* reader.c */
{
	XDR xdrs;
	long i, j;
 
	xdrstdio_create(&xdrs, stdin, XDR_DECODE);
	for (j = 0; j < 8; j++) {
		if (!xdr_long(&xdrs, &i)) {
			fprintf(stderr, "failed!\n");
			exit(1);
		}
		printf("%ld ", i);
	}
	printf("\n");
	exit(0);
}

The new programs were executed on a MIPS, a Sun, and from a Sun to a MIPS; the results are as follows:

sun% writer | reader
0 1 2 3 4 5 6 7
sun%
 
mips% writer | reader
0 1 2 3 4 5 6 7
mips%
 
sun% writer | rsh mips reader
0 1 2 3 4 5 6 7
sun%

Note

Arbitrary data structures create portability problems, particularly with alignment and pointers:

Alignment on word boundaries may cause the size of a structure to vary on different machines.

A pointer has no meaning outside the machine where it is defined.

A.1.1 A Canonical Standard

The XDR approach to standardizing data representations is canonical, because XDR defines a single byte order (big-endian), a single floating-point representation (IEEE), and so on. A program running on any machine can use XDR to create portable data by translating its local representation to the XDR standard. Similarly, any such program can read portable data by translating the XDR standard representation to the local equivalent.

The single standard treats separately those programs that create or send portable data and those that use or receive the data. A new machine or language has no effect upon existing portable data creators and users. Any new machine simply uses the canonical standards of XDR; the local representations of other machines are irrelevant. To existing programs on other machines, the local representations of the new machine are also irrelevant. There are strong precedents for the canonical approach of XDR. For example, TCP/IP, UDP/IP, XNS, Ethernet, and all protocols below layer five of the ISO model, are canonical protocols. The advantage of any canonical approach is simplicity; in the case of XDR, a single set of conversion routines is written once.

The canonical approach does have one disadvantage of little practical importance. Suppose two little-endian machines transfer integers according to the XDR standard. The sending machine converts the integers from little-endian byte order to XDR (big-endian) byte order, and the receiving machine does the reverse. Because both machines observe the same byte order, the conversions were really unnecessary. Fortunately, the time spent converting to and from a canonical representation is insignificant, especially in networking applications. Most of the time required to prepare a data structure for transfer is not spent in conversion but in traversing the elements of the data structure.

A.1.2 The XDR Library

The XDR library enables you to write and read arbitrary C constructs consistently. This makes it useful even when the data is not shared among machines on a network. The XDR library can do this because it has filter routines for strings (null-terminated arrays of bytes), structures, unions, and arrays. Using more primitive routines, you can write your own specific XDR routines to describe arbitrary data structures, including elements of arrays, arms of unions, or objects pointed at from other structures. The structures themselves may contain arrays of arbitrary elements, or pointers to other structures.

The previous writer.c and reader.c routines manipulate data by using standard I/O routines, so xdrstdio_create was used. The parameters to XDR stream creation routines vary according to their function. For example, xdrstdio_create takes the following parameters:

A pointer to an XDR structure that it initializes

A pointer to a FILE that the input or output acts upon

The operation -- either XDR_ENCODE for serializing in writer.c or XDR_DECODE for deserializing in reader.c

It is not necessary for RPC users to create XDR streams; the RPC system itself can create these streams and pass them to the users. There is a family of XDR stream creation routines in which each member treats the stream of bits differently.

The xdr_long primitive is characteristic of most XDR library primitives and all client XDR routines for two reasons:

The routine returns FALSE (0) if it fails, and TRUE (1) if it succeeds.

For each data type xxx, there is an associated XDR routine of the form:
```
xdr_xxx(xdrs, xp)
	XDR *xdrs;
	xxx *xp;
{
}
```

In this case, xxx is long, and the corresponding XDR routine is a primitive, xdr_long. The client could also define an arbitrary structure xxx in which case the client would also supply the routine xdr_xxx, describing each field by calling XDR routines of the appropriate type. In all cases, the first parameter, xdrs, is treated as an opaque handle and passed to the primitive routines.

XDR routines are direction-independent; that is, the same routines are called to serialize or deserialize data. This feature is important for portable data. Calling the same routine for either operation practically guarantees that serialized data can also be deserialized. Thus, one routine is used by both producer and consumer of networked data.

You implement direction independence by passing the address of an object rather than the object itself (only with deserialization is the object modified). If needed, the user can obtain the direction of the XDR operation. See Section A.1.5 for details.

For a more complicated example, assume that a person's gross assets and liabilities are to be exchanged among processes, and each is a separate data type:

struct gnumbers {
	long g_assets;
	long g_liabilities;
};

The corresponding XDR routine describing this structure would be as follows:

bool_t  		/* TRUE is success, FALSE is failure */
xdr_gnumbers(xdrs, gp)
	XDR *xdrs;
	struct gnumbers *gp;
{
	if (xdr_long(xdrs, &gp->g_assets) &&
	    xdr_long(xdrs, &gp->g_liabilities))
		return(TRUE);
	return(FALSE);
}

In this example, the parameter, xdrs, is never inspected or modified; it is only passed to subcomponent routines. The program must inspect the return value of each XDR routine call and stop immediately and return FALSE upon subroutine failure.

This example also shows that the type bool_t is declared as an integer whose only value is TRUE (1) or FALSE (0). The following definitions apply:

#define bool_t	int
#define TRUE	1
#define FALSE	0

With these conventions, you can rewrite xdr_gnumbers as follows:

xdr_gnumbers(xdrs, gp)
	XDR *xdrs;
	struct gnumbers *gp;
{
	return(xdr_long(xdrs, &gp->g_assets) &&
		xdr_long(xdrs, &gp->g_liabilities));
}

Either coding style can be used.

A.1.3 XDR Library Primitives

The following sections describe the XDR primitives (basic and constructed data types) and XDR utilities. The include file <rpc/xdr.h>, (automatically included by <rpc/rpc.h>) defines the interface to these primitives and utilities.

A.1.3.1 Number Filters

The XDR library provides primitives that translate between numbers and their corresponding external representations. Primitives include the set of numbers in:

[signed, unsigned] * [short, int, long]

Specifically, the eight primitives are:

bool_t xdr_char(xdrs, cp)
	XDR *xdrs;
	char *cp;
 
bool_t xdr_u_char(xdrs, ucp)
	XDR *xdrs;
	unsigned char *ucp;
 
bool_t xdr_hyper(xdrs, hp)
	XDR *xdrs;
	longlong_t *hp;
 
bool_t xdr_u_hyper(xdrs, uhp)
	XDR *xdrs;
	u_longlong_t *uhp;
 
bool_t xdr_int(xdrs, ip)
	XDR *xdrs;
	int *ip;
 
bool_t xdr_u_int(xdrs, up)
	XDR *xdrs;
	unsigned *up;
 
bool_t xdr_long(xdrs, lip)
	XDR *xdrs;
	long *lip;
 
bool_t xdr_u_long(xdrs, lup)
	XDR *xdrs;
	u_long *lup;
 
bool_t xdr_longlong_t(xdrs, hp)
	XDR *xdrs;
	longlong_t *hp;
 
bool_t xdr_u_longlong_t(xdrs, uhp)
	XDR *xdrs;
	u_long *uhp;
 
bool_t xdr_short(xdrs, sip)
	XDR *xdrs;
	short *sip;
 
bool_t xdr_u_short(xdrs, sup)
	XDR *xdrs;
	u_short *sup;

The first parameter, xdrs, is an XDR stream handle. The second parameter is the address of the number that provides data to the stream or receives data from it. All routines return TRUE if they complete successfully, and FALSE otherwise.

For more information on number filters, see the xdr(3) reference page.

A.1.3.2 Floating Point Filters

The XDR library also provides primitive routines for floating point types in C:

bool_t xdr_float(xdrs, fp)
	XDR *xdrs;
	float *fp;
 
bool_t xdr_double(xdrs, dp)
	XDR *xdrs;
	double *dp;

The first parameter, xdrs, is an XDR stream handle. The second parameter is the address of the floating point number that provides data to the stream or receives data from it. Both routines return TRUE if they complete successfully, and FALSE otherwise.

Note

Because the numbers are represented in IEEE floating point, routines may fail when decoding a valid IEEE representation into a machine-specific representation, or vice versa.

A.1.3.3 Enumeration Filters

The XDR library provides a primitive for generic enumerations; it assumes that a C enum has the same representation inside the machine as a C integer. The Boolean type is an important instance of the enum. The external representation of a Boolean is always TRUE (1) or FALSE (0) as shown here:

#define bool_t	int
#define FALSE	0
#define TRUE	1
 
#define enum_t int
 
bool_t xdr_enum(xdrs, ep)
	XDR *xdrs;
	enum_t *ep;
 
bool_t xdr_bool(xdrs, bp)
	XDR *xdrs;
	bool_t *bp;

The second parameters ep and bp are addresses of the associated type that provides data to, or receives data from, the stream xdrs.

A.1.3.4 Possibility of No Data

Occasionally, an XDR routine must be supplied to the RPC system, even when no data is passed or required. The following routine does this:

bool_t xdr_void();  /* always returns TRUE */

A.1.3.5 Constructed Data Type Filters

Constructed or compound data type primitives require more parameters and perform more complicated functions than the primitives previously discussed. The following sections include primitives for strings, arrays, unions, and pointers to structures.

Constructed data type primitives may use memory management. In many cases, memory is allocated when deserializing data with XDR_DECODE. XDR enables memory deallocation through the XDR_FREE operation. The three XDR directional operations are XDR_ENCODE, XDR_DECODE, and XDR_FREE.

A.1.3.5.1 Strings

In C, a string is defined as a sequence of bytes terminated by a NULL byte, which is not considered when calculating string length. When a string is passed or manipulated, there must be a pointer to it. Therefore, the XDR library defines a string to be a char *, not a sequence of characters. The external and internal representations of a string are different. Externally, strings are represented as sequences of ASCII characters; internally, with character pointers. The xdr_string routine converts between the two, as shown:

bool_t xdr_string(xdrs, sp, maxlength)
	XDR *xdrs;
	char **sp;
	u_int maxlength;

The first parameter, xdrs, is the XDR stream handle; the second, sp, is a pointer to a string (type char **). The third parameter, maxlength, specifies the maximum number of bytes allowed during encoding or decoding; its value is usually specified by a protocol. For example, a protocol may specify that a file name cannot be longer than 255 characters. Keep maxlength small because overflow conditions may occur if xdr_string has to call malloc for space. The routine returns FALSE if the number of characters exceeds maxlength; otherwise, it returns TRUE.

The behavior of xdr_string is similar to that of other routines in this section. For the direction, XDR_ENCODE, the parameter sp points to a string of a certain length; if the string does not exceed maxlength, the bytes are serialized.

The effect of deserializing a string is subtle. First, the length of the incoming string is determined; it must not exceed maxlength. Next, sp is dereferenced; if the value is NULL, then a string of the appropriate length is allocated and *sp is set to this string. If the original value of *sp is non-NULL, then XDR assumes that a target area (which can hold strings no longer than maxlength) has been allocated. In either case, the string is decoded into the target area, and the routine appends a NULL character to it.

In the XDR_FREE operation, the string is obtained by dereferencing sp. If the string is not NULL, it is freed and *sp is set to NULL. In this operation, xdr_string ignores the maxlength parameter.

A.1.3.5.2 Byte Arrays

Often, variable-length arrays of bytes are preferable to strings. Byte arrays differ from strings in the following three ways:

The length of the array (the byte count) is explicitly located in an unsigned integer.

The byte sequence is not terminated by a NULL character.

The external and internal byte representation is the same.

The primitive xdr_bytes converts between the internal and external representations of byte arrays:

bool_t xdr_bytes(xdrs, bpp, lp, maxlength)
	XDR *xdrs;
	char **bpp;
	u_int *lp;
	u_int maxlength;

The usage of the first, second, and fourth parameters are identical to the same parameters of xdr_string. The length of the byte area is obtained by dereferencing lp when serializing; *lp is set to the byte length when deserializing.

A.1.3.5.3 Arrays

The XDR library provides a primitive for handling arrays of arbitrary elements. The xdr_bytes routine treats a subset of generic arrays, in which the size of array elements is known to be 1, and the external description of each element is built-in. The generic array primitive, xdr_array requires parameters identical to those of xdr_bytes in addition to two more: the size of array elements, and an XDR routine to handle each of the elements.

This routine encodes or decodes each array element:

bool_t
xdr_array(xdrs, ap, lp, maxlength, elementsiz, xdr_element)
	XDR *xdrs;
	char **ap;
	u_int *lp;
	u_int maxlength;
	u_int elementsiz;
	bool_t (*xdr_element)();

The parameter ap is the address of the pointer to the array. If *ap is NULL when the array is being deserialized, XDR allocates an array of the appropriate size and sets *ap to that array. The element count of the array is obtained from *lp when the array is serialized; *lp is set to the array length when the array is deserialized. The parameter maxlength is the maximum allowable number of array elements; elementsiz is the byte size of each array element. (You can also use the C function sizeof to obtain this value.) The xdr_element routine is called to serialize, deserialize, or free each element of the array.

Consider the following three examples, which show the recursiveness of the XDR library routines already discussed.

A user on a networked machine can be identified in three ways:

The machine name, such as krypton; (see the gethostname(2) reference page)

The user's UID; (see the geteuid(2) reference page)

The group numbers to which the user belongs; (see the getgroups(2) reference page)

A structure with this information and its associated XDR routine could be coded like this:

struct netuser {
	char	*nu_machinename;
	int 	nu_uid;
	u_int	nu_glen;
	int 	*nu_gids;
};
#define NLEN 255	/* machine names < 256 chars */
#define NGRPS 20	/* user can't be in > 20 groups */
bool_t
xdr_netuser(xdrs, nup)
	XDR *xdrs;
	struct netuser *nup;
{
	return(xdr_string(xdrs, &nup->nu_machinename, NLEN) &&
		xdr_int(xdrs, &nup->nu_uid) &&
		xdr_array(xdrs, &nup->nu_gids, &nup->nu_glen,
		NGRPS, sizeof (int), xdr_int));
}

A party of network users could be implemented as an array of netuser structure. The declaration and its associated XDR routines are as follows:

struct party {
	u_int p_len;
	struct netuser *p_nusers;
};
#define PLEN 500 /* max number of users in a party */
 
bool_t
xdr_party(xdrs, pp)
	XDR *xdrs;
	struct party *pp;
{
	return(xdr_array(xdrs, &pp->p_nusers, &pp->p_len, PLEN,
	    sizeof (struct netuser), xdr_netuser));
}

The parameters to main-- argc and argv-- can be combined into a structure, and an array of these structures can make up a history of commands. The declarations and XDR routines might look like:

struct cmd {
	u_int c_argc;
	char **c_argv;
};
#define ALEN 1000	/* args cannot be > 1000 chars */
#define NARGC 100	/* commands cannot have > 100 args */
 
struct history {
	u_int h_len;
	struct cmd *h_cmds;
};
#define NCMDS 75  /* history is no more than 75 commands */
 
bool_t
xdr_wrapstring(xdrs, sp)
	XDR *xdrs;
	char **sp;
{
	return(xdr_string(xdrs, sp, ALEN));
}
 
bool_t
xdr_cmd(xdrs, cp)
	XDR *xdrs;
	struct cmd *cp;
{
	return(xdr_array(xdrs, &cp->c_argv, &cp->c_argc, NARGC,
	    sizeof (char *), xdr_wrapstring));
}
 
bool_t
xdr_history(xdrs, hp)
	XDR *xdrs;
	struct history *hp;
{
	return(xdr_array(xdrs, &hp->h_cmds, &hp->h_len, NCMDS,
	    sizeof (struct cmd), xdr_cmd));
}

The routine xdr_wrapstring is needed to package the xdr_string routine, because the implementation of xdr_array only passes two parameters to the array element description routine; xdr_wrapstring supplies the third parameter to xdr_string.

A.1.3.5.4 Opaque Data

Some protocols pass handles from a server to a client. The client later passes back the handles, without first inspecting them; that is, handles are opaque. The xdr_opaque primitive describes fixed-size, opaque bytes:

bool_t xdr_opaque(xdrs, p, len)
	XDR *xdrs;
	char *p;
	u_int len;

The parameter p is the location of the bytes; len is the number of bytes in the opaque object. By definition, the data within the opaque object is not machine-portable.

A.1.3.5.5 Arrays of Fixed Size

The XDR library provides a primitive, xdr_vector, for fixed-length arrays:

#define NLEN 255	/* machine names must be < 256 chars */
#define NGRPS 20	/* user belongs to exactly 20 groups */
 
struct netuser {
	char *nu_machinename;
	int nu_uid;
	int nu_gids[NGRPS];
};
 
bool_t
xdr_netuser(xdrs, nup)
	XDR *xdrs;
	struct netuser *nup;
{
	int i;
 
	if (!xdr_string(xdrs, &nup->nu_machinename, NLEN))
		return(FALSE);
	if (!xdr_int(xdrs, &nup->nu_uid))
		return(FALSE);
	if (!xdr_vector(xdrs, nup->nu_gids, NGRPS, sizeof(int),
	    xdr_int)) {
			return(FALSE);
	}
	return(TRUE);
}

A.1.3.5.6 Discriminated Unions

The XDR library supports discriminated unions. A discriminated union is a C union and an enum_t value that selects an arm of the union:

struct xdr_discrim {
	enum_t value;
	bool_t (*proc)();
};
 
bool_t xdr_union(xdrs, dscmp, unp, arms, defaultarm)
	XDR *xdrs;
	enum_t *dscmp;
	char *unp;
	struct xdr_discrim *arms;
	bool_t (*defaultarm)();  /* may equal NULL */

In this example, the routine translates the discriminant of the union at *dscmp. The discriminant is always an enum_t. Next, the union at *unp is translated. The parameter arms is a pointer to an array of xdr_discrim structures. Each structure contains an ordered pair of [value,proc].

If the union's discriminant is equal to the associated value, then the proc is called to translate the union. The end of the xdr_discrim structure array is denoted by a routine of value NULL. If the discriminant is not in the arms array, then the defaultarm procedure is called if it is non-null; otherwise the routine returns FALSE.

The following example shows how to serialize or deserialize a discriminated union. Suppose that the type of a union is an integer, character pointer (a string), or a gnumbers structure. Also, assume the union and its current type are declared in a structure, as follows:

enum utype { INTEGER=1, STRING=2, GNUMBERS=3 };
 
struct u_tag {
	enum utype utype;	/* the union's discriminant */
	union {
		int ival;
		char *pval;
		struct gnumbers gn;
	} uval;
};

The following constructs and XDR procedure serialize or deserialize the discriminated union:

struct xdr_discrim u_tag_arms[4] = {
	{ INTEGER, xdr_int },
	{ GNUMBERS, xdr_gnumbers }
	{ STRING, xdr_wrapstring },
	{ __dontcare__, NULL }
	/* always terminate arms with a NULL xdr_proc */
}
 
bool_t
xdr_u_tag(xdrs, utp)
	XDR *xdrs;
	struct u_tag *utp;
{
	return(xdr_union(xdrs, &utp->utype, &utp->uval,
		u_tag_arms, NULL));
}

The routine xdr_gnumbers was presented in Section A.1.2 and xdr_wrapstring was presented in Example C in Section A.1.3.5.3. The default arm parameter to xdr_union (the last parameter) is NULL in Example D. Therefore, the value of the union's discriminant can only be a value listed in the u_tag_arms array. Example D also shows that the elements of the arm's array do not need to be sorted.

The values of the discriminant may be sparse, though in Example D they are not. It is always good practice to assign explicitly integer values to each element of the discriminant's type. This will document the external representation of the discriminant and guarantee that different C compilers provide identical discriminant values.

A.1.3.5.7 Pointers

In C it is useful to put within a structure any pointers to another structure. The xdr_reference primitive makes it easy to serialize, deserialize, and free these referenced structures. A structure of structure pointers is shown here:

bool_t xdr_reference(xdrs, pp, size, proc)
	XDR *xdrs;
	char **pp;
	u_int ssize;
	bool_t (*proc)();

Parameter pp is the address of the pointer to the structure, ssize is the size in bytes of the structure (use the C function sizeof to obtain this value), and proc is the XDR routine that describes the structure. When decoding data, storage is allocated if *pp is NULL.

There is no need for a primitive xdr_struct to describe a structure within a structure, because pointers are always sufficient.

Note

The xdr_reference and xdr_array primitives are not interchangeable external representations of data.

The following example describes a structure (and its corresponding XDR routine) that contains an item of data and a pointer to a gnumbers structure that has more information about that item of data. Suppose there is a structure containing a person's name and a pointer to a gnumbers structure containing the person's gross assets and liabilities. This structure has the following construct:

struct pgn {
	char *name;
	struct gnumbers *gnp;
};

This structure has the following corresponding XDR routine:

bool_t
xdr_pgn(xdrs, pp)
	XDR *xdrs;
	struct pgn *pp;
{
	if (xdr_string(xdrs, &pp->name, NLEN) &&
	  xdr_reference(xdrs, &pp->gnp,
	  sizeof(struct gnumbers), xdr_gnumbers))
		return(TRUE);
	return(FALSE);
}

In many applications, C programmers attach double meaning to the values of a pointer. Typically the value NULL means data is not necessary, but some application-specific interpretation applies. In essence, the C programmer is encoding a discriminated union efficiently by overloading the interpretation of the value of a pointer.

For example, in the previous structure, a NULL pointer value for gnp could indicate that the person's assets and liabilities are unknown; that is, the pointer value encodes two things: whether or not the data is known, and if it is known, where it is located in memory. Linked lists are an extreme example of the use of application-specific pointer interpretation.

The primitive xdr_reference cannot attach any special meaning to a NULL-value pointer during serialization. That is, passing an address of a pointer whose value is NULL to xdr_reference when serializing data will most likely cause a memory fault and a core dump.

The xdr_pointer correctly handles NULL pointers. For more information about its use, see Section A.2.

A.1.4 Non-filter Primitives

The non-filter primitives that follow are for manipulating XDR streams:

u_int xdr_getpos(xdrs)
	XDR *xdrs;
 
bool_t xdr_setpos(xdrs, pos)
	XDR *xdrs;
	u_int pos;
 
xdr_destroy(xdrs)
	XDR *xdrs;

The routine xdr_getpos returns an unsigned integer that describes the current position in the data stream.

Note

In some XDR streams, the returned value of xdr_getpos is meaningless; the routine returns a -1 in this case (though -1 should be a legitimate value).

The routine xdr_setpos sets a stream position to pos. However, in some XDR streams, setting a position is impossible; in such cases, xdr_setpos returns FALSE. This routine also fails if the requested position is out-of-bounds. The definition of bounds varies according to the stream.

The xdr_destroy primitive destroys the XDR stream. Usage of the stream after calling this routine is undefined.

A.1.5 XDR Operation Directions

Though not recommended, you may want to optimize XDR routines by using the direction of the operation -- XDR_ENCODE, XDR_DECODE, or XDR_FREE. For example, the value xdrs->x_op contains the direction of the XDR operation. An example in Section A.2 shows the usefulness of the xdrs->x_op field.

A.1.6 XDR Stream Access

An XDR stream is obtained by calling the appropriate creation routine, which takes arguments for the specific properties of the stream. Streams currently exist for serialization or deserialization of data to or from standard I/O FILE streams, TCP/IP connections and files, and memory.

A.1.6.1 Standard I/O Streams

XDR streams can be interfaced to standard I/O using the xdrstdio_create routine as follows:

#include <stdio.h>
#include <rpc/rpc.h>	/* XDR streams part of RPC */
void
xdrstdio_create(xdrs, fp, x_op)
	XDR *xdrs;
	FILE *fp;
	enum xdr_op x_op;

The routine xdrstdio_create initializes an XDR stream pointed to by xdrs. The XDR stream interfaces to the standard I/O library. Parameter fp is an open file, and x_op is an XDR direction.

A.1.6.2 Memory Streams

A memory stream enables the streaming of data into or out of a specified area of memory:

#include <rpc/rpc.h>
 
void
xdrmem_create(xdrs, addr, len, x_op)
	XDR *xdrs;
	char *addr;
	u_int len;
	enum xdr_op x_op;

The routine xdrmem_create initializes an XDR stream in local memory that is pointed to by parameter addr; parameter len is the length in bytes of the memory. The parameters xdrs and x_op are identical to the corresponding parameters of xdrstdio_create. Currently, the UDP/IP implementation of ONC RPC uses xdrmem_create. Complete call or result messages are built in memory before calling the sendto system routine.

A.1.6.3 Record (TCP/IP) Streams

A record stream is an XDR stream built on top of a record marking standard; that is, in turn, built on top of a file or a Berkeley UNIX 4.2BSD connection interface, as shown:

#include <rpc/rpc.h>	/* xdr streams part of rpc */
 
xdrrec_create(xdrs,
  sendsize, recvsize, iohandle, readproc, writeproc)
	XDR *xdrs;
	u_int sendsize, recvsize;
	char *iohandle;
	int (*readproc)(), (*writeproc)();

The routine xdrrec_create provides an XDR stream interface that allows for a bidirectional, arbitrarily long sequence of records. The contents of the records are meant to be data in XDR form. The stream's primary use is for interfacing RPC to TCP connections. However, it can be used to stream data into or out of ordinary files.

The parameter xdrs is similar to the corresponding parameter described in Section A.1.6.2. The stream does its own data buffering, similar to that of standard I/O. The parameters sendsize and recvsize determine the size in bytes of the output and input buffers, respectively; if their values are zero, defaults are used. When a buffer needs to be filled or flushed, the routine readproc or writeproc is called, respectively. The usage of these routines is similar to the system calls read and write. However, the first parameter to each routine is the opaque parameter iohandle. The other two parameters ( buf and nbytes) and the results (byte count) are identical to the system routines. If xxx is readproc or writeproc, then it has the following form:

 /* returns the actual number of bytes transferred;
  * -1 is an error
  */
 
 int
 xxx(iohandle, buf, len)
	char *iohandle;
	char *buf;
	int nbytes;

The XDR stream enables you to delimit records in the byte stream. This is discussed in Section A.2. The following primitives are specific to record streams:

bool_t
xdrrec_endofrecord(xdrs, flushnow)
	XDR *xdrs;
	bool_t flushnow;
 
bool_t
xdrrec_skiprecord(xdrs)
	XDR *xdrs;
 
bool_t
xdrrec_eof(xdrs)
	XDR *xdrs;

The routine xdrrec_endofrecord causes the current outgoing data to be marked as a record. If the parameter flushnow is TRUE, then the stream's writeproc will be called; otherwise, writeproc will be called when the output buffer has been filled.

The routine xdrrec_skiprecord causes an input stream's position to be moved past the current record boundary and onto the beginning of the next record in the stream. If there is no more data in the stream's input buffer, then the routine xdrrec_eof returns TRUE. This does not mean that there is no more data in the underlying file descriptor.

A.1.7 XDR Stream Implementation

This section provides the abstract data types needed to implement new instances of XDR streams.

The following structure defines the interface to an XDR stream:

enum xdr_op { XDR_ENCODE=0, XDR_DECODE=1, XDR_FREE=2 };
 
typedef struct {
	enum xdr_op x_op;	/* operation; fast added param */
	struct xdr_ops {
		bool_t  (*x_getlong)();  /* get long from stream */
		bool_t  (*x_putlong)();  /* put long to stream */
		bool_t  (*x_getbytes)(); /* get bytes from stream */
		bool_t  (*x_putbytes)(); /* put bytes to stream */
		u_int   (*x_getpostn)(); /* return stream offset */
		bool_t  (*x_setpostn)(); /* reposition offset */
		caddr_t (*x_inline)();   /* ptr to buffered data */
		VOID    (*x_destroy)();  /* free private area */
	} *x_ops;
	caddr_t	x_public;	/* users' data */
	caddr_t	x_private;	/* pointer to private data */
	caddr_t	x_base;		/* private for position info */
	int		x_handy;	/* extra private word */
} XDR;

The x_op field is the current operation being performed on the stream. This field is important to the XDR primitives, but is not expected to affect the implementation of a stream. The fields x_private, x_base, and x_handy pertain to a particular stream implementation. The field x_public is for the XDR client and must not be used by the XDR stream implementations or the XDR primitives. The macros x_getpostn, x_setpostn, and x_destroy, access operations. The operation x_inline takes two parameters: an XDR *, and an unsigned integer, which is a byte count. The routine returns a pointer to a piece of the stream's internal buffer. The caller can then use the buffer segment for any purpose. To the stream, the bytes in the buffer segment have been consumed or put. The routine may return NULL if it cannot return a buffer segment of the requested size. (The x_inline routine is for maximizing efficient use of machine cycles. The resulting buffer is not data-portable, so using this feature is not recommended.)

The operations x_getbytes and x_putbytes get and put sequences of bytes from or to the underlying stream; they return TRUE if successful, and FALSE otherwise. The routines have identical parameters (replace xxx):

bool_t
xxxbytes(xdrs, buf, bytecount)
	XDR *xdrs;
	char *buf;
	u_int bytecount;

The x_getlong and x_putlong routines receive and put long numbers to and from the data stream. These routines must translate the numbers between the machine representation and the (standard) external representation. The operating system primitives htonl and ntohl help to do this. The higher-level XDR implementation assumes that signed and unsigned long integers contain the same number of bits, and that nonnegative integers have the same bit representations as unsigned integers. The routines return TRUE if they succeed, and FALSE otherwise. They have identical parameters:

bool_t
xxxlong(xdrs, lp)
	XDR *xdrs;
	long *lp;

Implementors of new XDR streams must make an XDR structure (with new operation routines) available to clients, using some kind of creation routine.

A.2 Advanced Topics

This section describes advanced techniques for passing data structures, such as linked lists (of arbitrary length). The examples in this section are written using both the XDR C library routines and the XDR data description language.

The following example presents a C data structure and its associated XDR routines for an individual's gross assets and liabilities. The example is duplicated here:

struct gnumbers {
	long g_assets;
	long g_liabilities;
};
bool_t
xdr_gnumbers(xdrs, gp)
	XDR *xdrs;
	struct gnumbers *gp;
{
	if (xdr_long(xdrs, &(gp->g_assets)))
		return(xdr_long(xdrs, &(gp->g_liabilities)));
	return(FALSE);
}

If you want to implement a linked list of such information, you could construct the following data structure:

struct gnumbers_node {
	struct gnumbers gn_numbers;
	struct gnumbers_node *gn_next;
};
 
typedef struct gnumbers_node *gnumbers_list;

The head of the linked list can be thought of as the data object; that is, the head is not merely a convenient shorthand for a structure. Similarly the gn_next field indicates whether the object has terminated. Unfortunately, if the object continues, the gn_next field is also the address of where it continues. The link addresses carry no useful information when the object is serialized.

The XDR data description of this linked list is described by the recursive declaration of gnumbers_list:

struct gnumbers {
	int g_assets;
	int g_liabilities;
};
 
struct gnumbers_node {
	gnumbers gn_numbers;
	gnumbers_node *gn_next;
};

Here, the Boolean indicates whether there is more data following it. If the Boolean is FALSE, then it is the last data field of the structure; if TRUE, then it is followed by a gnumbers structure and (recursively) by a gnumbers_list. Note that the C declaration has no Boolean explicitly declared in it (though the gn_next field implicitly carries the information), while the XDR data description has no pointer explicitly declared in it. From the XDR description in the previous paragraph, you can determine how to write the XDR routines for a gnumbers_list. That is, the xdr_pointer primitive would implement the XDR union. Unfortunately -- due to recursion -- using XDR on a list with the following routines causes the C stack to grow linearly with respect to the number of nodes in the list:

bool_t
xdr_gnumbers_node(xdrs, gn)
	XDR *xdrs;
	gnumbers_node *gn;
{
	return(xdr_gnumbers(xdrs, &gn->gn_numbers) &&
		xdr_gnumbers_list(xdrs, &gp->gn_next));
}
bool_t
xdr_gnumbers_list(xdrs, gnp)
	XDR *xdrs;
	gnumbers_list *gnp;
{
	return(xdr_pointer(xdrs, gnp,
		sizeof(struct gnumbers_node),
		xdr_gnumbers_node));
}

The following routine combines these two mutually recursive routines into a single, non-recursive one:

bool_t
xdr_gnumbers_list(xdrs, gnp)
	XDR *xdrs;
	gnumbers_list *gnp;
{
	bool_t more_data;
	gnumbers_list *nextp;
 
	for (;;) {
		more_data = (*gnp != NULL);
		if (!xdr_bool(xdrs, &more_data)) {
			return(FALSE);
		}
		if (! more_data) {
			break;
		}
		if (xdrs->x_op == XDR_FREE) {
			nextp = &(*gnp)->gn_next;
		}
		if (!xdr_reference(xdrs, gnp,
			sizeof(struct gnumbers_node), xdr_gnumbers)) {
 
		return(FALSE);
		}
		gnp = (xdrs->x_op == XDR_FREE) ?
			nextp : &(*gnp)->gn_next;
	}
	*gnp = NULL;
	return(TRUE);
}

The first task is to find out whether there is more data or not, so that Boolean information can be serialized. Notice that this is unnecessary in the XDR_DECODE case, because the value of more_data is not known until it is deserialized in the next statement, which uses XDR on the more_data field of the XDR union. If there is no more data, this last pointer is set to NULL to indicate the list end, and a TRUE is returned to indicate completion. Setting the pointer to NULL is only important in the XDR_DECODE case, since it is already NULL in the XDR_ENCODE and XDR_FREE cases.

Next, if the direction is XDR_FREE, the value of nextp is set to indicate the location of the next pointer in the list. This is for dereferencing gnp to find the location of the next item in the list; after the next statement, the storage pointed to by gnp is deallocated and is no longer valid. This cannot be done for all directions because, in the XDR_DECODE direction, the value of gnp is not set until the next statement.

Next, XDR operates on the data in the node through the primitive xdr_reference, which is like xdr_pointer (which was used before). However, xdr_reference does not send over the Boolean indicating whether there is more data; it is used instead of xdr_pointer because XDR has already been used on this information. Notice that the XDR routine passed is not the same type as an element in the list. The routine passed is xdr_gnumbers, for using XDR on gnumbers; however, each element in the list is of type gnumbers_node. The xdr_gnumbers_node is not passed because it is recursive; instead, use xdr_gnumbers, which uses XDR on all of the non-recursive part. Note that this works only if the gn_numbers field is the first item in each element, so that the addresses are identical when passed to xdr_reference.

Next, gnp is updated to point to the next item in the list. If the direction is XDR_FREE, it is set to the previously saved value; otherwise, gnp is dereferenced to get the proper value. Although more difficult to understand than the recursive version, the non-recursive routine is much less likely to overflow the C stack. It also runs more efficiently because a lot of procedure call overhead has been removed. Most lists are small though (in the hundreds of items or less) and the recursive version should be sufficient for them.