This appendix lists
and summarizes worldwide portability interfaces (WPI) that are defined by
Version 5 of the X/Open CAE specification for system interfaces and headers
(XSH).
All these interfaces support the wide-character data type.
Tables in
this appendix also list ISO C equivalents that are not recommended, if there
are any, for each WPI interface.
The reference pages (manpages) provide detailed
information for each interface.
Refer to
standards(5)
for information about
compiling a program in the appropriate definition environment for XSH Version
5.
Programs call the following function to use the appropriate locale (language, territory, and codeset) at run time:
| WPI Function | Description |
setlocale( ) |
Establishes localization data at run time. |
The following character classification
functions classify wide-character values according to the codeset defined
in the locale category
LC_CTYPE.
| WPI Function | Equivalent in ISO C | Description |
iswalnum( ) |
isalnum( ) |
Tests if a character is alphanumeric. |
iswalpha( ) |
isalpha( ) |
Tests if a character is alphabetic. |
iswcntrl( ) |
iscntrl( ) |
Tests if a character is a control character. |
iswdigit( ) |
isdigit( ) |
Tests if a character is a decimal digit in the portable character set. |
iswgraph( ) |
isgraph( ) |
Tests if a character is a graphic character. |
iswlower( ) |
islower( ) |
Tests if a character is lowercase. |
iswprint( ) |
isprint( ) |
Tests if a character is a printing character. |
iswpunct( ) |
ispunct( ) |
Tests if a character is a punctuation mark. |
iswspace( ) |
isspace( ) |
Tests if a character determines white space in displayed text. |
iswupper( ) |
isupper( ) |
Tests if a character is uppercase. |
iswxdigit( ) |
isxdigit( ) |
Tests if a character is a hexadecimal digit in the portable character set. |
In addition to the functions for each character classification, the WPI includes two functions that provide a common interface to all classification categories:
wctype( )
Returns a value that corresponds to a character classification
iswctype( )
Tests if a wide character has a certain property
The 11 WPI functions listed in the preceding table can be replaced by
calls to the
wctype( )
and
iswctype( )
functions as shown in the following table:
| Call Using Classification Function | Equivalent Call Using wctype( ) and iswctype( ) |
iswalnum(wc
) |
iswctype(wc
, wctype("alnum")) |
iswalpha(wc
) |
iswctype(wc
, wctype("alpha")) |
iswcntrl(wc
) |
iswctype(wc
, wctype("cntrl")) |
iswdigit(wc
) |
iswctype(wc
, wctype("digit")) |
iswgraph(wc
) |
iswctype(wc
, wctype("graph")) |
iswlower(wc
) |
iswctype(wc
, wctype("lower")) |
iswprint(wc
) |
iswctype(wc
, wctype("print")) |
iswpunct(wc
) |
iswctype(wc
, wctype("punct")) |
iswspace(wc
) |
iswctype(wc
, wctype("space")) |
iswupper(wc
) |
iswctype(wc
, wctype("upper")) |
iswxdigit(wc
) |
iswctype(wc
, wctype("xdigit")) |
In this table, the quoted literals in the
call to
wctype
are the character classes defined in the
X/Open UNIX standard for Western European and many Eastern European languages;
however, a locale can define other character classes.
The Unicode standard defines character classes that
do not have class-specific functions, and a locale for an Asian language might
define additional character classes to distinguish ideographic from phonetic
characters.
You must use the
wctype()
and
iswctype()
functions to test if a character belongs to a class when no class-specific
function exists for the test.
See
locale(4)
for details about character classes
and testing equivalence between classes defined in the XSH and the Unicode
standards.
Note
The calls in the second column of the preceding table illustrate only functional equivalence to the calls shown in the first column of the table. In most programming applications,
iswctype()needs to execute multiple times for each execution ofwctype(). In such cases, you would code calls in the second column of the table as follows to achieve performance equivalence to corresponding calls in the first column:wctype_t property_handle; wint_t wc; int yes_or_no; . . . property_handle=wctype("alnum"); . . . while (...) { . . . yes_or_no=iswctype(wc, property_handle); . . . }
The following case conversion functions let
you switch the case of a wide character according to the codeset defined in
the locale category
LC_CTYPE:
| WPI Function | Equivalent in ISO C | Description |
towlower( ) |
tolower( ) |
Converts a character to lowercase. |
towupper( ) |
toupper( ) |
Converts a character to uppercase. |
The WPI also includes the following functions to map and convert a wide character according to properties defined in the current locale:
wctrans( )
Maps a wide character to a property defined in the current locale
towctrans( )
Converts a wide character according to a property defined in the current locale
Currently, the only properties defined in Tru64 UNIX locales are
toupper
and
tolower.
The following example of
using
wctrans( )
and
towctrans( )
performs the same conversion as
towupper( ):
wint_t from_wc, to_wc;
wctrans_t conv_handle;
.
.
.
conv_handle=wctrans("toupper");
.
.
.
while (...) {
.
.
.
to_wc=towctrans(from_wc,conv_handle);
.
.
.
}
The following WPI function sorts wide-character strings
according to rules specified in the locale defined for the
LC_COLLATE
category:
| WPI Function | Equivalent in ISO C | Description |
wcscoll( ) |
strcoll( ) |
Collates character strings. |
You can also use the
wcsxfrm( )
and
wcscmp( ) functions, summarized in
Section A.11,
to transform and then compare wide-character strings.
The following WPI functions allow programs to retrieve, according to locale setting, data that is language specific or country specific:
| WPI Function | Description |
nl_langinfo( ) |
A general-purpose function that retrieves language and cultural data according to the locale setting. |
strfmon( ) |
Formats a monetary value according to the locale setting. |
localeconv( ) |
Returns information used to format numeric values according to the locale setting. |
The
ctime( )
and
asctime( )
functions do not have the flexibility
needed for language independence.
The WPI therefore includes the following
interfaces to format date and time strings according to information provided
by the locale:
| WPI Function | Description |
strftime( ) |
Formats a date and time string based on the specified format string and according to the locale setting. |
wcsftime( ) |
Formats a date and time string based on a specified format string and according to the locale setting, then returns the result in a wide-character array. |
strptime( ) |
Converts a character string to a time value
according to a specified format string; reverses the operation performed by
strftime( ). |
The WPI extends definitions of the following ISO C functions to support internationalization requirements. The WPI extensions are described after the table that lists the functions.
| WPI/ISO C Function | Description |
fprintf( ) |
Prints formatted output to a file by using
a
vararg
parameter list. |
fwprintf( ) |
Prints formatted wide characters to the specified
output stream by using a
vararg
parameter list. |
printf( ) |
Prints formatted output to the standard output
stream by using a
vararg
parameter list. |
sprintf( ) |
Formats one or more values and writes the
output to a character string by using a
vararg
parameter
list. |
swprintf( ) |
Prints formatted wide characters to the specified
address by using a
vararg
parameter list. |
vfprintf( ) |
Prints formatted output to a file by using
a
stdarg
parameter list. |
vfwprintf( ) |
Prints formatted wide characters to the specified
output stream by using a
stdarg
parameter list. |
vprintf( ) |
Prints formatted output to the standard output
stream by using a
stdarg
parameter list. |
vsprintf( ) |
Formats a
stdarg
parameter
list and writes the output to a character string. |
vswprintf( ) |
Prints formatted output to the specified
address by using a
stdarg
parameter list. |
vwprintf( ) |
Prints formatted wide characters to the standard
output by using a
stdarg
parameter list. |
wprintf( ) |
Prints formatted wide characters to the standard
output by using a
vararg
parameter list. |
fscanf( ) |
Converts formatted input from a file. |
fwscanf( ) |
Converts formatted wide characters from the specified output stream. |
scanf( ) |
Converts formatted input from the standard input stream. |
sscanf( ) |
Converts formatted data from a character string. |
swscanf( ) |
Converts formatted wide characters from the specified address. |
wscanf( ) |
Converts formatted wide characters from the standard input. |
The WPI extensions to the preceding functions include:
%digit$
conversion specifier, which allows variation in the ordinal position
of the argument being printed; such variation is frequently necessary when
text is translated into different languages.
Use of the decimal-point character as specified by the locale.
This feature affects
e,
E,
f,
g, and
G
conversions.
Use of the thousands-grouping character specified by the locale.
The
C
and
S
conversion
characters, which let you convert wide characters and wide-character strings,
respectively.
The WPI adds the following functions to convert wide-character strings to various numeric formats:
| WPI Function | Equivalent in ISO C | Description |
wcstod( ) |
strtod( ) |
Converts the initial portion of a wide-character string to a double-precision floating-point number. |
wcstol( ) |
strtol( ) |
Converts the initial portion of a wide-character string to a long integer number. |
wcstoul( ) |
strtoul( ) |
Converts the initial portion of a wide-character string to an unsigned long integer number. |
To allow an application to get data from or write data to external files (as multibyte data) and process it internally (as wide-character data), the WPI defines various functions to convert between multibyte data and wide-character data.
| WPI Function | Description |
btowc( ) |
Converts a single byte from multibyte-character format to wide-character format. |
mblen( ) |
Determines the number of bytes in a
character according to the locale setting.
You should modify all string manipulation
statements, which assume the size of a character is always 1 byte, to call
this function.
The following statement updates a pointer to the next character,
cp++;
The following
example incorporates the
cp += mblen(cp, MB_CUR_MAX);
|
mbrlen() |
Performs the same operation as
mblen()
but can be restarted for use with locales that include shift-state
encoding.
[Footnote 4]
|
mbrtowc() |
Performs the same operation as
mbtowc()
but can be restarted for use with locales that include shift-state
encoding.
[Footnote 4] |
mbsrtowcs() |
Performs the same operation as
mbstowcs()
but can be restarted for use with locales that include
shift-state encoding.
[Footnote 4] |
mbstowcs( ) |
Converts a multibyte-character string to a wide-character string. |
mbtowc( ) |
Converts a multibyte character to a wide character. |
wcstombs( ) |
Converts a wide-character string to a multibyte-character string. |
wcrtomb() |
Performs the same operation as
wctomb()
but can be restarted for use with locales that include shift-state
encoding.
[Footnote 4] |
wcsrtombs() |
Performs the same operation as
wcstombs()
but can be restarted for use with locales that include
shift-state encoding.
[Footnote 4] |
wctob( ) |
Converts a wide character to a single byte in multibyte-character format, if possible. |
wctomb( ) |
Converts a wide character to a multibyte character. |
Note
You do not always need to explicitly handle the conversion to and from file code (multibyte data). Functions for printing and scanning text (discussed in Section A.7) include the
%Sand%Cformat specifiers that automatically handle multibyte to wide-character conversion. The WPI alternatives for older ISO C input/output functions (see Section A.10) also perform multibyte/wide-character conversions automatically.
The WPI functions listed in the following table automatically convert between file code (usually multibyte encoding) and process code (wide-character encoding) for text input and output operations:
| WPI Function | Equivalent in ISO C | Description |
fgetwc( ) |
fgetc( ) |
Gets a character from the input stream and converts it to a wide character. |
fgetws( ) |
fgets( ) |
Gets a character string from the input stream and converts it to a wide-character string. |
fputwc( ) |
fputc( ) |
Converts a wide character to a multibyte character and writes the result to an output stream. |
fputws( ) |
fputs( ) |
Converts a wide-character string to a multibyte character string and writes the result to an output stream. |
fwide() |
None | Sets stream orientation to byte or wide character. This function is not useful within current locale environments. [Footnote 5] |
getwc( ) |
getc( ) |
Gets a character from the input stream, which is passed to the function as an argument, and converts it to a wide character. |
getwchar( ) |
getchar( ) |
Gets a character from the standard input stream and converts it to a wide character. |
| None | gets( ) |
Use
fgetws( ). |
mbsinit() |
None | Determines, for locales that use shift-state encoding, whether a multibyte string is in the initial conversion state. [Footnote 5] |
putwc( ) |
putc( ) |
Converts a wide character to a multibyte character and writes the result to an output stream, which is passed to the function as an argument. |
putwchar( ) |
getchar( ) |
Converts a wide character to a multibyte character and writes the result to the standard output stream. |
| None | puts( ) |
Use
fputws( ). |
ungetwc( ) |
ungetc( ) |
Pushes a wide character back onto the input stream. |
The WPI defines alternatives and additions to ISO C string-handling functions to support manipulation of wide-character strings. The WPI functions support both single-byte and multibyte characters.
| WPI Function | Equivalent in ISO C | Description |
wcscat( ) |
strcat( ) |
Appends a copy of a string to the end of another string. |
wcsncat( ) |
strncat( ) |
Similar to
wcscat( )
except that the number of characters to be appended is limited by the
n
parameter. |
String Searching:
| WPI Function | Equivalent in ISO C | Description |
wcschr( ) |
strchr( ) |
Locates the first occurrence of a wide character in a wide-character string. |
wcsrchr( ) |
strrchr( ) |
Locates the last occurrence of a wide character in a wide-character string. |
wcspbrk( ) |
strpbrk( ) |
Locates the first occurrence of any wide characters from one wide-character string in another wide-character string. |
wcsstr( ) |
strstr( ) |
Finds a wide-character substring.
Note that
the
wcsstr()
function also supercedes the
wcswcs()
function included in versions of the XSH specification earlier
than Issue 5. |
wcscspn( ) |
strcspn( ) |
Returns the number of initial elements of one wide-character string that are all wide characters not included in the second wide-character string. |
wcsspn( ) |
strspn( ) |
Returns the number of initial elements of one wide-character string that are all characters included in the second wide-character string. |
| WPI Function | Equivalent in ISO C | Description |
wcscpy( ) |
strcpy( ) |
Copies a wide-character string. |
wcsncpy( ) |
strncpy( ) |
Similar to
wcscpy( )
except that the number of wide characters to be copied is limited by the
n
parameter. |
| WPI Function | Equivalent in ISO C | Description |
wcscmp( ) |
strcmp( ) |
Compares two wide-character strings. |
wcsncmp( ) |
strncmp( ) |
Similar to
wcscmp( )
except that the number of wide characters to be compared is limited by the
n
parameter. |
| WPI Function | Equivalent in ISO C | Description |
wcslen( ) |
strlen( ) |
Determines the number of wide characters in a wide-character string. |
| WPI Function | Equivalent in ISO C | Description |
wcstok( ) |
strtok( ) |
Decomposes a wide-character string into a series of tokens, each delimited by a wide character from another wide-character string. |
Printing Position Determination:
| WPI Function | Equivalent in ISO C | Description |
wcswidth( ) |
None | Determines the number of printing positions required for a number of wide characters in a wide-character string. |
wcwidth( ) |
None | Determines the number of printing positions required for a wide character. |
Performing Memory Operations on Wide-Character Strings:
| WPI Function | Equivalent in ISO C | Description |
wmemcpy( ) |
memcpy( ) |
Copies wide characters from one buffer to another. |
wmemchr( ) |
memchr( ) |
Searches a buffer for the specified wide character. |
wmemcmp( ) |
memcmp( ) |
Compares the specified number of wide characters in two buffers. |
wmemmove( ) |
memmove( ) |
Copies wide characters from one buffer to another in a nondestructive manner. |
wmemset( ) |
memset( ) |
Copies the specified wide character into the specified number of locations in a destination buffer. |
The WPI provides codeset conversion capabilities through a set
of functions for program use or the
iconv
command for interactive
use.
You specify for these interfaces the source and target codesets and the
name of a language text file to be converted.
The codesets define a conversion
stream through which the language text is passed.
The following table summarizes the three functions you use for
codeset
conversion.
These functions reside in the
libiconv.a
library.
| WPI Function | Equivalent in ISO C | Description |
iconv_open( ) |
None | Initializes a conversion stream by identifying the source and the target codesets. |
iconv_close( ) |
None | Closes the conversion stream. |
iconv( ) |
None | Converts an input string encoded in the source codeset to an output string encoded in the target codeset. |
Refer to
Section 6.13
for a description of the
iconv
command and the types of conversions that are supported.