PreviousNext

Code Set

A code set is a mapping of the members of a character set to specific numeric code values. Different code sets use different numeric code values to represent the same character. In general, operating systems use string names to refer to the code sets that the system supports. It is common for different operating systems to use different string names to refer to the same code set.

Distributed applications that run in a network of heterogeneous operating systems need to be able to identify the character sets and code sets that client and server machines are using to avoid losing data during communications between each other. DCE RPC supports transparent automatic conversion for characters that are members of the DCE Portable Character Set (DCE PCS) and which are encoded in the ASCII and U.S. EBCDIC code sets. The RPC runtime automatically converts DCE PCS characters encoded in ASCII or U.S. EBCDIC, if necessary, when they are passed over the network between client and server.

DCE RPC applications that need to transfer character data that is outside the DCE PCS character set and ASCII and U.S. EBCDIC encodings (international characters) can use special IDL constructs and a set of DCE RPC routines to set up their applications so that they can pass this international character data with minimal or no loss between client and server applications. An example of such an application would be one that used European, Chinese, or Japanese characters mapped to EUC, Big5, or SJIS encodings. Together, the IDL constructs and the DCE RPC routines provide a method of automatic code set conversion for applications that transfer international character data in heterogeneous code set environments.

DCE provides a mechanism to uniquely identify a code set; this mechanism is the code set registry. The code set registry assigns a unique identifier to each character set and code set. Because the registry provides code set identifiers that are consistent across a network of heterogeneous operating systems, it provides a method for clients and servers in a heterogeneous environment to use to identify code sets without having to rely on operating system-specific string names.

The code set data structure contains a 32-bit hexadecimal value (c_set) that uniquely identifies the code set followed by a 16-bit decimal value (c_max_bytes) that indicates the maximum number of bytes this code set uses to encode one character in this code set.

The value for c_set is one of the registered values in the code set registry.

The following routines require a code set value:

· cs_byte_from_netcs( )

· cs_byte_local_size( )

· cs_byte_net_size( )

· cs_byte_to_netcs( )

· dce_cs_loc_to_rgy( )

· dce_cs_rgy_to_loc( )

· rpc_cs_get_tags( )

· rpc_cs_binding_set_tags( )

· rpc_rgy_get_max_bytes( )

· wchar_t_from_netcs( )

· wchar_t_local_size( )

· wchar_t_net_size( )

· wchar_t_to_netcs( )

In these routines, the code set value shows a data type of unsigned32.

The RPC stub buffer sizing routines *_net_size( ) and *_local_size use the value of c_max_bytes to calculate the size of a buffer for code set conversion.

The C language representation of a code set structure is as follows:

typedef struct {

long c_set;

short c_max_bytes;

}rpc_cs_c_set_t;

The code set data structure is a member of the code sets array.