 |
Index for Section 5 |
|
 |
Alphabetical listing for D |
|
 |
Bottom of page |
|
dechanyu(5)
NAME
dechanyu - A character encoding system (codeset) for Traditional Chinese
DESCRIPTION
The DEC Hanyu (dechanyu) codset consists of the following sets of
characters:
· ASCII
· The first and second character planes of CNS11643-1986
· Digital Taiwan Supplemental Character Set (DTSCS)
· User-defined characters
DEC Hanyu uses a combination of single-byte data, 2-byte data, and 4-byte
data to represent ASCII characters, symbols, or ideographic characters.
ASCII characters
All ASCII characters are represented in the form of single-byte, 7-bit data
in DEC Hanyu; that is, the most significant bit (MSB) of a byte that
represents an ASCII character is always set off. Refer to ascii(5) for more
information about the ASCII character set.
CNS11643-1986 Characters (Planes 1 and 2)
Each plane of the CNS 11643-1986 character set is divided into 94 rows and
each of these rows has 94 columns. The characters defined in plane 1 and
plane 2 of CNS 11643-1986 are as follows:
______________________________________________________________________
Character Plane Character Type
Number of
Characters
______________________________________________________________________
1 Special characters 651
Control characters 33
Frequently used characters 5401
2 7650
Less frequently used
characters
______________________________________________________________________
Note that the first two planes of the CNS11643-1986 character set are the
same as those specified for the revised CNS11643-1992 character set.
In DEC Hanyu, each CNS 11643-1986 character is represented by two bytes, in
conformance with the CNS 11643-1986 standard. The MSB of the first byte is
always turned on while that of the second byte is on for the first
character plane and off for the second character plane.
The first byte of CNS 11643-1986 encoding determines the row number of the
character, while the second byte determines its column number. Code ranges
for the two character planes are as follows:
Plane 1
A1A1 to FEFE
Plane 2
A121 to FE7E
The following formulas determine the value of a CNS 11643-1986 character in
relation to its row and column numbers.
· For a CNS 11643-1986 Plane 1 character:
1st byte = A0(hex) + Row number
2nd byte = A0(hex) + Column number
· For a CNS 11643-1986 Plane 2 character:
1st byte = A0(hex) + Row number
2nd byte = 20(hex) + Column number
For example, if a character is positioned at the first column of the 36th
row on CNS 11643 plane 1, its value is C4A1, which is calculated as
follows:
1st byte = A0(hex) + 36 = C4(hex)
2nd byte = A0(hex) + 01 = A1(hex)
Similarly, if a character is positioned at the first column of the 36th row
on CNS 11643 plane 2, its value is C421, which is calculated as follows:
1st byte = A0(hex) + 36 = C4(hex)
2nd byte = 20(hex) + 01 = 21(hex)
DTSCS Characters
Currently, only the EDPC (Electronic Data Processing Centre) Recommended
Character Set, which defines a total of 6319 characters (rows 1 to 68), is
included in the Digital Taiwan Supplementary Character Set (DTSCS). In the
revised CNS 11643-1992 standard, the 6319 characters in the EDPC
Recommended Character Set are assigned to the third and fourth character
planes as follows:
________________________________________________________
EDPC Characters Character Plane Number of Characters
________________________________________________________
Part I Plane 3 6148
Part II Plane 4 171
________________________________________________________
The characters defined in Plane 3 and Plane 4 of CNS 11643-1992 are as
follows:
_________________________________________________________________________
Character Plane Character Type
Number of
Characters
_________________________________________________________________________
3 Rarely-used characters (EDPC Part I) 6148
4 7298
Used for residency system, ISO 2nd
edition DIS 10646 Han characters, 171
EDPC Part II Characters
_________________________________________________________________________
In DEC Hanyu, each DTSCS character is represented by a 4-byte value. The
first two bytes are the leading value, specifically C2CB, which is used as
a designator sequence for the DTSCS character set. The MSB of the third and
fourth bytes is set on for the EDPC Recommended Character Set.
User-Defined Characters
In addition to the two Chinese character sets described in preceding
sections, DEC Hanyu provides an area of 3587 positions for user-defined
characters (UDC). The positions for UDC are those positions that are unused
(but not reserved) code points on the first and second character planes of
CNS 11643-1986.
The encoding for UDC is exactly the same as that for CNS11643-1986 except
that the two sets of characters occupy different regions. Code ranges for
UDC are as follows:
______________________________________________
Character Plane Number of UDC Code Range
______________________________________________
1 145 FDCC to FEFE
1 2256 AAA1 to C1FE
2 1186 F245 to FE7E
______________________________________________
Codeset Conversion
The following codeset converter pairs are available for converting
Traditional Chinese characters between dechanyu and other encoding formats.
Refer to iconv_intro(5) for an introduction to codeset conversion. For more
information about the other codeset for which dechanyu is the input or
output, see the reference page specified in the list item.
· big5_dechanyu, dechanyu_big5
Converting from and to the Big-5 codeset: big5(5).
Note that Big-5 encoding is equivalent to the Microsoft code-page
format used on PCs for Traditional Chinese. See code_page(5) for
information about PC code pages.
· dechanzi_dechanyu, dechanyu_dechanzi
Converting from and to the DEC Hanzi codeset: dechanzi(5).
· eucTW_dechanyu, dechanyu_eucTW
Converting from and to Taiwanese Extended UNIX Code: eucTW(5).
· telecode_dechanyu, dechanyu_telecode
Converting from and to the Telecode codeset: telecode(5).
· UTF-16_dechanyu, dechanyu_UTF-16
Converting from and to UTF-16 format: Unicode(5).
· UCS-4_dechanyu, dechanyu_UCS-4
Converting from and to UCS-4 format: Unicode(5).
· UTF-8_dechanyu, dechanyu_UTF-8
Converting from and to UTF-8 format: Unicode(5).
Fonts for DEC Hanyu Characters
The operating system provides both screen and printer fonts for DEC Hanyu
characters.
The following DECwindows Motif fonts are grouped according to character set
and family; they reflect various sizes and typefaces for 75dpi and 100dpi
display devices:
CNS 11643-1986 Fonts (Hei family):
-adecw-hei-medium-r-normal--16-160-75-75-m-160-dec.cns11643.1986-2
-adecw-hei-medium-r-normal--24-240-75-75-m-240-dec.cns11643.1986-2
-adecw-hei-medium-r-normal--16-160-100-100-m-160-dec.cns11643.1986-2
-adecw-hei-medium-r-normal--24-240-100-100-m-240-dec.cns11643.1986-2
CNS 11643-1986 fonts (Screen family):
-adecw-screen-medium-r-normal--18-180-75-75-m-160-dec.cns11643.1986-2
-adecw-screen-medium-r-normal--24-240-75-75-m-240-dec.cns11643.1986-2
-adecw-screen-medium-r-normal--18-180-100-100-m-160-dec.cns11643.1986-2
-adecw-screen-medium-r-normal--24-240-100-100-m-240-dec.cns11643.1986-2
-adecw-screen-medium-r-normal--18-180-100-100-m-160-dec.cns11643.1986-UDC
-adecw-screen-medium-r-normal--24-240-100-100-m-240-dec.cns11643.1986-UDC
CNS 11643-1986 fonts (Sung family):
-adecw-sung-medium-r-normal--24-240-75-75-m-240-dec.cns11643.1986-2
-adecw-sung-medium-r-normal--32-320-75-75-m-320-dec.cns11643.1986-2
-adecw-sung-medium-r-normal--24-240-100-100-m-240-dec.cns11643.1986-2
-adecw-sung-medium-r-normal--32-320-100-100-m-320-dec.cns11643.1986-2
DTSCS fonts (Hei family):
-adecw-hei-medium-r-normal--16-160-75-75-m-160-dec.dtscs.1990-2
-adecw-hei-medium-r-normal--24-240-75-75-m-240-dec.dtscs.1990-2
-adecw-hei-medium-r-normal--16-160-100-100-m-160-dec.dtscs.1990-2
-adecw-hei-medium-r-normal--24-240-100-100-m-240-dec.dtscs.1990-2
DTSCS fonts (Screen family):
-adecw-screen-medium-r-normal--18-180-75-75-m-160-dec.dtscs.1990-2
-adecw-screen-medium-r-normal--24-240-75-75-m-240-dec.dtscs.1990-2
-adecw-screen-medium-r-normal--18-180-100-100-m-160-dec.dtscs.1990-2
-adecw-screen-medium-r-normal--24-240-100-100-m-240-dec.dtscs.1990-2
DTSCS fonts (Sung family):
-adecw-sung-medium-r-normal--24-240-75-75-m-240-dec.dtscs.1990-2
-adecw-sung-medium-r-normal--32-320-75-75-m-320-dec.dtscs.1990-2
-adecw-sung-medium-r-normal--24-240-100-100-m-240-dec.dtscs.1990-2
-adecw-sung-medium-r-normal--32-320-100-100-m-320-dec.dtscs.1990-2
The operating system provides the following PostScript printer fonts for
CNS 11643-1986 characters:
· Hei-Light-CNS11643
· Sung-Light-CNS11643
These PostScript fonts support only the Traditional Chinese characters in
planes 1 and 2 of the CNS 11643 character set. The Traditional Chinese
characters in the DTSCS character set are not supported by printer fonts.
The restriction also applies to the eucTW codeset, which also includes
DTSCS characters and is supported by the same fonts as dechanyu.
For general information on printing Asian language text, refer to
i18n_printing(5).
SEE ALSO
Commands: locale(1)
Others: ascii(5), big5(5), Chinese(5), code_page(5), dechanzi(5), eucTW(5),
GBK(5), i18n_intro(5), i18n_printing(5), iconv_intro(5), l10n_intro(5),
sbig5(5), telecode(5)
 |
Index for Section 5 |
|
 |
Alphabetical listing for D |
|
 |
Top of page |
|