 |
Index for Section 5 |
|
 |
Alphabetical listing for D |
|
 |
Bottom of page |
|
dechanzi(5)
NAME
dechanzi - A character encoding system (codeset) for Simplified Chinese
DESCRIPTION
The DEC Hanzi (dechanzi) codeset consists of the following character sets:
· ASCII
· GB2312-80
· Extended GB
DEC Hanzi uses a 2-byte data representation for symbols and ideographic
characters that are defined in GB2312-80.
ASCII Characters
All ASCII characters are represented in the form of single-byte, 7-bit data
in the DEC Hanzi codeset; that is, the most significant bit (MSB) of the
byte that represents an ASCII character is always set off. For more
information on ASCII characters, refer to ascii(5).
GB2312-80 Characters
The code table for GB2312-80 characters is divided into 94 rows(Qu),
numbered from 1 to 94. Each row has 94 columns(Wei), also numbered from 1
to 94. The code table defines a total of 7445 characters, of which 6763 are
Chinese characters. Chinese characters are grouped as follows:
· Graphic symbols
There are 682 graphic symbols, which occupy rows 1 to 9 in the code
table.
· Frequently used (Level 1) characters
There are 3755 frequently used characters, which occupy rows 16 to 55
in the code table.
· Less frequently used (Level 2) characters
There are 3008 less frequently used characters, which occupy rows 56-
87 in the code table.
To differentiate GB2312-80 character codes from ASCII and Extended GB
character codes, the most significant bit (MSB) of both the first byte and
the second byte are set on. The following formulas show how to calculate
the value for a GB2312-80 character from its row and column numbers:
1st byte = A0 + Row number
2nd byte = A0 + Column number
For example, if a GB2312-80 character is in the first column of the 16th
row, the character's value is B0A1, which is calculated as follows:
1st byte = A0(hex) + 16 = B0(hex)
2nd byte = A0(hex) + 01 = A1(hex)
Extended GB Characters
The Extended GB code table is similar to the GB2312 code table and is
divided into 94 rows and 94 columns (8894 code points). However, the
Extended GB code table provides code points for user-defined characters
(UDC). The 8836 code points in this table are divided into two areas:
· User-defined area
This area spans rows 1 to 87 and provides 8178 code points.
· User-defined (reserved) area
This area spans rows 88 to 94 and provides 658 code points. This area
is where users can define special and long-lasting user-defined
characters.
To differentiate Extended GB codes from ASCII codes and GB2312-80 codes,
the most significant bit (MSB) of the first byte is set on while that of
the second byte is set off. The following formulas show how the code value
of an Extended GB character is calculated from its row and column numbers:
1st byte = A0 + Row number
2nd byte = 20 + Column number
For example, if a character is positioned at the first column of the 16th
row on the GB2312-80 code plane, the character's value is B021, which is
calculated as follows:
1st byte = A0(hex) + 16 = B0(hex)
2nd byte = 20(hex) + 01 = 21(hex)
Codeset Conversion
The following codeset converter pairs are available for converting
Simplified Chinese characters between dechanzi and other encoding formats.
Refer to iconv_intro(5) for an introduction to codeset conversion. For more
information about the other codeset for which dechanzi is the input or
output, see the reference page specified in the list item.
· big5_dechanzi, dechanzi_big5
Converting from and to the Big-5 codeset: big5(5)
· dechanyu_dechanzi, dechanzi_dechanyu
Converting from and to the DEC Hanyu codeset: dechanyu(5)
· eucTW_dechanzi, dechanzi_eucTW
Converting from and to Taiwanese Extended UNIX Code: eucTW(5)
· UTF-16_dechanzi, dechanzi_UTF-16
Converting from and to UTF-16 format: Unicode(5)
· UCS-4_dechanzi, dechanzi_UCS-4
Converting from and to UCS-4 format: Unicode(5)
· UTF-8_dechanzi, dechanzi_UTF-8
Converting from and to UTF-8 format: Unicode(5)
DEC Hanzi encoding is identical to the Microsoft code-page format (cp936)
used for Simplified Chinese characters on PC systems. However, DEC Hanzi
supports fewer characters than supported by the code page. Therefore, using
converters with dechanzi in the converter name to convert between cp936 and
other formats can result in some data loss. Refer to code_page(5) for more
information about PC code pages.
DEC Hanzi Fonts
The operating system provides both screen and printer fonts for DEC Hanzi
characters. The operating system also provides bit map fonts in addition to
the TrueType fonts described in this section. For a complete description of
DEC Hanzi fonts, see the document, Technical Reference for Using Chinese
Features.
The following set of Simplified Chinese TrueType fonts are installed as the
operating system default fonts for DEC Hanzi:
FangSong
-css_dongwen-fangsong-medium-r-normal--0-0-0-0-c-0-gb2312.1980-0
-css_dongwen-fangsong-medium-r-normal--0-0-0-0-c-0-gb2312.1980-1
-css_dongwen-fangsong-medium-r-normal--0-0-0-0-c-0-iso8859-1
HeiTi
-css_dongwen-heiti-medium-r-normal--0-0-0-0-c-0-gb2312.1980-0
-css_dongwen-heiti-medium-r-normal--0-0-0-0-c-0-gb2312.1980-1
-css_dongwen-heiti-medium-r-normal--0-0-0-0-c-0-iso8859-1
KaiTi
-css_dongwen-kaiti-medium-r-normal--0-0-0-0-c-0-gb2312.1980-0
-css_dongwen-kaiti-medium-r-normal--0-0-0-0-c-0-gb2312.1980-1
-css_dongwen-kaiti-medium-r-normal--0-0-0-0-c-0-iso8859-1
SongTi
-css_dongwen-songti-medium-r-normal--0-0-0-0-c-0-gb2312.1980-0
-css_dongwen-songti-medium-r-normal--0-0-0-0-c-0-gb2312.1980-1
-css_dongwen-songti-medium-r-normal--0-0-0-0-c-0-iso8859-1
The following set of Simplified Chinese TrueType fonts are available as an
installation option:
FangSong
-huatian-fangsong-medium-r-normal--0-0-0-0-c-0-gb2312.1980-0
-huatian-fangsong-medium-r-normal--0-0-0-0-c-0-gb2312.1980-1
-huatian-fangsong-medium-r-normal--0-0-0-0-m-0-iso8859-1
HeiTi
-huatian-heiti-medium-r-normal--0-0-0-0-c-0-gb2312.1980-0
-huatian-heiti-medium-r-normal--0-0-0-0-c-0-gb2312.1980-1
-huatian-heiti-medium-r-normal--0-0-0-0-m-0-iso8859-1
KaiTi
-huatian-kaiti-medium-r-normal--0-0-0-0-c-0-gb2312.1980-0
-huatian-kaiti-medium-r-normal--0-0-0-0-c-0-gb2312.1980-1
-huatian-kaiti-medium-r-normal--0-0-0-0-m-0-iso8859-1
SongTi
-huatian-songti-medium-r-normal--0-0-0-0-c-0-gb2312.1980-0
-huatian-songti-medium-r-normal--0-0-0-0-c-0-gb2312.1980-1
-huatian-songti-medium-r-normal--0-0-0-0-m-0-iso8859-1
With either the default or optional font sets installed, the SongTi fonts
are the default screen fonts for the DEC Hanzi codeset.
The operating system provides the following PostScript printer fonts for
DEC Hanzi characters:
· Hei-GB2312-80
· XiSong-GB2312-80
For general information on printing Asian language text, refer to
i18n_printing(5).
SEE ALSO
Commands: locale(1)
Others: ascii(5), big5(5), Chinese(5), code_page(5), dechanyu(5), eucTW(5),
GB18030(5), GBK(5), i18n_intro(5), i18n_printing(5), iconv_intro(5),
l10n_intro(5), sbig5(5), telecode(5), Unicode(5)
 |
Index for Section 5 |
|
 |
Alphabetical listing for D |
|
 |
Top of page |
|