|
Index for Section 5 |
|
|
Alphabetical listing for B |
|
|
Bottom of page |
|
big5(5)
NAME
big5 - A character encoding system (codeset) for Traditional Chinese
DESCRIPTION
The big5 codeset is one of several codesets that support the Traditional
Chinese language. This codeset includes the following character sets:
· ASCII
· Big-5
The big5 codeset uses a combination of single-byte data and two-byte data
to represent ASCII characters, symbols, and Chinese ideographic characters.
ASCII Characters
All ASCII characters are represented in the form of single-byte, 7-bit data
in the big5 codeset; that is, the most significant bit (MSB) of a byte that
represents an ASCII character is always set off. For more information, see
ascii(5).
Big-5 Character Groups
The Big-5 character set defines the following character groups:
· Special symbols (408)
· Level 1 characters (5401)
· Level 2 characters (7652)
· Level 1 user-defined space (785)
· Level 2 user-defined space (2983)
· Level 3 user-defined space (2041)
Code Values for Big-5 Characters
Each Big-5 character is represented by a two-byte code that compiles
according to the Big-5 standard. The MSB of the first byte is always set on
while that of the second byte can be on or off. Code ranges for characters
in the different character groups are as follows:
· Special symbols: A140 to A3BF
· Level 1 characters: A440 to C67E
· Level 2 characters: C940 to F9D5
· Level 1 user-defined space: FA40 to FEFE
· Level 2 user-defined space: 8E40 to A0FE
· Level 3 user-defined space: 8140 to 8DFE
In this space, the valid code range for the first byte is 81 to FE,
while that for the second byte is 40 to 7E and A1 to FE.
Codeset Conversion
The following codeset converter pairs are available for converting
Traditional Chinese characters between big5 and other encoding formats.
Refer to iconv_intro(5) for an introduction to codeset conversion. For more
information about the other codeset for which big5 is the input or output,
see the reference page specified in the list item.
· dechanyu_big5, big5_dechanyu
Converting from and to DEC Hanyu: dechanyu(5)
· dechanzi_big5, big5_dechanzi
Converting from and to DEC Hanzi: dechanzi(5)
· eucTW_big5, big5_eucTW
Converting from and to Taiwanese Extended UNIX Code: eucTW(5)
· sbig5_big5, big5_sbig5
Converting from and to Shift Big-5: sbig5(5)
· telecode_big5, big5_telecode
Converting from and to Telecode: telecode(5)
· UCS-2_big5, big5_UCS-2
Converting from and to UCS-2: Unicode(5)
· UCS-4_big5, big5_UCS-4
Converting from and to UCS-4: Unicode(5)
· UTF-8_big5, big5_UTF-8
Converting from and to UTF-8: Unicode(5)
Note
The big5 encoding format is identical to the encoding format used in
PC code pages that support Traditional Chinese. Therefore, you can use
codeset converters that convert between big5 and UCS-2, UCS-4, or
UTF-8 to convert Traditional Chinese data between PC code-page and
Unicode encoding formats. Refer to code_page(5) for a discussion of
how the operating system supports PC code pages.
Fonts for Big-5 Characters
The operating system supports Big-5 code by internally converting
characters to DEC Hanyu. Therefore, DEC Hanyu fonts are used for Big-5
characters. Both display and printer fonts are provided for DEC Hanyu and
these are listed in the dechanyu(5) reference page.
For general information about printer support for and codeset conversion of
Asian text, refer to i18n_printing(5).
SEE ALSO
Commands: locale(1)
Others: ascii(5), Chinese(5), code_page(5), dechanyu(5), dechanzi(5),
eucTW(5), GBK(5), i18n_intro(5), i18n_printing(5), iconv_intro(5),
l10n_intro(5), sbig5(5), telecode(5), Unicode(5)
|
Index for Section 5 |
|
|
Alphabetical listing for B |
|
|
Top of page |
|