6 Using Internationalized Software

This chapter explains how setup tasks and software features vary among language environments other than English. The chapter is aimed at programmers who are familiar with Tru64 UNIX in an English-language environment and who need to work with other languages, particularly those that use multibyte characters, to run and test their applications.

6.1 Working in a Multilanguage Environment: Introduction

To enable input and display in any language other than English, you must always set the locale in which your process runs. Depending on the language, you may need to perform additional tasks, for example, to:

Select keyboard type

Define search paths for specialized data and executable files that are language specific

Set terminal code, application code, and other characteristics of the terminal driver to be appropriate for the codeset or codesets where a language's characters are defined

Load the fonts required to display the characters in a particular language

Enable one or more of the data input and editing methods used to define and enter characters, words, and phrases

Apply printer-control characters, filters, and fonts that are appropriate for local-language printers

This chapter discusses these topics as they apply to particular languages or groups of languages. The chapter also describes some command and desktop environment features that English-language speakers do not normally use and that allow you to display, enter, print, and mail text in languages other than English. For complete information about using internationalization features of applications that run in the Common Desktop Environment (CDE), see the CDE Companion.

Language-specific user guides provide additional information about customization and use of software provided for a particular language. The following user guides are available only in HTML format:

Technical Reference for Using Chinese Features

Technical Reference for Using Japanese Features

Technical Reference for Using Korean Features

Technical Reference for Using Thai Features

Non-English characters are embedded in the text of the user guides for Chinese, Japanese, and Korean. To view these characters with your web browser, the appropriate language support subsets must be installed on your system and your locale must be set to one that includes the local language characters used in the book.

Tru64 UNIX documentation also provides introductory reference pages on the topics of internationalization ( i18n_intro(5)) and localization ( l10n_intro(5)), along with reference pages for all supported languages and codesets.

6.2 Setting Locale and Language

System software that supports different language environments may provide translated message files, application resource files, help files, or some combination of these. If translations are available for message files, you can vary the language of software messages and other text by selecting a locale.

For system software, you set locale by defining the LANG environment variable. For example:


% setenv LANG en_US.ISO8859-1

Refer to the discussion of internationalization in the System Administration book and in the Command and Shell User's Guide for more detailed information on using locales and defining the associated variables for system and user setup. You can also refer to the i18n_intro(5) reference page for a discussion of locale variables such as LANG. If these locale variables are not defined, internationalized applications assume the POSIX (C) locale, which supports only English. For names of locales that are available with the operating system, see l10n_intro(5).

Note

Locales often have multiple variants. These variants have the same name as the base locale but include a file name suffix that begins with the at sign (@). Locale variants for support of codesets, such as UCS-4 and cp850, that are not native to UNIX, can be assigned to LANG or LC_ALL. However, locale variants that differ from the base locale in only one locale category should be assigned only to the appropriate locale category. For example, a locale variant designed to support a specific collation sequence, such as @radicalwould be assigned to LC_COLLATE. A locale variant designed to support the euro monetary sign (@euro) would be assigned only to LC_MONETARY. Use the base locale name, not these variants, in assignments to the LANG environment variable. Furthermore, in cases where a base locale name is not being assigned to all locale categories, avoid using the LC_ALL environment variable, whose assigned value overrides settings for both LANG and the environment variables for specific locale categories.
Many locale-specific files reside in directories whose names are constructed from the language, territory, and codeset portions of a locale name. Commands and other system applications insert the setting of the LANG variable into search paths that contain %L as one of the directory nodes. This makes it possible for software programs to find the correct set of files, such as fonts, resource files, user-defined character files, and translated reference pages, that should be used with the current locale. An @ suffix related to collation, if included in an assignment to the LANG variable, may result in applications being unable to find certain locale-specific files.

For graphical applications, you need to select a language to take advantage of text translations and local-language features available with Common Desktop Environment (CDE) and other kinds of Motif applications. For Asian languages, the correct language selection is particularly important because it enables:

Support for the appropriate input method in these applications

Entry of file names and other parameters that use ideographic characters

Cursor positioning on correct character and word boundaries

Line wrapping at correct word boundaries

See the CDE Companion for general information about setting language in CDE.

CDE assumes that all applications run during a session operate in the language that was set at the start of the session. On Tru64 UNIX systems, you can work around this restriction.

In a dtterm window, set the LANG or LC_ALL environment variable to the locale in which you want to run the new application. For example:
```
% setenv LANG ko_KR.deckorean
```

If the setting is for a Japanese, Chinese, or Korean locale, use the system command line to start the appropriate input method server before invoking the application. For example:
```
% /usr/bin/X11/dxhangulim &
```
See Section 6.4 for information about Asian input method servers.

In the same window, use the system command line to invoke the application you want to run in the new locale. For example:
```
% /usr/dt/bin/dtterm &
```

If you need to change your keyboard setting to work in the new locale, do so before starting to work in the new application's window. See Section 6.3 for information about setting keyboard type.

6.3 Selecting Keyboard Type

To enter English text, a standard keyboard provides a sufficient number of keys (combined with shift states) to enter all uppercase and lowercase letters, numerals, and punctuation marks. For many other languages, the default keyboard does not provide enough keys and shift states to enter all characters.

Terminal users must use a localized keyboard or, if their keyboard includes a Compose key, use Compose-key sequences to enter non-English characters from single-byte codesets. Some terminals also provide software emulation of a number of keyboard layouts for languages that are based on single-byte codesets. The user guide for each terminal explains how you can use its keyboard to enter non-English characters. Entry of multibyte characters in Asian languages requires special terminal hardware.

Workstation users can set keyboard type to be appropriate for languages for which there are standard keyboard types when appropriate support files are installed on the system. You need to set keyboard type for Western and Eastern European languages, Japanese, Thai, and Hebrew. Keyboard setting is not required for Chinese and Korean languages.

In CDE, use Keyboard Options (one of the Desktop Applications) to change your keyboard type. Refer to the CDE Companion for more information about changing keyboard type. From the system command line, this application is invoked by using the dxkeyboard command.

Unlike the language setting, the keyboard setting is a global attribute that applies to all windows. Therefore, if you are working in windows that were created with different language settings, you may need to change the keyboard setting as you move from one window to another. Keep in mind that no matter what setting is made by using CDE applications, that setting does not change the setting that applies when you log on the system. The keyboard setting when you log on the system is always the system-default keyboard. See keyboard(5) for information about changing the system-default keyboard.

6.3.1 Determining Keyboard Layout

If you change your keyboard from the one whose characters are printed on the hardware keys, you need to know how characters are mapped to keys and whether any characters must be entered by using a mode-switch key or mode-switch key sequence. For some languages, such as Czech, up to four different characters can be mapped to the same key. In such cases, you use the key defined as the mode switch to toggle among different sets of characters mapped to the same key. Note that mode switching is a character entry mechanism that is different from Compose sequences. A particular keyboard setting may support Compose sequences (which require one key to be defined as a multikey), mode switching (which requires at least one key to be defined as a mode-switch key), both, or neither of these input mechanisms.

You can access a keyboard layout for your current keyboard setting by using a command similar to the following to create a PostScript file that you can print:

% /usr/bin/X11/xkbprint -label symbols -o mykeyboard.ps :0

Refer to xkbprint(1X) for more information about the xkbprint command.

6.4 Determining Input Method

For some languages, such as Japanese, Chinese, and Korean, you use an input method to enter characters, phrases, or both. An input method lets you input a character by taking multiple editing actions on entry data. The data entered at intermediate stages of character entry is called the preediting string. The X Input Method specification defines four user input styles:

On-the-spot
Data being edited is displayed directly in the application window. Application data is moved to allow the preediting string to display at the point of character insertion.

Over-the-spot
The preediting string is displayed in a window that is positioned over the point of insertion.

Off-the-spot
The preediting string is displayed in a window that is within the application window but not over the point of insertion. Often, the window for the preediting string appears at the bottom of the application window. In this case, the preediting window may occlude the last line of text in the application window. You can resize the application window to make this last line visible.

Root-window
The preediting string is displayed in a child window of the application RootWindow.

For some of the input styles selected in an application, the preediting and status windows are not redrawn correctly if the application window is occluded by other windows. To correct this problem, click on or refocus on the application window.

Input methods for different locales typically support more than one user input style but not all of them. If you work in languages that are supported by an input method, you can specify styles in priority order through the VendorShell resource XmNpreeditType. By default, this resource is defined to be:

OnTheSpot,OverTheSpot,OffTheSpot,Root

The preceding value means that on-the-spot input style is used if the input method supports it, else the over-the-spot is used if the input method supports it, and so forth.

There are several ways to supply the XmNpreeditType resource value to an application:

In CDE, use the Input Methods application. See the CDE Companion for information on using this application.

In an application-specific resource file.

On the command line that invokes an application.
For example:
```
% app-name -xrm '*preeditType: offthespot,onthespot' &
```

Input styles are supported by specialized input method servers. An input method server runs as an independent process and communicates with an application to handle input operations. An input method server does not have to be running on the same system as the application but must be running and made accessible to the application before the application starts. Following are the input method servers available in the operating system, along with the input styles that each server supports:

dxhangulim, the Korean input server, which supports all four input styles (over the spot, off the spot, root window, and on the spot)

dxhanyuim, the Traditional Chinese input server, which supports the off-the-spot and root-window input styles

dxhanziim, the Simplified Chinese input server, which supports the off-the-spot and root-window input styles

dxjim, the Japanese input server, which supports the on-the-spot, over-the-spot, and root-window input styles

Each of these servers has a corresponding reference page.

The applications that you run may support more, fewer, or none of the input styles supported by a particular input server. The preedit option "None" applies when an input server rejects all input styles supported by the application.

In the CDE, the appropriate input server automatically starts when you select the session language. However, see Section 6.15.4 for restrictions that may require you to start an input server manually.

6.5 Determining the Input Mode Switch State

The keyboard layout for an Asian language provides keys for only a small number of characters. For Asian languages, you also use an input methodology (incorporating control-key sequences, keypad-key sequences, or options in a windows application) to convert one or more characters that you can input directly from the keyboard to other kinds of characters. Section 6.4 and the language-specific technical reference guides discuss input methods for Asian languages.

If your keyboard has a mode-switch LED (light emitting diode), it is turned on or off, depending on whether you last toggled the special input mode on or off.

If you are using a workstation and your language is set to an Asian language, you can show the mode-switch LED on the screen by invoking the Keyboard Indicator application with the -map option, as follows:

% /usr/bin/X11/kb_indicator -map &

The -map option starts a Motif application that emulates a mode-switch LED. The application window contains one button, which is displayed as on or off, corresponding to the input mode state. You can click on this button to toggle in and out of input mode. The window is insensitive if input mode switching is not supported for your current language setting.

You can have only one Keyboard Indicator application running during your session. To stop the application, press Ctrl-c in the window from which you started the application or enter the following kill command with the application's process id:

kill -INT process_id

If Keyboard Indicator is stopped by any other means, you must enter the following command before restarting the application:


% /usr/bin/X11/kb_indicator -clear

The preceding command erases the server status for the application so that it can be restarted cleanly.

If your language is set to Hebrew, the Keyboard Manager application (/usr/bin/X11/decwkm) provides the same function as the Keyboard Indicator window provides for Asian languages.

6.6 Defining the Search Path for Specialized Components

European languages are supported by data and executable files installed at system default locations. Asian-language support for some commands and programming libraries requires files that are subordinate to the /usr/i18n directory. These files supplement or replace files in system default locations. When you install one or more of the Asian language subsets, the installation procedure makes the following adjustments to variable settings on a systemwide basis:

I18NPATH
The I18NPATH variable defines the location of files that provide Asian-language support and that are not in system default locations. This variable is set to:
/usr/i18n
Your system administrator can choose to install files for Asian-language support at a location different from /usr/i18n; however, there must be a link to the other location in the /usr/i18n directory.

PATH
The PATH variable points to the location of commands and is set to:
$I18NPATH/usr/bin:$PATH

The /etc/i18n_profile file includes the preceding variable assignments on a systemwide basis for Bourne and Korn shell users. For C shell users, the installation process includes the /etc/i18n_login file in the /etc/csh.login file to correctly set search paths for Hebrew and Asian languages. Unless specifically noted in descriptions of particular commands or utilities, individual users do not need to change process-specific search paths to find localized binaries and utilities.

6.7 Using Terminal Interface Features for Asian Languages

The Tru64 UNIX Asian terminal driver (atty) and Thai terminal driver (ttty) support input and output of English and other language characters over asynchronous terminal lines. When one or both of these drivers are installed, you can set terminal line characteristics to be appropriate for the language you are using. The driver's local-language capabilities are supported in the following terminal configurations:

Terminal connected directly to the host machine via a serial line

Terminal connected through LAT to the host system

Terminal connected through TCP/IP to the host system

Refer to atty(7) and ttty(7) for more information about these terminal drivers.

The stty command can enable support for multibyte codesets and special character manipulation capabilities, such as the following:

Automatic codeset conversion between terminal and application

Line editing of multibyte characters

Japanese input method (Kana-Kanji conversion)

User-defined character (UDC) databases and on-demand loading (ODL) of associated fonts

Chinese phrase input method

This section provides general information about using the stty command to enable features added to the terminal subsystem for Asian languages.

The stty utility sets or reports on terminal input/output characteristics of the device that is the utility's standard input. Table 6-1 shows the stty options that set line discipline for Asian languages.

Table 6-1: The stty Command Options for Controlling Terminal Line Discipline

stty Option	Description
`adec`	Sets the terminal line discipline to handle multibyte data and the processing environment appropriate for simplified Chinese (Hanzi), traditional Chinese (Hanyu), and Korean codesets. This option is supported for both the STREAMS and BSD terminal drivers.
`jdec`	Sets the terminal line discipline to handle multibyte data and the processing environment appropriate for Japanese codesets. This option sets terminal code to `dec` and application code to `eucJP`. The `jdec` option is supported for both the STREAMS and BSD terminal drivers.
`tdec`	Sets the terminal line discipline to handle Thai characters and the processing environment appropriate for the Thai codeset. This option is supported for only the BSD terminal driver.
`dec`	Sets the terminal line discipline back to the default, or standard, `tty` line discipline and clears characteristics that preceding `stty` commands may have set for application and terminal code. This option is supported for both the STREAMS and BSD terminal drivers.

Note

Do not set the terminal line discipline to jdec or adec from a console set up for kernel debugging (running the KDEBUG driver). Doing so may cause the console to hang.

The stty command requires an appropriate locale setting to be in effect before changing the terminal line discipline to support that locale. For example, to set your terminal line discipline to handle Korean, enter:

% setenv LANG ko_KR.deckorean
% stty adec

To set your terminal line discipline back to the tty default, enter:


% stty dec

Note

When your terminal line discipline is not set to the tty default and you want to switch to another nondefault option (to switch from jdec to adec, for example), first enter the stty dec command to clear any application or terminal characteristics that may not be appropriate for the new setting. The following example shows how to switch a terminal line discipline from its current setting of adec to jdec:
% stty dec
% stty jdec

The stty command entered with the -a option or all argument displays all settings for the current terminal line discipline:

% stty adec
% stty all
atty disc;speed 9600 baud; 24 rows; 80 columns
erase = ^?; werase = ^W; kill = ^U; intr = ^C; quit = ^\; susp = ^Z
dsusp = ^Y; eof = ^D; eol <undef>; eol2 <undef>; stop = ^S; start = ^Q
lnext = ^V; discard = ^O; reprint = ^R; status <undef>; time = 0
min = 1
-parenb -parodd cs8 -cstopb hupcl cread -clocal
-ignbrk brkint -ignpar -parmrk -inpck -istrip -inlcr -igncr icrnl -iuclc
ixon -ixany -ixoff imaxbel
isig icanon -xcase echo echoe echok -echonl -noflsh -mdmbuf -nohang
-tostop echoctl -echoprt echoke -altwerase iexten -nokerninfo
opost -olcuc onlcr -ocrnl -onocr -onlret -ofill -ofdel tabs -onoeot
-odl lru size=256
-sim key= class=
tcode=dec acode=deckanji

6.7.1 Converting Between Application and Terminal Codesets

Many terminals support only one codeset, which is a problem when you work on one terminal and need to run applications in locales (particularly Asian locales) that are based on a variety of codesets. Therefore, the atty driver provides a mechanism for converting between the codeset that an application uses and the codeset that a terminal supports. You control codeset conversion by using options on the stty command line.

Note that the adec, jdec, and dec options of the stty command set terminal code and application code appropriately for Compaq terminals and workstations. You need to explicitly use the tcode option, for example, if you are logging in from a Japanese terminal that does not support the same codeset as Compaq terminals and workstations.

Table 6-2 specifies stty options that explicitly set terminal and application code.

Table 6-2: The stty Options to Explicitly Set Application and Terminal Code

stty Option	Description
`acode codeset`	Sets application code to `codeset`
`tcode codeset`	Sets terminal code to `codeset`
`code codeset`	Sets both terminal code and application code to `codeset`

The following command lets you run an application that uses DEC Kanji on a terminal that supports only Shifted JIS (a codeset prevalent in the Japanese personal computer market):

% stty acode deckanji tcode sjis

The technical reference guides for the Asian language features provide additional details about supported application codesets and terminal codesets.

6.7.2 Command Line Editing That Supports Multibyte Characters

This section discusses how you enable and use command-line editing when Asian-language support is installed on your system.

When the terminal line discipline and terminal codeset characteristics are set appropriately for multibyte codesets, the atty driver handles command-line editing appropriately for languages supported by those codesets. For example, when you enter the control sequence to delete a character (assuming you have defined the control sequence), the entire character is deleted, regardless of how many bytes it occupies. The character being erased can be either a single-byte English character or a multibyte Asian character when both occur on the same command line.

Word deletion is also supported, even when words combine single-byte and multibyte characters. The atty driver accepts single-byte space characters, two-byte space characters (if applicable to the terminal code setting), or tab characters as word delimiters.

The erase and werase options of the stty command line let you define the control sequence for character and word deletion. For example:


% stty erase [Ctrl-h]
% stty werase [Ctrl-j]

This example specifies that Ctrl-h deletes the character preceding the cursor and Ctrl-j deletes the word preceding the cursor.

History mode is a mode of command-line editing that allows you to recall and optionally modify a command entered previously. The history mode implementation discussed here is one that is customized for Asian-language input and supported only for the BSD terminal driver. Table 6-3 specifies the stty options that enable or disable history mode editing.

Table 6-3: The stty Options to Enable/Disable History Mode

stty Option	Description
`history key`	Sets the toggle key for the history mechanism and enables it.
`-history`	Disables the history mechanism.

The atty driver can maintain a history of up to 32 commands, each with a maximum length of 127 characters. Table 6-4 describes the commands you can use to edit command lines after entering the history key.

Table 6-4: Command Line Editing in History Mode

Command/Key	Description
Ctrl-a	Move to the beginning of the line
Ctrl-d	Delete the character under the cursor
Ctrl-e	Move to the end of the line
Up-arrow	Recall the previous command line in the history list
Down-arrow	Recall the next command in the history list
Left-arrow	Move the cursor left by one character
Right-arrow	Move the cursor right by one character
`erase_sequence`	Delete the character preceding the cursor
`werase_sequence`	Delete the word preceding the cursor

In the preceding table erase_sequence and werase_sequence indicate the control sequences defined by the stty options erase and werase, respectively.

When editing a command line in history mode, you insert characters as follows:

Press the arrow keys to move the cursor to the position immediately to the right of the point where you want to insert characters.

Enter the characters you want to insert.

If you enter the control characters that represent "kill," "interrupt," or "suspend," the tty driver breaks out of history mode and cancels the command line being edited.

6.7.3 Kana-Kanji Conversion: Customization of Japanese Input Options

In the Japanese language, a particular language element, such as a vowel, can be represented by more than one character. These characters can have both phonetic and ideographic variants; furthermore, the phonetic character variants can print in either two-column or single-column width. The different classes of characters, listed in the following table, require different input schemes:

Character Class	Description
Kanji	Ideographic
Hiragana	Phonetic
Katakana	Phonetic Katakana characters exist in full width (two-column) and half width (single-column) formats. The single-column format of Katakana is referred to as Hankaku.

During a single session, a Japanese user can work with Kanji, Hiragana, and Katakana characters in various combinations. The user therefore must be able to customize terminal input mode to suit the character being entered. When the input device is a JIS terminal rather than a workstation, the user must adjust line discipline and terminal code settings in the software to match hardware capabilities (for example, whether the terminal uses 7-bit or 8-bit encoding).

The tty driver supports a mechanism known as Kana-Kanji conversion. This term refers to the conversion between phonetic and ideographic character encoding and the support for keyboard entry sequences that make Japanese character selection more efficient for the user. You use the stty command to enable or disable the Kana-Kanji conversion method and other aspects of Japanese input support. The stty options that support Japanese input are described in Table 6-5 and, unless noted otherwise, are used in conjunction with the jdec option. For example, the following command sets the terminal line discipline to support Japanese character encoding and also enables Kana-Kanji conversion:

% stty jdec ikk

Table 6-5: The stty Options to Enable and Customize Japanese Input

stty Option	Description
`clause mode`	Sets the character attribute for marking a clause that results from Kana-Kanji conversion. The `mode` argument can be `bold`, `underline`, `reverse`, or `none`.
`esc.alw`	Changes the terminal state to "shift out" whenever a newline character is output. This option applies only when the `tcode` (terminal code) `stty` option is set to `jis7` or `jis8`.
`-esc.alw`	Does not change the current terminal state when a newline character is output. This option applies only when the `tcode` (terminal code) option is set to `jis7` or `jis8`.
`henkan mode`	Sets the character attribute for marking a Henkan, or conversion, region that results from Kana-Kanji conversion. The `mode` argument can be `bold`, `underline`, `reverse`, or `none`.
`ikk`	Enables the Japanese input method and spawns the Kana-Kanji conversion daemon, `kkcd`, if it does not already exist. With the BSD terminal driver in cbreak mode, you must use the `jx` option before using the `ikk` option to enable the input method. With the STREAMS terminal driver, you must use the `jinkey` option before using the `ikk` option. By default, key map information is taken from (in highest to lowest priority order): The file specified for the `kkseq` option of the `stty` command The file defined for the `JSYKKSEQ` environment variable The `$HOME/.jsykkseq` file System default key map files for the Japanese input method reside in the `/usr/i18n/skel/ja_JP` directory. Dictionaries used with the Japanese input method are taken from (in highest to lowest priority order): The files defined for the `JSYTANGO`, `JSYKOJIN`, and `JSYLEARN` environment variables The `/usr/i18n/jsy/jsytango.dic`, `$HOME/jsykojin.dic`, and `$HOME/jsylearn.dic` dictionary files
`-ikk`	Disables the Japanese input method and kills the `kkcd` daemon.
`jinkey` `sequence`	Defines the escape sequence to activate the extended Japanese input method used with the STREAMS terminal driver. The parameter for this option can be more than one character.
`imode mode`	Sets the mode for handling 8-bit code or Hankaku (single-column) Kana code when the terminal line discipline is set to `dec`. The `mode` argument can be one of the following keywords: `kanji`, where the 8-bit code is treated as encoding for Kanji `hiragana`, where the 8-bit code is converted to 2-column Hiragana format `katakana`, where the 8-bit code is converted to 2-column Katakana format `hankaku`, where the 8-bit code is handled in Hankaku (1-column) Katakana format
`jx character`	Sets the toggle character for entering the extended, or `cbreak`, Kana-Kanji conversion mode used with the BSD terminal driver. Users need to enter `cbreak` mode when working in utilities, such as `dbx`, that do not support the full range of Japanese input options.
`-jx`	Undefines the toggle character for entering the extended Kana-Kanji conversion mode.
`kin esc_sequence`	Sets the JIS Kanji "shift in" escape sequence for the JIS terminal.
`kkmap`	Displays the current key map for Kana-Kanji conversion. The display is a traversal tree with a maximum of 15 characters for each key sequence.
`kkseq file`	Sets the Kana-Kanji conversion key map file for the terminal (see also the table entry for the `ikk` option).
`knj.bsl`	Uses only one backspace to erase one Kanji character.
`-knj.bsl`	Uses two backspaces to erase one Kanji character.
`knj.sp`	Uses one 2-byte (zenkaku) space to blank out one Kanji character.
`-knj.sp`	Uses two ASCII spaces to blank out one Kanji character.
`kout esc_sequence`	Sets the JIS Kanji "shift out" escape sequence for the JIS terminal.

6.8 Supporting User-Defined Characters and Phrase Input

The national character sets for Japan, Taiwan, and China do not include some of the characters that can appear in Asian place and personal names. Such characters are defined by users and reside in site-specific databases. These databases are called user-defined character (UDC) or character-attribute databases. When users define ideographic characters, they must also define font glyphs, collating files, and other support files for the characters. Appendix B provides details on how you set up and use UDC databases.

In Korea, Taiwan, and China, users can input a complete phrase by typing a keyword, abbreviation, or acronym. This capability is supported by a phrase database and an input mechanism. Appendix C provides details on how you set up and use a Chinese phrase database.

The /var/i18n/conf/cp_dirs configuration file allows software services or hardware to locate the databases that support UDC and phrase input.

Example 6-1 shows the default entries in the cp_dirs file. You can edit these entries to change the default locations.

Example 6-1: Default cp_dirs File

#
# Attribute directory configuration file
#
#                       System location         User location
#                       ===============         =============
udc     -               /var/i18n/udc           ~/.udc
odl     -               /var/i18n/odl           ~/.odl
sim     -               /var/i18n/sim           ~/.sim
cdb     /usr/i18n/.cdb  /var/i18n/cdb           ~/.cdb
iks     -               /var/i18n/iks           ~/.iks
pre     -               /var/i18n/fonts         ~/.fonts
bdf     -               /var/i18n/fonts         ~/.fonts
pcf     -               /var/i18n/fonts         ~/.fonts

Each line in the cp_dirs file represents one entry and has the following format:

[service_name standard_path system_path user_path ]

The service_name can be one of the following:

bdf (for font files in BDF format)

cdb (for collating value databases used with the asort command)

iks (for input key sequence files)

odl (for databases of fonts and input key sequences that the SoftODL service uses)

pcf (for font files in PCF format)
These files, depending on their font resolution, reside in either the 75dpi or 100dpi subdirectory.

pre (for font files in preload format created by the cgen utility)
These are raw font files used to preload multibyte-character terminals.

sim (for phrase databases)

udc (for UDC databases)

The cp_dirs file can contain only one entry for each service named. Remaining fields in the entry line consist of the following:

standard_path specifies the location of the collating values database for the standard character sets (applies only to the cdb entry)

system_path specifies the location of systemwide databases

user_path specifies the location of users' private databases

The preceding locations are specified as one of the following:

An absolute pathname, starting with a slash (/)

A pathname, starting with tilde slash (~/), that is relative to a user's home directory

A minus sign or hyphen (-) to indicate that the entry is not used
For example, you can specify - to be user_path for all services related to user-defined characters if you want these characters supported only through systemwide databases.

Comment lines in the cp_dirs file begin with the number sign (#).

6.9 Using Printer Interface Features That Support Local Languages

When you install Tru64 UNIX and include language variant subsets, your printing subsystem is enhanced with the following features:

Two generic internationalized print filters, pcfof and wwpsof, that work with Compaq and third-party printers

A set of print filters that support escape sequences used by local-language printers

Entries in the /etc/printcap file to support printer code conversion and on-demand loading of font files

An lprsetup command that lets you add entries for local-language printers to the /etc/printcap file

lp, lpr, lpc, lpq, lprm, and lpstat commands that support additional options for printing and printer control

Support for on-demand loading in the lpd printer daemon

PostScript outline fonts that can be used by the wwpsof filter and other software

Software, such as the pfsetup, ffd, and wwlpspr commands. These commands support the DEClaser 1152, the DEClaser 5100, and Printserver 17 products that are no longer offered for sale but are still being used by customers. See i18n_printing(5) for more information.

The following sections discuss all but the last of the listed features.

6.9.1 Generic Internationalized Print Filters

The pcfof and wwpsof print filters enable use of Compaq printers, particularly those for which no other printer-specific solution is described in this chapter. You also need to use these filters if your printer is from another vendor. Both of these filters rely on a printer customization file (.pcf file) to supply certain device-specific information. Operating system software includes a basic set of .pcf files. System administrators can add more .pcf files to describe the capabilities of additional printers used at your site.

6.9.1.1 pcfof Print Filter

The pcfof filter handles both PostScript printers and text printers, such as the HP PCL printer. For PostScript files, the filter requires that the appropriate local language PostScript fonts be available on the printer. This restriction limits the filter's usefulness on many Compaq printers, particularly for printing PostScript files that require Japanese fonts. This filter can be set up to do codeset conversion when the printer locale differs from the one required for a text file print job. The filter also has .pcf files that are appropriate to use for a number of third-party text printers. Refer to pcfof(8) and the System Administration manual for details on using this print filter.

6.9.1.2 wwpsof Print Filter

The wwpsof filter is used only with PostScript printers. The main advantage of this filter is that it does not require PostScript fonts to be printer resident because the filter can embed the required fonts in the print job. The PostScript fonts can be either outline fonts installed on the system or bitmap fonts made available to the filter through an X font server. The filter prints multilanguage text files by first converting each character in the text file to a matching character in a UNIX codeset for which fonts are available and then converting the file to PostScript. The filter can also print PostScript files that have been generated by a CDE application. Refer to wwpsof(8) and the System Administration manual for details on using this print filter.

6.9.2 Print Filters for Specific Local Language Printers

A print filter processes text data for a particular model of printer. The filter handles the device dependencies of the printer and performs device accounting functions. When each print job is complete, the print filter writes an accounting record to the file specified by the af field of the printer's entry in the /etc/printcap file.

The print filters for local-language text printers can handle text files that contain ASCII and local-language characters, or output files created by the nroff command. When processing nroff output, the filter removes multibyte characters that extend beyond the page boundary and translates nroff control sequences for underlining, superscripting, and subscripting to control sequences appropriate for the printer. However, the filter does not support multiple nroff control sequences on the same character.

The PostScript print filters can print PostScript files in addition to text and nroff output files.

A local-language print filter can be the specified filter in both the of and if fields in the /etc/printcap file. For general information on /etc/printcap entries, refer to the System Administration manual and to printcap(4). Supplementary information is provided in i18n_printing(5). A reference page for a specific language (for example, Japanese(5)) lists the names of print filters that support printing characters in that language.

The following print filters process text data for Asian languages:

Language	Filter	Printer
Japanese	`la84of`	LA84-J
Japanese	`la86of`	LA86-J
Japanese	`la90of`	LA90-J
Japanese	`la280of`	LA280-J
Japanese	`la380of`	LA380-J
Japanese	`ln03jaof`	LN03-J
Japanese	`ln05jaof`	LN05-J
Simplified Chinese	`la88cof`	LA88-C
Simplified Chinese	`la380cbof`	LA380-CB
Korean	`la380kof`	LA380-K
Korean	`dl510kaof`	DL510-KA
Traditional Chinese	`cp382dof`	CP382-D
Thai	`thailpof`	EP1050+

The following print filters process PostScript and text data for Asian languages and for some of the languages supported by locales using the ISO8859-2, ISO8859-5, ISO8859-7, and ISO8859-9 codesets:

Language	Filter	Printer
Japanese	`ln82rof`	LN82R
Czech, Traditional Chinese, Simplified Chinese, Hungarian, Greek, Korean, Polish, Russian, Slovak, Slovene, and Turkish	`dl1152wrof`	DEClaser 1152
Thai	`dl1152trof`, `dl1152ttmrof`	DEClaser 1152
Czech, Traditional Chinese, Simplified Chinese, Hungarian, Greek, Korean, Polish, Russian, Slovak, Slovene, and Turkish	`dl5100wrof`	DEClaser 5100
Thai	`dl5100trof`, `dl5100ttmrof`	DEClaser 5100

See the reference page for a specific language (for example, Japanese(5)) to find the names of print filters that support printing characters in that language. See i18n_printing(5) for information about the DEClaser 1152 and DEClaser 5100 printers.

6.9.3 Support for Local Language Printers in /etc/printcap

The /etc/printcap file describes characteristics of each printer on the system. Printer characteristics are specified by symbol/value pairs, where each symbol is a 2-character mnemonic. Each time a user submits a print job, the lpd printer daemon and printer spooling system uses information in the /etc/printcap file to determine how that job is handled.

Table 6-6 lists and describes /etc/printcap symbols that are specific to support for local-language printers. Refer to printcap(4) for descriptions of other symbols used in the /etc/printcap file. Refer to Section 6.9.4 for an example of using the lprsetup command to add several of these options to the /etc/printcap for a local-language printer.

Table 6-6: Symbols in /etc/printcap File for Local Language Printers

Symbol	Type	Default	Description
`ya`	`str`	None	Double-quoted list of keyword value assignments. This assignment list specifies most of the printer options related to country-specific support. The option keywords, which are explained following this table, include `flocale`, `font`, `line`, `odldb`, `odlstyle`, `onehalf`, `plocale`, `spcom`, `tacdata`, and `tm`.
`yd`	`str`	None	Secondary tty line or channel for font faulting Specify this entry for the DEClaser 1152 printer to support the font-faulting mechanism. The font-faulting mechanism, which is enabled by the `alpc` and `ffserver` commands, allows the printer to use fonts that are installed but not downloaded. Font faulting is required to support Chinese, Korean, and some other fonts. The font-faulting daemon (`ffd`) uses the secondary tty line to send font information to the printer.
`yj`	`str`	`NULL`	If `on` (the default) is specified as a value, restarts the filter specified for the `of` symbol for every print job. You need to define this symbol only for printers that are not country-specific and only if non-ASCII characters need to be printed on the flag page of printed output.
`yp`	`str`	`NULL`	Printer ID that conforms to the WoToTo Standard (for Thai printers).
`ys`	`num`	`NULL`	Size of the SoftODL character cache The `ys` entry is applied to text print filters. It must be present and its value must be greater than zero to enable on-demand loading of font files. These font files are the ODL support files created by the `cgen` utility for user-defined characters. The location of the SoftODL support files is identified by the path for systemwide ODL files in the database location configuration file `/usr/var/i18n/conf/cp_dirs`. ODL files for private UDC databases are not downloaded to printers. For optimal performance, the cache value specified for the `ys` field should match the printer cache size. To find out the cache size for a particular printer, refer to the printer's manual.
`yt`	`str`	`fifo`	The SoftODL character replacement method The `yt` entry applies to text print filters. The value for this entry can be either `fifo` (first-in-first-out) or `lru` (least recently used). You can type either uppercase or lowercase letters for these values. To find out which value is appropriate for a particular printer, refer to the printer's manual.

The ya symbol is defined for printing languages whose characters are not included in the Latin-1 character set. The value assigned to the ya symbol is a quoted string that can include one or more of the following keywords:

flocale=locale_name
Specifies the locale for interpretation of file text. The print filter uses this locale to validate characters in the text. For an Asian language that is supported by more than one codeset, a difference between the flocale and plocale values determines whether codeset conversion is done before the file is printed. If flocale is not specified, the filter interprets the file in the current locale.

font=font_name
Specifies the name of the outline font for printing PostScript files. This font must be appropriate for the specified plocale value.

line=number_of_lines
Specifies the number of lines per page. When used in combination with the -w flag of the lpr command, the line number can control the font size and orientation of printed output.

odldb=odl_database_path
Specifies the pathname of the SoftODL database. By default, the printer uses the systemwide database as specified in the cp_dirs file.

odlstyle=style-NxN
Specifies the SoftODL font style and size to use, for example normal-24x24. If odlstyle is not specified, the default style and size set for the systemwide database is used.

onehalf
For the Thai language, specifies that characters be printed on one and a half lines, rather than three lines, to produce more compressed and natural looking output. The onehalf option is valid only for the thailpof print filter.

plocale=locale_name
Specifies the printer locale. Some printers, such as the LA380-CB printer, are country-specific and have built-in fonts that are encoded in a particular codeset. For these printers, the codeset part of locale_name should match the codeset of the built-in fonts. Other printers, such as the DEClaser 5100, are generic and suitable for printing files in a variety of languages. For these printers, the codeset part of locale_name should match the codeset of the font needed to print files in a particular language (or set of languages). Remember that to use the same generic printer for printing files in different languages, you must define a separate print queue and spool directory for each language (codeset) in which print jobs will be submitted.

spcom
Enables space-compensation mode for languages, such as Thai, that contain nonspacing characters. These characters can combine with other characters for display and therefore do not occupy space. Many of the existing tools that align text do not handle nonspacing characters correctly. If you want to print the Thai output that these tools generate, you should specify the spcom option to ensure proper text alignment in the printed file. This option is valid only when used with a Thai print filter or the th_TH.TACTIS plocale value.

tacdata=tac_data_path
Specifies the location of the character code tables used with the thailpof print filter. By default, tac_data_path is /usr/lbin/tac_data.

tm
Enables text morphing for printing Thai characters. Text morphing replaces some characters with others to produce better printed output. Refer to Thai(5) for information on text morphing.

6.9.4 Enhancements to Printer Configuration Software

The CDE Printer Configuration application is the desktop application that helps you add, delete, or change the characteristics of the printers on your system. The lprsetup utility is an alternative way to do these operations if your system is not running windows software. In both cases, the software performs necessary tasks, such as creating the printer spooling directory, linking the appropriate filter to the printer, and writing the entry for the printer in the /etc/printcap file. See lprsetup.dat(4) for information about mapping the product names of supported printers to their system identifiers. Refer to the System Administration manual for detailed information and examples for printer setup.

Example 6-2 shows how you use the lprsetup command to set up a local-language printer, in this case ln05ja.

Example 6-2: Setting Up a Local Language Printer with lprsetup

# /usr/sbin/lprsetup   [1]
Printer Setup Program
 
Command < add modify delete exit view quit help >: add
 
Adding printer entry, type '?' for help.
 
Enter printer name to add [0] : ln05    [2]
 
For more information on the specific printer types Enter
`printer?'
 
Enter the FULL name of one of the following printer
types:
 
cp382d    dl1152w  dl5100w     dl510ka  ep1050+   fx1050
fx80      hp4mplus hp4mplus_a4 hpsimx   hpsimx_a4 hp680c
hp680c_a4 hpIII    hpIIIP      hpIIP    hpIV      ibmpro
la280     la30     la30n_a4    la30w    la30w_a4  la324
la380     la380cb  la380k      la400    la424     la50
la600     la70     la75        la84     la86      la88
la88c     la90     lf01r       lg02     lg04plus  lg06
lg08      lg12     lg12plus    lg31     lg104plus lg108plus
lj250     ln03     ln03ja      ln03r    ln03s     ln05
ln05ja    ln05r    ln06        ln06r    ln07      ln07r
ln08      ln08r    ln09        ln10ja   ln14      ln17
ln17_a4   ln17p    ln17ps_a4   ln82r    nec290    ps_level1
ps_level2 remote   wwpsof      xf       unknown
generic_ansi generic_ansi_a4 generic_text generic_text_a4
or press RETURN for [unknown] : ln05ja    [3]

.
.
.
Enter the name of the printcap symbol you wish to modify.
Other valid entries are:
 
        'q' to quit (no more changes)
        'p' to print the symbols you have specified so far.
        'l' to list all of the possible symbols and defaults.
 
The names of the printcap symbols are:
 
 af  br  cf  ct  df  dn  du  fc  ff  fo  fs  gf  ic  if  lf  lo
 lp  mc  mx  nc  nf  of  op  os  pl  pp  ps  pw  px  py  rf  rm
 rp  rs  rw  sb  sc  sd  sf  sh  st  tf  tr  ts  uv  vf  xc  xf
 xs  ya  yd  yj  yp  ys  yt  Da  Dl  It  Lf  Lu  Ml  Nu  Or  Ot
 Ps  Sd  Si  Ss  Ul  Xf
 
Enter symbol name: ya    [4]
 
Enter a new value for symbol 'ya'? ["plocale=ja_JP.sdeckanji"]
 
Do you want to enable ODL? [n] y    [5]
 
Enter symbol name: yt    [6]
 
Enter a new value for symbol 'yt'? [fifo]
 
Enter symbol name: q    [7]

.
.
.

Invokes the lprsetup program. [Return to example]

Selects a name for the printer (see Table 6-7). [Return to example]

Selects the printer type. [Return to example]

Specifies the printer locale. [Return to example]

Enables on-demand loading (ODL) of printer fonts for user-defined characters. An affirmative response also sets the cache size that the SoftODL service uses. This value, by default the appropriate cache size for the printer, is stored as value of the ys symbol in the /etc/printcap file. [Return to example]

Specifies the character replacement method that the SoftODL service uses. [Return to example]

Quits the program to indicate no more changes are needed to the /etc/printcap file. [Return to example]

Table 6-7 lists Asian languages and the associated printer choices as displayed by the lprsetup script.

Table 6-7: Local Language Printers Supported by the lprsetup Command

Language	Printer
Japanese (text only)	`la84j`, `la86j`, `la90j`, `la280j`, `la380j`, `ln03ja`, `ln05ja`,
Japanese (PostScript)	`ln83r`
Traditional Chinese (text only)	`cp382d`
Simplified Chinese (text only)	`la88c`, `la380c`
Korean (text only)	`la380k`, `dl510k`
Czech, Traditional Chinese, Simplified Chinese, Hungarian, Greek, Korean, Polish, Russian, Slovak, Slovene, and Turkish (PostScript)	`dl1152w`, `dl5100w`, `wwpsof`, `lps17` ^{[Footnote 3]}
Thai (text only)	`dp1050+`
Thai (PostScript)	`dl1152t`, `dl1152ttm`, `dl5100t`, `dl5100ttm`

6.9.5 Print Commands and the Printer Daemon

The lp, lpc, lpd, lpq, lpr, lprm, and lpstat commands handle the features added to the print subsystem for Asian and other languages not in the Latin-1 group. For example, the lpr command includes the -A option and additional values for the -O option to give users access to such features. See lpr(1) for details about local-language options and values.

6.9.6 Choosing PostScript Fonts for Different Locales

The fonts for the Chinese and Korean languages do not fit in the memory of most PostScript printers. Fonts for the Thai language and some European languages do fit in memory, but are large enough that they do not fit in printer memory along with fonts for other languages. For PostScript printers that are currently available and for which fonts supporting certain languages are not printer-resident, the wwpsof print filter (see Section 6.9.1.2) provides a solution. In this case, you may need to specify in a printer's configuration file the names of the PostScript fonts you want to use for different languages. Tru64 UNIX also provides a mechanism for selectively downloading fonts to certain older PostScript printer products as described in i18n_printing(5). In this case, you have to choose among fonts to be downloaded to the printer.

The following list associates languages and codesets with the appropriate set of PostScript fonts:

Hungarian, Czech, Slovak, Slovene (*.ISO8859-2)

Arial-Bold-ISOLatin2
Arial-BoldItalic-ISOLatin2
Arial-Italic-ISOLatin2
Arial-ISOLatin2
ArialNarrow-Bold-ISOLatin2
ArialNarrow-BoldItalic-ISOLatin2
ArialNarrow-Italic-ISOLatin2
ArialNarrow-ISOLatin2
BookAntiqua-Bold-ISOLatin2
BookAntiqua-BoldItalic-ISOLatin2
BookAntiqua-Italic-ISOLatin2
BookAntiqua-ISOLatin2
BookmanOldStyle-Bold-ISOLatin2
BookmanOldStyle-BoldItalic-ISOLatin2
BookmanOldStyle-Italic-ISOLatin2
BookmanOldStyle-ISOLatin2
CenturyGothic-Bold-ISOLatin2
CenturyGothic-BoldItalic-ISOLatin2
CenturyGothic-Italic-ISOLatin2
CenturyGothic-ISOLatin2
CenturySchoolbook-Bold-ISOLatin2
CenturySchoolbook-BoldItalic-ISOLatin2
CenturySchoolbook-Italic-ISOLatin2
CenturySchoolbook-Italic-ISOLatin2
CenturySchoolbook-ISOLatin2
Courier-Bold-ISOLatin2
Courier-BoldItalic-ISOLatin2
Courier-Italic-ISOLatin2
Courier-ISOLatin2
MonotypeCorsiva-ISOLatin2
TimesNewRoman-Bold-ISOLatin2
TimesNewRoman-BoldItalic-ISOLatin2
TimesNewRoman-Italic-ISOLatin2
TimesNewRoman-ISOLatin2

Russian (*.ISO8859-5)

Arial-Bold-ISOLatinCyrillic
Arial-BoldInclined-ISOLatinCyrillic
Arial-Inclined-ISOLatinCyrillic
Arial-ISOLatinCyrillic
Courier-Bold-ISOLatinCyrillic
Courier-BoldInclined-ISOLatinCyrillic
Courier-Inclined-ISOLatinCyrillic
Courier-ISOLatinCyrillic
Nimrod-Bold-ISOLatinCyrillic
Nimrod-BoldInclined-ISOLatinCyrillic
Nimrod-Inclined-ISOLatinCyrillic
Nimrod-ISOLatinCyrillic
Plantin-Bold-ISOLatinCyrillic
Plantin-BoldInclined-ISOLatinCyrillic
Plantin-Inclined-ISOLatinCyrillic
Plantin-ISOLatinCyrillic
TimesNewRoman-Bold-ISOLatinCyrillic
TimesNewRoman-BoldInclined-ISOLatinCyrillic
TimesNewRoman-Inclined-ISOLatinCyrillic
TimesNewRoman-ISOLatinCyrillic

Greek (*.ISO8859-7)

Arial-Bold-ISOLatinGreek
Arial-BoldInclined-ISOLatinGreek
Arial-Inclined-ISOLatinGreek
Arial-ISOLatinGreek
Courier-Bold-ISOLatinGreek
Courier-BoldInclined-ISOLatinGreek
Courier-Inclined-ISOLatinGreek
Courier-ISOLatinGreek
TimesNewRoman-Bold-ISOLatinGreek
TimesNewRoman-BoldInclined-ISOLatinGreek
TimesNewRoman-Inclined-ISOLatinGreek
TimesNewRoman-ISOLatinGreek

Hebrew (*.ISO8859-8)

David-Bold-ISOLatinHebrew
David-BoldOblique-ISOLatinHebrew
David-ISOLatinHebrew
David-Oblique-ISOLatinHebrew
FrankRuhl-Bold-ISOLatinHebrew
FrankRuhl-BoldOblique-ISOLatinHebrew
FrankRuhl-ISOLatinHebrew
FrankRuhl-Oblique-ISOLatinHebrew
Miriam-Bold-ISOLatinHebrew
Miriam-BoldOblique-ISOLatinHebrew
Miriam-ISOLatinHebrew
Miriam-Oblique-ISOLatinHebrew
MiriamFixed-Bold-ISOLatinHebrew
MiriamFixed-BoldOblique-ISOLatinHebrew
MiriamFixed-ISOLatinHebrew
MiriamFixed-Oblique-ISOLatinHebrew
NarkissTam-Bold-ISOLatinHebrew
NarkissTam-BoldOblique-ISOLatinHebrew
NarkissTam-ISOLatinHebrew
NarkissTam-Oblique-ISOLatinHebrew

Turkish (*.ISO8859-9)

Arial-Bold-ISOLatin5
Arial-BoldItalic-ISOLatin5
Arial-Italic-ISOLatin5
Arial-ISOLatin5
ArialNarrow-Bold-ISOLatin5
ArialNarrow-BoldItalic-ISOLatin5
ArialNarrow-Italic-ISOLatin5
ArialNarrow-ISOLatin5
BookAntiqua-Bold-ISOLatin5
BookAntiqua-BoldItalic-ISOLatin5
BookAntiqua-Italic-ISOLatin5
BookAntiqua-ISOLatin5
BookmanOldStyle-Bold-ISOLatin5
BookmanOldStyle-BoldItalic-ISOLatin5
BookmanOldStyle-Italic-ISOLatin5
BookmanOldStyle-ISOLatin5
CenturyGothic-Bold-ISOLatin5
CenturyGothic-BoldItalic-ISOLatin5
CenturyGothic-Italic-ISOLatin5
CenturyGothic-ISOLatin5
CenturySchoolbook-Bold-ISOLatin5
CenturySchoolbook-BoldItalic-ISOLatin5
CenturySchoolbook-Italic-ISOLatin5
CenturySchoolbook-ISOLatin5
Courier-Bold-ISOLatin5
Courier-BoldItalic-ISOLatin5
Courier-Italic-ISOLatin5
Courier-ISOLatin5
MonotypeCorsiva-ISOLatin5
TimesNewRoman-Bold-ISOLatin5
TimesNewRoman-BoldItalic-ISOLatin5
TimesNewRoman-Italic-ISOLatin5
TimesNewRoman-ISOLatin5

Traditional Chinese (*.dechanyu)
```
Sung-Light-CNS11643
Hei-Light-CNS11643
```

Simplified Chinese (*.dechanzi)
```
XiSong-GB2312-80
Hei-GB2312-80
```

Korean (*.deckorean)
```
Munjo
```

Japanese (*.deckanji)
None (uses printer built-in fonts)

Thai (*.TACTIS)

AngsanaUPC-Bold
AngsanaUPC-BoldItalic
AngsanaUPC-Italic
AngsanaUPC-Light
CordiaUPC-Bold
CordiaUPC-BoldItalic
CordiaUPC-Italic
CordiaUPC-Light
EucrosiaUPC-Bold
EucrosiaUPC-BoldItalic
EucrosiaUPC-Italic
EucrosiaUPC-Light
FreesiaUPC-Bold
FreesiaUPC-BoldItalic
FreesiaUPC-Italic
FreesiaUPC-Light
IrisUPC-Bold
IrisUPC-BoldItalic
IrisUPC-Italic
IrisUPC-Light
JasmineUPC-Bold
JasmineUPC-BoldItalic
JasmineUPC-Italic
JasmineUPC-Light
KodchiangUPC-Bold
KodchiangUPC-BoldItalic
KodchiangUPC-Italic
KodchiangUPC-Light
LilyUPC-Bold
LilyUPC-BoldItalic
LilyUPC-Italic
LilyUPC-Light
WaterlilyUPC-Bold
WaterlilyUPC-BoldItalic
WaterlilyUPC-Italic
WaterlilyUPC-Light
YuccaUPC-Bold
YuccaUPC-BoldItalic
YuccaUPC-Italic
YuccaUPC-Light

6.10 Using Mail in a Multilanguage Environment

Tru64 UNIX provides enhanced versions of the following commands and utilities to handle languages based on multibyte-character codesets:

sendmail

mailx

MH (mail handler)

The following sections discuss enhancements to these components, along with a discussion of codeset conversion done by the comsat server. Refer to sendmail(8), mailx(1), mh(1), comsat(8) for more complete software descriptions.

6.10.1 The sendmail Utility

The sendmail utility, which is a back end to several user commands, is configured by default to support 8-bit data. The configuration that supports 8-bit data is required for multibyte character support. Refer to sendmail(8) for restrictions that apply to the 8-bit configuration.

6.10.2 The mailx Command and MH Commands

The mailx command and all applicable commands in the MH system support the conversion of mail messages between the mail interchange codeset (used to transfer messages to some hosts) and a user's application codeset. For example, if the mail interchange codeset is ISO-2022-JP and the application codeset is eucJP, the mailx or MH command converts incoming messages to the Japanese EUC codeset before displaying them.

To prevent data loss, when incoming messages are stored in mail folders, the messages are encoded in the codeset in which they are received. Codeset conversion takes place when users extract or display the messages.

To communicate mail interchange code information to other systems, outgoing messages include two additional header lines like the following:

Mime-Version: 1.0
 
Content-Type: TEXT/PLAIN; charset=ISO-2022-JP

The charset field in the preceding example specifies the mail interchange codeset, in this case, ISO-2022-JP. This codeset is an ISO 7-bit state-dependent codeset for Japanese characters. Codesets other than those that are part of the ISO standard, are identified by the prefix X- in the codeset name. For example, when DEC Hanyu is the codeset used for mail interchange, the following header lines are included in outgoing mail messages:

Mime-Version: 1.0
 
Content-Type: TEXT/PLAIN; charset=X-dechanyu

The mailx command and MH commands use the following values (listed in order of highest to lowest priority) to determine or set the mail interchange and application codesets for a particular message:

The mail interchange codeset applied to incoming messages is determined from:
1. The charset field in the mail header, if additional header lines are present in the message
2. The codeset specified as the systemwide mail interchange default in the /usr/lib/mail-codesets file
  If you create this file, it contains a single entry, which is the name of a locale.
If neither of the preceding values is available, codeset conversion does not occur.

The mail interchange codeset applied to outgoing messages is determined from:
1. The setting of the EXCODE environment variable
2. The setting of the excode component as defined in the $HOME/.mailrc file (for mailx users) or the $HOME/.mh_profile file (for users of MH commands)
3. The content of the /usr/lib/mail-codesets file
If a codeset is not determined for outgoing mail interchange, the mail is sent with no codeset identifier.

The application codeset is determined from:
1. The setting of the LANG environment variable
2. The value of the lang component in the $HOME/.mailrc file (for the mailx command) or the $HOME/.mh_profile file (for MH commands)

6.10.3 The comsat Server

The comsat server, which notifies users of incoming mail messages, always attempts to convert incoming mail messages from the mail interchange codeset to the user's application codeset. The comsat server uses the following values (in order of highest to lowest priority) to determine the codesets that apply to a message:

For the mail interchange codeset:
1. The charset field, if included in the mail message header
2. The codeset specified as the systemwide mail interchange default in the /usr/lib/mail-codesets file
  If neither of the preceding values is available, codeset conversion does not occur.

For the application codeset:
1. The application codeset defined for the atty driver of the user's system
2. The codeset name in the$HOME/.codeset_device_name file, where device_name is the name of the terminal device for the current session

6.11 Applying Sort Orders to Non-English Characters

The sort command sorts characters according to the collation sequence defined for the current locale. A particular locale can apply one set of collation rules to the associated character set. Multiple locale names do exist, however, for the same combination of language, territory, and character set. Most often, these variations exist to offer users the choice of more than one collating sequence.

When more than one locale is available for a given combination of language, territory, and codeset, some of the locale names include a suffix with the format @variant. To avoid problems with pathnames constructed using the %L specifier, you should assign a locale name with a suffix that is category specific only to the appropriate locale category variable (or variables). In the following example, the locale assigned to LC_COLLATE differs from the locale assigned to LANG only with respect to collating sequence:

% setenv LANG zh_TW.eucTW
% setenv LC_COLLATE zh_TW.eucTW@radical

Supporting different collation orders through one or more locales is adequate for most languages. However, collation orders for Asian languages require additional support for the following reasons:

Asian languages include user-defined characters, which are not specified in a locale. These characters can be defined with a collation weight. In this case, the collation weight needs to be applied when the user-defined characters are encountered in the strings being sorted.

Ideographic characters can be sorted on more than one dimension (radical, stroke, phonetic, and internal code). Some users need to combine these dimensions during sort operations. In one operation the user may need to sort characters first by radical and then according to the number of strokes. For another operation, the user may need to put characters first in phonetic order, then according to the number of strokes, and so on. Sorting by combinations of dimensions requires breadth-first sorting, rather than the depth-first sorting implemented through locales.

For the preceding reasons, the asort command was developed and is available when you install language variant subsets that support Asian languages. The asort command uses, by default, the collating order defined for the LC_COLLATE variable and supports all the options supported by the sort command. In addition, the asort command includes the following options:

-C
This option indicates that the sort operation should use special system sort tables, along with sort tables produced by the cgen utility to support user-defined characters. This option overrides the sort sequence defined in the locale specified by the LC_COLLATE variable.

-v
This option, which you can use only with the -C option, implements breadth-first sorting.

Refer to asort(1) for more information about using this command.

6.12 Processing Reference Pages in Languages Other Than English

Programmers who supply software applications for UNIX systems frequently supply online reference pages (manpages) to document the application and its components. UNIX text-processing commands and utilities must be able to process translated versions of these reference pages for applications sold to the international market. Enhanced versions of the nroff, tbl, and man commands are included in Tru64 UNIX to support this requirement.

6.12.1 The nroff Command

The nroff command includes the following capabilities to support locales:

Formats reference page source files written in any language whose locale is installed on the system

Supports characters of any supported languages in the string arguments of macros and requests

Supports character mapping of characters for any supported language through the .tr request in reference page source files

Allows you to set the escape character (\), command control character (.), and nobreak control character (') to local language, as well as ASCII, characters

Maps each 2-byte space character, which is defined in most codesets for Asian languages, to two ASCII spaces in output

When formatting reference pages that contain ideographic characters, the nroff command treats each character as a single word. A string of ideographic characters, including 2-byte letters and punctuation characters, can be wrapped to the next line subject to the following constraints:

The last character on the text line cannot be defined as a no-last character by either the standard or private list of no-last characters.

The first character on the text line cannot be defined as a no-first character by either the standard or private list of no-first characters.

The standard no-first, no-last character lists are defined in nroff catalog files. For lists of these characters, refer to the language-specific technical reference guides that are available on the documentation CD-ROM.

The no-first and no-last constraints exist to prevent nroff from placing a punctuation mark or right parenthesis at the beginning of a text line or placing a left parenthesis at the end of a text line. You can turn the standard constraints on and off in source files with the .ki and .ko commands, respectively.

You can also define a private set of no-first and no-last characters with the following command:

.kl 'no-first-list'no-last-list '

The parameters no-first-list and no-last-list are strings of characters you should include in the no-first and no-last categories. You cancel a private no-first and no-last list by entering a .kl command with null strings as the parameters. For example:

.kl '''

Note

The characters specified in the .kl command override, rather than supplement, the characters in the standard set of no-first and no-last characters. Therefore, you cannot use the standard set of no-first and no-last characters together with a private set.
Using the command .kl ''' restores use of the standard set of no-first and no-last characters for the current locale.

The nroff command can format text so that it is justified or not justified to the right margin. When text is justified to the right margin, nroff inserts spaces between words in the line. Ideographic characters, although treated as words in most stages of the formatting process, differ in terms of whether they can be delimited by spaces.� .\" characters include in the character definition �� .\" character (can-space-before), after the character (can-space-after), �The characters that can be preceded by a space, followed by a space, or both are listed in the language-specific user guides that are available on line when you install language variant subsets of Tru64 UNIX. When right-justifying text, the nroff command inserts spaces only at the following places:

Where 1-byte or 2-byte spaces already occur

Between English and ideographic characters

Before characters defined as can-space-before

After characters defined as can-space-after

In other cases, no space is inserted between consecutive ideographic characters. Therefore, if a text line contains only ideographic characters, it may not be justified to the right margin.

6.12.2 The tbl Command

The tbl command preprocesses table formatting commands within blocks delimited by the .TS and .TE macros. The tbl command handles multibyte characters that can occur in text of languages other than English.

The tbl command is frequently used along with the neqn equation formatting preprocessor to filter input passed to the nroff command. In such cases, specify tbl first to minimize the volume of data passed through the pipes. For example:


% cd /usr/usr/share/ja_JP.deckanji/man/man1
% tbl od.1 | neqn | nroff -Tlpr -man -h | \
lpr -Pmyprinter

When printing Asian language text, you must use printer hardware that supports the language.

6.12.3 The man Command

The man command can handle multibyte characters in reference page files. By default, the man command automatically searches for reference pages in the/usr/share/locale_name /man directory before searching the /usr/share/man and /usr/local/man directories. Therefore, if the LANG environment variable is set to an installed locale and if reference page translations are available for that locale, the man command automatically displays reference pages in the appropriate language.

In addition, the man command automatically applies codeset conversion (assuming the availability of appropriate converters) when reference page translations for a particular language are encoded in a codeset that does not match the codeset of the user's locale. Refer to man(1) for information about redefining the man command search path and for more details about codeset conversion.

6.13 Converting Data Files from One Codeset to Another

Each locale is based on a specific codeset. Therefore, when an application uses a file whose data is coded in one codeset and runs in a locale based on another codeset, character interpretation may be meaningless. For example, assume that a fictional language includes a character named "quo", which is encoded as \031 in one codeset and \042 in another codeset. If the "quo" character is stored in a data file as \031, the application that reads data from that file should be running in the locale based on the same codeset. Otherwise, \031 identifies a character other than "quo".

Users, the applications they run, or both may need to set the process environment to a particular locale and use a data file created with a codeset different from the one on which the locale is based. The data file in question might be appropriate for a given language and in a codeset different from the user's locale for one of the following reasons:

The data file might have been created on another vendor's system by using a locale based on a vendor-specific codeset. For example, the integration of PCs into the enterprise computing environment increases the likelihood that UNIX users need to process files for which the data encoding is in MS-DOS code page format.

The locale could be one of several UNIX locales that support the same Asian language, such as Japanese. Asian languages are typically supported by a variety of locales, each based on a different codeset.

The data file could be in Unicode (UCS-2), UCS-4, or UTF-8 format. If characters in this file are to be printed or displayed on the screen, they might need to be converted to encodings for which fonts are available on a Tru64 UNIX system.

You can convert a data file from one codeset to another by using the iconv command or the iconv_open, iconv, and iconv_close functions. For example, the following command reads data in the accounts_local file, which is encoded in the SJIS codeset; converts the data to the eucJP codeset; and appends the results to the accounts_central file:

% iconv -f SJIS -t eucJP accounts_local \
>> accounts_central

Many commands and utilities, such as the man command and internationalized print filters, use the iconv functions and associated converters to perform codeset conversion on the user's behalf.

The iconv command and associated functions can use either an algorithmic converter or a table converter to convert data. Algorithmic converters, if installed on your system, reside in the /usr/lib/nls/loc/iconv directory; this directory is the one searched first for a converter. This directory also contains an alias file (iconv.alias) that maps different name strings for the same converter to the converter as named on the system. Table converters, if installed on your system, reside in the /usr/lib/nls/loc/iconvTable directory. The value of the LOCPATH variable, if defined, overrides the command's default search path.

The iconv command assumes that a converter name adheres to the following format:

from-codeset_ to-codeset

For the preceding example, the iconv command would search for and use the /usr/lib/nls/loc/iconv/SJIS_eucJP converter.

Table 6-8 specifies the codeset conversions that Tru64 UNIX supports for English data. The user guides for the language variant subsets include tables with codeset conversions supported for Asian languages.

For detailed information about the iconv command, refer to iconv(1) and iconv_intro(5). For information on functions that programs can use to perform codeset conversion, refer to iconv_open(3), iconv(3), and iconv_close(3). You can find a list of all the codeset converters available for a particular language in the reference page for that language.

Table 6-8: Supported Codeset Conversions for English

Codeset	ASCII-GR	ISO8859-1	ISO8859-1-GL	ISO8859-1-GR
ASCII-GR	-	Yes	No	No
ISO8859-1	Yes	-	Yes	Yes
ISO8859-1-GL	No	Yes	-	No
ISO8859-1-GR	No	Yes	No	-

6.14 Miscellaneous Information for Base System Commands

The following list includes information about features and restrictions that apply when using traditional UNIX commands in local-language environments:

file
The file command has been enhanced to recognize files encoded in Unicode or ISO 10646 (UCS-2 or UCS-4) format. For other kinds of text files, the command recognizes when the character encoding is valid for the codeset of the current locale. The file command also has a jfile alias. When you use this alias, the command recognizes the most commonly used encodings for Japanese (DEC Kanji, Japanese EUC, Shift JIS, and 7-bit JIS) regardless of the current locale setting. For more information, see file(1).

rlogin
When using the rlogin command to log on to a Tru64 UNIX system from an ULTRIX system, be sure to specify the -8 flag to pass 8-bit data without stripping. Otherwise, you will have problems entering non-ASCII characters from your terminal.
If you view a large data file while logged on to the remote system, use a pager command, such as pg, and not the Hold Screen key to view a large data file. The -8 option sets the terminal mode of the original host to RAW, disabling flow control. So, if data is sent to the terminal a rate faster than the terminal can handle it, some data is lost when you use the Hold Screen key.
This rlogin restriction applies not only when logging in from an ULTRIX system, but when logging in from any UNIX system whose software does not fully support 8-bit data format.

Emacs editor
The operating system includes the multilingual Emacs software from the Free Software Foundation. Before using this editor, you must add the /usr/i18n/mule/bin directory to your process-specific search path. You can then invoke this editor by using the mule command.

vi and more
The vi and more commands discard text that follows an invalid multibyte character. If you encounter this problem, it is likely that your locale setting is not correct for the text being viewed or edited. In this case, reset your locale to one that matches the text and invoke the command again.
When used with Thai characters, vi may wrap lines before the right boundary of the screen. This happens because Thai text includes nonspacing characters, which contribute to the character count but not to the display width. The editor wraps lines based on character count. For example, vi may wrap a line after entry of 80 characters, even though these characters do not occupy 80 columns on the screen.

Using local-language user names and file names
It is a limitation of UNIX file systems that you cannot use a multibyte character whose second or subsequent byte is an ASCII slash (/) in names of files, users, or other objects. This limitation means that user-defined characters in the DEC Hanzi and DEC Kanji codesets and certain characters (CNS Plane 2 characters) in the DEC Hanyu codeset cannot be used in these names.

6.15 Using Language Support Enhancements for Motif Applications

In the Motif environments, such as CDE, you use versions of fonts, codesets, servers, and applications that support features discussed in earlier sections of this chapter. This section provides more detail on using features that help support Asian languages. Topics include:

Tuning the cache and unit size of the X Display Server for languages with ideographic characters

Using font renderers for multibyte PostScript fonts

Customizing a window for local languages

6.15.1 Tuning the X Server for Ideographic Languages

Asian languages have large ideographic character sets, so all characters needed for display are not loaded into memory at the same time. Instead, only as many characters as will fit in the memory cache are simultaneously loaded. When characters needed for display are not currently cached in memory, the least recently used font glyphs are removed from the cache to make room. The font-cache mechanism allows you to display ideographic text in multiple typefaces, font sizes, and font styles without increasing the amount of memory that systems must have to support ideographic languages.

The X Server font-cache mechanism allows you to change the number of cache units and the size of these units to best accommodate the character sets used in displays. You will probably need to change the default values set for cache parameters to achieve the best performance from your system if it will display Asian-language text. Consider the following criteria when deciding on the optimal values for font caching:

The number of ideographic languages that you want to display
If you intend to work with several ideographic languages during the same CDE session, you need larger values for acceptable performance.

The number of fonts that will be used simultaneously
Variation in font number and size depends partly on the kinds of applications you run. A desktop publishing application typically requires more fonts than other types of applications whereas a software development tool requires fewer.

The number of frequently used characters in the languages you want to display
In Asian languages, only a subset of characters are used frequently. The size of this subset varies from one language to another. For example, approximately 20,000 standard characters are supported for Taiwan but only 5,000 of those characters are used frequently. Estimates for the number of frequently used characters for other Asian countries is as follows: People's Republic of China (3000), Korea (2000), and Japan (2000). Font-cache parameters are tuned to accommodate the subset of frequently used characters.

To change the cache size (which is the number of cache units) and the size of each cache unit, you must modify the X Server configuration file /usr/lib/X11/xdm/Xservers. This file contains a line, similar to the following one, that starts the X Server:

:0 local /usr/bin/X11/X

You can modify this line to add definitions for cache size and unit size. For example:

:0 local /usr/bin/X11/X -cs cache_size-cuunit_size

Table 6-9 describes the options that tune the font-cache mechanism.

Table 6-9: X Server Options for Tuning the Font-Cache Mechanism

stty Option Description

-cs
cache_size

Defines the number of cache units.

The minimum (and also default) value for this parameter is 1024. If you specify a cache size smaller than 1024, font caching is disabled. For one ideographic language, the recommended value is the lowest multiple of 1024 that accommodates the number of frequently used characters in that language.

If a workstation displays multiple ideographic languages simultaneously, you must add together the values required for each language to get the minimum cache size. Specify an even larger value if you intend to run applications, such as desktop publishing software, that require multiple font styles and sizes for each ideographic character.

-cu
unit_size

Defines the size of each cache unit.

The minimum value for unit size is 31 bytes and the default value is 128 bytes. If you specify a value smaller than 31 bytes, the value has no effect. If a particular font requires more memory space than 128 bytes, the font-cache mechanism automatically allocates one or more additional units to store its glyphs.

Note

Font caching applies only to uncompressed fonts in pcf format. Font caching is not applied to any compressed fonts or to fonts in bdf format. Because font caching cannot be used with compressed fonts, the 2-byte fonts for Asian languages are not installed in compressed format.

You can calculate cache unit size with the following formula:

unit_size =
((floor(ceil((double)WIDTH / 8.0) /4.0)) + 1.0) * 4.0 * (double)HEIGHT

Consider the following calculation for a typical font size of 24x24:

unit_size in bytes
= ((floor(ceil((double) 24 / 8.0 / 4.0)) + 1.0) * 4.0 * (double) 24
= 96

For 34x34 fonts, the unit size calculation would yield 272 bytes.

Given that 96 bytes are needed to cache a 24x24 font glyph and 272 bytes is needed to cache a 34x34 font glyph, the default unit size of 128 has the following implications:

For 24x24 fonts, each character needs one cache unit. If cache size is set to 4096, the cache can accommodate 4096 characters.

For 34x34 fonts, each character needs three cache units. If cache size is set to 4096, the cache can accommodate 1365 characters.

Small fonts (whose characters require a single, 128-byte unit) are used to display ideographic characters. Therefore, you usually have to change only the cache size to achieve acceptable performance in text displays of languages with ideographic characters.

6.15.2 Using Font Renderers for Multibyte PostScript Fonts

The operating sytem includes font renderers that allow any X application to use the PostScript fonts available for the Chinese and Korean languages. The system administrator can set up font renderers for the following kinds of fonts for use through the X Server or the font server:

Double-byte PostScript outline fonts

UDC fonts

By installing the IOSWWXFR** subset, you automatically enable font rendering for the PostScript outline fonts.

6.15.2.1 Setting Up the Font Renderer for Double-Byte PostScript Fonts

You can set up the font renderer for Chinese and Korean PostScript fonts for use either through the X Server or the font server by editing the appropriate configuration file:

For the X Server, the font renderer is automatically added at installation time to the font_renderers list in the X Server's configuration file.

For a font server, you must manually add the following entry to the renderers list in the font server's configuration file:
```
renderers = other_renderer, other_renderer,...
     libfr_DECpscf.so;DECpscfRegisterFontFileFunctions
```
In addition, you must specify the paths for the PostScript font files in the catalogue list in the same configuration file. Double-byte PostScript fonts for the Asian languages are available in the following directories:
```
/usr/i18n/lib/X11/fonts/KoreanPS
/usr/i18n/lib/X11/fonts/SChinesePS
/usr/i18n/lib/X11/fonts/TChinesePS
```
Each font in these directories has the following components:
- A Type1 font header with the .pfa2 file name extension
  This header file is the only file that must be listed in the fonts.dir file in the font directory.
- A data file with the .csdata file name extension
- A binary metrics file with the .xafm file name extension

The renderer for Asian double-byte PostScript fonts uses its own configuration file that specifies the following information:

Cache size (number of cache units)

Cache unit size

File handler (names associated with font-rendering software)

Default character (character that is printed in place of any character for which there is no glyph)

The default pathname for this configuration file is /var/X11/renderer/DECpscf_config; however, you can change this path by setting the DECPSCF_CONFIG_PATH environment variable.

6.15.2.2 Setting Up the Font Renderer for UDC Fonts

The UDC font renderer accesses the UDC database directly to obtain font glyphs. Therefore, X applications that use this renderer do not need to use .pcf files generated by the cgen utility.

You can set up the UDC font renderer for use either through the X Server or the font server as follows:

For the X Server, the font renderer is automatically added at installation time to the font_renderers list in the X Server's configuration file.

For a font server, you must manually add the following entry to the renderers list in the font server's configuration file:
```
renderers = other_renderer, other_renderer,...
     libfr_UDC.so;UDCRegisterFontFileFunctions
```
In addition, you must specify the path to the UDC database in the catalogue list of the same configuration file. This path should be set to the top directory for the UDC database. For example, /var/i18n/udc is the correct path for a systemwide UDC database if the database was set up in the default directory.
To process UDC characters in a particular language, the font renderer also requires entries in the fonts.dir file in the appropriate PostScript font directory from the following list:
```
/usr/i18n/lib/X11/fonts/SChinesePS
/usr/i18n/lib/X11/fonts/TChinesePS
```
Edit the fonts.dir file to specify virtual file names in the format locale_name.udc followed by the corresponding XLFD names registered for the codesets. Table 6-10 shows the XLFD entry that corresponds to different Asian codesets.
Table 6-10: XLFD Registry Names for UDC Characters

Codeset XLFD Registry Name

dechanyu, eucTW DEC.CNS11643.1986-UDC

big5 BIG5-UDC

dechanzi GB2312.1980-UDC

deckanji, sdeckanji, eucJP JISX.UDC-1

The following example entry is appropriate for the fonts.dir file in the /usr/i18n/lib/X11/fonts/TChinesePS directory:
```
2
zh_TW.dechanyu.udc -system-decwin-normal-r--24-240-75-75-m-24-DEC.CNS11643.1986-UDC
zh_TW.big5.udc -system-decwin-normal-r--24-240-75-75-m-24-BIG5-UDC
```

6.15.3 Setting Fonts for Display of Local Languages

The system on which you install language variant subsets is automatically updated with fonts required to display text in the supported languages.

In CDE, applications access local language fonts through a font alias mechanism. The /usr/dt/config/xfonts/i18n/{75,100}dpi/fonts.alias files rather than files installed in the /usr/dt/config/xfonts/locale-name/ areas are most critical for resolution of which fonts an application uses. This arrangement supports both a consistent session language and the ability to run an individual application in a language different from the session language.

6.15.3.1 Accessing Local-Language Fonts for Remote Displays

The system where local-language subsets are installed may function as a client in a client-server display environment. In this case, the local-language fonts must also be available to the window managers for all the server systems where native language text is displayed. You need to install fonts for other locales either on individual systems used for remote login to the system where language variant subsets are installed or make the fonts known to the other systems through a font server. Table 6-11, Table 6-12, Table 6-13, Table 6-14, Table 6-15, Table 6-16, and Table 6-17 describe the fonts used to display text in various local languages. See ISO8859-15(5) for a list of available bitmap fonts for the ISO 8859-15 (Latin-9) codeset, which is not used in any locales but may be needed for displaying the euro character. You can use the /usr/bin/X11/xlsfonts command to determine which fonts are currently installed on a system.

Table 6-11: Bitmap Fonts for Asian Locales

Language	Typeface	Style	Sizes	75dpi	100dpi
Japanese	Gothic (ISO Latin-1)	Normal	8, 10, 12, 14, 18, 24	x	x
	Gothic (Kanji)	Normal	8, 10, 12, 14, 18, 24	x	x
	Gothic (Roman Kana)	Normal	8, 10, 12, 14, 18, 24	x	x
	kmenu (ISO Latin-1)	Normal	12	x	x
	kmenu (Roman Kana)	Normal	12	x	x
	Mincho (ISO Latin-1)	Normal	8, 10, 12, 14, 18, 24	x	x
	Mincho (Kanji)	Normal	8, 10, 12, 14, 18, 24	x	x
	Mincho (Roman Kana)	Normal	8, 10, 12, 14, 18, 24	x	x
	Screen (DECsuppl)	Normal	14, 18, 24	x
	Screen (DECtech)	Normal	14, 18, 24	x
	Screen (ISO Latin-1)	Normal	14, 18, 24	x
	Screen (Kanji00)	Normal	10, 14, 16, 18, 24	x
	Screen (Kanji11)	Normal	10, 14, 18, 24	x
	Screen (Roman Kana)	Normal	10, 14, 18, 24	x
Korean	Gotic	Normal	16, 24	x
	Myungcho	Normal	16, 24, 32	x
	Screen	Normal	18, 24	x
	KS Roman	Normal	18, 24	x
Simplified Chinese	FangSongTi	Normal	24, 34	x
	HeiTi	Normal	16, 24, 34	x
	KaiTi	Normal	24, 34	x
	Screen	Normal	18, 24	x
	SongTi	Normal	16, 24, 34	x
Traditional Chinese	Hei (CNS11643)	Normal	16, 24	x
	Hei (DTSCS)	Normal	16, 24	x
	Screen (CNS11643)	Normal	18, 24	x
	Screen (DTSCS)	Normal	18, 24	x
	Sung (CNS11643)	Normal	24, 32	x
	Sung (DTSCS)	Normal	24, 32	x
Thai	Screen	Normal	14, 18, 24	x
Asia (Misc.)	Screen (DEC Ctrl)	Normal	14, 18, 24	x
	Screen (DRCS)	Normal	18, 24	x

Table 6-12: Bitmap Fonts for *.ISO8859-2 Locales

Language	Typeface	Style	Sizes	75dpi	100dpi
Czech, Hungarian, Polish, Slovak, Slovene	Arial	Normal	10, 12, 14, 18, 24, 36	x	x
		Italic	10, 12, 14, 18, 24, 36	x	x
		Bold	10, 12, 14, 18, 24, 36	x	x
		Bold-Italic	10, 12, 14, 18, 24, 36	x	x
	Arial Narrow	Normal	10, 12, 14, 18, 24, 36	x	x
		Italic	10, 12, 14, 18, 24, 36	x	x
		Bold	10, 12, 14, 18, 24, 36	x	x
		Bold-Italic	10, 12, 14, 18, 24, 36	x	x
	Book Antiqua	Normal	10, 12, 14, 18, 24, 36	x	x
		Italic	10, 12, 14, 18, 24, 36	x	x
		Bold	10, 12, 14, 18, 24, 36	x	x
		Bold-Italic	10, 12, 14, 18, 24, 36	x	x
	Bookman Old Style	Normal	10, 12, 14, 18, 24, 36	x	x
		Italic	10, 12, 14, 18, 24, 36	x	x
		Bold	10, 12, 14, 18, 24, 36	x	x
		Bold-Italic	10, 12, 14, 18, 24, 36	x	x
	Century Gothic	Normal	10, 12, 14, 18, 24, 36	x	x
		Italic	10, 12, 14, 18, 24, 36	x	x
		Bold	10, 12, 14, 18, 24, 36	x	x
		Bold-Italic	10, 12, 14, 18, 24, 36	x	x
	Century Schoolbook	Normal	10, 12, 14, 18, 24, 36	x	x
		Italic	10, 12, 14, 18, 24, 36	x	x
		Bold	10, 12, 14, 18, 24, 36	x	x
		Bold-Italic	10, 12, 14, 18, 24, 36	x	x
	Courier	Normal	8, 10, 12, 14, 18, 24, 36	x	x
		Italic	8, 10, 12, 14, 18, 24, 36	x	x
		Bold	8, 10, 12, 14, 18, 24, 36	x	x
		Bold-Italic	8, 10, 12, 14, 18, 24, 36	x	x
	Monotype Corsiva	Normal	10, 12, 14, 18, 24, 36	x	x
	Times New Roman	Normal	10, 12, 14, 18, 24, 36	x	x
		Italic	10, 12, 14, 18, 24, 36	x	x
		Bold	10, 12, 14, 18, 24, 36	x	x
		Bold-Italic	10, 12, 14, 18, 24, 36	x	x
	Terminal	Normal	14, 18	x	x
		Double-Width	14, 18	x	x
		Double-Width, Double-Height	28, 36	x	x
		Narrow	14, 18	x	x
		Double-Width, Narrow	14, 18	x	x
		Double-Width, Double-Height, Narrow	28, 36	x	x
		Bold	14, 18	x	x
		Double-Width, Bold	14, 18	x	x
		Double-Width, Double-Height, Bold	28, 36	x	x
		Narrow, Bold	14, 18	x	x
		Double-Width, Narrow, Bold	14, 18	x	x
		Double-Width, Double-Height, Narrow, Bold	28, 36	x	x

Table 6-13: Bitmap Fonts for *.ISO8859-4 Locales

Language	Typeface	Style	Sizes	75dpi	100dpi
Lithuanian	Arial	Normal	10, 12, 14, 18, 24, 36	x	x
		Italic	10, 12, 14, 18, 24, 36	x	x
		Bold	10, 12, 14, 18, 24, 36	x	x
		Bold-Italic	10, 12, 14, 18, 24, 36	x	x
	Arial Narrow	Normal	10, 12, 14, 18, 24, 36	x	x
		Italic	10, 12, 14, 18, 24, 36	x	x
		Bold	10, 12, 14, 18, 24, 36	x	x
		Bold-Italic	10, 12, 14, 18, 24, 36	x	x
	Book Antiqua	Normal	10, 12, 14, 18, 24, 36	x	x
		Italic	10, 12, 14, 18, 24, 36	x	x
		Bold	10, 12, 14, 18, 24, 36	x	x
		Bold-Italic	10, 12, 14, 18, 24, 36	x	x
	Bookman Old Style	Normal	10, 12, 14, 18, 24, 36	x	x
		Italic	10, 12, 14, 18, 24, 36	x	x
		Bold	10, 12, 14, 18, 24, 36	x	x
		Bold-Italic	10, 12, 14, 18, 24, 36	x	x
	Century Gothic	Normal	10, 12, 14, 18, 24, 36	x	x
		Italic	10, 12, 14, 18, 24, 36	x	x
		Bold	10, 12, 14, 18, 24, 36	x	x
		Bold-Italic	10, 12, 14, 18, 24, 36	x	x
	Century Schoolbook	Normal	10, 12, 14, 18, 24, 36	x	x
		Italic	10, 12, 14, 18, 24, 36	x	x
		Bold	10, 12, 14, 18, 24, 36	x	x
		Bold-Italic	10, 12, 14, 18, 24, 36	x	x
	Courier	Normal	8, 10, 12, 14, 18, 24, 36	x	x
		Italic	8, 10, 12, 14, 18, 24, 36	x	x
		Bold	8, 10, 12, 14, 18, 24, 36	x	x
		Bold-Italic	8, 10, 12, 14, 18, 24, 36	x	x
	Monotype Corsiva	Normal	10, 12, 14, 18, 24, 36	x	x
	Times New Roman	Normal	10, 12, 14, 18, 24, 36	x	x
		Italic	10, 12, 14, 18, 24, 36	x	x
		Bold	10, 12, 14, 18, 24, 36	x	x
		Bold-Italic	10, 12, 14, 18, 24, 36	x	x
	Terminal	Normal	14, 18	x	x
		Double-Width	14, 18	x	x
		Double-Width, Double-Height	28, 36	x	x
		Narrow	14, 18	x	x
		Double-Width, Narrow	14, 18	x	x
		Double-Width, Double-Height, Narrow	28, 36	x	x
		Bold	14, 18	x	x
		Double-Width, Bold	14, 18	x	x
		Double-Width, Double-Height, Bold	28, 36	x	x
		Narrow, Bold	14, 18	x	x
		Double-Width, Narrow, Bold	14, 18	x	x
		Double-Width, Double-Height, Narrow, Bold	28, 36	x	x

Table 6-14: Bitmap Fonts for *.ISO8859-5 Locales

Language	Typeface	Style	Sizes	75dpi	100dpi
Russian	Arial	Normal	10, 12, 14, 18, 24, 36	x	x
		Italic	10, 12, 14, 18, 24, 36	x	x
		Bold	10, 12, 14, 18, 24, 36	x	x
		Bold-Italic	10, 12, 14, 18, 24, 36	x	x
	Courier	Normal	8, 10, 12, 14, 18, 24, 36	x	x
		Italic	8, 10, 12, 14, 18, 24, 36	x	x
		Bold	8, 10, 12, 14, 18, 24, 36	x	x
		Bold-Italic	8, 10, 12, 14, 18, 24, 36	x	x
	Nimrod	Normal	10, 12, 14, 18, 24, 36	x	x
		Italic	10, 12, 14, 18, 24, 36	x	x
		Bold	10, 12, 14, 18, 24, 36	x	x
		Bold-Italic	10, 12, 14, 18, 24, 36	x	x
	Plantin	Normal	10, 12, 14, 18, 24, 36	x	x
		Italic	10, 12, 14, 18, 24, 36	x	x
		Bold	10, 12, 14, 18, 24, 36	x	x
		Bold-Italic	10, 12, 14, 18, 24, 36	x	x
	Times New Roman	Normal	10, 12, 14, 18, 24, 36	x	x
		Italic	10, 12, 14, 18, 24, 36	x	x
		Bold	10, 12, 14, 18, 24, 36	x	x
		Bold-Italic	10, 12, 14, 18, 24, 36	x	x
	Terminal	Normal	14, 18	x	x
		Double-Width	14, 18	x	x
		Double-Width, Double-Height	28, 36	x	x
		Narrow	14, 18	x	x
		Double-Width, Narrow	14, 18	x	x
		Double-Width, Double-Height, Narrow	28, 36	x	x
		Bold	14, 18	x	x
		Double-Width, Bold	14, 18	x	x
		Double-Width, Double-Height, Bold	28, 36	x	x
		Narrow, Bold	14, 18	x	x
		Double-Width, Narrow, Bold	14, 18	x	x
		Double-Width, Double-Height, Narrow, Bold	28, 36	x	x

Table 6-15: Bitmap Fonts for *.ISO8859-7 Locales

Language	Typeface	Style	Sizes	75dpi	100dpi
Greek	Arial	Normal	10, 12, 14, 18, 24, 36	x	x
		Italic	10, 12, 14, 18, 24, 36	x	x
		Bold	10, 12, 14, 18, 24, 36	x	x
		Bold-Italic	10, 12, 14, 18, 24, 36	x	x
	Courier	Normal	8, 10, 12, 14, 18, 24, 36	x	x
		Italic	8, 10, 12, 14, 18, 24, 36	x	x
		Bold	8, 10, 12, 14, 18, 24, 36	x	x
		Bold-Italic	8, 10, 12, 14, 18, 24, 36	x	x
	Times New Roman	Normal	10, 12, 14, 18, 24, 36	x	x
		Italic	10, 12, 14, 18, 24, 36	x	x
		Bold	10, 12, 14, 18, 24, 36	x	x
		Bold-Italic	10, 12, 14, 18, 24, 36	x	x
	Terminal	Normal	14, 18	x	x
		Double-Width	14, 18	x	x
		Double-Width, Double-Height	28, 36	x	x
		Narrow	14, 18	x	x
		Double-Width, Narrow	14, 18	x	x
		Double-Width, Double-Height, Narrow	28, 36	x	x
		Bold	14, 18	x	x
		Double-Width, Bold	14, 18	x	x
		Double-Width, Double-Height, Bold	28, 36	x	x
		Narrow, Bold	14, 18	x	x
		Double-Width, Narrow, Bold	14, 18	x	x


		Double-Width, Double-Height, Narrow, Bold	28, 36	x	x

Table 6-16: Bitmap Fonts for *.ISO8859-8 Locales

Language	Typeface	Style	Sizes	75dpi	100dpi
Hebrew	David	Normal	8, 10, 12, 14, 18, 24	x	x
		Italic	8, 10, 12, 14, 18, 24	x	x
		Bold	8, 10, 12, 14, 18, 24	x	x
		Bold-Italic	8, 10, 12, 14, 18, 24	x	x
	Frankruhl	Normal	8, 10, 12, 14, 18, 24	x	x
		Italic	8, 10, 12, 14, 18, 24	x	x
		Bold	8, 10, 12, 14, 18, 24	x	x
		Bold-Italic	8, 10, 12, 14, 18, 24	x	x
	Gam	Normal	8, 10, 12, 14, 18, 24	x	x
		Italic	8, 10, 12, 14, 18, 24	x	x
		Bold	8, 10, 12, 14, 18, 24	x	x
		Bold-Italic	8, 10, 12, 14, 18, 24	x	x
	menu	Normal	10, 12	x	x
	Miriam	Normal	8, 10, 12, 14, 18, 24	x	x
		Italic	8, 10, 12, 14, 18, 24	x	x
		Bold	8, 10, 12, 14, 18, 24	x	x
		Bold-Italic	8, 10, 12, 14, 18, 24	x	x
	Miriam Fixed	Normal	8, 10, 12, 14, 18, 24	x	x
		Italic	8, 10, 12, 14, 18, 24	x	x
		Bold	8, 10, 12, 14, 18, 24	x	x
		Bold-Italic	8, 10, 12, 14, 18, 24	x	x
	Narkiss Tam	Normal	8, 10, 12, 14, 18, 24	x	x
		Italic	8, 10, 12, 14, 18, 24	x	x
		Bold	8, 10, 12, 14, 18, 24	x	x
		Bold-Italic	8, 10, 12, 14, 18, 24	x	x
	Terminal	Normal	14, 18	x	x
		Double-Width	14, 18	x	x
		Double-Width, Double-Height	28, 36	x	x
		Narrow	14, 18	x	x
		Double-Width, Narrow	14, 18	x	x
		Double-Width, Double-Height, Narrow	28, 36	x	x
		Bold	14, 18	x	x
		Double-Width, Bold	14, 18	x	x
		Double-Width, Double-Height, Bold	28, 36	x	x
		Narrow, Bold	14, 18	x	x
		Double-Width, Narrow, Bold	14, 18	x	x
		Double-Width, Double-Height, Narrow, Bold	28, 36	x	x

Table 6-17: Bitmap Fonts for *.ISO8859-9 Locales

Language	Typeface	Style	Sizes	75dpi	100dpi
Turkish	Arial	Normal	10, 12, 14, 18, 24, 36	x	x
		Italic	10, 12, 14, 18, 24, 36	x	x
		Bold	10, 12, 14, 18, 24, 36	x	x
		Bold-Italic	10, 12, 14, 18, 24, 36	x	x
	Arial Narrow	Normal	10, 12, 14, 18, 24, 36	x	x
		Italic	10, 12, 14, 18, 24, 36	x	x
		Bold	10, 12, 14, 18, 24, 36	x	x
		Bold-Italic	10, 12, 14, 18, 24, 36	x	x
	Book Antiqua	Normal	10, 12, 14, 18, 24, 36	x	x
		Italic	10, 12, 14, 18, 24, 36	x	x
		Bold	10, 12, 14, 18, 24, 36	x	x
		Bold-Italic	10, 12, 14, 18, 24, 36	x	x
	Bookman Old Style	Normal	10, 12, 14, 18, 24, 36	x	x
		Italic	10, 12, 14, 18, 24, 36	x	x
		Bold	10, 12, 14, 18, 24, 36	x	x
		Bold-Italic	10, 12, 14, 18, 24, 36	x	x
	Century Gothic	Normal	10, 12, 14, 18, 24, 36	x	x
		Italic	10, 12, 14, 18, 24, 36	x	x
		Bold	10, 12, 14, 18, 24, 36	x	x
		Bold-Italic	10, 12, 14, 18, 24, 36	x	x
	Century Schoolbook	Normal	10, 12, 14, 18, 24, 36	x	x
		Italic	10, 12, 14, 18, 24, 36	x	x
		Bold	10, 12, 14, 18, 24, 36	x	x
		Bold-Italic	10, 12, 14, 18, 24, 36	x	x
	Courier	Normal	8, 10, 12, 14, 18, 24, 36	x	x
		Italic	8, 10, 12, 14, 18, 24, 36	x	x
		Bold	8, 10, 12, 14, 18, 24, 36	x	x
		Bold-Italic	8, 10, 12, 14, 18, 24, 36	x	x
	Monotype Corsiva	Normal	10, 12, 14, 18, 24, 36	x	x
	Times New Roman	Normal	10, 12, 14, 18, 24, 36	x	x
		Italic	10, 12, 14, 18, 24, 36	x	x
		Bold	10, 12, 14, 18, 24, 36	x	x
		Bold-Italic	10, 12, 14, 18, 24, 36	x	x
	Terminal	Normal	14, 18	x	x
		Double-Width	14, 18	x	x
		Double-Width, Double-Height	28, 36	x	x
		Narrow	14, 18	x	x
		Double-Width, Narrow	14, 18	x	x
		Double-Width, Double-Height, Narrow	28, 36	x	x
		Bold	14, 18	x	x
		Double-Width, Bold	14, 18	x	x
		Double-Width, Double-Height, Bold	28, 36	x	x
		Narrow, Bold	14, 18	x	x
		Double-Width, Narrow, Bold	14, 18	x	x
		Double-Width, Double-Height, Narrow, Bold	28, 36	x	x

6.15.4 Customizing a Terminal Emulation Window for Asian Languages

The following features and restrictions apply to terminal windows that you create when an Asian language is specified for the language setting:

Depending on the language setting, additional menu items, push buttons, toggle switches, and text entry fields may be available to you for customizing terminal window features.

Terminal emulation always follows the selected language for your session if the terminal is invoked from the CDE Personal Applications menu. If a terminal window is invoked from another terminal window where the LANG or LC_ALL variable has been set to the locale for another language, then the language of the new window changes. Setting locale in the parent window does not change the language of the parent window, only of child windows invoked from the parent window.

For a language supported by an input method server, you must be sure the input server is connected to the terminal window where you input characters in that language. Otherwise, you cannot use the input method for character entry. The connection between a terminal window and an input server does not exist if:
- The terminal window was started before the input server started
  At the start of a CDE session, the input server starts automatically when the session language is selected. For example, if Chinese is your session language, the input server for Chinese is automatically attached to terminal windows by default. However, if Chinese is your session language and you want to create a window to work in Korean, you must manually start the input server for Korean (in addition to setting a Korean locale) before invoking the new terminal window.
- The input method server was killed for some reason
  If the connection between a terminal window and the input method server was broken, you can start the input method server and then create another terminal window where you can use the input method.