 |
Index for Section 1 |
|
 |
Alphabetical listing for T |
|
 |
Bottom of page |
|
tr(1)
NAME
tr - Translates characters
SYNOPSIS
tr [-Acs] string1 string2
tr -ds [-Ac] string1 string2
tr -d [-Ac] string1
tr -s [-Ac] string1
The tr command copies characters from the standard input to the standard
output with substitution or deletion of selected characters.
STANDARDS
Interfaces documented on this reference page conform to industry standards
as follows:
tr: XCU5.0
Refer to the standards(5) reference page for more information about
industry standards and associated tags.
OPTIONS
-A [Tru64 UNIX] Translates on a byte-by-byte basis. When you specify
this option, tr does not support extended characters.
-c Complements (inverts) the set of characters in string1, which is the
set of all characters in the current character set, as defined by the
current setting of LC_CTYPE, except for those actually specified in the
string1 argument. These characters are placed in the array in ascending
collation sequence, as defined by the current setting of LC_COLLATE.
-d Deletes all occurrences of input characters or collating elements found
in the array specified in string1.
-s Replaces any character specified in string1 that occurs as a string of
two or more repeating characters as a single instance of the character
in string2.
OPERANDS
string1
string2
Translation control strings as explained in the DESCRIPTION section.
DESCRIPTION
Input characters from string1 are replaced with the corresponding
characters in string2. The tr command cannot handle an ASCII NUL (\000) in
string1 or string2; it always deletes NUL from the input.
[Tru64 UNIX] The trbsd command is a BSD compatible version of tr.
The following constructs can be used to specify characters or single-
character collating elements. If any of these constructs result in
multicharacter collating elements, tr excludes those elements from the
resulting array without issuing a diagnostic.
A character
Represents itself when not described by one of the other conventions in
this list.
\octal-sequence. . .
Represents a character by using its octal value. An octal sequence
consists of a backslash followed by the longest sequence of one-, two-,
or three-octal-digit characters (01234567). The sequence causes the
character whose encoding is represented by the one-, two-, or three-
digit octal value to be placed in the string.
\\, \a, \b, \f, \n, \r, \t, \v
Represent standard backslash-escape sequences. No results are defined
by the Single UNIX Specification for specifying characters after a
backslash other than the ones listed here. In portable applications, a
backslash should be followed only by an octal sequence, another
backslash, or the lowercase letter a, b, f, n, r, t, or v.
[Tru64 UNIX] On UNIX systems, you can enclose string operands in
quotation marks or specify a backslash before some characters, such as
* (an asterisk), to remove the special meaning of those characters to
the shell.
c1-c2
Represents a range of collating elements between the specified range
endpoints, inclusive, as defined by the current locale setting of the
LC_COLLATE category. The starting element, c1, must precede the ending
element, c2, in the current collation order. The characters or
collating elements in the range are placed in the associated string in
ascending collation sequence. Note that the collation sequence for
ASCII characters, such as letters in the English alphabet, may vary
among locales. In the POSIX locale, for example, a-z produces a string
with all English lowercase letters in English alphabetical order.
However, when LC_COLLATE is set to a different locale, English
lowercase letters may be subject to a different collation order.
Therefore, a-z may produce a different result for locales other than
the POSIX locale.
[c*number]
Stands for number repetitions of the character c. The number is
considered to be in decimal unless the first digit of number is 0; then
it is considered to be in octal. This format is valid only as string2.
[=equiv=]
Represents all characters or collating elements belonging to the
equivalence class specified by equiv, as defined by the LC_COLLATE
locale category. An equivalence class expression can be used for
string1 or string2 only when used in combination with the -d and -s
options. (For more information, see the locale(4) reference page.)
[:class:]
Represents all characters belonging to the defined character class, as
defined by the current setting of the LC_CTYPE locale category. The
following character class names are accepted when specified in string1:
alnum cntrl lower space
alpha digit print upper
blank graph punct xdigit
If the current locale defines additional keywords (by including
additional charclass definitions in the LC_TYPE category), the tr
command also recognizes those keywords as class values.
When the -d and -s options are specified together, any of the character
class names are accepted in string2; otherwise, only character class
names lower or upper are accepted in string2 and then only if the class
complement, (upper or lower, respectively) is specified in the same
relative position in string1. Such a specification is interpreted as a
request for case conversion.
When [:lower:] appears in string1 and [:upper:] appears in string2, the
arrays contain the characters from the toupper mapping in the LC_CTYPE
category of the current locale. When [:upper:] appears in string1 and
[:lower:] appears in string2, the arrays contain the characters from
the tolower mapping in the LC_CTYPE category of the current locale.
The first character from each mapping pair is in the array for string1
and the second character from each mapping pair is in the array for
string2 in the same relative position.
[Tru64 UNIX] When string2 is shorter than string1, a difference results
between historical System V and BSD systems. A BSD system pads string2
with the last character found in string2. Thus, it is possible to do the
following:
tr 0123456789 d
[Tru64 UNIX] The preceding command translates all digits to the letter d.
A portable application cannot rely on the BSD behavior; it would have to
code the example in the following way:
tr 0123456789 '[d*]'
[Tru64 UNIX] If a given character appears more than once in string1, the
character in string2 corresponding to its last appearance in string1 will
be used in the translation.
If the -c and -d options are both specified, all characters except those
specified by string1 are deleted. The contents of string2 are ignored,
unless -s is also specified. Note, however, that the same string cannot be
used for both the -d and the -s options; when both options are specified,
both string1 (used for deletion) and string2 (used for squeezing) are
required.
If the -d option is not specified, each input character or collating
element found in the array specified by string1 is replaced by the
character or collating element in the same relative position in the array
specified by string2.
When the -s option is specified, if the string2 contains a character class,
the argument's array contains all of the characters in that character
class. For example:
tr -s '[:space:]'
In a case conversion, however, the string2 array contains only those
characters defined as the second characters in each of the toupper or
tolower character pairs, as appropriate. For example:
tr -s '[:upper:]' '[:lower:]'
System V Compatibility
[Tru64 UNIX] The root of the directory tree that contains the commands
modified for SVID 2 compliance is specified in the file /etc/svid2_path.
You can use /etc/svid2_profile as the basis for, or to include in, your
.profile. The file /etc/svid2_profile reads /etc/svid2_path and sets the
first entries in the PATH environment variable so that the modified SVID 2
commands are found first.
[Tru64 UNIX] In the SVID 2 compliant version of the tr command, only
characters in the octal range of 1 to 377 are complemented when you specify
the -c option. This behavior is accomplished because the -A option is
implicitly forced to be on when you specify the -c option.
NOTES
1. [Tru64 UNIX] Specifying the -A option improves ASCII performance.
2. Despite similarities in appearance, the string arguments used by tr
are not regular expressions.
3. The tr command correctly processes NULL characters in its input
stream. NULL characters can be stripped using the following command:
tr -d '\000'
4. If string1 or string2 is the empty string, results are undefined and
unpredictable.
EXIT STATUS
The following exit values are returned:
0 Successful completion.
>0 An error occurred.
EXAMPLES
1. To translate braces into parentheses, enter:
tr '{}' '()' <textfile >newfile
This translates each { (left brace) to ( (left parenthesis) and each }
(right brace) to ) (right parenthesis). All other characters remain
unchanged.
2. In the POSIX locale, to translate lowercase ASCII characters to
uppercase, you can enter:
tr 'a-z' 'A-Z' <textfile >newfile
This command assumes that English letters are collated in English
alphabetical order, which may not be true for locales other than the
POSIX locale. The following command is recommended for case conversion
for all locales:
tr '[:lower:]' '[:upper:]' <textfile >newfile
3. The two strings can be of different lengths:
tr '0-9' '#' <textfile >newfile
This translates each 0 into a # (number sign) but does not treat the
digits 1 to 9; if the two character strings are not the same length,
the extra characters in the longer one are ignored.
4. To translate each digit to a # (number sign), enter:
tr '0-9' '[#*]' <textfile >newfile
The * (asterisk) tells tr to repeat the # (number sign) enough times
to make the second string as long as the first one.
5. To translate each string of digits to a single # (number sign), enter:
tr -s '0-9' '[#*]' <textfile >newfile
6. In the POSIX locale, to translate all ASCII characters that are not
specified, enter:
tr -c '[ -~]' '[A-_]' <textfile >newfile
This translates each nonprinting ASCII character to the next following
corresponding control key letter (\001 translates to B, \002 to C, and
so on). ASCII DEL (\177), the character that follows ~ (tilde),
translates to a ] (right bracket). This command assumes that ASCII
characters are collated in a certain order, which may not be true for
locales other than the POSIX locale.
7. To create a list of all words in file1 one per line in file2, where a
word is taken to be a maximal string of letters, enter:
tr -cs '[:alpha:]' '[\n*]' < file1 > file2
8. To use an equivalence class to identify accented variants of the base
character e in file1, which are stripped of diacritical marks and
written to file2, enter:
tr '[=e=]' '[e*]' < file1 > file2
Equivalence classes are locale dependent. Some locales may not include
equivalence classes to associate base letters and their accented
variants.
ENVIRONMENT VARIABLES
The following environment variables affect the execution of tr:
LANG
Provides a default value for the internationalization variables that
are unset or null. If LANG is unset or null, the corresponding value
from the default locale is used. If any of the internationalization
variables contain an invalid setting, the utility behaves as if none of
the variables had been defined.
LC_ALL
If set to a non-empty string value, overrides the values of all the
other internationalization variables.
LC_COLLATE
Determines the locale for the behavior of range expressions and
equivalence classes.
LC_CTYPE
Determines the locale for the interpretation of sequences of bytes of
text data as characters (for example, single-byte as opposed to
multibyte characters in arguments) and the behavior of character
classes.
LC_MESSAGES
Determines the locale for the format and contents of diagnostic
messages written to standard error.
NLSPATH
Determines the location of message catalogues for the processing of
LC_MESSAGES.
SEE ALSO
Commands: ed(1), ksh(1), sed(1), Bourne shell sh(1b), POSIX shell sh(1p),
trbsd(1)
Files: ascii(5)
Standards: standards(5)
 |
Index for Section 1 |
|
 |
Alphabetical listing for T |
|
 |
Top of page |
|