This chapter provides an introduction to files, file systems, and text editors. A file is a collection of data stored together in the computer. Typical files contain memos, reports, correspondence, programs, or other data. A file system is the useful arrangement of files into directories.
A text editor is a program that lets you create new files and modify existing ones.
After completing this chapter, you will be able to:
Create files with the
vi
text editor.
These
files will be useful for working through the examples later in this book.
Understand the file system components and concepts.
This knowledge can help you design a file system that is appropriate
for the type of information you use and the way you work.
2.1 Overview of Text Editors
An editor is a program that lets you create and change files containing text, programs, or other data. An editor does not provide the formatting and printing features of a word processor or publishing software.
With a text editor, you can:
Create, read, and write files
Display and search for data
Add, replace, and remove data
Move and copy data
Run operating system commands
Your editing takes place in an edit buffer that you can save or discard.
The
vi
and
ed
text editing programs
are available on the operating system.
Each editor has its own methods of
displaying text as well as its own set of subcommands and rules.
For information about
vi
, read
Section 2.2
and
Appendix A.
For information about
ed
, see
Appendix B.
Your system may contain additional editors; see your system administrator
for details.
2.2 Creating Sample Files with the vi Text Editor
This section shows how to create three files with the
vi
text editor.
The goal of this section is to have you create, using a minimal set
of commands, files that can be used for working through the examples later
in this book.
For more information about
vi
, see
Appendix A
and the
vi
(1)
reference page.
Note
If you are familiar with a different editing program, you can use that program to create the three sample files described in this section. If you already have created three files with an editing program, you can use those files by substituting their names for the file names used in the examples.
When following the steps that are used to create the sample files, only
enter the text that is shown in
boldface
characters.
System prompts and output are shown in a different typeface,
like this
.
To create three sample files, follow these steps:
Start the
vi
program by typing
vi
and the name of a new
file at the shell prompt.
Press the Return key.
$
vi file1 [Return]
This is a new file, so the system responds by putting your cursor at the top of a screen:
~
~
~
~
~
~
"file1" [New file]
Notice the blank lines on your screen that begin with a tilde ( ~). These tildes indicate the lines that contain no text. Because you have not entered any text, all lines begin with a tilde.
Type the lowercase letter
i
to specify
that you want to insert text to the new file.
The system does not display
the
i
that you enter.
Enter the following sample text, pressing the Return key after each line. To correct mistakes before moving to the next line, press the Delete key or the Backspace key to move backward over the mistake. Retype the text correctly.
You start the vi program by entering [Return]
the vi command optionally followed by the name [Return]
of a new or existing file. [Escape]
~ ~ ~ ~ ~ ~ "file1" [New file]
Press the Escape key to indicate that you have finished
your current work.
Type a colon (:
) to enter the Last
Line mode.
Note
Depending upon how your terminal or workstation is set up, the Escape key may be programmed to perform a different function. It is possible that one of the function keys on your keyboard may have been set up to perform the escape function. This function is often assigned to the [F11] key. See your system administrator if your Escape key does not operate properly.
The colon is displayed as a prompt at the bottom of the screen as follows:
You start the vi program by entering
the vi command optionally followed by the name
of a new or existing file.
~
~
~
~
~
~
:
Enter a lowercase letter
w
next.
Entering the letter
w
indicates to the system that you
want to
write
, or save, a copy of the new file in your
current, user directory (see
Chapter 4
for an explanation about
your current directory).
Your screen will look like this:
You start the vi program by entering
the vi command optionally followed by the name
of a new or existing file.
~
~
~
~
~
~
"file1" [New file] 3 lines, 111 characters
The system displays the name of the new file as well as the number of lines and characters it contains.
The system is still in the
vi
text editor so you
can create two more sample files.
The process is the same as the one you used
to create
file1
, but the text you enter will be different.
Type a colon (:
).
The colon is displayed
as a prompt at the bottom of the screen.
To create your second sample file,
enter
vi file2
.
The system responds with a screen that
looks like this:
~
~
~
~
~
~
~
"file2" No such file or directory
The message
file2 No such file or directory
indicates that
file2
is a new file.
Indicate that you want to insert text to the new file by typing
the lowercase letter
i
.
Enter the following sample text:
If you have created a new file, you will find[Return]
that it is easy to add text.[Escape]
Type a colon (:
) and enter the lowercase
letter
w
to write, or save, the file in your current directory.
Your screen will look like this:
If you have created a new file, you will find
that it is easy to add text.
~
~
~
~
~
~
~
"file2" [New file] 2 lines, 75 characters
Follow the instructions in step 5 to create the third file.
However, name the file
file3
, and enter the following sample
text:
You will find that vi is a useful[Return]
editor that has many features.[Escape]
Type a colon (:
) and enter the
wq
command.
The
wq
command writes the file, quits (that is, exits)
the editor, and returns you to the shell prompt.
2.3 Understanding Files, Directories, and Pathnames
A file is a collection of data stored in a computer. A file stored in a computer is like a document stored in a filing cabinet because you can retrieve it, open it, process it, close it, and store it as a unit. Every computer file has a file name that both users and the system use to refer to the file.
A file system is the arrangement of files into a useful pattern. Any time you organize information, you create something like a computer file system. For example, the structure of a manual file system (file cabinets, file drawers, file folders, and documents) resembles the structure of a computer file system. (The software that manages the file storage is also known as the file system, but that usage of the term does not occur in this chapter. On some systems, this software is also called the file manager.)
Once you have organized your file system (manual or computer), you can find a particular piece of information quickly because you understand the structure of the system. To understand the file system, you should first become familiar with the following three concepts:
Files and file names
Directories and subdirectories
Tree structures and pathnames
A file can contain the text of a document, a computer program, records for a general ledger, the numerical or statistical output of a computer program, or other data.
A file name can contain any character except the following because these characters have special meaning to the shell:
Slash ( / )
Backslash ( \ )
Ampersand ( & )
Left- and right-angle brackets (< and >)
Question mark ( ? )
Dollar sign ( $ )
Left bracket ( [ )
Asterisk ( * )
Tilde ( ~ )
Vertical bar or pipe symbol ( | )
You may use a period or dot ( . ) in the middle of a file name, but never at the beginning of the file name unless you want the file to be hidden when doing a simple listing of files. For information about characters with special meanings to your shell, refer to Section 8.2.2 and Section 8.3.2. For information about listing hidden files, see Section 3.1.3.
Note
Unlike some operating systems, this operating system distinguishes between uppercase and lowercase letters in file names (that is, it is case sensitive). For example, the following three file names represent three distinct files:
filea
,Filea
, andFILEA
.
Use file names that reflect the actual contents of your files.
For
example, a file name such as
memo.advt
might indicate
that the file contains a memo about advertising.
On the other hand, file names
such as
filea
,
fileb
, or
filec
tell you nothing about the contents of that file.
It is also a good idea to use a consistent pattern to name related files. For example, suppose you have an advertising report that is divided into chapters, with each chapter contained in a separate file. You might name these files in the following way:
chap1.advt
chap2.advt
chap3.advt
Note
Many programs that you invoke use the portion of the file name following the dot ( . ), called the extension, as an indicator of the file's purpose.
The maximum length of a
file name depends upon the file system used on your operating system.
For
example, your file system may allow a maximum file name length of 255 characters
(the default), or it may allow a maximum file name length of only 14 characters.
Because knowing the maximum file name length is important to providing files
with meaningful file names, see your system administrator for details.
2.3.2 Directories and Subdirectories
You can organize your files into groups and subgroups that resemble the cabinets, drawers, and folders in a manual file system. These groups are called directories, and the subgroups are called subdirectories. A well-organized system of directories and subdirectories lets you retrieve and manipulate the data in your files quickly.
Directories differ from files in two significant ways:
Directories are organizational tools; files are storage places for data.
Directories contain the names of files, other directories, or both.
When
you first log in, the system automatically places you in your
login
directory.
This directory is also called your
home
directory.
The system also sets your
HOME
environment variable to the full path name of this directory.
This directory
was created for you when your computer account was established.
However, a
file system in which all files are arranged under your login directory is
not necessarily the most efficient method to organize your files.
As you work with the system, you may want to set up additional directories and subdirectories so you can organize your files into useful groups. For example, assume that you work for the Sales department and are responsible for four lines of automobiles. You may want to create a subdirectory under your login directory for each automobile line. Each subdirectory can contain all memos, reports, and sales figures applicable for the automobile model.
Once your files are arranged into a directory structure that you find
useful, you can move easily between directories.
See
Chapter 4
for information about creating directories and moving between them.
2.3.3 Displaying the Name of Your Current (Working) Directory (pwd)
The directory in which you are working at any given time is your current, or working directory.
Whenever you are uncertain about the directory in which you are working
or where that directory exists in the file system, enter the
pwd
(print working directory) command as follows:
$
pwd
The system displays the name of your current directory in the format:
/usr/msg
This information
indicates that you are currently working in a directory named
msg
that is located under the
usr
directory.
The
/usr/msg
notation is known as the
pathname
of your working directory.
See
Section 2.3.4
for information
about pathnames.
See the
pwd
(1)
reference page for further information
on the
pwd
command.
2.3.4 The Tree-Structure File System and Pathnames
The files and directories in the file system are arranged hierarchically
in a structure that resembles an upside-down tree with the roots at the top
and the branches at the bottom.
This arrangement is called a
tree
structure.
You can find more detailed information about the directory
structure in the
hier
(5)
reference page.
Figure 2-1 shows a typical file system arranged in a tree structure. The names of directories are printed in bold, and the names of files are printed in italics.
Figure 2-1: A Typical File System
At the top of the file system shown
in
Figure 2-1
(that is, at the root of the inverted tree structure)
is a directory called the
root
directory.
The symbol
that represents this first major division of the file system is a slash ( / ).
At the next level down from the root of the
file system are eight directories, each with its own system of subdirectories
and files.
Figure 2-1, however, shows only the subdirectories
under the directory named
user
.
These are the login directories
for the users of this system.
The third level down the tree structure contains
the login directories for two of the system's users,
smith
and
chang
.
It is in these directories that
smith
and
chang
begin their work after logging in.
The fourth level of the figure shows three
directories under the
chang
login directory:
plans
,
report
, and
payroll
.
The fifth level of the tree structure contains
both files and subdirectories.
The
plans
directory contains
four files, one for each quarter.
The
report
directory
contains three files comprising the three parts of a report.
Also on the fifth
level are two subdirectories,
regular
and
contract
, which further organizes the information in the
payroll
directory.
A higher
level directory is frequently called a
parent
directory.
For example, in
Figure 2-1, the directories
plans
,
report
, and
payroll
all have
chang
as their parent directory.
A pathname specifies the location of a directory or a file within the file system. For example, when you want to change from working on File A in Directory X to File B in Directory Y, you enter the pathname to File B. The operating system then uses this pathname to search through the file system until it locates File B.
A pathname consists of a sequence of directory names separated by slashes ( / ) that ends with a directory name or a file name. The first element in a pathname specifies where the system is to begin searching, and the final element specifies the target of the search. The following pathname is based on Figure 2-1:
/user/chang/report/part3
The first slash ( /
) represents
the root directory and indicates the starting place for the search.
The remainder
of the pathname indicates that the search is to go to the
user
directory, then to the
chang
directory, next to the
report
directory, and finally to the
part3
file.
Whether you are changing your current directory, sending data to a file, or copying or moving a file from one place in your file system to another, you use pathnames to indicate the objects you want to manipulate.
A pathname that starts with a slash
( /
) (the symbol representing the root
directory) is called a
full pathname
or an
absolute pathname.
You can also think of a full pathname as the
complete name of a file or a directory.
Regardless of where you are working
in the file system, you can always find a file or a directory by specifying
its full pathname.
The
file system also lets you use
relative pathnames.
Relative
pathnames do not begin with the
/
that represents
the root directory because they are relative to the current directory.
You can specify a relative pathname in one of several ways:
As the name of a file in the current directory.
As a pathname that begins with the name of a directory one level below your current directory.
As
a pathname that begins with
..
(dot dot, the relative pathname
for the parent directory).
As a pathname that begins with
.
(dot,
which refers to the current directory).
This relative pathname notation is
useful when you want to run your own version of an operating system command
in the current directory (for example
./ls
).
Every directory contains at least two entries:
..
(dot dot), and
.
(dot).
In
Figure 2-2, for example, if your current directory is
chang
, the relative pathname for the file
1Q
in the
contract
directory is
payroll/contract/1Q
.
By comparing this relative pathname with the full pathname for
the same file,
/user/chang/payroll/contract/1Q
, you can
see that using relative pathnames means less typing and more convenience.
Figure 2-2: Relative and Full Pathnames
In the C shell and the Korn or POSIX shell, you may also use a tilde ( ~) at the beginning of relative pathnames. The tilde character used alone specifies a user's login (home) directory. The tilde character followed by a user name specifies the login (home) directory of another user on the same system.
For example, to specify your own login directory, use the tilde alone.
To specify the login directory of user
chang
, specify
~chang
.
For more information on using relative pathnames, see Chapter 4.
Note
If there are other users on your system, you may or may not be able to get to their files and directories, depending upon the permissions set for them. For more information about file and directory permissions, see Chapter 5. In addition, your system may contain enhanced security features that may affect access to files and directories. If so, see your system administrator for details.
2.4 Specifying Files with Pattern Matching
Commands often take file names as arguments. To use several different file names as arguments to a command, you can type out the full name of each file, as the following example shows:
$
ls file1 file2 file3
However, if the file names have a common pattern (in this example, the
file
prefix), the shell can match that pattern, generate a list
of those names, and automatically pass them to the command as arguments.
The
asterisk (*), sometimes referred to as a
wildcard , matches any string of characters.
In the following example, the
ls
command finds the name of every text file in the current directory
that includes the
file
prefix:
$
ls file*
The
file*
matches any file name that begins with
file
and ends with any other character string.
The shell passes every file name
that matches this pattern as an argument for the
ls
command.
Thus, you do not have to enter (or even remember) the full name of each
file in order to use it as an argument.
Both commands (ls
with all file names typed out and
ls file*
) do the same
thing -- they pass all files with the
file
prefix
in the directory as arguments to the
ls
command.
There is one exception to the general rules for pattern matching.
When
the first character of a file name is a period, you must match the period
explicitly.
For example,
ls *
displays the names of all
files in the current directory that do not begin with a period.
The command
ls -a
displays all file names, those that begin with a period
and all others.
This restriction prevents the shell from automatically matching the
relative directory names.
These are
.
(dot, standing
for the current directory) and
..
(dot dot, standing
for the parent directory).
For more information on relative directory names,
see
Chapter 4.
If a pattern does not match any file names, the shell displays a message informing you that no match has been found.
In addition to the asterisk (*), operating system shells provide other ways to match character patterns. Table 2-1 summarizes all pattern-matching characters and provides examples.
Table 2-1: Pattern-matching Characters
Character | Action |
* | Matches any string, including the null string. For example,
|
? | Matches any single character. For example,
|
[...] | Matches any one of the enclosed characters. For example,
|
[.-.] | Matches any character that falls within the specified range, as defined by the current locale. For more information on locale, see Appendix C. For example,
|
[!...] | Matches any single character except one of those enclosed. For example,
This pattern matching is available only in the Bourne, Korn, and POSIX shells. |
An internationalized operating system provides the additional pattern-matching features described in Table 2-2.
Table 2-2: Internationalized Pattern-matching Characters
Character | Action |
[[:class:]] | A character class name enclosed in bracket-colon delimiters matches any of the set of characters in the named class. The supported classes are
For example, the
|
[[=char=]] | A character enclosed in bracket-equal delimiters matches any equivalence class character. An equivalence class is a set of collating elements that all sort to the same primary location. It is generally designed to deal with primary-secondary sorting; that is, for languages such as French that define groups of characters as sorting to the same primary location, and then having a tie-breaking, secondary sort. |
For more information on internationalized pattern-matching characters,
see the
grep
(1)
reference page.
For more information on internationalization
features, see
Appendix C.