2 Overview of Files and Directories

This chapter provides an introduction to files, file systems, and text editors. A file is a collection of data stored together in the computer. Typical files contain memos, reports, correspondence, programs, or other data. A file system is the useful arrangement of files into directories.

A text editor is a program that lets you create new files and modify existing ones.

After completing this chapter, you will be able to:

Create files with the vi text editor. These files will be useful for working through the examples later in this book.

Understand the file system components and concepts.

This knowledge can help you design a file system that is appropriate for the type of information you use and the way you work.

2.1 Overview of Text Editors

An editor is a program that lets you create and change files containing text, programs, or other data. An editor does not provide the formatting and printing features of a word processor or publishing software.

With a text editor, you can:

Create, read, and write files

Display and search for data

Add, replace, and remove data

Move and copy data

Run operating system commands

Your editing takes place in an edit buffer that you can save or discard.

The vi and ed text editing programs are available on the operating system. Each editor has its own methods of displaying text as well as its own set of subcommands and rules.

For information about vi, read Section 2.2 and Appendix A. For information about ed, see Appendix B.

Your system may contain additional editors; see your system administrator for details.

2.2 Creating Sample Files with the vi Text Editor

This section shows how to create three files with the vi text editor.

The goal of this section is to have you create, using a minimal set of commands, files that can be used for working through the examples later in this book. For more information about vi, see Appendix A and the vi(1) reference page.

Note

If you are familiar with a different editing program, you can use that program to create the three sample files described in this section. If you already have created three files with an editing program, you can use those files by substituting their names for the file names used in the examples.

When following the steps that are used to create the sample files, only enter the text that is shown in boldface characters. System prompts and output are shown in a different typeface, like this.

To create three sample files, follow these steps:

Start the vi program by typing vi and the name of a new file at the shell prompt. Press the Return key.
```
$ vi file1 [Return]
```
This is a new file, so the system responds by putting your cursor at the top of a screen:
```
~
~
~
~
~
~
"file1" [New file]
```
Notice the blank lines on your screen that begin with a tilde ( ~). These tildes indicate the lines that contain no text. Because you have not entered any text, all lines begin with a tilde.

Type the lowercase letter i to specify that you want to insert text to the new file. The system does not display the i that you enter.
Enter the following sample text, pressing the Return key after each line. To correct mistakes before moving to the next line, press the Delete key or the Backspace key to move backward over the mistake. Retype the text correctly.
```
You start the vi program by entering [Return]
the vi command optionally followed by the name [Return]
of a new or existing file. [Escape]
~
~
~
~
~
~
"file1" [New file]
```

Press the Escape key to indicate that you have finished your current work. Type a colon (:) to enter the Last Line mode.

Note

Depending upon how your terminal or workstation is set up, the Escape key may be programmed to perform a different function. It is possible that one of the function keys on your keyboard may have been set up to perform the escape function. This function is often assigned to the [F11] key. See your system administrator if your Escape key does not operate properly.
The colon is displayed as a prompt at the bottom of the screen as follows:
```
You start the vi program by entering
the vi command optionally followed by the name
of a new or existing file.
~
~
~
~
~
~
:
```

Enter a lowercase letter w next. Entering the letter w indicates to the system that you want to write, or save, a copy of the new file in your current, user directory (see Chapter 4 for an explanation about your current directory).
Your screen will look like this:
```
You start the vi program by entering
the vi command optionally followed by the name
of a new or existing file.
~
~
~
~
~
~
"file1" [New file] 3 lines, 111 characters
```
The system displays the name of the new file as well as the number of lines and characters it contains.
The system is still in the vi text editor so you can create two more sample files. The process is the same as the one you used to create file1, but the text you enter will be different.

Type a colon (:). The colon is displayed as a prompt at the bottom of the screen. To create your second sample file, enter vi file2. The system responds with a screen that looks like this:
```
~
~
~
~
~
~
~
"file2" No such file or directory
```
The message file2 No such file or directory indicates that file2 is a new file.

Indicate that you want to insert text to the new file by typing the lowercase letter i. Enter the following sample text:
```
If you have created a new file, you will find[Return]
that it is easy to add text.[Escape]
```

Type a colon (:) and enter the lowercase letter w to write, or save, the file in your current directory.

Your screen will look like this:


If you have created a new file, you will find
that it is easy to add text.
~
~
~
~
~
~
~
"file2" [New file] 2 lines, 75 characters

Follow the instructions in step 5 to create the third file. However, name the file file3, and enter the following sample text:
```
You will find that vi is a useful[Return]
editor that has many features.[Escape]
```

Type a colon (:) and enter the wq command.
The wq command writes the file, quits (that is, exits) the editor, and returns you to the shell prompt.

2.3 Understanding Files, Directories, and Pathnames

A file is a collection of data stored in a computer. A file stored in a computer is like a document stored in a filing cabinet because you can retrieve it, open it, process it, close it, and store it as a unit. Every computer file has a file name that both users and the system use to refer to the file.

A file system is the arrangement of files into a useful pattern. Any time you organize information, you create something like a computer file system. For example, the structure of a manual file system (file cabinets, file drawers, file folders, and documents) resembles the structure of a computer file system. (The software that manages the file storage is also known as the file system, but that usage of the term does not occur in this chapter. On some systems, this software is also called the file manager.)

Once you have organized your file system (manual or computer), you can find a particular piece of information quickly because you understand the structure of the system. To understand the file system, you should first become familiar with the following three concepts:

Files and file names

Directories and subdirectories

Tree structures and pathnames

2.3.1 Files and File Names

A file can contain the text of a document, a computer program, records for a general ledger, the numerical or statistical output of a computer program, or other data.

A file name can contain any character except the following because these characters have special meaning to the shell:

Slash ( / )

Backslash ( \ )

Ampersand ( & )

Left- and right-angle brackets (< and >)

Question mark ( ? )

Dollar sign ( $ )

Left bracket ( [ )

Asterisk ( * )

Tilde ( ~ )

Vertical bar or pipe symbol ( | )

You may use a period or dot ( . ) in the middle of a file name, but never at the beginning of the file name unless you want the file to be hidden when doing a simple listing of files. For information about characters with special meanings to your shell, refer to Section 8.2.2 and Section 8.3.2. For information about listing hidden files, see Section 3.1.3.

Note

Unlike some operating systems, this operating system distinguishes between uppercase and lowercase letters in file names (that is, it is case sensitive). For example, the following three file names represent three distinct files: filea, Filea, and FILEA.

Use file names that reflect the actual contents of your files. For example, a file name such as memo.advt might indicate that the file contains a memo about advertising. On the other hand, file names such as filea, fileb, or filec tell you nothing about the contents of that file.

It is also a good idea to use a consistent pattern to name related files. For example, suppose you have an advertising report that is divided into chapters, with each chapter contained in a separate file. You might name these files in the following way:

chap1.advt

chap2.advt

chap3.advt

Note

Many programs that you invoke use the portion of the file name following the dot ( . ), called the extension, as an indicator of the file's purpose.

The maximum length of a file name depends upon the file system used on your operating system. For example, your file system may allow a maximum file name length of 255 characters (the default), or it may allow a maximum file name length of only 14 characters. Because knowing the maximum file name length is important to providing files with meaningful file names, see your system administrator for details.

2.3.2 Directories and Subdirectories

You can organize your files into groups and subgroups that resemble the cabinets, drawers, and folders in a manual file system. These groups are called directories, and the subgroups are called subdirectories. A well-organized system of directories and subdirectories lets you retrieve and manipulate the data in your files quickly.

Directories differ from files in two significant ways:

Directories are organizational tools; files are storage places for data.

Directories contain the names of files, other directories, or both.

When you first log in, the system automatically places you in your login directory. This directory is also called your home directory. The system also sets your HOME environment variable to the full path name of this directory. This directory was created for you when your computer account was established. However, a file system in which all files are arranged under your login directory is not necessarily the most efficient method to organize your files.

As you work with the system, you may want to set up additional directories and subdirectories so you can organize your files into useful groups. For example, assume that you work for the Sales department and are responsible for four lines of automobiles. You may want to create a subdirectory under your login directory for each automobile line. Each subdirectory can contain all memos, reports, and sales figures applicable for the automobile model.

Once your files are arranged into a directory structure that you find useful, you can move easily between directories. See Chapter 4 for information about creating directories and moving between them.

2.3.3 Displaying the Name of Your Current (Working) Directory (pwd)

The directory in which you are working at any given time is your current, or working directory.

Whenever you are uncertain about the directory in which you are working or where that directory exists in the file system, enter the pwd (print working directory) command as follows:


$ pwd

The system displays the name of your current directory in the format:

/usr/msg

This information indicates that you are currently working in a directory named msg that is located under the usr directory.

The /usr/msg notation is known as the pathname of your working directory. See Section 2.3.4 for information about pathnames. See the pwd(1) reference page for further information on the pwd command.

2.3.4 The Tree-Structure File System and Pathnames

The files and directories in the file system are arranged hierarchically in a structure that resembles an upside-down tree with the roots at the top and the branches at the bottom. This arrangement is called a tree structure. You can find more detailed information about the directory structure in the hier(5) reference page.

Figure 2-1 shows a typical file system arranged in a tree structure. The names of directories are printed in bold, and the names of files are printed in italics.

Figure 2-1: A Typical File System

At the top of the file system shown in Figure 2-1 (that is, at the root of the inverted tree structure) is a directory called the root directory. The symbol that represents this first major division of the file system is a slash ( / ).

At the next level down from the root of the file system are eight directories, each with its own system of subdirectories and files. Figure 2-1, however, shows only the subdirectories under the directory named user. These are the login directories for the users of this system.

The third level down the tree structure contains the login directories for two of the system's users, smith and chang. It is in these directories that smith and chang begin their work after logging in.

The fourth level of the figure shows three directories under the chang login directory: plans, report, and payroll.

The fifth level of the tree structure contains both files and subdirectories. The plans directory contains four files, one for each quarter. The report directory contains three files comprising the three parts of a report. Also on the fifth level are two subdirectories, regular and contract, which further organizes the information in the payroll directory.

A higher level directory is frequently called a parent directory. For example, in Figure 2-1, the directories plans, report, and payroll all have chang as their parent directory.

A pathname specifies the location of a directory or a file within the file system. For example, when you want to change from working on File A in Directory X to File B in Directory Y, you enter the pathname to File B. The operating system then uses this pathname to search through the file system until it locates File B.

A pathname consists of a sequence of directory names separated by slashes ( / ) that ends with a directory name or a file name. The first element in a pathname specifies where the system is to begin searching, and the final element specifies the target of the search. The following pathname is based on Figure 2-1:

/user/chang/report/part3

The first slash ( / ) represents the root directory and indicates the starting place for the search. The remainder of the pathname indicates that the search is to go to the user directory, then to the chang directory, next to the report directory, and finally to the part3 file.

Whether you are changing your current directory, sending data to a file, or copying or moving a file from one place in your file system to another, you use pathnames to indicate the objects you want to manipulate.

A pathname that starts with a slash ( / ) (the symbol representing the root directory) is called a full pathname or an absolute pathname. You can also think of a full pathname as the complete name of a file or a directory. Regardless of where you are working in the file system, you can always find a file or a directory by specifying its full pathname.

The file system also lets you use relative pathnames. Relative pathnames do not begin with the / that represents the root directory because they are relative to the current directory.

You can specify a relative pathname in one of several ways:

As the name of a file in the current directory.

As a pathname that begins with the name of a directory one level below your current directory.

As a pathname that begins with .. (dot dot, the relative pathname for the parent directory).

As a pathname that begins with . (dot, which refers to the current directory). This relative pathname notation is useful when you want to run your own version of an operating system command in the current directory (for example ./ls).

Every directory contains at least two entries: .. (dot dot), and . (dot).

In Figure 2-2, for example, if your current directory is chang, the relative pathname for the file 1Q in the contract directory is payroll/contract/1Q. By comparing this relative pathname with the full pathname for the same file, /user/chang/payroll/contract/1Q, you can see that using relative pathnames means less typing and more convenience.

Figure 2-2: Relative and Full Pathnames

In the C shell and the Korn or POSIX shell, you may also use a tilde ( ~) at the beginning of relative pathnames. The tilde character used alone specifies a user's login (home) directory. The tilde character followed by a user name specifies the login (home) directory of another user on the same system.

For example, to specify your own login directory, use the tilde alone. To specify the login directory of user chang, specify ~chang.

For more information on using relative pathnames, see Chapter 4.

Note

If there are other users on your system, you may or may not be able to get to their files and directories, depending upon the permissions set for them. For more information about file and directory permissions, see Chapter 5. In addition, your system may contain enhanced security features that may affect access to files and directories. If so, see your system administrator for details.

2.4 Specifying Files with Pattern Matching

Commands often take file names as arguments. To use several different file names as arguments to a command, you can type out the full name of each file, as the following example shows:


$ ls file1 file2 file3

However, if the file names have a common pattern (in this example, the file prefix), the shell can match that pattern, generate a list of those names, and automatically pass them to the command as arguments.

The asterisk (*), sometimes referred to as a wildcard , matches any string of characters. In the following example, the ls command finds the name of every text file in the current directory that includes the file prefix:


$ ls file*

The file* matches any file name that begins with file and ends with any other character string. The shell passes every file name that matches this pattern as an argument for the ls command.

Thus, you do not have to enter (or even remember) the full name of each file in order to use it as an argument. Both commands (ls with all file names typed out and ls file*) do the same thing -- they pass all files with the file prefix in the directory as arguments to the ls command.

There is one exception to the general rules for pattern matching. When the first character of a file name is a period, you must match the period explicitly. For example, ls * displays the names of all files in the current directory that do not begin with a period. The command ls -a displays all file names, those that begin with a period and all others.

This restriction prevents the shell from automatically matching the relative directory names. These are . (dot, standing for the current directory) and .. (dot dot, standing for the parent directory). For more information on relative directory names, see Chapter 4.

If a pattern does not match any file names, the shell displays a message informing you that no match has been found.

In addition to the asterisk (*), operating system shells provide other ways to match character patterns. Table 2-1 summarizes all pattern-matching characters and provides examples.

Table 2-1: Pattern-matching Characters

Character	Action
*	Matches any string, including the null string. For example, `th*` matches `th`, `theodore`, and `theresa`.
?	Matches any single character. For example, `304?b` matches `304Tb`, `3045b`, `304Bb`, or any other string that begins with `304`, ends with `b`, and has one character in between.
[...]	Matches any one of the enclosed characters. For example, `[AGX]*` matches all file names in the current directory that begin with `A`, `G`, or `X`.
[.-.]	Matches any character that falls within the specified range, as defined by the current locale. For more information on locale, see Appendix C. For example, `[T-W]*` matches all file names in the current directory that begin with `T`, `U`, `V`, or `W`.
[!...]	Matches any single character except one of those enclosed. For example, `[!abyz]*` matches all file names in the current directory that begin with any character except `a`, `b`, `y`, or `z`. This pattern matching is available only in the Bourne, Korn, and POSIX shells.

An internationalized operating system provides the additional pattern-matching features described in Table 2-2.

Table 2-2: Internationalized Pattern-matching Characters

Character Action

[[:class:]]

A character class name enclosed in bracket-colon delimiters matches any of the set of characters in the named class.

The supported classes are alpha, upper, lower, digit, alnum, xdigit, space, print, punct, graph, and cntrl.

For example, the alpha character class name specifies that you want to match any alphabetic character (uppercase and lowercase) as defined by the current locale. If you are running an American-based locale, alpha matches any character in the alphabet (A-Z, a-z).

[[=char=]]

A character enclosed in bracket-equal delimiters matches any equivalence class character.

An equivalence class is a set of collating elements that all sort to the same primary location. It is generally designed to deal with primary-secondary sorting; that is, for languages such as French that define groups of characters as sorting to the same primary location, and then having a tie-breaking, secondary sort.

For more information on internationalized pattern-matching characters, see the grep(1) reference page. For more information on internationalization features, see Appendix C.