1 Overview

The Logical Storage Manager (LSM) software is an optional integrated, host-based disk storage management application that allows you to manage storage devices without disrupting users or applications accessing data on those storage devices.

LSM uses Redundant Arrays of Independent Disks (RAID) technology to enable you to configure storage devices into a virtual pool of storage from which you create LSM volumes. You configure new file systems, databases, and applications, or encapsulate existing ones, to use an LSM volume instead of a disk partition.

The benefits of using an LSM volume instead of a disk partition include:

Data loss protection
You can configure LSM to protect against data loss by configuring LSM volumes in one of the following ways:
- To store and maintain multiple copies (mirrors) of data on different storage devices. If a storage device fails, LSM continues operating using mirror data.
- To store data and parity information on different storage devices. If a storage device fails, LSM uses the data on the remaining storage devices and the parity information to reconstruct the missing data on the failed storage device.
In either case, data remains available without disrupting users or applications, shutting down the system, or backing up and restoring data.
You can configure LSM to encapsulate the boot disk partitions into LSM volumes and then create mirrors of those volumes. By doing so, you create copies of the boot disk partitions from which the system can boot if the original boot disk fails.

Maximized disk usage
You can configure LSM to seamlessly join together storage devices to appear as a single storage device to users and applications.

Performance improvements
You can configure LSM to separate data into units of equal size, then read or write the data units on two or more storage devices. LSM simultaneously reads or writes the data units if the storage devices are on different SCSI buses.

Data availability
You can configure LSM in a TruCluster environment. TruCluster software makes AlphaServer^TM systems appear as a single system on the network. The AlphaServer systems running the TruCluster software become members of the cluster and share resources and data storage. This sharing allows applications, such as LSM, to continue uninterrupted if the cluster member on which it was running fails.

This chapter introduces LSM features, concepts, and terminology. The volintro(8) reference page also provides information on LSM terms and commands.

1.1 LSM Object Hierarchy

LSM uses the following hierarchy of objects to organize storage:

LSM disk--An object that represents a storage device that is initialized exclusively for use by LSM

Disk Group--An object that represents a collection of LSM disks and subdisks for use by an LSM volume

Subdisk--An object that represents a contiguous set of blocks on an LSM disk that LSM uses to write volume data

Plex--An object that represents a subdisk or collection of subdisks to which LSM writes a copy of the volume data or log information

Volume--An object that represents a hierarchy of LSM objects, including LSM disks, subdisks, and plexes in a disk group. Applications and file systems make read and write requests to the LSM volume.

The following sections describe LSM objects in more detail.

1.1.1 LSM Disk

An LSM disk is a Tru64 UNIX supported storage device, including disks, disk partitions, and hardware RAID sets, that you configure exclusively for use by LSM. LSM views the storage in the same way as the Tru64 UNIX operating system software views it. For example, if the operating system software considers a RAID set as a single storage device, so does LSM.

For more information on supported storage devices, see the Tru64 UNIX Software Product Description (SPD) web site at the following URL:

http://www.tru64unix.compaq.com/docs/spds.html

Note

LSM does not recognize and support disk clones (hardware disk copies of LSM disks).

Figure 1-1 shows a typical hardware configuration that LSM supports.

Figure 1-1: Typical LSM Hardware Configuration

A storage device becomes an LSM disk when you initialize it for use by LSM. There are three types of LSM disks:

A sliced disk , which initializes an entire disk for LSM use. This type of initialization organizes the storage into two regions on separate partitions--a large public region used for storing data and a private region for storing LSM internal metadata, such as LSM configuration information. The default size of the private region is 4096 blocks. Figure 1-2 shows a sliced disk.
Figure 1-2: LSM Sliced Disk

A simple disk , which initializes a disk partition. This type of initialization organizes the storage into two regions on the same partition--a large public region used for storing data and a private region for storing LSM internal metadata, such as LSM configuration information. The default size of the private region is 4096 blocks. Figure 1-3 shows a simple disk.
Figure 1-3: LSM Simple Disk

Whenever possible, initialize the entire disk as a sliced disk instead of configuring individual disk partitions as simple disks. This ensures that the disk's storage is used efficiently and avoids using space for multiple private regions on the same disk.

A nopriv disk , which initializes a disk or disk partition that contains data you want to encapsulate. This type of initialization creates only a public region for the data and no private region. Figure 1-4 shows a nopriv disk.
Figure 1-4: LSM Nopriv Disk

1.1.2 Disk Group

A disk group is an object that represents a grouping of LSM disks. LSM disks in a disk group share a common configuration database that identifies all the LSM objects in the disk group. LSM automatically creates and maintains copies of the configuration database in the private region of multiple LSM sliced or simple disks in each disk group.

LSM distributes these copies across all controllers for redundancy. If LSM disks in a disk group are located on the same controller, LSM distributes the copies across several disks. LSM automatically records changes to the LSM configuration and, if necessary, changes the number and location of copies of the configuration database for a disk group.

You cannot have a disk group of only LSM nopriv disks, because an LSM nopriv disk does not have a private region to store copies of the configuration database.

By default, the LSM software creates a default disk group called rootdg. The configuration database for rootdg contains information for itself and all other disk groups that you create.

An LSM volume can use disks only within the same disk group. You can create all of your volumes in the rootdg disk group, or you can create other disk groups. For example, if you dedicate disks to store financial data, you can create and assign those disks to a disk group called finance.

When you add an LSM disk to a disk group, LSM assigns it a disk media name . By default, the disk media name is the same as the disk access name , which the operating system software assigns to a storage device. For example, the disk media name and disk access name might be dsk1.

You do not have to use the default disk media name. You can assign a disk media name of up to 31 alphanumeric characters that cannot include spaces or the forward slash ( / ). For example, you could assign a disk media name of finance_data_disk.

LSM associates the disk media name with the operating system's disk access name. The disk media name provides insulation from operating system naming conventions. This allows LSM to find the device should you move it to a new location (for example, connect a disk to a different controller). However, LSM nopriv disks require more planning to move them to a different controller or a different system. See Section 5.2.7 for more information on moving a disk group containing nopriv disks to another system.

1.1.3 Subdisk

A subdisk is an object that represents a contiguous set of blocks in an LSM disk's public region that LSM uses to store data.

By default, LSM assigns a subdisk name using the LSM disk media name followed by a dash (-) and an ascending two-digit number beginning with 01. For example, dsk1-01 is the subdisk name on an LSM disk with a disk media name of dsk1.

You do not have to use the default subdisk name. You can assign a subdisk name of up to 31 alphanumeric characters that cannot include spaces or the forward slash ( / ). For example, you could assign a subdisk name of finance_disk01.

A subdisk can be:

The entire public region. Figure 1-5 shows that the entire public region of an LSM disk was configured as a subdisk called dsk1-01:
Figure 1-5: Single Subdisk Using a Public Region

A portion of the public region. Figure 1-6 shows a public region of an LSM disk that was configured as two subdisks called dsk2-01 and dsk2-02:
Figure 1-6: Multiple Subdisks Using a Public Region

1.1.4 Plex

A plex is an object that represents a subdisk or collection of subdisks in the same disk group to which LSM writes a copy of volume data or log information. There are three types of plexes:

Data plex
A data plex contains volume data. There are three types of data plexes. The data plex that you choose depends on how you want LSM to store volume data on subdisks. The following lists the three types of data plexes:
- Concatenated data plex
  In a concatenated data plex, LSM writes volume data in a linear manner. When the space in one subdisk has been written to, the remaining data goes to the next sequential subdisk in the plex. Section 1.1.4.1 explains this plex type in more detail.
- Striped data plex
  In a striped data plex, LSM separates data into equal-sized units and writes the data units to each disk in the plex. This spreads the read-write operations evenly across the disks. Section 1.1.4.2 explains this plex type in more detail.
- RAID 5 data plex
  In a RAID 5 data plex, LSM calculates a parity value for the data being written, then separates the data into equal-sized units and intersperses the data units and parity on each column in the plex. Section 1.1.4.3 explains this plex type in more detail.

Log plex
A log plex contains information about activity in a volume. In the event of a failure, LSM recovers only those areas of the volume identified in the log plex as being dirty (written to) at the time of the failure. There are two types of log plexes:
- Dirty Region Log (DRL) plex
  In a DRL plex, LSM logs regions in a mirrored concatenated or striped data plex.
- RAID 5 log plex
  In a RAID 5 log plex, LSM logs blocks being changed in a RAID 5 data plex and stores a temporary copy of the data and parity being written.

Data and log plex (for compatibility with Version 4.0)

By default, LSM assigns a plex name using the volume name followed by a dash (-) and an ascending two-digit number beginning with 01. For example, volume1-01 is the name of a plex for a volume called volume1.

You do not have to use the default plex name. You can assign a plex name of up to 31 alphanumeric characters that cannot include spaces or the forward slash ( / ). For example, you could assign a plex media name of finance_plex01.

1.1.4.1 Concatenated Data Plex

In a concatenated data plex, LSM creates a contiguous address space on the subdisks and sequentially writes volume data in a linear manner. If LSM reaches the end of a subdisk while writing data, it continues to write data to the next subdisk as shown in Figure 1-7.

Figure 1-7: Concatenated Data Plex

A single subdisk failure in a volume with one concatenated data plex will result in LSM volume failure. To prevent this type of failure, you can create multiple plexes (mirrors) on different disks. LSM continuously maintains the data in the mirrors. If a plex becomes unavailable because of a disk failure, the volume continues operating using another plex.

Using disks on different SCSI buses for mirror plexes speeds read requests, because data can be simultaneously read from multiple plexes.

By default, LSM creates a DRL plex when you create a mirrored volume. A DRL plex divides the data plexes into a set of consecutive regions and tracks regions that change due to I/O writes. When the system restarts after a failure, only the changed regions of the volume are recovered.

If you do not use a DRL plex and the system restarts after a failure, LSM must copy and resynchronize all the data to each plex to restore the plex consistency. Although this process occurs in the background and the volume is still available, it can be a lengthy procedure and can result in unnecessarily recovering data, thereby degrading system performance.

You can create up to 32 plexes, which can be any combination of data or DRL plexes.

Figure 1-8 shows a volume with mirrored concatenated data plexes.

Figure 1-8: Volume with Mirrored Concatenated Data Plexes

1.1.4.2 Striped Data Plex

In a striped data plex, LSM separates the data into units of equal size (64 KB by default) and writes the data units alternately on two or more columns of subdisks, creating a stripe of data across the columns. LSM can simultaneously write the data units if there are two or more units and the subdisks are on different SCSI buses.

Figure 1-9 shows how a write request of 384 KB of data is separated into six 64 KB units and written to three columns as two complete stripes.

Figure 1-9: Writing Data to a Striped Plex

If a write request does not complete a stripe, then the first data unit of the next write request starts in the next column. For example, Figure 1-10 shows how 320 KB of data is separated into five 64 KB units and written to three columns. The first data unit of the next write request will start in the third column.

Figure 1-10: Incomplete Striped Data Plex

As in a concatenated data plex, a single disk failure in a volume with one striped data plex will result in volume failure. To prevent this type of failure, you can create multiple plexes (mirrors) on different disks. LSM continuously maintains the data in the mirrors. If a plex becomes unavailable because of a disk failure, the volume continues operating using another plex.

Using disks on different SCSI buses for mirror plexes speeds read requests, because data can be simultaneously read from multiple plexes.

By default, LSM creates a DRL plex when you create a mirrored volume. A DRL plex divides the data plexes into a set of consecutive regions and tracks regions that change due to I/O writes. When the system restarts after a failure, only the changed regions of the volume are recovered.

If you do not use a DRL plex and the system restarts after a failure, LSM must copy and resynchronize all the data to each plex to restore the plex consistency. Although this process occurs in the background and the volume is still available, it can be a lengthy procedure and can result in unnecessarily recovering data, thereby degrading system performance.

You can create up to 32 plexes, which can be any combination of data or DRL plexes.

Figure 1-11 shows a volume with mirrored striped data plexes.

Figure 1-11: Volume with Mirrored Striped Data Plexes

1.1.4.3 RAID 5 Data Plex

In a RAID 5 data plex, LSM calculates a parity value for each stripe of data, then separates the stripe of data and parity into units of equal size (16 KB by default) and writes the data and parity units on three or more columns of subdisks, creating a stripe of data across the columns. LSM can simultaneously write the data units if there are three or more units and the disks are on different SCSI buses. If a disk in one column fails, LSM continues operating using the data and parity information in the remaining columns to reconstruct the missing data.

In a RAID 5 data plex, LSM writes both data and parity across columns, writing the parity in a different column for each stripe of data. The first parity unit is located in the last column. Each successive parity unit is located in the next column, left-shifted one column from the previous parity unit location. If there are more stripes than columns, the parity unit placement begins again in the last column.

Figure 1-12 shows how data and parity information are written in a RAID 5 data plex with three columns.

Figure 1-12: Data and Parity Placement in a Three-Column RAID 5 Data Plex

In Figure 1-12, the first stripe of data contains data units 1 and 2 and parity unit P0. The second stripe contains data units 3 and 4 and parity unit P1. The third stripe contains units 5 and 6 and parity unit P2.

By default, creating a RAID 5 volume creates a RAID 5 log plex. A RAID 5 log plex keeps track of data and parity blocks being changed due to I/O writes. When the system restarts after a failure, the write operations that did not complete before the failure are restarted.

Note

You cannot mirror a RAID 5 data plex.
The TruCluster software does not support RAID 5 volumes and data plexes.

1.1.5 LSM Volume

A volume is an object that represents a hierarchy of plexes, subdisks, and LSM disks in a disk group. Applications and file systems make read and write requests to the LSM volume. The LSM volume depends on the underlying LSM objects to satisfy the request.

An LSM volume can use storage from only one disk group.

A Note About Terminology

Volumes that use mirror plexes, whether those plexes are concatenated or striped, are often called mirrored volumes. If the plex layout is important, the volume might be described as a concatenated and mirrored volume or a striped and mirrored volume. Volumes that use a RAID plex are similarly called RAID 5 volumes. It is important to understand that these terms are a shorthand way of describing a volume that uses plexes of a certain type:

A mirrored volume has two or more striped or concatenated plexes that each contain exact copies of the data. The volume is the container for all the copies (plexes).

A RAID 5 volume, or a volume that uses a RAID 5 plex, can never be mirrored, because by definition and design, the volume contains only one data plex (and up to 31 log plexes). The data plex provides redundancy for the volume in the form of the parity value for each stripe of data. The log plex provides the fast-recovery mechanism for the volume by tracking the regions that change, along with a copy of the data and parity for a predefined number of writes.

As with most storage devices, an LSM volume has a block device interface and a character device interface.

A volume's block device interface is located in the /dev/vol/disk_group directory.

A volume's character device interface is located in the /dev/rvol/disk_group directory.

Because these interfaces support the standard UNIX open, close, read, write, and ioctl calls, databases, file systems, applications, and secondary swap use an LSM volume in the same manner as a disk partition as shown in Figure 1-13.

Figure 1-13: Using LSM Volumes Like Disk Partitions

1.2 LSM Interfaces

You create, display, and manage LSM objects using any of the following interfaces:

A Java-based graphical user interface (GUI) called LSM Storage Administrator (lsmsa) that displays a hierarchical view of LSM objects and their relationships.
The Storage Administrator provides dialog boxes in which you enter information to create or manage LSM objects. Completing a dialog box can be the equivalent of entering several command-line commands. The Storage Administrator allows you to manage local or remote systems on which LSM is running. You need an LSM license to use the Storage Administrator. See Appendix A for more information on using the Storage Administrator.

A menu-based, interactive interface called voldiskadm.
To perform a procedure, you choose an operation from the main menu and the voldiskadm interface prompts you for information. The voldiskadm interface provides default values when possible. You can press Return to use the default value or enter a new value or enter ? at any time to view online help. See Appendix C and the voldiskadm(8) reference page for more information.

A bit-mapped GUI called Visual Administrator (dxlsm) that uses the Basic X Environment.
The Visual Administrator allows you to view and manage disks and volumes and perform limited file system administration. The Visual Administrator displays windows in which LSM objects are represented as icons. See Appendix D for more information on the Visual Administrator.

A command-line interpreter, whereby you enter LSM commands at the system prompt. The examples in this manual use LSM commands.

In most cases, you can use the LSM interfaces interchangeably. That is, LSM objects created by one interface are manageable through and compatible with LSM objects created by other LSM interfaces. The command-line interpreter provides you with complete control and the finest granularity in creating and managing LSM objects. The other interfaces do not support all the operations available through the command line; see the relevant appendix for a description of the supported functions.

1.2.1 LSM Command-Line Interpreter

LSM provides a range of commands that allow you to display and manage LSM objects.

Table 1-1 lists the LSM commands and their functions.

Table 1-1: LSM Commands

Command	Function
`volsetup`	Initializes the LSM software
`volencap`	Sets up scripts to encapsulate disks or disk partitions into LSM volumes
`volreconfig`	Performs the encapsulation scripts set up by `volencap`, and if necessary restarts the system to complete the encapsulation
`volrootmir`	Mirrors the root and swap volumes
`volunroot`	Removes the root and swap volumes
`volmigrate`, `volunmigrate`	Migrates AdvFS domains to or from LSM volumes
`voldiskadd`	Interactively creates LSM disks
`voldisksetup`	Adds one or more disks for use with LSM (with -i option)
`volassist`	Creates, mirrors, backs up, and moves volumes automatically
`voldisk`	Administers LSM disks
`voldg`	Administers disk groups
`volplex`	Administers plexes
`volume`	Administers volumes
`volsd`	Administer subdisks
`volmake`	Creates LSM objects manually
`volmirror`	Mirrors data on a disk, mirrors all unmirrored volumes in a disk group, or changes or displays the current defaults for mirroring
`voledit`	Creates, modifies, and removes LSM records
`volprint`	Displays LSM configuration information
`volsave`	Backs up the LSM configuration database
`volrestore`	Restores the LSM configuration database
`volmend`	Mends simple problems in configuration records
`volnotify`	Displays LSM configuration events
`volwatch`	Monitors LSM for failure events and performs hot-sparing if enabled
`volstat`	Displays LSM statistics
`voldctl, vold, voliod`	Controls daemon operations
`voltrace`	Traces I/O operations on volumes
`volevac`	Evacuates all volume data from a disk
`volrecover`	Synchronizes plexes after a crash or disk failure
`volinstall`	Customizes the LSM environment after LSM installation
`voldiskadm`	Starts the LSM interactive menu interface
`lsmsa`	Starts the LSM Storage Administrator GUI
`dxlsm`	Starts the LSM Visual Administrator interface

For more information on a command, see the reference page corresponding to its name. For example, for more information on the volassist command, enter:

# man volassist