 |
Index for Section 8 |
|
 |
Alphabetical listing for V |
|
volintro(8)
NAME
volintro - Introduction to Logical Storage Manager (LSM) utilities
SYNOPSIS
/sbin/volassist, /sbin/vold, /sbin/voldctl, /sbin/voldg, /sbin/voldisk,
/sbin/voledit, /usr/sbin/volinfo, /sbin/voliod, /sbin/volmake,
/usr/sbin/volmend, /sbin/volplex, /usr/sbin/volprint, /sbin/volrecover,
/sbin/volsd, /usr/sbin/volstat, /usr/sbin/voltrace, /sbin/volume
DESCRIPTION
The Logical Storage Manager utilities provide a shell-level interface used
by the system administrator and higher-level applications and scripts to
query and manipulate objects that are managed through the Logical Storage
Manager (LSM).
GLOSSARY
Some of the terms and objects that are used with the Logical Storage
Manager are:
volume
A virtual disk device that looks to applications and file systems like
a regular disk partition device. Volumes present block and raw device
interfaces that are compatible in their use, with disk partition
devices. However, a volume is a virtual device that can be mirrored,
spanned across disk drives, moved to use different storage, and striped
using administrative commands. The configuration of a volume can be
changed, using the Logical Storage Manager utilities, without causing
disruption to applications or file systems that are using the volume.
plex
A copy of a volume's logical data address space, also sometimes known
as a mirror. A volume can have up to eight plexes associated with it.
Each plex is, at least conceptually, a copy of the volume that is
maintained consistently in the presence of volume I/O and
reconfigurations. Plexes represent the primary means of configuring
storage for a volume. Plexes can have a striped or concatenated
organization (layout).
disk
Disks exist as two entities. One is the physical disk on which all
data is ultimately stored and which exhibits all the behaviors of the
underlying technology. The other is the Logical Storage Manager
presentation of disks which, while mapping one-to-one with the physical
disks, are just presentations of units from which allocations of
storage are made. As an example, a physical disk presents the image of
a device with a definable geometry with a definable number of
cylinders, heads etc. whereas a Logical Storage Manager disk is simply
a unit of allocation with a name and a size.
subdisk
A region of storage allocated on a disk for use with a volume.
Subdisks are associated to volumes through plexes. One or more
subdisks are layout to form plexes based on the plex layout: striped or
concatenated. Subdisks are defined relative to disk media records.
disk media record
A reference to a physical disk, or possibly a disk partition.
This record can be thought of as a physical disk identifier for the
disk or partition. Disk media records are configuration records that
provide a name (known as the disk media name or DM name) that an
administrator can use to reference a particular disk independent of its
location on the system's various disk controllers. Disk media records
reference particular physical disks through a disk ID, which is a
unique identifier that is assigned to a disk when it is initialized for
use with the Logical Storage Manager.
Operations are provided to set or remove the disk ID stored in a disk
media record. Such operations have the effect of removing or replacing
disks, with any associated subdisks being removed or replaced along
with the disk.
disk access record
A configuration record that defines a pathway to a disk. Disk access
records most often name a unit number. The list of all disk access
records stored in a system is used to find all disks attached to the
system.
Disk access records do not identify particular physical disks.
Through the use of disk IDs, the Logical Storage Manager allows disks
to be moved between controllers, or to different locations on a
controller. When a disk is moved, a different disk access record will
be used when accessing the disk, although the disk media record will
continue to track the actual physical disk.
On some systems, the Logical Storage Manager will build a list of disk
access records automatically, based on the list of all devices attached
to the system. On these systems, it is not necessary to define disk
access records explicitly. On other systems, disk access records must
be defined explicitly with the /sbin/voldisk define operation.
Specialty disks (such as RAM disks or floppy disks) are likely to
require explicit /sbin/voldisk define operations on all systems.
Disk access records are identified by their disk access names (also
known as DA names).
disk group
A group of disks that share a common configuration. A configuration
consists of a set of records describing objects including disks,
volumes, plexes, and subdisks that are associated with one particular
disk group. Each disk group has an administrator-assigned name that
can be used by the administrator to reference that disk group. Each
disk group has an internally defined unique disk group ID, which is
used to differentiate two disk groups with the same administrator-
assigned name.
Disk groups provide a method to partition the configuration database,
so that the database size is not too large and so that database
modifications do not affect too many drives. They also allow the
Logical Storage Manager to operate with groups of physical disk media
that can be moved between systems.
Disks and disk groups have a circular relationship: disk groups are
formed from disks, and disk group configurations are stored on disks.
All disks in a disk group are stamped with a disk group ID, which is a
unique identifier for naming disk groups. Some or all disks in a disk
group also store copies of the configuration of the disk group.
disk group configuration
A disk group configuration is a small database that contains all
volume, plex, subdisk, and disk media records. These configurations
are replicated onto some or all disks in the disk group, often with two
copies on each disk. Because these databases are stored within disk
groups, record associations cannot span disk groups. Thus, a subdisk
defined on a disk in one disk group cannot be associated with a volume
in another disk group.
root disk group
Each system requires one special disk group, named rootdg. This group
is generally the default for most utilities.
In addition to defining the regular disk group information, the
configuration for the root disk group contains local information that
is specific to a disk group and that is not intended to be movable
between systems.
private region
Disks used by the Logical Storage Manager contain two special regions:
a private region and a public region. Usually, each region is formed
from a complete partition of the disk; however, the private and public
regions can be allocated from the same partition.
The private region of a disk contains various on-disk structures that
are used by the Logical Storage Manager for various internal purposes.
Each private region begins with a disk header which identifies the disk
and its disk group. Private regions can also contain copies of a disk
group's configuration, and copies of the disk group's kernel log.
public region
The public region of a disk is the space reserved for allocating
subdisks. Subdisks are defined with offsets that are relative to the
beginning of the public region of a particular disk. Only one
contiguous region of disk can form the public region for a particular
disk.
kernel log
A log kept in the private region on the disk and that is written by the
Logical Storage Manager kernel. The log contains records describing
the state of volumes in the disk group. This log provides a mechanism
for the kernel to persistently register state changes so that vold can
be guaranteed to detect the state changes even in the event of a system
failure.
disk header
A block stored in a private region of a disk and that defines several
properties of the disk. The disk header defines the size of the
private region, the location and size of the public region, the unique
disk ID for the disk, the disk group ID and disk group name (if the
disk is currently associated with a disk group), and the host ID for a
host that has exclusive use of the disk.
disk ID
A 64 byte universally unique identifier that is assigned to a physical
disk when its private region is initialized with the /sbin/voldisk init
operation. The disk ID is stored in the disk media record so that the
physical disk can be related to the disk media record at system
startup.
disk group ID
A 64 byte universally unique identifier that is assigned to a disk
group when the disk group is created with /sbin/voldg init. This
identifier is in addition to the disk group name, which is assigned by
the administrator. The disk group ID is used to check for disk groups
that have the same administrator-assigned name but are actually
distinct.
host ID
A name, usually assigned by the administrator, that identifies a
particular host. Host IDs are used to assign ownership to particular
physical disks. When a disk is part of a disk group that is in active
use by a particular host, the disk is stamped with that host's host ID.
If another system attempts to access the disk, it will detect that the
disk has a non-matching host ID and will disallow access until the
first system discontinues use of the disk.
To allow for system failures that do not clear the host ID, the
/sbin/voldisk clearimport operation can be used to clear the host ID
stored on a disk.
If a disk is a member of a disk group and has a host ID that matches a
particular host, then that host will import the disk group as part of
system startup.
striped plex
A plex that scatters data evenly across each of its associated
subdisks. A plex has a characteristic number of stripe columns
(represented by the number of associated subdisks) and a characteristic
stripe width. The stripe width defines how data with a particular
address is allocated to one of the associated subdisks. Given a stripe
width of 128 blocks, and two stripe columns, the first group of 128
blocks would be allocated to the first subdisk, the second group of 128
blocks would be allocated to the second subdisk, the third group to the
first subdisk, again, and so on.
concatenated plex
A plex whose subdisks are associated at specific offsets within the
address range of the plex, and extend in the plex address range for the
length of the subdisk. This layout allows regions of one or more disks
to create a plex, rather then a single big region.
volboot file
The volboot file is a special file (usually stored in /etc/vol/volboot)
that is used to bootstrap the root disk group and to define a system's
host ID. In addition to a host ID, the volboot file contains a list of
disk access records.
On system startup, this list of disks is scanned to find a disk that
is a member of the rootdg disk group and that is stamped with this
system's host ID. When such a disk is found, its configuration is read
and is used to get a more complete list of disk access records that are
used as a second-stage bootstrap of the root disk group, and to locate
all other disk groups.
plex consistency
If the plexes of a volume contain different data, then the plexes are
said to be inconsistent. This is only a problem if the Logical Storage
Manager is unaware of the inconsistencies, as the volume can return
differing results for consecutive reads. Plex inconsistency is a
serious compromise of data integrity. This inconsistency can be caused
by write operations that start around the time of a system failure, if
parts of the write complete on one plex but not the other. Plexes can
also be inconsistent after creation of a mirrored volume, if the plexes
are not first synchronized to contain the same data. An important part
of Logical Storage Manager operation is ensuring that consistent data
is returned to any application that reads a volume. This may require
that plex consistency of a volume be ``recovered'' by copying data
between plexes so that they have the same contents. Alternatively, the
volume can be put into a state such that reads from one plex are
automatically written back to the other plexes, thus making the data
consistent for that volume offset.
CONVENTIONS
A number of conventions are used throughout much of the Logical Storage
Manager to provide a degree of similarity between the various operations.
The following is a list of such conventions:
Utility syntax
Most utilities in the Logical Storage Manager provide more than one
operation, with operations grouped into utilities primarily by object type.
Utilities that provide multiple operations are typically invoked with the
following form:
"utility [ options ] keyword [ operands ]"
Here, utility is the name of the utility and keyword is a name that
identifies the specific operation to perform. Any options that are
introduced in the standard -letter form precede the operation keyword.
To aid in normal use, each utility provides an extended usage message that
lists all the options and operation keywords supported by the utility. The
extended usage message for a utility can be displayed using a keyword of
help. The extended usage messages are intended to serve as reminders, and
not as replacements for user documentation.
Standard length numbers
Many basic properties of objects that are managed by the Logical Storage
Manager require specification of lengths, either as a pure object length or
as an offset relative to some other object. The Logical Storage Manager
supports volume lengths up to 2,147,483,647 disk sectors (one terabyte on
most systems).
Typing such large numbers, or even much smaller numbers, can be annoying.
The Logical Storage Manager provides a uniform syntax for representing
such numbers which uses suffixes to provide convenient multipliers.
Numbers can be specified in decimal, octal or hexadecimal. Also, numbers
can be specified as a sum of several numbers, as a convenience to avoid
using a calculator.
A hexadecimal (base 16) number is introduced using a prefix of 0x. For
example, 0xfff is the same as decimal 4091. An octal (base 8) number is
introduced using a prefix of 0. For example, 0177777 is the same as
decimal 65535.
A number can be followed by a suffix character to indicate a multiplier for
the number. A length number with no suffix character represents a count of
standard disk sectors. The length of a standard disk sector can vary
between systems. It is commonly 512 bytes. On systems where disks can
have different sector sizes, one of the sectors sizes will be chosen as the
``standard'' size. Supported suffix characters are:
b multiply the length by 512 bytes (blocks)
s multiply the length by the standard sectors size (default)
k multiply the length by 1024 bytes (Kilobytes)
m multiply the length by 1,048,576 (1024K) bytes (Megabytes)
g multiply the length by 1,073,741,824 (1024M) bytes (Gigabytes)
Numbers are represented internally as an integer number of sectors. As a
result, if the standard disk sectors size is larger than 512 bytes, numbers
can be specified that will need to be rounded to a sector. Rounding is
always done to the next lowest, not the nearest, multiple of the sector
size.
Because the letter b is a valid hexadecimal character, there is a special
case for the b suffix where a single blank character can separate a number
from the b suffix character. Use of a blank within a number, when invoking
commands from the shell, usually requires quoting the number. For example:
/sbin/volassist make vol01 "0x1000 b"
Numbers can be added or subtracted by separating two or more numbers by a
plus or minus sign, respectively. A plus sign is optional. As an example,
the largest allowed number that can be represented on a system with a 512
byte sector size can be entered as:
1023g+1023m+1023k+1
Note that 1024g-1 cannot be used, because the implementation cannot handle
the intermediate representation of 1024g (which is greater than the largest
number that can be represented) internally.
The Logical Storage Manager always reports length numbers as a simple count
of sectors, with no suffix character.
Case is not important in length numbers. Hexadecimal numbers and suffix
characters can be specified using any reasonable combination of upper- and
lower-case letters.
Disk group selection
Most commands operate upon only one disk group per invocation. Each disk
group has a separate configuration from every other disk group and it is
possible for two disk groups to contain two objects that have the same
name. This can happen, in particular, if a disk group is moved from one
system to another. However, most utilities make no attempt to ensure that
names between disk groups are unique, so name collisions can occur anyway.
System administrators who endeavor to avoid name collisions should be able
to use most of the utilities without having to specify disk groups except
when creating objects. Administrators cannot use single-command
invocations that reference objects in more than one disk group, but disk
groups will be selected automatically, based on objects specified in the
command.
The standard rules that most commands use for selecting the disk group for
a command are as follows:
· Given a particular set of object names specified on the command, look
for the disk group of each object. If all objects are in the same
disk group, use that disk group. If any named object is not unique
between all disk groups, and if the one of those object names is not
in the rootdg disk group, then fail.
· To force use of a particular disk group, use -g diskgroup to indicate
the group. Non-unique names do not cause errors when a disk group is
specified explicitly. The diskgroup specification is either a disk
group ID or a disk group name.
· A special case is provided for the rootdg disk group. Any set of
objects in the rootdg disk group can be specified without specifying
-g rootdg, even if the name is used in another disk group.
If a set of object names is given on the command line, and if some are
unique but some are not unique, then the command will still fail according
to the rules listed above. Just because a combination of objects could be
used to disambiguate the disk group does not mean that a utility will do
so.
RECORD TYPES
Disk group configurations contain six types of records: volume records,
plex records, subdisk records, disk media records, disk group records, and
disk access records. Disk access records are specific to the root disk
group and are stored in configurations only because there is no other
convenient place to store them; otherwise, they are logically separate from
all disk groups. Because they are specific and meaningful to the local
system only, the logical place for their storage is the rootdg since that
is the only disk group guaranteed to exist on the system.
Disk group records
Disk group records define several different types of names for a disk
group. The different types of names are:
real name
This is the name of the disk group, as the name is defined on disk.
This name is stored in the disk group configuration, and is also stored
in the disk headers of all disks in the disk group.
alias name
This is the standard name that the system uses when referencing the
disk group. References to the disk group name usually mean the alias
name. Volume and plex device directories are structured into
subdirectories based on the disk group alias name. Typically, the disk
group's alias name and real name are identical. A local alias can be
useful for gaining access to a disk group with a name that conflicts
with other disk groups in the system, or that conflicts with records in
the rootdg disk group.
disk group ID
A 64-byte identifier that represents the unique ID of the disk group.
All disk groups on all systems should have a different disk group ID,
even if they have the same real name. This identifier is stored in the
disk headers of all disks in the disk group. It is used to ensure that
the Logical Storage Manager does not confuse two disk groups which were
created with the same name.
Volume records
Volume records define the characteristics of particular volume devices.
The name of a volume record defines the node name used for files in the
/dev/vol and /dev/rvol directories. The block device for a particular
volume (which can be used as an argument to the mount command (see
mount(8)) has the path:
/dev/vol/groupname/volume
In this command path, groupname is the name assigned by the administrator
to the disk group containing the volume. The raw device for a volume,
typically used for application I/O and for issuing I/O control operations
(see ioctl(2)), has the path:
/dev/rvol/groupname/volume
For convenience, volumes assigned to the root disk group are accessible
under the rootdg subdirectories of /dev/vol and /dev/rvol, but are also
under /dev/vol/volume and /dev/rvol/volume.
Reads to a volume device are directed to one of the read-write or read-only
plexes associated with the volume. Writes to the volume are directed to
all of the enabled read-write and write-only plexes associated with the
volume.
During a write operation, two plexes of a volume may become out of sync
with each other, due to the fact that writes directed to two disks can
complete at different times. This is not normally a problem. However, if
the system were to crash or lose power during a write operation, the two
plexes could have different contents.
Most applications and file systems are not written with the presumption
that two separate reads of a device can return different contents without
an intervening write operation. Because plexes with different contents
could cause such a situation where two read operations of a block return
different contents, the Logical Storage Manager expends considerable effort
to ensure that this is avoided.
Volumes have the following fundamental attributes:
usage type
Each volume has a usage type, which defines a particular class of rules
for operating on the volume, typically based on the expected content of
the volume. Several utilities can apply extensions or limitations that
apply to volumes with a particular usage type. Four usage types are
included with the base release of the Logical Storage Manager: fsgen,
for use with volumes that contain file systems; gen, for use with
volumes that are used as swap devices or for other applications that do
not use file systems; and special root and swap usage types which are
specifically for use with the root file system volume and the primary
swap device.
length
Each volume has a length, which defines the limiting offset of read and
write operations. The length is assigned by the administrator, and may
or may not match the lengths of the associated plexes.
volume state
Each volume is either enabled, disabled, or detached. When enabled,
normal read and write operations are allowed on the volume, and any
file system residing on the volume can be mounted, or used in the usual
way.
When disabled, no access to the volume or any of its associated plexes
is allowed. When detached, the plex devices for the volume can be
accessed, and some ioctls can be used by utilities to operate on the
volume.
usage-type state
Usage types maintain a private state field related to the volume that
relate to operations that have been performed on the volume, or to
failure conditions that have been encountered. This state field
contains a string of up to 14 characters.
plexes
Each volume has between zero and eight associated plexes.
read policy
A configurable policy for switching between plexes for volume reads.
When a volume has more than one enabled associated plex, the Logical
Storage Manager can distribute reads between the plexes to distribute
the I/O load and thus increase total possible bandwidth of reads
through the volume.
The read policy can be set by administrator. Possible policies are:
·
round-robin
For every other read operation, switch to a different plex from the
previous read operation. Given three plexes, this will switch
between each of the three plexes, in order.
·
"preferred plex"
This read policy specifies a particular named plex that is used to
satisfy read requests. In the event that a read request cannot be
satisfied by the preferred plex, this policy changes to round-robin.
·
select
This read policy is the default policy, and adjusts to use an
appropriate read policy based on the set of plexes associated with
the volume. If exactly one enabled read-write striped plex is
associated with the volume, then that plex is chosen automatically as
the preferred plex; otherwise, the round-robin policy is used. If a
volume has one striped plex and one non-striped plex, preferring the
striped plex often yields better throughput.
start options
This is a string that is organized as a set of usage-type options to
apply when starting (enabling) a volume. See volume(8) for details.
log type
A policy to use for logging changes to the volume, which can be
assigned by the administrator. Policies that can be specified are:
·
none
Do not perform any special actions when writing to the volume. Just
write the requested data to all read-write or write-only plexes.
·
block-change-log
Write ranges of block numbers to disk when data is written to a
volume.
When the system is restarted after a crash, these ranges of block
numbers are used to limit the amount of data copying that is required
to recover plex consistency for the volume. The block number ranges
are logged to special logging subdisks associated with each of the
plexes associated with the volume.
Use of block-change-logging can greatly speed recovery of a volume,
but it also degrades performance of the volume under normal
operation.
read/write-back recover mode
This is a mode that applies to the volume, which is managed by
utilities as part of plex consistency recovery. When this mode is
enabled, each read operation will recover plex consistency for the
region covered by the read. Plex consistency is recovered by reading
data from blocks of one plex and writing that data to all other
writable plexes. This ensures that a future read operation covering
the same range of blocks will read the same data.
write-back-on-read-failure mode
This is a mode that applies to the volume, which can be enabled or
disabled by the administrator using voledit. If this mode is enabled,
then a read failure for a plex will cause data to be read from an
alternate plex and then written back to the plex that got the read
failure. This will usually fix the error. Only if the writeback fails
will the plex be detached for having an unrecoverable I/O failure.
writecopy mode
This is a mode that applies to the volume, which can be enabled or
disabled by the administrator using voledit. This mode takes affect
only if block-change-logging is in effect. When the operating system
hands off a write request to the volume driver, the operating system
may continue to change the memory that is being written to disk. The
Logical Storage Manager cannot detect that the memory is changing, so
it can inadvertently leave plexes with inconsistent contents. This is
not normally a problem, because the operating system ensures that any
such modified memory is rewritten to the volume before the volume is
closed (such as by a clean system shutdown).
However, if the system crashes, plexes may be inconsistent. Since the
block-change-logging feature prevents recovery of the entire volume, it
may not ensure that plexes are entirely consistent.
Turning on the writecopy mode (which is normally set by default) often
causes the Logical Storage Manager to copy the data for a write request
to a new section of memory before writing it to disk. Because the
write is done from the copied memory, it cannot change and so the data
written to each plex is guaranteed to be the same if the write
completes.
exception policy
There are several modes that can be set on the volume, by utilities
according to the usage type of the volume. These modes affect
operation of a volume in the presence of I/O failures. Currently only
one of these policies, called GEN_DET_SPARSE is ever used. This policy
tracks complete and incomplete plexes in a volume (an incomplete plex
does not have a backing subdisk for all blocks in the volume). If an
unrecoverable error occurs on an incomplete plex, the plex is detached
(disabled from receiving regular volume I/O requests). If an
unrecoverable error occurs on a complete plex, the plex is detached
unless it is the last complete plex. If the plex is the last complete
read-write plex, any incomplete plexes that overlap with the error will
be detached but the plex with the error will remain attached.
This default policy is chosen to ensure that an I/O that fails on one
plex will not, in the future, be directed to that plex again unless
that plex is the last complete plex remaining attached to the volume.
In that case, the policy ensures that the volume will return the error
consistently, even in the presence of incomplete plexes.
comment
An administrator-assigned string of up to 40 characters that can be set
and changed using the voledit utility. The Logical Storage Manager
does not interpret the comment field. The comment cannot contain
newline characters.
user, group, and mode
These attributes are the user group and file permission modes used for
the volume device nodes, and for the plex device nodes of associated
plexes. The user and group are normally root. The mode usually allows
read and write permission to the owner, and no access by other users.
Plex records
Plex records define the characteristics of a particular mirror of a volume.
A plex can be in either an associated state or a dissociated state.
In the dissociated state, the plex is not a part of a volume. A
dissociated plex cannot be accessed in any way. An associated plex can be
accessed through the volume and, in a limited fashion, through a plex
device. The name of the plex defines the node name used for files in the
/dev/plex directory. The device for a particular plex has the path:
/dev/plex/groupname/plex
In this command path, groupname is the name assigned by the administrator
to the disk group containing the plex. For convenience, plexes assigned to
the root disk group are accessible both under the rootdg subdirectory of
/dev/plex and also under /dev/plex/plex.
Plexes have the following fundamental attributes:
plex state
Each plex is either enabled, disabled, or detached. When enabled,
normal read and write operations from the volume can be directed to the
plex. When disabled, no I/O operations will be applied to the plex.
When detached, normal volume I/O will not be directed to the plex.
When detached, the plex device can be accessed for either read or write
access using the special plex device nodes. If a plex is enabled,
however, then the plex device can be read but not written.
I/O failures encountered during normal volume I/O may move the enabled
state for a plex directly from enabled to detached. See the
description of volume exception policies for more information.
I/O mode
Each plex is either in read-write, read-only, or write-only mode. This
mode affects read and write operations directed to the volume, if the
plex is enabled. For read-write and read-only modes, volume read
operations can be directed to the plex. For read-write and write-only
modes, volume write operations will be directed to the plex.
Plexes are normally in read-write mode. Write-only mode is used to
recover a plex that failed, and whose contents have thus become out-
of-date with respect to the volume. It is also used when attaching a
new plex to a volume. In read-write mode, writes to the volume will
update the plex, causing written regions to be up-to-date. Typically, a
set of special copy operations will be used to update the remainder of
the plex.
layout
The organization of associated subdisks with respect to the plex
address space. The layout is either striped or concatenated.
subdisks
Each plex has zero or more associated subdisks. Subdisks are
associated at offsets relative to the beginning of the plex address
space.
Subdisks for concatenated plexes may not cover the entire length of
the plex, leaving holes in the plex. A plex that is not as long as the
volume to which it is associated is considered to have a hole at the
end of the plex, up to the length of the volume. A plex with a hole is
considered incomplete, also sometimes called sparse.
log subdisk
Each plex can have at most one associated log subdisk. A log subdisk
is typically one block long and is used with the block-change-logging
feature to improve the time required to recover consistency of a volume
after a system failure.
length
The length of a plex is the offset of the last subdisk in the plex plus
the length of that subdisk. In other words, the length of the plex is
defined by the last block in the plex address space that is backed by a
subdisk. This value may or may not relate to the length of the volume,
depending on whether the plex is completely contiguously allocated.
contiguous length
The offset of the first block in the plex address space that is not
backed by a subdisk. If the plex has no holes, the contiguous length
matches the plex length. If the contiguous length is equal to or
greater than the length of the associated volume, the plex is
considered complete, otherwise it is sparse.
usage-type state
Volume usage types maintain a private state field related to the
operations that have been performed on the plex, or to failure
conditions that have been encountered. This state field contains a
string of up to 14 characters.
condition flags
Various condition flags are defined for the plex that define state
which is recognized automatically, rather than managed by the volume
usage type. Defined flags are:
NODAREC
No physical disk could be found corresponding to the disk ID in the
disk media record for one of the subdisks associated with the plex.
The plex cannot be used until the condition is fixed or the
affected subdisk is dissociated.
REMOVED
One of the disk media records was put into the removed state
through explicit administrative action. The plex cannot be used
until the disk is replaced or the affected subdisk is dissociated.
RECOVER
A disk for one of the disk media records was replaced or was
reattached too late to prevent the plex from becoming out-of-date
with respect to the volume. The plex requires complete recovery
from another plex in the volume to synchronize the plex with the
correct contents of the volume.
IOFAIL
The plex was detached as a result of an I/O failure detected during
normal volume I/O. The plex is out-of-date with respect to the
volume, and in need of complete recovery. However, this condition
also indicates a likelihood that one of the disks in the system
should be replaced.
volatile state
A plex is considered to have ``volatile'' contents if the disk for any
of the plex's subdisks is considered to be volatile. The contents of a
volatile disk are not presumed to survive a system reboot. The
contents of a volatile plex are always considered out-of-date after a
recovery and in need of complete recovery from another plex.
comment
An administrator-assigned string of up to 40 characters that can be set
and changed using the voledit utility. The Logical Storage Manager
does not interpret the comment field. The comment cannot contain
newline characters.
Subdisk records
Subdisk records define a region of disk, allocated from a disk's public
region. Subdisks have very little state associated with them, other than
the configuration state that defines which region of disk the subdisk
occupies.
Subdisks cannot overlap each other, either in their associations with
plexes, or in their arrangement on disk public regions.
Subdisks have the following fundamental attributes:
disk media name
The name of the disk media record that the subdisk is defined on.
disk offset
The offset, from the beginning of the disk's public region, to the
start of the subdisk.
plex offset
For associated subdisks, this is the offset (from the beginning of the
plex) of the subdisk association. For subdisks associated with striped
plexes, the plex offset defines relative ordering of subdisks in the
plex, rather than actual offsets within the plex address space.
length
The length of the subdisk.
comment
An administrator-assigned string of up to 40 characters that can be set
and changed using the voledit utility. The Logical Storage Manager
does not interpret the comment field. The comment cannot contain
newline characters.
Disk media records
Disk media records define a specific disk within a disk group. The name of
a disk media record is assigned when a disk is first added to a disk group
(using the /sbin/voldg adddisk operation). Disk media records can be
assigned to specific physical disks by associating the media record with
the current disk access record for the physical disk.
Disk media records have the following fundamental attributes:
disk ID
A 64-byte unique identifier representing the physical disk to which the
media record is associated. This can be cleared to indicate that the
disk is considered in the removed state.
A removed disk has no current association with any physical disk.
disk access name
The disk access name that is currently used to access the physical disk
referenced by the disk ID. If the disk ID is defined, but no physical
disk with that ID could be found, the disk access name will be clear.
A disk where the physical disk could not be found is considered to be
in the NODAREC, or inaccessible, state. A disk can become inaccessible
either because the indicated disk is not currently attached to the
system, or because I/O failures on the physical disk prevented the
Logical Storage Manager from identifying or using the physical disk.
A disk media record that has an active association with a physical disk
(both the disk ID and the disk access name attributes are defined),
inherits several properties from the underlying physical disk. These
attributes are taken from the disk header, which is stored in the
private region of the disk.
These inherited attributes are:
public length
The length of the region of the physical disk that is available for
subdisk allocations.
private length
The length of the region of the physical disk that is reserved for
storing private Logical Storage Manager information.
atomic I/O size
This is the fundamental I/O size for the disk, in bytes, also known as
the sector size. All I/Os destined for this disk must be multiples of
this size. Currently, the Logical Storage Manager requires that all
disks have the same sector size. On most systems, this size is 512
bytes.
Disk access records
Disk access records define an address, or access path, that can be used to
access a disk. The list of all disk access records defines the list of all
disk addresses that the Logical Storage Manager can use to locate physical
disks. Disk access records do not define specific physical disks, since
physical disks can be moved on a system. When a physical disk is moved, a
different disk access record may be necessary to locate it.
Disk access records are stored in the rootdg disk group configuration.
Unlike all other record types, the names of disk access records can
conflict with the names of other records. For example, a specialty disk
(such as a RAM disk) can use the same name for both the disk access record
and the disk media record that points to it. It is typically advisable to
use different names for the access and media records, to avoid additional
confusion if disks are moved.
Disk access records can be defined explicitly. Some (sometimes all) disk
access records may be configured automatically by the Logical Storage
Manager, based on available information in the operating system. Such
automatically-configured disks are not stored persistently in the on-disk
root disk group configuration, but are instead regenerated every time the
Logical Storage Manager starts up.
Disk access records have the following fundamental attributes:
disk access name
The name of the disk access record is typically a disk address of some
kind. Disk names are usually of the form ranp or rznp, where ra is the
device mnemonic for MSCP disks, rz is the device mnemonic for SCSI
disks, n is the unit number of the device, and p is the partition
identifier (in the range a to h).
type
Each disk access record has a type, which identifies certain key
characteristics of the Logical Storage Manager's interaction with the
disk. Currently available types are: sliced, simple, and nopriv. See
voldisk(8) for more information on disk types. Typically, most or all
of the disks will be of type sliced. It may be desirable to create
specialty disks (such as RAM disks) with type nopriv.
If the physical disk represented by the disk access record is currently
associated with a disk media record, then the following fields are defined:
disk group name
The name of the disk group containing the disk media record.
disk media name
The name of the media record that points to the physical disk.
Additional attributes can be added, arbitrarily, by disk types. See
voldisk(8) for a list of additional attributes defined by the standard disk
types.
VOLUME USAGE TYPES
The usage type of a volume represents a class of rules for operating on a
volume. Each usage type is defined by a set of executables under the
directory /etc/vol/type/usage_type, where usage_type is the name given to
the usage type. The required executables are: volinfo, volmake, volmend,
volplex, volsd, and volume. These executables are invoked by the Logical
Storage Manager administrative utilities with the same names.
The executables under /etc/vol/type should not, normally, be executed
directly.
Four usage types are provided with the Logical Storage Manager: gen, fsgen,
root, and swap. It is likely that new usage types will be added in future
releases. It is also possible for third-party products to install usage
types.
The usage types currently provided with the Logical Storage Manager store
state information in the volume and plex usage-type state fields. The state
fields defined for volumes are:
EMPTY
The volume is not yet initialized. This is the initial state for
volumes created by volmake.
CLEAN
The volume has been stopped and the contents for all plexes are
consistent.
ACTIVE
The volume has been started and is running normally, or was running
normally when the system was stopped. If the system crashes in this
state, then the volume may require plex consistency recovery.
NEEDSYNC
The volume requires recovery. This is typically set after a system
failure to indicate that the plexes in the volume may be inconsistent,
so that they require recovery [see the resync operation in volume(8)].
SYNC
Plex consistency recovery is currently being done on the volume.
/sbin/volume resync sets this state when it starts to recovery plex
consistency on a volume that was in the NEEDSYNC state.
The state fields defined for plexes are:
EMPTY
The plex is not yet initialized. This state is set when the volume
state is also EMPTY.
CLEAN
The plex was running normally when the volume was stopped.
The plex will be enabled without requiring recovery when the volume is
started.
ACTIVE
The plex is running normally on a started volume. The plex condition
flags (NODAREC, REMOVED, RECOVER, and IOFAIL) may apply if the system
is rebooted and the volume restarted.
STALE
The plex was detached, either by /sbin/volplex det or by an I/O
failure. /sbin/volume start will change the state for a plex to STALE
if any of the plex condition flags are set. STALE plexes will be
reattached automatically, when starting a volume, by calling
/sbin/volplex att.
OFFLINE
The plex was disabled by the /usr/sbin/volmend off operation. See
volmend(8) for more information.
SNAPATT
This is a snapshot plex that is being attached by the /sbin/volassist
snapstart operation. When the attach is complete, the state for the
plex will be changed to SNAPDONE. If the system fails before the
attach completes, the plex and all of its subdisks will be removed.
SNAPDONE
This is a snapshot plex created by /sbin/volassist snapstart that is
fully attached. A Plex in this state can be turned into a snapshot
volume with /sbin/volassist snapshot. See volassist(8) for more
information. If the system fails before the attach completes, the plex
and all of its subdisks will be removed.
SNAPTMP
This is a snapshot plex being attached by the /sbin/volplex snapstart
operation. When the attach is complete, the state for the plex will be
changed to SNAPDIS. If the system fails before the attach completes,
the plex will be dissociated from the volume.
SNAPDIS
This is a snapshot plex created by /sbin/volplex snapstart that is
fully attached. A Plex in this state can be turned into a snapshot
volume with /sbin/volplex snapshot. See volplex(8) for more
information. If the system fails before the attach completes, the plex
will be dissociated from the volume.
TEMP
This is a plex that is being associated and attached to a volume with
/sbin/volplex att. If the system fails before the attach completes the
plex will be dissociated from the volume.
TEMPRM
This is a plex that is being associated and attached to a volume with
/sbin/volplex att. If the system fails before the attach completes the
plex will be dissociated from the volume and removed.
Any subdisks in the plex will be kept.
TEMPRMSD
This is a plex that is being associated and attached to a volume with
/sbin/volplex att. If the system fails before the attach completes,
the plex and its subdisks will be dissociated from the volume and
removed.
EXIT CODES
The majority of the Logical Storage Manager utilities use a common set of
exit codes, which can be used by shell scripts or other types of programs
to react to specific problems detected by the utilities. For C
programmers, these exit status codes are defined in the include file
volclient.h. The number and macro name for each distinct exit code is
described below. Shell script writers must directly compare against the
numbers specified.
(0) VEX_OK
The utility is not reporting any error through the exit code.
(1) VEX_USAGE
Some command line arguments to the utility were invalid.
(2) VEX_SYNTAX
A syntax error occurred in a command or description, or a specified
record name is too long or contains invalid characters. This code is
returned only by utilities that implement a command or description
language.
This code may also be returned for errors in search patterns.
(3) VEX_NOVOLD
The volume daemon does not appear to be running.
(4) VEX_IPC
An unexpected error was encountered while communicating with the volume
daemon.
(5) VEX_OSERR
An unexpected error was returned by a system call or by the C library.
This can also indicate that the utility ran out of memory.
(6) VEX_LOST
The status for a commit was lost because the volume daemon was killed
and restarted during the commit of a transaction, but after restart the
volume daemon did not know whether the commit succeeded or failed.
(7) VEX_UTILERR
The utility encountered an error that it should not have encountered.
This generally implies a condition that the utility should have tested
for but did not, or a condition that results from the volume daemon
returning a value that did not make sense.
(8) VEX_TIMEOUT
The time required to complete a transaction exceeded 60 seconds,
causing the transaction locks to be lost. As most utilities will
reattempt the transaction at least once if a timeout occurs, this
usually implies that a transaction timed out two or more times.
(9) VEX_NODG
No disk group could be identified for an operation. This results
either from naming a disk group that does not exist, or from supplying
names on a command line that are in different disk groups or in
multiple disk groups.
(10) VEX_CHANGED
A change made to the database by another process caused the utility to
stop. This code is also returned by a usage-type-dependent utility if
it is given a record that is associated with a different usage-type. If
this situation occurs when the usage-type-dependent utility is called
from a switchout utility, then the database was changed after the
switchout utility determined the proper usage-type to invoke.
(11) VEX_NOENT
A requested subdisk, plex, or volume record was not found in the
configuration database. This may also mean that a record was an
inappropriate type.
(12) VEX_EXIST
A name used to create a new configuration record matches the name of an
existing record.
(13) VEX_BUSY
A subdisk, plex, or volume is locked against concurrent access.
This code is used for inter-transaction locks associated with usage-
type utilities. The code is also used for the dissociated plex or
subdisk lock convention, which writes a non-blank string to the
tutil[0] field in a plex or subdisk structure to indicate that the
record is being used.
(14) VEX_NOUSETYPE
No usage-type could be determined for a utility that requires a usage
type.
(15) VEX_BADUSETYPE
An unknown or invalid usage-type was specified.
(16) VEX_ASSOC
A plex or subdisk is associated, but the operation requires a
dissociated record.
(17) VEX_DISASSOC
A plex or subdisk is dissociated, but the operation requires an
associated record. This code can also be used to indicate that a
subdisk or plex is not associated with a specific plex or volume.
(18) VEX_LAST
A plex or subdisk was not dissociated because it was the last record
associated with a volume or plex.
(19) VEX_TOOMANY
Association of a plex or subdisk would surpass the maximum number that
can be associated to a volume or plex.
(20) VEX_INVAL
A specified operation is invalid within the parameters specified. For
example, this code is returned when an attempt is made to split a
subdisk on a striped plex, or to use a split size that is greater than
the size of the plex.
(21) VEX_IOER
An I/O error was encountered that caused the utility to abort an
operation.
(22) VEX_NOPLEX
A volume involved in an operation did not have any associated plexes,
although at least one was required.
(23) VEX_NOSUBDISK
A plex involved in an operation did not have any associated subdisks,
although at least one was required
(24) VEX_UNSTARTABLE
A volume could not be started by the /sbin/volume start operation,
because the configuration of the volume and its plexes prevented the
operation.
(25) VEX_STARTED
A specified volume was already started.
(26) VEX_UNSTARTED
A specified volume was not started. For example, this code is returned
by the /sbin/volume stop operation if the operation is given a volume
that is not started.
(27) VEX_DETACHED
A volume or plex involved in an operation is in the detached state,
thus preventing a successful operation.
(28) VEX_DISABLED
A volume or plex involved in an operation is in the disabled state,
thus preventing a successful operation.
(29) VEX_ENABLED
A volume or plex involved in an operation is in the enabled state, thus
preventing a successful operation.
(30) VEX_UNKNOWN
An unknown error was encountered. This code may be used, for example,
when the volume daemon returns an unrecognized error number.
(31) VEX_OPEN
An operation failed because a volume or plex device was open or
mounted, or because a subdisk was associated with an open or mounted
volume or plex.
Exit codes greater than 32 are reserved for use by usage-types. Codes
greater than 64 can be reserved for use by specific utilities.
SEE ALSO
volassist(8), vold(8), voldctl(8), voldg(8), voldisk(8), voldiskadm(8),
voledit(8), volinfo(8), voliod(8), volmake(8), volmend(8), volnotify(8),
volplex(8), volprint(8), volrecover(8), volsd(8), volstat(8), voltrace(8),
volume(8), volwatch(8), plexrec(4), sdrec(4), volrec(4), volmake(4)