LSM-related preventative maintenance procedures enable you to restore your LSM configuration if a disk or system fails. Preventative maintenance procedures include using the redundancy features built into LSM, backing up your system regularly, backing up copies of critical data needed to reconstruct your LSM configuration, and monitoring the LSM software.
This chapter describes LSM-related preventative maintenance procedures
that you should perform.
7.1 The LSM Hot-Sparing Feature
The LSM software supports hot-sparing that enables you to configure a system to automatically react to failures on mirrored or RAID5 LSM objects.
With hot-sparing enabled, LSM detects failures on LSM objects and relocates the affected subdisks to designated spare disks or free disk space within the disk group. LSM then reconstructs the LSM objects that existed before the failure and makes them redundant and accessible.
When a partial disk failure occurs (that is, a failure affecting only some subdisks on a disk), redundant data on the failed portion of the disk is relocated and the existing volumes comprised of the unaffected portions of the disk remain accessible.
By default, hot-sparing is disabled. To enable hot-sparing enter the following command:
#
volwatch
-s
Note
Only one
volwatch
daemon can run on a system or cluster node at any time.
The
volwatch
daemon monitors for failures involving
LSM disks, plexes, or RAID5 subdisks.
When such a failure occurs,
volwatch
daemon:
Detects LSM events resulting from the failure of an LSM disk, plex, or RAID5 subdisk.
Sends electronic mail to the root account (and other specified accounts) with notification about the failure and identifies the affected LSM objects.
If hot-spare support is enabled, determines which subdisks to relocate, finds space for those subdisks in the disk group, relocate the subdisks, and notifies the root account (and other specified accounts) of these actions and their success or failure.
When a partial disk failure occurs (that is, a failure affecting only some subdisks on a disk), redundant data on the failed portion of the disk is relocated and the existing volumes comprised of the unaffected portions of the disk remain accessible.
Note
Hot-sparing is only performed for redundant (mirrored or RAID5) subdisks on a failed disk. Non-redundant subdisks on a failed disk are not relocated, but you are notified of the failure.
Hot-sparing does not guarantee the same layout of data or the same performance after relocation. You may want to make some configuration changes after hot-sparing occurs.
When an exception event occurs, the volwatch command uses mailx(1) to send mail to:
The root account
The user accounts specified when you use the
rcmgr
command to set the
VOLWATCH_USERS
variable
in the
/etc/rc.config.common
file.
See the
rcmgr
(8)
reference page for more information on the
rcmgr
command.
The user account that you specify on the command line with
the
volwatch
command.
There is a 15 second delay before the failure is analyzed and the message is sent. This delay allows a group of related events to be collected and reported in a single mail message. The following is a sample mail notification when a failure is detected:
Failures have been detected by the Logical Storage Manager: failed disks: medianame ... failed plexes: plexname ... failed log plexes: plexname ... failing disks: medianame ... failed subdisks: subdiskname ... The Logical Storage Manager will attempt to find spare disks, relocate failed subdisks and then recover the data in the failed plexes.
The following list describes the sections of the mail message:
The
medianame
list under
failed
disks
specifies disks that appear to have completely failed;
The
medianame
list under
failing
disks
indicates a partial disk failure or a disk that is in the
process of failing.
When a disk has failed completely, the same medianame
list appears under both failed disks: and failing disks.
The
plexname
list under
failed
plexes
shows plexes that are detached due to I/O failures experienced
while attempting to do I/O to subdisks they contain.
The
plexname
list under
failed
log plexes
indicates RAID5 or dirty region log (DRL) plexes that
have experienced failures.
The subdiskname list specifies subdisks in RAID5
volumes that are detached due to I/O errors.
If a disk fails completely, the mail message lists the disk that has failed and all plexes that use the disk. For example:
To: root Subject: Logical Storage Manager failures on mobius.lsm.com Failures have been detected by the Logical Storage Manager failed disks: disk02 failed plexes: home-02 src-02 mkting-01 failing disks: disk02
This message shows that
disk02
was detached by a
failure, and that plexes
home-02
,
src-02
,
and
mkting-01
were also detached (probably due to a disk
failure).
If plex or disk is detached by a failure, the mail sent lists the failed objects. If a partial disk failure occurs, the mail identifies the failed plexes. For example, if a disk containing mirrored volumes experiences a failure, a mail message similar to the following is sent:
To: root Subject: Logical Storage Manager failures on mobius.lsm.com Failures have been detected by the Logical Storage Manager: failed plexes: home-02 src-02
To determine which disks are causing the failures in this message, enter:
#
volstat -sff home-02 src-02
This produces output such as the following:
FAILED TYP NAME READS WRITES sd disk01-04 0 0 sd disk01-06 0 0 sd disk02-03 1 0 sd disk02-04 1 0
This display indicates that the failures are on
disk02
and that subdisks
disk02-03
and
disk02-04
are affected.
Hot-sparing automatically relocates the affected subdisks and initiates necessary recovery procedures. However, if relocation is not possible or the hot-sparing feature is disabled, you must investigate the problem and recover the plexes. For example, sometimes these errors are caused by cabling failures. Check at the cables connecting your disks to your system. If there are any obvious problems, correct them and recover the plexes with the following command:
#
volrecover -b volhome
volsrc
This command starts a recovery of the failed plexes in the background
(the command returns before the operation is done).
If an error message appears,
or if the plexes become detached again, you must replace the disk.
7.1.2 Initailizing Spare Disks
To use hot-sparing, you should configure disks as a spare, which identifies the disk as an available site for relocating failed subdisks. The LSM software does not use disks that are identified as spares for normal allocations unless you explicitly specify otherwise. This ensures that there is a pool of spare disk space available for relocating failed subdisks.
Spare disk space is the first space used to relocate failed subdisks. However, if no spare disk space is available or if the available spare disk space is not suitable or sufficient, free disk space is used.
You must initialize a spare disk and placed it in a disk group as a spare before it can be used for replacement purposes. If no disks are designated as spares when a failure occurs, LSM automatically uses any available free disk space in the disk group in which the failure occurs. If there is not enough spare disk space, a combination of spare disk space and free disk space is used.
When hot-sparing selects a disk for relocation, it preserves the redundancy characteristics of the LSM object to which the relocated subdisk belongs. For example, hot-sparing ensures that subdisks from a failed plex are not relocated to a disk containing a mirror of the failed plex. If redundancy cannot be preserved using available spare disks and/or free disk space, hot-sparing does not take place. If relocation is not possible, mail is sent indicating that no action was taken.
When hot-sparing takes place, the failed subdisk is removed from the configuration database and LSM takes precautions to ensure that the disk space used by the failed subdisk is not recycled as free disk space.
Follow these guidelines when choosing a disk to configuring as a spare:
The hot-spare feature works best if you specify at least one spare disk in each disk group containing mirrored or RAID5 volumes.
If a given disk group spans multiple controllers and has more than one spare disk, set up the spare disks on different controllers (in case one of the controllers fails).
For hot-sparing to succeed for a mirrored volume, the disk group must have at least one disk that does not already contain one of the volume's mirrors. This disk must be a spare disk with some available space or a regular disk with some free space.
For hot-sparing to succeed for a mirrored and striped volume, the disk group must have at least one disk that does not already contain one of the volume's mirrors or another subdisk in the striped plex. This disk should either be a spare disk with some available space or a regular disk with some free space.
For hot-sparing to succeed for a RAID5 volume, the disk group must have at least one disk that does not already contain the volume's RAID5 plex or one of its log plexes. This disk should either be a spare disk with some available space or a regular disk with some free space.
If a mirrored volume has a DRL log subdisk as part of its
data plex (for example,
volprint
does not list the plex
length as
LOGONLY
), that plex cannot be relocated.
Therefore,
place log subdisks in plexes that contain no data (log plexes).
By default,
the
volassist
command creates log plexes.
Hot-sparing is capable of creating a new mirror of the root
disk if the root disk is mirrored and it fails.
The
rootdg
disk group should contain an empty spare disk that satisfies the restrictions
for mirroring the root disk.
Although it is possible to build LSM objects on spare disks, it is preferable to use spare disks for hot-sparing only.
When relocating subdisks off a failed disk, LSM attempts to use a spare disk large enough to hold all data from the failed disk.
To initialize a disk as a spare that has no associated subdisks, use
the
voldiskadd
command and enter
y
at the following prompt:
Add disk as a spare disk for newdg? [y,n,q,?] (default:
n)
y
To initialize an existing LSM disk as a spare disk, enter:
#
voledit set spare=on
medianame
For example, to initialize a disk called
test03
as
a spare disk, enter:
#
voledit set spare=on test03
To remove a disk as a spare, enter:
#
voledit set spare=off
medianame
For example, to make a disk called
test03
available
for normal use, enter:
#
voledit set spare=off test03
7.2 Replacement Procedure
In the event of a disk failure, mail is sent, and if
volwatch
was configured to run with hot sparing support with the
-s
option,
volwatch
attempts to relocate any subdisks
that appear to have failed.
This involves finding appropriate spare disk or
free disk space in the same disk group as the failed subdisk.
To determine which disk from among the eligible spare disks to use,
volwatch
tries to use the disk that is closest to the failed disk.
The value of closeness depends on the controller, target, and disk number
of the failed disk.
For example, a disk on the same controller as the failed
disk is closer than a disk on a different controller; a disk under the same
target as the failed disk is closer than one under a different target.
If no spare or free disk space is found, the following mail message is sent explaining the disposition of volumes on the failed disk:
Relocation was not successful for subdisks on disk dm_name in volume v_name in disk group dg_name. No replacement was made and the disk is still unusable. The following volumes have storage on medianame: volumename ... These volumes are still usable, but the redundancy of those volumes is reduced. Any RAID5 volumes with storage on the failed disk may become unusable in the face of further failures.
If non-RAID5 volumes are made unusable due to the failure of the disk, the following is included in the mail message:
The following volumes: volumename ... have data on medianame but have no other usable mirrors on other disks. These volumes are now unusable and the data on them is unavailable. These volumes must have their data restored.
If RAID5 volumes are made unavailable due to the disk failure, the following message is included in the mail message:
The following RAID5 volumes: volumename ... have storage on medianame and have experienced other failures. These RAID5 volumes are now unusable and data on them is unavailable. These RAID5 volumes must have their data restored.
If spare disk space is found, LSM attempts to set up a subdisk on the
spare disk and use it to replace the failed subdisk.
If this is successful,
the
volrecover
command runs in the background to recover
the contents of data in volumes on the failed disk.
If the relocation fails, the following mail message is sent:
Relocation was not successful for subdisks on disk dm_name in volume v_name in disk group dg_name. No replacement was made and the disk is still unusable. error message
If any volumes are rendered unusable due to the failure, the following is included in the mail message:
The following volumes: volumename ... have data on dm_name but have no other usable mirrors on other disks. These volumes are now unusable and the data on them is unavailable. These volumes must have their data restored.
If the relocation procedure completes successfully and recovery is under way, the following mail message is sent:
Volume v_name Subdisk sd_name relocated to newsd_name, but not yet recovered.
Once recovery has completed, a message is sent relaying the outcome of the recovery procedure. If the recovery was successful, the following is included in the mail message:
Recovery complete for volume v_name in disk group dg_name.
If the recovery was not successful, the following is included in the mail message:
Failure recovering v_name in disk group dg_name.
7.2.1 Moving Relocated Subdisks
When hot-sparing occurs, subdisks are relocated to spare disks or available free disk space within the disk group. The new subdisk locations may not provide the same performance or data layout that existed before hot-sparing took place. After a hot-spare procedureis completed, you may want to move the relocated subdisks to improve performance, to keep the spare disk space free for future hot-spare needs, or to restore the configuration to its previous state.
Note the characteristics of the subdisk before relocation. This information is available from the mail messages sent to root. For example, look for a mail message similar to the following:
To: root Subject: Logical Storage Manager failures on host teal Attempting to relocate subdisk disk02-03 from plex home-02. Dev_offset 0 length 1164 dm_name disk02 da_name dsk2. The available plex home-01 will be used to recover the data.
Note the new location for the relocated subdisk For example, look for a mail message similar to the following:
To: root Subject: Attempting LSM relocation on host teal Volume home Subdisk disk02-03 relocated to disk05-01, but not yet recovered.
Fix or replace the disk that experienced the failure using the procedures described in Section 8.1.
Move the relocated subdisk to the desired location.
Note
The RAID5 volumes are not redundant while you move the subdisk.
For example, to move the relocated subdisk
disk05-01
back to the original disk
disk-02
, enter:
#
volassist move volhome !disk05
disk02
7.3 Save the LSM Configuration
It is recommended that you use the
volsave
command
to create a copy the current LSM configuration on a regular basis.
You can
use the
volrestore
command to recreate the configuration
if the disk group's configuration is lost.
The
volsave
command only saves information about
the LSM configuration; it does not save:
Data in LSM volumes. To ensure that you can recover the saved configuration, you must back up the volume on which it resides.
Configuration information for volumes associated with the
boot disk.
After the
rootdg
disk group is restored, you
must reencapsulate the boot disk partitions as described in Chapter 4.
The
volsave
command saves information about
an LSM configuration in a set of files called a
description set, which is stored by default in a time stamped directory in
/usr/var/lsm/db
.
When you run
volsave
, a description set is created,
which consists of the following files:
allvol.DF
A
volmake
description file for all volumes, plexes,
and subdisks in a disk group.
The
volsave
command creates
a separate subdirectory and description file for each disk group on the system.
voldisk.list
A description of the disks.
This file is the output of the
voldisk list
command.
volboot
The contents of the
/etc/vol/volboot
file.
header
A header file for the description set, containing a checksum, a magic
number, the date of the file's creation, and the version of the
volsave
command.
To create a backup copy of the current LSM configuration using the default backup directory and verify the LSM configuration information in the default directory, enter:
#
volsave
Output similar to the following is displayed:
LSM configuration being saved to /usr/var/lsm/db/LSM.19991226203620.skylark volsave does not save configuration for volumes used for root, swap, /usr or /var. LSM configuration for following system disks not saved: dsk8a dsk8b LSM Configuration saved successfully.
To verify the save, enter:
#
cd /usr/var/lsm/db/LSM.19991226203620.skylark
#
ls
Output similar to the following is displayed:
dg1.d header volboot dg2.d rootdg.d voldisk.list:
In this example, the
volsave
command created the
following files and directories:
A time stamped subdirectory,
LSM.19991226203620.skylark
, containing the
header
,
volboot
,
and
voldisk.list
description files
A
diskgroup.d
subdirectory for each of the system's three disk groups,
dg1
,
dg2
, and
rootdg
.
An
allvol.DF
file in each of the
diskgroup.d
subdirectories.
This file is a
volmake
description file for all volumes, plexes, and subdisks in
that disk group.
To save the LSM configuration in a timestamped subdirectory in a directory
other than
/usr/var/config
, enter:
#
volsave -d /usr/var/config/LSM.%date
Output similar to the following is displayed:
LSM configuration being saved to /usr/var/config/LSM.19991226203658
.
.
.
LSM Configuration saved successfully.
To save an LSM configuration to a specific directory called
/usr/var/LSM.config1
, enter:
#
volsave -d /usr/var/LSM.config1
Output similar to the following is displayed:
LSM configuration being saved to /usr/var/LSM.config1
.
.
.
LSM Configuration saved successfully.
Backing up a volume requires a mirrored plex that is at least large enough to store the complete contents of the volume. Using a smaller plex results in an incomplete copy.
Note
Use the AdvFS backup utilities to backup volumes used with AdvFS instead of the LSM methods.
The methods described in this section do not apply to RAID5 volumes.
7.4.1 Backing Up A Non-Mirrored Volume
Follow these steps to backup a non-mirrored volume:
Ensure there is enough free disk space to create another plex
for the volume that you want to back up.
You can determine this by comparing
the output of the
voldg free
command for the disk group
and the
volprint
-vt
command for the volume.
Create a snapshot plex by entering the following command:
#
volassist snapstart
volume_name
For example, to create a snapshot plex for the volume called
vol01
, enter:
#
volassist snapstart vol01
This creates a write-only backup plex that is attached to and synchronized with the specified volume.
When the snapstart operation is complete, the plex state changes
to
SNAPDONE
.
You can then complete the snapshot operation.
Select a convenient time and inform users of the upcoming snapshot.
Warn them
to save files and refrain from using the system briefly during that time.
When you are ready to create the snapshot, make sure there is no activity on the volume. For UFS volumes, It is recommended that you unmount the file system briefly to ensure that the snapshot data on disk is consistent and complete (all cached data has been flushed to the disk).
Create a snapshot volume that reflects the original volume by entering the following command:
#
volassist snapshot
volume_name temp_volume name
For example to create temporary volume called
vol01-temp
for a volume called
vol01
, enter:
#
volassist snapshot vol01 vol01-temp
This operation detaches the finished snapshot (which becomes a normal plex), creates a new normal volume, and attaches the snapshot plex to it. The snapshot then becomes a normal, functioning plex with a state of ACTIVE. At this point, you can mount and resume normal use of the volume.
Check the temporary volume's contents.
For example, to check
the UFS file system on a volume called
vol01-temp
,
enter:
#
fsck -p /dev/rvol/rootdg/vol01-temp
Perform the backup by entering the following command:
# dump 0 /dev/rvol/disk-group/temp_volume_name
For example, to back up a volume called
vol01-temp
in the
rootdg
disk group, enter:
#
dump 0 /dev/rvol/rootdg/vol01-temp
After the backup is completed, remove the temporary volume by entering the following commands:
#
volume stop temp_volume_name
#
voledit -r rm temp_volume_name
For example to remove a temporary volume called
vol01-temp
, enter:
#
volume stop vol01-temp
#
volume -r rm vol01-temp
7.4.2 Backing Up A Mirrored Volume
If a volume is mirrored, you can back up the volume by temporarily dissociating one of the plexes from the volume. This method eliminates the need for extra disk space for the purpose of backup only.
Warning
If the volume has only two plexes, redundancy is not available during the time of the backup.
Follow these steps to back up a mirrored volume:
Stop all activity on the volume. For UFS volumes, it is recommended that you unmount the file system briefly to ensure that the data on disk is consistent and complete (all cached data has been flushed to the disk).
Dissociate one of the volume's plexes by entering the following command:
#
volplex dis
plex_name
For example, to dissociate a plex called
vol01-02
,
enter:
#
volplex dis vol01-02
This operation usually takes only a few seconds.
It leaves the plex
called
vol01-02
available as an image of the volume
frozen at the time of dissociation.
Mount and resume normal use of the volume.
Create a temporary volume by entering the following commands:
#
volmake -Ufsgen vol temp_volume_name plex=plex_name
#
volume start temp_volume_name
For example to create a temporary volume called
vol01-temp
using a plex called
vol01-02
, enter:
#
volmake -Ufsgen vol vol01-temp plex=vol01-02
#
volume start vol01-temp
Check the temporary volume by entering the following command:
#
fsck -p /dev/rvol/rootdg/temp_volume_name
Perform the backup using the temporary volume by entering the following command:
#
dump 0 /dev/rvol/rootdg/temp_volume_name
After the backup is completed, remove the temporary volume and reattach the plex by entering the following commands:
#
volplex dis plex_name
#
voledit -r rm temp_volume_name
#
volplex att volume_name plex_name &
7.5 Collect LSM Performance Data
LSM provides two types of performance information -- I/O statistics and I/O traces:
I/O statistics are collected using the
volstat
command, which provides the most commonly-used information.
I/O tracing are collected using the
voltrace
command, which provides advanced and in-depth information.
Note
In a cluster environment,
volstat
andvoltrace
report statistics relative to the node on which the commands are executed. These commands do not provide statistics for all the nodes within a cluster.
7.5.1 Gathering I/O Statistics
The
volstat
command provides access to information
for activity on volumes, plexes, subdisks, and disks uses with the LSM software.
You can use the
volstat
command to report I/O statistics
for LSM objects during system boot time or for specified time intervals.
Statistics
for a specific LSM object or all objects are displayed.
If a disk group is
specified, statistics are displayed only for objects in that disk group; otherwise,
statistics for the default disk group (rootdg
) are displayed.
The amount of information displayed depends on the options you specified
with the
volstat
command.
You can also reset the statistics
information to zero, which is useful for measuring the impact of a particular
operation.
For information on available options, see the
volstat
(8)
reference
page.
7.5.1.1 Statistics Recorded by LSM
The LSM software records the following three I/O statistics:
A count of read and write operations.
The number of read and write blocks.
The average operation time. This time reflects the total time it took to complete an I/O operation, including the time spent waiting in a disk's queue on a busy device.
The LSM software records these statistics for logical I/O for each volume. The statistics are recorded for the following types of operations: reads, writes, atomic copies, verified reads, verified writes, plex reads, and plex writes.
For example, one write to a two-plex volume results in at least five operations -- one for each plex, one for each subdisk, and one for the volume. Similarly, one read that spans two subdisks shows at least four reads -- one read for each subdisk, one for the plex, and one for the volume.
The LSM software also maintains other statistical data. For example, read and write failures that appear for each mirror, and corrected read and write failures for each volume accompany the read and write failures that are recorded.
To display statical data for volumes, enter:
#
volstat
Output similar to the following is displayed:
OPERATIONS BLOCKS AVG TIME(ms)
TYP NAME READ WRITE READ WRITE READ WRITE
vol v1 3 72 40 62541 8.9 56.5
vol v2 0 37 0 592 0.0 10.5
To display statical data for all LSM objects, enter:
#
volstat
-vpsd
Output similar to the following is displayed:
OPERATIONS BLOCKS AVG TIME(ms) TYP NAME READ WRITE READ WRITE READ WRITE dm dsk6 3 82 40 62561 8.9 51.2 dm dsk7 0 725 0 176464 0.0 16.3 dm dsk9 688 37 175872 592 3.9 9.2 dm dsk10 29962 0 7670016 0 4.0 0.0 dm dsk12 0 29962 0 7670016 0.0 17.8 vol v1 3 72 40 62541 8.9 56.5 pl v1-01 3 72 40 62541 8.9 56.5 sd dsk6-01 3 72 40 62541 8.9 56.5 vol v2 0 37 0 592 0.0 10.5 pl v2-01 0 37 0 592 0.0 8.0 sd dsk7-01 0 37 0 592 0.0 8.0 sd dsk12-01 0 0 0 0 0.0 0.0 pl v2-02 0 37 0 592 0.0 9.2 sd dsk9-01 0 37 0 592 0.0 9.2 sd dsk10-01 0 0 0 0 0.0 0.0 pl v2-03 0 6 0 12 0.0 13.3 sd dsk6-02 0 6 0 12 0.0 13.3
To display statistics on volumes in the
rootdg
disk
group in one second intervals, enter:
#
volstat -i 1
Output similar to the following is displayed:
OPERATIONS BLOCKS AVG TIME(ms)
TYP NAME READ WRITE READ WRITE READ WRITE
Mon Jun 8 15:11:16 1998
vol v1 3 72 40 62541 8.9 56.5
vol v2 14015 37 14015 592 0.3 10.5
Mon Jun 8 15:11:17 1998
vol v1 0 0 0 0 0.0 0.0
vol v2 2606 0 2606 0 0.3 0.0
To display error statistics on a volume called
testvol
,
enter:
#
volstat -f cf testvol
Output similar to the following is displayed:
CORRECTED FAILED
TYP NAME READS WRITES READS WRITES
vol testvol 1 0 0 0
Additional volume statistics are available for RAID5 configurations.
See the
volstat
(8)
reference page for details.
7.6 Monitor LSM Events and Configuration Changes
You use the
volwatch
and the
volnotify
commands to monitor LSM events and configuration changes.
The
volwatch
shell script sends mail to the
root
login (default) and other user accounts that you specify when
certain LSM configuration events occur, such as a plex detach caused by a
disk failure.
To specify another mail recipient or multiple mail recipients, use the
rcmgr
command to set the
rc.config.common
variable
VOLWATCH_USERS
.
The LSM
volwatch
script uses
VOLWATCH_USERS
whenever the system is booted or LSM is restarted.
To specify a user named user1@mail.com as a mail recipient, enter:
#
rcmgr -c set VOLWATCH_USERS
root@dec.com user1@mail.com
LSM events are sent to the EVM Event Management system using
volnotify
command.
The
volnotify
command integrates
with EVM by default, and runs automatically when LSM starts.
The following LSM events are reported by the
volnotify
command within EVM:
-i Display disk group import, deport, and disable events
-f Display plex, volume, and disk detach events
-d Display disk change events
-c Displays disk group change events
While the LSM
volnotify
events reported
within EVM are configured through the
rc.config.common
variable
LSM_EVM_OPTS
, the
LSM_EVM_OPTS
settings should not normally be changed because certain system software depend
on these values for operation.
Note
For cluster environments, the
volnotify
command only reports LSM events that occur locally on that node. Therefore, to receive LSM events that occur anywhere within the cluster, do not disable thevolnotify
integration within EVM.
Subscribers can display LSM events through the LSM
volnotify
EVM template called
lsm.volnotify.evt
.
This
EVM template is used to display LSM events in cluster and non-cluster environments.
To display LSM events from the EVM log, enter:
#
evmget -f "[name *.volnotify]" | evmshow -t \ "@timestamp @@"
To display LSM events in real time, enter:
#
evmwatch -f "[name *.volnotify]" | evmshow -t \ "@timestamp @@"
For more information, see the
volnotify
(8),
volwatch
(8),
and
EVM
(5) reference pages.
7.7 Monitor Volume States
You can use the
volprint
command to display volume
information.
The
volprint
command displays state information
that indicates a variety of normal and exceptional conditions.
7.7.1 Volume States
Volume states indicate whether or not the volume is initialized, written
to, and the accessibility of the volume.
Table 7-1
describes the volume states.
Table 7-1: Volume States
State | Means |
EMPTY | The volume contents are not initialized. The kernel state is always DISABLED when the volume is EMPTY. |
CLEAN | The volume is not started (kernel state is DISABLED) and its plexes are synchronized. |
ACTIVE | The volume was started (kernel state is currently ENABLED) or was in use (kernel state was ENABLED) when the machine was rebooted. If the volume is currently ENABLED, the state of its RAID-5 plex at any moment is not certain (since the volume is in use). If the volume is currently DISABLED, this means that the plexes cannot be guaranteed to be consistent, but are made consistent when the volume is started. |
SYNC | The volume is either in read-writeback recovery mode (kernel state is currently ENABLED) or was in read-writeback mode when the machine was rebooted (kernel state is DISABLED). With read-writeback recovery, plex consistency is recovered by reading data from blocks of one plex and writing the data to all other writable plexes. If the volume is ENABLED, this means that the plexes are being resynchronized via the read-writeback recovery. If the volume is DISABLED, it means that the plexes were being resynchronized via read-writeback when the machine rebooted and therefore, still need to be synchronized. |
NEEDSYNC | The volume requires a resynchronization operation the next time it starts. |
The interpretation of these states during volume startup is modified
by the persistent state log for the volume (for example, the DIRTY/CLEAN flag).
If the clean flag is set, this means that an ACTIVE volume was not written
to by any processes or was not even open at the time of the reboot; therefore,
it is considered CLEAN.
The clean flag is always set when the volume is marked
CLEAN.
7.7.2 RAID5 Volume States
RAID5 volumes have their own set of volume states as described in
Table 7-2.
Table 7-2: RAID5 Volume States
State | Means |
EMPTY | The volume contents are not initialized. The kernel state is always DISABLED when the volume is EMPTY. |
CLEAN | The volume is not started (kernel state is DISABLED) and its parity is good. The RAID-5 plex stripes are consistent. |
ACTIVE | The volume was started (kernel state is currently ENABLED) or was in use (kernel state was ENABLED) when the system was rebooted. If the volume is currently ENABLED, the state of its RAID-5 plex at any moment is not certain (since the volume is in use). If the volume is currently DISABLED, this means that the parity synchronization is not guaranteed. |
SYNC | The volume is either undergoing a parity resynchronization (kernel state is currently ENABLED) or was having its parity resynchronized when the machine was rebooted (kernel state is DISABLED). |
NEEDSYNC | The volume requires a parity resynchronization operation the next time it is started. |
REPLAY | The volume is in a transient state as part of a log replay. A log replay occurs when it becomes necessary to use logged parity and data. |
The volume kernel state indicates the accessibility of the volume.
The
volume kernel state allows a volume to have an offline (DISABLED), maintenance
(DETACHED), or online (ENABLED) mode of operation.
Table 7-3
describes the volume kernel states.
Table 7-3: Volume Kernel States
State | Means |
ENABLED | Read and write operations can be performed. |
DISABLED | The volume is not accessed. |
DETACHED | Read and write operations cannot be performed,
but plex device operations and
ioctl
functions are accepted. |
You can use the
volprint
command to display plex
information.
The
volprint
command displays state information
that indicates a variety of normal and exceptional conditions.
7.8.1 Plex States
Each data plex associated with a volume has its state set to one of the values listed in Table 7-4.
LSM utilities use plex states to:
Monitor operations on plexes,
Track whether a plex was in active use at the time of a system failure
Indicate whether volume contents have been initialized to a known state
Determine if a plex contains a valid copy (mirror) of the volume contents
Although the LSM utilities automatically maintain a plex's state, you
may need to manually change the plex state.
For example, if a disk begins
to fail, you can temporarily disable a plex located on the disk.
Table 7-4: LSM Plex States
State | Means |
EMPTY | The plex is not initialized. This state is set when the volume state is also EMPTY. |
CLEAN | The plex was running normally when the volume was stopped. The plex was enabled without requiring recovery when the volume is started. |
ACTIVE | The plex is running normally on a started volume. The plex condition flags (NODAREC, REMOVED, RECOVER, and IOFAIL) may apply if the system is rebooted and the volume restarted. |
STALE | The plex was detached, either by the
volplex det
command or by an I/O failure.
The
volume start
command changes the state for a plex to STALE if any of the plex
condition flags are set.
STALE plexes are reattached automatically by calling
volplex
att
when a volume starts.
|
OFFLINE | The plex was disabled by the
volmend
off
command.
See
volmend(8)
for more information.
|
SNAPATT | This is a snapshot plex that is attached
by the
volassist snapstart
command.
When the attach is
complete, the state for the plex is changed to SNAPDONE.
If the system fails
before the attach completes, the plex and all of its subdisks are removed.
|
SNAPDONE | This is a snapshot plex created by
volassist snapstart
command that fully attached.
You can turn a
plex in this state into a snapshot volume with the
volassist snapshot
command.
If the system fails before the attach completes, the plex
and all of its subdisks are removed.
See volassist(8) for more information. |
SNAPTMP | This is a snapshot plex that was attached
by the
volplex snapstart
command.
When the attach is
complete, the state for the plex changes to SNAPDIS.
If the system fails
before the attach completes, the plex is dissociated from the volume.
|
SNAPDIS | This is a snapshot plex created by
volplex snapstart
command that is fully attached.
You can turn
a plex in this state into a snapshot volume with
volplex snapshot
command.
If the system fails before the attach completes, the plex
is dissociated from the volume.
See volplex(8) for more information.
|
TEMP | This is a plex that is associated and attached
to a volume with the
volplex att
command.
If the system
fails before the attach completes the plex is dissociated from the volume.
|
TEMPRM | This is a
plex that is being associated and attached to a volume with
volplex
att .
If the system fails before the attach completes the plex is
dissociated from the volume and removed.
Any subdisks in the plex are kept.
|
TEMPRMSD |
This is a plex that is associated and attached
to a volume with the
volplex att
command.
If the system
fails before the attach completes, the plex and its subdisks are dissociated
from the volume and removed.
|
During normal LSM operation, plexes automatically cycle through a series
of states.
At system startup, volumes are automatically started and the
volume start
operation makes all CLEAN plexes ACTIVE.
If all goes
well until shutdown, the volume-stopping operation marks all ACTIVE plexes
CLEAN and the cycle continues.
Deviations from this cycle indicates abnormalities that the LSM software attempts to normalize, for example:
If a crash occurs between startup and shutdown, the volume-starting operation may find a mirrored volume has no CLEAN plexes, only ACTIVE plexes. The mirrored volume is first placed in the NEEDSYNC state, then into SYNC state once resynchronization starts. After the plexes are resynchronized, the volume is placed into the ACTIVE state.
If an I/O error occurs between startup and shutdown that causes a plex to become disabled, the volume-stopping operation marks that plex as STALE. Any STALE plexes require recovery. When the system restarts, data is copied from an ACTIVE to a STALE plex and makes the STALE plex ACTIVE.
The plex kernel state indicates the accessibility of the plex. The plex kernel state is monitored in the volume driver and allows a plex to have an offline (DISABLED), maintenance (DETACHED), or online (ENABLED) mode of operation. No user intervention is required to set these states; they are maintained internally. On a system that is operating properly, all plexes are set to ENABLED.
Table 7-5
describes the plex kernel states.
Table 7-5: Plex Kernel States
State | Means |
ENABLED | A write request to the volume will be reflected to the plex, if the plex is set to ENABLED for write mode. A read request from the volume is satisfied from the plex if the plex is set to ENABLED. |
DISABLED | The plex is not accessed. |
DETACHED | A write to the volume is not reflected to
the plex.
A read request from the volume will never be satisfied from the
plex device.
Plex operations and
ioctl
functions are accepted. |
The
vold
and
voliod
daemons must
be running for the LSM software to properly work.
These daemons are normally
started automatically when the system boots.
To determine the state of the volume daemon, enter:
#
voldctl mode
The following table shows messages that may display and possible actions
to take if
vold
is disabled or not running.
Message from voldctl mode | Status of
vold |
How to change |
|
Running and enabled | -- |
|
Running, but disabled | voldctl enable |
|
Not running | vold |
See the
vold
(8)
reference page for more information on the
vold
daemon.
The volume extended I/O daemon (voliod
) allows for
some extended I/O operations without blocking calling processes.
The correct
number of
voliod
daemons is automatically started when
LSM is started.
There are typically several
voliod
daemons
running at all times.
It is recommended that you run at least one
voliod
daemon for each processor on the system.
Follow these steps to check and/or change the
voliod
daemons:
Display the current
voliod
state by entering
the following command:
#
voliod
This is the only method for checking on
voliod
, because
the
voliod
processes are kernel threads and are not visible
as output of the
ps
command.
Output similar to the following may display:
0 volume I/O daemons running
If no
voliod
daemons are running, or if
you want to change the number of daemons, enter the following command:
#
voliod set
n
Where n is the number of I/O daemons. Set the number of LSM I/O daemons, which is either two or the number of central processing units (CPUs) on the system, whichever is greater. For example, on a single CPU system, enter:
#
voliod set 2
On a four CPU system, enter:
#
voliod set 4
Verify the change by entering the following command:
#
voliod
Output similar to the following should display:
2 volume I/O daemons running
See the
voliod
(8)
reference page for more information on the
voliod
daemon.
7.10 Trace LSM I/O Operations
You use the
voltrace
command trace volume operations.
Using the
voltrace
command, you can set I/O tracing masks
against a group of volumes or to the system as a whole.
You can then use the
voltrace
command to display ongoing I/O operations relative to the
masks.
The trace records for each physical I/O show a volume and buffer-pointer
combination that enables you to track each operation even though the traces
may be interspersed with other operations.
Like the I/O statistics for a volume,
the I/O trace statistics include records for each physical I/O done, and a
logical record that summarizes all physical records.
For additional information,
see the
voltrace
(8)
reference page.
To trace volumes, enter:
#
voltrace -l
Output similar to the following is displayed:
926159 598519126 START read vdev v2 dg rootdg dev 40,6 block 895389 len 1 concurrency 1 pid 3943 926159 598519127 END read vdev v2 dg rootdg op 926159 block 895389 len 1 time 1 926160 598519128 START read vdev v2 dg rootdg dev 40,6 block 895390 len 1 concurrency 1 pid 3943 926160 598519128 END read vdev v2 dg rootdg op 926160 block 895390 len 1 time 0