This chapter examines problems that, while universal for file systems, may have
unique solutions for AdvFS.
See
System Configuration and Tuning
for related information
about diagnosing performance problems.
5.1 Disk File Structure Incompatibility
If you install your Version 5 operating system as an update to your Version
4 system (not a full installation), your
/root,
/usr,
and
/var
files will retain a DVN of 3 (see
Section 1.4.3.1).
By default, domains created on Version 5.0 and later have a new format that is incompatible with earlier versions (see Section 1.4.3). The newer operating system recognizes the older disk structure, but the older does not recognize the newer. To access a fileset with the new format (a DVN of 4) from an older operating system, NFS mount the fileset from a Version 5 system or upgrade your operating system to Version 5. There is the potential for problems when files created on one operating system are moved to another.
If you try to mount a fileset belonging to a domain with a DVN of 4 when you are running a version of the operating system earlier than Version 5.0, you will get an error message.
There is no tool that upgrades all domains with a DVN of 3 to domains with DVN
of 4.
You must upgrade each domain (see
Section 1.4.3.2).
5.1.1 Utility Incompatibility
Because of the new on-disk file formats, some AdvFS-specific utilities from earlier releases have the potential to corrupt domains created using the new on-disk formats. All statically-linked AdvFS-specific utilities from earlier operating system versions will not run on Version 5.0 and later. These utilities are usually from operating system versions prior to Version 4.0. In additional, the following dynamically-linked utilities from earlier releases of Tru64 UNIX do not run on Version 5.0 and later:
advfsstat
balance
chvol
defragment
rmvol
showfdmn
verify
5.1.2 Avoiding Metadata Incompatibility
If a system crashes or goes down unexpectedly, for example due to loss of power, after reboot AdvFS will perform recovery when the filesets that were mounted at the time of the crash are remounted. This recovery keeps the AdvFS metadata consistent and makes use of the AdvFS transaction log file.
Different versions of the operating system use different AdvFS log record types. Therefore, it is important that AdvFS recovery be done on the same version of the operating system as was running at the time of the crash. For example, if your system was running Version 5.1 when it crashed, do not reboot using Version 3.2G because the log records may be formatted differently from those saved by the Version 5.1 system.
To reboot without error using a different version of the operating system, cleanly
unmount all filesets before rebooting.
Note that if the system failed due to a system
panic or an AdvFS domain panic, it is best to reboot using the original version of
the operating system and then run the
verify
command to ensure
that the domain is not corrupted.
If it is not, it is then safe to reboot using a
different version of the operating system.
If running the
verify
command indicates that the domain has been corrupted, see
Section 5.4.6.
5.2 Memory Mapping, Direct I/O and Data Logging Incompatibility
Memory mapping, atomic-write data logging and direct I/O are mutually exclusive.
If a file is open in one of these modes, attempting to open the same file in one of
the conflicting modes will fail.
For more information see
Section 4.1.4
and
Section 4.1.5
and the
mmap(2)
reference page.
5.3 Handling Poor Performance
The performance of a disk depends upon the I/O demands upon it. If your domain is structured so that heavy access is focused on one volume, it is likely that system performance will degrade. Once you have determined the load balance, there are a number of ways to equalize the activity and increase throughput. See System Configuration and Tuning, command reference pages, and Chapter 4 for more complete information.
To discover the causes of poor performance, first check system activity (see Section 4.2). There are a number of ways to improve performance:
Upgrade domains
DVN4 domains are indexed when a directory grows beyond a page, that is, about 200 files (see Section 1.4.3.2). Directories with more than 5000 files show the most benefit.
Eliminate disk access incompatibility
If you have initiated direct I/O (which turns off caching) to read and write data to a file, any application that accesses the same file will also have direct I/O. This may prove inefficient (see Section 4.1.5 ).
Defragment domains
As files grow, contiguous space on disk often is not available to accommodate
new data, so files become fragmented.
File fragmentation can reduce system performance
because more I/O is required to read or write a file.
Use the AdvFS GUI (see
Chapter 6) or run the
defragment
utility from the
command line (see
Section 4.3.1).
If you have AdvFS Utilities, you can also:
Balance a multivolume domain
System performance improves if you distribute files evenly over all your volumes.
Files that are distributed unevenly can degrade system performance.
Use the
balance
command to redistribute the files (see
Section 4.3.2).
When a volume is added to a domain with the
addvol
command,
all the files of the domain remain on the previously existing volume and the new one
is empty.
To even the file distribution, use the AdvFS GUI (see
Chapter 6)
or run the
balance
utility from the command line.
Stripe individual files
AdvFS allows you to stripe individual files across multiple volumes (see Section 4.3.4). Use AdvFS striping only on directly attached storage that is not otherwise striped. Combining AdvFS striping with other striping may degrade performance.
Migrate individual files
You can use the
migrate
utility to move a heavily accessed
file or selected pages of a file to another volume in the domain.
You can move the
file to a specific volume or you can let the system choose (see
Section 4.3.3).
Change AdvFS resources
You can change your file system size in the following ways:
Increase the size of a domain by adding a volume with the
addvol
command (see
Section 1.4.6).
For optimum performance,
each volume you add should consist of the entire disk (typically, partition
c).
Do not add a volume containing any data you want to keep.
When you run
the
addvol
command, data on the added disk is destroyed.
Shrink a domain by removing a volume with the
rmvol
command
(see
Section 1.4.7).
Striped file segments will be moved to a volume
that does not contain a stripe.
If this is not possible, the system requests confirmation
before doubling up on stripes (see
Section 4.3.4).
You can interrupt the
rmvol
process with
[Ctrl/c]
without damaging your domain.
Files already removed from
the volume will remain in their new location.
Files that had not been moved at the
time of the interrupt will remain in their original location.
If the volume that has had the files removed does not allow new file allocations
after an aborted
rmvol
operation, use the
chvol
command with the
-A
option to reactivate the volume.
Change the size of a domain by changing volumes. Add a new one, move your files to it, then remove the old (see Section 4.3.3).
Back up your data regularly and frequently and watch for signs of impending
disk failure.
Removing files from a problem disk before it fails can prevent a lot
of trouble.
See the Event Management information in
System Administration
for more information.
5.4.1 Checking Free Space and Disk Usage
You can look at the way space is allocated on a disk by file, fileset, or domain.
The AdvFS GUI (see
Chapter 6) displays a hierarchical view of disk
objects and the space they use.
Table 5-1
shows command-line
commands that examine disk space usage.
Table 5-1: Disk Space Usage Information Commands
| Command | Description |
du |
Displays information about block allocation
for files; use the
-a
option to display information for individual
files. |
df |
Displays disk space usage by fileset; available space for a fileset is limited by the fileset quota if it is set. |
showfdmn |
Displays the attributes and block usage for each volume in an active domain; for multivolume domains, additional volume information is displayed. |
showfile |
Displays block usage and volume information for a file or for the contents of a directory. |
showfsets |
Displays information about the filesets in a domain; use to display fileset quota limits. |
vdf |
Displays used and available disk space for a fileset or a domain. |
See the reference pages for the commands for more complete information.
Under certain
conditions, the disk usage information for AdvFS may become corrupt.
To correct this,
change the entry in the
/etc/fstab
file to enable the
quotacheck
command to run.
The
quotacheck
command only
checks filesets that have the
userquota
and
groupquota
options specified.
For example, for the fileset
usr_domain#usr:
usr_domain#usr /usr advfs rw,userquota,groupquota 0 2
Then run the
quotacheck
command for the fileset:
#
quotacheck usr_domain#usr
This should correct the disk usage information.
5.4.2 Reusing AdvFS Volumes
All volumes (disks, disk partitions, LSM volumes, etc.) are labeled either
unused
or with the file system for which they were last used.
You can only
add a volume labeled
unused
to your domain (see
Section 1.3).
If the volume you wish to add is part of an existing domain (the
/etc/fdmns
directory entry exists), the easiest way to return the volume label to
unused status is to remove the volume with the
rmvol
command or
to remove the domain with the
rmfdmn
command (which labels all
volumes that were in the domain unused).
For example, if your volume is
/dev/disk/dsk5c, your original
domain is
old_domain, and the domain you want to add the volume
to is
new_domain, mount all the filesets in
old_domain
then enter:
#rmvol /dev/disk/dsk5c old_domain#addvol /dev/disk/dsk5c new_domain
If the volume you want to add is not part of an existing domain but is giving
you a warning message because it is labeled, reset the disk label.
If you answer
yes
to the prompt on the
addvol
or
mkfdmn
command, the disk label will be reset.
You will lose all information that
was on the volume that you are adding.
5.4.3 Dumping to Block 0
To dump to a partition that starts at block 0 of a disk, you must first clear
the disk label.
If you do not, the
vdump
command may appear to
contain valid savesets, but when the
vrestore
command attempts
to interpret the disk label as part of the saveset, it will return an error (see
Section 3.1.5).
5.4.4 Disk Space Usage Limits
If your system has been running without any limits on resource usage, you can add quotas to limit the amount of disk space your users can access. AdvFS quotas provide a layer of control beyond that available with UFS.
User and group quotas limit the amount of space a user or group can allocate for a fileset. Fileset quotas restrain a fileset from grabbing all of the available space in a domain.
You can set two types of quota limits: hard limits that cannot be exceeded and soft limits that can be exceeded for a period of time called the grace period. You can turn quota enforcement on and off. See Chapter 2 for complete information.
If you are working in an editor and realize that the information you need to
save will put you over your quota limit, do not abort the editor or write the file
because data may be lost.
Instead, remove files to make room for the edited file prior
to writing it.
You can also write the file to another fileset, such as
tmp, remove files from the fileset whose quota you exceeded, and then move
the file back to that fileset.
AdvFS will impose quota limits in the rare case that you are 8 kilobytes below
the user, group, or fileset quota and are attempting to use some or all of the space
you have left.
This is because AdvFS allocates storage in units of 8 kilobytes.
If
adding 8 kilobytes to a file would exceed the quota limit, then that file cannot be
extended.
5.4.5 Verifying File System Consistency
To ensure that metadata is consistent, run the
verify
command
to verify the file system structure.
This utility checks disk structures such as the
bitfile metadata table (BMT), the storage bitmaps, the tag directory, and the
frag
file for each fileset.
It verifies that the directory structure is
correct, that all directory entries reference a valid file, and that all files have
a directory entry.
You must be the root user
to run this command.
It is a good idea to run the
verify
command:
When problems are evident (corruptions, domain panic, lost data, I/O errors).
Before an update installation.
If your files have not been accessed in three to six months or longer
and you plan to run utilities such as
balance,
defragment,
migrate,
quotacheck,
repquota,
rmfset,
rmvol, or
vdump
that access every file in a domain.
Use the SysMan "Repair an AdvFS Domain" or, from the command line, enter:
verify
domain_name
The
verify
command mounts filesets in special directories
as it proceeds.
If the command is unable to mount a fileset due to the failure of
a domain, as a last resort run the command with the
-F
option.
This option mounts the fileset using the
-d
option of the
mount
command, which means that AdvFS initializes the transaction log for
the domain without recovery.
As no domain recovery will occur for previously incomplete
operations, this could cause data corruption.
Under some circumstances the
verify
command may fail to unmount
the filesets.
If this occurs, you must unmount the affected filesets manually.
On machines with many millions of files, sufficient swap must be allocated
for the
verify
utility to run to completion.
If the amount of memory
required by
verify
exceeds the kernel variable
proc/max_per_proc_data_size
process variable, the utility will not complete.
To overcome this problem,
allocate up to 10% of the domain size in swap for running the
verify
command.
The following example verifies the
domainx
domain, which
contains the filesets
setx
and
sety:
#verify domainx+++Domain verification+++ Domain Id 2f03b70a.000f1db0 Checking disks ... Checking storage allocated on disk /dev/disk/dsk10g Checking storage allocated on disk /dev/disk/dsk10a Checking mcell list ... Checking mcell position field ... Checking tag directories ... +++ Fileset verification +++ +++ Fileset setx +++ Checking frag file headers ... Checking frag file type lists ... Scanning directories and files ... 1100 Scanning tags ... 1100 Searching for lost files ... 1100 +++ Fileset sety +++ Checking frag file headers ... Checking frag file type lists ... Scanning directories and files ... 5100 Scanning tags ... 5100 Searching for lost files ... 5100
In this example, the
verify
command finds no problems with
the domain.
5.4.6 Salvaging File Data from a Damaged Domain
How you recover file data from a damaged domain depends on the severity of the damage. Pick the simplest recovery path for the information you have.
Run the
verify
utility to try to repair the domain
(see
Section 5.4.5
and
verify(8)).
The
verify
utility can
only fix a limited set of problems.
Recreate the domain from your most recent backup.
If your backup is not recent enough, use your most recent backup
with the
salvage
utility to obtain more current copies of files.
The amount of data
you are able to recover will depend upon the damage to your domain.
You must be root
user to run the
salvage
utility.
See
salvage(8)
for more information.
Use the SysMan "Recover Files from an AdvFS Domain" or, from the command line, enter:
salvage
domain_name
Running the
salvage
utility does not guarantee that you will
recover all of your domain.
You may be missing files, directories, file names, or
parts of files.
The utility generates a log file that contains the status of files
that were recovered.
Use the
-l
option to list in the log file
the status of all files that are encountered.
The
salvage
utility places recovered files in directories
named after the filesets.
There is a
lost+found
directory for each
fileset that contains files for which no parent directory can be found.
You can specify
the path name of the directory that is to contain the fileset directories.
If you
do not specify a directory, the utility writes recovered filesets under the current
working directory.
You cannot mount the directories in which the files are recovered.
You must move the recovered files to new filesets.
The best way to recover your domain is to use your daily backup tapes.
If files
have changed since the last backup, you can use the tapes along with the
salvage
utility as follows:
Create a new domain and filesets to hold the recovered information. Mount the filesets.
Restore from your backup tape(s) to the new domain.
Run the
salvage
utility with the
-d
option set to recover files that have changed since the backup.
If you have no backups,
you can run the
salvage
utility without the
-d
option to recover all the files in the domain.
The fastest salvage process is to recover file information to another location on disk. The following example recovers data to disk:
#/sbin/advfs/salvage -d 199812071330 corrupt3_domainsalvage: Domain to be recovered 'corrupt3_domain' salvage: Volume(s) to be used '/dev/disk/dsk12a' '/dev/disk/dsk12g' '/dev/disk/dsk12h' salvage: Files will be restored to '.' salvage: Logfile will be placed in './salvage.log' salvage: Starting search of all filesets: 09-Mar-2000 11:53:40 salvage: Starting search of all volumes: 09-Mar-2000 11:55:41 salvage: Loading file names for all filesets: 09-Mar-2000 11:56:42 salvage: Starting recovery of all filesets: 09-Mar-2000 11:57:02
If not enough room is available on disk for the recovered information, you can
recover data to tape and then write it back on to your original disk location.
However,
since this process destroys the original damaged data on disk, once you have created
a new domain, there is no way to rerun the
salvage
command if problems
arise.
Run the
salvage
command with the
-d
option set and use the
-F
and
-f
options to
specify tar format and tape drive.
If you have no backups, you can run the
salvage
utility without the
-d
option to recover all
the files in the domain.
Remove the corrupt domain.
Create a new domain and filesets to hold the recovered information. Mount the filesets.
Restore from your backup tape(s) to the new domain.
Extract the
tar
archive from the tape that the
salvage
utility created (see
tar(1)) to the new filesets.
Caution
Writing over the corrupt data on the disk is an irreversible process. If there is an error, you can no longer recover any more data from the corrupt domain. Therefore, look at the
salvagelog file or the files on thetartape to make sure you have gotten all the files you need. If you have not recovered a significant number of files, you can use thesalvagecommand with the-Soption described below.
The following example recovers data to tape and restores the data to a newly created domain:
#/sbin/advfs/salvage -F tar -d 9810280930 corrupt_domainsalvage: Domain to be recovered 'corrupt_domain' salvage: Volume(s) to be used '/dev/disk/dsk8c' '/dev/disk/dsk5c' salvage: Files will archived to '/dev/tape/tape0_d1' in TAR format salvage: Logfile will be placed in './salvage.log' salvage: Starting search of all filesets: 09-Mar-2000 10:28:13 salvage: Starting search of all volumes: 09-Mar-2000 10:31:41#rmfdmn corrupt_domain#mkfdmn /dev/disk/dsk5c good_domain#addvol /dev/disk/dsk8c good_domain#mkfset good_domain fset1#mkfset good_domain fset2#mount good_domain#fset1 /fset1#mount good_domain#fset2 /fset2
Then restore filesets from tape(s) created by the
salvage
command.
#cd /fset1#tar -xpf /dev/tape/tape0_d1 fset1#cd /fset2#tar -xpf /dev/tape/tape0_d1 fset2
If you have run the
salvage
utility and have been unable
to recover a large number of files, run
salvage
with the
-S
option set.
This process is very slow because the utility reads every
disk block at least once.
Caution
The
salvageutility with the-Soption set opens and reads block devices directly. This could present a security problem. It may be possible to recover data from older, deleted AdvFS domains while attempting to recover data from current AdvFS domains.
Note that if you have chosen recovery to tape and have already created a new
domain on the disks containing the corrupted domain, you cannot use the
-S
option because your original information has been lost.
Note
If you have accidentally used the
mkfdmncommand on a good domain, running thesalvageutility with the-Soption set is the only way to recover files.
For example:
#salvage -S corrupt3_domainsalvage: Domain to be recovered 'corrupt3_domain' salvage: Volume(s) to be used '/dev/disk/dsk2a' '/dev/disk/dsk2g' '/dev/disk/dsk2h' salvage: Files will be restored to '.' salvage: Logfile will be placed in './salvage.log' salvage: Starting sequential search of all volumes: 08-May-2000 14:45:39 salvage: Loading file names for all filesets: 08-May-2000 15:00:38 salvage: Starting recovery of all filesets: 08-May-2000 15:00:40
5.4.7 "Can't Clear a Bit Twice" Error Message
If you receive a "Cannot clear a bit twice" error message, your domain is damaged. To repair it:
Set the AdvfsFixUpSBM kernel variable to allow access to the damaged domain. This flag is off by default
Mount and back up the filesets in the damaged domain.
Turn AdvfsFixUpSBM off.
Unmount the filesets in the domain Run the
verify
utility with the
-f
option.
If there are errors, continue through
steps 5 and 6.
Recreate the domain and filesets.
Restore from the backup.
To turn AdvfsFixUpSBM on:
#dbx -k /vmunix /dev/memdbx>assign AdvfsFixUpSBM = 1dbx>quit
To turn AdvfsFixUpSBM off:
#dbx -k /vmunix /dev/memdbx>assign AdvfsFixUpSBM = 0dbx>quit
Note
The AdvfsFixUpSBM variable is global. Turn it off so that the error message is again available for all domains.
5.4.8 Recovering from a Domain Panic
When a metadata write error occurs, or if corruption is detected in a single AdvFS domain, the system initiates a domain panic (rather than a system panic) on the domain. This isolates the failed domain and allows a system to continue to serve all other domains. After a domain panic AdvFS no longer issues I/O requests to the disk controller for the affected domain. Although the domain cannot be accessed, the filesets in the domain can be unmounted.
When a domain panic occurs, an
EVM
event is logged (see
EVM(5)) and the
following message is printed to the system log and the console:
AdvFS Domain Panic; Domain
name
Id
domain_Id
For example:
AdvFS Domain Panic; Domain staffb_domain Id 2dad7c28.0000dfbb
An AdvFS domain panic has occurred due to either a
metadata write error or an internal inconsistency.
This domain is being rendered inaccessible.
By default, a domain panic on an active domain will cause a live dump to be
created and placed in the
/var/adm/crash
directory.
Please file
a problem report with your software support organization and include the dump file
and a copy of the running kernel.
To recover from a domain panic, perform the following steps:
Run the
mount
command with the
-t
option and identify all mounted filesets in the affected domain.
Unmount all these filesets.
Examine the
/etc/fdmns
directory to obtain a list
of the AdvFS volumes in the domain that panicked.
Run the
savemeta
command (see
savemeta(8)) to collect information
about the metadata files for each volume in the domain for Compaq support personnel.
These saved files will be written in the directory specified and contain information
that technical support needs.
If the problem is a hardware problem, fix it before continuing.
Run the
verify
utility on the domain (see
Section 5.4.5).
If there are no errors, mount all the filesets you unmounted and resume normal operations.
If the
verify
command was able to run but showed
errors, mount the filesets, do a backup, and recreate the domain.
Note that the backup
may be incomplete and that earlier backup resources may be needed.
If the failure prevents complete recovery, recreate the domain with
the
mkfdmn
command and restore the domain's data from backup.
If
this does not provide enough information, you may need to run the
salvage
utility (see
Section 5.4.6).
For example:
#mount -t advfsstaffb_dmn#staff3_fs on /usr/staff3 type advfs (rw) staffb_dmn#staff4_fs on /usr/staff4 type advfs (rw)#umount /usr/staff3#umount /usr/staff4#ls -l /etc/fdmns/staffb_dmnlrwxr-xr-x 1 root system 10 Aug 25 16:46 dsk35c->/dev/disk/dsk3c lrwxr-xr-x 1 root system 10 Aug 25 16:50 dsk36c->/dev/disk/dsk6c lrwxr-xr-x 1 root system 10 Aug 25 17:00 dsk37c->/dev/disk/dsk1c#/sbin/advfs/savemeta staffb_dmn /tmp/saved_dmn#verify staffb_dmn
You do not need to reboot after a domain panic.
If you have recurring domain panics, it may be helpful to adjust the AdvfsDomainPanicLevel
attribute (see
Section 4.3.7) in order to facilitate debugging.
5.4.9 Recovering from Filesets Mounted Read-Only
When a fileset is mounted, AdvFS verifies that all
volumes in a domain can be accessed.
The size recorded in the domain's metadata for
each volume must match the size of the volume.
If the sizes match, the mount proceeds.
If a volume is smaller than the recorded size, AdvFS attempts to read the last block
marked in use for the fileset.
If this block can be read, the mount will succeed,
but the fileset will be marked as read-only.
If the last in-use block for any volume
in the domain cannot be read, the mount will fail.
See
mount(8)
for more information.
If a fileset is mounted read-only, check the labels of the flagged volumes in the error message. There are two common errors:
A disk is mislabeled on a RAID array.
An LSM volume upon which an AdvFS domain resides has been shrunk from its original size (see Section 1.7).
If you have AdvFS Utilities and if the domain consists of multiple volumes and has enough free space to remove the offending volume, you do not need to remove your filesets. However, it is a good idea to back them up before proceeding:
Remove the volume from the domain using the
rmvol
command.
(This will automatically migrate the data to the remaining volumes.)
Correct the disk label of the volume with the
disklabel
command.
Add the corrected volume back to the domain with the
addvol
command.
Run the
balance
command to distribute the data
across the new volumes.
For example, if
/dev/disk/dsk2c
(on a device here called <disk>) within the
data5
domain is mislabeled, you can migrate your files on that volume (automatic with the
rmvol
command), then move them back when you have restored the volume:
#rmvol /dev/disk/dsk2c data5#disklabel -z dsk2#disklabel -rw dsk2 <disk>#addvol /dev/disk/dsk2c data5#balance data5
If you do not have AdvFS Utilities or if there is not enough free space in the domain to transfer the data from the offending volume:
Back up all filesets in the domain.
Remove the domain with the
rmfdmn
command.
Correct the disk label of the volume with the
disklabel
command.
Make the new domain.
If you have AdvFS Utilities and if the original domain was multivolume,
add the corrected volume back to the domain with the
addvol
command.
Restore the filesets from the backup.
For example, if
/dev/disk/dsk1c
(on a device here called<disk>)
containing the
data3
domain is mislabeled:
#vdump -0f -u /data3#rmfdmn data3#disklabel -z dsk1 <disk>#disklabel -w dsk1 <disk>#mkfdmn data3
If you are recreating a multivolume domain, include the necessary
addvol
commands to add the additional volumes.
For example to add
/dev/disk/dsk5c
to the domain:
#addvol /dev/disk/dsk5c data3#mkfset data3 data3fset#mount data3#data3fset /data3#vrestore -xf - /data3
5.5 Restoring an AdvFS File System
Use the
vrestore
command to restore your AdvFS
files that have been backed up with the
vdump
command.
5.5.1 Restoring the /etc/fdmns Directory
AdvFS must have a current
/etc/fdmns
directory in order to
mount filesets (see
Section 1.4.2).
A missing or damaged
/etc/fdmns
directory prevents access to a domain, but the data within the domain remains
intact.
You can restore the
/etc/fdmns
directory from backup or
you can recreate it.
If you have a current backup copy of the directory, it is preferable to restore
the
/etc/fdmns
directory from backup.
Any standard backup facility
(vdump,
tar, or
cpio) can
back up the
/etc/fdmns
directory.
To restore the directory, use
the recovery procedure that is compatible with your backup process.
You can reconstruct the
/etc/fdmns
directory manually or
with the
advscan
command.
The procedure for reconstructing the
/etc/fdmns
directory is similar for both single-volume and multivolume domains.
You can construct the directory for a missing domain, missing links, or the whole
directory.
If you choose to reconstruct the directory manually, you must know the name
of each domain and its associated volumes.
5.5.1.1 Reconstructing the /etc/fdmns Directory Manually
If you accidentally lose all or part of your
/etc/fdmns
directory,
and you know which domains and links are missing, you can reconstruct it manually.
The following example reconstructs the
/etc/fdmns
directory
and two domains where the domains exist and their names are known.
Each contains a
single volume (or special device).
Note that the order of creating the links in these
examples does not matter.
The domains are:
domain1
on
/dev/disk/dsk1c
domain2
on
/dev/disk/dsk2c
To reconstruct the two single-volume domains, enter:
#mkdir /etc/fdmns#mkdir /etc/fdmns/domain1#cd /etc/fdmns/domain1#ln -s /dev/disk/dsk1c dsk1c#mkdir /etc/fdmns/domain2#cd /etc/fdmns/domain2#ln -s /dev/disk/dsk2c dsk2c
The following example reconstructs one multivolume domain.
The
domain1
domain contains the following three volumes:
/dev/disk/dsk1c
/dev/disk/dsk2c
/dev/disk/dsk3c
To reconstruct the multivolume domain, enter the following:
#mkdir /etc/fdmns#mkdir /etc/fdmns/domain1#cd /etc/fdmns/domain1#ln -s /dev/disk/dsk1c dsk1c#ln -s /dev/disk/dsk2c dsk2c#ln -s /dev/disk/dsk3c dsk3c
5.5.1.2 Reconstructing the /etc/fdmns Directory Using advscan
You can use the
advscan
command to determine which partitions
on a disk or Logical Storage Manager (LSM) disk group are part of an AdvFS domain.
Then you can use the command to rebuild all or part of your
/etc/fdmns
directory.
This command is useful:
When disks have moved to a new system, device numbers have changed, or you have lost track of a domain location.
For repair, if you delete the
/etc/fdmns
directory,
delete a domain from the
/etc/fdmns
directory, or delete links
from a domain's subdirectory in the
/etc/fdmns
directory.
The
advscan
command can:
Determine if a partition is an AdvFS partition.
List partitions in the order they are found on disk.
Read the disk label to determine which partitions are in the domain and if any are overlapping.
Scan all disks found in any
/etc/fdmns
domain.
Recreate missing domain directories. The domain name is created from the device name.
Fix the domain count and links for a domain.
For each domain there are three numbers that must match for the AdvFS file system to operate properly:
The number of physical partitions found by the
advscan
command that have the same domain ID
The domain volume count (the number stored in the AdvFS metadata that specifies how many partitions the domain has)
The number of
/etc/fdmns
links to the partitions,
because each partition must be represented by a link
See
advscan(8)
for more information.
Inconsistencies can occur in these numbers in a number of ways and for a number
of reasons.
In general, the
advscan
command treats the domain volume
count as more reliable than the number of partitions or
/etc/fdmns
links.
The following tables list anomalies, possible causes, and corrective actions
that
advscan
can take.
In the table, the letter N represents the
value that is expected to be consistent for the number of partitions, domain volume
count, and number of links.
Table 5-2
shows possible cause and corrective action
if the expected value, N, for the number of partitions and for the domain value count
do not equal the number of links in
/etc/fdmns/<dmn>.
Table 5-2: Fileset Anomalies and Corrections
| Number of Links in /etc/fdmns/<dmn> | Possible Cause | Corrective Action |
| <N | addvol
terminated
early or a link in
/etc/fdmns/<dmn>
was manually removed. |
If the domain is activated before running
advscan
with the
-f
option and the cause of the mismatch
was an interrupted
addvol, the situation will be corrected automatically.
Otherwise,
advscan
will add the partition to the
/etc/fdmns/<dmn>
directory. |
| >N | rmvol
terminated early
or a link in
/etc/fdmns/<dmn>
was manually added. |
If the domain is activated and the cause
of the mismatch was an interrupted
rmvol, the situation will be
corrected automatically.
Otherwise, if the cause was a manually added link in
/etc/fdmns/<dmn>, systematically try removing different links in the
/etc/fdmns/<dmn>
directory and try activating the domain.
The number
of links to remove is the number of links in the
/etc/fdmns/<dmn>
directory minus the domain volume count displayed by
advscan. |
Table 5-3
shows possible cause and corrective action
if the expected value, N, for the number of partitions and for the number of links
in
/etc/fdmns/<dmn>
do not equal the domain volume count:
Table 5-3: Fileset Anomalies and Corrections
| Domain Volume Count | Possible Cause | Corrective Action |
| <N | Cause unknown | Cannot correct; run
salvage
to recover as much data as possible from the domain. |
| >N | addvol
terminated
early and partition being added is missing or has been reused. |
Cannot correct; run
salvage
to recover as much data as possible from the remaining volumes in the domain. |
Table 5-4
shows possible cause and corrective action
if the expected value, N, for the domain volume count and for the number of links
in
/etc/fdmns/<dmn>
do not equal the number of partitions:
Table 5-4: Fileset Anomalies and Corrections
| Number of Partitions | Possible Cause | Corrective Action |
| <N | Partition missing. | Cannot correct; run
salvage
to recover as much data as possible from the remaining volumes in the domain. |
| >N | addvol
terminated
early. |
None; domain will mount with N volumes;
rerun
addvol |
To locate AdvFS partitions, enter the
advscan
command:
advscan
[options]
disks
In the following example there are no missing domains.
The
advscan
command scans devices
dsk0
and
dsk5
for AdvFS partitions and finds nothing amiss.
There are two partitions found (dsk0c
and
dsk5c), the domain volume count reports two,
and there are two links entered in the
/etc/fdmns
directory.
#advscan dsk0 dsk5Scanning disks dsk0 dsk5 Found domains: usr_domain Domain Id 2e09be37.0002eb40 Created Thu Feb 24 09:54:15 2000 Domain volumes 2 /etc/fdmns links 2 Actual partitions found: dsk0c dsk5c
In the following example, directories that define the domains that include
dsk6
were removed from the
/etc/fdmns
directory.
This
means that the number of
/etc/fdmns
links, the number of partitions,
and the domain volume counts are no longer equal.
The
advscan
command scans device
dsk6
and recreates the missing domains as follows:
A partition is found containing an AdvFS domain.
The domain volume
count reports one, but there is no domain directory in the
/etc/fdmns
directory that contains this partition.
Another partition is found containing a different AdvFS domain. The domain volume count is also one. There is no domain directory that contains this partition.
No other AdvFS partitions are found. The domain volume counts and the number of partitions found match for the two discovered domains.
The
advscan
command creates directories for the
two domains in the
/etc/fdmns
directory.
The
advscan
command creates symbolic links for
the devices in the
/etc/fdmns
domain directories.
The command and output are as follows:
#advscan -r dsk6Scanning disks dsk6 Found domains: *unknown* Domain Id 2f2421ba.0008c1c0 Created Thu Jan 20 13:38:02 2000 Domain volumes 1 /etc/fdmns links 0 Actual partitions found: dsk6a*
*unknown*
Domain Id 2f535f8c.000b6860
Created Fri Feb 25 09:38:20 2000
Domain volumes 1
/etc/fdmns links 0
Actual partitions found:
dsk6b*
Creating /etc/fdmns/domain_dsk6a/
linking dsk6a
Creating /etc/fdmns/domain_dsk6b/
linking dsk6b
5.5.2 Recovering from Volume Failure
Some problems show up in AdvFS because of hardware errors. For example, if a write to the file system fails due to a hardware fault, it might show up as metadata corruption. Hardware problems cannot be repaired by your file system. If you start seeing unexplained errors from a file system, do the following:
As root user, examine the
/var/adm/messages
file
for AdvFS I/O error messages.
For example:
Sep 28 15:39:16 systemname vmunix: AdvFS I/O error:
Sep 28 15:39:16 systemname vmunix: Domain#Fileset:test1#test1
Sep 28 15:39:16 systemname vmunix: Mounted on: /test1
Sep 28 15:39:17 systemname vmunix: Volume: /dev/rz11c
Sep 28 15:39:17 systemname vmunix: Tag: 0x00000006.8001
Sep 28 15:39:17 systemname vmunix: Page: 76926
Sep 28 15:39:17 systemname vmunix: Block: 5164080
Sep 28 15:39:17 systemname vmunix: Block count: 256
Sep 28 15:39:17 systemname vmunix: Type of operation: Read
Sep 28 15:39:17 systemname vmunix: Error: 5
Sep 28 15:39:17 systemname vmunix: To obtain the name of
Sep 28 15:39:17 systemname vmunix: the file on which the
Sep 28 15:39:17 systemname vmunix: error occurred, type the
Sep 28 15:39:17 systemname vmunix: command
Sep 28 15:39:17 systemname vmunix: /sbin/advfs/tag2name
Sep 28 15:39:17 systemname vmunix: /test1/.tags/6
This error message describes the domain, fileset, and volume on which the error
occurred.
It also describes how to find out what file was affected by the I/O error.
If you do not find any AdvFS I/O error messages but are still seeing unexplained
behavior on the file system, unmount the domain as soon as possible and run the
verify
utility to check the consistency of the domain's metadata.
Check for device driver error messages for the volume described in
the AdvFS I/O error message.
If you do not find any error messages, unmount the domain
as soon as possible and run the
verify
utility to check the integrity
of the domain's metadata.
If you do find device driver I/O error messages that correspond
to the AdvFS I/O error messages, then the file system is being affected by problems
with the underlying hardware.
Try to remove the faulty volume using the
rmvol
utility (see
Section 1.4.7).
If this succeeds, the file system problems
should not recur.
If
rmvol
fails due to more I/O errors, it will
be necessary to recreate the domain.
If you have a recent backup, recreate the domain and restore it from
backup.
If you have no backup or it is too old, use the
salvage
utility (see
Section 5.4.6) to extract the contents of the
corrupted domain.
Remove the faulty domain using the
rmfdmn
command.
Recreate the domain using the
mkfdmn
command.
Remember that if you are recreating your domain under Version 5.0 and later, your
domains will have a DVN of 4 by default (see
Section 1.4.3).
Add volumes
as needed if you have the AdvFS Advanced Utilities package installed.
Do not to include
the faulty volume in the new domain.
Restore the contents of the recreated domain using the information obtained in step 4.
Remount the filesets in the domain.
5.5.3 Recovering from Failure of the root Domain
A catastrophic failure of the disk containing your AdvFS root domain requires that you recreate your root domain in order to have a domain to boot from. Before you recreate your domain, it is a good idea to satisfy yourself that the failure is not due to hardware problems. Check the console, look for cable or power problems, etc.
If you have files in the root domain that were not backed up, run the
salvage
utility with the
-d
option to obtain more recent
information from your domain.
Make sure
that regularly scheduled jobs are disabled.
Then boot from your installation CD-ROM.
To recover from the failure of the root domain:
Run the
salvage
utility if necessary and save the
files at another location.
Boot your system as stand-alone.
Transfer to single-user mode.
Examine the devices available.
Label the disk you have chosen.
Create the root domain and fileset. Note that if you have changed the root domain name or fileset name, use the new name.
Mount the newly created root domain and restore from backup.
If necessary, move any files recovered from the
salvage
process into the root domain.
If necessary, move your
/usr
file to this disk.
The following example assumes that you are booting from the CD-ROM device DKA500,
which is the installation Stand Alone System (SAS).
The tape drive is
/dev/tape/tape0.
The root is being restored to device
/dev/disk/dsk1,
which is a device here called <disk>.
The example boots in single-user mode, creates a new root domain, and restores its
contents from backup.
>>>b DKA5003) UNIX Shell#ls /dev/disk#ls /dev/tape/tape0#disklabel -rw -t advfs /dev/rdisk/dsk1a <disk>#mkfdmn -r /dev/disk/dsk1a root_domain#mkfset root_domain root#mount root_domain#root /mnt#cd /mnt#vrestore -x -D .#mkfdmn /dev/disk/dsk1a usr_domain#mkfset usr_domain usr#mount usr/_domain#usr /usr#mount root_domain#root /mnt#cd /usr
You can now boot your restored root domain.
5.5.4 Restoring a Multivolume usr Domain
To restore a multivolume
/usr
file system, the
usr_domain
domain must first be reconstructed with all of its volumes before
you restore the files.
However, creating a multivolume domain requires the
addvol
utility, and the
addvol
command will not run unless
the License Management Facility (LMF) database, which resides in the
/usr/sbin
directory, is available.
See
lmf(8)
for information.
On some systems the
/var
directory, where the LMF database
resides, and the
/usr
directory are both located in the
usr
fileset.
So the directory containing the license database must be recovered
from the
usr
fileset before the
addvol
command
can be accessed.
On some systems the
/var
directory is in a separate
fileset.
If this is the case, the
addvol
command can be recovered
first and then can be used to add the volumes.
The following example restores a multivolume domain where the
/var
directory and the
/usr
directory are both in the
usr
fileset in the
usr_domain
domain consisting of the
dsk1g,
dsk2c, and
dsk3c
volumes.
The
procedure assumes that the root file system has already been restored.
Mount the root fileset as read/write:
#mount -u /
Remove the links for the old
usr_domain
and create
a new
usr_domain
using the initial volume:
#rm -rf /etc/fdmns/usr_domain#mkfdmn /dev/disk/dsk1g usr_domain
Create and mount the
/usr
and
/var
filesets:
#mkfset usr_domain usr#mount -t advfs usr_domain#usr /usr
Create a soft link in
/usr
because that is where
the
lmf
command looks for its database:
#ln -s /var /usr/var
Insert the
/usr
backup tape:
#cd /usr#vrestore -vi(/)add sbin/addvol(/)add sbin/lmf(/)add var/adm/lmf(/)extract(/)quit
Reset the license database:
#
/usr/sbin/lmf reset
Add the extra volumes to
usr_domain:
#/usr/sbin/addvol /dev/disk/dsk2c usr_domain#/usr/sbin/addvol /dev/disk/dsk3c usr_domain
Do a full restore of the
/usr
backup:
#cd /usr#vrestore -xv
The following example restores a multivolume domain where the
/usr
and
/var
directories are in separate filesets in the
same multivolume domain,
usr_domain, consisting of
dsk1g,
dsk2c, and
dsk3c.
This means that
you must mount both the
/var
and the
/usr
backup
tapes.
The procedure assumes that the root file system has already been restored.
Mount the root fileset as read/write:
#
mount -u /
Remove the links for the old
usr_domain
and create
a new
usr_domain
using the initial volume:
#rm -rf /etc/fdmns/usr_domain#mkfdmn /dev/disk/dsk1g usr_domain
Create and mount the
/usr
and
/var
filesets:
#mkfset usr_domain usr#mkfset usr_domain var#mount -t advfs usr_domain#usr /usr#mount -t advfs usr_domain#var /var
Insert the
/var
backup tape and restore from it:
#cd /var#vrestore -vi(/)add adm/lmf(/)extract(/)quit
Insert the
/usr
backup tape:
#cd /usr#vrestore -vi(/)add sbin/addvol(/)add sbin/lmf(/)extract(/)quit
Reset the license database:
#/usr/sbin/lmf reset
Add the extra volumes to
usr_domain:
#/usr/sbin/addvol /dev/disk/dsk2c usr_domain#/usr/sbin/addvol /dev/disk/dsk3c usr_domain
Do a full restore of
/usr
backup:
#cd /usr#vrestore -xv
Insert the
/var
backup tape and do a full restore
of
/var
backup:
#cd /var#vrestore -xv
5.6 Recovering from a System Crash
As each domain is mounted after a crash, it automatically runs recovery code
that checks through the transaction log to ensure that any file system operations
that were occurring when the system crashed are either completed or backed out.
This
ensures that AdvFS metadata is in a consistent state after a crash.
5.6.1 Saving Copies of System Metadata
If you believe that a domain is corrupted or otherwise causing problems, run
the
savemeta
command to save a copy of the domain's metadata for
examination by Compaq support personnel.
You must be root user to run this command
(see
savemeta(8)).
5.6.2 Physically Moving an AdvFS Disk
If a machine has failed, it is possible to move disks containing AdvFS domains
to another computer running AdvFS.
Connect the disk(s) to the new machine and modify
the
/etc/fdmns
directory so the new system will recognize the transferred
volume(s).
You must be root user to complete this process.
You cannot move domains that have a DVN of 4 to systems running a Version 4 operating system. Doing so will generate an error message (see Section 5.1). You can move domains with a DVN of 3 to a machine running Version 5. The newer operating system will recognize the domains created earlier.
Caution
Do not use either the
addvolcommand or themkfdmncommand to add the volumes to the new machine. Doing so will delete all data on the disk you are moving. See Section 5.4.6 if you have already done so.
If you do not know what partitions your domains were on, you can add the disks
on the new machine and run the
advscan
command, which may be able
to recreate this information.
You can also look at the disk label on the disk to see
which partitions in the past have been made into AdvFS partitions.
This will not tell
you which partitions belong to which domains.
For example, if the motherboard of your machine fails, you need to move the
disks to another system.
You may need to reassign the disk SCSI IDs to avoid conflicts.
(See your disk manufacturer instructions for more information.) For this example,
assume the IDs are assigned to disks 6 and 8.
Assume also that the system has a domain,
testing_domain, on two disks,
dsk3
and
dsk4.
This domain contains two filesets:
sample1_fset
and
sample2_fset.
These filesets are mounted on
/data/sample1
and
/data/sample2.
Assume you know that the domain that you are moving had partitions
dsk3c,
dsk4a,
dsk4b, and
dsk4g.
The moving process would take the following steps:
Shut down the working machine to which you are moving the disks.
Connect the disks from the bad machine to the good one.
Reboot. You do not need to reboot to single-user mode; multiuser mode works because you can complete the following steps while the system is running.
Figure out the device nodes created for the new disks:
#/sbin/hwmgr -show scsi -full
The output is a detailed list of information about all the disks
on your machine.
The DEVICE FILE column shows the name that the system uses to refer
to each disk.
Determine the listing for the disk you just added, for example,
disk6.
Use this name to set up symbolic links in step 5 below.
Modify your
/etc/fdmns
directory to include the
information from the transferred domains:
#mkdir -p /etc/fdmns/testing_domain#cd /etc/fdmns/testing_domain#ln -s /dev/disk/dsk6c dsk6c#ln -s /dev/disk/dsk8a dsk8a#ln -s /dev/disk/dsk8b dsk8b#ln -s /dev/disk/dsk8g dsk8g#mkdir /data/sample1#mkdir /data/sample2
Edit the
/etc/fstab
file to add the fileset mount-point
information:
testing_domain#sample1_fset /data/sample1 advfs rw 1 0testing_domain#sample2_fset /data/sample2 advfs rw 1 0
Mount the volumes:
#mount /data/sample1#mount /data/sample2
Note that if you run the
mkfdmn
command or the
addvol
command on partition
dsk6c,
dsk8a,
dsk8b, or
dsk8g, or an overlapping partition, you will
destroy the data on the disk.
See
Section 5.4.6
if you have
accidentally done so.
If a system crashes, AdvFS will perform recovery at reboot. Filesets that were mounted at the time of the crash will be recovered when they are remounted. This recovery keeps the AdvFS metadata consistent and makes use of the AdvFS transaction log.
Since different versions of the operating system use different transaction log structures, it is important that you recover your filesets on the version of the operating system that was running at the time of the crash. If you do not, you risk corrupting the domain metadata and/or panicking the domain.
If the system crash has occurred because you have set the
AdvfsDomainPanicLevel
attribute (see
Section 4.3.6) to promote a domain panic to
a system panic, it is also good idea to run the
verify
command
on the panicked domain to ensure that it is not damaged.
If your filesets were unmounted
at the time of the crash, or if you have remounted them successfully and have run
the
verify
command (if needed), you can mount the filesets on a
different version of the operating system, if appropriate.