2    Planning a High-Performance and High-Availability Configuration

A high-performance configuration is one that will rapidly respond to the demands of a normal workload, and also maintain an adequate level of performance if the workload increases. A high-availability configuration provides protection against single points of failure.

This chapter describes how to perform the following tasks:

2.1    Identifying a Resource Model for Your Workload

Before you can configure or tune a system, you must identify a resource model for your workload. That is, you must determine if your applications are memory-intensive or CPU-intensive, and how they perform disk and network I/O. This information will help you to choose the configuration and tuning recommendations that are appropriate for your workload.

For example, if the resource model for a database server indicates that the workload consists of large sequential data transfers, an appropriate configuration is one that provides high bandwidth. If a system performs many disk write operations, a mirrored disk configuration may not be an appropriate configuration.

Use Table 2-1 to help you determine the resource model for your workload and identify a possible configuration solution for each model.

Table 2-1:  Resource Models and Possible Configuration Solutions

Resource Model Configuration Solution
CPU-intensive Multiprocessing system, fast CPUs, or RAID array
Memory-intensive VLM system or large onboard CPU cache
Requires large amount of disk storage System with a large I/O capacity, LSM, or RAID array
Requires low disk latency Solid-state disks, fast disks, RAID array, or Fibre Channel
Requires high throughput High-performance adapters, striping, RAID 5, or dynamic parity RAID
Requires high bandwidth High-performance adapters, wide devices, RAID 3, or dynamic parity RAID
Performs many large sequential data transfers High-performance disks, wide devices, striping, RAID 3, RAID 5, dynamic parity RAID
Performs many small data transfers RAID 5
Issues predominantly read transfers Disk mirroring, RAID 5, or striping
Issues predominantly write transfers Prestoserve or write-back cache
Performs many network operations Multiple network adapters, NetRAIN, or high-performance adapters
Application must be highly available Cluster
Data must be highly available Mirroring (especially across different buses), RAID 3, RAID 5, or dynamic parity RAID
Network I/O-intensive Multiple network adapters or NetRAIN

2.2    Identifying Performance and Availability Goals

Before you choose a configuration, you must determine the level of performance and availability that you need. In addition, you must account for cost factors and plan for future workload expansion.

When choosing a system and disk storage configuration, you must be sure to evaluate the configuration choices in terms of the following criteria:

After you determine the goals for your environment, you can choose the system and disk storage configuration that will address these goals.

2.3    Choosing High-Performance System Hardware

Different systems provide different configuration and performance features. A primary consideration for choosing a system is its CPU and memory capabilities. Some systems support multiple CPUs, fast CPU speeds, and very-large memory (VLM) configurations.

Because very-large database (VLDB) systems and cluster systems usually require many external I/O buses, another consideration is the number of I/O bus slots in the system.

Be sure that a system is adequately scalable, which will determine whether you can increase system performance by adding resources, such as CPU and memory boards. If applicable, choose a system that supports RAID controllers, high-performance network adapters, and cluster products.

Table 2-2 describes some hardware options that can be used in a high-performance system and the performance benefit for each option.

Table 2-2:  High-Performance System Hardware Options

Hardware Option Performance Benefit
Multiple CPUs (Section 2.3.1) Improves processing time
Fast CPU speed (Section 2.3.1) Improves processing time
Onboard CPU cache (Section 2.3.1) Improves processing time
Very-large memory (Section 2.3.2) Improves processing time and decreases disk I/O latency
Large I/O capacity (Section 2.3.3) Allows you to connect many I/O adapters and controllers for disk storage and network connections
High-performance disk storage support (Section 2.3.4) Improves overall system performance and availability
High-performance network support (Section 2.3.5) Increases network access and network performance

For detailed information about hardware performance features see the Compaq Systems & Options Catalog. For information about operating system hardware support, see the Tru64 UNIX Version 4.0F Software Product Description.

The following sections describe these hardware options in detail.

2.3.1    CPU Configuration

To choose a CPU configuration that will meet your needs, you must determine your requirements for the following:

2.3.2    Memory and Swap Space Configuration

You must determine the total amount of memory and swap space that you need to handle your workload. Insufficient memory resources and swap space will cause performance problems. In addition, your memory bank configuration will affect performance.

To configure memory and swap space, perform the following tasks:

  1. Determine how much physical memory your configuration requires and choose a system that provides the necessary memory and has enough backplane slots for memory boards (Section 2.3.2.1).

  2. Choose a swap space allocation mode (Section 2.3.2.2).

  3. Determine how much swap space you need (Section 2.3.2.3).

  4. Configure swap disks in order to efficiently distribute the disk I/O (Section 6.2).

The following sections describe these tasks.

2.3.2.1    Determining Your Physical Memory Requirements

You must have enough system memory to provide an acceptable level of user and application performance. The amount of memory installed in your system must be at least as much as the sum of the following:

In addition, each network connection to your server requires the following memory resources:

These memory resources total 1 KB for each connection endpoint (not including the socket buffer space), so you need 10 MB of memory in order to accommodate 10,000 connections. There is no limit on a system's ability to handle millions of TCP connections, if you have enough memory resources to service the connections. However, when memory is low, the server will reject new connection requests until enough existing connections are freed. Use the netstat -m command to display the memory that is currently being used by the network subsystem.

To ensure that your server has enough memory to handle high peak loads, you should have available 10 times the memory that is needed on a busy day. For optimal performance and for scalability, configure more than the minimum amount of memory needed.

2.3.2.2    Choosing a Swap Space Allocation Mode

There are two modes that you can use to allocate swap space. The modes differ in how the virtual memory subsystem reserves swap space for anonymous memory (modifiable virtual address space). Anonymous memory is memory that is not backed by a file, but is backed by swap space (for example, stack space, heap space, and memory allocated by the malloc function).

There is no performance benefit attached to either mode; however, deferred mode is recommended for large-memory systems. The swap space allocation modes are as follows:

See the System Administration manual for more information on swap space allocation methods.

2.3.2.3    Determining Swap Space Requirements

Swap space is used to hold the recently accessed modified pages from processes and from the UBC. In addition, if a crash dump occurs, the operating system writes all or part of physical memory to swap space.

It is important to configure a sufficient amount of swap space and to distribute swap space across multiple disks. An insufficient amount of swap space can severely degrade performance and prevent processes from running or completing. A minimum of 128 MB of swap space is recommended.

The optimal amount of swap space for your configuration depends on the following factors:

To calculate the amount of swap space required by your configuration, first identify the total amount of anonymous memory (modifiable virtual address space) required by all of your processes. If you are using immediate mode, the optimal amount of swap space will be this value plus 10 percent. If you are using deferred mode, the optimal amount of swap space will be half the total anonymous memory requirements.

You can configure swap space when you first install the operating system, or you can add swap space at a later date. See Section 6.2 for information about adding swap space after installation and configuring swap space for high-performance.

2.3.3    I/O Capacity

Systems provide support for different numbers of storage shelves and I/O buses to which you can connect external storage devices and network adapters. Some enterprise systems provide up to 132 PCI slots for external storage. These systems are often used in VLDB systems and cluster configurations.

You must ensure that the system you choose has sufficient I/O buses and slots available for your disk storage and network configuration.

2.3.4    High-Performance Disk Storage Support

Systems support different local disk storage configurations, including multiple storage shelves, large disk capacity, and UltraSCSI devices. In addition, some systems support high-performance PCI buses, which are required for hardware RAID subsystems and clusters.

You must ensure that the system you choose supports the disk storage configuration that you need. See Section 2.4 and Section 2.5 for more information on disk storage configurations.

2.3.5    High-Performance Network Support

Systems support various networks and network adapters that provide different performance features. For example, an Asynchronous Transfer Mode (ATM) high-performance network is ideal for applications that need the high speed and the low latency (switched, full duplex network infrastructure) that ATM networks provide.

In addition, you can configure multiple network adapters or use NetRAIN to increase network access and provide high network availability.

2.4    Choosing High-Performance Disk Storage Hardware

The disk storage subsystem is used for both data storage and for swap space. Therefore, an incorrectly configured or tuned disk subsystem can degrade both disk I/O and virtual memory performance. Using your resource model, as described in Section 2.1, choose the disk storage hardware that will meet your performance needs.

Table 2-3 describes some hardware options that can be used in a high-performance disk storage configuration and the performance benefit for each option.

Table 2-3:  High-Performance Disk Storage Hardware Options

Hardware Option Performance Benefit
Fast disks (Section 2.4.1) Improves disk access time and sequential data transfer performance
Solid-state disks (Section 2.4.2) Provides very low disk access time
Wide devices (Section 2.4.3) Provides high bandwidth and improves performance for large data transfers
High-performance host bus adapters (Section 2.4.4) Increases bandwidth and throughput, and supports wide data paths and fast bus speeds
DMA host bus adapters (Section 2.4.5) Relieves CPU of data transfer overhead
RAID controllers (Section 2.4.6 and Section 8.4) Decreases CPU overhead; increases the number of disks that can be connected to an I/O bus; provides RAID functionality; and optionally provides write-back caches
Fibre Channel (Section 2.4.7) Provides high access speeds and other high-performance features
Prestoserve (Section 2.4.8) Improves synchronous write performance

For detailed information about hardware performance features, see the Compaq Systems & Options Catalog. For information about operating system hardware support, see the Tru64 UNIX Version 4.0F Software Product Description.

The following sections describe some of these high-performance disk storage hardware options in detail.

2.4.1    Fast Disks

Disks that spin with a high rate of revolutions per minute (RPM) have a low disk access time (latency). High-RPM disks are especially beneficial to the performance of sequential data transfers.

High-performance disks (7200 RPM) can improve performance for many transaction processing applications (TPAs). UltraSCSI disks (10,000 RPM) are ideal for demanding applications, including network file servers and Internet servers, that require high bandwidth and high throughput.

2.4.2    Solid-State Disks

Solid-state disks provide outstanding performance in comparison to magnetic disks, but at a higher cost. By eliminating the seek and rotational latencies that are inherent in magnetic disks, solid-state disks can provide very high disk I/O performance. Disk access time is under 100 microseconds, which allows you to access critical data more than 100 times faster than with magnetic disks.

Available in both wide (16-bit) and narrow (8-bit) versions, solid-state disks are ideal for response-time critical applications with high data transfer rates, such as online transaction processing (OLTP), and applications that require high bandwidth, such as video applications.

Solid-state disks complement hardware RAID configurations by eliminating bottlenecks caused by random workloads and small data sets. Solid-state disks also provide data reliability through a nonvolatile data-retention system.

For the best performance, use solid-state disks for your most frequently accessed data to reduce the I/O wait time and CPU idle time. In addition, connect the disks to a dedicated bus and use a high-performance host bus adapter.

2.4.3    Devices with Wide Data Paths

Disks, host bus adapters, SCSI controllers, and storage expansion units support wide data paths, which provide nearly twice the bandwidth of narrow data paths. Wide devices can greatly improve I/O performance for large data transfers.

Disks with wide (16-bit) data paths provide twice the bandwidth of disks with narrow (8-bit) data paths. To obtain the performance benefit of wide disks, all the disks on a SCSI bus must be wide. If you use both wide and narrow disks on the same SCSI bus, the bus performance will be constrained by the narrow disks.

2.4.4    High-Performance Host Bus Adapters

Host bus adapters and interconnects provide different performance features at various costs. For example, FWD (fast, wide, and differential) SCSI bus adapters provide high bandwidth and high throughput connections to disk devices. Other adapters support UltraSCSI.

In addition, some host bus adapters provide dual-port (dual-channel) support, which allows you to connect two buses to one I/O bus slot.

Bus speed (the rate of data transfers) depends on the host bus adapter. Different adapters support bus speeds ranging from 5 million bytes per second (5 MHz) for slow speed to 40 million bytes per second (20 MHz) for UltraSCSI.

You must use high-performance host bus adapters, such as the KZPSA adapter, to connect systems to high-performance RAID array controllers.

2.4.5    DMA Host Bus Adapters

Some host bus adapters support direct memory access (DMA), which enables an adapter to bypass the CPU and go directly to memory to access and transfer data. For example, the KZPAA is a DMA adapter that provides a low-cost connection to SCSI disk devices.

2.4.6    RAID Controllers

RAID controllers are used in hardware RAID subsystems, which greatly expand the number of disks connected to a single I/O bus, relieve the CPU of the disk I/O overhead, and provide RAID functionality and other high-performance and high-availability features.

There are various types of RAID controllers, which provide different features. High-performance RAID array controllers support dynamic parity RAID and battery-backed write-back caches. Backplane RAID storage controllers provide a low-cost RAID solution.

See Section 8.4 for more information about hardware RAID subsystems.

2.4.7    Fibre Channel

Fibre Channel is a high-performance I/O bus that is an example of serial SCSI and provides network storage capabilities. Fibre Channel supports multiple protocols, including SCSI, Intellitan Protocol Interface (IPI), TCP/IP, and High-Performance Peripheral Interface (HIPPI).

Fibre Channel is based on a network of intelligent switches. Link speeds are available up to 100 MB/sec full duplex. Although Fibre Channel is more expensive than parallel SCSI, Fibre Channel Arbitrated Loop (FC-AL) decreases costs by eliminating the Fibre Channel fabric and using connected nodes in a loop topology with simplex links. In addition, an FC-AL loop can connect to a Fibre Channel fabric.

2.4.8    Prestoserve

Prestoserve uses a nonvolatile, battery-backed memory cache to improve synchronous write performance. Prestoserve temporarily caches file system writes that otherwise would have to be written to disk. This capability improves performance for systems that perform large numbers of synchronous writes.

To optimize Prestoserve cache use, you may want to enable Prestoserve only on the most frequently used file systems. Prestoserver can greatly improve performance for NFS servers.

You cannot use Prestoserve in a cluster or for nonfile system I/O.

2.5    Choosing How to Manage Disks

Disk configurations vary in capacity, performance features, and degree of availability. Use your workload resource model, as described in Section 2.1, to identify the disk configuration that will meet your performance needs.

Table 2-4 describes some disk storage management options and the performance benefit and availability impact for each option.

Table 2-4:  High-Performance Disk Storage Configuration Solutions

Configuration Option Performance Benefit
Shared pool of storage (LSM) (Section 2.5.1) Facilitates management of large amounts of storage
Disk striping (RAID 0) (Section 2.5.2) Distributes disk I/O and improves throughput, but decreases availability
RAID 3 (Section 2.5.3) Improves bandwidth and provides availability
RAID 5 (Section 2.5.3) Improves throughput and provides availability
Dynamic parity RAID (Section 2.5.3) Improves overall disk I/O performance and provides availability
Disk mirroring (RAID 1) (Section 2.6.2) Improves read performance and provides high availability, but decreases write performance

Not only is it important to choose the correct disk storage configuration, you also must follow the configuration guidelines, as described in Chapter 8.

The following sections describe some of these high-performance storage configurations in detail. See Section 2.6 for information about high-availability configurations.

2.5.1    Using a Shared Pool of Storage for Flexible Management

There are two methods that you can use to manage the physical disks in your environment. The traditional method of managing disks and files is to divide each disk into logical areas called disk partitions, and to then create a file system on a partition or use a partition for raw I/O.

Each disk type has a default partition scheme. The disktab database file lists the default disk partition sizes. The size of a partition determines the amount of data it can hold. It can be time-consuming to modify the size of a partition. You must back up any data in the partition, change the size by using the disklabel command, and then restore the data to the resized partition.

An alternative to managing disks with static disk partitions is to use the Logical Storage Manager (LSM) to set up a shared pool of storage that consists of multiple disks. You can create virtual disks (LSM volumes) from this pool of storage, according to your performance and capacity needs, and then place file systems on the volumes or use them for raw I/O.

LSM provides you with flexible and easy management for large storage configurations. Because there is no direct correlation between a virtual disk and a physical disk, file system or raw I/O can span disks as needed. In addition, you can easily add disks to and remove disks from the pool, balance the load, and perform other storage management tasks. LSM also provides you with high-performance and high-availability RAID functionality, hot spare support, and load balancing.

See Section 8.3 for information about LSM configurations.

2.5.2    Striping Disks to Distribute I/O

Disk striping (RAID 0) distributes disk I/O and can improve throughput. The striped data is divided into blocks (sometimes called chunks or stripes) and distributed across multiple disks in a array. Striping enables parallel I/O streams to operate concurrently on different devices, so that I/O operations can be handled simultaneously by multiple devices.

Disk striping requires LSM or a hardware RAID subsystem.

The performance benefit of striping depends on the size of the stripe and how your users and applications perform disk I/O. For example, if an application performs multiple simultaneous I/O operations, you can specify a stripe size that will enable each disk in the array to handle a separate I/O operation. If an application performs large sequential data transfers, you can specify a stripe size that will distribute a large I/O evenly across the disks.

For volumes that receive only one I/O at a time, you may not want to use striping if access time is the most important factor. In addition, striping may degrade the performance of small data transfers, because of the latencies of the disks and the overhead associated with dividing a small amount of data.

Striping decreases data availability because one disk failure makes the entire disk array unavailable. To make striped disks highly available, you can combine RAID 0 with RAID 1 to mirror the striped disks.

See Chapter 8 for more information about LSM and hardware RAID subsystems.

2.5.3    Using Parity RAID to Improve Disk Performance

Hardware RAID subsystems support parity RAID for high performance and high availability. Tru64 UNIX supports three types of parity RAID, each with different performance and availability benefits:

See Section 8.4 for more information about hardware RAID subsystems.

2.6    Choosing a High-Availability Configuration

You can set up a configuration that provides the level of availability that you need. For example, you can make only disk data highly available or you can set up a cluster configuration with no single point of failure, as shown in Figure 1-4.

Table 2-5 lists each possible point of failure, the configuration solution that will provide high availability, and any performance benefits and tradeoffs.

Table 2-5:  High-Availability Configurations

Point of Failure Configuration Solution Benefits and Tradeoffs
Single system Latest hardware, firmware, and operating system releases Provides the latest hardware and software enhancements, but may require down time during upgrade
  Cluster with at least two systems (Section 2.6.1) Improves overall performance by spreading workload across member systems, but increases costs and management complexity
Multiple systems Cluster with more than two members (Section 2.6.1) Improves overall performance by spreading workload across member systems, but increases costs and management complexity
Cluster interconnect Second cluster interconnect (Section 2.6.1) Increases costs
Disk Mirrored disks (Section 2.6.2) Improves read performance, but increases costs and decreases write performance
  Parity RAID (Section 2.6.2) Improves disk I/O performance, but increases management complexity and decreases performance when under heavy write loads and in failure mode
Host bus adapter or bus Mirrored data across disks on different buses (Section 2.6.2) Improves read performance, but increases costs and decreases write performance
Network connection Multiple network connections or use NetRAIN (Section 2.6.3) Improves network access and possibly performance, but increases costs
System cabinet power supply Redundant power supplies (Section 2.6.4) Increases costs
Storage unit power supply Redundant power supplies or mirrored disks across cabinets with independent power supplies (Section 2.6.4 and Section 2.6.2) Increases costs
Total power supply Battery-backed uninterruptible power system (UPS) (Section 2.6.4) Increases costs

The following sections describe some of the previous high-availability configurations in detail.

2.6.1    Using a Cluster for System Availability

If users and applications depend on the availability of a single system for CPU, memory, data, and network resources, they will experience down time if a system crashes or an application fails. To make systems and applications highly available, you must use the TruCluster products to set up a cluster.

A cluster is a loosely coupled group of servers configured as member systems and connected to highly available shared disk storage and common networks. Software applications are installed on every member system, but only one system runs an application at one time.

A cluster utilizes a failover mechanism to protect against failures. If a member system fails, all cluster-configured applications running on that system will fail over to a viable member system; that is, the new system will start the applications and make them available to users. Use more than two member systems in a cluster to protect against multiple system failures.

Cluster products include TruCluster Available Server Software and TruCluster Production Server Software, which supports a high-performance cluster interconnect that enables fast and reliable communications between members. To protect against interconnect failure, use redundant cluster interconnects.

You can use only specific systems, host bus adapters, RAID controllers, and disks with the cluster products. In addition, member systems must have enough I/O bus slots for adapters, controllers, and interconnects.

See a specific cluster product's Software Product Description for detailed information about the product.

2.6.2    Using RAID for Disk Data Availability

RAID technology provides you with high data availability, in addition to high performance. RAID 1 (disk mirroring) provides high data availability by maintaining identical copies of data on different disks in an array. If the original disk fails, the copy is still available to users and applications. To protect data against a host bus adapter or bus failure, mirror the data across disks located on different buses.

Mirroring disks can improve read performance because data can be read from two different locations. However, it decreases disk write performance, because data must be written to two different locations.

Disk mirroring requires LSM or a hardware RAID subsystem.

Hardware RAID subsystems also provide high data availability and high performance by using parity RAID, in which data is spread across disks and parity information is used to reconstruct data if a failure occurs. Tru64 UNIX supports three types of parity RAID (RAID 3, RAID 5, and dynamic parity RAID), and each provides different performance and availability benefits. See Section 2.5.3 for more information about parity RAID.

See Chapter 8 for more information about disk storage configurations.

2.6.3    Using Redundant Networks

Network connections may fail because of a failed network interface or a problem in the network itself. You can make the network connection highly available by using redundant network connections. If one connection becomes unavailable, you can still use the other connection for network access. Whether you can use multiple networks depends on the application, network configuration, and network protocol.

You can also use NetRAIN (redundant array of independent network adapters) to configure multiple interfaces on the same LAN segment into a single interface, and to provide failover support for network adapter and network connections. One interface is always active while the other interfaces remain idle. If the active interface fails, an idle interface is brought online within less than 10 seconds.

NetRAIN supports only Ethernet and FDDI.

See nr(7) for more information about NetRAIN. See the Network Administration manual for information about network configuration. See Chapter 10 for information about improving network performance.

2.6.4    Using Redundant Power Sources

To protect against a cabinet power supply failure, use redundant power supplies from different power sources. For disk storage units, you can mirror disks across cabinets with independent power supplies.

In addition, use an uninterruptible power system (UPS) to protect against a total power failure (for example, the power in a building fails). A UPS depends on a viable battery source and monitoring software.