[Contents] [Prev. Chapter] [Next Section] [Next Chapter] [Index] [Help]

4    Configuring and Tuning Memory

This chapter describes how the DIGITAL UNIX operating system uses the physical memory installed in the system. This chapter also describes how to configure and tune virtual memory, swap space, and buffer caches. Many of the tuning tasks described in this chapter require you to modify system attributes. See Section 2.11 for more information.


[Contents] [Prev. Chapter] [Next Section] [Next Chapter] [Index] [Help]

4.1    Understanding Memory Management

The total amount of physical memory is determined by the capacity of the memory boards installed in your system. The system distributes this memory in 8-KB units called pages.

The system distributes pages of physical memory among three areas:

Figure 4-1 shows how physical memory is used.

Figure 4-1:  Physical Memory Usage

The virtual memory subsystem and the UBC compete for the physical pages that are not wired. Pages are allocated to processes and to the UBC, as needed. When the demand for memory increases, the oldest (least-recently used) pages are reclaimed from the virtual memory subsystem and the UBC and reused. Various attributes control the amount of memory available to the virtual memory subsystem and the UBC and the rate of page reclamation. Wired pages are not reclaimed.

System performance depends on the total amount of physical memory and also the distribution of memory resources. DIGITAL UNIX allows you to control the allocation of memory (other than static wired memory) by modifying the values of system attributes. Tuning memory usually involves the following tasks:

You can also configure your swap space for optimal performance. However, to determine how to obtain the best performance, you must understand your workload characteristics, as described in Chapter 1.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.2    Understanding Memory Hardware

When programs are executed, the system moves data and instructions among various caches, physical memory, and disk swap space. Accessing the data and instructions occurs at different speeds, depending on the location. Table 4-1 describes the various hardware resources (in the order of fastest to slowest access time).

Table 4-1:  Memory Management Hardware Resources

Resource Description
CPU caches Various caches reside in the CPU chip and vary in size up to a maximum of 64 KB (depending on the type of processor). These caches include the translation lookaside buffer, the high-speed internal virtual-to-physical translation cache, the high-speed internal instruction cache, and the high-speed internal data cache.
Secondary cache The secondary direct-mapped physical data cache is external to the CPU, but usually resides on the main processor board. Block sizes for the secondary cache vary from 32 bytes to 256 bytes (depending on the type of processor). The size of the secondary cache ranges from 128 KB to 8 MB.
Tertiary cache The tertiary cache is not available on all Alpha CPUs; otherwise, it is identical to the secondary cache.
Physical memory The actual amount of physical memory varies.
Swap space Swap space consists of one or more disks or disk partitions (block special devices).

The hardware logic and the PAL code control much of the movement of addresses and data among the CPU cache, the secondary and tertiary caches, and physical memory. This movement is transparent to the operating system. Figure 4-2 shows an overview of how instructions and data are moved among various hardware components during program execution.

Figure 4-2:  Moving Instructions and Data Through the Memory Hardware

Movement between caches and physical memory is significantly faster than movement between disk and physical memory, because of the relatively slow speed of disk I/O. Therefore, avoid paging and swapping operations, and applications should utilize caches when possible. Figure 4-3 shows the amount of time that it takes to access data and instructions from various hardware locations.

Figure 4-3:  Time Consumed to Access Storage Locations

For more information on the CPU, secondary cache, and tertiary cache, see the Alpha Architecture Reference Manual.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.3    Understanding Virtual Memory

The virtual memory subsystem performs the following functions:

The following sections describe these functions in detail.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.3.1    Allocating Virtual Address Space to Processes

For each process, the fork system call performs the following tasks:

Because memory is limited, a process' entire virtual address space cannot be in physical memory at one time. However, a process can execute when only a portion of its virtual address space (its working set) is mapped to physical memory.

For each process, the virtual memory subsystem allocates a large amount of virtual address space but uses only part of this space. Only 4 TB is allocated for user space. User space is generally private and maps to a nonshared physical page. An additional 4 TB of virtual address space is used for kernel space. Kernel space usually maps to shared physical pages. The remaining space is not used for any purpose.

In addition, user space is sparsely populated with valid pages. Only valid pages are able to map to physical pages. The vm-maxvas attribute specifies the maximum amount of valid virtual address space for a process (that is, the sum of all the valid pages). The default is 128000 pages (1 GB).

Figure 4-4 shows the use of process virtual address space.

Figure 4-4:  Virtual Address Space Usage


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.3.2    Translating Virtual Addresses to Physical Addresses

When a virtual page is touched or accessed, the virtual memory subsystem must locate the physical page and then translate the virtual address into a physical address. Each process has a page table, which is an array containing an entry for each current virtual-to-physical address translation. Page table entries have a direct relation to virtual pages (that is, virtual address 1 corresponds to page table entry 1) and contain a pointer to the physical page and protection information.

Figure 4-5 shows the translation of a virtual address into a physical address.

Figure 4-5:  Virtual-to-Physical Address Translation

A process' resident set is the complete set of all the virtual addresses that have been mapped to physical addresses (that is, all the pages that have been accessed during process execution). Resident set pages may be shared among multiple processes. A process' working set is the set of virtual addresses that are currently mapped to physical physical addresses. The working set is a subset of the resident set and represents a snapshot of the process' resident set.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.3.3    Page Faulting

When a nonfile-backed virtual address is requested, the virtual memory subsystem locates the physical page and makes it available to the process. This process occurs at different speeds, depending on the location of the page (see Figure 4-3).

If a requested address is currently being used (active), it will have an entry in the page table. In this case, the PAL code loads the physical address into the translation lookaside buffer, which then passes the address to the CPU.

If a requested address is not active in the page table, the PAL lookup code issues a page fault, which instructs the virtual memory subsystem to locate the page and make the virtual-to-physical address translation in the page table.

If a requested virtual address is being accessed for the first time, the virtual memory subsystem performs the following tasks:

  1. Allocates an available page of physical memory.

  2. Fills the page with zeros.

  3. Enters the virtual-to-physical address translation in the page table.

This is called a zero-filled-on-demand page fault.

If a requested virtual address has already been accessed, it will be in one of the following locations:

If a process needs to modify a read-only virtual page, the virtual memory subsystem allocates an available page of physical memory, copies the read-only page into the new page, and enters the translation in the page table. This is called a copy-on-write page fault.

To improve process execution time and decrease the number of page faults, the virtual memory subsystem attempts to anticipate which pages the task will need next. Using an algorithm that checks which pages were most recently used, the number of available pages, and other factors, the subsystem maps additional pages, along with the page that contains the requested address.

The virtual memory subsystem also uses page coloring to reduce execution time. If possible, the subsystem attempts to map a process' entire resident set into the secondary cache. If the entire task, text, and data are executed within the cache, addresses do not have to be fetched from physical memory.

The private-cache-percent attribute specifies the percentage of the cache that is reserved for anonymous (nonshared) memory. The default is to reserve 50 percent of the cache for anonymous memory and 50 percent for file-backed memory (shared). To cache more anonymous memory, increase the value of the private-cache-percent attribute. This attribute is primarily used for benchmarking.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.3.4    Managing and Tracking Pages

The virtual memory subsystem allocates physical pages to processes and the UBC, as needed. Because physical memory is limited, these pages must be periodically reclaimed so that they can be reused.

The virtual memory subsystem uses page lists to track the location and age of all the physical memory pages. At any one time, each physical page can be found on one of the following lists:

Use the vmstat command or dbx to determine the number of pages that are on the page lists. Remember that pages on the active list (the act field in the vmstat output) include both inactive and UBC LRU pages.

As physical pages are allocated to processes and the UBC, the free list becomes depleted, and pages must be reclaimed in order to replenish the list. To reclaim pages, the virtual memory subsystem does the following:

See Section 4.3.5, Section 4.3.6, Section 4.3.8, and Section 4.3.9 for more information about prewriting pages, paging, and swapping.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.3.5    Prewriting Modified Pages

The virtual memory subsystem attempts to prevent a memory shortage by prewriting modified pages to swap space.

When the virtual memory subsystem anticipates that the pages on the free list will soon be depleted, it prewrites to swap space the oldest modified (dirty) inactive pages. The value of the vm-page-prewrite-target attribute determines the number of pages that the subsystem will prewrite and keep clean. The default value is 256 pages.

In addition, when the number of modified UBC LRU pages exceeds the value of the vm-ubcdirtypercent attribute, the virtual memory subsystem prewrites to swap space the oldest modified UBC LRU pages. The default value of the vm-ubcdirtypercent attribute is 10 percent of the total UBC LRU pages.

To minimize the impact of sync (steady state flushes) when prewriting UBC pages, the ubc-maxdirtywrites attribute specifies the maximum number of disk writes that the kernel can perform each second. The default value is 5.

See Section 4.7.13 for more information about prewriting dirty pages.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.3.6    Using Attributes to Control Paging and Swapping

When the demand for memory depletes the free list, paging begins. The virtual memory subsystem takes the oldest inactive and UBC LRU pages, moves the contents of the modified pages to swap space, and puts the clean pages on the free list, where they can be reused.

If the free page list cannot be replenished by reclaiming individual pages, swapping begins. Swapping temporarily suspends processes and moves entire resident sets to swap space, which frees large amounts of physical memory.

The point at which paging and swapping start and stop depends on the values of some virtual memory subsystem attributes. Figure 4-6 shows the default values of these attributes.

Figure 4-6:  Paging and Swapping Attributes - Default Values

Detailed descriptions of the attributes are as follows:

See Section 4.3.8 and Section 4.3.9 for information about paging and swapping operations.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.3.7    Using Attributes to Control UBC Memory Allocation

Because the UBC shares with the virtual memory subsystem the physical pages that are not wired by the kernel, the allocation of memory to the UBC can affect file system performance and paging and swapping activity. The UBC is dynamic and consumes varying amounts of memory in order to respond to changing file system demands.

Figure 4-7 shows how memory is allocated to the UBC.

Figure 4-7:  UBC Memory Allocation

The following attributes control the amount of memory available to the UBC:


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.3.8    Paging Operation

When the memory demand is high and the number of pages on the free page list reaches the value of the vm-page-free-target attribute, the virtual memory subsystem uses paging to replenish the free page list. The page reclamation code controls paging and swapping. The page-out daemon and task swapper daemon are extensions of the page reclamation code. See Section 4.3.6 for more information about the attributes that control paging and swapping.

The page reclamation code activates the page-stealer daemon, which first reclaims the pages that the UBC has borrowed from the virtual memory subsystem, until the size of the UBC reaches the borrowing threshold (the default is 20 percent). If the reclaimed pages are dirty (modified), their contents must be written to disk before the pages can be moved to the free page list. Freeing borrowed UBC pages is a fast way to reclaim pages, because UBC pages are usually unmodified. See Section 4.3.7 for more information about UBC borrowed pages.

If freeing UBC borrowed memory does not sufficiently replenish the free list, a pageout occurs. The page-stealer daemon reclaims the oldest inactive and UBC LRU pages.

Paging becomes increasingly aggressive if the number of free pages continues to decrease. If the number of pages on the free page list falls below the value of the vm-page-free-min attribute (the default is 20 pages), a page must be reclaimed for each page allocated. To prevent deadlocks, if the number of pages on the free page list falls below the value of the vm-page-free-reserved attribute (the default is 10 pages), only privileged tasks can get memory until the free page list is replenished.

Paging stops when the number of pages on the free list reaches the value of the vm-page-free-target attribute.

If paging individual pages does not replenish the free list, swapping is used to free a large amount of memory. See Section 4.3.9 for more information.

Figure 4-8 shows the movement of pages during paging operations.

Figure 4-8:  Paging Operation


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.3.9    Swapping Operation

If there is a high demand for memory, the virtual memory subsystem may be unable to replenish the free list by reclaiming pages. Swapping reduces the demand for physical memory by suspending processes, which dramatically increases the number of pages on the free list. To swap out a process, the task swapper suspends the process, writes its resident set to swap space, and moves the clean pages to the free list.

Idle task swapping begins when the number of pages on the free list falls below the value of the vm-page-free-swap attribute for a period of time (the default is 74 pages). The task swapper suspends all tasks that have been idle for 30 seconds or more.

If the number of pages on the free list falls below the value of the vm-page-free-optimal attribute (the default is 74 pages) for more than five seconds, hard swapping begins. The task swapper suspends, one at a time, the tasks with the lowest priority and the largest resident set size.

Swapping stops when the number of pages on the free list reaches the value of the vm-page-free-hardswap attribute (the default is 1280).

A swapin occurs when the number of pages on the free list reaches the value of the vm-page-free-optimal attribute for a period of time. The task's working set is paged in from swap space and it can now execute. The value of the vm-inswappedmin attribute specifies the minimum amount of time, in seconds, that a task must remain in the inswapped state before it can be outswapped. The default value is 1 second.

Swapping has a serious impact on system performance. You can modify the attributes described in Section 4.3.6 to control when swapping starts and stops.

Increasing the rate of swapping (swapping earlier during page reclamation) increases throughput. As more processes are swapped out, fewer processes are actually executing and more work is done. Although increasing the rate of swapping moves long-sleeping threads out of memory and frees memory, it degrades interactive response time. When an outswapped process is needed, it will have a long latency.

If you decrease the rate of swapping (swap later during page reclamation), you will improve interactive response time, but at the cost of throughput.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.3.10    Using Swap Buffers

To facilitate the movement of data between memory and disk, the virtual memory subsystem uses synchronous and asynchronous swap buffers. The virtual memory subsystem uses these two types of buffers to immediately satisfy a page-in request without having to wait for the completion of a page-out request, which is a relatively slow process.

Synchronous swap buffers are used for page-in page faults and for swap outs. Asynchronous swap buffers are used for asynchronous pageouts and for prewriting modified pages. See Section 4.7.15 and Section 4.7.16 for tuning information.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.4    Understanding the Unified Buffer Cache

The DIGITAL UNIX operating system uses the Unified Buffer Cache (UBC) as a layer between the operating system and disk. The UBC holds actual file data, which includes reads and writes from conventional file activity and page faults from mapped file sections, and AdvFS metadata. The cache can improve I/O performance by decreasing the number of disk I/O operations.

The UBC shares with the virtual memory subsystem the physical pages that are not wired by the kernel. The maximum and minimum percentages of memory that the UBC can utilize are specified by the ubc-maxpercent attribute (the default is 100 percent) and the ubc-minpercent attribute (the default is 10 percent). In addition, the ubc-borrowpercent attribute specifies the percentage of memory allocated to the UBC above which the memory is only borrowed from the virtual memory subsystem. The default is 20 percent of physical memory. See Section 4.3.7 for more information.

The UBC is dynamic and consumes varying amounts of memory in order to respond to changing file system demands. For example, if file system activity is heavy, pages will be allocated to the UBC up to the value of the ubc-maxpercent attribute. In contrast, heavy process activity, such as large increases in the working sets for large executables, will cause the virtual memory subsystem to reclaim UBC borrowed pages. Figure 4-7 shows the allocation of physical memory to the UBC.

The UBC uses a hashed list to quickly locate the physical pages that it is holding. A hash table contains file and offset information that is used to speed lookup operations.

The UBC also uses a buffer to facilitate the movement of data between memory and disk. The vm-ubcbuffers attribute specifies maximum file system device I/O queue depth for writes (that is, the number of UBC I/O requests that can be outstanding). See Section 4.7.17 for tuning information.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.5    Understanding the Metadata Buffer Cache

The metadata buffer cache is part of kernel wired memory and is used to cache only UFS and CDFS metadata, which includes file header information, superblocks, inodes, indirect blocks, directory blocks, and cylinder group summaries. The DIGITAL UNIX operating system uses the metadata buffer cache as a layer between the operating system and disk. The cache can improve I/O performance by decreasing disk I/O operations.

The metadata buffer cache is configured at boot time and uses bcopy routines to move data in and out of memory. The size of the metadata buffer cache is specified by the value of the bufcache attribute. See Section 4.9 for tuning information.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.6    Configuring Memory and Swap Space

The following sections describe how to configure memory and swap space, which includes the following tasks:


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.6.1    Determining Your Physical Memory Requirements

This section describes how to determine your system's memory requirements. The amount of memory installed in your system must be able to provide an acceptable level of user and application performance.

To determine your system's memory requirements, you must gather the following information:

See Section 4.6.2 for information about swap space requirements.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.6.2    Configuring Swap Space

Your system's performance depends on the swap space configuration. DIGITAL recommends a minimum of 128 MB for swap space.

To calculate the swap space required by your system and workload, compare the total modifiable virtual address space (anonymous memory) required by your processes with the total amount of physical memory. Modifiable virtual address space holds data elements and structures that are modified during process execution, such as heap space, stack space, and data space.

To calculate swap space requirements if you are using immediate mode, total the anonymous memory requirements for all processes and then add 10 percent of that value. If you are using deferred mode, total the anonymous memory requirements for all processes and then divide by two.

Application messages, such as the following, usually indicate that not enough swap space is configured into the system or that a process limit has been reached:

"lack of paging space"
"swap space below 10 percent free"

Use multiple disks for swap space. The page reclamation code uses a form of disk striping (known as swap space interleaving) so that pages can be written to the multiple disks. To optimize swap space, ensure that all your swap disks are configured when you boot the system, instead of adding swap space while the system is running.

Use the swapon -s command to display your swap space configuration. The first line displayed is the total allocated swap space. Use the iostat to display disk usage.

The following list describes how to configure swap space for high performance:

See Chapter 5 for more information about configuring and tuning swap disks for high performance and availability.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.6.3    Choosing a Swap Space Allocation Mode

There are two methods that you can use to allocate swap space. The methods differ in the point in time at which the virtual memory subsystem reserves swap space for a process. There is no performance benefit attached to either method; however, deferred mode is recommended for very-large memory/very-large database (VLM/VLDB) systems. The swap allocation methods are as follows:

See the System Administration manual for more information on swap space allocation methods.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.7    Tuning Virtual Memory

The virtual memory subsystem is a primary source of performance problems. Performance may degrade if the virtual memory subsystem cannot keep up with the demand for memory and excessive paging and swapping occurs. A memory bottleneck may cause a disk I/O bottleneck, because excessive paging and swapping decreases performance and indicates that the natural working set size has exceeded the available memory. The virtual memory subsystem runs at a high priority when servicing page faults, which blocks the execution of other processes.

If you have excessive page-in and page-out activity from a swap partition, the system may have a high physical memory commitment ratio. Excessive paging also can increase the miss rate for the secondary cache, and may be indicated by the following output:

The tuning recommendations that will provide the best performance benefit involve the following two areas:

Table 4-2 describes the primary tuning tasks guidelines and lists the performance benefits as well as tradeoffs.

Table 4-2:  Primary Virtual Memory Tuning Guidelines

Action Performance Benefit Tradeoff
Reduce the number of processes running at the same time (Section 4.7.1) Reduces demand for memory None
Reduce the static size of the kernel (Section 4.7.2) Reduces demand for memory None
Increase the available address space (Section 4.7.3) Improves performance for memory-intensive processes Slightly increases the demand for memory
Increase the available system resources (Section 4.7.4) Improves performance for memory-intensive processes Increases wired memory
Increase the maximum number of memory-mapped files that are available to a process (Section 4.7.5) Increases file mapping and improves performance for memory-intensive processes, such as Internet servers Consumes memory
Increase the maximum number of virtual pages within a process' address space that can have individual protection attributes (Section 4.7.6) Improves performance for memory-intensive processes and for Internet servers that maintain large tables or resident images Consumes memory
Increase the size of a System V message and queue (Section 4.7.7) Improves performance for memory-intensive processes Consumes memory
Increase the maximum size of a single System V shared memory region (Section 4.7.8) Improves performance for memory-intensive processes Consumes memory
Increase the minimum size of a System V shared memory segment (Section 4.7.9) Improves performance for VLM and VLDB systems Consumes memory
Reduce process memory requirements (Section 4.7.10) Reduces demand for memory None
Reduce the amount of physical memory available to the UBC (Section 4.7.11) Provides more memory resources to processes May degrade file system performance
Increase the rate of swapping (Section 4.7.12) Frees memory and increases throughput Decreases interactive response performance
Decrease the rate of swapping (Section 4.7.12) Improves interactive response performance Decreases throughput
Increase the rate of dirty page prewriting (Section 4.7.13) Prevents drastic performance degradation when memory is exhausted Decreases peak workload performance
Decrease the rate of dirty page prewriting (Section 4.7.13) Improves peak workload performance May cause drastic performance degradation when memory is exhausted

If the previous tasks do not sufficiently improve performance, there are advanced tuning tasks that you can perform. The advanced tuning tasks include the following:

Table 4-3 describes the advanced tuning tasks guidelines and lists the performance benefits as well as tradeoffs.

Table 4-3:  Advanced Virtual Memory Tuning Guidelines

Action Performance Benefit Tradeoff
Increase the size of the page-in and page-out clusters (Section 4.7.14) Improves peak workload performance Decreases total system workload performance
Decrease the size of the page-in and page-out clusters (Section 4.7.14) Improves total system workload performance Decreases peak workload performance
Increase the swap device I/O queue depth for pageins and swapouts (Section 4.7.15) Increases overall system throughput Consumes memory
Decrease the swap device I/O queue depth for pageins and swapouts (Section 4.7.15) Improves the interactive response time and frees memory Decreases system throughput
Increase the swap device I/O queue depth for pageouts (Section 4.7.16) Frees memory and increases throughput Decreases interactive response performance
Decrease the swap device I/O queue depth for pageouts (Section 4.7.16) Improves interactive response time Consumes memory
Increase the UBC write device queue depth (Section 4.7.17) Increases overall file system throughput and frees memory Decreases interactive response performance
Decrease the UBC write device queue depth (Section 4.7.17) Improves interactive response time Consumes memory
Increase the amount of UBC memory used to cache a large file (Section 4.7.18) Improves large file performance May allow a large file to consume all the pages on the free list
Decrease the amount of UBC memory used to cache a large file (Section 4.7.18) Prevents a large file from consuming all the pages on the free list May degrade large file performance
Increase the paging threshold (Section 4.7.19) Maintains performance when free memory is exhausted May waste memory
Enable aggressive swapping (Section 4.7.20) Improves system throughput Degrades interactive response performance
Decrease the size of the metadata buffer cache (Section 4.7.21) Provides more memory resources to processes on large systems May degrade UFS performance
Decrease the size of the namei cache (Section 4.7.22) Decreases demand for memory May slow lookup operations and degrade file system performance
Decrease the amount of memory allocated to the AdvFS cache (Section 4.7.23) Provides more memory resources to processes May degrade AdvFS performance
Reserve physical memory for shared memory (Section 4.7.24) Improves shared memory detach time Decreases the memory available to the virtual memory subsystem and the UBC

The following sections describe these guidelines in detail.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.7.1    Reducing the Number of Processes Running Simultaneously

You can improve performance and reduce the demand for memory by running fewer applications simultaneously. Use the at or the batch command to run applications at offpeak hours.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.7.2    Reducing the Static Size of the Kernel

You can reduce the static size of the kernel by deconfiguring any unnecessary subsystems. Use the setld command to display the installed subsets and to delete subsets.

Use the sysconfig command to display the configured subsystems and to delete subsystems.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.7.3    Increasing the Available Address Space

If your applications are memory-intensive, you may want to increase the available address space. Increasing the address space will cause only a small increase in the demand for memory. However, you may not want to increase the address space if your applications use many forked processes.

The following attributes determine the available address space for processes:

You can use the setrlimit function to control the consumption of system resources by a parent process and its child processes. See setrlimit(2) for information.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.7.4    Increasing the Available System Resources

If your applications are memory-intensive, you may want to increase the system resources that are available to processes. Be careful when increasing the system resources, because this will increase the amount of wired memory in the system.

The following attributes affect system resources:

You can use the setrlimit function to control the consumption of system resources by a parent process and its child processes. See setrlimit(2) for information.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.7.5    Increasing the Number of Memory-Mapped Files

The vm-mapentries attribute specifies the maximum number of memory-mapped files in a user address. Each map entry describes one unique disjoint portion of a virtual address space. The default value is 200.

You may want to increase the value of the vm-mapentries attribute for VLM systems. Because Web servers map files into memory, for busy systems running multithreaded Web server software, you may want to increase the value to 20000. This will increase the limit on file mapping. This attribute affects all processes, and increasing its value will increase the demand for memory.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.7.6    Increasing the Number of Pages With Individual Protections

The vm-vpagemax attribute specifies the maximum number of virtual pages within a process' address space that can be given individual protection attributes. These protection attributes differ from the protection attributes associated with the other pages in the address space.

Changing the protection attributes of a single page within a virtual memory region causes all pages within that region to be treated as though they had individual protection attributes. For example, each thread of a multithreaded task has a user stack in the stack region for the process in which it runs. Because multithreaded tasks have guard pages (that is, pages that do not have read/write access) inserted between the user stacks for the threads, all pages in the stack region for the process are treated as though they have individual protection attributes.

The default value of the vm-vpagemax attribute is determined by dividing the value of the vm-maxvas attribute (the address space size in bytes) by 8192. If a stack region for a multithreaded task exceeds 16 KB pages, you may want to increase the value of the vm-vpagemax attribute. For example, if the value of the vm-maxvas attribute is 1 GB (the default), set the value of vm-vpagemax to 131072 pages (1073741824/8192=131072). This value improves the efficiency of Web servers that maintain large tables or resident images.

You may want to increase the value of the vm-vpagemax attribute for VLM systems. However, this attribute affects all processes, and increasing its value will increase the demand for memory.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.7.7    Increasing the Size of a System V Message and Queue

If your applications are memory-intensive or you have a VLM system, you may want to increase the value of the msg-max attribute. This attribute specifies the maximum size of a single System V message. However, increasing the value of this attribute will increase the demand for memory. The default value is 8192 bytes (1 page).

In addition, you may want to increase the value of the msg-tql attribute. This attribute specifies the maximum number of messages that can be queued to a single System V message queue at one time. However, increasing the value of this attribute will increase the demand for memory. The default value is 40.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.7.8    Increasing the Size of a System V Shared Memory Region

If your applications are memory-intensive or you have a VLM system, you may want to increase the value of the shm-max attribute. This attribute specifies the maximum size of a single System V shared memory region. However, increasing the value of this attribute will increase the demand for memory. The default value is 4194304 bytes (512 pages).

In addition, you may want to increase the value of the shm-seg attribute. This attribute specifies the maximum number of System V shared memory regions that can be attached to a single process at any point in time. However, increasing the value of this attribute will increase the demand for memory. The default value is 32.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.7.9    Increasing the Minimum Size of a System V Shared Memory Segment

If your applications are memory-intensive, you may want to increase the value of the ssm-threshold attribute. Page table sharing occurs when the size of a System V shared memory segment reaches the value specified by this attribute. However, increasing the value of this attribute will increase the demand for memory.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.7.10    Reducing Application Memory Requirements

You may want to reduce your applications' use of memory to free memory for other purposes. Follow these coding considerations to reduce your applications' use of memory:

See the Programmer's Guide for more information on process memory allocation.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.7.11    Reducing the Memory Available to the UBC

You may be able to improve performance by reducing the maximum percentage of memory available for the UBC. If you decrease the maximum size of the UBC, you increase the amount of memory available to the virtual memory subsystem, which may reduce the paging and swapping rate. However, reducing the memory allocated to the UBC may adversely affect I/O performance because the UBC will hold less file system data, which results in more disk I/O operations. Therefore, do not significantly decrease the maximum size of the UBC.

The maximum amount of memory that can be allocated to the UBC is specified by the ubc-maxpercent attribute. The default is 100 percent. The minimum amount of memory that can be allocated to the UBC is specified by the ubc-minpercent attribute. The default is 10 percent. If you have an Internet server, use these default values.

If the page-out rate is high and you are not using the file system heavily, decreasing the value of the ubc-maxpercent attribute may reduce the rate of paging and swapping. Start with the default value of 100 percent and decrease the value in increments of 10. If the values of the ubc-maxpercent and ubc-minpercent attributes are close together, you may seriously degrade I/O performance or cause the system to page excessively.

Use the vmstat command to determine whether the system is paging excessively. Using dbx, periodically examine the vpf_pgiowrites and vpf_ubcalloc fields of the vm_perfsum kernel structure. The page-out rate may shrink if pageouts greatly exceed UBC allocations.

You also may be able to prevent paging by increasing the percentage of memory that the UBC borrows from the virtual memory subsystem. To do this, decrease the value of the ubc-borrowpercent attribute. Decreasing the value of the ubc-borrowpercent attribute allows less memory to remain in the UBC when page reclamation begins. This can reduce the UBC effectiveness, but may improve the system response time when a low-memory condition occurs. The value of the ubc-borrowpercent attribute can range from 0 to 100. The default value is 20 percent.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.7.12    Changing the Rate of Swapping

Swapping has a drastic impact on system performance. You can modify attributes to control when swapping begins and ends. Increasing the rate of swapping (swapping earlier during page reclamation), moves long-sleeping threads out of memory, frees memory, and increases throughput. As more processes are swapped out, fewer processes are actually executing and more work is done. However, when an outswapped process is needed, it will have a long latency, so increasing the rate of swapping will degrade interactive response time.

In contrast, if you decrease the rate of swapping (swap later during page reclamation), you will improve interactive response time, but at the cost of throughput.

To increase the rate of swapping, increase the value of the vm-page-free-optimal attribute (the default is 74 pages). Increase the value only by 2 pages at a time. Do not specify a value that is more than the value of the vm-page-free-target attribute (the default is 128).

To decrease the rate of swapping, decrease the value of the vm-page-free-optimal attribute by 2 pages at a time. Do not specify a value that is less than the value of the vm-page-free-min attribute (the default is 20).


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.7.13    Controlling Dirty Page Prewriting

The virtual memory subsystem attempts to prevent a memory shortage by prewriting modified pages to swap space. When the virtual memory subsystem anticipates that the pages on the free list will soon be depleted, it prewrites to swap space the oldest modified (dirty) pages on the inactive list. To reclaim a page that has been prewritten, the virtual memory subsystem only needs to validate the page.

Increasing the rate of dirty page prewriting will reduce peak workload performance, but it will prevent a drastic performance degradation when memory is exhausted. Decreasing the rate will improve peak workload performance, but it will cause a drastic performance degradation when memory is exhausted.

You can control the rate of dirty page prewriting by modifying the values of the vm-page-prewrite-target attribute and the vm-ubcdirtypercent attribute.

The vm-page-prewrite-target attribute specifies the number of virtual memory pages that the subsystem will prewrite and keep clean. The default value is 256 pages. To increase the rate of virtual memory dirty page prewriting, increase the value of the vm-page-prewrite-target attribute from the default value (256) by increments of 64 pages.

The vm-ubcdirtypercent attribute specifies the percentage of UBC LRU pages that can be modified before the virtual memory subsystem prewrites the dirty UBC LRU pages. The default value is 10 percent of the total UBC LRU pages (that is, 10 percent of the UBC LRU pages must be dirty before the UBC LRU pages are prewritten). To increase the rate of UBC LRU dirty page prewriting, decrease the value of the vm-ubcdirtypercent attribute by increments of 1 percent.

In addition, you may want to minimize the impact of I/O spikes caused by the sync function when prewriting UBC LRU dirty pages. The value of the ubc-maxdirtywrites attribute specifies the maximum number of disk writes that the kernel can perform each second. The default value of the ubc-maxdirtywrites attribute is 5 I/O operations per second.

To minimize the impact of sync (steady state flushes) when prewriting dirty UBC LRU pages, increase the value of the ubc-maxdirtywrites attribute.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.7.14    Modifying the Size of the Page-In and Page-Out Clusters

The virtual memory subsystem reads in and writes out additional pages in an attempt to anticipate pages that it will need.

The vm-max-rdpgio-kluster attribute specifies the maximum size of an anonymous page-in cluster. The default value is 16 KB (2 pages). If you increase the value of this attribute, the system will spend less time page faulting because more pages will be in memory. This will increase the peak workload performance, but will consume more memory and decrease the total system workload performance.

Decreasing the value of the vm-max-rdpgio-kluster attribute will conserve memory and increase the total system workload performance, but will increase paging and decrease the peak workload performance.

The vm-max-wrpgio-kluster attribute specifies the maximum size of an anonymous page-out cluster. The default value is 32 KB (4 pages). Increasing the value of this attribute improves the peak workload performance and conserves memory, but causes more pageins and decreases the total system workload performance.

Decreasing the value of the vm-max-wrpgio-kluster attribute improves the total system workload performance and decreases the number of pageins, but decreases the peak workload performance and consumes more memory.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.7.15    Modifying the Swap I/O Queue Depth for Pageins and Swapouts

Synchronous swap buffers are used for page-in page faults and for swapouts. The vm-syncswapbuffers attribute specifies the maximum swap device I/O queue depth for pageins and swapouts.

You can modify the value of the vm-syncswapbuffers attribute. The value should be equal to the approximate number of simultaneously running processes that the system can easily handle. The default is 128.

Increasing the swap device I/O queue depth increases overall system throughput, but consumes memory.

Decreasing the swap device I/O queue depth decreases memory demands and improves interactive response time, but decreases overall system throughput.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.7.16    Modifying the Swap I/O Queue Depth for Pageouts

Asynchronous swap buffers are used for asynchronous pageouts and for prewriting modified pages. The vm-asyncswapbuffers attribute controls the maximum depth of the swap device I/O queue for pageouts.

The value of the vm-asyncswapbuffers attribute should be the approximate number of I/O transfers that a swap device can handle at one time. The default value is 4.

Increasing the queue depth will free memory and increase the overall system throughput.

Decreasing the queue depth will use more memory, but will improve the interactive response time.

If you are using LSM, you may want to increase the page-out rate. Be careful if you increase the value of the vm-asyncswapbuffers attribute, because this will cause page-in requests to lag asynchronous page-out requests.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.7.17    Modifying the UBC Write Device Queue Depth

The UBC uses a buffer to facilitate the movement of data between memory and disk. The vm-ubcbuffers attribute specifies the maximum file system device I/O queue depth for writes. The default value is 256.

Increasing the UBC write device queue depth frees memory and increases the overall file system throughput.

Decreasing the UBC write device queue depth increases memory demands, but improves the interactive response time.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.7.18    Controlling Large File Caching

If a large file completely fills the UBC, it may take all of the pages on the free page list, which may cause the system to page excessively. The vm-ubcseqpercent attribute specifies the maximum amount of memory allocated to the UBC that can be used to cache a file. The default value is 10 percent of memory allocated to the UBC.

The vm-ubcseqstartpercent attribute specifies the size of the UBC as a percentage of physical memory, at which time the virtual memory subsystem starts stealing the UBC LRU pages for a file to satisfy the demand for pages. The default is 50 percent of physical memory.

Increasing the value of the vm-ubcseqpercent attribute will improve the performance of a large single file, but decrease the remaining amount of memory.

Decreasing the value of the vm-ubcseqpercent attribute will increase the available memory, but will degrade the performance of a large single file.

To force the system to reuse the pages in the UBC instead of taking pages from the free list, perform the following tasks:

For example, using the default values, the UBC would have to be larger than 50 percent of all memory and a file would have to be larger than 10 percent of the UBC (that is, the file size would have to be at least 5 percent of all memory) in order for the system to reuse the pages in the UBC.

On large-memory systems that are doing a lot of file system operations, you may want to lower the vm-ubcseqstartpercent value to 30 percent. Do not specify a lower value unless you decrease the size of the UBC. In this case, do not change the value of the vm-ubcseqpercent attribute.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.7.19    Increasing the Paging Threshold

The vm-page-free-target attribute specifies the minimum number of pages on the free list before paging starts. The default value is 128 pages.

Increasing the value of the vm-page-free-target attribute will increase the paging activity but may improve performance when free memory is exhausted. If you increase the value, start at the default value (128 pages or 1 MB) and then double the value. Do not specify a value above 1025 pages or 8 MB. A high value can waste memory.

Do not decrease the value of the vm-page-free-target attribute unless you have a lot of memory or you experience a serious performance degradation when free memory is exhausted.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.7.20    Enabling Aggressive Task Swapping

You can enable the vm-aggressive attribute (set the value to 1) to allow the virtual memory subsystem to aggressively swap out processes when memory is needed. This improves system throughput, but degrades the interactive response performance.

By default, the vm-aggressive attribute is disabled (set to 0), which results in less aggressive swapping. In this case, processes are swapped in at a faster rate than if aggressive swapping is enabled.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.7.21    Decreasing the Size of the Metadata Buffer Cache

The metadata buffer cache contains recently accessed UFS and CDFS metadata. On large-memory systems with a high cache hit rate, you may want to decrease the size of the metadata buffer cache. This will increase the amount of memory that is available to the virtual memory subsystem. However, decreasing the size of the cache may degrade UFS performance.

The bufcache attribute specifies the percentage of physical memory that the kernel wires for the metadata buffer cache. The default size of the metadata buffer cache is 3 percent of physical memory. You can decrease the value of the bufcache attribute to a minimum of 1 percent.

For systems that use only AdvFS, set the value of the bufcache attribute to 1 percent.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.7.22    Decreasing the Size of the namei Cache

The namei cache is used by all file systems to map file pathnames to inodes. Use dbx to monitor the cache by examining the nchstats structure.

To free memory resources, decrease the number of elements in the namei cache by decreasing the value of the name-cache-size attribute. The default values are 2*nvnode*11/10 (for 32-MB or larger systems) and 150 (for 24-MB systems). The maximum value is 2*max-vnodes*11/10.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.7.23    Decreasing the Size of the AdvFS Buffer Cache

To free memory resources, you may want to decrease the percentage of physical memory allocated to the AdvFS buffer cache.

The AdvfsCacheMaxPercent attribute determines the maximum amount of physical memory that can be used for the AdvFS buffer cache. The default is 7 percent of memory. However, decreasing the size of the AdvFS buffer cache may adversely affect AdvFS I/O performance.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.7.24    Reserving Physical Memory for Shared Memory

Granularity hints allow you to reserve a portion of dynamically wired physical memory at boot time for shared memory. Granularity hints allow the translation lookaside buffer to map more than a single page and enable shared page table entry functionality, which will cause fewer buffer misses.

On typical database servers, using granularity hints provides a 2 to 4 percent run-time performance gain that reduces the shared memory detach time. In most cases, use the Segmented Shared Memory (SSM) functionality (the default) instead of the granularity hints functionality.

To enable granularity hints, you must specify a value for the gh-chunks attribute. To make granularity hints more effective, modify applications to ensure that both the shared memory segment starting address and size are aligned on an 8-MB boundary.

Section 4.7.24.1 and Section 4.7.24.2 describe how to enable granularity hints.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.7.24.1    Tuning the Kernel to Use Granularity Hints

To use granularity hints, you must specify the number of 4-MB chunks of physical memory to reserve for shared memory at boot time. This memory cannot be used for any other purpose and cannot be returned to the system or reclaimed.

To reserve memory for shared memory, specify a nonzero value for the gh-chunks attribute. For example, if you want to reserve 4 GB of memory, specify 1024 for the value of gh-chunks (1024 * 4 MB = 4 GB). If you specify a value of 512, you will reserve 2 GB of memory.

The value you specify for the gh-chunks attribute depends on your database application. Do not reserve an excessive amount of memory, because reserving memory decreases the memory available to the virtual memory subsystem and the UBC.

You can determine if you have reserved the appropriate amount of memory. For example, you can initially specify 512 for the value of the gh-chunks attribute. Then, invoke the following sequence of dbx commands while running the application that allocates shared memory:

# dbx -k /vmunix /dev/mem
 
(dbx) px &gh_free_counts
0xfffffc0000681748
(dbx) 0xfffffc0000681748/4X
fffffc0000681748:  0000000000000402 0000000000000004
fffffc0000681758:  0000000000000000 0000000000000002
(dbx)

The output shows the following:

To save memory, you can reduce the value of the gh-chunks attribute until only one or two 512-page chunks are free while the application that uses shared memory is running.

The following attributes also affect granularity hints:

In addition, messages will display on the system console indicating unaligned size and attach address requests. The unaligned attach messages are limited to one per shared memory segment.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.7.24.2    Modifying Applications to Use Granularity Hints

You can make granularity hints more effective by making both the shared memory segment starting address and size aligned on an 8-MB boundary.

To share Level 3 page table entries, the shared memory segment attach address (specified by the shmat function) and the shared memory segment size (specified by the shmget function) must be aligned on an 8-MB boundary. This means that the lowest 23 bits of both the address and the size must be zero.

The attach address and the shared memory segment size is specified by the application. In addition, System V shared memory semantics allow a maximum shared memory segment size of 2 GB minus 1 byte. Applications that need shared memory segments larger than 2 GB can construct these regions by using multiple segments. In this case, the total shared memory size specified by the user to the application must be 8-MB aligned. In addition, the value of the shm-max attribute, which specifies the maximum size of a System V shared memory segment, must be 8-MB aligned.

If the total shared memory size specified to the application is greater than 2 GB, you can specify a value of 2139095040 (or 0x7f800000) for the value of the shm-max attribute. This is the maximum value (2 GB minus 8 MB) that you can specify for the shm-max attribute and still share page table entries.

Use the following dbx command sequence to determine if page table entries are being shared:

# dbx -k /vmunix /dev/mem
 
(dbx) p *(vm_granhint_stats *)&gh_stats_store
	struct {
	    total_mappers = 21
	    shared_mappers = 21
	    unshared_mappers = 0
	    total_unmappers = 21
	    shared_unmappers = 21
	    unshared_unmappers = 0
	    unaligned_mappers = 0
	    access_violations = 0
	    unaligned_size_requests = 0
	    unaligned_attachers = 0
	    wired_bypass = 0
	    wired_returns = 0
	} 
	(dbx)

For the best performance, the shared_mappers kernel variable should be equal to the number of shared memory segments, and the unshared_mappers, unaligned_attachers, and unaligned_size_requests variables should be 0 (zero).

Because of how shared memory is divided into shared memory segments, there may be some unshared segments. This occurs when the starting address or the size is aligned on an 8-MB boundary. This condition may be unavoidable in some cases. In many cases, the value of total_unmappers will be greater than the value of total_mappers.

Shared memory locking changes a lock that was a single lock into a hashed array of locks. The size of the hashed array of locks can be modified by modifying the value of the vm-page-lock-count attribute. The default value is 64.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.8    Tuning the UBC

The UBC and the virtual memory subsystem compete for the physical memory that is not wired by the kernel. You may be able to improve file system performance by tuning the UBC. However, increasing the amount of memory available to the UBC will affect the virtual memory subsystem and may increase the rate of paging and swapping.

The amount of memory allocated to the UBC is determined by the ubc-maxpercent, ubc-minpercent, and ubc-borrowpercent attributes. You may be able to improve performance by modifying the value of these attributes, which are described in Section 4.4.

The following output may indicate that the size of the UBC is too small for your configuration:

The UBC is flushed by the update daemon. You can monitor the UBC usage lookup hit ratio by using dbx. You can view UBC statistics by using dbx and checking the vm_perfsum structure. You can also monitor the UBC by using dbx -k and examining the ufs_getapage_stats structure. See Chapter 2 for information about monitoring the UBC.

You can improve UBC performance by following the guidelines described in Table 4-4. You can also improve file system performance by following the guidelines described in Chapter 5.

Table 4-4:  Guidelines for Tuning the UBC

Action Performance Benefit Tradeoff
Increase the memory allocated to the UBC (Section 4.8.1) Improves file system performance May cause excessive paging and swapping
Decrease the amount of memory borrowed by the UBC (Section 4.8.2) Improves file system performance Decreases the memory available for processes and may decrease system response time
Increase the minimum size of the UBC (Section 4.8.3) Improves file system performance Decreases the memory available for processes
Modify the application to use mmap (Section 4.8.4) Decreases memory requirements None
Increase the UBC write device queue depth (Section 4.7.17) Increases overall file system throughput and frees memory Decreases interactive response performance
Decrease the UBC write device queue depth (Section 4.7.17) Improves interactive response time Consumes memory

The following sections describe these guidelines in detail.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.8.1    Increasing the Maximum Size of the UBC

If there is an insufficient amount of memory allocated to the UBC, I/O performance may be degraded. If you allocate more memory to the UBC, you will improve the chance that data will be found in the cache. By preventing the system from having to copy data from a disk, you may improve I/O performance. However, allocating more memory to the UBC may cause excessive paging and swapping.

To increase the maximum amount of memory allocated to the UBC, you can increase the value of the ubc-maxpercent attribute. The default value is 100 percent. However, the performance of an application that generates a lot of random I/O will not be improved by enlarging the UBC because the next access location for random I/O cannot be predetermined. See Section 4.3.7 for information about UBC memory allocation.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.8.2    Decreasing the Amount of Borrowed Memory

If vmstat output shows excessive paging but few or no pageouts, you may want to increase the value of the ubc-borrowpercent attribute. This situation can occur on low-memory systems (24-MB systems) because they reclaim UBC pages more aggressively than systems with more memory.

The UBC borrows all physical memory above the value of the ubc-borrowpercent attribute and up to the value of the ubc-maxpercent attribute. Increasing the value of the ubc-borrowpercent attribute allows more memory to remain in the UBC when page reclamation begins. This can increase the UBC cache effectiveness, but may degrade system response time when a low-memory condition occurs. The value of the ubc-borrowpercent attribute can range from 0 to 100. The default value is 20 percent. See Section 4.3.7 for information about UBC memory allocation.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.8.3    Increasing the Minimum Size of the UBC

Increasing the value of the ubc-minpercent attribute will prevent large programs from completely filling the UBC. For I/O servers, you may want to raise the value of the ubc-minpercent attribute to ensure that memory is available for the UBC. The default value is 10 percent.

To ensure that the value of the ubc-minpercent is appropriate, use the vmstat command to examine the page-out rate.

If the values of the ubc-maxpercent and ubc-minpercent attributes are close together, you may degrade I/O performance or cause the system to page excessively. See Section 4.3.7 for information about UBC memory allocation.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.8.4    Using mmap in Your Applications

You may want to use the mmap function instead of the read or write function in your applications. The read and write system calls require a page of buffer memory and a page of UBC memory, but mmap requires only one page of memory.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.9    Tuning the Metadata Buffer Cache

A portion of physical memory is wired for use by the metadata buffer cache, which is the traditional BSD buffer cache. The file system code that deals with UFS metadata, which includes directories, indirect blocks, and inodes, uses this cache.

You may be able to improve UFS performance by following the guidelines described in Table 4-5.

Table 4-5:  Guidelines for Tuning the Metadata Buffer Cache

Action Performance Benefit Tradeoff
Increase the memory allocated to the metadata buffer cache (Section 4.9.1) Improves UFS performance Reduces the memory available to the virtual memory subsystem and the UBC
Increase the size of the hash chain table (Section 4.9.2) Improves lookup speed Consumes memory

The following sections describe these guidelines in detail.


[Contents] [Prev. Chapter] [Prev. Section] [Next Section] [Next Chapter] [Index] [Help]

4.9.1    Increasing the Size of the Metadata Buffer Cache

The bufcache attribute specifies the size of the kernel's metadata buffer cache as a percentage of physical memory. The default is 3 percent.

You may want to increase the size of the metadata buffer cache if you have a high cache miss rate (low hit rate). In general, you do not have to increase the cache size. Never increase the value of the bufcache to more than 10 percent.

To determine whether to increase the size of the metadata buffer cache, use dbx to examine the bio_stats structure. The miss rate (block misses divided by the sum of the block misses and block hits) should not be more than 3 percent.

Allocating additional memory to the metadata buffer cache reduces the amount of memory available to the virtual memory subsystem and the UBC. In general, you do not have to increase the value of the bufcache attribute.


[Contents] [Prev. Chapter] [Prev. Section] [Next Chapter] [Index] [Help]

4.9.2    Increasing the Size of the Hash Chain Table

The hash chain table for the metadata buffer cache stores the heads of the hashed buffer queues. Increasing the size of the hash chain table spreads out the buffers and may reduce linear searches, which improves lookup speeds.

The buffer-hash-size attribute specifies the size of the hash chain table for the metadata buffer cache. The default hash chain table size is 512 slots.

You can modify the value of the buffer-hash-size attribute so that each hash chain has 3 or 4 buffers. To determine a value for the buffer-hash-size attribute, use dbx to examine the value of nbuf, then divide the value by 3 or 4, and finally round the result to a power of 2. For example, if nbuf has a value of 360, dividing 360 by 3 gives you a value of 120. Based on this calculation, specify 128 (2 to the power of 7) as the value of the buffer-hash-size attribute.


[Contents] [Prev. Chapter] [Prev. Section] [Next Chapter] [Index] [Help]