10 Managing Network Performance

This chapter describes how to manage Tru64 UNIX network subsystem performance. The following sections describe how to:

Monitor the network subsystem (Section 10.1)

Tune the network subsystem (Section 10.2)

10.1 Gathering Network Information

Table 10-1 describes the commands you can use to obtain information about network operations.

Table 10-1: Network Monitoring Tools

Name	Use	Description
`netstat`	Displays network statistics (Section 10.1.1)	Displays a list of active sockets for each protocol, information about network routes, and cumulative statistics for network interfaces, including the number of incoming and outgoing packets and packet collisions. Also, displays information about memory used for network operations.
`traceroute`	Displays the packet route to a network host	Tracks the route network packets follow from gateway to gateway. See `traceroute`(8) for more information.
`ping`	Determines if a system can be reached on the network	Sends an Internet Control Message Protocol (ICMP) echo request to a host to determine if a host is running and reachable, and to determine if an IP router is reachable. Enables you to isolate network problems, such as direct and indirect routing problems. See `ping`(8) for more information.
`sobacklog_hiwat` attribute	Reports the maximum number of pending requests to any server socket (Section 10.1.2)	Allows you to display the maximum number of pending requests to any server socket in the system.
`sobacklog_drops` attribute	Reports the number of backlog drops that exceed a socket backlog limit (Section 10.1.2)	Allows you to display the number of times the system dropped a received SYN packet because the number of queued `SYN_RCVD` connections for a socket equaled the socket backlog limit.
`somaxconn_drops` attribute	Reports the number of drops that exceed the value of the `somaxconn` attribute (Section 10.1.2)	Allows you to display the number of times the system dropped a received SYN packet because the number of queued `SYN_RCVD` connections for a socket equaled the upper limit on the backlog length (`somaxconn` attribute).
`tcpdump`	Monitors network interface packets	Monitors and displays packet headers on a network interface. You can specify the interface on which to listen, the direction of the packet transfer, or the type of protocol traffic to display. The `tcpdump` command allows you to monitor the network traffic associated with a particular network service and to identify the source of a packet. It lets you determine whether requests are being received or acknowledged, or to determine the source of network requests, in the case of slow network performance. Your kernel must be configured with the `packetfilter` option to use the command. See `tcpdump`(8) and `packetfilter`(7) for more information.

The following sections describe some of these commands in detail.

10.1.1 Monitoring Network Statistics by Using the netstat Command

To check network statistics, use the netstat command. Some problems to look for are:

If the netstat -i command shows excessive amounts of input errors (Ierrs), output errors (Oerrs), or collisions (Coll), this may indicate a network problem; for example, cables are not connected properly or the Ethernet is saturated.

Use the netstat -is command to check for network device driver errors.

Use the netstat -m command to determine if the network is using an excessive amount of memory in proportion to the total amount of memory installed in the system.
If the netstat -m command shows several requests for memory delayed or denied, this means that either physical memory was temporarily depleted or the kernel malloc free lists were empty.

Each socket results in a network connection. If the system allocates an excessive number of sockets, use the netstat -an command to determine the state of your existing network connections.
An example of the netstat -an command is as follows:
```
# /usr/sbin/netstat -an | grep tcp | awk '{print $6}' | sort | uniq -c
     1 CLOSE_WAIT
    58 ESTABLISHED
     2 FIN_WAIT_1
     3 FIN_WAIT_2
    17 LISTEN
     1 SYN_RCVD
 15749 TIME_WAIT
```
For Internet servers, the majority of connections usually are in a TIME_WAIT state. In the previous example, there are almost 16,000 sockets being used, which requires 16 MB of memory.

Use the netstat -p ip command to check for bad checksums, length problems, excessive redirects, and packets lost because of resource problems.

Use the netstat -p tcp command to check for retransmissions, out of order packets, and bad checksums.

Use the netstat -p udp command to check for bad checksums and full sockets.

Use the netstat -rs command to obtain routing statistics.

Most of the information provided by netstat is used to diagnose network hardware or software failures, not to identify tuning opportunities. See the Network Administration manual for more information on how to diagnose failures.

The following output produced by the netstat -i command shows input and output errors:

# /usr/sbin/netstat -i
Name  Mtu   Network     Address         Ipkts Ierrs    Opkts Oerrs  Coll
ln0   1500  DLI         none           133194     2    23632     4  4881
ln0   1500  <Link>                     133194     2    23632     4  4881
ln0   1500  red-net     node1          133194     2    23632     4  4881
sl0*  296   <Link>                          0     0        0     0     0
sl1*  296   <Link>                          0     0        0     0     0
lo0   1536  <Link>                        580     0      580     0     0
lo0   1536  loop        localhost         580     0      580     0     0

Use the following netstat command to determine the causes of the input (Ierrs) and output (Oerrs) shown in the preceding example:


# /usr/sbin/netstat -is
 
ln0 Ethernet counters at Fri Jan 14 16:57:36 1998
 
        4112 seconds since last zeroed
    30307093 bytes received
     3722308 bytes sent
      133245 data blocks received
       23643 data blocks sent
    14956647 multicast bytes received
      102675 multicast blocks received
       18066 multicast bytes sent
         309 multicast blocks sent
        3446 blocks sent, initially deferred
        1130 blocks sent, single collision
        1876 blocks sent, multiple collisions
           4 send failures, reasons include:
                Excessive collisions
           0 collision detect check failure
           2 receive failures, reasons include:
                Block check error
                Framing Error
           0 unrecognized frame destination
           0 data overruns
           0 system buffer unavailable
           0 user buffer unavailable

The following netstat -s displays statistics for each protocol:

# /usr/sbin/netstat -s
ip:
        67673 total packets received
        0 bad header checksums
        0 with size smaller than minimum
        0 with data size < data length
        0 with header length < data size
        0 with data length < header length
        8616 fragments received
        0 fragments dropped (dup or out of space)
        5 fragments dropped after timeout
        0 packets forwarded
        8 packets not forwardable
        0 redirects sent
icmp:
        27 calls to icmp_error
        0 errors not generated  old message was icmp
        Output histogram:
                echo reply: 8
                destination unreachable: 27
        0 messages with bad code fields
        0 messages < minimum length
        0 bad checksums
        0 messages with bad length
        Input histogram:
                echo reply: 1
                destination unreachable: 4
                echo: 8
        8 message responses generated
igmp:
        365 messages received
        0 messages received with too few bytes
        0 messages received with bad checksum
        365 membership queries received
        0 membership queries received with invalid field(s)
        0 membership reports received
        0 membership reports received with invalid field(s)
        0 membership reports received for groups to which we belong
        0 membership reports sent
tcp:
        11219 packets sent
                7265 data packets (139886 bytes)
                4 data packets (15 bytes) retransmitted
                3353 ack-only packets (2842 delayed)
                0 URG only packets
                14 window probe packets
                526 window update packets
                57 control packets
        12158 packets received
                7206 acks (for 139930 bytes)
                32 duplicate acks
                0 acks for unsent data
                8815 packets (1612505 bytes) received in-sequence
                432 completely duplicate packets (435 bytes)
                0 packets with some dup. data (0 bytes duped)
                14 out-of-order packets (0 bytes)
                1 packet (0 bytes) of data after window
                0 window probes
                1 window update packet
                5 packets received after close
                0 discarded for bad checksums
                0 discarded for bad header offset fields
                0 discarded because packet too short
        19 connection requests
        25 connection accepts
        44 connections established (including accepts)
        47 connections closed (including 0 drops)
        3 embryonic connections dropped
        7217 segments updated rtt (of 7222 attempts)
        4 retransmit timeouts
                0 connections dropped by rexmit timeout
        0 persist timeouts
        0 keepalive timeouts
                0 keepalive probes sent
                0 connections dropped by keepalive
udp:
        12003 packets sent
        48193 packets received
        0 incomplete headers
        0 bad data length fields
        0 bad checksums
        0 full sockets
        12943 for no port (12916 broadcasts, 0 multicasts)

See netstat(1) for more information about the output produced by the various command options.

10.1.2 Checking Socket Listen Queue Statistics by Using the sysconfig Command

You can determine whether you need to increase the socket listen queue limit by using the sysconfig -q socket command to display the values of the following attributes:

sobacklog_hiwat
Allows you to monitor the maximum number of pending requests to any server socket in the system. The initial value is zero.

sobacklog_drops
Allows you to monitor the number of times the system dropped a received SYN packet because the number of queued SYN_RCVD connections for a socket equaled the socket backlog limit. The initial value is zero.

somaxconn_drops
Allows you to monitor the number of times the system dropped a received SYN packet because the number of queued SYN_RCVD connections for the socket equaled the upper limit on the backlog length (somaxconn attribute). The initial value is zero.

It is recommended that the value of the sominconn attribute equal the value of the somaxconn attribute. If so, the value of somaxconn_drops will have the same value as sobacklog_drops.

However, if the value of the sominconn attribute is 0 (the default), and if one or more server applications uses an inadequate value for the backlog argument to its listen system call, the value of sobacklog_drops may increase at a rate that is faster than the rate at which the somaxconn_drops counter increases. If this occurs, you may want to increase the value of the sominconn attribute.

See Section 10.2.3 for information on tuning socket listen queue limits.

10.2 Tuning the Network Subsystem

Most resources used by the network subsystem are allocated and adjusted dynamically; however, there are some tuning guidelines that you can use to improve performance, particularly with systems that are Internet servers, including Web, proxy, firewall, and gateway servers.

Network performance is affected when the supply of resources is unable to keep up with the demand for resources. The following two conditions can cause this to occur:

A problem with one or more hardware or software network components

A workload (network traffic) that consistently exceeds the capacity of the available resources, although everything appears to be operating correctly

Neither of these problems are network tuning issues. In the case of a problem on the network, you must isolate and eliminate the problem. In the case of high network traffic (for example, the hit rate on a Web server has reached its maximum value while the system is 100 percent busy), you must either redesign the network and redistribute the load, reduce the number of network clients, or increase the number of systems handling the network load. See the Network Programmer's Guide and the Network Administration manual for information on how to resolve network problems.

Table 10-2 lists network subsystem tuning guidelines and performance benefits as well as tradeoffs.

Table 10-2: Network Tuning Guidelines

Guideline	Performance Benefit	Tradeoff
Increase the size of the hash table that the kernel uses to look up TCP control blocks (Section 10.2.1)	Improves the TCP control block lookup rate and increases the raw connection rate	Slightly increases the amount of wired memory
Increase the number of TCP hash tables (Section 10.2.2)	Reduces hash table lock contention for SMP systems	Slightly increases the amount of wired memory
Increase the limits for partial TCP connections on the socket listen queue (Section 10.2.3)	Improves throughput and response time on systems that handle a large number of connections	Consumes memory when pending connections are retained in the queue
Increase the number of outgoing connection ports (Section 10.2.4)	Allows more simultaneous outgoing connections	None
Modify the range of outgoing connection ports (Section 10.2.5)	Allows you to use ports from a specific range	None
Disable the use of a PMTU (Section 10.2.6)	Improves the efficiency of servers that handle remote traffic from many clients	May reduce server efficiency for LAN traffic
Increase the number of IP input queues (Section 10.2.7)	Reduces IP input queue lock contention for SMP systems	None
Enable `mbuf` cluster compression (Section 10.2.8)	Improves efficiency of network memory allocation	None
Enable TCP keepalive functionality (Section 10.2.9)	Enables inactive socket connections to time out	None
Increase the size of the kernel interface alias table (Section 10.2.10)	Improves the IP address lookup rate for systems that serve many domain names	Slightly increases the amount of wired memory
Make partial TCP connections time out more quickly (Section 10.2.11)	Prevents clients from overfilling the socket listen queue	A short time limit may cause viable connections to break prematurely
Make the TCP connection context time out more quickly at the end of the connection (Section 10.2.12)	Frees connection resources sooner	Reducing the timeout limit increases the potential for data corruption; use caution if you apply this guideline
Reduce the TCP retransmission rate (Section 10.2.13)	Prevents premature retransmissions and decreases congestion	A long retransmit time is not appropriate for all configurations
Enable the immediate acknowledgment of TCP data (Section 10.2.14)	Can improve network performance for some connections	May adversely affect network bandwidth
Increase the TCP maximum segment size (Section 10.2.15)	Allows sending more data per packet	May result in fragmentation at the router boundary
Increase the size of the transmit and receive socket buffers (Section 10.2.16)	Buffers more TCP packets per socket	May decrease available memory when the buffer space is being used
Increase the size of the transmit and receive buffers for a UDP socket (Section 10.2.17)	Helps to prevent dropping UDP packets	May decrease available memory when the buffer space is being used
Allocate sufficient memory to the UBC (Section 9.2.4 and Section 9.2.5 Section 9.2.6)	Improves disk I/O performance	May decrease the physical memory available to processes
Increase the size of the ARP table (Section 10.2.18)	May improve network performance on a system that is simultaneously connected to many nodes on the same LAN	Consumes memory resources
Increase the maximum size of a socket buffer (Section 10.2.19)	Allows large socket buffer sizes	Consumes memory resources
Prevent dropped input packets (Section 10.2.20)	Allows high network loads	None

The following sections describe these tuning guidelines in detail.

10.2.1 Improving the Lookup Rate for TCP Control Blocks

You can modify the size of the hash table that the kernel uses to look up Transmission Control Protocol (TCP) control blocks. The inet subsystem attribute tcbhashsize specifies the number of hash buckets in the kernel TCP connection table (the number of buckets in the inpcb hash table).

Performance Benefit and Tradeoff

The kernel must look up the connection block for every TCP packet it receives, so increasing the size of the table can speed the search and improve performance. This results in a small increase in wired memory.

You can modify the tcbhashsize attribute without rebooting the system.

When to Tune

Increase the number of hash buckets in the kernel TCP connection table if you have an Internet server.

Recommended Values

The default value of the tcbhashsize attribute is 512. For Internet servers, set the tcbhashsize attribute to 16384.

See Section 3.6 for information about modifying kernel subsystem attributes.

10.2.2 Increasing the Number of TCP Hash Tables

Because the kernel must look up the connection block for every Transmission Control Protocol (TCP) packet it receives, a bottleneck may occur at the TCP hash table in SMP systems. Increasing the number of tables distributes the load and may improve performance. The inet subsystem attribute tcbhashnum specifies the number of TCP hash tables.

Performance Benefit and Tradeoff

For SMP systems, you may be able to reduce hash table lock contention by increasing the number of hash tables that the kernel uses to look up TCP control blocks. This will slightly increase wired memory.

You cannot modify the tcbhashnum attribute without rebooting the system.

When to Tune

Increase the number of TCP hash tables if you have an SMP system that is an Internet server.

Recommended Values

The minimum and default values of the tcbhashnum attribute are 1; the maximum value is 64. For busy Internet server SMP systems, you can increase the value of the tcbhashnum attribute to 16. If you increase this attribute, you should also increase the size of the hash table. See Section 10.2.1 for information.

Compaq recommends that you make the value of the tcbhashnum attribute the same as the value of the inet subsystem attribute ipqs. See Section 10.2.7 for information.

See Section 3.6 for information about modifying kernel attributes.

10.2.3 Tuning the TCP Socket Listen Queue Limits

You may be able to improve performance by increasing the limits for the socket listen queue (only for TCP). The socket subsystem attribute somaxconn specifies the maximum number of pending TCP connections (the socket listen queue limit) for each server socket. If the listen queue connection limit is too small, incoming connect requests may be dropped. Note that pending TCP connections can be caused by lost packets in the Internet or denial of service attacks.

The socket subsystem attribute sominconn specifies the minimum number of pending TCP connections (backlog) for each server socket. The attribute controls how many SYN packets can be handled simultaneously before additional requests are discarded. The value of the sominconn attribute overrides the application-specific backlog value, which may be set too low for some server software.

Performance Benefit and Tradeoff

To improve throughput and response time with fewer drops, you can increase the value of the somaxconn attribute.

If you want to improve performance without recompiling an application or if you have an Internet server, increase the value of the sominconn attribute. Increasing the value of this attribute can also prevent a client from saturating a socket listen queue with erroneous TCP SYN packets.

You can modify the somaxconn and sominconn attributes without rebooting the system. However, sockets that are already open will continue to use the previous socket limits until the applications are restarted.

When to Tune

Increase the socket listen queue limits if you have an Internet server or a busy system that has many pending connections and is running applications generating a large number of connections.

Monitor the sobacklog_hiwat, sobacklog_drops, and somaxconn_drops attributes to determine if socket queues are overflowing. If so, you may need to increase the socket listen queue limits. See Section 10.1.2 for information.

Recommended Values

The default value of the somaxconn attribute is 1024. For Internet servers, set the value of the somaxconn attribute to the maximum value of 65535.

The default value of the sominconn attribute is zero. To improve performance without recompiling an application and for Internet servers, set the value of the sominconn attribute to the maximum value of 65535.

If a client is saturating a socket listen queue with erroneous TCP SYN packets, effectively blocking other users from the queue, increase the value of the sominconn attribute to 65535. If the system continues to drop incoming SYN packets, you can decrease the value of the inet subsystem attribute tcp_keepinit to 30 (15 seconds).

The value of the sominconn attribute should be the same as the value of the somaxconn attribute.

See Section 3.6 for information about modifying kernel subsystem attributes.

10.2.4 Increasing the Number of Outgoing Connection Ports

When a TCP or UDP application creates an outgoing connection, the kernel dynamically allocates a nonreserved port number for each connection. The kernel selects the port number from a range of values between the value of the inet subsystem attribute ipport_userreserved_min and the value of the ipport_userreserved attribute. If you use the default attribute values, the number of simultaneous outgoing connections is limited to 3976.

Performance Benefit

Increasing the number of ports provides more ports for TCP and UDP applications.

You can modify the ipport_userreserved attribute without rebooting the system.

When to Tune

If your system requires many outgoing ports, you may want to increase the value of the ipport_userreserved attribute.

Recommended Values

The default value of the ipport_userreserved attribute is 5000, which means that the default number of ports is 3976 (5000 minus 1024).

If your system is a proxy server (for example, a Squid caching server or a firewall system) with a load of more than 4000 simultaneous connections, increase the value of the ipport_userreserved attribute to the maximum value of 65000.

It is not recommended that you reduce the value of the ipport_userreserved attribute to a value that is less than 5000 or increase it to a value that is higher than 65000.

You can also modify the range of outgoing connection ports. See Section 10.2.5 for information.

See Section 3.6 for information about modifying kernel subsystem attributes.

10.2.5 Modifying the Range of Outgoing Connection Ports

When a TCP or UDP application creates an outgoing connection, the kernel dynamically allocates a nonreserved port number for each connection. The kernel selects the port number from a range of values between the value of the inet subsystem attribute ipport_userreserved_min and the value of the ipport_userreserved attribute. Using the default values for these attributes, the range of outgoing ports starts at 1024 and stops at 5000.

Performance Benefit and Tradeoff

Modifying the range of outgoing connections provides TCP and UDP applications with a specific range of ports.

You can modify the ipport_userreserved_min and ipport_userreserved attributes without rebooting the system.

When to Tune

If your system requires outgoing ports from a particular range, you can modify the values of the ipport_userreserved_min and ipport_userreserved attributes.

Recommended Values

The default value of the ipport_userreserved_min attribute is 1024. The default value of the ipport_userreserved is 5000. The maximum value of both attributes is 65000.

Do not reduce the ipport_userreserved attribute to a value that is less than 5000, or reduce the ipport_userreserved_min attribute to a value that is less than 1024.

See Section 3.6 for information about modifying kernel subsystem attributes.

10.2.6 Disabling Use of a PMTU

Packets transmitted between servers are fragmented into units of a specific size in order to ease transmission of the data over routers and small-packet networks, such as Ethernet networks. When the inet subsystem attribute pmtu_enabled is enabled (set to 1, which is the default behavior), the system determines the largest common path maximum transmission unit (PMTU) value between servers and uses it as the unit size. The system also creates a routing table entry for each client network that attempts to connect to the server.

Performance Benefit and Tradeoff

If a server handles traffic among many remote clients, disabling the use of a PMTU can decrease the size of the kernel routing table, which improves server efficiency. However, on a server that handles local traffic and some remote traffic, disabling the use of a PMTU can degrade bandwidth.

You can modify the pmtu_enabled attribute without rebooting the system.

When to Tune

Disable use of a PMTU if you have a server that handles traffic among many remote clients, or if you have an Internet server that has poor performance and the routing table increases to more than 1000 entries.

Recommended Values

Set the value of the pmtu_enabled attribute to 0 to disable the use of PMTU protocol.

See Section 3.6 for information about modifying kernel subsystem attributes.

10.2.7 Increasing the Number of IP Input Queues

The inet subsystem attribute ipqs specifies the number of IP input queues.

Performance Benefit and Tradeoff

Increasing the number of IP input queues can reduce lock contention at the queue by increasing the number of queues and distributing the load.

You cannot modify the ipqs attribute without rebooting the system.

When to Tune

Increase the number of IP input queues if you have an SMP system that is an Internet server.

Recommended Values

For SMP systems that are Internet servers, increase the value of the ipqs attribute to 16. The maximum value is 64.

It is recommended that you make the value of the ipqs attribute the same as the value of the inet subsystem attribute tcbhashnum. See Section 10.2.2 for information.

See Section 3.6 for information about modifying kernel subsystem attributes.

10.2.8 Enabling mbuf Cluster Compression

The socket subsystem attribute sbcompress_threshold controls whether mbuf clusters are compressed at the socket layer. By default, mbuf clusters are not compressed (sbcompress_threshold is set to 0).

Performance Benefit

Compressing mbuf clusters can prevent proxy servers from consuming all the available mbuf clusters.

You can modify the sbcompress_threshold attribute without rebooting the system.

When to Tune

You may want to enable mbuf cluster compression if you have a proxy server. These systems are more likely to consume all the available mbuf clusters if they are using FDDI instead of Ethernet.

To determine the memory that is being used for mbuf clusters, use the netstat -m command. The following example is from a firewall server with 128 MB memory that does not have mbuf cluster compression enabled:

# netstat -m
  2521 Kbytes for small data mbufs (peak usage 9462 Kbytes)
 78262 Kbytes for mbuf clusters (peak usage 97924 Kbytes)
  8730 Kbytes for sockets (peak usage 14120 Kbytes)
  9202 Kbytes for protocol control blocks (peak usage 14551
     2 Kbytes for routing table (peak usage 2 Kbytes)
     2 Kbytes for socket names (peak usage 4 Kbytes)
     4 Kbytes for packet headers (peak usage 32 Kbytes)
 39773 requests for mbufs denied
     0 calls to protocol drain routines
 98727 Kbytes allocated to network

The previous example shows that 39773 requests for memory were denied. This indicates a problem because this value should be zero. The example also shows that 78 MB of memory has been assigned to mbuf clusters, and that 98 MB of memory is being consumed by the network subsystem.

Recommended Values

To enable mbuf cluster compression, modify the default value of the socket subsystem attribute sbcompress_threshold. Packets will be copied into the existing mbuf clusters if the packet size is less than this value. For proxy servers, specify a value of 600.

If you increase the value of the sbcompress_threshold attribute to 600, the memory allocated to the network subsystem immediately decreases to 18 MB, because compression at the kernel socket buffer interface results in a more efficient use of memory.

10.2.9 Enabling TCP Keepalive Functionality

Keepalive functionality enables the periodic transmission of messages on a connected socket in order to keep connections active. Sockets that do not exit cleanly are cleaned up when the keepalive interval expires. If keepalive is not enabled, those sockets will continue to exist until you reboot the system.

Applications enable keepalive for sockets by setting the setsockopt function's SO_KEEPALIVE option. To override programs that do not set keepalive on their own, or if you do not have access to the application sources, use the inet subsystem attribute tcp_keepalive_default to enable keepalive functionality.

Performance Benefit

Keepalive functionality cleans up sockets that do not exit cleanly when the keepalive interval expires.

You can modify the tcp_keepalive_default attribute without rebooting the system. However, sockets that already exist will continue to use old behavior, until the applications are restarted.

When to Tune

Enable keepalive if you require this functionality, and you do not have access to the source code.

Recommended Values

To override programs that do not set keepalive on their own, or if you do not have access to application source code, set the inet subsystem attribute tcp_keepalive_default to 1 in order to enable keepalive for all sockets.

If you enable keepalive, you can also configure the following TCP options for each socket:

The inet subsystem attribute tcp_keepidle specifies the amount of idle time before sending a keepalive probe (specified in 0.5 second units). The default interval is 2 hours.

The inet subsystem attribute tcp_keepintvl specifies the amount of time (in 0.5 second units) between the retransmission of keepalive probes. The default interval is 75 seconds.

The inet subsystem attribute tcp_keepcnt specifies the maximum number of keepalive probes that are sent before the connection is dropped. The default is 8 probes.

The inet subsystem attribute tcp_keepinit specifies the maximum amount of time before an initial connection attempt times out in 0.5 second units. The default is 75 seconds.

See Section 3.6 for information about modifying kernel subsystem attributes.

10.2.10 Improving the Lookup Rate for IP Addresses

The inet subsystem attribute inifaddr_hsize specifies the number of hash buckets in the kernel interface alias table (in_ifaddr).

If a system is used as a server for many different server domain names, each of which are bound to a unique IP address, the code that matches arriving packets to the right server address uses the hash table to speed lookup operations for the IP addresses.

Performance Benefit and Tradeoff

Increasing the number of hash buckets in the table can improve performance on systems that use large numbers of aliases.

You can modify the inifaddr_hsize attribute without rebooting the system.

When to Tune

Increase the number of hash buckets in the kernel interface alias table if your system uses large numbers of aliases.

Recommended Values

The default value of the inet subsystem attribute inifaddr_hsize is 32; the maximum value is 512.

For the best performance, the value of the inifaddr_hsize attribute is always rounded down to the nearest power of 2. If you are using more than 500 interface IP aliases, specify the maximum value of 512. If you are using less than 250 aliases, use the default value of 32.

See Section 3.6 for information about modifying kernel subsystem attributes.

10.2.11 Decreasing the TCP Partial-Connection Timeout Limit

The inet subsystem attribute tcp_keepinit specifies the amount of time that a partially established TCP connection remains on the socket listen queue before it times out. Partial connections consume listen queue slots and fill the queue with connections in the SYN_RCVD state.

Performance Benefit and Tradeoff

You can make partial connections time out sooner by decreasing the value of the tcp_keepinit attribute.

You can modify the tcp_keepinit attribute without rebooting the system.

When to Tune

You do not need to modify the TCP partial-connection timeout limit, unless the value of the somaxconn_drops attribute often increases. If this occurs, you may want to decrease the value of the tcp_keepinit attribute.

Recommended Values

The value of the tcp_keepinit attribute is in units of 0.5 seconds. The default value is 150 units (75 seconds). If the value of the sominconn attribute is 65535, use the default value of the tcp_keepinit attribute.

Do not set the value of the tcp_keepinit attribute too low, because you may prematurely break connections associated with clients on network paths that are slow or network paths that lose many packets. Do not set the value to less than 20 units (10 seconds).

See Section 3.6 for information about modifying kernel subsystem attributes.

10.2.12 Decreasing the TCP Connection Context Timeout Limit

The TCP protocol includes a concept known as the Maximum Segment Lifetime (MSL). When a TCP connection enters the TIME_WAIT state, it must remain in this state for twice the value of the MSL, or else undetected data errors on future connections can occur. The inet subsystem attribute tcp_msl determines the maximum lifetime of a TCP segment and the timeout value for the TIME_WAIT state.

Performance Benefit and Tradeoff

You can decrease the value of the tcp_msl attribute to make the TCP connection context time out more quickly at the end of a connection. However, this will increase the chance of data corruption.

You can modify the tcp_msl attribute without rebooting the system.

When to Tune

Usually, you do not have to modify the TCP connection context timeout limit.

Recommended Values

The value of the tcp_msl attribute is set in units of 0.5 seconds. The default value is 60 units (30 seconds), which means that the TCP connection remains in TIME_WAIT state for 60 seconds (or twice the value of the MSL). In some situations, the default timeout value for the TIME_WAIT state (60 seconds) is too large, so reducing the value of the tcp_msl attribute frees connection resources sooner than the default behavior.

Do not reduce the value of the tcp_msl attribute unless you fully understand the design and behavior of your network and the TCP protocol. It is strongly recommended that you use the default value; otherwise, there is the potential for data corruption.

See Section 3.6 for information about modifying kernel subsystem attributes.

10.2.13 Decreasing the TCP Retransmission Rate

The inet subsystem attribute tcp_rexmit_interval_min specifies the minimum amount of time before the first TCP retransmission.

Performance Benefit and Tradeoff

You can increase the value of the tcp_rexmit_interval_min attribute to slow the rate of TCP retransmissions, which decreases congestion and improves performance.

You can modify the tcp_rexmit_interval_min attribute without rebooting the system.

When to Tune

Not every connection needs a long retransmission time. Usually, the default value is adequate. However, for some wide area networks (WANs), the default retransmission interval may be too small, causing premature retransmission timeouts. This may lead to duplicate transmission of packets and the erroneous invocation of the TCP congestion-control algorithms.

To check for retransmissions, use the netstat -p tcp command and examine the output for data packets retransmitted.

Recommended Values

The tcp_rexmit_interval_min attribute is specified in units of 0.5 seconds. The default value is 2 units (1 second).

Do not specify a value that is less than 1 unit. Do not change the attribute unless you fully understand TCP algorithms.

See Section 3.6 for information about modifying kernel subsystem attributes.

10.2.14 Disabling Delaying the Acknowledgment of TCP Data

By default, the system delays acknowledging TCP data. The inet subsystem attribute tcpnodelack determines whether the system delays acknowledging TCP data.

Performance Benefit and Tradeoff

Disabling delaying of TCP data may improve performance. However, this may adversely impact network bandwidth.

You can modify the tcpnodelack attribute without rebooting the system.

When to Tune

Usually, the default value of the tcpnodelack attribute is adequate. However, for some connections (for example, loopback), the delay can degrade performance. Use the tcpdump command to check for excessive delays.

Recommended Values

The default value of the tcpnodelack is zero. To disable the TCP acknowledgment delay, set the value of the tcpnodelack attribute to one.

See Section 3.6 for information about modifying kernel subsystem attributes.

10.2.15 Increasing the Maximum TCP Segment Size

The inet subsystem attribute tcp_mssdflt specifies the TCP maximum segment size.

Performance Benefit and Tradeoff

Increasing the maximum TCP segment size allows sending more data per socket, but may cause fragmentation at the router boundary.

You can modify the tcp_mssdflt attribute without rebooting the system.

When to Tune

Usually, you do not need to modify the maximum TCP segment size.

Recommended Values

The default value of the tcp_mssdflt attribute is 536. You can increase the value to 1460.

See Section 3.6 for information about modifying kernel subsystem attributes.

10.2.16 Increasing the Transmit and Receive Buffers for a TCP Socket

The inet subsystem attribute tcp_sendspace specifies the default transmit buffer size for a TCP socket. The tcp_recvspace attribute specifies the default receive buffer size for a TCP socket.

Performance Benefit and Tradeoff

Increasing the transmit and receive socket buffers allows you to buffer more TCP packets per socket. However, increasing the values uses more memory when the buffers are being used by an application (sending or receiving data).

You can modify the tcp_sendspace and tcp_recvspace attributes without rebooting the system.

When to Tune

You may want to increase the transmit and receive socket buffers if you have a busy system with sufficient memory (for example, more than 1 GB of physical memory). Before you apply this modification, you may want to increase the maximum size of a socket buffer, as described in Section 10.2.19.

Recommended Values

The default values of the tcp_sendspace and tcp_recvspace attributes are 32 KB (32768 bytes). You can increase the value of these attributes to 60 KB.

You may want to increase the maximum size of a socket buffer before you increase the transmit and receive buffers. See Section 10.2.19 for information.

See Section 3.6 for information about modifying kernel subsystem attributes.

10.2.17 Increasing the Transmit and Receive Buffers for a UDP Socket

The inet subsystem attribute udp_sendspace specifies the default transmit buffer size for an Internet User Datagram Protocol (UDP) socket. The inet subsystem attribute udp_recvspace specifies the default receive buffer size for a UDP socket.

Performance Benefit and Tradeoff

Increasing the UDP transmit and receive socket buffers allows you to buffer more UDP packets per socket. However, increasing the values uses more memory when the buffers are being used by an application (sending or receiving data).

Note

UDP attributes do not affect Network File System (NFS) performance.

You can modify the udp_sendspace and udp_recvspace attributes without rebooting the system. However, you must restart applications to use the new UDP socket buffer values.

When to Tune

Use the netstat -p udp command to check for full sockets. If the output shows many full sockets, increase the value of the udp_recvspace attribute.

Recommended Values

The default value of the udp_sendspace is 9 KB (9216 bytes). The default value of the udp_recvspace is 40 KB (42240 bytes). You can increase the values of these attributes to 64 KB.

See Section 3.6 for information about modifying kernel subsystem attributes.

10.2.18 Increasing the Size of the ARP Table

The net subsystem attribute arptab_nb specifies the number of hash buckets in the address resolution protocol (ARP) table (that is, the table's width). The net subsystem attribute arptab_depth specifies the number of entries in each hash bucket in the ARP table.

Performance Benefit and Tradeoff

Increasing the size of the ARP table may improve performance. Wide ARP tables can decrease the chance that a search will be needed to match an address to an ARP entry. Deep ARP tables can hold a large number of entries. However, increasing the size of the ARP table will increase the memory used by the table.

Increasing the size of the ARP table will not affect performance unless the system is simultaneously connected to many nodes on the same LAN. See the Kernel Debugging manual and kdbx(8) for more information.

You can modify the arptab_nb and arptab_depth attributes without rebooting the system.

When to Tune

Display the ARP table by using the arp -a command or the kdbx arp debugger extension. Increase the value of the arptab_nb and arptab_depth attributes if the ARP table contains more than 400 entries.

Recommended Values

You can increase the width of the ARP table by increasing the value of the inarptab_nb attribute. The default value is 37. The maximum value is 1024.

You can increase the depth of the ARP table by increasing the value of the arptab_depth attributes. The default value is 16. The maximum value is 256.

See Section 3.6 for information about modifying kernel subsystem attributes.

10.2.19 Increasing the Maximum Size of a Socket Buffer

The socket subsystem attribute sb_max specifies the maximum size of a socket buffer.

Performance Benefit and Tradeoff

Increasing the maximum size of a socket buffer may improve performance if your applications can benefit from a large buffer size.

You can modify the sb_max attribute without rebooting the system.

When to Tune

If you require a large socket buffer, increase the maximum socket buffer size.

Recommended Values

The default value of the sb_max attribute is 128 KB. Increase this value before you increase the size of the transmit and receive socket buffers (see Section 10.2.16).

See Section 3.6 for information about modifying kernel subsystem attributes.

10.2.20 Preventing Dropped Input Packets

If the IP input queue overflows under a heavy network load, input packets may be dropped.

The inet subsystem attribute ipqmaxlen specifies the maximum length (in bytes) of the IP input queue (ipintrq) before input packets are dropped. The ifqmaxlen attribute specifies the number of output packets that can be queued to a network adapter before packets are dropped.

Performance Benefit and Tradeoff

Increasing the IP input queue can prevent packets from being dropped.

You can modify the ipqmaxlen and ifqmaxlen attributes without rebooting the system.

When to Tune

If your system drops packets, you may want to increase the values of the ipqmaxlen and ifqmaxlen attributes. To check for input dropped packets, examine the ipintrq kernel structure by using dbx. If the ifq_drops field is not 0, the system is dropping input packets. For example:

# dbx -k /vmunix 
(dbx)print ipintrq
struct {
    ifq_head = (nil)
    ifq_tail = (nil)
    ifq_len = 0
    ifq_maxlen = 512
    ifq_drops = 128
 .
 .
 .

Use the netstat -id command to monitor dropped output packets. Examine the output for a nonzero value in the Drop column for an interface. The following example shows 579 dropped output packets on the tu1 network interface:

# netstat -id
 
Name  Mtu   Network  Address              Ipkts Ierrs    Opkts Oerrs  Coll Drop
 
fta0  4352  link     08:00:2b:b1:26:59    41586     0    39450     0     0   0
fta0  4352  DLI      none                 41586     0    39450     0     0   0
fta0  4352  10       fratbert             41586     0    39450     0     0   0
tu1   1500  link     00:00:f8:23:11:c8  2135983     0   163454    13  3376 579
tu1   1500  DLI      none               2135983     0   163454    13  3376 579
tu1   1500  red-net  ratbert            2135983     0   163454    13  3376 579
 .
 .
 .

In addition, you can use the netstat -p ip, and check for a nonzero number in the lost packets due to resource problems field or no memory or interface queue was full field. For example:

# netstat -p ip
ip:                                        
        259201001 total packets received
        0 bad header checksums
        0 with size smaller than minimum
        0 with data size < data length
        0 with header length < data size
        0 with data length < header length
        25794050 fragments received
        0 fragments dropped (duplicate or out of space)
        802 fragments dropped after timeout
        0 packets forwarded
        67381376 packets not forwardable
                67381376 link-level broadcasts
        0 packets denied access
        0 redirects sent
        0 packets with unknown or unsupported protocol
        170988694 packets consumed here
        160039654 total packets generated here
        0 lost packets due to resource problems
        4964271 total packets reassembled ok
        2678389 output packets fragmented ok
        14229303 output fragments created
        0 packets with special flags set

Recommended Values

The default and minimum values for the ipqmaxlen and ifqmaxlen attributes are 1024; the maximum values are 65535. For most configurations, the default values are adequate. Only increase the values if you drop packets.

If your system drops packets, increase the values of the ipqmaxlen and ifqmaxlen attributes until you no longer drop packets. For example, you can increase the default values to 2000.

See Section 3.6 for information about modifying kernel subsystem attributes.