This chapter describes how to manage Tru64 UNIX network subsystem performance. The following sections describe these tasks:
Monitor the network subsystem (Section 10.1)
Tune the network subsystem (Section 10.2)
Table 10-1 describes the commands you can use to obtain information about network operations.
| Name | Use | Description |
Displays network statistics (Section 10.1.1) |
Displays a list of active sockets for each protocol, information about network routes, and cumulative statistics for network interfaces, including the number of incoming and outgoing packets and packet collisions. Also, displays information about memory used for network operations. |
|
Displays the packet route to a network host |
Tracks the route network packets follow
from gateway to gateway.
See
|
|
Determines if a system can be reached on the network |
Sends an Internet Control Message
Protocol (ICMP) echo request to a host in order to determine if a host is
running and reachable, and to determine if an IP router is reachable.
Enables
you to isolate network problems, such as direct and indirect routing problems.
See
|
|
Reports the maximum number of pending requests to any server socket (Section 10.1.2) |
Allows you to display the maximum number of pending requests to any server socket in the system. |
|
Reports the number of backlog drops that exceed a socket backlog limit (Section 10.1.2) |
Allows you to display the number of
times the system dropped a received SYN packet, because the number of queued
|
|
Reports the number of drops that
exceed the value of the
|
Allows you to display the number of
times the system dropped a received SYN packet because the number of queued
|
|
Monitors network interface packets |
Monitors and displays packet headers on a network interface. You can specify the interface on which to listen, the direction of the packet transfer, or the type of protocol traffic to display. The
Your kernel must be configured with the
|
The following sections describe some of these commands in detail.
To
check network statistics, use the
netstat
command.
Some problems to look for are as follows:
If the
netstat -i
command
shows excessive amounts of input errors (Ierrs), output
errors (Oerrs), or collisions (Coll),
this may indicate a network problem; for example, cables are
not connected properly or the Ethernet is saturated.
Use the
netstat -is
command to check for
network device driver errors.
Use the
netstat -m
command to determine if the network is using an excessive amount of memory
in proportion to the total amount of memory installed in the system.
If the
netstat -m
command shows several requests for memory delayed or denied, this means that
your system was temporarily short of physical memory.
Each socket results in a network connection.
If the system
allocates
an excessive number of sockets, use the
netstat -an
command to determine the state of your existing network connections.
An example of the
netstat -an
command is as follows:
#/usr/sbin/netstat -an | grep tcp | awk '{print $6}' | sort | uniq -c1 CLOSE_WAIT 58 ESTABLISHED 2 FIN_WAIT_1 3 FIN_WAIT_2 17 LISTEN 1 SYN_RCVD 15749 TIME_WAIT
For Internet servers, the majority of connections usually are in a
TIME_WAIT
state.
Note that there are almost 16,000 sockets being used,
which requires 16 MB of memory.
Use the
netstat -p ip
command to check
for bad
checksums, length problems, excessive redirects, and packets lost because
of
resource problems.
Use the
netstat -p tcp
command to check
for
retransmissions, out of order packets, and bad checksums.
Use the
netstat -p udp
command to look
for bad
checksums and full sockets.
Use the
netstat -rs
to obtain routing statistics.
Most of the information provided by
netstat
is used
to diagnose network hardware or software failures, not to analyze tuning opportunities.
See the
Network Administration
manual for more information on how to diagnose
failures.
The following example shows the output produced by
the
netstat -i
command:
#/usr/sbin/netstat -iName Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll ln0 1500 DLI none 133194 2 23632 4 4881 ln0 1500 <Link> 133194 2 23632 4 4881 ln0 1500 red-net node1 133194 2 23632 4 4881 sl0* 296 <Link> 0 0 0 0 0 sl1* 296 <Link> 0 0 0 0 0 lo0 1536 <Link> 580 0 580 0 0 lo0 1536 loop localhost 580 0 580 0 0
Use the following
netstat
command to determine the causes of the input (Ierrs) and
output (Oerrs) shown in the preceding
example:
#/usr/sbin/netstat -isln0 Ethernet counters at Fri Jan 14 16:57:36 1998 4112 seconds since last zeroed 30307093 bytes received 3722308 bytes sent 133245 data blocks received 23643 data blocks sent 14956647 multicast bytes received 102675 multicast blocks received 18066 multicast bytes sent 309 multicast blocks sent 3446 blocks sent, initially deferred 1130 blocks sent, single collision 1876 blocks sent, multiple collisions 4 send failures, reasons include: Excessive collisions 0 collision detect check failure 2 receive failures, reasons include: Block check error Framing Error 0 unrecognized frame destination 0 data overruns 0 system buffer unavailable 0 user buffer unavailable
The
netstat -s
command displays the following
statistics for each protocol:
#/usr/sbin/netstat -sip: 67673 total packets received 0 bad header checksums 0 with size smaller than minimum 0 with data size < data length 0 with header length < data size 0 with data length < header length 8616 fragments received 0 fragments dropped (dup or out of space) 5 fragments dropped after timeout 0 packets forwarded 8 packets not forwardable 0 redirects sent icmp: 27 calls to icmp_error 0 errors not generated old message was icmp Output histogram: echo reply: 8 destination unreachable: 27 0 messages with bad code fields 0 messages < minimum length 0 bad checksums 0 messages with bad length Input histogram: echo reply: 1 destination unreachable: 4 echo: 8 8 message responses generated igmp: 365 messages received 0 messages received with too few bytes 0 messages received with bad checksum 365 membership queries received 0 membership queries received with invalid field(s) 0 membership reports received 0 membership reports received with invalid field(s) 0 membership reports received for groups to which we belong 0 membership reports sent tcp: 11219 packets sent 7265 data packets (139886 bytes) 4 data packets (15 bytes) retransmitted 3353 ack-only packets (2842 delayed) 0 URG only packets 14 window probe packets 526 window update packets 57 control packets 12158 packets received 7206 acks (for 139930 bytes) 32 duplicate acks 0 acks for unsent data 8815 packets (1612505 bytes) received in-sequence 432 completely duplicate packets (435 bytes) 0 packets with some dup. data (0 bytes duped) 14 out-of-order packets (0 bytes) 1 packet (0 bytes) of data after window 0 window probes 1 window update packet 5 packets received after close 0 discarded for bad checksums 0 discarded for bad header offset fields 0 discarded because packet too short 19 connection requests 25 connection accepts 44 connections established (including accepts) 47 connections closed (including 0 drops) 3 embryonic connections dropped 7217 segments updated rtt (of 7222 attempts) 4 retransmit timeouts 0 connections dropped by rexmit timeout 0 persist timeouts 0 keepalive timeouts 0 keepalive probes sent 0 connections dropped by keepalive udp: 12003 packets sent 48193 packets received 0 incomplete headers 0 bad data length fields 0 bad checksums 0 full sockets 12943 for no port (12916 broadcasts, 0 multicasts)
See
netstat(1)
for information about the output produced by
the various command options.
You can determine whether you need to increase the socket listen queue
limit by using the
sysconfig -q socket
command
to display the values of the following attributes:
sobacklog_hiwat
Allows you to monitor the maximum number of pending requests to any server socket in the system. The initial value is zero.
sobacklog_drops
Allows you to monitor the number of times the system dropped a received SYN packet because the number of queued SYN_RCVD connections for a socket equaled the socket backlog limit. The initial value is zero.
somaxconn_drops
Allows you to monitor the number of times the system dropped a received
SYN packet because
the number of queued SYN_RCVD connections for the socket equaled the
upper limit on the backlog length (somaxconn
attribute).
The initial value is zero.
It is recommended that the value of the
sominconn
attribute equal the value of the
somaxconn
attribute.
If so, the value of
somaxconn_drops
will have the
same value as
sobacklog_drops.
However, if the
value of the
sominconn
attribute is 0 (the default),
and if one or more server applications uses an inadequate value for the
backlog argument to its
listen
system call, the value of
sobacklog_drops
may increase at a rate that is
faster than the rate at which the
somaxconn_drops
counter
increases.
If this occurs, you may want to increase the value of the
sominconn
attribute.
See Section 10.2.3 for information on tuning socket listen queue limits.
Most resources used by the network subsystem are allocated and adjusted dynamically; however, there are some tuning recommendations that you can use to improve performance, particularly with systems that are Internet servers.
Network performance is affected when the supply of resources is unable to keep up with the demand for resources. The following two conditions can cause this congestion to occur:
A problem with one or more components of the network (hardware or software)
A workload (network traffic) that consistently exceeds the capacity of the available resources even though everything is operating correctly
Neither of these problems are network tuning issues. In the case of a problem on the network, you must isolate and eliminate the problem. In the case of high network traffic (for example, the hit rate on a Web server has reached its maximum value while the system is 100 percent busy), you must either redesign the network and redistribute the load, reduce the number of network clients, or increase the number of systems handling the network load. See the Network Programmer's Guide and the Network Administration manual for information on how to resolve network problems.
Table 10-2 lists network subsystem tuning guidelines and performance benefits as well as tradeoffs.
| Action | Performance Benefit | Tradeoff |
| Increase the size of the hash table that the kernel uses to look up TCP control blocks (Section 10.2.1) | Improves the TCP control block lookup rate and increases the raw connection rate | Slightly increases the amount of wired memory |
| Increase the number of TCP hash tables (Section 10.2.2) | Reduces head lock contention for SMP systems | Slightly increases the amount of wired memory |
| Increase the limits for partial TCP connections on the socket listen queue (Section 10.2.3) | Improves throughput and response time on systems that handle a large number of connections | Consumes memory when pending connections are retained in the queue |
| Increase the number of outgoing connection ports (Section 10.2.4) | Allows more simultaneous outgoing connections | None |
| Modify the range of outgoing connection ports (Section 10.2.5) | Allows you to use ports from a specific range | None |
| Enable TCP keepalive functionality (Section 10.2.6) | Enables inactive socket connections to time out | None |
| Increase the size of the kernel interface alias table (Section 10.2.7) | Improves the IP address lookup rate for systems that serve many domain names | Slightly increases the amount of wired memory |
| Make partial TCP connections time out more quickly (Section 10.2.8) | Prevents clients from overfilling the socket listen queue | A short time limit may cause viable connections to break prematurely |
| Make the TCP connection context time out more quickly at the end of the connection (Section 10.2.9) | Frees connection resources sooner | Reducing the timeout limit increases the potential for data corruption, so guideline should be applied with caution |
| Reduce the TCP retransmission rate (Section 10.2.10) | Prevents premature retransmissions and decreases congestion | A long retransmit time is not appropriate for all configurations |
| Enable the immediate acknowledgement of TCP data (Section 10.2.11) | Can improve network performance for some connections | May adversely affect network bandwidth |
| Increase the TCP maximum segment size (Section 10.2.12) | Allows sending more data per packet | May result in fragmentation at router boundary |
| Increase the size of the transmit and receive socket buffers (Section 10.2.13) | Buffers more TCP packets per socket | May decrease available memory when the buffer space is being used |
| Increase the size of the transmit and receive buffers for a UDP socket (Section 10.2.14) | Helps to prevent dropping UDP packets | May decrease available memory when the buffer space is being used |
| Allocate sufficient memory to the UBC (Section 10.2.15) | Improves disk I/O performance | May decrease the physical memory available to the virtual memory subsystem |
| Disable the use of a PMTU (Section 10.2.16) | Improves the efficiency of servers that handle remote traffic from many clients | May reduce server efficiency for LAN traffic |
| Increase the size of the ARP table (Section 10.2.17) | May improve network performance on a system that is simultaneously connected to many nodes on the same LAN | Consumes memory resources |
| Increase the maximum size of a socket buffer (Section 10.2.18) | Allows large socket buffer sizes | Consumes memory resources |
| Increase the number of IP input queues (Section 10.2.19) | Reduces IP input queue lock contention for SMP systems | None |
| Prevent dropped input packets (Section 10.2.20) | Allows high network loads | None |
Enable
mbuf
cluster compression (Section 10.2.21) |
Improves efficiency of network memory allocation | None |
| Modify the NetRAIN retry limit (Section 10.2.22) | Controls the time to detect an interface failure | Aggressive monitoring can increase CPU usage |
| Modify the NetRAIN monitoring timer (Section 10.2.23) | Controls the time to detect an interface failure | Aggressive monitoring can increase CPU usage |
The following sections describe these tuning guidelines in detail.
You
can modify the size of the hash table that the kernel uses to look up
Transmission Control Protocol (TCP) control blocks.
The
inet
subsystem attribute
tcbhashsize
specifies the number of hash
buckets in the kernel TCP connection table (the number of buckets in
the
inpcb
hash table).
The default value is 512.
The kernel must look up the connection block for every TCP packet it receives, so increasing the size of the table can speed the search and and improve performance.
For Internet, Web, proxy, firewall, and gateway servers,
set the
tcbhashsize
attribute to 16384.
See Section 4.4 for information about modifying kernel subsystem attributes.
If you have an SMP system, you may be able to reduce head lock contention by increasing the number of hash tables that the kernel uses to look up Transmission Control Protocol (TCP) control blocks.
Because the kernel must look up the connection block for every TCP packet it receives, a bottleneck may occur at the TCP hash table in SMP systems. Increasing the number of tables distributes the load and may improve performance.
The
inet
subsystem attribute
tcbhashnum
specifies the number of TCP hash tables.
For busy Internet server SMP systems, you can increase the value of the
tcbhashnum
attribute to 16.
The minimum and default values
are 1; the maximum value is 64.
It is recommended that you make the value of the
tcbhashnum
attribute the same
as the value of the
inet
subsystem attribute
ipqs.
See
Section 10.2.19
for information.
See Section 4.4 for information about modifying kernel subsystem attributes.
You
may be able to improve performance by increasing the limits for
the socket listen queue (only for TCP).
The
socket
subsystem attribute
somaxconn
specifies the maximum number of
pending TCP connections (the socket listen queue limit) for each
server socket.
If
the listen queue connection limit is too small, incoming connect requests
may be dropped.
Note that pending TCP connections can be caused by
lost packets in the Internet or denial of service attacks.
The default value of the
somaxconn
attribute
is 1024; the maximum value is 65535.
To improve throughput and response time with fewer drops, you can increase
the value of the
somaxconn
attribute.
A busy system running applications that generate a large number of
connections may have many pending connections.
For
Internet, Web, proxy, firewall, and gateway servers,
set the value of the
somaxconn
attribute to the maximum value of 65535.
The
socket
subsystem attribute
sominconn
specifies the minimum number
of pending TCP connections (backlog) for each server socket.
The attribute controls how many SYN packets can be handled simultaneously
before additional requests are discarded.
The default value is zero.
The value of the
sominconn
attribute
overrides the application-specific backlog value,
which may be set too low for some server software.
To improve performance without recompiling an application and for
Internet, Web, proxy, firewall, and gateway servers,
set the value of the
sominconn
attribute to the maximum value of 65535.
The value of the
sominconn
attribute should be the same as the value of the
somaxconn
attribute.
Network performance can degrade if a client saturates a socket listen
queue with erroneous
TCP SYN packets, effectively blocking other users from the queue.
To
eliminate this problem, increase the value of the
sominconn
attribute to 65535.
If the system continues
to drop incoming SYN packets, you can decrease the value of the
inet
subsystem attribute
tcp_keepinit
to 30 (15 seconds).
See
Section 10.1.2
for information about monitoring the
sobacklog_hiwat,
sobacklog_drops, and
somaxconn_drops
attributes.
If the values show that the queues are overflowing, you may need
to increase the socket listen queue limit.
See Section 4.4 for information about modifying kernel subsystem attributes.
When
a TCP or UDP application creates an outgoing connection, the kernel dynamically
allocates a nonreserved port number for each connection.
The kernel
selects the port number from a range of values between the value of the
inet
subsystem attribute
ipport_userreserved_min
and the value of the
ipport_userreserved
attribute.
Using the default attribute values, the number of
simultaneous outgoing connections is limited to 3976 (5000 minus 1024).
If your system requires many outgoing ports, you may want to increase the value
of the
ipport_userreserved
attribute.
If
your system is a proxy server (for example, a
Squid Caching Server or a firewall system) with a load of more than 4000
simultaneous connections, increase the value of the
ipport_userreserved
attribute to the maximum value of 65000.
It is not recommended that you reduce the value of the
ipport_userreserved
attribute to a value that is
less than 5000 or increase it to a value that is higher than 65000.
You can also modify the range of outgoing connection ports. See Section 10.2.5 for information.
See Section 4.4 for information about modifying kernel subsystem attributes.
When a TCP or UDP application creates an outgoing connection, the kernel dynamically allocates a nonreserved port number for each connection.
The kernel selects the port number from a range of values between the value of the
inet
subsystem attribute
ipport_userreserved_min
and the value of the
ipport_userreserved
attribute.
Using the default values for these attributes, the range of outgoing ports starts at 1024 and stops at 5000.
If your system requires outgoing ports from a particular range, you can
modify the values of the
ipport_userreserved_min
and
ipport_userreserved
attributes.
The maximum value of both attributes is 65000.
Do not reduce the
ipport_userreserved
attribute to a value that is
less than 5000 or reduce the
ipport_userreserved_min
attribute to a value that is less than 1024.
See Section 4.4 for information about modifying kernel subsystem attributes.
Keepalive functionality enables the periodic transmission of messages on a connected socket in order to keep connections active. If you enable keepalive, sockets that do not exit cleanly are cleaned up when the keepalive interval expires. If keepalive is not enabled, those sockets will continue to exist until you reboot the system.
Applications enable keepalive for sockets by setting the
setsockopt
function's
SO_KEEPALIVE
option.
To override programs that do not set keepalive on
their own or if you do not have access to the application sources,
set the
inet
subsystem attribute
tcp_keepalive_default
to 1 in
order to enable keepalive for all sockets.
If you enable keepalive, you can also configure the following TCP options for each socket:
The
inet
subsystem attribute
tcp_keepidle
specifies the amount of idle
time before sending a keepalive probe (specified in 0.5 second units).
The default interval is 2 hours.
The
inet
subsystem attribute
tcp_keepintvl
specifies the amount of
time between retransmission of keepalive probes in 0.5 second units.
The default interval is 75 seconds.
The
inet
subsystem attribute
tcp_keepcnt
specifies the maximum number
of keepalive probes that are sent before the connection is dropped.
The default is 8 probes.
The
inet
subsystem attribute
tcp_keepinit
specifies the maximum
amount of time before an initial connection attempt times out
in 0.5 second units.
The default is 75 seconds.
See Section 4.4 for information about modifying kernel subsystem attributes.
The
inifaddr_hsize
attribute specifies the number
of hash buckets in the kernel interface alias table
(in_ifaddr).
The default value of the
inet
subsystem attribute
inifaddr_hsize
is 32; the maximum value is 512.
If a system is used as a server for many different server domain names, each of which are bound to a unique IP address, the code that matches arriving packets to the right server address uses the hash table to speed lookup operations for the IP addresses. Increasing the number of hash buckets in the table can improve performance on systems that use large numbers of aliases.
For the best performance, the value of the
inifaddr_hsize
attribute is always rounded
down to the nearest power of 2.
If you are using more than 500 interface IP aliases, specify the
maximum value of 512.
If you are using less than 250 aliases, use
the default value of 32.
See Section 4.4 for information about modifying kernel subsystem attributes.
The
inet
subsystem attribute
tcp_keepinit
specifies the amount of time that
a partially established TCP connection remains on the socket listen queue
before it times out.
The value of the attribute is in units of 0.5 seconds.
The default value is 150 units (75 seconds).
Partial connections consume listen queue slots and
fill the queue with connections in the
SYN_RCVD
state.
You can make
partial connections time out sooner by decreasing the
value of the
tcp_keepinit
attribute.
However, do
not set the value too low, because you may prematurely break
connections associated with clients on network paths that are slow or
network paths that lose many packets.
Do not set the value to less than 20
units (10 seconds).
If you have a 32000 socket queue limit, the default (75 seconds) is usually
adequate.
Network performance can degrade if a client overfills a socket listen
queue with
TCP SYN packets, effectively blocking other users from the queue.
To
eliminate this problem, increase the value of the
sominconn
attribute to the maximum of 64000.
If the
system continues to drop SYN packets, decrease the value
of the
tcp_keepinit
attribute to 30 (15 seconds).
See Section 4.4 for information about modifying kernel subsystem attributes.
You can make the TCP connection context time out more quickly at the end of a connection. However, this will increase the chance of data corruption.
The TCP protocol includes a concept known as the Maximum Segment Lifetime
(MSL).
When a TCP connection enters the
TIME_WAIT
state, it must remain
in this state for twice the value of the MSL, or else undetected data errors
on future connections can occur.
The
inet
subsystem
attribute
tcp_msl
determines the maximum lifetime of a TCP segment and the timeout
value for the
TIME_WAIT
state.
The value of the attribute is set in units of 0.5 seconds.
The default value is 60 units (30 seconds), which means that the TCP
connection remains in
TIME_WAIT
state for 60 seconds (or twice the value of
the MSL).
In some situations, the default
timeout value for the
TIME_WAIT
state (60 seconds) is too large, so
reducing the value of the
tcp_msl
attribute frees
connection resources sooner than the default behavior.
Do not reduce the value
of the
tcp_msl
attribute unless you fully understand
the design and behavior of your network and the TCP protocol.
It is strongly recommended that you use the default value; otherwise, there
is
the potential for data corruption.
See Section 4.4 for information about modifying kernel subsystem attributes.
The
inet
subsystem attribute
tcp_rexmit_interval_min
specifies
the minimum amount of time between the first TCP retransmission.
For
some wide area networks (WANs), the default value may be too small,
causing premature retransmission timeouts.
This may lead
to duplicate transmission of packets and the erroneous
invocation of the TCP congestion-control algorithms.
The
tcp_rexmit_interval_min
attribute is specified in units
of 0.5 seconds.
The default value is 2 units (1 second).
You can increase the value of the
tcp_rexmit_interval_min
attribute to
slow the rate of TCP retransmissions, which decreases congestion and
improves performance.
However, not every connection needs a long
retransmission time.
Usually, the default value is adequate.
Do not specify a value that is less than 1 unit.
Do not change the attribute
unless you fully understand TCP algorithms.
See Section 4.4 for information about modifying kernel subsystem attributes.
The
value of the
inet
subsystem attribute
tcpnodelack
determines whether
the system delays acknowledging TCP data.
The default is 0, which
delays the acknowledgment of TCP data.
Usually, the default is adequate.
However, for some connections (for example, loopback), the delay can degrade
performance.
You may be able to
improve network performance by setting the value of the
tcpnodelack
attribute to 1, which disables the
acknowledgment delay.
However, this may adversely impact
network bandwidth.
Use the
tcpdump
command to
check for excessive delays.
See Section 4.4 for information about modifying kernel subsystem attributes.
The
inet
subsystem attribute
tcp_mssdflt
specifies the TCP maximum
segment size (the default value of 536).
You can increase the value to
1460.
This allows sending more data per socket, but may cause
fragmentation at the router boundary.
See Section 4.4 for information about modifying kernel subsystem attributes.
The
inet
subsystem
attribute
tcp_sendspace
specifies the default transmit
buffer size for a TCP socket.
The
tcp_recvspace
attribute specifies the default receive buffer size for a TCP socket.
The default value of both attributes is 32 KB.
You can increase the value
of
these attributes to 60 KB.
This allows you to buffer more TCP packets
per socket.
However, increasing the values
uses more memory when the buffers are being used by an application
(sending or receiving data).
You may want to increase the maximum size of a socket buffer before you increase the transmit and receive buffers. See Section 10.2.18 for information.
See Section 4.4 for information about modifying kernel subsystem attributes.
The
inet
subsystem attribute
udp_sendspace
specifies the default transmit
buffer size for an Internet User Datagram Protocol (UDP)
socket; the default value is 9 KB.
The
inet
subsystem
attribute
udp_recvspace
specifies the default receive
buffer size for a UDP socket; the default value is 40 KB.
You can increase
the values of these attributes to 64 KB.
However, increasing the values
uses more memory when the buffers are being used by an application
(sending or receiving data).
These attributes do not have an impact on the Network File System (NFS).
See Section 4.4 for information about modifying kernel subsystem attributes.
You
must ensure that you have sufficient memory allocated to the
Unified Buffer Cache (UBC).
Servers that perform lots of file I/O (for example, Web and proxy servers)
extensively utilize both the UBC and the virtual memory subsystem.
In
most cases, use the default value of 100 percent for the
vm
subsystem attribute
ubc-maxpercent,
which specifies the maximum amount of physical memory that can
be allocated to the UBC.
If necessary, you can decrease the
size of the attribute by increments of 10 percent.
See Section 9.2.3 for more information about tuning the UBC.
See Section 4.4 for information about modifying kernel subsystem attributes.
Packets
transmitted between servers are fragmented into units of a
specific size in order to ease transmission of the data over routers and
small-packet networks, such as Ethernet networks.
When the
inet
subsystem attribute
pmtu_enabled
is enabled
(set to 1, which is
the default behavior), the system determines the largest common
path maximum transmission unit (PMTU) value
between servers and uses it as the unit size.
The system also creates a routing table entry for each client network
that attempts to connect to the server.
On a server that handles local traffic and some remote traffic, enabling the use of a PMTU can improve bandwidth. However, if a server handles traffic among many remote clients, enabling the use of a PMTU can cause an excessive increase in the size of the kernel routing tables, which can reduce server efficiency.
If an Internet, Web, proxy, firewall, or gateway server
has poor performance and
the routing table increases to more than 1000 entries, set
the value of the
pmtu_enabled
attribute to 0 to
disable the use of PMTU protocol.
See Section 4.4 for information about modifying kernel subsystem attributes.
The
net
subsystem attribute
arptab_nb
specifies the number of hash
buckets in the address resolution protocol (ARP) table (that is,
the table's width).
The default value is 37.
You can modify the value of the
arptab_nb
attribute if the ARP table is thrashing.
In addition, you may be able to improve performance by modifying the
attribute.
You can view the ARP table by using the
arp -a
command or the
kdbx arp
debugger extension.
However, changing the attribute values will not affect performance unless
the system is simultaneously connected to many nodes on the same LAN.
See the
Kernel Debugging
manual and
kdbx(8)
for more information.
You can increase the width of the ARP table by increasing the value
of
the
inarptab_nb
attribute.
In general, wide
ARP tables can decrease the chance that a search will be needed
to match an address to an ARP entry.
Increasing the value of the
arptab_nb
attribute
will increase the memory used by the ARP table.
See Section 4.4 for information about modifying kernel subsystem attributes.
If
you require a large socket buffer, increase the maximum socket buffer
size.
To do this, increase the value of the
socket
subsystem attribute
sb_max, before you increase
the size of the transmit and receive socket
buffers (see
Section 10.2.13).
The default maximum socket buffer size is 128 KB.
See Section 4.4 for information about modifying kernel subsystem attributes.
For SMP systems, you may be able to
reduce lock contention at the IP input queue by increasing the number of
queues and distributing the load.
The
inet
subsystem
attribute
ipqs
specifies the number of IP input queues.
For busy Internet server SMP systems, you may want to
increase the value of the
ipqs
attribute to 16.
The default value of the
ipqs
attribute is 1.
The minimum value is 1; the maximum value is 64.
It is recommended that you make the value of the
ipqs
attribute the same as the value of the
inet
subsystem
attribute
tcbhashnum.
See
Section 10.2.2
for information.
See Section 4.4 for information about modifying kernel subsystem attributes.
If the IP input
queue overflows under a heavy network load, input packets may be dropped.
To check for dropped packets, examine the
ipintrq
kernel structure by using
dbx.
For example:
#dbx -k /vmunix(dbx)print ipintrqstruct { ifq_head = (nil) ifq_tail = (nil) ifq_len = 0 ifq_maxlen = 512 ifq_drops = 0 . . .
If the
ifq_drops
field is not 0, increase the
value of the
inet
subsystem attribute
ipqmaxlen.
For example, you may want to increase the value to 2000.
The default and minimum
value is 512; the maximum value is 65535.
See Section 4.4 for information about modifying kernel subsystem attributes.
The
ipqmaxlen
attribute cannot be tuned at run time.
You can immediately determine the impact of the kernel modification
by using
dbx
to increase the value of the
ipintrq.ifq_maxlen
kernel variable.
For example:
#dbx -k /vmunix(dbx)patch ipintrq.ifq_maxlen=2000
See
Section 4.4.6
for information about using
dbx.
The
socket
subsystem attribute
sbcompress_threshold
controls whether
mbuf
clusters are compressed.
By default,
mbuf
clusters are not compressed
(sbcompress_threshold
is set to 0), which can
cause proxy servers
to consume all the available
mbuf
clusters.
This situation
is more likely to occur if you are using FDDI instead of Ethernet.
To enable
mbuf
cluster compression, modify the default
value of the
socket
subsystem attribute
sbcompress_threshold.
Packets will be copied into the existing
mbuf
clusters
if the packet size is less than this value.
For proxy servers,
specify a value of 600.
To determine the memory that is being used for
mbuf
clusters, use the
netstat -m
command.
The
following example is from a
firewall server with 128 MB memory that does not have
mbuf
cluster compression enabled:
#netstat -m2521 Kbytes for small data mbufs (peak usage 9462 Kbytes) 78262 Kbytes for mbuf clusters (peak usage 97924 Kbytes) 8730 Kbytes for sockets (peak usage 14120 Kbytes) 9202 Kbytes for protocol control blocks (peak usage 14551 2 Kbytes for routing table (peak usage 2 Kbytes) 2 Kbytes for socket names (peak usage 4 Kbytes) 4 Kbytes for packet headers (peak usage 32 Kbytes) 39773 requests for mbufs denied 0 calls to protocol drain routines 98727 Kbytes allocated to network
The previous example shows 39773 requests for memory were
denied.
This indicates a problem because this
value should be 0.
The example also shows that 78 MB
of memory has been assigned to
mbuf
clusters,
and that 98 MB of memory is being consumed by the network subsystem.
If you increase the value of the
sbcompress_threshold
attribute to 600, the
memory allocated to
the network subsystem immediately decreases to 18 MB, because compression
at the kernel socket buffer interface results in a more efficient
use of memory.
The
netrain
subsystem attribute
nr_max_retries
specifies how many failed tests must occur before
a NetRAIN interface is determined to have failed and a backup interface
is brought on line.
The default value is 4.
Decreasing the default value will cause NetRAIN to be more aggressive about declaring an interface to be failed and forcing an interface failover, but will increase CPU usage. Increasing the default value will cause NetRAIN to be more tolerant of temporary failures, but may result in long failover times.
For ATM LAN Emulation (LANE) interfaces, set the value of the
nr_max_retries
attribute to 5.
The
netrain
subsystem attribute
netrain_timeout
specifies the number of clock ticks between runs of the kernel
thread that monitors the health of the network interfaces.
All other
NetRAIN timers are based on this
frequency.
The default value is 1000 ticks (1 second).
Decreasing the value of the
netrain_timeout
attribute
causes NetRAIN to monitor network interfaces more aggressively so that it can
quickly detect a failed interface, but it will increase CPU usage.
Increasing the value will cause NetRAIN to be more tolerant of
temporary failures, but may result in long failover times.