This appendix contains information about performance aspects of the Transport Control Protocol (TCP). It discusses how programs can influence TCP throughput by controlling the window size used by TCP via socket options.
TCP throughput depends on the transfer rate, which is the rate at which the network can accept packets, and the round-trip time, which is the delay between the time a TCP segment is sent and the time an acknowledgement arrives for that segment. These factors determine the amount of data that must be buffered (the window) prior to receiving acknowledgment to obtain maximum throughput on a TCP connection.
If the transfer rate or the round-trip time or both is high, the default window size used by TCP may be insufficient to keep the pipe fully loaded. Under these circumstances, TCP throughput can be limited because the sender is required to stall until acknowledgements for prior data are received.
The receive socket buffer size determines the maximum receive window for a TCP connection. The transfer rate from a sender can also be limited by the send socket buffer size. DEC OSF/1 currently uses a default value of 32768 bytes for TCP send and receive buffers.
An application can override the default TCP send and receive socket buffer sizes by using the setsockopt system call specifying the SO_SNDBUF and SO_RCVBUF options, prior to establishing the connection. The largest size that can be specified with the SO_SNDBUF and SO_RCVBUF options is limited by the kernel variable sb_max. See Section C.3.1 for information about increasing this value.
For maximum throughput, Digital recommends send and receive socket buffers on both ends of the connection be of equal size.
When writing programs that use the setsockopt system call to change a TCP socket buffer size (SO_SNDBUF, SO_RCVBUF), note that the actual socket buffer size used for a TCP connection can be larger than the specified value. This situation occurs when the specified socket buffer size is not a multiple of the TCP Maximum Segment Size (MSS) to be used for the connection.
TCP determines the actual size, and the specified size is rounded up to the nearest multiple of the negotiated MSS. For local network connections, the MSS is generally determined by the network interface type and its maximum transmission unit (MTU).
DEC OSF/1 implements the TCP window scale option, as defined in RFC 1323: TCP Extensions for High Performance. The TCP window scale option, which allows larger windows to be used, was designed to increase throughput of TCP over high bandwidth, long delay networks. This option may also increase throughput of TCP in local FDDI networks.
The window field in the TCP header is 16 bits. Therefore, the largest window that can be used without the window scale option is 2**16 (64KB). When the window scale option is used between cooperating systems, windows up to (2**30)-1 bytes are allowed. The option, transmitted between TCP peers at the time a connection is established, defines a scale factor which is applied to the window size value in each TCP header to obtain the actual window size.
The maximum receive window, and therefore the scale factor offered by TCP during connection establishment, is determined by the maximum receive socket buffer space.
If the receive socket buffer size is greater than 65535 bytes, during connection establishment, TCP will specify the Window Scale option with a scale factor based on the size of the receive socket buffer. Both systems involved in the TCP connection must send the Window Scale option in their SYN segments for window scaling to occur in either direction on the connection. As stated previously, Digital recommends that, for maximum throughput, send and receive buffers on both ends of the connection be of equal size.
The sb_max kernel variable limits the amount of socket buffer space that can be allocated for each send and receive buffer. The current default is 128KB but optionally you can increase it.
For local FDDI connections, the current value is sufficient. For long delay, high bandwidth paths, values greater than 128KB may be required.
To change the sb_max kernel variable, use the dbx -k command as root. The following example shows how to increase the sb_max variable in the kernel disk image, as well as the kernel currently in memory, to 150KB:
#
dbx -k /vmunix
dbx version 9.0.1 Type 'help' for help. stopped at [thread_block:1305 +0x114,0xfffffc000033961c] \ Source not available
153600
153600
(dbx) quit
See dbx(1) for a description of the dbx assign and patch commands.