This chapter discusses the following topics:
A general overview of cluster aliases (Section 6.1)
Cluster alias subsystem components (Section 6.2)
The default cluster alias (Section 6.3)
The number of aliases per cluster (Section 6.4)
The location of alias IP addresses (Section 6.5)
Routing for alias IP addresses (Section 6.6)
in_single
and
in_multi
services
(Section 6.7)
Alias attributes (Section 6.8)
Service port attributes (Section 6.9)
vMAC support (Section 6.10)
NFS and cluster alias (Section 6.11)
RPC services and cluster alias (Section 6.12)
ifconfig aliases and cluster aliases (Section 6.13)
A
cluster alias
is an IP address that
makes some or all of the systems in a cluster look like a single
system to Transmission Control Protocol (TCP) and User Datagram
Protocol (UDP) applications.
Figure 6-1
shows how a network client views the systems in a cluster with and
without a cluster alias.
Figure 6-1: Client's View of a Cluster With and Without Cluster Alias
Cluster aliases free clients from having to connect to specific
cluster members for services.
Just as clients can request a variety of
services from a single host, clients can request a variety of services
from a cluster alias.
For example, you can
telnet
or
rlogin
to a cluster alias as you do to a single
host.
A cluster can have more than one cluster alias. One alias, the default cluster alias, is created during cluster installation and all members can receive packets that are addressed to this alias. You can create additional aliases as needed.
You can think of a cluster alias as a distributed virtual clusterwide network
interface.
In that sense, a cluster alias is conceptually similar to
an
ifconfig
alias, where a single physical network
interface responds to more than one IP address.
Each system in a cluster explicitly joins the aliases to which it
wants to belong.
After a system joins an alias, it is a
member
of that alias.
Using the analogy that a
cluster alias is similar to an address on virtual network interface,
joining an alias is similar to issuing an
ifconfig
up
command for that alias interface.
The member can now
receive packets addressed to the alias.
Clients send TCP connection requests or UDP messages to the IP address representing an alias. The cluster transparently routes the request or message to a cluster node that is a current member of that alias. The hop within the cluster uses the cluster interconnect, not network routing.
If a member of an alias is unavailable, the cluster stops sending
packets to that member and routes packets to active members of that
alias.
As long as one member of an alias is active, the alias is
available.
6.2 Cluster Alias Subsystem Components
The cluster alias subsystem has the following main components:
The kernel portion of the cluster alias subsystem,
clua
, which is a configurable kernel subsystem
loaded at boot time.
A user-level daemon,
aliasd
.
The kernel
communicates with this daemon to manage routing for cluster
aliases.
The alias daemon transparently handles the routing
configuration for cluster aliases, automatically adding any needed
host routes (and network routes for alias addresses on virtual
subnets) for cluster aliases to that member's
/etc/gated.conf.membern
file.
The daemon starts
gated
using this file as
gated
's configuration file rather than the member's
/cluster/members/{memb}/etc/gated.conf
file.
The
cluamgr
command provides options that can
modify the daemon's behavior.
Each cluster member runs
aliasd
.
Note
The
aliasd
daemon supports only the Routing Information Protocol (RIP).
An administrative interface that provides both a command-line and
a graphical user interface (GUI) to manage aliases and alias
attributes.
The command-line interface is the
cluamgr
command.
The GUI is accessed from the
SysMan Menu.
A member-specific alias configuration file,
/etc/clu_alias.config
, which contains
the
cluamgr
commands that configure aliases,
including the default cluster alias, for that member.
A clusterwide application configuration file,
/etc/clua_services
, which assigns alias-related
attributes to ports used by services.
The
/etc/clua_services
file is the cluster alias
extension of the
/etc/services
file.
The
clua_services
file extends the
services
syntax to assign alias-related
attributes to ports.
A clusterwide file,
/etc/exports.aliases
, which
contains the names of non-default cluster aliases that NFS clients can
use.
By default, only NFS requests directed to the default cluster
alias are accepted by the cluster.
This file lets you use additional
aliases as NFS server names.
This is useful, for example, when not all
members of a cluster are directly connected to the storage that contains
exported file systems.
In this case, you can create an alias that
encompasses only those members directly connected to the storage, and then
tell users on NFS client systems to use that cluster alias when
requesting NFS services from the cluster.
An application programming interface (API),
libclua
.
Figure 6-2
provides a functional
overview of the cluster alias subsystem components.
Figure 6-2: Cluster Alias Functional Overview
There is one special alias, called the
default cluster
alias.
During installation, the cluster is given a name,
which is stored in
/etc/sysconfigtab
as the value
of the
cluster_name
attribute.
The installation
procedure adds an entry to
/etc/hosts
, which
associates this cluster name with a user-specified default cluster alias
IP address.
For example, for a cluster named
deli
whose alias IP address is
16.140.112.209
, the
installation procedure adds the following entry to
/etc/hosts
:
16.140.112.209 deli.zk3.dec.com deli
Each cluster member is a member of the default cluster alias.
The
command that makes a cluster member a member of the default cluster
alias is in each member's
/etc/clu_alias.config
file.
All cluster members automatically join the default cluster alias
at boot time.
Figure 6-3
shows a three-node cluster
with two cluster aliases.
All members belong to alias A, the default
cluster alias, but only two members belong to alias B.
Figure 6-3: Cluster Using Two Aliases
Several standard Internet services, such as
telnet
and
login
, use the IP address of the default
cluster alias as the source address for outgoing packets.
Cluster
alias IP addresses, including that of the default cluster
alias, must be on a network accessible to cluster clients; that is,
clients must be able to route to this subnet.
For this reason, cluster
alias IP addresses cannot be on the cluster interconnect (the subnet
used by the cluster for internal communication).
6.4 The Number of Aliases per Cluster
For many clusters, the default cluster alias provides sufficient access for cluster clients. Whether or not a cluster will benefit from having additional aliases depends on the symmetry (storage and network) of the cluster, and whether you want all members to handle client requests for all services. Additional aliases are useful in the following situations:
In a heterogeneous cluster where some devices or applications are best served through a subset of cluster members.
If you want to restrict services to a subset of cluster members in order to reduce the internal forwarding of requests and packets.
In a cluster where all members are not directly connected to the storage containing files systems exported by the cluster. In this case, using an alias the encompasses just those systems that are directly connected to this storage reduces traffic across the cluster interconnect.
After cluster installation, you can define as many
aliases as are needed for a cluster.
The default value for the
clua
subsystem
max_aliasid
attribute is 8, the maximum value is 102,400.
In practice this upper
limit is probably memory restricted, but the useful range should meet
all practical needs.
One suggestion is to use the default alias for a
while, and then decide whether your site can benefit from additional
aliases.
In many cases, the default alias is sufficient; the
Cluster Administration
manual describes a situation where a
site uses two aliases for load balancing.
6.5 The Location of Alias IP Addresses
A cluster alias address can be in one of two types of subnets:
A subnet to which one or more cluster systems are connected with physical network interfaces.
Using a common subnet for cluster aliases works well when the cluster is connected to only a single local area network, and that network is managed as a single IP address domain.
Cluster alias routing in a common subnet is based on proxy Address Resolution Protocol (ARP) support. For each alias, one cluster member acts as the proxy ARP master for that alias.
A cluster alias resides in a virtual subnet if its address is in
a subnet that is not associated with any physical interfaces.
A virtual subnet is made visible to the physical network by
gated
, the gateway routing daemon.
If the
cluamgr
virtual
option is
assigned to an alias address, a cluster member advertises a host
route and a network route to the alias.
Multiple clusters on the same LAN can use the same virtual subnet.
Caution
A virtual subnet must not have any real systems in it.
The choice of subnet type depends mainly on whether the existing subnet to which the cluster is connected (that is, the common subnet) has enough addresses available for cluster aliases. If addresses are not available on an existing subnet, consider creating a virtual subnet. A lesser consideration is that if a cluster is connected to multiple subnets, configuring a virtual subnet has the advantage of being uniformly reachable from all of the connected subnets. However, this advantage is more a matter of style than substance. It does not make much practical difference which type of subnet you use for cluster alias addresses; do whatever makes the most sense at your site.
Regardless of the type of subnet, it must be configured so that packets from clients can be routed to alias addresses. Services that use cluster aliases will not be accessible to clients if those alias addresses are on a virtual or a common subnet that clients cannot reach.
A cluster alias address should not be a broadcast address or a
multicast address, nor should it reside in the subnet used by the
cluster interconnect.
Although you can assign a cluster alias an IP
address that resides in one of the private address spaces defined in
RFC 1918, you must use the
cluamgr -r resvok
command in order for the alias subsystem to advertise a route to the
alias address.
(See
cluamgr
(8)
for information on using the
resvok
flag and how to add an entry to
/etc/rc.config.common
to make the route advertising
persist beyond reboots.)
6.6 Routing for Alias IP Addresses
This section discusses how routes to aliases are advertised and how packets addressed to aliases are taken off the wire:
Advertising routes to aliases (Section 6.6.1)
Routing for aliases on common subnets (Section 6.6.2)
Routing for aliases on virtual subnets (Section 6.6.3)
Accepting and redirecting packets and connection requests addressed to an alias (Section 6.6.5)
Routing example (Section 6.6.6)
The following terms, which are used in these sections, are defined in the glossary. If you are not familiar with the terms, read the glossary definitions before continuing.
6.6.1 Advertising Routes to Aliases
An alias router is a cluster member that makes a cluster alias address known to the network and receives incoming packets for that alias. By default, all cluster members are configured as alias routers for the default cluster alias at boot time. Any cluster member can be configured to advertise a host or a network route to any alias.
Note
By default, cluster members route only for cluster aliases; they are not configured as general purpose routers. Whether or not a site decides to configure one or more cluster members to route for non-alias traffic is the responsibility of the network administrators at that site.
A cluster member does not have to join an alias in order to route for
that alias.
In the following example, a cluster member
specifies
alias1
, and specifies and joins
alias2
.
The member will route packets addressed to
either alias, but will only receive requests/packets addressed to
alias2
:
/usr/sbin/cluamgr -a alias=alias1 /usr/sbin/cluamgr -a alias=alias2,join
You can put these commands in a member's
/etc/clu_alias.config
file to ensure that the
commands are run at boot time.
6.6.2 Routing for Aliases on Common Subnets
For each alias, all cluster members that are aware of the alias (have
either specified or joined the alias) use
gated
to advertise a host route to that alias.
The
aliasd
daemon automatically configures
/etc/gated.conf.membern
to advertise a host route at boot time based on the information in
/etc/clu_config.alias
.
Only one alias member at a time responds to Address Resolution Protocol (ARP) requests for a given cluster alias. This member is the proxy ARP master for the alias. If this system fails, another is elected to take over the role of proxy ARP master.
Note
In routing tables, host routes take precedence over the interface routes used for proxy ARP. Proxy ARP applies only to alias addresses configured on a common subnet (a physical network).
If multiple cluster alias addresses are
defined, you can use the
rpri
alias
attribute to balance the incoming load a bit by giving different nodes
the highest routing priority for different alias addresses.
With a single
alias and ARP-based routing, only one member acts as the alias
router at a time.
(Section 6.8
describes the router
priority,
rpri
attribute.)
However, all cluster members that are aware of an alias use
gated
to set up a host route to each cluster alias
on each of their network interfaces.
Because host routes take
precedence over the interface routes used with ARP, any client in the
same subnet as the cluster that is running a route daemon sees the
host routes.
Depending on various random occurrences, such as which
nodes boot in what order, different clients may see a different
cluster node's host route first.
Therefore, clients might use different
cluster nodes as their route to the cluster alias.
(This is not
guaranteed to be uniformly distributed.
If the clients boot before the
cluster, all clients will see and use the host route through the first
cluster node that advertises one.) Client nodes that do not run a
route daemon, such as
routed
or
gated
, find the cluster alias using the ARP
protocol, and are routed through the proxy ARP master.
6.6.3 Routing for Aliases on Virtual Subnets
The alias daemon,
aliasd
, creates a
/etc/gated.conf.membern
file for each cluster member.
The alias configuration process
modifies this configuration file to advertise each alias address in a
virtual subnet
as a host
route.
No manual modification is required; each member's alias daemon
automatically modifies that member's
/etc/gated.conf.membern
file to advertise a route to each cluster alias host address through
each network interface on that member.
If the
cluamgr
virtual
option is
assigned to an alias address, the cluster member also advertises a
network route to the virtual subnet containing the alias address.
To
ensure that the virtual subnet's location is known to the network,
make sure that at least one, and preferably all members, specify the
cluamgr
command
virtual=t
option
for at least one alias in each virtual subnet.
As with common subnet route advertising, the routing load may be
balanced across multiple cluster nodes, depending on which route
advertisements the clients see in what order.
6.6.4 Summary of Routing for Aliases on Common and Virtual Subnets
Table 6-1
summarizes the types of
routes that are advertised for cluster aliases on common and virtual
subnets.
Table 6-1: Summary of Routing for Aliases on Common and Virtual Subnets
Subnet Type | Address Domain | Proxy ARP | Host Route via gated | Network Route via gated |
Common | A subnet to which a cluster is connected. | Yes [Footnote 1] | Yes [Footnote 2] | No |
Virtual | A subnet with no physical connections that appears to exist 'behind' the cluster. | No | Yes [Footnote 2] | Yes [Footnote 3] |
6.6.5 Accepting and Redirecting Packets and Connection Requests Addressed to an Alias
Normal routing ensures that a packet addressed to a cluster alias arrives at exactly one cluster node. (One packet is not handled by multiple cluster members.) That node determines the cluster member to receive and process the packet based on the following:
Which members of the alias are available (the alias subsystem monitors calls to
bind()
and
listen()
)
The port number
The type of packet
A weighted round-robin algorithm
The following table describes how packets are redirected within a cluster:
New TCP/IP connection | Look at the packet and make a list of eligible members for the target port. Use the weighted round-robin algorithm to select a member from the list of active listening members. Forward the packet to the selected member. |
Existing TCP/IP connection | Determine which alias member owns this connection. Forward the packet to the member. |
UDP | Look at the packet and make a list of eligible members for the target port. Use the weighted round-robin algorithm to select a member from the list of active listening members. Forward the packet to the selected member. See Section 6.11 for information on how NFS over UDP is handled by the cluster alias subsystem. |
ICMP (some ICMP packets must be handled in cluster-alias context) | Look at the packet and determine whether to handle it or forward it to another member. If needed, forward the packet to that member. |
Figure 6-4
shows a cluster with interfaces
on three networks, two public common networks and one private virtual
network.
The default cluster alias IP address is on the virtual
subnet.
Figure 6-4: Alias Routing Example
If the correct
cluamgr
commands are used to
configure the alias, the
gated
daemon on each node
will advertise on all connected networks:
A host route to that address for the benefit of local nodes.
A network route to the virtual network to ensure that nodes beyond these networks can locate the virtual subnet.
The following examples show the
cluamgr
commands
run on hosts A and B to advertise routes to the cluster alias on the
virtual subnet:
To have
gated
advertise a host route for the alias:
# cluamgr -a alias=16.140.240.153
To have
gated
advertise a host route and a network route
(16.140.240.0) for the alias:
# cluamgr -a alias=16.140.240.153,virtual=t
To have
gated
advertise a host route and a network route
(16.140.240.0) for the alias, and to receive packets and
connection requests addressed to the alias:
# cluamgr -a alias=16.140.240.153,virtual=t,join
6.7 in_single and in_multi Services
Service ports that are accessed through a cluster alias are defined as either in_single or in_multi. These service port attributes determine the routing of network requests to applications, not whether an application can run on more than one member at the same time. From the point of view of the cluster alias subsystem:
When a service's port is designated as
in_single
, only
one alias member receives connection requests or packets
for that service.
If that member becomes unavailable, the cluster
alias subsystem selects another eligible member of that alias to receive all
connection requests or packets.
When a service's port is designated as
in_multi
, the alias subsystem distributes connection
requests and packets among all eligible members of the alias.
By default, the cluster alias subsystem treats all services as
in_single
.
For the cluster alias subsystem to
treat a service's port as
in_multi
, the port must
either be registered as
in_multi
in
/etc/clua_services
or through a call to
clua_registerservice()
.
See
Section 6.9
for more information on service port
attributes.
A service whose port is designated as
in_multi
can take
advantage of cluster aliasing to distribute incoming TCP connection
requests and UDP packets among members of the alias.
The alias
subsystem provides load balancing through a weighted round-robin
algorithm that distributes requests/packets among alias members.
If
one member of an alias cannot respond to client requests, the cluster
alias software transparently distributes requests/packets among the
remaining alias members.
Note
Cluster alias and CAA are separate subsystems with complementary but different functions. CAA is an application-control tool; cluster alias is a routing tool. CAA decides where an application will run; cluster alias decides how to get there. You cannot use CAA to control routing within the cluster; you cannot use cluster aliases to control where an application is running in the cluster. The Cluster Administration manual provides more information on the differences between cluster alias and CAA.
The following two figures show how the alias subsystem distributes
client requests for
in_single
and
in_multi
services.
For the
in_single
service (Figure 6-5), all requests are sent to the alias
member currently running the service.
For the
in_multi
service (Figure 6-6), requests are distributed among all
alias members.
Figure 6-5: in_single Service Accessed Through Default Cluster Alias
Figure 6-6: in_multi Service Accessed Through Default Cluster Alias
Alias attributes are member-specific. Each cluster member has its own view of an alias. For example, one cluster member can route for an alias but not be a member of that alias, but another cluster member can both route for that alias and be an end recipient for requests or messages addressed to that alias. In like manner, alias attributes are also alias-specific. A cluster member can join two aliases and assign a different selection weight to each alias, thus ensuring that the member system receives a higher proportion of connections addressed to the alias with the higher selection weight.
Aliases and their attributes are managed through the
cluamgr
command and the SysMan Menu.
The
SysMan Menu calls
cluamgr
as needed.
The following attributes control the routing and distribution of
connection requests and packets among members of an alias.
The
descriptions are paraphrased from those in
cluamgr
(8), which describes these and other alias attributes.
The router priority (rpri
) controls the proxy ARP
router selection for an alias on a common subnet.
For each alias in a
common subnet, the cluster member with the highest router priority for
that alias responds to ARP requests for that alias.
Note
that this option does not control which members broadcast host
or network routes for aliases.
When a cluster has more than one cluster alias, you can use router priority to spread the proxy ARP response overhead for aliases among cluster members. (This option is irrelevant for an alias whose address is in a virtual subnet.)
The selection priority (selp
) identifies subsets of
members of an alias for the assignment of new connection requests.
The selection priority establishes a hierarchy within
the members of an alias.
Connection requests are
distributed among those members sharing the highest
selection priority value.
If an alias has three
members, two with
selp=10
and one with
selp=5
, no
connection requests or messages are given to the
selp=5
member as long as either of the
selp=10
members is
available.
You can use selection priority values to set up a failover order for members of a particular cluster alias.
Selection weight provides a simple, static method for controlling
which members of an alias get the most connections.
The selection
weight (selw
) indicates the number of connections
(on average) that this member is given before connections are given to
the next alias member with the same
selp
value.
(The
selp
value determines the order in
which members are eligible to receive requests or messages; the
selw
value determines how many requests or messages
a member gets after it is eligible.)
If node A is larger than node B and can handle 50 percent more
connections, then assign, for instance,
selw=3
to
an alias on node A, and
selw=2
to an alias on node B.
Selection weight applies only to applications that are registered as
in_multi
services.
(All traffic for an
in_single
service
must go to the cluster member running that service.)
Selection weight and routing priority address two different load balancing issues; selection weight balances application overhead within a cluster, and router priority balances proxy ARP response overhead within a cluster.
In general, the default routing priority provides acceptable
performance.
The selection weight is probably more useful when
balancing application loads within a heterogeneous cluster consisting
of both large and small systems.
6.9 Service Port Attributes
The
/etc/clua_services
file is a shared file that
is read by all cluster members.
The file is similar in concept and
syntax to the
/etc/services
file.
The
clua_services
file provides a method for
associating alias-related attributes with the port numbers used by
services.
(When application source code is available, the
clua_registerservice()
function serves the same
purpose.) Any service with a fixed port assignment can have an entry
in
/etc/clua_services
.
With the exception of the
out_alias
attribute, these attributes apply to services accessed through any
cluster alias.
The
out_alias
attribute, which
applies only to connections originating from the cluster, is specific
to the default cluster alias.
You can associate the following attributes with a service's port:
A service that, from the cluster alias point of view, runs on only one
cluster member at a time, but can fail over to another instance of the
service on another member if the active service goes away.
(Active, in
this context, relates only to messages addressed to the cluster alias.
All instances of a service are always active for their node's local IP
addresses unless the
in_nolocal
attribute is also
set.) As each service binds to the application's port, the first is
flagged as active for the alias, and the others are flagged as
inactive.
If the active service fails, one of the inactive service
daemons is marked as active.
Any port that is not explicitly listed in the
clua_services
file as
in_multi
, or
registered as
in_multi
through a call to the
clua_registerservice()
function, is treated as
in_single
.
Indicates a service that can run concurrently on two or more cluster members. For a service using UDP, each packet might go to a different alias member. For a service using TCP, each connection is bound to a single alias member, but different connections to the service from the same client might be established on different alias members.
An
in_multi
service must be explicitly registered, either
in the
/etc/clua_services
file or through the
clua_registerservice()
function.
Indicates that the port does not honor connection requests to alias addresses.
Indicates that the port does not honor connection requests to nonalias addresses. For TCP, the port does not accept connections; for UDP, the port drops messages.
Indicates that the default cluster alias is used as the source address whenever this port is used as a destination. Normally, outbound connections (or UDP messages) use the local IP address of the cluster member on which the client is running. It is often beneficial to use the cluster alias address as the source address for outbound traffic from the cluster (for example, to simplify authentication).
The
out_alias
attribute applies only when the
connection (assuming TCP, not UDP) is originated from the cluster;
that is, the cluster is the client.
If a process running on a cluster
member initiates an outbound connection, and the destination port (the
port representing that half of the connection that is not in the
cluster) is flagged in the cluster's
/etc/clua_services
file as
out_alias
, the connection uses the default alias as
its source address.
The same logic holds true when the outbound traffic is a UDP send, because each send can be viewed as a microconnection.
Indicates that the port cannot be assigned as a dynamic port. This option is assigned to ports between 512 and 1024 that are used by well-known, multi-instance network services that are always started at boot time.
The
in_multi
,
in_single
, and
in_noalias
attributes are mutually exclusive.
The
in_nolocal
and
in_noalias
attributes are mutually exclusive.
See
clua_services
(4)
and
clua_registerservice
(3)
for more information about
the use of these attributes.
6.10 vMAC Support
When a cluster alias IP address is configured in a common subnet, one cluster member in that subnet acts as the alias's proxy ARP master, responding to local ARP requests addressed to the alias. If another member of the alias takes over the proxy ARP master role, the new master broadcasts a gratuitous ARP packet to inform other systems about the new hardware media access control (MAC) address associated with the alias's IP address. The other local systems then update their ARP tables to reflect this new cluster-alias-to-MAC association.
However, this broadcast packet is a problem for systems that do not understand gratuitous ARP packets. These systems do not become aware of changes in the cluster alias-to-MAC association, and continue to send alias traffic to the stale MAC address until the normal timeout interval for their ARP tables has elapsed. A solution is to provide a virtual hardware address (vMAC address) for each cluster alias.
A virtual MAC address is a unique hardware address that can be automatically created for each alias IP address. An alias vMAC address follows the cluster alias proxy ARP master from node to node as needed. Regardless of which cluster member is serving as the proxy ARP master for an alias, the alias's vMAC address does not change.
The
Cluster Administration
manual describes how to enable
vMAC support for a cluster alias.
6.11 NFS and Cluster Aliases
When a cluster is configured as an NFS server, NFS client requests
must be directed either to the default cluster alias or to an alias
listed in
/etc/exports.aliases
.
NFS mount requests
directed at individual cluster members are rejected.
As shipped, the default cluster alias is the only alias that NFS
clients can use.
However, you can make additional cluster aliases
available for use by NFS clients by putting the alias names in the
exports.aliases
file.
This feature is useful when
some members of a cluster are not directly connected to the storage
containing exported file systems.
In this case, creating an alias with
only directly connected systems as alias members can reduce the number
of internal hops required to service an NFS request.
The remainder of this section discusses how NFS and the cluster alias subsystem interact for TCP and UDP NFS traffic. (We recommend that, whenever possible, you use UDP as the network transport.) In the following scenarios, assume that the client's first interaction with the cluster is to mount a file system exported by the cluster, and that all members are connected both to the network and to storage.
Getting packets off the network (Section 6.11.1)
Mount requests (Section 6.11.2)
NFS over TCP (Section 6.11.3)
NFS over UDP (Section 6.11.4)
6.11.1 Getting Packets Off the Network
NFS requests using TCP or UDP are addressed to the default cluster
alias or to an alias whose name is in
/etc/exports.aliases
.
By default, a cluster member advertises a host route to each alias that it has specified or joined. Because a client tends to cache the first host route to an alias that it sees, as long as that route is available the client will send all packets for an alias to the same cluster member.
All packets pass through this member on the way in, but not necessarily
on the way out.
The cluster member that puts the response packet on
the wire inserts the cluster alias address as the source address, so
the client is satisfied: send a packet to an alias, receive a packet
from an alias.
6.11.2 Mount Requests
The mount daemon,
mountd
, is a multi-instance
service that handles incoming UDP and TCP mount requests.
The cluster
alias subsystem decides which
mountd
instance
services a mount request.
The node handling the mount request has no
relationship to the node that will eventually service the incoming NFS
packets although by chance they may end up being the same node.
6.11.3 NFS over TCP
TCP connection requests are assigned to a member based on the alias round-robin algorithm and alias selection weights. (Because there is no file system information in the connection request, the cluster cannot route the request to the member that is currently the CFS server for the file system.) All subsequent TCP NFS packets for that file system from that client are handled by the same member that was assigned the connection, regardless of file-system relocations.
Use
Figure 6-7, and the callouts that
follow the figure, to trace the path of an NFS TCP connection:
Figure 6-7: NFS over TCP
Because clients cache routes, the NFS request most likely goes through the same member as the mount request.
Because a TCP connection has already been established by the initial request, the member that takes the packet off the wire automatically tunnels it to the NFS server member that is handling the connection. (There is no CFS server lookup on the node that takes the packet off the wire.)
Note
The following steps are based on the assumption that the NFS server member is not the CFS server for the file system.
The NFS server sends a CFS request across the interconnect to the member that is the CFS server.
The CFS server member handles the I/O to storage.
The CFS server member returns the results across the interconnect to the NFS server member.
The NFS server member replies to the client (using the alias address as the source address).
UDP NFS packets are redirected to the cluster member that is serving the file system. All UDP NFS traffic for that file system is handled on that member. If another cluster member becomes the CFS server for the file system, UDP packets are tunneled to the new server. UDP packets always follow the CFS server for the file system.
Note
Some clients, for example PCs, broadcast UDP requests when trying to find an NFS server. The cluster responds to these requests by returning the IP address of the default cluster alias. This ensures that later NFS client requests are sent to the default cluster alias.
Use
Figure 6-8, and the callouts that
follow the figure, to trace the path of an NFS request over UDP:
Figure 6-8: NFS over UDP
Because clients cache routes, the NFS request most likely goes through the same member as the mount request.
Because an NFS UDP packet contains all the data needed for the NFS request, the cluster alias software on the member that takes the packet off the wire can determine which file system contains the file, and performs a CFS-callout to determine which cluster member is the CFS server for the file system.
Note
The remaining steps are based on the assumption that the CFS server is a member of the cluster alias to which the packet was addressed.
The packet is tunneled to the NFS server that is also the CFS server for the file system. CFS can service the request locally. (This is why UDP can provide better performance that TCP: the UDP packet makes just one trip across the interconnect. Because the NFS server is also the CFS server, no extra trips are needed to handle the CFS I/O.)
The NFS server/CFS server member handles the I/O to storage.
The NFS server/CFS server member replies directly to the client (using the alias address as the source address).
Note
However, if the CFS server for the file system is not a member of the alias, the receiving node round-robins the packet in the same manner that it handles TCP connection requests. In this case, UDP performance will be worse than TCP performance because, with TCP, all incoming packets from a client are tunneled to the node servicing the connection. As a result, all I/O to the same file from the same client is handled by the same node. However, with UDP, if the CFS server is not a member of the alias being used, each I/O request for a given file will end up being handled by a different cluster member. In this case, the CFS clients cannot cache data; they will be constantly invalidating each other's caches and writing through the CFS server node.
6.12 RPC Services and Cluster Aliases
RPC services can call either the
clusvc_getcommport()
function or the
clusvc_getresvcommport()
function to bind to a
port.
(Use
clusvc_getresvcommport()
when binding to
a reserved (privileged) port, a port number in the range 0-1023.)
Both functions call
clua_registerservice()
to
automatically set the
CLUASRV_MULTI
(in_multi
) attribute on the port.
Use the
clusvc_getcommport()
and
clusvc_getresvcommport()
functions in the following
circumstances:
The RPC service does not use a well-known port (services that use
well-known ports can have
in_multi
entries in
/etc/clua_services
).
Multiple instances of the RPC service will run in a cluster.
Requests for the RPC service will be directed to a cluster alias, which will provide load balancing among the instances of the service.
These two functions make it possible to run an RPC application
accessed via a cluster alias on multiple cluster members.
In
addition to ensuring that each instance of an RPC application uses the
same common port, the functions also inform the
portmap
daemon that the application is a
multi-instance, alias application.
If you do not use one of these functions to bind to the port, you can
still run multiple instances of the application, but only one instance
will receive requests directed to a cluster alias.
6.13 ifconfig Aliases and Cluster Aliases
Before TruCluster Server Version 5.0, TruCluster products used
the
asemgr
command to control application
failover.
The
asemgr
command ran the
ifconfig
command to create IP aliases as
needed.
Because the cluster alias subsystem creates and manages
aliases on a clusterwide basis, there is no longer any need to
explicitly establish and remove IP aliases with
ifconfig
when an application fails over.
Cluster alias addresses are not designed with a one-IP-address-one-service mindset. A cluster alias is an address that encompasses the cluster as a whole (or whatever subset of the cluster chooses to define the particular alias), with the design center being applications that can run multiple copies concurrently (multi-instance services). When single-instance services are necessary, they are best configured with CAA so that failover of the service can be more easily managed.
If you are familiar with ASE services, you can continue to define
application-specific interface alias addresses in CAA scripts using
ifconfig alias
.
These aliases are independent of
cluster aliases and do not create any conflicts.
The advantage of using the default cluster alias is that you do not need to migrate an application's address when moving the application within the cluster, because all applications are using the same address (the default cluster alias) and the cluster alias code will always find where the application is running in the cluster. Furthermore, if an application can run multi-instance (concurrently on multiple nodes for enhanced scaling), all instances can be transparently accessed using the same cluster alias without the client knowing multiple nodes are involved.
One advantage of using an ASE-style per-service interface alias (defined in the scripts and migrated with the service by the script) is that traffic is always routed directly to the node running the service (with the cluster alias, traffic often takes one hop within the cluster). Whether this hop outweighs defining an address for each service and migrating it manually depends on the application's throughput needs.