The Memory Channel Application Programming Interface (API) implements highly efficient memory sharing between Memory Channel API cluster members, with automatic error-handling, locking, and UNIX style protections. This chapter contains information to help you develop applications based on the Memory Channel API library. It explains the differences between Memory Channel address space and traditional shared memory, and describes how programming using Memory Channel as a transport differs from programming using shared memory as a transport.
This chapter also contains examples that show how to use the Memory Channel
API library functions in programs.
You will find these code files in the
/usr/examples/cluster/
directory.
Each file contains
compilation instructions.
The chapter discusses the following topics:
Initializing the Memory Channel API library (Section 10.1)
Understanding the Memory Channel multirail model (Section 10.2)
Tuning your Memory Channel configuration (Section 10.3)
Troubleshooting (Section 10.4)
Initializing the Memory Channel API library for a user program (Section 10.5)
Accessing Memory Channel address space (Section 10.6)
Using clusterwide locks (Section 10.7)
Using cluster signals (Section 10.8)
Accessing cluster information (Section 10.9)
Comparing shared memory and message passing models (Section 10.10)
Answering questions asked by programmers who use the Memory Channel API to develop programs for TruCluster Server systems (Section 10.11)
10.1 Initializing the Memory Channel API Library
To run applications that are based on the Memory Channel API library, the library
must be initialized on each host in the Memory Channel API cluster.
The
imc_init
command initializes the Memory Channel API
library and allows applications to use the API.
Initialization of the Memory Channel
API library occurs either by automatic execution of the
imc_init
command at system boot time, or when the
system administrator invokes the command from the command line after the system
boots.
Initialization of the Memory Channel API library at system boot time is
controlled by the
IMC_AUTO_INIT
variable in the
/etc/rc.config
file.
If the value of this variable is
set to 1, the
imc_init
command is invoked at system
boot time.
When the Memory Channel API library is initialized at boot time, the
values of the
-a
maxalloc
and
-r
maxrecv
flags are set
to the values that are specified by the
IMC_MAX_ALLOC
and
IMC_MAX_RECV
variables in the
/etc/rc.config
file.
The default value for the
maxalloc
parameter and the
maxrecv
parameter is 10 MB.
If the
IMC_AUTO_INIT
variable is set to zero (0), the
Memory Channel API library is not initialized at system boot time.
The system
administrator must invoke the
imc_init
command to
initialize the library.
The parameter values in the
/etc/rc.config
file are not used when the
imc_init
command is manually invoked.
The
imc_init
command initializes the Memory Channel API
library the first time it is invoked, whether this happens at system boot time
or after the system has booted.
The value of the
-a
maxalloc
flag must be the same on all hosts
in the Memory Channel API cluster.
If different values are specified, the maximum value
that is specified for any host determines the clusterwide value that applies to
all hosts.
After the Memory Channel API library has been initialized on the current host, the
system administrator can invoke the
imc_init
command
again to reconfigure the values of the
maxalloc
and
maxrecv
resource limits, without forcing a
reboot.
The system administrator can increase or decrease either limit, but the
new limits cannot be lower than the current usage of the resources.
Reconfiguring the cluster from the command line does not read or modify the
values that are specified in the
/etc/rc.config
file.
The system administrator can use the
rcmgr
(8)
command to modify the parameters
and have them take effect when the system reboots.
You must have root privileges to execute the
imc_init
command.
10.2 The Memory Channel Multirail Model
The Memory Channel multirail model supports the concept of physical rails and logical rails. A physical rail is defined as a Memory Channel hub with its cables and Memory Channel adapters, and the Memory Channel driver for the adapters on each node. A logical rail is made up of one or two physical rails.
A cluster can have one or more logical rails up to a maximum of four. Logical rails can be configured in the following styles:
Single-rail (Section 10.2.1)
Failover pair (Section 10.2.2)
If a cluster is configured in the single-rail style, there is a one-to-one relationship between physical rails and logical rails. This configuration has no failover properties; if the physical rail fails, the logical rail fails.
A benefit of the single-rail configuration is that applications can access the aggregate address space of all logical rails and utilize their aggregate bandwidth for maximum performance.
Figure 10-1
shows a single-rail Memory Channel
configuration with three logical rails, each of which is also a physical rail.
Figure 10-1: Single-Rail Memory Channel Configuration
If a cluster is configured in the failover pair style, a logical rail consists of two physical rails, with one physical rail active and the other inactive. If the active physical rail fails, a failover takes place and the inactive physical rail is used, allowing the logical rail to remain active after the failover. This failover is transparent to the user.
The failover pair style can only exist in a Memory Channel configuration consisting of two physical rails.
The failover pair configuration provides availability in the event of a physical rail failure, because the second physical rail is redundant. However, only the address space and bandwidth of a single physical rail are available at any given time.
Figure 10-2
shows a multirail Memory Channel configuration
in the failover pair style.
The illustrated configuration has one
logical rail, which is made up of two physical rails.
Figure 10-2: Failover Pair Memory Channel Configuration
10.2.3 Configuring the Memory Channel Multirail Model
When you implement the Memory Channel multirail model, all nodes in a cluster must be configured with an equal number of physical rails, which are configured into an equal number of logical rails, each with the same failover style.
The system configuration parameter
rm_rail_style
,
in the
/etc/sysconfigtab
file, is used to set
multirail styles.
The
rm_rail_style
parameter can be set
to one of the following values:
Zero (0) for a single-rail style
1 for a failover pair style
The default value of the
rm_rail_style
parameter
is 1.
The
rm_rail_style
parameter must have the same value
for all nodes in a cluster, or configuration errors may occur.
To change the value of the
rm_rail_style
parameter
to zero (0) for a single-rail style, change the
/etc/sysconfigtab
file by adding or modifying the
following stanza for the
rm
subsystem:
rm: rm_rail_style=0
Note
We recommend that you use
sysconfigdb
(8) to modify or to add stanzas in the/etc/sysconfigtab
file.
If you change the
rm_rail_style
parameter, you must
halt the entire cluster and then reboot each member system.
Note
A cluster will fail if any logical rail fails. See Section 10.4.3 for more information.
Error handling for the Memory Channel multirail model is implemented for specified logical rails. See Section 10.6.6 for a description of Memory Channel API library error-management functions and code examples.
Note
The Memory Channel multirail model does not facilitate any type of cluster reconfiguration, such as the addition of hubs or Memory Channel adapters. For such reconfiguration, you must first shut down the cluster completely.
10.3 Tuning Your Memory Channel Configuration
The
imc_init
command initializes the Memory Channel API
library with certain resource defaults.
Depending on your application,
you may require more resources than the defaults allow.
In some cases,
you can change certain Memory Channel parameters and virtual memory resource
parameters to overcome these limitations.
The following sections
describe these parameters and explain how to change them.
10.3.1 Extending Memory Channel Address Space
The amount of total Memory Channel address space that is available to the
Memory Channel API library is specified using the
maxalloc
parameter of the
imc_init
command.
The maximum amount
of Memory Channel address space that can be attached for receive on a host is
specified using the
maxrecv
parameter of the
imc_init
command.
The default limit in each case is
10 MB.
(Section 10.1
describes how to initialize the
Memory Channel API library using the
imc_init
command.)
You can use the
rcmgr
(8)
command to change the value
that is used during an automatic initialization by setting the variables
IMC_MAX_ALLOC
and
IMC_MAX_RECV
.
For example, you can set the variables to allow a total of 80 MB of
Memory Channel address space to be made available to the Memory Channel API library
clusterwide, and to allow 60 MB of Memory Channel address space to be attached for
receive on the current host, as follows:
rcmgr set IMC_MAX_ALLOC 80 rcmgr set IMC_MAX_RECV 60
If you use the
rcmgr
(8)
command to set new limits,
they will take effect when the system reboots.
You can use the Memory Channel API library initialization command,
imc_init
, to change both the amount of total Memory Channel
address space available and the maximum amount of Memory Channel address space that
can be attached for receive, after the Memory Channel API library has been initialized.
For example, to allow a total amount of 80 MB of Memory Channel address space to
be made available clusterwide, and to allow 60 MB of Memory Channel address space to
be attached for receive on the current host, use the following command:
imc_init -a 80 -r
60
If you use the
imc_init
command to set new limits,
they will be lost when the system reboots, and the values of the
IMC_MAX_ALLOC
and
IMC_MAX_RECV
variables
will be used as limits.
10.3.2 Increasing Wired Memory
Every page of Memory Channel address space that is attached for receive must be
backed by a page of physical memory on your system.
This memory is
nonpageable; that is, it is wired memory.
The amount of wired memory on a host
cannot be increased infinitely; the system configuration parameter
vm_syswiredpercent
will impose a limit.
You can
change the
vm_syswiredpercent
parameter in the
/etc/sysconfigtab
file.
For example, if you want to set the
vm_syswiredpercent
parameter to 80, the
vm
stanza in the
/etc/sysconfigtab
file must contain the following
entry:
vm: vm_syswiredpercent=80
If you change the
vm_syswiredpercent
parameter, you
must reboot the system.
Note
The default amount of wired memory is sufficient for most operations. We recommend that you exercise caution in changing this limit.
The following sections describe error conditions that you may
encounter when using the Memory Channel API library functions, and suggest
solutions.
10.4.1 IMC_NOTINIT Return Code
The
IMC_NOTINIT
status is returned when the
imc_init
command has not been run, or when the
imc_init
command has failed to run correctly.
The
imc_init
command must be run on each host in
the Memory Channel API cluster before you can use the Memory Channel API library functions.
(Section 10.1
describes how to initialize the
Memory Channel API library using the
imc_init
command.)
If the
imc_init
command does not run successfully,
see
Section 10.4.2
for suggested solutions.
10.4.2 Memory Channel API Library Initialization Failure
The Memory Channel API library may fail to initialize on a host; if this
happens, an error message is displayed on the console and is written to
the
messages
log file in the
/usr/var/adm
directory.
Use the following
list of error messages and solutions to eliminate the error:
Memory Channel is not initialized for user access
This error message indicates that the current host has not been initialized to use the Memory Channel API.
To solve this problem, ensure that all Memory Channel cables are correctly attached to the Memory Channel adapters on this host. See Section 10.4.3 for more information on fatal errors that are caused by problems with the physical Memory Channel configuration or interconnect.
Memory Channel API - insufficient wired memory
This error message indicates that the value of the
IMC_MAX_RECV
variable in the
/etc/config
file or the value of the
-r
option to the
imc_init
command
is greater than the wired memory limit that is specified by the configuration
parameter
vm_syswiredpercent
.
To solve this problem, invoke the
imc_init
command
with a smaller value for the
maxrecv
parameter, or increase the system wired memory limit as described in
Section 10.3.2.
10.4.3 Fatal Memory Channel Errors
Sometimes the Memory Channel API fails to initialize because of problems
with the physical Memory Channel configuration or interconnect.
Error
messages that are displayed on the console in these circumstances do not
mention the Memory Channel API.
The following sections describe some of the more
common reasons for such failures.
10.4.3.1 Logical Rail Failure
If any logical rail fails, a system panic occurs on one or more hosts in the cluster, and the following error message is displayed on the console:
panic (cpu 0): rm_delete_context: fatal MC error
To solve this problem, ensure that the hub is powered up and that all
cables are connected properly; then halt the entire cluster and reboot
each member system.
10.4.3.2 Logical Rail Initialization Failure
If the logical rail configuration for a logical rail on this node does not match that of a logical rail on other cluster members, a system panic occurs on one or more hosts in the cluster, and error messages of the following form are displayed on the console:
rm_slave_init rail configuration does not match cluster expectations for logical rail 0 logical rail 0 has failed initialization rm_delete_context: lcsr = 0x2a80078, mcerr = 0x20001, mcport = 0x72400001 panic (cpu 0): rm_delete_context: fatal MC error
This error can occur if the configuration parameter
rm_rail_style
is not identical on every node.
To solve this problem, follow these steps:
Halt the system.
Boot
/genvmunix
.
Modify the
/etc/sysconfigtab
file as described
in
Section 10.2.3.
Reboot the kernel with Memory Channel API cluster support (/vmunix
).
The
IMC_MCFULL
status is returned
if there is not enough Memory Channel address space to perform an
operation.
The amount of total Memory Channel address space that is available to the Memory Channel API
library is specified by using the
maxalloc
parameter of the
imc_init
command, as described in
Section 10.4.2.
You can use the
rcmgr
(8)
command or the Memory Channel
API library initialization command,
imc_init
, to increase
the amount of Memory Channel address space that is available to the library
clusterwide.
See
Section 10.3.1
for more details.
10.4.5 IMC_RXFULL Return Code
The
IMC_RXFULL
status is returned by the
imc_asattach
function, if receive mapping space is exhausted
when an attempt is made to attach a region for receive.
Note
The default amount of receive space on the current host is 10 MB.
The maximum amount of Memory Channel address space that can be attached for
receive on a host is specified using the
maxrecv
parameter of the
imc_init
command, as described in
Section 10.1.
You can use the
rcmgr
(8)
command or the Memory Channel
API library initialization command,
imc_init
, to
extend the maximum amount of Memory Channel address space that can be attached for
receive on the host.
See
Section 10.3.1
for more details.
10.4.6 IMC_WIRED_LIMIT Return Code
The
IMC_WIRED_LIMIT
return value indicates that an
attempt has been made to exceed the maximum quantity of wired memory.
The system configuration parameter
vm_syswiredpercent
specifies the wired memory limit; see
Section 10.3.2
for
information on changing this limit.
10.4.7 IMC_MAPENTRIES Return Code
The
IMC_MAPENTRIES
return value indicates that the
maximum number of virtual memory map entries has been exceeded for the current
process.
10.4.8 IMC_NOMEM Return Code
The
IMC_NOMEM
return status indicates a
malloc
function failure while performing a Memory Channel API
function call.
This will happen if process virtual memory has been exceeded, and can
be remedied by using the usual techniques for extending process virtual
memory limits; that is, by using the
limit
command and the
unlimit
command for the C shell, and by using the
ulimit
command for the Bourne shell and the Korn
shell.
10.4.9 IMC_NORESOURCES Return Code
The
IMC_NORESOURCES
return value indicates that there
are insufficient Memory Channel data structures available to perform the required
operation.
However, the amount of available Memory Channel data structures is
fixed, and cannot be increased by changing a parameter.
To solve this
problem, amend the application to use fewer regions or locks.
10.5 Initializing the Memory Channel API Library for a User Program
The
imc_api_init
function is used to initialize the
Memory Channel API library in a user program.
Call the
imc_api_init
function in a process before any
of the other Memory Channel API functions are called.
If a process forks, the
imc_api_init
function must be called before calling
any other API functions in the child process, or an undefined behavior
will result.
10.6 Accessing Memory Channel Address Space
The Memory Channel interconnect provides a form of memory sharing between Memory Channel API cluster members. The Memory Channel API library is used to set up the memory sharing, allowing processes on different members of the cluster to exchange data using direct read and write operations to addresses in their virtual address space. When the memory sharing has been set up by the Memory Channel API library, these direct read and write operations take place at hardware speeds without involving the operating system or the Memory Channel API library software functions.
When a system is configured with Memory Channel, part of the physical address
space of the system is assigned to the Memory Channel address space.
The size of
the Memory Channel address space is specified by the
imc_init
command.
A process accesses this Memory Channel address space by using the Memory Channel API to map a
region of Memory Channel address space to its own virtual address space.
Applications that want to access the Memory Channel address space on different
cluster members can allocate part of the address space for a particular purpose
by calling the
imc_asalloc
function.
The
key
parameter associates a clusterwide key
with the region.
Other processes that allocate the same region also specify this
key.
This allows processes to coordinate access to the region.
To use an allocated region of Memory Channel address space, a process maps the
region into its own process virtual address space, using the
imc_asattach
function or the
imc_asattach_ptp
function.
When a process attaches
to a Memory Channel region, an area of virtual address space that is the same size as the
Memory Channel region is added to the process virtual address space.
When
attaching the region, the process indicates whether the region is mapped to
receive or transmit data, as follows:
Transmit -- Indicates that the region is to be used to transmit data on Memory Channel. When a process writes to addresses in this virtual address region, the data is transmitted over the Memory Channel interconnect to the other members of the Memory Channel API cluster.
To map a region for transmit, specify the value
IMC_TRANSMIT
for the
dir
parameter to the
imc_asattach
function.
Receive -- Indicates that the region is to be used to receive data from Memory Channel. In this case, the address space that is mapped into the process virtual address space is backed by a region of physical memory on the system. When data is transmitted on Memory Channel, it is written into the physical memory of any hosts that have mapped the region for receive, so that processes on that system read from the same area of physical memory. The process does not receive any data that is transmitted before the region is mapped.
To map a region for receive, use the value
IMC_RECEIVE
as the
dir
parameter for the
imc_asattach
function.
A process can attach to a Memory Channel region in broadcast mode, point-to-point mode, or loopback mode. These methods of attach are described in Section 10.6.1.
Memory sharing using the Memory Channel interconnect is similar to conventional shared memory in that, after it is established, simple accesses to virtual address space allow two different processes to share data. However, there are two differences between these memory-sharing mechanisms that you must allow for, as follows:
When conventional shared memory is created, it is assigned a virtual address. In C programming terms, there is a pointer to the memory. This single pointer can be used both to read and write data to the shared memory. However, a Memory Channel region can have two different virtual addresses assigned to it: a transmit virtual address and a receive virtual address. In C programming terms, there are two different pointers to manage; one pointer can only be used for write operations, the other pointer is used for read operations.
In conventional shared memory, write operations are made directly to memory and are immediately visible to other processes that are reading from the same memory. However, when a write operation is made to a Memory Channel region, the write operation is not made directly to memory but to the I/O system and the Memory Channel hardware. This means that there is a delay before the data appears in memory on the receiving system. This is described in more detail in Section 10.6.5.
10.6.1 Attaching to Memory Channel Address Space
The following sections describe the ways in which a process can attach to Memory Channel address space. There are three ways in which a process can attach to Memory Channel address space, as follows:
Broadcast attach (Section 10.6.1.1)
Point-to-point attach (Section 10.6.1.2)
Loopback attach (Section 10.6.1.3)
This section also explains initial coherency, reading and writing
Memory Channel regions, latency-related coherency, and error management, and
includes some code examples.
10.6.1.1 Broadcast Attach
When one process maps a region for transmit and other processes map the
same region for receive, the data that the transmit process writes to
the region is transmitted on Memory Channel to the receive memory of the other
processes.
Figure 10-3
shows a three-host Memory Channel
implementation that shows how the address spaces are mapped.
Figure 10-3: Broadcast Address Space Mapping
With the address spaces that are mapped as shown in Figure 10-3, note the following:
Process A allocates a region of Memory Channel address space.
Process A then maps
the allocated region to its virtual address space when it attaches the
region for transmit using the
imc_asattach
function.
Process B and Process C both allocate the same region of Memory Channel address space as Process A. However, unlike Process A, Process B and Process C both attach the region to receive data.
When data is written to the virtual address space of Process A, the data is transmitted on Memory Channel.
When the data from Process A appears on Memory Channel, it is written to the physical memory on Hosts B and C that backs the virtual address spaces of Processes B and C that were allocated to receive the data.
10.6.1.2 Point-to-Point Attach
An allocated region of Memory Channel address space can be attached for
transmit in point-to-point mode to the virtual address space of a process on
another node.
This is done by calling the
imc_asattach_ptp
function with a specified host as a parameter.
This means that writes to
the region are sent only to the host that is specified in the parameter, and not
to all hosts in the cluster.
Regions that are attached using the
imc_asattach_ptp
function are always attached in transmit mode, and are write-only.
Figure 10-4
shows a two-host Memory Channel implementation that
shows point-to-point address space mapping.
Figure 10-4: Point-to-Point Address Space Mapping
With the address spaces mapped as shown in Figure 10-4, note the following:
Process 1 allocates a region of Memory Channel address space.
It then maps the
allocated region to its virtual address space when it attaches the
region point-to-point to Host B using the
imc_asattach_ptp
function.
Process 2 allocates the region and then attaches it for receive using
the
imc_asattach
function.
When data is written to the virtual address space of Process 1, the data is transmitted on Memory Channel.
When the data from Process 1 appears on Memory Channel, it is written to the physical memory that backs the virtual address space of Process 2 on Host B.
A region can be attached for both transmit and receive by processes on a host.
Data that is written by the host is written to other hosts that have attached
the region for receive.
However, by default, data that is written by the host
is not also written to the receive memory on that host; it is written only to
other hosts.
If you want a host to see data that it writes, you must specify the
IMC_LOOPBACK
flag to the
imc_asattach
function when attaching for transmit.
The loopback attribute of a region is set up on a per-host basis, and
is determined by the value of the
flag
parameter to
the first transmit attach on that host.
If you specify the value
IMC_LOOPBACK
for the
flag
parameter, two Memory Channel transactions occur for
every write, one to write the data and one to loop the data back.
Because of the nature of point-to-point attach mode, looped-back writes are not permitted.
Figure 10-5
shows a configuration in which a region of
Memory Channel address space is attached both for transmit with loopback and for
receive.
Figure 10-5: Loopback Address Space Mapping
When a Memory Channel region is attached for receive, the initial contents are undefined. This situation can arise because a process that has mapped the same Memory Channel region for transmit might update the contents of the region before other processes map the region for receive. This is referred to as the initial coherency problem. You can overcome this in two ways:
Write the application in a way that ensures that all processes attach the region for receive before any processes write to the region.
At allocation time, specify that the region is coherent by specifying
the
IMC_COHERENT
flag when you allocate the region
using the
imc_asalloc
function.
This ensures that all
processes will see every update to the region, regardless of when the processes
attach the region.
Coherent regions use the loopback feature. This means that two Memory Channel transactions occur for every write, one to write the data and one to loop the data back; because of this, coherent regions have less available bandwidth than noncoherent regions.
10.6.3 Reading and Writing Memory Channel Regions
Processes that attach a region of Memory Channel address space can only write to a transmit pointer, and can only read from a receive pointer. Any attempt to read a transmit pointer will result in a segmentation violation.
Apart from explicit read operations on Memory Channel transmit pointers, segmentation violations will also result from operations that cause the compiler to generate read-modify-write cycles; for example:
Postincrement and postdecrement operations.
Preincrement and predecrement operations.
Assignment to simple data types that are not an integral multiple of four bytes.
Use of the
bcopy
(3)
library function, where the
length
parameter is not an integral multiple of eight
bytes, or where the source or destination arguments are not 8-byte aligned.
Assignment to structures that are not quadword-aligned (that is, the
value returned by the
sizeof
function is not an integral
multiple of eight).
This refers only to unit assignment of the whole
structure; for example,
mystruct1 = mystruct2
.
Example 10-1 shows how to initialize, allocate, and attach to a region of Memory Channel address space, and also shows two of the differences between Memory Channel address space and traditional shared memory:
Initial coherency, as described in Section 10.6.2
Asymmetry of receive and transmit regions, as described in Section 10.6.3
The sample program shown in Example 10-1 executes in master or slave mode, as specified by a command-line parameter. In master mode, the program writes its own process identifier (PID) to a data structure in the global Memory Channel address space. In slave mode, the program polls a data structure in the Memory Channel address space to determine the PID of the master process.
Note
Make sure that your programs are flexible in their use of keys to prevent problems resulting from key clashes. We recommend that you use meaningful, application-specific keys.
Example 10-1: Accessing Regions of Memory Channel Address Space
/* /usr/examples/cluster/mc_ex1.c */ #include <c_asm.h> #include <sys/types.h> #include <sys/imc.h> #define VALID 756 main (int argc, char *argv[]) { imc_asid_t glob_id; typedef struct { pid_t pid; volatile int valid; [1] } clust_pid; clust_pid *global_record; caddr_t add_rx_ptr = 0, add_tx_ptr = 0; int status; int master; int logical_rail=0; /* check for correct number of arguments /* if (argc != 2) { printf("usage: mcpid 0|1\n"); exit(-1); } /* test if process is master or slave */ master = atoi(argv[1]); [2] /* initialize Memory Channel API library */ status = imc_api_init(NULL); [3] if (status < 0) { imc_perror("imc_api_init::",status); [4] exit(-2); } imc_asalloc(123, 8192, IMC_URW, 0, &glob_id, logical_rail); [5] if (master) { imc_asattach(glob_id, IMC_TRANSMIT, IMC_SHARED, 0, &add_tx_ptr); [6] global_record = (clust_pid*)add_tx_ptr; [7] global_record->pid = getpid(); mb(); [8] global_record->valid = VALID; mb(); } else { /* secondary process */ imc_asattach(glob_id, IMC_RECEIVE, IMC_SHARED, 0, &add_rx_ptr); [9] (char*)global_record = add_rx_ptr; while ( global_record->valid != VALID) ; /* continue polling */ [10] printf("pid of master process is %d\n", global_record->pid); } imc_asdetach(glob_id); imc_asdealloc(glob_id); [11] }
The
valid
flag is declared as volatile to prevent
the compiler from performing any optimizations that might prevent the
code from reading the updated PID value from memory.
[Return to example]
The first argument on the command line indicates whether the process is a master (argument equal to 1) or a slave process (argument not not equal to 1). [Return to example]
The
imc_api_init
function initializes the
Memory Channel API library.
Call it before calling any
of the other Memory Channel API library functions.
[Return to example]
All Memory Channel API library functions return a zero (0) status
if successful.
The
imc_perror
function
decodes error status values.
For brevity, this example ignores
the status from all functions other than the
imc_api_init
function.
[Return to example]
The
imc_asalloc
function allocates a region of Memory Channel
address space with the following characteristics:
key=123
-- This value identifies the region
of Memory Channel address space.
Other applications that attach this region will use
the same key value.
size=8192
-- The size of the region is 8192
bytes.
perm=IMC_URW
-- The access permission on the
region is user read and write.
id=glob_id
-- The
imc_asalloc
function returns this value,
which uniquely identifies the allocated region.
The program uses
this value in subsequent calls to other Memory Channel functions.
logical_rail=0
-- The region is allocated
using Memory Channel logical rail zero (0).
The master process attaches the region for transmit by calling the
imc_asattach
function and specifying the
glob_id
identifier, which was returned by the call to
the
imc_asalloc
function.
The
imc_asattach
function returns
add_tx_ptr
, a pointer to the address of the region in
the process virtual address space.
The
IMC_SHARED
value signifies that the region is shareable, so other processes on this host
can also attach the region.
[Return to example]
The program points the global record structure to the region of virtual memory
in the process virtual address space that is backed by the Memory Channel reason,
and writes the process ID in the
pid
field of the
global record.
Note that the master process has attached the region for
transmit; therefore, it can only write data in the field.
An attempt to
read the field will result in a segmentation violation; for example:
(pid_t)x = global_record->pid;
The program uses memory barrier instructions to ensure that the
pid
field is forced out of the Alpha CPU write
buffer before the
VALID
flag is set.
[Return to example]
The slave process attaches the region for receive by calling the
imc_asattach
function and specifying the
glob_id
identifier, which was returned by the call to
the
imc_asalloc
function.
The
imc_asattach
function returns
add_rx_ptr
, a pointer to the address of the region in
the process virtual address space.
On mapping, the contents of the
region may not be consistent on all processes that map the region.
Therefore,
start the slave process before the master to ensure that all writes by the
master process appear in the virtual address space of the slave process.
[Return to example]
The slave process overlays the region with the global record structure
and polls the valid flag.
The earlier declaration of the flag as
volatile ensures that the flag is immune to compiler
optimizations, which might result in the field being stored in a
register.
This ensures that the loop will load a new value from memory at each
iteration and will eventually detect the transition to
VALID
.
[Return to example]
At termination, the master
and slave processes explicitly detach and deallocate the region by
calling the
imc_asdetach
function and the
imc_asdealloc
function.
In the case of abnormal
termination, the allocated regions are automatically freed when the
processes exit.
[Return to example]
As described in Section 10.6.2, the initial coherency problem can be overcome by retransmitting the data after all mappings of the same region for receive have been completed, or by specifying at allocation time that the region is coherent. However, when a process writes to a transmit pointer, several microseconds can elapse before the update is reflected in the physical memory that corresponds to the receive pointer. If the process reads the receive pointer during that interval, the data it reads might be incorrect. This is known as the latency-related coherency problem.
Latency problems do not arise in conventional shared memory systems. Memory and cache control ensure that store and load instructions are synchronized with data transfers.
Example 10-2
shows two versions of a program that
decrements a global process count and detects the count reaching zero
(0).
The first program uses System V shared memory and interprocess
communication.
The second uses the Memory Channel API library.
Example 10-2: System V IPC and Memory Channel Code Comparison
/* /usr/examples/cluster/mc_ex2.c */ /**************************************** ********* System V IPC example ******* ****************************************/ #include <sys/types.h> #include <sys/ipc.h> #include <sys/shm.h> main() { typedef struct { int proc_count; int remainder[2047] } global_page; global_page *mypage; int shmid; shmid = shmget(123, 8192, IPC_CREAT | SHM_R | SHM_W); (caddr_t)mypage = shmat(shmid, 0, 0); /* attach the global region */ mypage->proc_count ++; /* increment process count */ /* body of program goes here */ . . . /* clean up */ mypage->proc_count --; /* decrement process count */ if (mypage->proc_count == 0 ) printf("The last process is exiting\n"); . . . } /**************************************** ******* Memory Channel example ******* ****************************************/ #include <sys/types.h> #include <sys/imc.h> main() { typedef struct { int proc_count; int remainder[2047] } global_page; global_page *mypage_rx, *mypage_tx; [1] imc_asid_t glob_id; int logical_rail=0; int temp; imc_api_init(NULL); imc_asalloc(123, 8192, IMC_URW | IMC_GRW, 0, &glob_id, logical_rail); [2] imc_asattach(glob_id, IMC_TRANSMIT, IMC_SHARED, IMC_LOOPBACK, &(caddr_t)mypage_tx); [3] imc_asattach(glob_id, IMC_RECEIVE, IMC_SHARED, 0, &(caddr_t)mypage_rx); [4] /* increment process count */ mypage_tx->proc_count = mypage_rx->proc_count + 1; [5] /* body of program goes here */
.
.
.
/* clean up */ /* decrement process count temp = mypage_rx->proc_count - 1 [6] mypage_tx->proc_count = temp; /* wait for MEMORY CHANNEL update to occur */ while (mypage_rx->proc_count != temp) ; if (mypage_rx->proc_count == 0 ) printf("The last process is exiting\n");
.
.
.
}
The process must be able to read the data that it writes to the Memory Channel global address space. Therefore, it declares two addresses, one for transmit and one for receive. [Return to example]
The
imc_asalloc
function allocates a region of Memory Channel
address space.
The characteristics of the region are as follows:
key=123
-- This value identifies the region of
Memory Channel address space.
Other applications that attach this region will
use the same key value.
size=8192
-- The size of the region is 8192 bytes.
perm=IMC_URW | IMC_GRW
-- The region is allocated
with user and group read and write permission.
id=glob_id
-- The
imc_asalloc
function returns this value, which uniquely identifies the allocated
region.
The program uses this value in subsequent calls to other
Memory Channel API library functions.
logical_rail=0
-- The region is allocated using
Memory Channel logical rail zero (0).
This call to the
imc_asattach
function attaches the
region for transmit at the address that is pointed to by the
mypage_tx
variable.
The value of the
flag
parameter is set to
IMC_LOOPBACK
,
so that any time the process writes data to the region, the data is
looped back to the receive memory.
[Return to example]
This call to the
imc_asattach
function
attaches the region for receive at the address that is pointed to by the
mypage_rx
variable.
[Return to example]
The program increments the global process count by adding 1 to the value in the receive pointer, and by assigning the result into the transmit pointer. When the program writes to the transmit pointer, it does not wait to ensure that the write instruction completes. [Return to example]
After the body of the program completes, the program decrements
the process count and tests that the decremented value was
transmitted to the other hosts in the cluster.
To ensure that it
examines the decremented count (rather than some transient value), the
program stores the decremented count in a local variable,
temp
.
It writes the decremented count to the
transmit region, and then waits for the value in the receive
region to match the value in
temp
.
When the
match occurs, the program knows that the decremented process count
has been written to the Memory Channel address space.
[Return to example]
In this example, the use of the local variable ensures that the program
compares the value in the receive memory with the value that was
transmitted.
An attempt to use the value in the receive memory before ensuring
that the value had been updated may result in erroneous data being read.
10.6.6 Error Management
In a shared memory system, the process of reading and writing to memory is assumed to be error-free. In a Memory Channel system, the error rate is of the order of three errors per year. This is much lower than the error rates of standard networks and I/O subsystems.
The Memory Channel hardware reports detected errors to the Memory Channel software. The Memory Channel hardware provides two guarantees that make it possible to develop applications that can cope with errors:
It does not write corrupt data to host systems.
It delivers data to the host systems in the sequence in which the data is written to the Memory Channel hardware.
These guarantees simplify the process of developing reliable and efficient messaging systems.
The Memory Channel API library provides the following functions to help applications implement error management:
imc_ckerrcnt_mr
-- The
imc_ckerrcnt_mr
function looks for the existence of errors
on a specified logical rail on Memory Channel hosts.
This allows
transmitting processes to learn whether or not errors occur when they send
messages.
imc_rderrcnt_mr
-- The
imc_rderrcnt_mr
function reads the clusterwide
error count for the specified logical rail and returns the value to the
calling program.
This allows receiving processes to learn the error
status of messages that they receive.
The operating system maintains a count of the number of errors that occur on the cluster. The system increments the value whenever it detects a Memory Channel hardware error in the cluster, and when a host joins or leaves the cluster.
The task of detecting and processing an error takes a small, but finite,
amount of time.
This means that the count that is returned by the
imc_rderrcnt_mr
function might not be up-to-date
with respect to an error that has just occurred on another host in the
cluster.
On the local host, the count is always up-to-date.
Use the
imc_rderrcnt_mr
function to implement a
simple and effective error-detection mechanism by reading the error
count before transmitting a message, and including the count in the
message.
The receiving process compares the error count in the message body
with the local value that is determined after the message arrives.
The local
value is guaranteed to be up-to-date, so if this value is the same as the
transmitted value, then it is certain that no intervening errors occurred.
Example 10-3
shows this technique.
Example 10-3: Error Detection Using the imc_rderrcnt_mr Function
/* /usr/examples/cluster/mc_ex3.c */ /***************************************** ********* Transmitting Process ********* ******************************************/ #include <sys/imc.h> #include <c_asm.h> main() { typedef struct { volatile int msg_arrived; int send_count; int remainder[2046]; } global_page; global_page *mypage_rx, *mypage_tx; imc_asid_t glob_id; int i; volatile int err_count; imc_api_init(NULL); imc_asalloc (1234, 8192, IMC_URW, 0, &glob_id,0); imc_asattach (glob_id, IMC_TRANSMIT, IMC_SHARED, IMC_LOOPBACK, &(caddr_t)mypage_tx); imc_asattach (glob_id, IMC_RECEIVE, IMC_SHARED, 0, &(caddr_t)mypage_rx); /* save the error count */ while ( (err_count = imc_rderrcnt_mr(0) ) < 0 ) ; mypage_tx->send_count = err_count; /* store message data */ for (i = 0; i < 2046; i++) mypage_tx->remainder[i] = i; /* now mark as valid */ mb(); do { mypage_tx->msg_arrived = 1; } while (mypage_rx->msg_arrived != 1); /* ensure no error on valid flag */ } /***************************************** *********** Receiving Process ********** ******************************************/ #include <sys/imc.h> main() { typedef struct { volatile int msg_arrived; int send_count; int remainder[2046]; } global_page; global_page *mypage_rx, *mypage_tx; imc_asid_t glob_id; int i; volatile int err_count; imc_api_init(NULL); imc_asalloc (1234, 8192, IMC_URW, 0, &glob_id,0); imc_asattach (glob_id, IMC_RECEIVE, IMC_SHARED, 0, &(caddr_t)mypage_rx); /* wait for message arrival */ while ( mypage_rx->msg_arrived == 0 ) ; /* get this systems error count */ while ( (err_count = imc_rderrcnt_mr(0) ) < 0 ) ; if (err_count == mypage_rx->send_count) { /* no error, process the body */ ..... } else { /* do error processing */ ...... } }
As shown in
Example 10-3, the
imc_rderrcnt_mr
function can be safely used to
detect errors at the receiving end of a message.
However, it cannot be
guaranteed to detect errors at the transmitting end.
This is because there is
a small, but finite, possibility that the transmitting process will read
the error count before the transmitting host has been notified of an error
occurring on the receiving host.
In
Example 10-3, the
program must rely on a higher-level protocol informing the transmitting host
of the error.
The
imc_ckerrcnt_mr
function provides guaranteed
error detection for a specified logical rail.
This function takes a
user-supplied local error count and a logical rail number as parameters, and
returns an error in the following circumstances:
An outstanding error is detected on the specified logical rail
Error processing is in progress
The error count is higher than the supplied parameter
If the function returns successfully, no errors have been detected
between when the local error count was stored and the
imc_ckerrcnt_mr
function was called.
The
imc_ckerrcnt_mr
function reads the Memory Channel adapter
hardware error status for the specified logical rail; this is a hardware
operation that takes several microseconds.
Therefore, the
imc_ckerrcnt_mr
function takes longer to execute
than the
imc_rderrcnt_mr
function, which reads only
a memory location.
Example 10-4
shows an amended version of the send
sequence shown in
Example 10-3.
In
Example 10-4, the transmitting process performs error
detection.
Example 10-4: Error Detection Using the imc_ckerrcnt_mr Function
/* /usr/examples/cluster/mc_ex4.c */ /**********************************************/ /* Transmitting Process With Error Detection */ /**********************************************/ #include <c_asm.h> #define mb() asm("mb") #include <sys/imc.h> main() { typedef struct { volatile int msg_arrived; int send_count; int remainder[2046]; } global_page; global_page *mypage_rx, *mypage_tx; imc_asid_t glob_id; int i, status; volatile int err_count; imc_api_init(NULL); imc_asalloc (1234, 8192, IMC_URW, 0, &glob_id,0); imc_asattach (glob_id, IMC_TRANSMIT, IMC_SHARED, IMC_LOOPBACK, &(caddr_t)mypage_tx); imc_asattach (glob_id, IMC_RECEIVE, IMC_SHARED, 0, &(caddr_t)mypage_rx); /* save the error count */ while ( (err_count = imc_rderrcnt_mr(0) ) < 0 ) ; do { mypage_tx->send_count = err_count; /* store message data */ for (i = 0; i < 2046; i++) mypage_tx->remainder[i] = i; /* now mark as valid */ mb(); mypage_tx->msg_arrived = 1; /* if error occurs, retransmit */ } while ( (status = imc_ckerrcnt_mr(&err_count,0)) != IMC_SUCCESS); }
In a Memory Channel system, the processes communicate by reading and writing regions of the Memory Channel address space. The preceding sections contain sample programs that show arbitrary reading and writing of regions. In practice, however, a locking mechanism is sometimes needed to provide controlled access to regions and to other clusterwide resources. The Memory Channel API library provides a set of lock functions that enable applications to implement access control on resources.
The Memory Channel API library implements locks by using mapped pages
of the global Memory Channel address space.
For efficiency reasons, locks are
allocated in sets rather than individually.
The
imc_lkalloc
function allows you to allocate a lock
set.
For example, if you want to use 20 locks, it is more efficient to
create one set with 20 locks than five sets with four locks each, and so on.
To facilitate the initial coordination of distributed applications, the
imc_lkalloc
function allows a process to atomically
(that is, in a single operation) allocate the lock set and acquire the
first lock in the set.
This feature allows the process to determine
whether or not it is the first process to allocate the lock set.
If it is, the
process is guaranteed access to the lock and can safely initialize the
resource.
Instead of allocating the lock set and acquiring the first lock
atomically, a process can call the
imc_lkalloc
function and then the
imc_lkacquire
function.
In that
case, however, there is a risk that another process might acquire the lock
between the two function calls, and the first process will not be guaranteed
access to the lock.
Example 10-5
shows a program in which the first process to
lock a region of Memory Channel address space initializes the region, and the
processes that subsequently access the region simply update the process
count.
Example 10-5: Locking Memory Channel Regions
/* /usr/examples/cluster/mc_ex5.c */ #include <sys/types.h> #include <sys/imc.h> main ( ) { imc_asid_t glob_id; imc_lkid_t lock_id; int locks = 4; int status; typedef struct { int proc_count; int pattern[2047]; } clust_rec; clust_rec *global_record_tx, *global_record_rx; [1] caddr_t add_rx_ptr = 0, add_tx_ptr = 0; int j; status = imc_api_init(NULL); imc_asalloc(123, 8192, IMC_URW, 0, &glob_id, 0); imc_asattach(glob_id, IMC_TRANSMIT, IMC_SHARED, IMC_LOOPBACK, &add_tx_ptr); imc_asattach(glob_id, IMC_RECEIVE, IMC_SHARED, 0, &add_rx_ptr); global_record_tx = (clust_rec*) add_tx_ptr; [2] global_record_rx = (clust_rec*) add_rx_ptr; status = imc_lkalloc(456, &locks, IMC_LKU, IMC_CREATOR, &lock_id); [3] if (status == IMC_SUCCESS) { /* This is the first process. Initialize the global region */ global_record_tx->proc_count = 0; [4] for (j = 0; j < 2047; j++) global_record_tx->pattern[j] = j; /* release the lock */ imc_lkrelease(lock_id, 0); [5] } /* This is a secondary process */ else if (status == IMC_EXISTS) { imc_lkalloc(456, &locks, IMC_LKU, 0, &lock_id); [6] imc_lkacquire(lock_id, 0, 0, IMC_LOCKWAIT); [7] /* wait for access to region */ global_record_tx->proc_count = global_record_rx->proc_count+1; [8] /* release the lock */ imc_lkrelease(lock_id, 0); } /* body of program goes here */
.
.
.
/* clean up */ imc_lkdealloc(lock_id); [9] imc_asdetach(glob_id); imc_asdealloc(glob_id); }
The process, in order to read the data that it writes to the Memory Channel global address space, maps the region for transmit and for receive. See Example 10-2 for a detailed description of this procedure. [Return to example]
The program overlays the transmit and receive pointers with the global record structure. [Return to example]
The process tries to create a lock set that contains four locks and a
key
value of
456
.
The call to the
imc_lkalloc
function also specifies the
IMC_CREATOR
flag.
Therefore, if the lock set is not
already allocated, the function will automatically acquire lock zero
(0).
If the lock set already exists, the
imc_lkalloc
function fails to allocate the lock set and returns the value
IMC_EXISTS
.
[Return to example]
The process that creates the lock set (and consequently holds lock zero (0)) initializes the global region. [Return to example]
When the process finishes initializing the region, it calls the
imc_lkrelease
function to release the lock.
[Return to example]
Secondary processes that execute after the region
has been initialized, having failed in the first call to the
imc_lkalloc
function, now call the function again
without the
IMC_CREATOR
flag.
Because the value of
the
key
parameter is the same
(456
), this call allocates the same lock set.
[Return to example]
The secondary process calls the
imc_lkacquire
function to acquire lock zero (0) from the lock set.
[Return to example]
The secondary process updates the process count and writes it to the transmit region. [Return to example]
At the end of the program, the processes release all Memory Channel resources. [Return to example]
When a process acquires a lock, other processes executing on the cluster cannot acquire that lock.
Waiting for locks to become free entails busy spinning and has a
significant effect on performance.
Therefore, in the interest of overall system
performance, make your applications acquire locks only as they are
needed and release them promptly.
10.8 Cluster Signals
The Memory Channel API library provides the
imc_kill
function to allow processes to send signals to specified processes
executing on a remote host in a cluster.
This function is similar to the
UNIX
kill
(2)
function.
When the
kill
function is used in a cluster, the signal is sent
to all processes whose process group number is equal to the absolute
value of the PID, even if that process is on another cluster member.
The
PID is guaranteed to be unique across the cluster.
The main differences between the
imc_kill
function
and the
kill
function are that the
imc_kill
function does not allow the sending
of signals across cluster members nor does it support the sending of
signals to multiple processes.
10.9 Cluster Information
The following sections discuss how to use the Memory Channel API functions to
access cluster information, and how to access status information from
the command line.
10.9.1 Using Memory Channel API Functions to Access Memory Channel API Cluster Information
The Memory Channel API library provides the
imc_getclusterinfo
function, which allows
processes to get information about the hosts in a Memory Channel API
cluster.
The function returns one or more of the following:
A count of the number of hosts in the cluster, and the name of each host.
The number of logical rails in the cluster.
The active Memory Channel logical rails bitmask, with a bit set for each active logical rail.
The function does not return information about a host unless the Memory Channel API library is initialized on the host.
The Memory Channel API library provides the
imc_wait_cluster_event
function to block a calling
thread until a specified cluster event occurs.
The following Memory Channel
API cluster events are valid:
A host joins or leaves the cluster.
The logical rail configuration of the cluster changes.
The
imc_wait_cluster_event
function examines the
current representation of the Memory Channel API cluster configuration item
that is being monitored, and returns the new Memory Channel API cluster
configuration.
Example 10-6
shows how you can use the
imc_getclusterinfo
function with the
imc_wait_cluster_event
function to request the
names of the members of the Memory Channel API cluster and the active Memory Channel
logical rails bitmask, and then wait for an event change on either.
Example 10-6: Requesting Memory Channel API Cluster Information; Waiting for Memory Channel API Cluster Events
/* /usr/examples/cluster/mc_ex6.c */ #include <sys/imc.h> main ( ) { imc_railinfo mask; imc_hostinfo hostinfo; int status; imc_infoType items[3]; imc_eventType events[3]; items[0] = IMC_GET_ACTIVERAILS; items[1] = IMC_GET_HOSTS; items[2] = 0; events[0] = IMC_CC_EVENT_RAIL; events[1] = IMC_CC_EVENT_HOST; events[2] = 0; imc_api_init(NULL); status = imc_getclusterinfo(items,2,mask,sizeof(imc_railinfo), &hostinfo,sizeof(imc_hostinfo)); if (status != IMC_SUCCESS) imc_perror("imc_getclusterinfo:",status); status = imc_wait_cluster_event(events, 2, 0, mask, sizeof(imc_railinfo), &hostinfo, sizeof(imc_hostinfo)); if ((status != IMC_HOST_CHANGE) && (status != IMC_RAIL_CHANGE)) imc_perror("imc_wait_cluster_event didn't complete:",status); } /*main*/
10.9.2 Accessing Memory Channel Status Information from the Command Line
The Memory Channel API library provides the
imcs
command
to report on Memory Channel status.
The
imcs
command
writes information to the standard output about currently active Memory Channel
facilities.
The output is displayed as a list of regions or lock sets,
and includes the following information:
The type of subsystem that created the region or lock set (possible values are IMC or PVM)
An identifier for the Memory Channel region
An application-specific key that refers to the Memory Channel region or lock set
The size, in bytes, of the region
The access mode of the region or lock set
The username of the owner of the region or lock set
The group of the owner of the region or lock set
The Memory Channel logical rail that is used for the region
A flag specifying the coherency of the region
The number of locks that are available in the lock set
The total number of regions that are allocated
Memory Channel API overhead
Memory Channel rail usage
10.10 Comparison of Shared Memory and Message Passing Models
There are two models that you can use to develop applications that are based on the Memory Channel API library:
Shared memory
Message passing
At first, the shared memory approach might seem more suited to the Memory Channel
features.
However, developers who use this model must deal with the
latency, coherency, and error-detection problems that are described in this
chapter.
In some cases, it might be more appropriate to develop a simple
message-passing library that hides these problems from applications.
The data
transfer functions in such a library can be implemented completely in user
space.
Therefore, they can operate as efficiently as implementations
based on the shared memory model.
10.11 Frequently Asked Questions
This section contains answers to questions that are asked by programmers who
use the Memory Channel API to develop programs for TruCluster systems.
10.11.1 IMC_NOMAPPER Return Code
Question:
An
attempt was made to do an attach to a coherent region using the
imc_asattach
function.
The function returned the value
IMC_NOMAPPER
.
What does this mean?
Answer: This return value
indicates that the
imc_mapper
process is missing from a
system in your Memory Channel API cluster.
The
imc_mapper
process is automatically started in
the following cases:
On system initialization, when the configuration variable
IMC_AUTO_INIT
has a value of 1.
(See
Section 10.1
for more information about the
IMC_AUTO_INIT
variable.)
When you execute the
imc_init
command for the first
time.
To solve this problem, reboot the system from which the
imc_mapper
process is missing.
This error may occur if you shut down a system to single-user mode from
init
level 3, and then return the system to
multi-user mode without doing a complete reboot.
If you want to reboot a system
that runs TruCluster Server software, you must do a full reboot of that system.
10.11.2 Efficient Data Copy
Question: How can data be copied to a Memory Channel transmit region in order to obtain maximum Memory Channel bandwidth?
Answer: The Memory Channel API
imc_bcopy
function provides an efficient way of copying
aligned or unaligned data to Memory Channel.
The
imc_bcopy
function
has been optimized to make maximum use of the buffering capability of a
Compaq Alpha CPU.
You can also use the
imc_bcopy
function to copy data
efficiently between two buffers in standard memory.
10.11.3 Memory Channel Bandwidth Availability
Question: Is maximum Memory Channel bandwidth available when using coherent Memory Channel regions?
Answer: No.
Coherent regions use
the loopback feature to ensure local coherency.
Therefore, every write
data cycle has a corresponding cycle to loop the data back; this halves
the available bandwidth.
See
Section 10.6.1.3
for more
information about the loopback feature.
10.11.4 Memory Channel API Cluster Configuration Change
Question: How can a program determine whether a Memory Channel API cluster configuration change has occurred?
Answer: You can use the new
imc_wait_cluster_event
function to
monitor hosts that are joining or leaving the Memory Channel API cluster, or to monitor
changes in the state of the active logical rails.
You can write a program that
calls the
imc_wait_cluster_event
function in a separate
thread; this blocks the caller until a state change occurs.
10.11.5 Bus Error Message
Question: When a program tries to set a value in an attached transmit region, it crashes with the following message:
Bus error (core dumped)
Why does this happen?
Answer: The data type of the
value may be smaller than 32 bits (in C, an
int
is a
32-bit data item, and a
short
is a 16-bit
data item).
The Compaq Alpha processor, like other RISC processors,
reads and writes data in 64-bit units or 32-bit units.
When you
assign a value to a data item that is smaller than 32 bits, the compiler
generates code that loads a 32-bit unit, changes the bytes that are to
be modified, and then stores the entire 32-bit unit.
If such a data
item is in a Memory Channel region that is attached for transmit, the assignment causes a
read operation to occur in the attached area.
Because transmit areas are
write-only, a bus error is reported.
You can prevent this problem by ensuring that all accesses are done on 32-bit data items. See Section 10.6.3 for more information.