This appendix describes how to write a Ladebug remote debugger server for an Alpha target (operating system or hardware platform). It describes the functionality required of the server and explains how this functionality is implemented in the Digital UNIX server and in the server included in the debugger monitors for the Alpha server evaluation boards. It also includes information about writing new Ladebug remote debugger servers.
The main reason for using a remote debugger is to debug software on a system that cannot run a fully functional debugger locally. Examples of when you might use a remote debugger are:
Remote debuggers are useful simply for debugging software on systems that are remote from where you are working, although in this case there are often other alternatives (for example logging into the system across a network). Whether it is better to use remote debugging or one of these alternatives will often depend on the precise characteristics of the network and the debugger used.
In most cases, the alternatives to using a remote debugger are either to put debugging code, such as print statements, in the software being debugged or to develop a simple local debugger for the target system.
Adding debugging code to the software increases the complexity of the software, hence causing additional bugs, and it is often difficult to determine what information will be needed to debug the software. A further problem is the debugging code can itself change the behavior of the software and as such normally has to be removed before the software is released.
Developing a local debugger can itself be a major task. Since such a debugger is normally a one-off development, you cannot normally justify including support for high level features (such as source level debugging). Even if this is possible, attempting to provide such facilities locally the target system will often not have the resources (memory, for example) required to run a high level debugger.
All remote debuggers consist of two parts:
These communicate through a remote debugger protocol that runs over some communication mechanism such as a serial line or the Internet.
The client provides the user interface to the debugger and most of the intelligence of the debugger. For example, the client will normally do all translation between addresses and variable or function names.
The server makes available low level functions that allow the client to examine and control software running on the target system. For example, the server provides functions to read and modify the target system's memory. The client requests these functions, and receives responses, using the remote debugger protocol.
In general, a server is much simpler than a client and will require only minimal functionality from the target system on which it is running. This allows servers to be implemented for environments in which the functionality of a full workstation operating system is not available.
Most target systems for remote debugging are of these two types:
ptrace()
function) to implement its debugger
functions. The Digital UNIX server
described in Section B.6.1 is an
example of a server for such a target.
Ladebug is a debugger running on Digital UNIX systems. It supports a wide range of languages including Ada, C, C++, COBOL, and Fortran. Besides providing local debugging on Digital UNIX systems, it supports remote debugging through the Ladebug remote debugger protocol. The same text and windows based interfaces are available for use with both local and remote debugging and almost all the commands that are available for local debugging are also available for remote debugging.
Ladebug servers must be able to:
For Ladebug to debug a program there must be symbolic information for the program available to Ladebug on the host in a form that it understands. At present, the only form of symbolic information that Ladebug understands for programs running on Alpha processors is extended COFF (ECOFF) for Digital UNIX. The program must follow the register usage, function calling, and other conventions expected of programs that have this form of symbolic information. For example, a program for which the symbolic information is ECOFF must use Digital UNIX register usage and function calling conventions.
The Ladebug Remote Debugger Protocol is a request/response protocol running over UDP. The debugger client (Ladebug) initiates all transactions sending a request to the server. On receiving the request the server acts upon the request and sends a response. The server never sends any messages except in response to requests received from the client. The requests that the client can send are listed in Table B-1.
Request | Action |
---|---|
Load Process | Loads a new process for debugging. |
Connect to Process | Connects to an existing process. |
Connect to Process Insist | Connects to an existing process even if other debugger sessions are already connected to it. |
Probe Process | Checks the state of the process being debugged. |
Disconnect from Process | Disconnects, ending a debugger session. |
Kill Process | Kills the process; and then disconnects. |
Stop Process | Stops a running process. |
Continue Process | Continues running a stopped process. |
Step | Executes one instruction in the process being debugged. |
Set Breakpoint | Sets a breakpoint at an address. |
Clear Breakpoint | Clears a breakpoint at an address. |
Get Next Breakpoint | Gets the "next" breakpoint that is known to the server. Breakpoints are returned in an arbitrary order but no breakpoint will be returned more than once in a single scan of the list. |
Get Registers | Gets the contents of all the registers. |
Set Registers | Sets the contents of all the registers. |
Read | Reads memory. |
Write | Writes to memory. |
Section B.11 contains a full description of the protocol.
The protocol provides three alternative requests for starting a debugger session:
All servers should implement either Load Process or Connect to Process but a server need not implement both of them. A server that implements Connect to Process can choose any of the following:
To allow a single target machine to run multiple remote debugger sessions at the same time, clients always send Connect and Load requests to a fixed privileged, UDP port on the target. The server is expected to allocate a new unprivileged UDP port before replying. The new port is used on the target as the source and destination of all messages for the remainder of the debugger session.
To allow servers to be run on systems on which security is an issue
(for example typical Digital UNIX systems)
the Connect and Load requests contain the client and server login
names. The server can use these login names, together with the name
of the host system, to check that the remote user is authorized
to run programs on the target system, as is done when a user runs
programs through rsh
.
There are two different requests that end a remote debugger session:
Although there are explicit Ladebug commands that call up each of these requests, Ladebug normally kills processes that it has loaded and disconnects from processes to which it has connected. As such, a server that implements the Load Process function should at least implement the Kill Processes function and a server that implements the Connect to Process function should implement the Disconnect from Process function. The following are optional:
Two example Ladebug servers are available. The source code of these example servers is available from Digital Equipment Corporation for unrestricted reuse on Alpha based platforms (see the copyright notice in the source code for details).
The Digital UNIX server is designed to allow the debugging of user processes running on remote Digital UNIX systems. The version described in this section supports loading new processes using the Load Process request but does not support the Connect to Process or Connect to Process Insist requests.
The server consists of a server daemon and user servers:
The remote debugger daemon must be run as a root process. It
would normally be started at system start-up. The user server
loads the debuggee using Digital UNIX's
fork()
and exec()
functions and uses Digital UNIX's ptrace()
interface
to implement the low level debugger facilities required. The load
server and daemon both use Digital UNIX UDP
sockets to communicate with the client.
The evaluation board server is included in the evaluation board debug monitor provided with the Alpha evaluation boards (EB64, EB64+, EB66, etc.). It is designed to provide source level debugging of operating system kernels being ported to these boards, and of programs running on these boards without an operating system. The complete monitor (including the remote debugger server) is designed to be easily ported to other Alpha based hardware.
The server only supports starting debugger sessions through the Connect to Process or Connect to Process Insist requests. This server does not support the Load Process request. Since the monitor is not a multiprocessing system, the server ignores the process ID in the Connect requests. It also ignores the login names in Connect requests.
The user is expected to load the test program using the monitor load facilities (LOAD, NETLOAD, etc.) before starting the debugger server. The server interprets all addresses it receives as physical addresses. The server then performs all debugger functions by directly reading and writing memory.
To set breakpoints, the debugger patches a PAL call into the code being debugged. To avoid conflict with the use of the breakpoint PAL call by operating system kernels, this is not the standard breakpoint PAL call (the BPT PAL call) but a special PAL call (DBGSTOP). DBGSTOP exhibits identical behaviour but has its own system entry address. It is implemented in the debugger version of evaluation boards' PAL code. When this PAL call is executed, it results in a call back to the monitor at which point the state of the debuggee is saved and the server is reentered.
The monitor's ethernet software allows server to register to receive packets addressed to particular UDP port and to send packets on any UDP port. The server depends on interrupts to receive packets while the debuggee is running. Upon receiving any interrupt, the monitor polls the ethernet driver for messages. The ethernet software passes any appropriate messages to the server.
A consequence of using interrupts to receive messages is that some care is needed when debugging programs that do their own interrupt handling. To allow such programs to be debugged, the Evaluation Board user library contains a function that polls the ethernet. This function would normally be called by the application every time it receives an interrupt.
Each server consists of:
Once the protocol handler has dealt with a message and built a reply it passes this reply back to the communicator. The communicator then sends this reply to the client. The communicator is target dependent. It makes use of the target's UDP functions to read and write messages. In the Digital UNIX server, the communicator also contains the server's main program and the code of the daemon. As such, it is responsible for creating the user servers.
ptrace()
on Digital
UNIX)
The simplest way of creating a server for a new target is to base it upon one of the example servers. Normally, if you are developing the server to be part of a monitor program, you should base it upon the evaluation board server.
If, however, you are developing it as an operating system utility you should probably base it upon the Digital UNIX server. You should try to make as few changes as possible to the example servers, since you are likely to have no satisfactory way of debugging software (and hence the servers themselves) until you have successfully ported them to your target system.
This section describes
The communicator contains the main function to the debugger server, to which the interface is target-dependent. It also contains some functions that the other components of the server can call. These functions are as follows:
The Digital UNIX communicator is
implemented in the C source file server_main.c. This file contains
the daemon's entry point (main()
), the main function
of the user servers (user_server_main()
), and the
interface functions previously described.
When the daemon is started, main()
creates a socket
and binds it to the Ladebug remote debugger connect port. It then
reads packets from this port ignoring any packets that are not
load requests. When it receives a load request, it checks that
the client user is allowed to run remote debugger sessions on this
machine using the server user name he has requested. This it does
by calling to the Digital UNIX function
ruserok()
.
If the request is valid, the daemon creates a child process (by forking). The parent process then simply continues round the packet reading loop. The child process:
user_server_main()
with the
client address and the load request packet as arguments. When
user_server_main()
returns, the child process
exits.
If, for any reason, the daemon is unable to start the user server, it then sends a load request response to the client containing an error code.
The user server function user_server_main()
starts by
creating a UDP socket for communicating with the client. It then:
ProcessPacket()
) and sends the response to the
client.
One complication in the code of the communicator is that the daemon has to be able to handle the receipt of duplicate load messages. The client sends such duplicate load messages when the server's load response message is lost or does not reach the client within the client's time-out time.
To handle such duplicate load messages, a pipe is created between the daemon and each user server. When the daemon receives a duplicate load message, it uses this pipe to pass it on the appropriate user server. The user server treats this message like any other duplicate message.
The evaluation boards' Ethernet driver software passes received frames to other parts of the monitor's software through call-back functions. A component of the monitor that wishes to receive frames on a UDP port calls a registration function provided by the Ethernet driver software.
The registration functions take as an argument the address of the call-back function to be called when such frames are received. When a component registers a call-back function, it can do either of the following:
udp_
register_well_known_port()
.
udp_create_port()
. Any component of the monitor can
then poll the ethernet at any time, by calling ethernet_
process_one_packet().
If the ethernet hardware has received a packet for any registered
UDP port then the driver will call the appropriate call-back
function. The call-back function called may be in a completely
different component of the monitor from that which called
ethernet_process_one_packet()
. Once any call-back
function has completed its processing, ethernet_process_one_
packet()
will return with a result indicating whether any
packets were processed.
All packets passed to the ethernet driver must be built in fixed sized buffers provided by the driver, so that the ethernet driver never has to copy any data. The ethernet driver allocates these buffers at addresses from which the ethernet devices can send data and to which they can receive data.
To avoid the need for complex allocation algorithms, and complex error handling if buffer allocation fails, any component of the monitor can allocate a number of buffers at start-up. To maintain this buffer count the ethernet send functions always return to the caller a buffer to replace the buffer containing the packet to be sent.
On completion, the call-back functions used to receive frames must always return an ethernet buffer to the ethernet drivers. This can be, but need not be, the buffer that contained the received frame.
The debugger server does not need its own pool of buffers, since it only sends a frame immediately following the receipt of a frame. As such it handles received frames in the following steps:
The server's communicator is largely implemented in C source file
server_read_loop.c
. This contains the following code:
enable_ladbx_msg()
. This
enables the receipt of connection messages from the client by
registering the connection port. For compatibility with older
versions of Ladebug it also registers a second connection port
with an unprivileged port number.
read_packets()
. The server
calls this function whenever it wishes to poll the ethernet for
received packets. It simply calls ethernet_process_one_
packet()
until there are no more packets to process.
data_received()
. This is
a wrapper for read_packets()
that is used when
the reason for polling the ethernet is that there has been an
interrupt. It disables interrupts before calling read_
packets()
and restores the interrupt state once
read_packets()
returns.
app_poll()
. Applications
that have their own interrupt handlers (and therefore disable
the monitor's interrupt handler) call this function to poll the
ethernet for debugger frames. It stores its return point as the
debuggee's program counter and then calls data_received()
. The reason for setting the debuggee's program counter
is that this if the server receives a stop request then it will
need to know where to put a breakpoint to stop the debuggee.
ladebug_server_
init_module()
. This function places a pointer to
app_poll()
at a standard address in memory so that
an independently linked application can call it.
ladbx_poll()
that calls app_poll()
through the pointer at this
address. Applications that do their own interrupt handline should
call ladbx_poll()
frequently (for example, every
time they receive an interrupt) to ensure that the debugger server
receives all debugger protocol packets without excessive delay.
In addition, the file kutil.s
contains the source
of the monitor's interrupt function. The monitor only enables
interrupts when an application is running. When the monitor receives
any interrupt, it saves the debuggee's state and polls the Ethernet
for received frames. Since the monitor will normally receive regular
1ms timer interrupts this will ensure that it receives all the
client's debugger frames.
It should be possible to port the Digital UNIX communicator to most other versions of Digital UNIX and Digital UNIX derivatives with few changes. For operating systems that are not derived from Digital UNIX, the mechanism for starting user servers may have to be significantly modified.
In particular, many operating systems have no exact equivalent of
fork()
and instead start a new process by running a
new executable file. On such a system, the communicator will have
to be split into two separate executable files (one for the daemon
and the other for the server). Also in such systems the new process
typically does not have access to data set up by its parent before
it was created, so some other mechanism will have to be used to
transfer the first packet, and other data, to the user process.
The mechanism for setting the user identifier of the user server will vary widely between operating systems. Be aware that although the term daemon is a Digital UNIX term almost all operating systems have some mechanism for installing and running privileged background processes.
The socket mechanism used to read data from the network is quite widely available. If this mechanism is not available, then any other mechanism that allows the communicator to wait for the receipt of UDP packets on particular ports can be used.
If the operating system does not provide any such mechanism (for example, a real time kernel that does not include networking support), then one option is to port part or all of the networking code in the Evaluation Board Monitor to this environment. In this case, it may be easier to base your communicator upon the that in the Evaluation Board Server rather than basing it upon that in the Digital UNIX Server.
For embedded servers, few (if any) changes should be needed to the Evaluation Board's communicator. However, for many such systems you will need to rewrite the network device drivers. These are contained in the ethernet code of the Evaluation Board Monitor.
The protocol handler's main interface function is
ProcessPacket()
. The only argument to this function
is a pointer to the packet that it is to process. As a part of its
processing of the packet ProcessPacket()
converts the
request packet passed to it into a response packet. The caller must
ensure that the buffer pointed to by the argument is large enough to
contain any possible response packet.
The function DumpPacket()
can also be called by the
communicator. This dumps the contents of packets passed to it, if
the protocol handler is compiled with tracing enabled.
The code for the packet handler is identical for the two servers.
It should not need to change for other server implementations.
The source code is in the files packet-handling.c
and packet-util.c
; packet-handling.c
contains the function ProcessPackets(). When this function receives
a packet, it:
This normally consists of extracting some arguments from the packet and passing them to the appropriate debugger kernel function. Where a request changes the state of the connection to the client (for example the kill command disconnects from the client), it calls the appropriate communicator interface function to inform the communicator that this has happened. In some cases, it also retrieves data from kernel functions (for example, the contents of the registers) and copies them into the packet.
It also sets the packet's return value. Before it returns the response packet to the communicator, it makes a copy of it so that it can be resent if the next packet duplicates the request packet.
packet-util.c
contains utility functions for reading
and writing the fields of packets and for dumping the contents
of a packet. To avoid any possible alignment problems the utility
functions read and write packet fields a byte at a time.
This section describes:
The debugger kernels provide the following interface functions to the protocol handler:
Loads a new process. The arguments are:
The result is TRUE if successful or FALSE if the load fails. If the load is successful the processes will become the new debuggee and stop at its entry point.
In the Digital UNIX server, the debugger
kernel is implemented using the ptrace()
function.
This is a Digital UNIX function that
allows a parent process to examine and control its children. Since
ptrace()
can only be used to debugger child processes
the Digital UNIX debugger kernel only
supports the loading of new processes and not connection to existing
processes.
Since this means that the debuggee always runs as a child of the server, and the server is killed when the client disconnects, the Digital UNIX server does not support disconnecting from a debuggee without killing it.
When kload()
loads a new debuggee, it does so using
the Digital UNIX functions fork()
and exec()
. kload()
creates
the new process using the fork()
function. This new
child process:
ptrace()
call to allow its parent
to control it using ptrace()
calls. This also sets
up a breakpoint on executing new images.
exec()
. When it reaches the debuggee's entry point, it will
stop and its parent will receive a SIGCHLD signal.
The parent process meanwhile waits for a signal from the child. When it receives a signal it, checks that the debuggee has stopped at a breakpoint (rather than, for example, having exited). If it has then the kernel checks whether the debuggee uses shared libraries.
If the debuggee does use shared libraries the server tells it to
continue (through a ptrace()
call) and waits for it to
stop once more.
ptrace()
until the
child process stops for the second time.
Once the debuggee has been started the server stores the state
of the debuggee in the variable child_state. The Digital UNIX kernel inserts breakpoints by using
ptrace()
to write a breakpoint PAL call to the
address of the breakpoint. On Alpha Digital
UNIX ptrace()
always reads and writes 8-byte (2
instruction) quantities, so the kernel has to insert and remove
breakpoints through a read, modify, write sequence.
The kernel implements stop request (function kstop()
) by sending a SIGINT signal to the debuggee. This will stop
the debuggee unless it has disabled receiving SIGINT signals.
kkill()
uses the same technique to stop the debuggee
before killing it.
When kgo()
is called, it first checks whether there
is a breakpoint at the current program counter. If there is a
breakpoint there, then the kernel executes the original instruction
at this location using the internal function kstepoverbreak()
. This function temporarily puts the instruction back into
the code and then uses ptrace()
with the PT_STEP
function code to execute this instruction. Once it has executed
the instruction it restores the breakpoint.
When kstepoverbreak()
returns kgo()
calls the internal function kasyncwait()
to set up
kstopped()
as a signal handler for the SIGCHLD signal.
It then calls ptrace()
with the PT_CONTINUE function
code. This tells the debuggee to continue from its current program
counter.
When a running debuggee stops for any reason, the server will
receive a SIGCHLD signal. This will cause kstopped()
to be called. kstopped()
checks why the debuggee has
stopped and sets the child_state appropriately. If the debuggee has
stopped as a result of a breakpoint PAL call the program counter
will point to the instruction after the breakpoint. Under these
circumstances, kstopped()
will move the program
counter back one instruction to point at the breakpointed address.
When the kernel reads memory, it uses the breakpoint table functions to check whether there is a breakpoint on either of the longwords it is reading. If there is such a breakpoint, it reads the data for that longword from the breakpoint table rather than from the debuggee's memory. Similarly, when the kernel is asked to write to memory, it will update breakpoint table entries if necessary and will not overwrite breakpoint PAL calls.
The mapping of register to register number used by ptrace()
is the same as that used by the kernel interface. This
means that kregister()
and ksetreg()
translate very directly into ptrace()
calls.
The evaluation board server's debugger kernel is implemented by directly reading from and writing to memory. It runs in the same environment as the debuggee, with the same mapping of virtual to physical addresses.
As such, there is no distinction between its memory and the debuggee's memory. This means that it can satisfy requests to read or write the debuggee's memory by simply reading or writing its own virtual memory.
The evaluation board kernel implements breakpoints through a special
additional PAL call, DBGSTOP. The monitor's PAL code provides this
additional PAL call. It functionally is identical to the standard Digital UNIX breakpoint PAL call except that
its system entry address can be set up independently by passing a
different function code value to wrest()
. This allows
the monitor to set breakpoints even in applications (for example
operating systems) that do their own breakpoint handling using the
standard breakpoint PAL call.
The evaluation boards' debugger kernel is implemented in the C
source file kernel.c
and the assembler source file
kutil.s
. The functions in these files are also used
to implement the low level debugger commands provided by the monitor
on its dumb terminal interface. kernel.c
contains
the main body of the debugger kernel, including all the interface
functions previously listed. kutil.s
contains the
system entry points for interrupts, traps and breakpoints; and
functions that provide a C interface to various PAL calls.
kernel.c
contains three functions that are used to
initialize the debugger kernel:
kstart()
is called when the system starts
up. It simply initializes some of kernel.c
's static
variables and ensures that interrupts are disabled.
kinitpalentry()
is called before
the monitor runs any application. It reinitializes the
PAL system entry points by calling the assembler function
kutilinitbreaks()
. Once a debuggee has been
started, and until it completes, monitor code will only be
executed when it is called directly or indirectly from one of
these system entry points. kutilinitbreaks()
defines the system entry point for interrupts to point to the
monitor's interrupt function, and the system entry point for
DBGSTOP point to the monitor's low level breakpoint function. All
other system entry points are defined to point to the monitor's
trap function.
kenableserver()
is called to switch
to remote debug mode. It is called when the user issues the
ladbx
command at the monitor's dumb terminal. At
this point the user should have already loaded and started the
debuggee using the monitor's dumb terminal commands, and it
should be stopped at a breakpoint. It sets remote debugger mode,
calls enable_ladbx_msg()
to enable the receipt of
debugger messages by the server and then waits for such messages
by calling kwaitforcontinue()
.
Once the debuggee has been started the state of the debuggee is
always in the variable child_state. kpoll()
simply
reads this variable.
The kernel sets breakpoints by saving the original instruction in the breakpoint table and inserted by writing the DBGSTOP instruction to the location at which a breakpoint is required. To simplify other memory access in the debugger monitor the kernel does not write DBGSTOP instructions into memory until just before the program is allowed to run, and replaces them with the original instructions as soon as the program stops.
The function kinstall_breakpoints()
writes DBGSTOP
instructions for all current breakpoints to memory, and the
restore_breakpoint_instrs()
internal function restores
the original instructions at these locations whenever the debuggee
stops. Because any modification to the debuggee's memory can
alter its instruction stream, the kernel follows all writes to the
debuggee's memory with instruction barrier PAL calls.
When the debuggee hits a breakpoint (i.e., executes a DBGSTOP
PAL call), the PAL code calls the monitor's assembler breakpoint
function (dbgentry()
) , which:
kreenter()
, which:
Unless the debuggee has just single stepped or processed a
stop request this function will be katbpt()
.
katbpt()
steps the saved program counter back one
instruction so that it:
kwaitforcontinue()
, which:
read_packets()
or
user_main()
.
kwaitforcontinue()
will stay in this loop until
some other kernel function clears the stopped flag.
The handling of exceptions is similar to the handling of
breakpoints. All the unused system entry points are initially set
up to point to dbgtrap. This sets a flag (in a register that the PAL
code has already saved) to indicate that the server was reentered as
a result of an exception and then jumps to dbgentry2
.
This is an alternative entry point to the function dbgentry()
. dbgentry()
saves the processor's
registers, as previously described, but then, instead of calling
kreenter()
, calls ktrap()
.
ktrap()
removes any temporary breakpoints, then
sets child_state
appropriately and then calls
kwaitforcontinue()
.
A command can be received as a result of:
When a command is received, either the command processor or the protocol handler calls the appropriate kernel function. The functions that can be called are the previously listed interface functions. Table B-2 explains the behavior.
kernel Function | Action |
---|---|
kload()
kload_implemented() | Always return FALSE |
kconnect_implemented()
| Returns TRUE |
kconnect() | Always returns TRUE, does nothing else |
kkill_
possible() | Always returns FALSE |
kkill() | Does nothing |
kdisconnect_possible()
| Always returns TRUE |
kdisconnect() | Does nothing |
kpid() |
Always returns 0 |
kgo()
| Checks whether the debuggee is stopped at a
breakpoint. If it is, it uses ksetstepbreak() to
set the temporary breakpoints so that the debuggee stop again
after executing one instruction. It also sets the breakpoint
continuation function to be ksteppedoverbreak()
. If the debuggee is not at a breakpoint, kgo()
places breakpoint instructions (DBGSTOP PAL calls) at all the
breakpoints and sets the breakpoint continuation function to be
katbpt() . Then, whether or not the debuggee was at
a breakpoint, it clears the stopped flag so that the debuggee will
continue the next time kwaitforcontinue() checks the
flag. |
kstop() |
Checks whether the debuggee is still running or stopped. If
the debuggee is stopped, kstop() does nothing. If the
debuggee is running, it places a temporary breakpoint at the current
instruction. |
kaddressok()
| Returns TRUE if the address is quadword aligned and FALSE otherwise. |
kcexamine()
| Reads the requested location. It does not need to check the breakpoint table because it is only called when the debuggee is stopped. |
kcdeposit()
| Writes to the requested location. |
kstep() | Uses
ksetstepbreak() to set up temporary breakpoints
everywhere the program counter can be after executing the next
instruction. This requires a maximum of two temporary breakpoints
since ksetstepbreak() can work out the destination
of a jump instruction by reading the instruction's argument
register. It also sets the breakpoint continuation function to be
katbpt and clears the stopped flag. |
kpc() | Returns the saved program counter. |
ksetpc()
| Modifies the saved program counter. |
kregister() | Returns the value of the appropriate entry in the saved register array. |
ksetreg() |
Sets the value of the appropriate entry in the saved register array. |
kbreak() |
Calls bptinsert() . The kernel does not write
to the debuggee's memory until the debuggee about to be run or
resumed. |
kremovebreak()
| Calls bptremove() . |
kpoll() | Returns the value of
child_state . |
When kwaitforcontinue()
sees that the stopped flag is
clear, it returns (through a number of intermediate functions) to
bptentry()
. This restores the processor registers and
then calls the PAL RTI function to return to the debuggee.
If the debuggee was continuing from a (permanent) breakpoint as a
result of a kgo() call, it will hit a new (temporary) breakpoint
after executing one instruction. The state will be saved as it
would be with a permanent breakpoint but the breakpoint continuation
function called will be ksteppedoverbreak()
. This
backs up the program counter 1 instruction, places DBGSTOP PAL calls
at all the breakpoints in the breakpoint table, and then once again
returns to bptentry()
to resume the debuggee.
The debuggee will now run until it is stopped by hitting a further permanent breakpoint or by an exception, or by a stop command.
The assembler source file kutil.s
contains the function
dbgint()
. This is the monitor's system entry point
for interrupts. On receiving any interrupt the monitor save the
previous state and call data_received()
to tell the
communicator that the ethernet device may have received data and
that it should poll the ethernet driver.
The one complication in dbgint()
is that if the
server receives a Stop Request packet, then the debugger kernel
will need to know the debuggee's current program counter. This is
not necessarily the program counter saved by the PAL code because
the interrupt routine can itself be interrupted (and therefore be
called recursively).
The global variable containing the saved program counter is checked
for a nonzero value. A value of zero is used to indicate that it is
not in use. If it is already set, it is not reset but data_
received()
is called.
If the program counter has not already been saved in this global
variable, the value that was saved on the stack by the PAL code is
examined. If it is within dbgint()
, then this is a
recursive call to dbgint()
with the second interrupt
having happened before the first call to dbgint()
saved the program counter. In these circumstances, there is no need
to call data_received()
since it will be called by
the outer call to dbgint()
. Otherwise, this value is
saved as the debuggee's program counter and data_received()
is called.
This procedure requires that the code that saves the value of the
program counter should be in the function dbgint()
and
not within another function called by dbgint()
.
The kernel also contains a function, knullipl()
, that
clears an interrupt. On the EB64 version of the kernel, it simply
writes two commands to the 82C59 (the interrupt controller used
on the EB64). This function will clearly have to be rewritten for
target systems that use different interrupt controllers.
Few, if any, changes should be needed to port the Digital UNIX debugger kernel to other Digital UNIX like operating systems. The operating system functions used in the Digital UNIX debugger kernel seem to be available in all Digital UNIX dialects.
A problem that may arise is that in some Digital UNIX dialects, when the debuggee stops at
a breakpoint the program counter, it may point to the actual
breakpoint instruction rather than the instruction following
the breakpoint. For some of the newer dialects of Digital UNIX, a server with greater functionality (in
particular the additional ability to connect to existing processes)
could be implemented by rewriting the debugger kernel to use the
/proc
debugger interface.
Porting the server to other operating systems will involve replacing
the ptrace(), fork()
and exec()
calls
with the equivalent calls (if they exist) for the target operating
system. Assuming it is possible to read and write a subprocess's
memory and registers, this should not be difficult.
On operating systems where this is not possible you may have to link some low level debugger functions into the debuggee and communicate between these functions and the kernel through shared memory. On such operating systems, there is also a need for a mechanism for detecting that the debuggee has stopped at a breakpoint. How this is done will vary widely between operating systems.
Few changes are likely to be needed to port the Evaluation Board Server to other embedded systems that use the Digital UNIX PAL code interface. The changes that will often be needed are as follows:
knullipl
will have to be rewritten.
If some other PAL code interface is used, then this will probably require changes in how breakpoints are set and how the server's entry points are called when the debuggee hits a breakpoint or receives an interrupt. It may also alter how much of the debuggee's state is saved by the PAL code before the server's entry points are called from the PAL code, and the value of the program counter that is passed to the server on reaching a breakpoint.
The most common problems that have arisen when modifying the code of the debugger kernels are as follows:
The breakpoint table handler provides the following interface functions:
*savedinst
. If it fails to
remove the breakpoint, it returns a negative error code as its
result.
*address
and the saved instruction in
*savedinst
. If it fails to find the breakpoint
it returns a negative error code as its result.
*savedinst
. If it fails
to find the breakpoint, it returns a negative error code as its
result.
The code for the breakpoint table handler is identical for the
two servers. It should not need to change for other server
implementations. The source code is in bptable.c
. The
table is implemented as 3 arrays of 100 entries each. The breakpoint
number of a breakpoint is used as an index into these arrays. The
three arrays are:
New entries are inserted in the first available entry and entries are found by a linear search.
The Ladebug Remote Debugger Protocol is a request/response protocol running over UDP. The debugger client (Ladebug) initiates all transactions sending a request to the server. On receiving the request, the server acts upon the request and sends a response.
If the client does not receive a response within a time-out, it repeats the request (with an indication that the message is a duplicate). The time-out will vary between a tenth of a second and 10 seconds depending on how long it took to get responses to previous requests.
If the client does not receive a response, after a number of attempts and with increasing retry time-outs, it assumes that the communication path to the server has failed. The server never sends any messages except in response to messages received from the client.
Section B.11.1.1 through Section B.12 describe more about the Ladebug remote debugger protocol:
This section describes the Ladebug Remote Debugger Protocol messages and the format of each message.
Table B-3 shows header names, byte numbers, format, and contents of the message headers. Section B.11.1.2 shows the possible values of the messages.
Name | Byte Number | Format | Content |
---|---|---|---|
Protocol Version | 0 | Integer | Should be 2 |
Retransmit Count | 1 | Integer | In requests, 0 the first time a packet is transmitted: each
retransmission of packet increments by a one.
In responses, The retransmit count of the request. |
Command code | 2 to 3 | Integer in network order, most significant byte first[1] | Identifies the type of request or response. |
Sequence Number | 4 to 7 | Integer in network order | Identifies the message. |
Process ID | 8 to 11 | Integer in target machine order, least significant byte first for Alpha targets | Identifies the process being debugged. The value is not defined in load request messages. |
Return value | 12 to 16 | Integer in network order | Ignored in requests. In replies tells the client whether the requested action was successful, and if not why not |
[1] The protocol sends multibyte integer fields whose meaning is independent of the target architecture in conventional network order (i.e most significant byte first). Examples of such fields are the command code or byte counts. Multibyte integer fields that can only be interpreted with knowledge of the target architecture, such as addresses or register values, are sent in target machine order. For Alpha targets this means that such fields are sent least significant byte first. |
Table B-4 explains the values returned by messages.
Value | Message | Explanation |
---|---|---|
0 | OK | Request succeeded |
1 | Bad process ID | The process ID of the message is not that of the debuggee, or, in the case of Connect to Process, the server could not connect to that process. |
2 | No resources | The server did not have the resources to carry out the request. |
3 | Not connected | The server is not connected to a debuggee. The request requires that it should be. |
4 | Not stopped | The debuggee is running. The request can only be carried out with the debuggee stopped. |
5 | Bad address | The address given in the request is bad. The precise meaning of this varies between the different types of responses that can give this return value. |
6 | Not implemented | The server does not implement this request. |
7 | Bad load name | See Section B.11.1.4 |
8 | Already connected | The server is already debugging the requested debuggee. |
9 | Cannot disconnect from process | See Section B.11.1.8 |
10 | Cannot kill process | See Section B.11.1.10 |
11 | Cannot step | See section Section B.11.1.12 |
The load process request is a request to the server to load a new process and to start a new debugger session. They are transmitted in the request in the order shown with no unused bytes between the fields. Section B.11.1.4 describes the possible responses to a load process request.
Table B-5 shows the fields of the load process request:
Name | Length | Format | Contents |
---|---|---|---|
Header | 16 bytes | See Table B-3 | See Table B-3 |
Client User Name | Variable | Null terminated character string | Name of the user of the client on the host. This can be used by the server to check that the client is allowed to load the requested process. |
Server User Name | Variable | Null terminated character string | User name of user to run the process on the target. This will be ignored by some servers. |
Program Name | Variable | Null terminated character string | Name of program to be loaded. The form and interpretation of this name will vary between servers. |
Number of arguments | 1 byte | Integer | Count of program argument fields |
Arguments | Variable | Variable number of null terminated strings. The number of arguments field gives the number of strings. | Arguments to be passed to the loaded process. May be ignored by some servers. |
Standard input | Variable | Null terminated string | File name of a file to which standard input is to be redirected. An empty string (just a 0 byte) indicates no redirection: otherwise the interpretation of the file name is server dependent. |
Standard output | Variable | Null terminated character string | Name of file to which standard output is to be redirected. This file name is interpreted in the same way as the standard input file name. |
Standard error | Variable | Null terminated character string | Name of file to which standard error is to be redirected. This file name is interpreted in the same way as the standard input file name. |
The command code for a Load Process request is 1. The server should ignore the PID received in a Load Process request.
The fields of a Load Process response are:
The command code of a load response is 0x8001.
A connect request message requests that the server should start a debugger session by connecting to an existing process on the host. The fields of a connect request are:
The command code of a connect request is 2.
The format of a connect response is identical to that of a load response. Possible failure reasons are:
The command code of a connect response is 0x8002
A connect insist request message requests that the server should take over debugging a process to which there may already be a server connected. The formats of connect insist requests and responses are differ from those of connect requests and responses only in the command codes. A Connect to Process Insist request has a command code of 3 and its response has a command code of 0x8003. A server should only return the Already Connected failure reason if it could not terminate the old debugger session.
The Probe Process request asks the server what the state of the debuggee is. It contains no fields other than the standard header. Its command code is 0x81.
The Probe Process response returns the state of the debuggee. Following the standard header (at byte 16) it contains a 1-byte integer field giving the state of the debuggee. Possible values are:
Possible failure reasons are:
The command code of Probe Process response is 0x8081.
The Disconnect from Process request asks the server to disconnect from both the current debuggee and the client. It will only succeed if the server can disconnect from the debuggee without killing it, or if the debuggee is already dead.
The effect on breakpoints of disconnecting from a process may vary between servers. In particular, the protocol does not define whether disconnecting from a stopped process will allow it to run on, or what happens if the processes reaches a breakpoint after the server has disconnected from it.
The request contains no fields other than the message header. Its command code is 0x82.
The Disconnect from Process response contains no fields other than the message header. Possible failure reasons are:
If the server cannot disconnect from the debuggee it will remain connected to the client and to the debuggee. The command code of Disconnect from Process response is 0x8082.
The Stop Process request asks the server to stop a running debuggee as soon as possible. It contains no fields other than the message header. Its command code is 0x83.
The Stop Process response contains no fields other than the message header. Possible failure reasons are:
The command code of Stop Process response is 0x8083.
The Kill Process request asks the server to kill the current debuggee and disconnect from the client. It will only succeed if the server can kill the debuggee, or if the debuggee is already dead. The request contains no fields other than the message header. Its command code is 0x84.
The Kill Process response contains no fields other than the message header. Possible failure reasons are:
If the server cannot kill the debuggee it will remain connected to the client and to the debuggee. The command code of Kill Process response is 0x8084.
The Continue Process request asks the server to make the debuggee to run on until it hits a breakpoint, terminates, is stopped by the server acting on a Stop Process request, or stops for some other reason (e.g. executing a trap or exception). It contains no fields other than the message header. Its command code is 0xA1.
The Continue Process response contains no fields other than the message header. Possible failure reasons are:
If the debuggee is terminated the request will succeed: but its effect is undefined. The command code of a Continue Process response is 0x80A1.
The Step request asks the server to make the debuggee execute one instruction. It contains no fields other than the message header. Its command code is 0xA2.
The Step response contains no fields other than the message header. Possible failure reasons are:
If the debuggee is terminated the request may succeed: but its effect is undefined. The command code of a Step response is 0x80A2.
The Set Breakpoint request asks the server to set a breakpoint in the code of the debuggee. Although a server is not required to be able to set a breakpoint on any particular address to be useful it must be able set breakpoint on a significant portion of the instructions of the debuggee.
The effect of setting a breakpoint on anything other than an instruction of the debuggee is not defined. Furthermore, the effect of setting a breakpoint on an instruction that the debuggee modifies or reads as data is not defined.
The fields of a Set Breakpoint request are:
The command code of a Set Breakpoint request is 0xA3.
The Set Breakpoint response contains no fields other than the message header. Possible failure reasons are:
The command code of a Set Breakpoint response is 0x80A2.
The Clear Breakpoint request asks the server to remove a breakpoint from the debuggee. Its fields are:
The command code of a Clear Breakpoint request is 0xA4.
The Clear Breakpoint response contains no fields other than the message header. Possible failure reasons are:
The command code of a Clear Breakpoint response is 0x80A4.
Using Get Next Breakpoint requests the client can get a complete list of the breakpoints known to the server that affect the current debuggee. In some servers this will include breakpoints set in the debuggee by previous remote debugger sessions or through an alternative interface. For example, in the evaluation board server it includes breakpoints set by previous debugger sessions and those set through the local debugger interface.
To get a complete list of breakpoints the client should start by sending a Get Next Breakpoint with a breakpoint address of zero. It should then send further Get Next Breakpoint requests each containing the address returned by the previous Get Next Breakpoint response. A server that receives this sequence of requests with no other requests intervening must return each breakpoint it knows about precisely once. The protocol does not define the order in which the server will return the breakpoints it knows about.
The fields of a Get Next Breakpoint request are:
The command code of a Get Next Breakpoint request is 0xA5.
The fields of a Get Next Breakpoint response are:
Possible failure reasons are:
The command code of a Get Next Breakpoint response is 0x80A5.
The Get Registers request asks the server to send the client the contents of all the debuggee's registers and pseudo registers. It contains no fields other than the message header. Its command code 0xA6.
The fields of the Get Registers response are:
Possible failure reasons are:
The command code of a Get Registers response is 0x80A6.
The Set Registers request asks the server to all the debuggee's registers and pseudo registers: including the debuggee's program counter. The request succeeds even if it is unable to change the values of some of the registers. Its fields are:
The command code of a Set Registers request is 0xA7.
A Set Registers response contains no fields other than the message header. Possible failure reasons are:
The command code of a Set Registers response is 0x80A7.
A Read request asks the server to read a portion of the debuggee's memory. Its fields are, in order:
Its command code is 0xA8.
The fields of a Read response are, in order:
Possible failure reasons are:
The command code of a Read response is 0x80A8.
A Write request asks the server to overwrite a portion of the debuggee's memory. Its fields are, in order:
Its command code is 0xA9.
A Write response contains no fields other than the message header. Possible failure reasons are:
The command code of a Write response is 0x80A9.
A server can be modeled as a single control thread plus a debuggee thread for each debuggee. Each thread is uniquely identified by its UDP port number. Once a client has sent a message to a thread, it cannot send further messages to that thread (other than copies of the original message) until it receives a response.
A server (control or debuggee) thread can only send responses to the requests it receives. Servers threads are expected to respond promptly to all requests they receive. A server thread must never send more than one response to each message it receives. If it receives a duplicate request it must send a copy of its original response, with an updated retransmission count, without acting a second time on the request.
The only messages a client can send to a control thread are Connect to Process, Connect to Process Insist, and Load Process requests. A positive response to either of these requests identifies a new debuggee thread that the client should use for debugging the new debuggee.
A debuggee thread can be in either running or stopped state. Initially a debuggee thread is in running state. In running state it will accept the following requests:
The client must not send any other requests to a debuggee thread when it is in running state. A debuggee thread will enter stopped state from running state whenever the debuggee stops. The client can discover the state of a debuggee thread by sending it a probe request. Any debuggee state other than running indicates that the debuggee thread is in stopped state. When it is in stopped state, the client can send the debuggee thread any request except Connect to Process, Connect to Process Insist, or Load Process.
A debuggee thread will return from stopped state to running state when it receives a Continue or Step request that it can act upon.
A debuggee thread will exit immediately after sending a positive Disconnect from Process or Kill Process response. It can also exit at any other time, either through some external cause, or as a result of some other client taking over the debuggee using a Connect to Process Insist request. Once a debuggee thread has exited the server will either ignore requests sent to it or send responses with not connected failure codes.
The standard packet header contains two fields that are used to recover from packet loss:
The sequence number is used to distinguish between different messages. The retransmission count is used to distinguish between copies of the same message. A client should give every message it sends to a particular server thread a different sequence number.
To avoid confusion between different clients started by the same user on the same host, it should also attempt to give load and connect requests sequence numbers that will not be used by other clients. One way to do this would be to base the sequence number upon the time at which the message is sent.
The first time it sends a message it will give it a retransmission count of 0. If it does not receive a response to a request within a reasonable time, it will increment the retransmission count and repeat the message. If after a number of attempts it has still not received a response with the same sequence number and retransmission count as the last message it sent it will assume that the server thread had exited or communications link has failed.
On receiving a duplicate packet, the server should copy the new retransmission count into its original response and send this updated response to the client.
The server cannot normally detect communication failure and will wait indefinitely for messages from a client.
The Ladebug Remote Debugger Protocol uses UDP running over an IP network layer as its transport. The client can use any UDP socket as its source but will always send load and connect requests to UDP socket 410. If the request is successful, the response will have as its source socket the socket allocated to the new debuggee thread. The client will send all messages for this debuggee thread to this socket.
The server should always send responses to the source socket of the associated request. The source socket for any message sent by the server should be either 410 (for responses to rejected load and connect messages) or the socket allocated to associated debuggee thread (for all other messages).