

## USING THE IDT79R3051<sup>TM</sup> WITH THE HP16500 LOGIC ANALYZER

#### by Andrew Ng

## INTRODUCTION

The IDT79R3051<sup>™</sup> RISController<sup>™</sup> is a highly integrated, high-performance MIPS™ R3000™ instruction set compatible CPU that minimizes system cost and power consumption across a wide variety of embedded applications. The R3051 includes 4kB - 8kB of instruction cache, 2kB of data cache, 4-deep read and write buffers, on-chip DMA arbitration, a simple external bus interface, as well as the core R3000A execution engine — all in a single chip 84-pin package. However, in today's marketplace, the technical features of a microprocessor are not enough to guarantee a successful product. A new CPU such as the R3051 must also have a large base of software applications, and very importantly, adequate hardware and software development and debug tools. The R3051 family already has a large base of software applications and a large set of development tools because of its R3000A instruction set compatibility and also because of its widespread market acceptance. The use of just one of these tools, the IDT7RS364 Disassembler for the HP16500 Logic Analyzer will be explained here.

# THE IDT7RS364 DISASSEMBLER AND THE HP16500 LOGIC ANALYZER

The IDT7RS364 Disassembler for the HP16500 Logic Analyzer is a useful tool meant to ease the task of debugging software run on R3000-based Target System Boards. Logic analyzers are inexpensive, general purpose debug tools which do not have the power of in-circuit emulators to actively control and simulate target system CPU and memory behavior. However, logic analyzers do provide a useful subset of in-circuit emulator debug capabilities by allowing an engineer to observe and analyze the digital circuit behavior of the target system.

The IDT7RS364 Disassembler consists of a software package that when loaded into the HP16500, pre-processes and formats the state trace listings of the Logic Analyzer. As shown in Figure 1, the HP16500 allows the engineer to capture the CPU's executed hex/binary machine opcodes in a typical Logic Analyzer State Trace Listing format. The user can set multilevel trace traps to capture the area of interest. As shown in Figure 2, with the addition of the IDT7RS364 Disassembler, the hex machine opcodes are automatically decoded and displayed in R3000 assembly code level mnemonic format. Thus the readability and usefulness of the state trace list display screen of the Logic Analyzer are greatly improved.

| State/Timing E   Listing 1   Invasm   Print   Run     Markers   Off |          |          |        |             |  |
|---------------------------------------------------------------------|----------|----------|--------|-------------|--|
| Label>                                                              | ADDR     | DATA     | STAT   | Time        |  |
| Base>                                                               | Hex      | Hex      | Hex    | Absolute    |  |
| -6                                                                  | 1FC00000 | 0BF00088 | 0010   | 0 s         |  |
| -5                                                                  | 1FC00004 | 00000000 | 0010   | 760 ns      |  |
| -4                                                                  | 1FC00220 | 3C020010 | 0010   | 1.52 us     |  |
| -3                                                                  | 1FC00224 | 40826000 | 0010   | 2.24 us     |  |
| -2                                                                  | 1FC00228 | 40806800 | 0010   | 3.00 us     |  |
| -1                                                                  | 1FC0022C | 3C02A000 | 0010   | 3.76 us     |  |
| 0                                                                   | 1FC00230 | 3C08AAAA | 0010   | 4.52 us     |  |
| 1                                                                   | 1FC00234 | 35085555 | 0010   | 5.24 us     |  |
| 2                                                                   | 1FC00238 | AC480000 | 0010   | 6.00 us     |  |
| 3                                                                   | 1FC0023C | AC400004 | 0010   | 6.76 us     |  |
| 4                                                                   | 00000000 | AAAA5555 | 0000   | 7.40 us     |  |
| 5                                                                   | 1FC00240 | 8C490000 | 0010   | 7.88 us     |  |
| 6                                                                   | 00000004 | 00000000 | 0000   | 8.52 us     |  |
| 7                                                                   | 1FC00244 | 00000000 | 0010   | 9.00 us     |  |
| 8                                                                   | 00000000 | AAAA5555 | 0010   | 9.64 us     |  |
| 9                                                                   | 1FC00248 | 11280003 | 0010 1 | 0.32 us     |  |
|                                                                     |          |          |        | 2883 drw 01 |  |

Figure 1. R3051 Address/Data Trace List on a Logic Analyzer

| State/Ti |          | asm Print             | Run    |             |
|----------|----------|-----------------------|--------|-------------|
| Off      |          |                       |        |             |
| Label>   | ADDR     | R3000 Mnemonic        | STAT   | Time        |
| Base>    | Hex      | hex                   | Hex    | Absolute    |
| -6       | 1FC00000 | J 0x1FC00220          | 0010   | 0 s         |
| -5       | 1FC00004 | NOP                   | 0010   | 760 ns      |
| -4       | 1FC00220 | LUI v0,0x0010         | 0010   | 1.52 us     |
| -3       | 1FC00224 | MTCO v0,\$12          | 0010   | 2.24 us     |
| -2       | 1FC00228 | MTC0 zero,\$13        | 0010   | 3.00 us     |
| -1       | 1FC0022C | LUI v0,0xA000         | 0010   | 3.76 us     |
| 0        | 1FC00230 | LUI t0,0xAAAA         | 0010   | 4.52 us     |
| 1        | 1FC00234 | ORI t0,t0,0x5555      | 0010   | 5.24 us     |
| 2        | 1FC00238 | SW t0,0x0000(v0)      | 0010   | 6.00 us     |
| 3        | 1FC0023C | SW zero,0x0004(v0)    | 0010   | 6.76 us     |
| 4        | 00000000 | STORE DATA 0xAAAA5555 | 0000   | 7.40 us     |
| 5        | 1FC00240 | LW t1,0x0000(v0)      | 0010   | 7.88 us     |
| 6        | 00000004 | STORE DATA 0x0000000  | 0000   | 8.52 us     |
| 7        | 1FC00244 | NOP                   | 0010   | 9.00 us     |
| 8        | 00000000 | load data 0xaaaa5555  | 0010   | 9.64 us     |
| 9        | 1FC00248 | B 0x1FC00258          | 0010 1 | .0.32 us    |
|          |          |                       |        | 2883 drw 02 |

Figure 2. R3051 Instruction Disassembly on the HP16500 Logic Analyzer



Figure 3. Typical R3051 System

### Connecting the R3051 to the HP16500 Pod Sets

Before the Disassembler can be used, the correct connections between the R3051 and the HP16500 must be made. The Disassembler requires five 16-channel probe pod sets. The Disassembler expects that the Pod Probe connections follow its interface protocol so that the pre-processing can correctly interpret the address, data, and status lines. The Disassembler typically uses 32 Address lines, 32 Data lines, a Read line, and a Write line.

In the typical R3051 system as shown in Figure 3, the R3051's  $\overline{Rd}$  output is used as the read line and the R3051's  $\overline{Wr}$  output is used as the write line. The Disassembler uses the read and write signals as clocks to strobe the address and data into the Logic Analyzer. Since the top speed of the State traces on the HP16500 is 35 MHz and the fastest possible memory cycle is 2 clocks, the Disassembler can easily support 40 MHz R3051 CPUs and has a theoretical limitation of 70 MHz.

The Address lines can be gathered from the Address Latch outputs and Addr(3:2). Not all 32 address lines need to be attached, as the user can format the address line's MSB channel probes to not show up in the state trace listing if desired. In such a case, the user can use the extra channel probes for other purposes.

In general, Data lines can be gathered from the A/D bus. Some systems, with only one set of Data Transceivers, can gather the data from the memory side of the Data Transceivers in order to reduce A/D bus loading. The R3051 connections to the five HP16500 Channel Probe Pod sets are listed in Table 1.

The Disassembler has three status lines, Write, AccTyp(2) and AccTyp(0). The R3051's  $\overline{Wr}$  output can be used as the write line so that the Disassembler can distinguish between a read and a write cycle. AccTyp(2) and AccTyp(0) are optional connections for cached code and in general should be grounded or at least left unconnected. The optional use of AccTyp(2)

| POD<br>chan | 5<br>sig | POD<br>chan | 4<br>sig | POD<br>chan | 3<br>sig | POD<br>chan | 2<br>sig | POD<br>chan | 1<br>sig |
|-------------|----------|-------------|----------|-------------|----------|-------------|----------|-------------|----------|
| 15          | X        | 15          | A/D(31)  | 15          | A/D(15)  | 15          | A(31)    | 15          | A(15)    |
| 14          | Х        | 14          | A/D(30)  | 14          | A/D(14)  | 14          | A(30)    | 14          | A(14)    |
| 13          | Х        | 13          | A/D(29)  | 13          | A/D(13)  | 13          | A(29)    | 13          | A(13)    |
| 12          | Gnd      | 12          | A/D(28)  | 12          | A/D(12)  | 12          | A(28)    | 12          | A(12)    |
| 11          | Х        | 11          | A/D(27)  | 11          | A/D(11)  | 11          | A(27)    | 11          | A(11)    |
| 10          | Note 2   | 10          | A/D(26)  | 10          | A/D(10)  | 10          | A(26)    | 10          | A(10)    |
| 9           | Х        | 9           | A/D(25)  | 9           | A/D(9)   | 9           | A(25)    | 9           | A(9)     |
| 8           | Х        | 8           | A/D(24)  | 8           | A/D(8)   | 8           | A(24)    | 8           | A(8)     |
| 7           | Х        | 7           | A/D(23)  | 7           | A/D(7)   | 7           | A(23)    | 7           | A(7)     |
| 6           | Х        | 6           | A/D(22)  | 6           | A/D(6)   | 6           | A(22)    | 6           | A(6)     |
| 5           | Х        | 5           | A/D(21)  | 5           | A/D(5)   | 5           | A(21)    | 5           | A(5)     |
| 4           | Wr       | 4           | A/D(20)  | 4           | A/D(4)   | 4           | A(20)    | 4           | A(4)     |
| 3           | Х        | 3           | A/D(19)  | 3           | A/D(3)   | 3           | A(19)    | 3           | Addr(3)  |
| 2           | Х        | 2           | A/D(18)  | 2           | A/D(2)   | 2           | A(18)    | 2           | Addr(2)  |
| 1           | Х        | 1           | A/D(17)  | 1           | A/D(1)   | 1           | A(17)    | 1           | Gnd      |
| 0           | Х        | 0           | A/D(16)  | 0           | A/D(0)   | 0           | A(16)    | 0           | Gnd      |
| NClk        |          | MClk        | Rd       | LClk        |          | KClk        |          | JClk        | Wr       |

2883 tbl 01

#### Table 1. R3051 Default Pod Connections on the HP16500 Logic Analyzer

#### NOTES:

Master Clock Format: J↑ + M↑

 POD5(12) is AccTyp(2) and POD5(10) is AccTyp(0). If AccTyp(2) is grounded then AccTyp(0) is not used by the Disassembler and can be used for other purposes. See text for further explanation.

3. A(31:4) are connected to the Address Latch outputs. The rest of the signals are connected to R3051 outputs. X's denote unused probes that can be assigned by the user.

and AccTyp(0) will be explained in more detail in the Cached Code/Data section. The 16-channel status pod has 13 unused channels that can be used to display other signals, e.g., the Byte Enables.

To a limited extent, the default ordering of the channel probe connections can be changed by the user. The relative ordering of the bits must still occur from MSB to LSB for the address/data/ status bus labels such that the Pod Number and Channel Numbers go from MSB to LSB. An example of reformatting the Pod interface is shown in Table 2 and Figure 4. The example in Table 2 and Figure 4 also demonstrates the use of the HP16500's demultiplexed clock feature. When using the demultiplexed clock, the address and data lines can use the same probes. This allows both the address and data to be taken from the multiplexed A/D(31:0) bus. The address is slaveclocked with ALE and the data is master-clocked with  $\overline{Wr}$  or  $\overline{Rd}$ . When using two clocks, only the 8 LSB probes on each pod can be used since the channels are internally multiplexed by the HP16500. Demultiplexed clocking is limited to 50 nsec master to slave clock recovery, which limits its use to 25 MHz CPU systems.

The HP16500 allows an extensive number of multi-level traps and triggers so that the code trace for the area of interest can be found. Care should be taken when setting up trigger conditions. Sometimes when in the trace/trigger menu, the Disassembler format in the data field trigger condition can conceal a trap condition. Changing the Disassembler format temporarily to hex format while in the trigger menu can prevent such confusion.

| POD<br>chan | 5<br>sig | POD<br>chan | 4<br>sig | POD<br>chan | 3<br>sig | POD<br>chan | 2<br>sig | POD<br>chan | 1<br>sig |
|-------------|----------|-------------|----------|-------------|----------|-------------|----------|-------------|----------|
| 15          |          | 15          |          | 15          |          | 15          |          | 15          | Х        |
| 14          |          | 14          |          | 14          |          | 14          |          | 14          | Х        |
| 13          |          | 13          |          | 13          |          | 13          |          | 13          | Х        |
| 12          |          | 12          |          | 12          |          | 12          |          | 12          | Gnd      |
| 11          |          | 11          |          | 11          |          | 11          |          | 11          | Х        |
| 10          |          | 10          |          | 10          |          | 10          |          | 10          | Note 3   |
| 9           |          | 9           |          | 9           |          | 9           |          | 9           | Х        |
| 8           |          | 8           |          | 8           |          | 8           |          | 8           | Х        |
| 7           | A/D(31)  | 7           | A/D(23)  | 7           | A/D(15)  | 7           | A/D(7)   | 7           | Х        |
| 6           | A/D(30)  | 6           | A/D(22)  | 6           | A/D(14)  | 6           | A/D(6)   | 6           | Х        |
| 5           | A/D(29)  | 5           | A/D(21)  | 5           | A/D(13)  | 5           | A/D(5)   | 5           | Х        |
| 4           | A/D(28)  | 4           | A/D(20)  | 4           | A/D(12)  | 4           | A/D(4)   | 4           | Wr       |
| 3           | A/D(27)  | 3           | A/D(19)  | 3           | A/D(11)  | 3           | A/D(3)   | 3           | Addr(3)  |
| 2           | A/D(26)  | 2           | A/D(18)  | 2           | A/D(10)  | 2           | A/D(2)   | 2           | Addr(2)  |
| 1           | A/D(25)  | 1           | A/D(17)  | 1           | A/D(9)   | 1           | A/D(1)   | 1           | Gnd      |
| 0           | A/D(24)  | 0           | A/D(16)  | 0           | A/D(8)   | 0           | A/D(0)   | 0           | Gnd      |
| NClk        |          | MClk        | Rd       | LClk        |          | KClk        | ALE      | JClk        | Wr       |

#### Table 2. Example of Reformatted Pod Connections

#### NOTES:

1. Master Clock Format: J↑+M↑

2. Slave Clock Format: KØ

3. POD5(12) is AccTyp(2) and POD5(10) is AccTyp(0). If AccTyp(2) is grounded then AccTyp(0) is not used by the Disassembler and can be used for other purposes. See text for further explanation.

4. On Master/Slave Pods, only the 8 LSB probes are actually connected. E.g., A/D(23:16) is connected to Pod4(7:0).

5. X's denote unused probes that can be assigned by the user.

2883 drw 04

| State/ | Timing Form            | nat                        |                         |         |                                 |
|--------|------------------------|----------------------------|-------------------------|---------|---------------------------------|
|        | Master Clock<br>J↑+ M↑ | Slave Clock K $\downarrow$ |                         |         |                                 |
| Pods   |                        |                            | Pod 3<br>Master   Slave |         | Pod 1<br>Clock                  |
| Label  | 7 07 0                 | Master   Slave<br>7 07 0   | 7 07 0                  | 7 07 0  | 7 07 0                          |
| ADDR   | * * * * * * * *        |                            | •••••                   |         | * * * *                         |
| DATA   | *******                | *******                    | *******                 | ******* |                                 |
| STAT   |                        |                            |                         |         | * * * * * * * * * * * * * * * * |

Figure 4. Example of Reformatted Pod Format

## When Running with Cached Code/Data

All Logic Analyzers and Disassemblers can only capture external CPU memory accesses. Since the R3051 is capable of running code and accessing data in its internal caches, such accesses are not seen by the external memory system. Thus in order for the Disassembler to accurately reflect the complete instruction/data flow, the R3051 must be run uncached.

As the target system becomes more and more functional, it becomes necessary to begin running cached code and data. Running cached code/data will affect the Disassembler's accuracy in the following ways:

Cached Instructions -

- 1. Instruction fetch i-cache hits are not seen.
- 2. Only the last word of a cachable 4-word burst instruction i-cache miss will be seen.

Cached Data Loads -

- 1. Data load d-cache hits are not seen.
- 2. Only the last word of a cachable 4-word data block refill d-cache miss will be seen.
- If the load instruction was an i-cache hit (not seen) then the associated data fetch if seen will be listed as an instruction. The data fetch is assumed to be the second (due to pipelining) read cycle after the load instruction.

Cached Data Stores -

- 1. Data stores are handled correctly, since the R3051 maintains a write-through cache policy which ALWAYS updates main memory as well as the d-cache.
- Because the R3051 has a 4-word deep write buffer, a data store may or may not occur on the second (due to pipelining) memory cycle following its instruction fetch. Multiple stores are always handled in the proper FIFO order, but each store may be interspersed with later instruction fetches.

Other than running the software uncached, the following less intrusive methods may be used to help interpret cached code/data:

 Use the R3051's testability mode to invoke the Force I-Cache Miss Mode. This will put all instruction fetches onto the external main memory interface so that the logic analyzer can see all of them. However, forced i-cache misses may or may not be 4-word burst reads.

In general, 4-word burst reads can be displayed properly if a more complex read strobe is formatted:

| J clock: | $\overline{Ack} == LOW$           |
|----------|-----------------------------------|
| M clock: | $\overline{RdCEn} == LOW$         |
| N clock: | SysClk == positive edge triggered |

The HP16500 OR's level conditions together, OR's edge conditions together and AND's level conditions with edge conditions. Thus the above strobe clocks the state when:

 $(\overline{\text{SysClk}} == \neq) \text{ AND } [(\overline{\text{Ack}} == 0) \text{ OR } (\overline{\text{RdCEn}} == 0)]$ 

This example clock setup is only applicable to systems that happen to bring Ack low at the same time RdCEn is low on 4-word burst reads or don't bring Ack low on 4word burst reads. Also 1/2 clock margin on the memory read access time is necessary in this example. Thus depending on the particular system design, variants of RdCEn, Ack, and SysClk can be combined or temporarily modified to create a 4-word read strobe and a write strobe.

2. Latch the R3051's Diag(1:0) outputs with ALE. On external main memory reads, if LatchedDiag(1) == 1 then the fetch is cachable and can be used as an indication that the state trace entry should be interpreted judiciously. When LatchedDiag(1) == 1, LatchedDiag(0) == 1 indicates a cachable instruction fetch and LatchedDiag(0) == 0 indicates a cachable data load.

LatchedDiag(1:0) are the R3051's equivalents of the R3000's AccTyp(2) and AccTyp(0). As such they can be connected to the Disassembler's AccTyp(2) and AccTyp(0) probes. This allows the Disassembler to differentiate between cached instructions and data so that they can be displayed properly. However, AccTyp(2) and Diag(1) are undefined for writes, e.g., when the write buffer is full or on partial word stores. So if the AccTyp(2) probe is used, in order for the

Disassembler to interpret write cycles correctly, LatchedDiag(1) needs to be AND'ed with  $\overline{Wr}$  as shown in Figure 5, so that it is always low during write cycles.

3. Use the Reset Mode Vector to set the R3051 to use single word data refills instead of 4-word data block refills. This will allow all 4 words on a data load d-cache misses to be seen.



2883 drw 05



| State/Timing E   Listing 1   Invasm   Print   Run     Markers   Off   Invasm   Invasm   Invasm   Invasm |          |      |      |        |        |             |
|---------------------------------------------------------------------------------------------------------|----------|------|------|--------|--------|-------------|
| Label>                                                                                                  | DATA     | ADDR | CLKN | BAWRRA | ALE    | WRNRDN      |
| Base>                                                                                                   | Hex      | Hex  | Hex  | Binary | Binary | Binary      |
| 274                                                                                                     | 8C490000 | 4    | 1    | 111110 | 0      | 11          |
| 275                                                                                                     | 8C490000 | 0    | 0    | 111110 | 0      | 11          |
| 276                                                                                                     | 00000000 | 4    | 1    | 110111 | 1      | 01          |
| 277                                                                                                     | 00000000 | 4    | 0    | 110110 | 0      | 01          |
| 278                                                                                                     | 00000000 | 4    | 1    | 110110 | 0      | 01          |
| 279                                                                                                     | 00000000 | 4    | 0    | 110110 | 0      | 01          |
| 280                                                                                                     | 00000000 | 4    | 1    | 110110 | 0      | 01          |
| 281                                                                                                     | 00000000 | 4    | 0    | 110110 | 0      | 01          |
| 282                                                                                                     | 00000000 | 4    | 1    | 110110 | 0      | 01          |
| 283                                                                                                     | 00000000 | 4    | 0    | 110110 | 0      | 01          |
| 284                                                                                                     | 00000000 | 4    | 1    | 110110 | 0      | 01          |
| 285                                                                                                     | 00000000 | 4    | 0    | 100110 | 0      | 01          |
| 286                                                                                                     | 00000000 | 4    | 1    | 100110 | 0      | 01          |
| 287                                                                                                     | 00000000 | 4    | 0    | 111110 | 0      | 11          |
| 288                                                                                                     | 1FC00240 | 4    | 1    | 111101 | 1      | 10          |
| 289                                                                                                     | 1FC00240 | 4    | 0    | 111100 | 0      | 10          |
|                                                                                                         |          |      |      |        |        | 2883 drw 06 |

Figure 6. R3051 State Trace Listing using Clk2xIn



Figure 7. Choosing a Clock Edge

## Using State Trace Listings and Timing Waveforms

The IDT7RS364 Disassembler is a good tool for easing the use of a Logic Analyzer when debugging a target system. However, sometimes, even lower level detail is needed to examine clock by clock behavior of particular bus cycles. The HP16500 performs this function in its State Analyzer mode by sampling with the CPU's system clock as shown in Figure 6. Because the state analyzer mode has a maximum speed of 35 MHz, certain restrictions apply. Ideally because the R3051 uses both edges of its SysClk output to generate control lines, it is preferable to use Clk2xIn or to clock on both edges of either SysClk or its buffered/inverted version SysClk. On the HP16500, high speed clocks should always use their ground shield on the probe to reference the input properly so that the probe does not sense signal overdrive. The edge of the reference clock should be chosen carefully so that it ideally clocks just before ALE de-asserts as shown in Figure 7. This allows the address to be seen along with the data on the multiplexed A/D bus so that dedicated address lines probes are not required. When choosing a clock, keep in mind that the HP16500 has 10 nsec setup time and 1 nsec hold time relative to the clock. In addition, the HP16500's Time Tagging feature if used is limited to 16.67 MHz.

Systems running with a Clk2xIn over 35 MHz (17.5 MHz CPU) can either clock the State Analyzer mode less frequently or use the Timing Analyzer mode. When clocking less frequently, care must be taken to chose a clock edge that adequately strobes ALE during its high period so that the address can be determined. Because the R3051 only has a 1/2 clock intercycle memory latency, Rd and Wr and other control lines may not be seen to de-assert between memory cycles when clocked at the SysClk frequency.

The HP16500 Logic Analyzer's Timing mode displays signals in waveform format as shown in Figure 8 and is capable of internally generating a 100 MHz (10 nsec) sample clock. To maintain all the functional timing relationships relative to the Clk2xIn, the timing mode allows asynchronous sampling up to 50 MHz CPU speed. The disadvantage of using the Timing mode is that the value of busses is hard to decipher when shown in waveform format. If necessary, HP16500 can be set up in its mixed mode display to display both state and timing modes on the same screen.

| State/Timing      | B E Waveform 1 Print Run  |
|-------------------|---------------------------|
| Accumulate<br>Off | Sample period = 10.000 ns |
| s/Div<br>200 ns   | Delay<br>3.060 us Off     |
| CLKN              |                           |
| ALE               |                           |
| WRNRDN 0          |                           |
| ACKS 0            |                           |
| WRNRDN 1          |                           |
| ACKS 1            |                           |
| BAWRRA 5          |                           |
| A_D all           |                           |
|                   | 2883 drw 08               |

Figure 8. R3051 Timing Mode Waveform

## SUMMARY

The use of the HP16500 and the IDT7RS364 Disassembler is but one example of the availability and compatibility of R3000 tools and software that can be used on the R3051. The Disassembler formats logic analyzer state traces into assembly level mnemonics to allow easier user interpretation. Similarly, other R3000 software, compilers, as well as other development tools such as the IDT7RS901 IDT/sim ROMable Kernel/Boot Monitor can also be used on R3051 systems with little or no modification.