3    Main Instruction Set

The assembler's instruction set consists of a main instruction set and a floating-point instruction set. This chapter describes the main instruction set; Chapter 4 describes the floating-point instruction set. For details on the instruction set beyond the scope of this manual, see the Alpha Architecture Reference Manual.

The assembler's main instruction set contains the following classes of instructions:

Tables in this chapter show the format of each instruction in the main instruction set. The tables list the instruction names and the forms of operands that can be used with each instruction. The specifiers used in the tables to identify operands have the following meanings:

Operand Specifier Description
address A symbolic expression whose effective value is used as an address.
b_reg Base register. An integer register containing a base address to which is added an offset (or displacement) value to produce an effective address.
d_reg Destination register. An integer register that receives a value as a result of an operation.
d_reg/s_reg One integer register that is used as both a destination register and a source register.
label A label that identifies a location in a program.
no_operands No operands are specified.
offset An immediate value that is added to the contents of a base register to calculate an effective address.
palcode A value that determines the operation performed by a PALcode instruction.
s_reg, s_reg1, s_reg2 Source registers whose contents are to be used in an operation.
val_expr An expression whose value is used as an absolute value.
val_immed An immediate value that is to be used in an operation.
jhint An address operand that provides a hint of where a jmp or jsr instruction will transfer control.
rhint An immediate operand that provides software with a hint about how a ret or jsr_coroutine instruction is used.

3.1    Load and Store Instructions

Load and store instructions load immediate values and move data between memory and general registers. This section describes the general-purpose load and store instructions supported by the assembler.

Table 3-1 lists the mnemonics and operands for instructions that perform load and store operations. The table is divided into groups of instructions. The operands specified within a particular group apply to all of the instructions contained in that group.

Table 3-1:  Load and Store Formats

Instruction Mnemonic Operands
Load Address

lda [Footnote 1]

d_reg, address
Load Byte

ldb

 
Load Byte Unsigned

ldbu

 
Load Word

ldw

 
Load Word Unsigned

ldwu

 
Load Sign Extended Longword

ldl [Footnote 1]

 
Load Sign Extended Longword Locked

ldl_l [Footnote 1]

 
Load Quadword

ldq [Footnote 1]

 
Load Quadword Locked

ldq_l [Footnote 1]

 
Load Quadword Unaligned

ldq_u [Footnote 1]

 
Unaligned Load Word

uldw

 
Unaligned Load Word Unsigned

uldwu

 
Unaligned Load Word Unsigned

uldl

 
Unaligned Load Longword

uldq

 
Load Address High

ldah [Footnote 1]

d_reg, offset(b_reg)
Load Global Pointer

ldgp

 
Load Immediate Longword

ldil

d_reg, val_expr
Load Immediate Quadword

ldiq

 
Store Byte

stb

s_reg, address
Store Word

stw

 
Store Longword

stl [Footnote 1]

 
Store Longword Conditional

stl_c [Footnote 1]

 
Store Quadword

stq [Footnote 1]

 
Store Quadword Conditional

stq_c [Footnote 1]

 
Store Quadword Unaligned

stq_u [Footnote 1]

 
Unaligned Store Word

ustw

 
Unaligned Store Longword

ustl

 
Unaligned Store Quadword

ustq

 

Section 3.1.1 describes the operations performed by load instructions and Section 3.1.2 describes the operations performed by store instructions.

3.1.1    Load Instruction Descriptions

Load instructions move values (addresses, values of expressions, or contents of memory locations) into registers. For all load instructions, the effective address is the 64-bit two's-complement sum of the contents of the index register and the sign-extended offset.

Instructions whose address operands contain symbolic labels imply an index register, which the assembler determines. Some assembler load instructions can produce multiple machine-code instructions (see Section C.4).

Note

Load instructions can generate many code sequences for which the linker must fix the address by resolving external data items.

Table 3-2 describes the operations performed by load instructions.

Table 3-2:  Load Instruction Descriptions

Instruction Description

Load Address (lda)

Loads the destination register with the effective address of the specified data item.

Load Byte (ldb)

Loads the least significant byte of the destination register with the contents of the byte specified by the effective address. Because the loaded byte is a signed value, its sign bit is replicated to fill the other bytes in the destination register. (The assembler uses temporary registers AT and t9 for this instruction.)

Load Byte Unsigned (ldbu)

Loads the least significant byte of the destination register with the contents of the byte specified by the effective address. Because the loaded byte is an unsigned value, the other bytes of the destination register are cleared to zeros. (The assembler uses temporary registers AT and t9 for this instruction -- unless the setting of the .arch directive or the -arch flag on the cc or as command line causes the assembler to generate a single machine instruction in response to the ldbu instruction.)

Load Word (ldw)

Loads the two least significant bytes of the destination register with the contents of the word specified by the effective address. Because the loaded word is a signed value, its sign bit is replicated to fill the other bytes in the destination register.

If the effective address is not evenly divisible by two, a data-alignment exception may be signaled. (The assembler uses temporary registers AT and t9 for this instruction.)

Load Word Unsigned (ldwu)

Loads the two least significant bytes of the destination register with the contents of the word specified by the effective address. Because the loaded word is an unsigned value, the other bytes of the destination register are cleared to zeros.

If the effective address is not evenly divisible by two, a data alignment exception may be signaled. (The assembler uses temporary registers AT and t9 for this instruction -- unless the setting of the .arch directive or the -arch flag on the cc or as command line causes the assembler to generate a single machine instruction in response to the ldwu instruction.)

Load Sign Extended Longword (ldl)

Loads the four least significant bytes of the destination register with the contents of the longword specified by the effective address. Because the loaded longword is a signed value, its sign bit is replicated to fill the other bytes in the destination register.

If the effective address is not evenly divisible by four, a data-alignment exception is signaled.

Load Sign Extended Longword Locked (ldl_l)

Loads the four least significant bytes of the destination register with the contents of the longword specified by the effective address. Because the loaded longword is a signed value, its sign bit is replicated to fill the other bytes in the destination register.

If the effective address is not evenly divisible by four, a data-alignment exception is signaled.

If an ldl_l instruction executes without generating an exception, the processor records the target physical address in a per-processor locked-physical-address register and sets the per-processor lock flag.

If the per-processor lock flag is still set when a stl_c instruction is executed, the store occurs; otherwise, it does not occur.

Load Quadword (ldq)

Loads the destination register with the contents of the quadword specified by the effective address. All bytes of the register are replaced with the contents of the loaded quadword.

If the effective address is not evenly divisible by eight, a data-alignment exception is signaled.

If a literal relocation type is specified in the ldq instruction, one machine instruction is generated and the symbol and offset are stored in the .lita section. Other relocation types generate a sequence of instructions and the symbol and offset are stored in that sequence.

Load Quadword Locked (ldq_l)

Loads the destination register with the contents of the quadword specified by the effective address. All bytes of the register are replaced with the contents of the loaded quadword.

If the effective address is not evenly divisible by eight, a data-alignment exception is signaled.

If an ldq_l instruction executes without generating an exception, the processor records the target physical address in a per-processor locked-physical-address register and sets the per-processor lock flag.

If the per-processor lock flag is still set when a stq_c instruction is executed, the store occurs; otherwise, it does not occur.

Load Quadword Unaligned (ldq_u)

Loads the destination register with the contents of the quadword specified by the effective address (with the three low-order bits cleared). The address does not have to be aligned on an 8-byte boundary; it can be any byte address.

Unaligned Load Word (uldw)

Loads the two least significant bytes of the destination register with the word at the specified address. The address does not have to be aligned on a 2-byte boundary; it can be any byte address. Because the loaded word is a signed value, its sign bit is replicated to fill the other bytes in the destination register. (The assembler uses temporary registers AT, t9, and t10 for this instruction.)

Unaligned Load Word Unsigned (uldwu)

Loads the two least significant bytes of the destination register with the word at the specified address. The address does not have to be aligned on a 2-byte boundary; it can be any byte address. Because the loaded word is an unsigned value, the other bytes of the destination register are cleared to zeros. (The assembler uses temporary registers AT, t9, and t10 for this instruction.)

Unaligned Load Longword (uldl)

Loads the four least significant bytes of the destination register with the longword at the specified address. The address does not have to be aligned on a 4-byte boundary; it can be any byte address in memory. (The assembler uses temporary registers AT, t9, and t10 for this instruction.)

Unaligned Load Quadword (uldq)

Loads the destination register with the quadword at the specified address. The address does not have to be aligned on an 8-byte boundary; it can be any byte address in memory. (The assembler uses temporary registers AT, t9, and t10 for this instruction.)

Load Address High (ldah)

Loads the destination register with the effective address of the specified data item. In computing the effective address, the signed constant offset is multiplied by 65536 before adding to the base register. The signed constant must be in the range -32768 to 32767.

Load Global Pointer (ldgp)

Loads the destination register with the global pointer value for the procedure. The sum of the base register and the sign-extended offset specifies the address of the ldgp instruction.

Load Immediate Longword (ldil)

Loads the destination register with the value of an expression that can be computed at assembly time. The value is converted to canonical longword form before being stored in the destination register; bit 31 is replicated in bits 32 though 63 of the destination register. (See Appendix B for additional information on canonical forms.)

Load Immediate Quadword (ldiq)

Loads the destination register with the value of an expression that can be computed at assembly time.

3.1.2    Store Instruction Descriptions

For all store instructions, the effective address is the 64-bit two's-complement sum of the contents of the index register and the sign-extended 16-bit offset.

Instructions whose address operands contain symbolic labels imply an index register, which the assembler determines. Some assembler store instructions can produce multiple machine-code instructions (see Section C.4).

Table 3-3 describes the operations performed by store instructions.

Table 3-3:  Store Instruction Descriptions

Instruction Description

Store Byte (stb)

Stores the least significant byte of the source register in the memory location specified by the effective address. (The assembler uses temporary registers AT, t9, and t10 for this instruction -- unless the setting of the .arch directive or the -arch flag on the cc or as command line causes the assembler to generate a single machine instruction in response to the stb instruction.)

Store Word (stw)

Stores the two least significant bytes of the source register in the memory location specified by the effective address.

If the effective address is not evenly divisible by two, a data-alignment exception may be signaled. (The assembler uses temporary registers AT, t9, and t10 for this instruction -- unless the setting of the .arch directive or the -arch flag on the cc or as command line causes the assembler to generate a single machine instruction in response to the stw instruction.)

Store Longword (stl)

Stores the four least significant bytes of the source register in the memory location specified by the effective address.

If the effective address is not evenly divisible by four, a data-alignment exception is signaled.

Store Longword Conditional (stl_c)

Stores the four least significant bytes of the source register in the memory location specified by the effective address, if the lock flag is set. The lock flag is returned in the source register and is then set to zero.

If the effective address is not evenly divisible by four, a data-alignment exception is signaled.

Store Quadword (stq)

Stores the contents of the source register in the memory location specified by the effective address.

If the effective address is not evenly divisible by eight, a data-alignment exception is signaled.

Store Quadword Conditional (stq_c)

Stores the contents of the source register in the memory location specified by the effective address, if the lock flag is set. The lock flag is returned in the source register and is then set to zero.

If the effective address is not evenly divisible by eight, a data-alignment exception is signaled.

Store Quadword Unaligned (stq_u)

Stores the contents of the source register in the memory location specified by the effective address (with the three low-order bits cleared).

Unaligned Store Word (ustw)

Stores the two least significant bytes of the source register in the memory location specified by the effective address. The address does not have to be aligned on a 2-byte boundary; it can be any byte address. (The assembler uses temporary registers AT, t9, t10, t11, and t12 for this instruction.)

Unaligned Store Longword (ustl)

Stores the four least significant bytes of the source register in the memory location specified by the effective address. The address does not have to be aligned on a 4-byte boundary; it can be any byte address. (The assembler uses temporary registers AT, t9, t10, t11, and t12 for this instruction.)

Unaligned Store Quadword (ustq)

Stores the contents of the source register in a memory location specified by the effective address. The address does not have to be aligned on an 8-byte boundary; it can be any byte address. (The assembler uses temporary registers AT, t9, t10, t11, and t12 for this instruction.)

3.2    Arithmetic Instructions

Arithmetic instructions perform arithmetic operations on values in registers. (Floating-point arithmetic instructions are described in Section 4.3.)

Table 3-4 lists the mnemonics and operands for instructions that perform arithmetic operations. The table is divided into groups of instructions. The operands specified within a particular group apply to all of the instructions contained in that group.

Table 3-4:  Arithmetic Instruction Formats

Instruction Mnemonic Operands
Clear

clr

d_reg
Absolute Value Longword

absl

s_reg, d_reg or d_reg/s_reg or val_immed, d_reg
Absolute Value Quadword

absq

Negate Longword (without overflow)

negl

Negate Longword (with overflow)

neglv

Negate Quadword (without overflow)

negq

Negate Quadword (with overflow)

negqv

Sign-Extension Byte

sextb

Sign-Extension Longword

sextl

Sign-Extension Word

sextw

Add Longword (without overflow)

addl

s_reg1, s_reg2, d_reg or d_reg/s_reg1, s_reg2 or s_reg1, val_immed, d_reg or d_reg/s_reg1, val_immed
Add Longword (with overflow)

addlv

Add Quadword (without overflow)

addq

Add Quadword (with overflow)

addqv

Scaled Longword Add by 4

s4addl

Scaled Quadword Add by 4

s4addq

Scaled Longword Add by 8

s8addl

Scaled Quadword Add by 8

s8addq

Multiply Longword (without overflow)

mull

Multiply Longword (with overflow)

mullv

Multiply Quadword (without overflow)

mulq

Multiply Quadword (with overflow)

mulqv

Subtract Longword (without overflow)

subl

Subtract Longword (with overflow)

sublv

Subtract Quadword (without overflow)

subq

Subtract Quadword (with overflow)

subqv

Scaled Longword Subtract by 4

s4subl

Scaled Quadword Subtract by 4

s4subq

Scaled Longword Subtract by 8

s8subl

Scaled Quadword Subtract by 8

s8subq

Unsigned Quadword Multiply High

umulh

Divide Longword

divl

Divide Longword Unsigned

divlu

Divide Quadword

divq

Divide Quadword Unsigned

divqu

Longword Remainder

reml

Longword Remainder Unsigned

remlu

Quadword Remainder

remq

Quadword Remainder Unsigned

remqu

Table 3-5 describes the operations performed by arithmetic instructions.

Table 3-5:  Arithmetic Instruction Descriptions

Instruction Description

Clear (clr)

Sets the contents of the destination register to zero.

Absolute Value Longword (absl)

Computes the absolute value of the contents of the source register and places the result in the destination register. If the value in the source register is -2147483648, an overflow exception is signaled.

Absolute Value Quadword (absq)

Computes the absolute value of the contents of the source register and places the result in the destination register. If the value in the source register is -9223372036854775808, an overflow exception is signaled.

Negate Longword (without overflow) (negl)

Negates the integer contents of the four least significant bytes in the source register and places the result in the destination register. An overflow occurs if the value in the source register is -2147483648, but the overflow exception is not signaled.

Negate Longword (with overflow) (neglv)

Negates the integer contents of the four least significant bytes in the source register and places the result in the destination register. If the value in the source register is -2147483648, an overflow exception is signaled.

Negate Quadword (without overflow) (negq)

Negates the integer contents of the source register and places the result in the destination register. An overflow occurs if the value in the source register is -2147483648, but the overflow exception is not signaled.

Negate Quadword (with overflow) (negqv)

Negates the integer contents of the source register and places the result in the destination register. An overflow exception is signaled if the value in the source register is -9223372036854775808.

Sign-Extension Byte (sextb)

Moves the least significant byte of the source register into the least significant byte of the destination register. Because the moved byte is a signed value, its sign bit is replicated to fill the other bytes in the destination register.

Sign-Extension Word (sextw)

Moves the two least significant bytes of the source register into the two least significant bytes of the destination register. Because the moved word is a signed value, its sign bit is replicated to fill the other bytes in the destination register.

Sign-Extension Longword (sextl)

Moves the four least significant bytes of the source register into the four least significant bytes of the destination register. Because the moved longword is a signed value, its sign bit is replicated to fill the other bytes in the destination register.

Add Longword (without overflow) (addl)

Computes the sum of two signed 32-bit values. This instruction adds the contents of s_reg1 to the contents of s_reg2 or the immediate value and then places the result in the destination register. Overflow exceptions never occur.

Add Longword (with overflow) (addlv)

Computes the sum of two signed 32-bit values. This instruction adds the contents of s_reg1 to the contents of s_reg2 or the immediate value and then places the result in the destination register. If the result cannot be represented as a signed 32-bit number, an overflow exception is signaled.

Add Quadword (without overflow) (addq)

Computes the sum of two signed 64-bit values. This instruction adds the contents of s_reg1 to the contents of s_reg2 or the immediate value and then places the result in the destination register. Overflow exceptions never occur.

Add Quadword (with overflow) (addqv)

Computes the sum of two signed 64-bit values. This instruction adds the contents of s_reg1 to the contents of s_reg2 or the immediate value and then places the result in the destination register. If the result cannot be represented as a signed 64-bit number, an overflow exception is signaled.

Scaled Longword Add by 4 (s4addl)

Computes the sum of two signed 32-bit values. This instruction scales (multiplies) the contents of s_reg1 by four and then adds the contents of s_reg2 or the immediate value. The result is stored in the destination register. Overflow exceptions never occur.

Scaled Quadword Add by 4 (s4addq)

Computes the sum of two signed 64-bit values. This instruction scales (multiplies) the contents of s_reg1 by four and then adds the contents of s_reg2 or the immediate value. The result is stored in the destination register. Overflow exceptions never occur.

Scaled Longword Add by 8 (s8addl)

Computes the sum of two signed 32-bit values. This instruction scales (multiplies) the contents of s_reg1 by eight and then adds the contents of s_reg2 or the immediate value. The result is stored in the destination register. Overflow exceptions never occur.

Scaled Quadword Add by 8 (s8addq)

Computes the sum of two signed 64-bit values. This instruction scales (multiplies) the contents of s_reg1 by eight and then adds the contents of s_reg2 or the immediate value. The result is stored in the destination register. Overflow exceptions never occur.

Multiply Longword (without overflow) (mull)

Computes the product of two signed 32-bit values. This instruction places either the 32-bit product of s_reg1 and s_reg2 or the immediate value in the destination register. Overflows are not reported.

Multiply Longword (with overflow) (mullv)

Computes the product of two signed 32-bit values. This instruction places either the 32-bit product of s_reg1 and s_reg2 or the immediate value in the destination register. If an overflow occurs, an overflow exception is signaled.

Multiply Quadword (without overflow) (mulq)

Computes the product of two signed 64-bit values. This instruction places either the 64-bit product of s_reg1 and s_reg2 or the immediate value in the destination register. Overflow is not reported.

Multiply Quadword (with overflow) (mulqv)

Computes the product of two signed 64-bit values. This instruction places either the 64-bit product of s_reg1 and s_reg2 or the immediate value in the destination register. If an overflow occurs, an overflow exception is signaled.

Subtract Longword (without overflow) (subl)

Computes the difference of two signed 32-bit values. This instruction subtracts either the contents of s_reg2 or an immediate value from the contents of s_reg1 and then places the result in the destination register. Overflow exceptions never happen.

Subtract Longword (with overflow) (sublv)

Computes the difference of two signed 32-bit values. This instruction subtracts either the contents of s_reg2 or an immediate value from the contents of s_reg1 and then places the result in the destination register. If the true result's sign differs from the destination register's sign, an overflow exception is signaled.

Subtract Quadword (without overflow) (subq)

Computes the difference of two signed 64-bit values. This instruction subtracts the contents of s_reg2 or an immediate value from the contents of s_reg1 and then places the result in the destination register. Overflow exceptions never occur.

Subtract Quadword (with overflow) (subqv)

Computes the difference of two signed 64-bit values. This instruction subtracts the contents of s_reg2 or an immediate value from the contents of s_reg1 and then places the result in the destination register. If the true result's sign differs from the destination register's sign, an overflow exception is signaled.

Scaled Longword Subtract by 4 (s4subl)

Computes the difference of two signed 32-bit values. This instruction subtracts the contents of s_reg2 or the immediate value from the scaled (by 4) contents of s_reg1. The result is stored in the destination register. Overflow exceptions never occur.

Scaled Quadword Subtract by 4 (s4subq)

Computes the difference of two signed 64-bit values. This instruction subtracts the contents of s_reg2 or the immediate value from the scaled (by 4) contents of s_reg1. The result is stored in the destination register. Overflow exceptions never occur.

Scaled Longword Subtract by 8 (s8subl)

Computes the difference of two signed 32-bit values. This instruction subtracts the contents of s_reg2 or the immediate value from the scaled (by 8) contents of s_reg1. The result is stored in the destination register. Overflow exceptions never occur.

Scaled Quadword Subtract by 8 (s8subq)

Computes the difference of two signed 64-bit values. This instruction subtracts the contents of s_reg2 or the immediate value from the scaled (by 8) contents of s_reg1. The result is stored in the destination register. Overflow exceptions never occur.

Unsigned Quadword Multiply High (umulh)

Computes the product of two unsigned 64-bit values. This instruction multiplies the contents of s_reg1 by the contents of s_reg2 or the immediate value and then places the high-order 64 bits of the 128-bit product in the destination register.

Divide Longword (divl)

Computes the quotient of two signed 32-bit values. This instruction divides the contents of s_reg1 by the contents of s_reg2 or the immediate value and then places the quotient in the destination register.

The divl instruction rounds toward zero. If the divisor is zero, an error is signaled. Overflow is signaled when dividing -2147483648 by -1. A call_pal PAL_gentrap instruction may be issued for divide-by-zero and overflow exceptions.

Divide Longword Unsigned (divlu)

Computes the quotient of two unsigned 32-bit values. This instruction divides the contents of s_reg1 by the contents of s_reg2 or the immediate value and then places the quotient in the destination register.

If the divisor is zero, an exception is signaled and a call_pal PAL_gentrap instruction may be issued. Overflow exceptions never occur. (The assembler uses temporary registers AT, t9, t10, t11, and t12 for the divlu instruction.)

Divide Quadword (divq)

Computes the quotient of two signed 64-bit values. This instruction divides the contents of s_reg1 by the contents of s_reg2 or the immediate value and then places the quotient in the destination register.

The divq instruction rounds toward zero. If the divisor is zero, an error is signaled. Overflow is signaled when dividing -9223372036854775808 by -1. A call_pal PAL_gentrap instruction may be issued for divide-by-zero and overflow exceptions. (The assembler uses temporary registers AT, t9, t10, t11, and t12 for the divq instruction.)

Divide Quadword Unsigned (divqu)

Computes the quotient of two unsigned 64-bit values. This instruction divides the contents of s_reg1 by the contents of s_reg2 or the immediate value and then places the quotient in the destination register.

If the divisor is zero, an exception is signaled and a call_pal PAL_gentrap instruction may be issued. Overflow exceptions never occur. (The assembler uses temporary registers AT, t9, t10, t11, and t12 for the divqu instruction.)

Longword Remainder (reml)

Computes the remainder of the division of two signed 32-bit values. The remainder reml(i,j) is defined as i-(j*divl(i,j)), where j!=0. This instruction divides the contents of s_reg1 by the contents of s_reg2 or by the immediate value and then places the remainder in the destination register.

The reml instruction rounds toward zero, for example, divl(5,-3)=-1 and reml(5,-3)=2.

For divide-by-zero, an error is signaled and a call_pal PAL_gentrap instruction may be issued. (The assembler uses temporary registers AT, t9, t10, t11, and t12 for the reml instruction.)

Longword Remainder Unsigned (remlu)

Computes the remainder of the division of two unsigned 32-bit values. The remainder remlu(i,j) is defined as i-(j*divlu(i,j)), where j!=0. This instruction divides the contents of s_reg1 by the contents of s_reg2 or the immediate value and then places the remainder in the destination register.

For divide-by-zero, an error is signaled and a call_pal PAL_gentrap instruction may be issued. (The assembler uses temporary registers AT, t9, t10, t11, and t12 for the remlu instruction.)

Quadword Remainder (remq)

Computes the remainder of the division of two signed 64-bit values. The remainder remq(i,j) is defined as i-(j*divq(i,j)) where j!=0. This instruction divides the contents of s_reg1 by the contents of s_reg2 or the immediate value and then places the remainder in the destination register.

The remq instruction rounds toward zero, for example, divq(5,-3)=-1 and remq(5,-3)=2.

For divide-by-zero, an error is signaled and a call_pal PAL_gentrap instruction may be issued. (The assembler uses temporary registers AT, t9, t10, t11, and t12 for the remq instruction.)

Quadword Remainder Unsigned (remqu)

Computes the remainder of the division of two unsigned 64-bit values. The remainder remqu(i,j) is defined as i-(j*divqu(i,j)) where j!=0. This instruction divides the contents of s_reg1 by the contents of s_reg2 or the immediate value and then places the remainder in the destination register.

For divide-by-zero, an error is signaled and a call_pal PAL_gentrap instruction may be issued. (The assembler uses temporary registers AT, t9, t10, t11, and t12 for the remqu instruction.)

3.3    Logical and Shift Instructions

Logical and shift instructions perform logical operations and shifts on values in registers.

Table 3-6 lists the mnemonics and operands for instructions that perform logical and shift operations. The table is divided into groups of instructions. The operands specified within a particular group apply to all of the instructions contained in that group.

Table 3-6:  Logical and Shift Instruction Formats

Instruction Mnemonic Operands
Logical Complement -- NOT

not

s_reg, d_reg or d_reg/s_reg or val_immed, d_reg
Logical Product -- AND

and

s_reg1, s_reg2, d_reg or d_reg/s_reg1, s_reg2 or s_reg1, val_immed, d_reg or d_reg/s_reg1, val_immed
Logical Sum -- OR

bis

Logical Sum -- OR

or

Logical Difference -- XOR

xor

Logical Product with Complement -- ANDNOT

bic

Logical Product with Complement -- ANDNOT

andnot

Logical Sum with Complement -- ORNOT

ornot

Logical Equivalence -- XORNOT

eqv

Logical Equivalence -- XORNOT

xornot

Shift Left Logical

sll

Shift Right Logical

srl

Shift Right Arithmetic

sra

Table 3-7 describes the operations performed by logical and shift instructions.

Table 3-7:  Logical and Shift Instruction Descriptions

Instruction Description

Logical Complement -- NOT (not)

Computes the logical NOT of a value. This instruction performs a complement operation on the contents of s_reg1 and places the result in the destination register.

Logical Product -- AND (and)

Computes the logical AND of two values. This instruction performs an AND operation between the contents of s_reg1 and either the contents of s_reg2 or the immediate value and then places the result in the destination register.

Logical Sum -- OR (bis)

Computes the logical OR of two values. This instruction performs an OR operation between the contents of s_reg1 and either the contents of s_reg2 or the immediate value and then places the result in the destination register.

Logical Sum -- OR (or)

Synonym for bis.

Logical Difference -- XOR (xor)

Computes the XOR of two values. This instruction performs an XOR operation between the contents of s_reg1 and either the contents of s_reg2 or the immediate value and then places the result in the destination register.

Logical Product with Complement -- ANDNOT (bic)

Computes the logical AND of two values. This instruction performs an AND operation between the contents of s_reg1 and the one's complement of either the contents of s_reg2 or the immediate value and then places the result in the destination register.

Logical Product with Complement -- ANDNOT (andnot)

Synonym for bic.

Logical Sum with Complement -- ORNOT (ornot)

Computes the logical OR of two values. This instruction performs an OR operation between the contents of s_reg1 and the one's complement of either the contents of s_reg2 or the immediate value and then places the result in the destination register.

Logical Equivalence -- XORNOT (eqv)

Computes the logical XOR of two values. This instruction performs an XOR operation between the contents of s_reg1 and the one's complement of either the contents of s_reg2 or the immediate value and then places the result in the destination register.

Logical Equivalence -- XORNOT (xornot)

Synonym for eqv.

Shift Left Logical (sll)

Shifts the contents of a register left (toward the sign bit) and inserts zeros in the vacated bit positions. Register s_reg1 contains the value to be shifted, and either the contents of s_reg2 or the immediate value specifies the shift count. If s_reg2 or the immediate value is greater than 63 or less than zero, s_reg1 shifts by the result of the following AND operation: s_reg2 AND 63.

Shift Right Logical (srl)

Shifts the contents of a register to the right (toward the least significant bit) and inserts zeros in the vacated bit positions. Register s_reg1 contains the value to be shifted, and either the contents of s_reg2 or the immediate value specifies the shift count. If s_reg2 or the immediate value is greater than 63 or less than zero, s_reg1 shifts by the result of the result of the following AND operation: s_reg2 AND 63.

Shift Right Arithmetic (sra)

Shifts the contents of a register to the right (toward the least significant bit) and inserts the sign bit in the vacated bit position. Register s_reg1 contains the value to be shifted, and either the contents of s_reg2 or the immediate value specifies the shift count. If s_reg2 or the immediate value is greater than 63 or less than zero, s_reg1 shifts by the result of the following AND operation: s_reg2 AND 63.

3.4    Relational Instructions

Relational instructions compare values in registers.

Table 3-8 lists the mnemonics and operands for instructions that perform relational operations. Each of the instructions listed in the table can take an operand in any of the forms shown.

Table 3-8:  Relational Instruction Formats

Instruction Mnemonic Operands
Compare Signed Quadword Equal

cmpeq

s_reg1, s_reg2, d_reg or d_reg/s_reg1, s_reg2 or s_reg1, val_immed, d_reg or d_reg/s_reg1, val_immed
Compare Signed Quadword Less Than

cmplt

Compare Signed Quadword Less Than or Equal

cmple

Compare Unsigned Quadword Less Than

cmpult

Compare Unsigned Quadword Less Than or Equal

cmpule

Table 3-9 describes the operations performed by relational instructions.

Table 3-9:  Relational Instruction Descriptions

Instruction Description

Compare Signed Quadword Equal (cmpeq)

Compares two 64-bit values. If the value in s_reg1 equals the value in s_reg2 or the immediate value, this instruction sets the destination register to one; otherwise, it sets the destination register to zero.

Compare Signed Quadword Less Than (cmplt)

Compares two signed 64-bit values. If the value in s_reg1 is less than the value in s_reg2 or the immediate value, this instruction sets the destination register to one; otherwise, it sets the destination register to zero.

Compare Signed Quadword Less Than or Equal (cmple)

Compares two signed 64-bit values. If the value in s_reg1 is less than or equal to the value in s_reg2 or the immediate value, this instruction sets the destination register to one; otherwise, it sets the destination register to zero.

Compare Unsigned Quadword Less Than (cmpult)

Compares two unsigned 64-bit values. If the value in s_reg1 is less than either the value in s_reg2 or the immediate value, this instruction sets the destination register to one; otherwise, it sets the destination register to zero.

Compare Unsigned Quadword Less Than or Equal (cmpule)

Compares two unsigned 64-bit values. If the value in s_reg1 is less than or equal to either the value in s_reg2 or the immediate value, this instruction sets the destination register to one; otherwise, it sets the destination register to zero.

3.5    Move Instructions

Move instructions move data between registers.

Table 3-10 lists the mnemonics and operands for instructions that perform move operations. The table is divided into groups of instructions. The operands specified within a particular group apply to all of the instructions contained in that group.

Table 3-10:  Move Instruction Formats

Instruction Mnemonic Operands
Move

mov

s_reg, d_reg or val_immed, d_reg
Move if Equal to Zero

cmoveq

s_reg1, s_reg2, d_reg or d_reg/s_reg1, s_reg2 or s_reg1, val_immed, d_reg or d_reg/s_reg1, val_immed
Move if Not Equal to Zero

cmovne

Move if Less Than Zero

cmovlt

Move if Less Than or Equal to Zero

cmovle

Move if Greater Than Zero

cmovgt

Move if Greater Than or Equal to Zero

cmovge

Move if Low Bit Clear

cmovlbc

Move if Low Bit Set

cmovlbs

Table 3-11 describes the operations performed by move instructions.

Table 3-11:  Move Instruction Descriptions

Instruction Description

Move (mov)

Moves the contents of the source register or the immediate value to the destination register.

Move if Equal to Zero (cmoveq)

Moves the contents of s_reg2 or the immediate value to the destination register if the contents of s_reg1 is equal to zero.

Move if Not Equal to Zero (cmovne)

Moves the contents of s_reg2 or the immediate value to the destination register if the contents of s_reg1 is not equal to zero.

Move if Less Than Zero (cmovlt)

Moves the contents of s_reg2 or the immediate value to the destination register if the contents of s_reg1 is less than zero.

Move if Less Than or Equal to Zero (cmovle)

Moves the contents of s_reg2 or the immediate value to the destination register if the contents of s_reg1 is less than or equal to zero.

Move if Greater Than Zero (cmovgt)

Moves the contents of s_reg2 or the immediate value to the destination register if the contents of s_reg1 is greater than zero.

Move if Greater Than or Equal to Zero (cmovge)

Moves the contents of s_reg2 or the immediate value to the destination register if the contents of s_reg1 is greater than or equal to zero.

Move if Low Bit Clear (cmovlbc)

Moves the contents of s_reg2 or the immediate value to the destination register if the low-order bit of s_reg1 is equal to zero.

Move if Low Bit Set (cmovlbs)

Moves the contents of s_reg2 or the immediate value to the destination register if the low-order bit of s_reg1 is not equal to zero.

3.6    Control Instructions

Control instructions change the control flow of an assembly program. They affect the sequence in which instructions are executed by transferring control from one location in a program to another.

Table 3-12 lists the mnemonics and operands for instructions that perform control operations. The table is divided into groups of instructions. The operands specified within a particular group apply to all of the instructions contained in that group.

Table 3-12:  Control Instruction Formats

Instruction Mnemonic Operands
Branch if Equal to Zero

beq

s_reg, label
Branch if Not Equal to Zero

bne

 
Branch if Less Than Zero blt  
Branch if Less Than or Equal to Zero

ble

 
Branch if Greater Than Zero

bgt

 
Branch if Greater Than or Equal to Zero

bge

 
Branch if Low Bit is Clear

blbc

 
Branch if Low Bit is Set

blbs

 
Branch

br

d_reg, label or label
Branch to Subroutine

bsr

 
Jump

jmp [Footnote 2]

d_reg, (s_reg), jhint or d_reg, (s_reg) or (s_reg), jhint or (s_reg) or d_reg, address or address
Jump to Subroutine

jsr [Footnote 2]

Return from Subroutine

ret

d_reg, (s_reg), rhint or d_reg, (s_reg) or d_reg, rhint or d_reg or (s_reg), rhint or (s_reg) or rhint or no_operands
Jump to Subroutine Return

jsr_coroutine [Footnote 2]

Table 3-13 describes the operations performed by control instructions. For all branch instructions described in the table, the branch destinations must be defined in the source being assembled, not in an external source file.

Table 3-13:  Control Instruction Descriptions

Instruction Description

Branch if Equal to Zero (beq)

Branches to the specified label if the contents of the source register is equal to zero.

Branch if Not Equal to Zero (bne)

Branches to the specified label if the contents of the source register is not equal to zero.

Branch if Less Than Zero (blt)

Branches to the specified label if the contents of the source register is less than zero. The comparison treats the source register as a signed 64-bit value.

Branch if Less Than or Equal to Zero (ble)

Branches to the specified label if the contents of the source register is less than or equal to zero. The comparison treats the source register as a signed 64-bit value.

Branch if Greater Than Zero (bgt)

Branches to the specified label if the contents of the source register is greater than zero. The comparison treats the source register as a signed 64-bit value.

Branch if Greater Than or Equal to Zero (bge)

Branches to the specified label if the contents of the source register is greater than or equal to zero. The comparison treats the source register as a signed 64-bit value.

Branch if Low Bit is Clear (blbc)

Branches to the specified label if the low-order bit of the source register is equal to zero.

Branch if Low Bit is Set (blbs)

Branches to the specified label if the low-order bit of the source register is not equal to zero.

Branch (br)

Branches unconditionally to the specified label. If a destination register is specified, the address of the instruction following the br instruction is stored in that register.

Branch to Subroutine (bsr)

Branches unconditionally to the specified label and stores the return address in the destination register. If a destination register is not specified, register $26 (ra) is used.

Jump (jmp)

Unconditionally jumps to a specified location. A symbolic address or the source register specifies the target location. If a destination register is specified, the address of the instruction following the jmp instruction is stored in the specified register.

Jump to Subroutine (jsr)

Unconditionally jumps to a specified location and stores the return address in the destination register. If a destination register is not specified, register $26 (ra) is used. A symbolic address or the source register specifies the target location. The instruction jsr procname transfers to procname and saves the return address in register $26.

Return from Subroutine (ret)

Unconditionally returns from a subroutine. If a destination register is specified, the address of the instruction following the ret instruction is stored in the specified register. The source register contains the return address. If the source register is not specified, register $26 (ra) is used. If a hint is not specified, a hint value of one is used.

Jump to Subroutine Return (jsr_coroutine)

Unconditionally returns from a subroutine and stores the return address in the destination register. If a destination register is not specified, register $26 (ra) is used. The source register contains the target address. If the source register is not specified, register $26 (ra) is used.

All jump instructions (jmp, jsr, ret, and jsr_coroutine) perform identical operations. They differ only in hints to possible branch-prediction logic. See the Alpha Architecture Reference Manual for information about branch-prediction logic.

3.7    Byte-Manipulation Instructions

Byte-manipulation instructions perform byte operations on values in registers.

Table 3-14 lists the mnemonics and operands for instructions that perform byte-manipulation operations. Each of the instructions listed in the table can take an operand in any of the forms shown.

Table 3-14:  Byte-Manipulation Instruction Formats

Instruction Mnemonic Operands
Compare Byte

cmpbge

s_reg1, s_reg2, d_reg or d_reg/s_reg1, s_reg2 or s_reg1, val_immed, d_reg or d_reg/s_reg1, val_immed
Extract Byte Low

extbl

Extract Word Low

extwl

Extract Longword Low

extll

Extract Quadword Low

extql

Extract Word High

extwh

 
Extract Longword High

extlh

 
Extract Quadword High

extqh

 
Insert Byte Low

insbl

 
Insert Word Low

inswl

 
Insert Longword Low

insll

 
Insert Quadword Low

insql

 
Insert Word High

inswh

 
Insert Longword High

inslh

 
Insert Quadword High

insqh

 
Mask Byte Low

mskbl

 
Mask Word Low

mskwl

 
Mask Longword Low

mskll

 
Mask Quadword Low

mskql

 
Mask Word High

mskwh

 
Mask Longword High

msklh

 
Mask Quadword High

mskqh

 
Zero Bytes

zap

 
Zero Bytes NOT

zapnot

 

Table 3-15 describes the operations performed by byte-manipulation instructions.

Table 3-15:  Byte-Manipulation Instruction Descriptions

Instruction Description

Compare Byte (cmpbge)

Performs eight parallel unsigned byte comparisons between corresponding bytes of register s_reg1 and s_reg2 or the immediate value. A bit is set in the destination register if a byte in s_reg1 is greater than or equal to the corresponding byte in s_reg2 or the immediate value.

The results of the comparisons are stored in the eight low-order bits of the destination register; bit 0 of the destination register corresponds to byte 0 and so forth. The 56 high-order bits of the destination register are cleared.

Extract Byte Low (extbl)

Shifts the register s_reg1 right by 0-7 bytes, inserts zeros into the vacated bit positions, and then extracts the low-order byte into the destination register. The seven high-order bytes of the destination register are cleared to zeros. Bits 0-2 of register s_reg2 or the immediate value specify the shift count.

Extract Word Low (extwl)

Shifts the register s_reg1 right by 0-7 bytes, inserts zeros into the vacated bit positions, and then extracts the two low-order bytes and stores them in the destination register. The six high-order bytes of the destination register are cleared to zeros. Bits 0-2 of register s_reg2 or the immediate value specify the shift count.

Extract Longword Low (extll)

Shifts the register s_reg1 right by 0-7 bytes, inserts zeros into the vacated bit positions, and then extracts the four low-order bytes and stores them in the destination register. The four high-order bytes of the destination register are cleared to zeros. Bits 0-2 of register s_reg2 or the immediate value specify the shift count.

Extract Quadword Low (extql)

Shifts the register s_reg1 right by 0-7 bytes, inserts zeros into the vacated bit positions, and then extracts all eight bytes and stores them in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the shift count.

Extract Word High (extwh)

Shifts the register s_reg1 left by 0-7 bytes, inserts zeros into the vacated bit positions, and then extracts the two low-order bytes and stores them in the destination register. The six high-order bytes of the destination register are cleared to zeros. Bits 0-2 of register s_reg2 or the immediate value specify the shift count.

Extract Longword High (extlh)

Shifts the register s_reg1 left by 0-7 bytes, inserts zeros into the vacated bit positions, and then extracts the four low-order bytes and stores them in the destination register. The four high-order bytes of the destination register are cleared to zeros. Bits 0-2 of register s_reg2 or the immediate value specify the shift count.

Extract Quadword High (extqh)

Shifts the register s_reg1 left by 0-7 bytes, inserts zeros into the vacated bit positions, and then extracts all eight bytes and stores them in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the shift count.

Insert Byte Low (insbl)

Shifts the register s_reg1 left by 0-7 bytes, inserts the byte into a field of zeros, and then places the result in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the shift count.

Insert Word Low (inswl)

Shifts the register s_reg1 left by 0-7 bytes, inserts the word into a field of zeros, and then places the result in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the shift count.

Insert Longword Low (insll)

Shifts the register s_reg1 left by 0-7 bytes, inserts the longword into a field of zeros, and then places the result in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the shift count.

Insert Quadword Low (insql)

Shifts the register s_reg1 left by 0-7 bytes, inserts the quadword into a field of zeros, and then places the result in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the shift count.

Insert Word High (inswh)

Shifts the register s_reg1 right by 0-7 bytes, inserts the word into a field of zeros, and then places the result in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the shift count.

Insert Longword High (inslh)

Shifts the register s_reg1 right by 0-7 bytes, inserts the longword into a field of zeros, and then places the result in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the shift count.

Insert Quadword High (insqh)

Shifts the register s_reg1 right by 0-7 bytes, inserts the quadword into a field of zeros, and then places the result in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the shift count.

Mask Byte Low (mskbl)

Sets a byte in register s_reg1 to zero and stores the result in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the offset of the byte.

Mask Word Low (mskwl)

Sets a word in register s_reg1 to zero and stores the result in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the offset of the word.

Mask Longword Low (mskll)

Sets a longword in register s_reg1 to zero and stores the result in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the offset of the longword.

Mask Quadword Low (mskql)

Sets a quadword in register s_reg1 to zero and stores the result in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the offset of the quadword.

Mask Word High (mskwh)

Sets a word in register s_reg1 to zero and stores the result in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the offset of the word.

Mask Longword High (msklh)

Sets a longword in register s_reg1 to zero and stores the result in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the offset of the longword.

Mask Quadword High (mskqh)

Sets a quadword in register s_reg1 to zero and stores the result in the destination register. Bits 0-2 of register s_reg2 or the immediate value specify the offset of the quadword.

Zero Bytes (zap)

Sets selected bytes of register s_reg1 to zero and places the result in the destination register. Bits 0-7 of register s_reg2 or an immediate value specify the bytes to be cleared to zeros. Each bit corresponds to one byte in register s_reg1; for example, bit 0 corresponds to byte 0. A bit with a value of one indicates its corresponding byte should be cleared to zeros.

Zero Bytes NOT (zapnot)

Sets selected bytes of register s_reg1 to zero and places the result in the destination register. Bits 0-7 of register s_reg2 or an immediate value specify the bytes to be cleared to zeros. Each bit corresponds to one byte in register s_reg1; for example, bit 0 corresponds to byte 0. A bit with a value of zero indicates its corresponding byte should be cleared to zeros.

3.8    Special-Purpose Instructions

Special-purpose instructions perform miscellaneous tasks.

Table 3-16 lists the mnemonics and operands for instructions that perform special operations. The table is divided into groups of instructions. The operands specified within a particular group apply to all of the instructions contained in that group.

Table 3-16:  Special-Purpose Instruction Formats

Instruction Mnemonic Operands
Call Privileged Architecture Library

call_pal

palcode
Architecture Mask

amask

s_reg, d_reg or val_immed, d_reg
Prefetch Data

fetch

offset(b_reg)
Prefetch Data, Modify Intent

fetch_m

 
Read Process Cycle Counter

rpcc

d_reg
Implementation Version

implver

 
No Operation

nop

no_operands
Universal No Operation

unop

 
Trap Barrier

trapb

 
Exception Barrier

excb

 
Memory Barrier

mb

 
Write Memory Barrier

wmb

 

Table 3-17 describes the operations performed by special-purpose instructions.

Table 3-17:  Special-Purpose Instruction Descriptions

Instruction Description

Call Privileged Architecture Library (call_pal)

Unconditionally transfers control to the exception handler. The palcode operand is interpreted by software conventions.

Architecture Mask (amask)

The value of the contents of s_reg or the immediate value represent a mask of architectural extensions that are being requested. Bits are cleared if they correspond to architectural extensions that are present, and the result is placed in the destination register.

Prefetch Data (fetch)

Indicates that the 512-byte block of data specified by the effective address should be moved to a faster-access part of the memory hierarchy.

Prefetch Data, Modify Intent (fetch_m)

Indicates that the 512-byte block of data specified by the effective address should be moved to a faster-access part of the memory hierarchy. In addition, this instruction is a hint that part or all of the data may be modified.

Read Process Cycle Counter (rpcc)

Returns the contents of the process cycle counter in the destination register.

Implementation Version (implver)

A small integer is placed in the destination register. This integer specifies the major implementation version of the processor on which it is executed. This information can be used to make code-scheduling or tuning decisions. The returned small integer can have the values 0, 1, or 2. 0 indicates EV4, EV45, LCA, and LCA-45 Alpha chips (that is, 21064, 21064A, 21066, 21068, and 21066A, respectively); 1 indicates an EV5 Alpha chip (21164); and 2 indicates an EV6 Alpha chip (21264).

No Operation (nop)

Has no effect on the machine state.

Universal No Operation (unop)

Has no effect on the machine state.

Trap Barrier (trapb)

Guarantees that all previous arithmetic instructions are completed, without incurring any arithmetic traps, before any instructions after the trapb instruction are issued.

Exception Barrier (excb)

Guarantees that all previous instructions complete any exception-related behavior or rounding-mode behavior before any instructions after the excb instruction are issued.

Memory Barrier (mb)

Used to serialize access to memory. See the Alpha Architecture Reference Manual for additional information on memory barriers.

Write Memory Barrier (wmb)

Guarantees that all previous store instructions access memory before any store instructions issued after the wmb instruction.