Butterfly CPU Core |
2010 Robert Finch |
Table of Contents
Documentation Notes
This document refers to a sixteen bit quantity as a character and a thirty-two bit quantity as a word.
The design objectives help place the processor in it's proper field of operation. This processor is geared towards implementation in an FPGA. Basically this design is intended for a small footprint (in terms of both memory required and FPGA resources consumed.) It has also been designed with the resources of a typical FPGA in mind. Following are some of the criteria that were used on which to base the design.
A small memory space, potentially less than 65kB but possibly several megabytes of memory. |
A shared data / code memory space, most likely using external ROM and RAM resources which are most likely limited in operating frequency. |
A narrow bus interface to that memory, sixteen bits expected but possibly only eight bits. |
Targeted specifically to implementation in an small FPGA; but scale-able to a larger project |
Using minimal resources with maximum functionality providing a level of functionality suitable for most projects. This includes support for a real operating system, externally generated hardware interrupts, and debugging. |
An easy target for porting existing high level language compilers and assemblers to. |
This design meets the above objectives in the following ways. Two versions of the processor are available (produced from the same source code). One is a strictly sixteen bit design with a sixteen bit wide datapath and an address space limited to 65kc (characters) of memory; the other is a thirty-two bit version with a larger address space. The availability of both sixteen and thirty-two bit versions allows for significantly reducing the size of the core when only a sixteen bit cpu is required. A sixteen bit instruction size was chosen to minimize the memory footprint required for programs. The instruction set was designed to be scaleable (object code similar) from the sixteen bit version to the thirty-two bit version. By using a sixteen bit instruction format the benefits of a sixteen bit processor in terms of code size, can be obtained with a thirty-two bit processor. Because this design need not be especially fast, a simple two-stage pipelined processor was chosen. This provides more than adequate performance when dealing with the type of memory system that it is anticipated that this design will interface to. It is also a non-harvard architecture because the memory space is shared between both code and data to help conserve resources. A reasonably large general purpose register set is available making the design reasonably compatible with many existing compilers and assemblers. Where needed, additional specialized instructions have been added to the processor to support a sophisticated operating system and interrupt management..
Module Port Interface
module Butterfly(clk, rst, nmi, irq, br, bg, go,
rdy, wr, wr0, wr1, ird, wr_nxt ,wr0_nxt, wr1_nxt, addr, addr_nxt, din, dout,
dout_nxt);
Signal | Description |
clk | input connected to system clock |
rst | reset input - resets the processor |
nmi | non-maskable interrupt input - triggers nmi processing |
irq | maskable interrupt input - triggers irq processing |
br | output - bus request - indicates that the processor needs to request the system bus |
bg | input - bus grant - indicates that the system bus is granted to the processor |
go | input - causes execution to continue if a stop opcode was encountered |
rdy | input - external memory is ready |
wr | output - write - indicates a write to memory is taking place |
wr0 | output - write even address - indicates a write to an even memory address is taking place |
wr1 | output - write odd address - indicates that a write to an odd memory address is taking place |
ird | output - instruction read - indicates that an instruction is to be read from the bus |
wr_nxt | output - indicates that the next cycle will be a write cycle |
wr0_nxt | output - indicates that the next cycle will be a write cycle to an even address |
wr1_nxt | output - indicates that the next cycle will be a write cycle to an odd address |
addr | output bus - contains the memory address to be accessed |
addr_nxt | output bus - contains the memory address to be accessed in the next cycle |
din | input bus - data or instruction input |
dout | output bus - data to be written to memory |
dout_nxt | output bus - data to be written to memory in the next cycle |
Size / Performance
421 slices / 750 LUTs / 40MHz
Register Set
General Registers
Code | Register Name | Description | Compiler Use |
0000 | r0 or 0 | this register is always zero (unchangeable) | |
0001 | r1 | general purpose register | first argument register / first return value register |
0010 | r2 | general purpose register | second argument register / second return value register |
0011 | r3 | general purpose register | third argument register |
0100 | r4 | general purpose register | fourth argument register |
0101 | r5 | general purpose register | scratch register |
0110 | r6 | general purpose register | scratch register |
0111 | r7 | general purpose register | scratch register |
1000 | r8 | general purpose register | scratch register |
1001 | r9 | general purpose register | register variable |
1010 | r10 | general purpose register | register variable |
1011 | r11 | general purpose register | register variable |
1100 | r12 | general purpose register | register variable |
1101 | GP / r13 | general purpose register (reserved for use as the global pointer) | |
1110 | SP / r14 |
general purpose register (reserved for use as the SP by assembler) |
|
1111 | LR / r15 | (Link Register) This register is automatically updated by the CALL instructions. It is normally used as the subroutine link register and should be used for this purpose where possible. | |
Special Purpose Registers
Code | Register Name | Description |
0000 | SR | Status Register |
0001 |
ILR | Interrupt Link Register |
0002 | {this register is reserved} | |
0003 | VER | Processor Version - major, minor, revision |
These registers may be accessed using the TRS and TSR commands.
Interrupt Link Register
This register contains the value of the program counter at which an interrupt occurred. It is automatically updated by the interrupt instruction and is used by the interrupt return instruction to restore the original program counter value.
Status Register
The primary need for the manipulation of status register bits is to manage the interrupt state of the processor. Because interrupt management is required, this processor provides specific instructions (EI, DI, RI) for manipulation of the interrupt state without affecting the remaining flags stored in the status register.
There is little reason to manipulate the status register except for possibly saving or restoring it from the stack frame so detail of it's internal components is subject to change at a moment's notice. The status register is sixteen bits (a single character) and split into two halves. The lower portion contains the working copy of the flags and the upper portion contains a backup copy of the flags made when an interrupt occurs. Occurrence of an interrupt automatically copies the working flags to the backup copy, and sets the interrupt mask in the working copy. Execution of an interrupt return instruction (RTI) automatically copies the backup flags back into the working flags.
The status register may be accessed using the TRS and TSR commands.
SR (working version) - format of the backup version contained in bits 8 to 15 is identical.
I | {resv} | {resv} | {resv} | N | V | C | Z |
7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
Flag | Description |
Z | zero - set if the result of an operation is zero |
C | carry - set if there is a carry (or borrow) from an add or subtract instruction; or the bit shifted out from a shift instruction |
V | overflow - set if there is a signed overflow on an add or subtract instruction |
N | negative - set if the result of an operation is negative |
I | interrupt mask - if set to one masks (disables interrupts) |
Version Register
This register contains the major, minor, and revision numbers of the processor. This register is read only.
31..24 | 23..16 | 15..8 | 7..0 |
Major | Minor | Revision | {Reserved} |
1 | 0 | 1 | 0 |
Addition / Subtraction / Comparison
Basic addition (ADD), subtraction (SUB) and comparison (CMP) operations are supported. The 'add' instruction has a three operand immediate form which is also used as the immediate form for subtract and compare (A - B = A + (-B)). An additional instruction NEG is provided which is an alternate form of the SUBR instruction. Add, subtract and compare instructions set the carry and overflow flags in the status register on unsigned and signed overflow respectively. Also the negative and zero flags are set based on the result of the operation. In order to support extended precision operations add with carry (ADC) and subtract with carry (SBC) instructions are provided.
The processor supports a standard set of logical operations including and (AND), or (inclusive)(OR) and exclusive or (XOR). An additional derived operation is NOT which is an alternate form of the XOR instruction. The negative and zero flags are set based on the result of the operation.
The cpu supports a full complement of shift instructions. Left shifts are supported via the SHL instruction. Right shifts are supported with arithmetic (ASR) and logical (SHR) shifts. Rotates are supported via the rotate left ROL and rotate right ROR instructions. The negative and zero flags are set based on the result of the operation. The carry flag is set to the value of the bit shifted out.
Program Flow Control
Branches
Branches are the most frequent form of program flow control operation and are usually performed in a conditional manner. The processor allows an nine bit branch displacement which covers virtually 100% of branch cases. A standard set of branch conditions is provided, outlined in the table below. Additionally a branch to subroutine (program counter relative call) instruction is provided. The branch to subroutine instruction automatically stores the return address in the link register (r15).
Branch conditions are based on the following flags that are maintained in the status register: z, v, c, and n.
Flags
Flag | Operation |
z | set when result is zero |
v | set on signed overflow of result |
n | set if result is negative |
c | set if carry (on add) or borrow (on sub) |
Branch Conditions
Code | Mnemonic | Description | Conditional Test |
0 | BLT | Branch if Less Than | n ^ v |
1 | BGE | Branch if Greater than or Equal | !(n ^ v) |
2 | BLE | Branch if Less than or Equal | (n^v) | z |
3 | BGT | Branch if Greater Than | !((n^v) | z) |
4 | BLTU | Branch if Less Than (Unsigned) | !c & !z |
5 | BGEU | Branch if Greater than or Equal (Unsigned) | c | z |
6 | BLEU | Branch if Less than or Equal (Unsigned) | !c | z |
7 | BGTU | Branch if Greater Than (Unsigned) | c & !z |
8 | BEQ | Branch if EQual | z |
9 | BNE | Branch if Not Equal | !z |
A | BMI | Branch if MInus | n |
B | BPL | Branch if PLus | !n |
C | {reserved} | reserved | |
D | {reserved} | reserved | |
E | BRA | BRanch Always | 1 |
F | BSR | Branch to SubRoutine (relative call) | 1 |
Jumps
Jumps (JMP) are not frequently used in program code so there is minimal support for this operation. Jumps are implemented as a specific case of the jump-and-link (JAL) instruction which is normally used for subroutine calls. By specifying r0 as the register to save the program counter in, a jump operation can be performed because the save of the program counter normally associated with the JAL instruction is nullified.
Subroutine Calls
Subroutine calls may be performed via the CALL, JAL, and BSR instructions. One of the problems with a fixed format sixteen bit instruction encoding is that subroutine call operations are complicated. There are simply not enough bits available in the opcode to directly support a subroutine address. Hence the provision of several subroutine call instructions to suit different needs.
JAL
The JAL instruction is a simple, powerful instruction common in many newer architectures that provides for most common program flow transfer operations that are not covered by branches. The JAL instruction is a generic unconditional program flow control instruction, as such it is very versatile but in many circumstances results in bloated code. This single instruction can perform regular jump operations, subroutine call operations using absolute or register indirect addresses, and a return from subroutine operation. The default form of the JAL instruction where the return address is stored in the link register (r15) is referred to as CALL.
BSR
BSR is a simple adaptation of the regular conditional branch instruction. While it has limited usefullness, most of the hardware required to implement the instruction is already present in the processor. It provides the ability to call subroutines using a program counter relative address, allowing position independent code. Obviously, the distance this call can branch is severely limited.
CALL
The CALL instruction supports jumping to a subroutine within the current 64k program bank. The upper sixteen bits of the program counter remain stable while the lower sixteen bits are loaded with a constant formed from the lowest twelve bits of the opcode shifted left four times. The target address must therefore be at a multiple of sixteen bytes. The impetus for this instruction is that it is much shorter than the absolute address subroutine call formed using the JAL instruction. On average eight bytes of memory may end up being wasted in order to align subroutines on a sixteen byte memory boundary. However, every time this shortened form of subroutine call is used it saves two characters (four bytes) of memory and two cpu cycles over using the JAL instruction. A subroutine need only be called from four different places before this mechanism starts conserving memory even given the alignment requirements. With clever programming, fewer than eight bytes may be wasted for alignment (for example: the memory could be used to store constants).
Interrupts
This processor supports interrupts in the form of a system call. When a hardware interrupt occurs the appropriate system call instruction is forced into the the processor's internal instruction pipeline (this is accomplished by hardware internal to the processor). There are several pre-defined system calls to specific addresses in order to support hardware interrupts as listed in the table below. A system call jumps to the interrupt subroutine whose address is encoded directly in the system call instruction. The RTI instruction is used to return from a system call. The interrupt subroutine begins at the address specified in the system call. The pre-defined system call addresses are spaced every four characters in order to allow an extended jump instruction to the actual routine to be performed. If desired other program code or data may be placed in these locations. Addresses above FFFF_FF80 should not be used as these are reserved for future use.
Occurrence of an interrupt results in the status register being copied to the back up version, the maskable interrupt mask being set to disable further interrupts, and the interrupt return address being copied to ILR, so that these may be restored by the interrupt return operation.
Opcode character "0000" (the break instruction) was purposefully choosen this way so that in the event that the processor attempts to execute null code an interrupt is triggered.
System Call Address | Vector Number | Vector Type / Cause | Instruction Mnemonic | Comment |
FFFF_FFF8 | 15 | reset | RST | will be called automatically during hardware reset |
FFFF_FFF0 | 14 | non-maskable interrupt | NMI | will be automatically called during hardware interrupt |
FFFF_FFE8 | 13 | maskable interrupt | IRQ | will be automatically called during hardware interrupt |
FFFF_FFE0 | 12 | debug interrupt | DBG | reserved for use with hardware debugging |
FFFF_FFD8 | 11 | trace | TRC | single step |
2-10 | {reserved} | |||
FFFF_FF88 | 1 | software interrupt | SYS | system call - meant to be used to call OS system functions |
FFFF_FF80 | 0 | break | BRK | called automatically when a null character is found in the instruction stream |
Memory is byte addressable; character and word accesses must be aligned on character address boundaries. Instructions must be character aligned.
This is a load / store architecture; the only operations accessing memory are load and store operations. The processor supports byte (8 bit), character (16 bit) and word (32 bit) loads and stores of data to memory (LB, SB, LC, LW, SC, SW). Byte and character values loaded are sign extended to the width of the implementation when loaded.
Ideally the processor should be able to support three operand instructions (one destination, two sources) including cases where all three operands are registers in order to allow a compiler to allocate registers most efficiently. This is commensurate with current compiler technology. However, from a practicality standpoint it is simpler and smaller (and thus faster) to use an instruction set that supports only two register operands at once within an FPGA. This does not degrade the performance of the processor to a significant degree as fewer than 1/4 of the instructions executed actually require three register operands, and of those not all will use three different registers. Supporting a register file with three independent ports would consume over twice as many resources for the register file as supporting a register file with two ports in the typical FPGA.
This is a sixteen register design. Limiting the register set to sixteen registers allows a compact instruction format and is consistent with the efficient use of resources within an FPGA.
The opcode format used here is a fixed sixteen bit format with an optional additional character(s) (16/32 bits!) containing an extended immediate value. The mechanism used to encode immediate values in the instruction stream is somewhat convoluted due to the desire to be able to encode thirty-two bit immediate constants in the instruction stream without using an intermediate register or instructions, without lengthening the processor pipeline, causing unnecessary stalls, or needlessly wasting any program space. The constant prefix instruction is provided for the purpose of supplying twelve (bits four through fifteen) or twenty-eight additional bits for the constant when a four bit constant value encoded within an instruction is not sufficient. With the use of the constant prefix instruction both sixteen (sign extended) and thirty-two bit immediates are supported. The constant prefix and following one or two characters are operated in an interlocked fashion to disallow an intervening interrupt to occur.
The four bit constant field allows encoding a constant between -4 and +11 directly in the opcode. Constants outside of this range must make use of the constant prefix instruction. This range was choosen as an optimum use of the four available bits in the opcode. The vast majority of program constants are positive; displacements used in addressing are almost always positive. Skewing the constant range to allow more positive values allows a larger number of small structure offsets to be encoded without resorting to using a constant prefix instruction.
Four bit Constant Encoding
Bit Pattern | Constant Encoded |
0 | 0 |
1 | 1 |
2 | 2 |
3 | 3 |
4 | 4 |
5 | 5 |
6 | 6 |
7 | 7 |
8 | 8 |
9 | 9 |
A | 10 |
B | 11 |
C | -4 |
D | -3 |
E | -2 |
F | -1 |
Basic Opcode Formats
op4 |
op12 |
misc | misc / reserved | |||
op4 | rd4 | rs4 | op4 | rr | register-register | |
op4 | rd4 | op4 | imm4 | rc | register-constant | |
op4 | const12 | cp | constant prefix | |||
op3 | disp1 | cond4 | disp8 | br | branch | |
op4 | rd4 | rs4 | const4 | rrc | register-register-constant |
More Detailed Opcode Formats
op4 |
op12 |
inst. | description | |||||
0 | op12 | misc | misc. | brk, nop, end, stp, cli, sei, rei, sys | ||||
1 | rd4 | rs4 | imm4 | ADD | add immediate | add, sub, cmp | ||
2 | rd4 | rs4 | op4 | rr | register-register operate | add, sub, cmp, and, or, xor, shl, shr, rol, ror, asr | ||
3 | rd4 | op4 | imm4 | ri | register-immediate operate | and, or, xor, subr, cmpr, shl, shr, rol, ror, asr, tsr, trs | ||
4 | const12 | CP | constant prefix | cpr | signal constant bits 4 to 15 | |||
5 | const12 | CP | constant prefix | cpr | signal constant bits 4 to 31 | |||
6 | {res} | reserved opcodes | ||||||
7 | {res} | reserved opcodes | ||||||
Program Flow Control | ||||||||
8 | rd4 | rs4 | disp4 | JAL | jump and link | jal, jmp, call, ret, rti | ||
9 | ||||||||
A/B disp1 |
cond4 | disp8 | Bcc | conditional branch | beq, bne, blt, ble, bmi, bra, bge, bgt, bpl, bsr | |||
Memory Operations | ||||||||
C | rd4 | rs4 | disp4 | SB | store byte | sb | ||
D | rd4 | rs4 | disp3 | op1 | SC / SW | store character or word | sc sw | |
E | rd4 | rs4 | disp4 | LB | load byte | lb | ||
F | rd4 | rs4 | disp3 | op1 | LC / LW | load character or word | lc lw |
Memory Layout
Normal instruction
op12 | const3..0 |
16 bit sign extended constant format
44 | const15..4 |
op12 | const3..0 |
32 bit constant format
54 | const15..4 |
const31..16 |
op12 | const3..0 |
Arithmetic / Logical operations | Subroutine Calls / Jumps | Memory Operations | Miscellaneous | Branches | Interrupt Management | ||||||
ADC | ROL | SXB | BSR | LB | CP | BEQ | BLTU | BRK | SYS | ||
ADD | ROR | SXC | CALL | LC | NOP | BGE | BMI | DI | |||
AND | SBC | ZXB | JAL | LEA | TRS | BGEU | BNE | EI | |||
ASR | SHL | ZXC | JMP | LW | TSR | BGT | BPL | IRQ | |||
CMP | SHR | RET | SB | STOP | BGTU | BRA | NMI | ||||
NEG | SUB | SC | BLE | BSR | RI | ||||||
NOT | SUBR | SW | Special | BLEU | RESET | ||||||
OR | XOR | END | BLT | RTI | |||||||
ADC Rd,Rs | |
ADC Rd,#n |
Synopsis
Arithmetic 'add' with carry register with register or immediate.
Detail
Rd = Rd + Rs + c
desc: | op | Rd | Rs | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 2 | R3..0 | R3..0 | 1 |
Rd = Rd + #imm + c
desc: | op | Rd | Rs | imm |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 3 | R3..0 | 1 | n3..0 |
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | X | X | X | X |
ADD Rd,Rs | |
ADD Rd,Rs,#n |
Synopsis
Arithmetic 'add' register with register or immediate.
Detail
Rd = Rd + Rs
desc: | op | Rd | Rs | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 2 | R3..0 | R3..0 | 0 |
Rd = Rs + #imm
desc: | op | Rd | Rs | imm |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 1 | R3..0 | R3..0 | n3..0 |
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | X | X | X | X |
AND Rd,Rs |
AND Rd,#n |
Synopsis
Logically 'and' register with register or immediate.
Detail
Rd = Rd & Rs
desc: | op | Rd | Rs | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 2 | R3..0 | R3..0 | 5 |
Rd = Rd & #imm
desc: | op | Rd | op | imm |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 3 | R3..0 | 5 | n3..0 |
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | X | - | - | X |
ASR Rd,#1 |
Synopsis
Arithmetically shift register right by one bit.
Detail
Rd = Rd >> 1; c = Rd[0]
The sign bit of the register is preserved during the shift.
desc: | op | rd | op | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 3 | R3..0 | B | 1 |
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | X | - | X | X |
Bcc label
Synopsis
Branch to label if condition is true
Detail
if cond then
pc = pc + disp
desc: | op | disp | cond. | disp. |
size: | 3 | 1 | 4 | 8 |
bits: | 15 13 | 12 | 11 8 | 7 0 |
bit pattern: | 101b | d8 | c3..0 | d7..0 |
Branch conditions are based on the following flags that are maintained in the status register: c, z, v, and n.
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | - | - | - | - |
Flags
Flag | Operation |
z | set when result is zero |
v | set on signed overflow of result |
n | set if result is negative |
c | set if carry (on add) or borrow (on sub) |
Branch Conditions
Code | Mnemonic | Description | Conditional Test |
0 | BLT | Branch if Less Than | n ^ v |
1 | BGE | Branch if Greater than or Equal | !(n ^ v) |
2 | BLE | Branch if Less than or Equal | (n^v) | z |
3 | BGT | Branch if Greater Than | !((n^v) | z) |
4 | BLTU | Branch if Less Than (Unsigned) | c |
5 | BGEU | Branch if Greater than or Equal (Unsigned) | !c |
6 | BLEU | Branch if Less than or Equal (Unsigned) | c | z |
7 | BGTU | Branch if Greater Than (Unsigned) | !(c | z) |
8 | BEQ | Branch if EQual | z |
9 | BNE | Branch if Not Equal | !z |
A | BMI | Branch if MInus | n |
B | BPL | Branch if PLus | !n |
C | {reserved} | reserved | |
D | {reserved} | reserved | |
E | BRA | BRanch Always | 1 |
F | BSR | Branch to SubRoutine (relative call) | 1 |
Synopsis
Run break routine.
Detail
ILR = pc; pc = FFFF_FF80; flags.backup = flags; flags.im = 1
desc: | opcode |
size: | 16 |
bit pattern: | 0000 |
This instruction causes the processor to perform a software initiated 'break' interrupt routine. It is purposely defined as a zero character so the processor will execute the break routine in the event that code flows into a region of memory that has been nulled out. This helps promote reliable system operation.
Flags Affected
I | N | V | C | Z | |||
1 | - | - | - | - | - | - | - |
CALL d[Rs] | |
CALL d9[pc] |
Synopsis
Call subroutine
Detail
r15 = pc; pc = disp + Rs;
Call subroutine using register indirect with displacement mode. This is an alternate form of the JAL instruction.
desc: | op | Rd | Rs | disp. |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 4 | F | R3..0 | n3..0 |
r15 = pc; pc = pc + disp;
Call subroutine using program counter relative form. The nine bit address is sign extended to 32 bits before being used. This allows subroutine calls to memory with in the first 256 or last 256 characters relative to the current program address.
desc: | op | disp | cond | disp |
size: | 3 | 1 | 4 | 8 |
bits: | 15 13 | 12 | 11 8 | 7 0 |
bit pattern: | 101b | d8 |
F |
d7..0 |
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | - | - | - | - |
CMP Rd,Rs |
CMP Rs,#n |
Synopsis
Compare register with register or immediate.
Detail
flags = Rd - Rs
desc: | op | rd | rs | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 2 | R3..0 | R3..0 | D |
flags = Rs - #imm
This is really the add instruction with the immediate constant automatically negated by the assembler. Because it's really an add instruction the immediate constant is limited to the range -biggest integer to -smallest integer
desc: | op | Rd | Rs | imm |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 1 | 0 | R3..0 | n3..0 |
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | X | X | X | X |
CP #imm |
Synopsis
Constant prefix
desc: | op | size | imm |
size: | 3 | 1 | 12 |
bits: | 15 13 | 12 | 11 0 |
bit pattern: | 010b | sz | n11..0 |
The upper bits of the immediate constant are set for the next instruction, overriding sign extension of the immediate. The constant prefix instruction indicates the presence of an additional twelve or twenty-eight constant bits in the instruction stream. If the size bit is set to one, then the next instruction character contains bits 16 to 31 of the constant for the following instruction. If the size bit is zero, then the constant prefix only includes bits 4 through 15 which are sign extended to produce the constant for the next instruction. Interrupts are prevented from occurring between this instruction and the next instruction. Normally this instruction is automatically inserted by the assembler wherever an extended constant value is required.
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | - | - | - | - |
DI |
Synopsis
Disable interrupts.
Detail
flags.im = 1
desc: | opcode |
size: | 16 |
bit pattern: | 0011 |
This instruction disables maskable hardware interrupts by setting the interrupt mask in the status register. It also disables the IRQ instruction.
Flags Affected
I | N | V | C | Z | |||
1 | - | - | - | - | - | - | - |
EI |
Synopsis
Enable interrupts.
Detail
flags.im = 0
desc: | op |
size: | 16 |
bit pattern: | 0010 |
This instruction enables maskable hardware interrupts by clearing the interrupt mask in the status register.
Flags Affected
I | N | V | C | Z | |||
0 | - | - | - | - | - | - | - |
END |
Synopsis
No operation.
Detail
desc: | op |
size: | 16 |
bit pattern: | 0022 |
This instruction is provided for use in a software emulator of the processor. It indicates the end of a sequence of instructions to emulate. The processor will treat this instruction as a NOP instruction.
Flags Affected
I | CE | N | V | C | Z | ||
- | - | - | - | - | - | - | - |
Synopsis
Run irq routine.
Detail
ILR = pc; pc = FFFF_FFE8; flags.backup = flags; flags.im = 1
desc: | op |
size: | 16 |
bit pattern: | 000D |
This instruction causes the processor to perform a software initiated maskable interrupt routine. It has the same effect as an external hardware maskable interrupt. If the interrupt mask in the status register is set, then this instruction will be ignored.
Flags Affected
I | N | V | C | Z | |||
1 | - | - | - | - | - | - | - |
JAL Rd,d[Rs] |
Synopsis
Jump and link to subroutine
Detail
Rd = pc; pc = disp + Rs;
Jump to subroutine using register indirect with displacement mode. The current value of the program counter is stored in the destination register. Normally this instruction will be used with a constant prefix.
desc: | op | Rd | Rs | disp. |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 8 | R3..0 | R3..0 | n3..0 |
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | - | - | - | - |
JMP d[Rs] |
Synopsis
Jump to target
Detail
pc = disp + Rs;
Jump to target code using register indirect with displacement mode. This is an alternate form of the JAL instruction.
desc: | op | Rd | Rs | disp. |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 8 | 0 | R3..0 | n3..0 |
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | - | - | - | - |
LB Rd,d[Rs] |
Synopsis
Load register byte from memory
Detail
Rd = memory [Rs + disp]
desc: | op | rd | rs | disp |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | E | R3..0 | R3..0 | d3..0 |
The byte loaded from memory is sign extended to the register width.
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | X | - | - | X |
LC Rd,d[Rs] |
Synopsis
Load register character from memory
Detail
Rd = memory [Rs + disp]
desc: | op | rd | rs | disp | op |
size: | 4 | 4 | 4 | 3 | 1 |
bits: | 15 12 | 11 8 | 7 4 | 3 1 | 0 |
bit pattern: | F | R3..0 | R3..0 | d3..1 | 0 |
The character loaded from memory is sign extended to the register width. Loads must be character aligned.
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | X | - | - | X |
LEA Rd,d[Rs] |
Synopsis
Load effective address.
Detail
Rd = Rs + disp
desc: | op | Rd | Rs | disp. |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 1 | R3..0 | R3..0 | d3..0 |
The effective address is loaded into the target register. This instruction is an alternate form of the ADD instruction.
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | X | X | X | X |
LW Rd,d[Rs] |
Synopsis
Load register word from memory
Detail
Rd = memory [Rs + disp]
desc: | op | rd | rs | disp | op |
size: | 4 | 4 | 4 | 3 | 1 |
bits: | 15 12 | 11 8 | 7 4 | 3 1 | 0 |
bit pattern: | F | R3..0 | R3..0 | d3..1 | 1 |
Note the format of the displacement field. D0 is forced to zero because word accesses must be character aligned.
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | X | - | - | X |
NEG Rd |
Synopsis
Take twos complement of register.
Detail
Rd = -Rd
desc: | op | rd | op | imm |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 3 | R3..0 | 2 | 0 |
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | X | X | X | X |
Synopsis
Run nmi routine.
Detail
ILR = pc; pc = FFFF_FFF0; flags.backup = flags; flags.im = 1
desc: | op |
size: | 16 |
bit pattern: | 000E |
This instruction causes the processor to perform a software initiated non-maskable interrupt routine. It has the same effect as an external hardware non-maskable interrupt.
Flags Affected
I | N | V | C | Z | |||
1 | - | - | - | - | - | - | - |
NOP |
Synopsis
No operation
Detail
desc: | op |
size: | 16 |
bit pattern: | 0020 |
This instruction acts merely as a placeholder. It performs no operation and has no effect on the processor. Many processors lack an explicit NOP operation resulting in different instructions being used for this purpose within the same processor. By providing an explicit NOP instruction some consistency in programs can be achieved. Without a NOP instruction many processors typically use instructions which have side effects in the form of affecting the processor status flags; this is not the case here.
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | - | - | - | - |
NOT Rd |
Synopsis
Take ones complement of register.
Detail
Rd = ~Rd
desc: | op | rd | op | imm |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 3 | R3..0 | 4 | F |
This is really an alternate form of the XOR instruction.
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | X | - | - | X |
OR Rd,Rs |
OR Rd,#n |
Synopsis
Logically inclusively 'or' register with register or immediate.
Detail
Rd = Rd | Rs
desc: | op | rd | rs | op |
size: | 3 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 2 | R3..0 | R3..0 | 6 |
Rd = Rd | #imm
desc: | op | rd | op | imm |
size: | 3 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 3 | R3..0 | 6 | n3..0 |
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | X | - | - | X |
RI |
Synopsis
Restore interrupt flag from backup copy.
Detail
flags.im = flags.backup im
desc: | op |
size: | 16 |
bit pattern: | 0022 |
This instruction restores the previous interrupt flag state from the backup copy of the status register. This allows restoring the interrupt state that was present when the backup of the status register was made. This is useful in operating system code where interrupts must be disabled to perform certain system functions (like updating system lists) and then restored after performing the operation.
Flags Affected
I | N | V | C | Z | |||
X | - | - | - | - | - | - | - |
Synopsis
Run reset routine.
Detail
ILR = pc; pc = FFFF_FFF8; flags.backup = flags; flags.im = 1
desc: | op |
size: | 16 |
bit pattern: | 000F |
This instruction causes the processor to perform a software initiated reset routine.
Flags Affected
I | CE | N | V | C | Z | ||
1 | - | - | - | - | - | - | - |
ROL Rd,#1 |
Synopsis
Rotate register left by one bit.
Detail
Rd = Rd << 1; c = Rd.msb
desc: | op | rd | op | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 3 | R3..0 | 9 | 1 |
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | X | - | X | X |
ROR Rd,#1 |
Synopsis
Rotate register right by one bit.
Detail
Rd = Rd >> 1; c = Rd.lsb
desc: | op | rd | op | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 3 | R3..0 | C | 1 |
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | X | - | - | X |
RET |
Synopsis
Return from subroutine.
Detail
pc = r15
desc: | op | rd | rs | disp. |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 4 | 0 | F | 0 |
This instruction returns to the calling routine by loading the program counter with the contents of the link register. This is an alternate form of the jump-and-link (JAL) instruction. It is possible to return to a point in the program after the subroutine call by specifying a positive displacement instead of zero. This allows constant parameters to a subroutine call to be placed directly in code immediately after the calling instruction.
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | - | - | - | - |
RTI |
Synopsis
Return from interrupt subroutine.
Detail
pc = ILR; flags = backup flags
desc: | op | imm |
size: | 12 | 4 |
bits: | 15 4 | 3 0 |
bit pattern: | 004 | n3..0 |
This instruction returns from an interrupt routine by restoring the flag register and jumping back to the code that was interrupted (who's address is stored in ILR). A displacement may be added to the return address to allow a return to code beyond the original calling point (allows inline parameter passing). The displacement should be set to zero for hardware triggered interrupt routines.
Flags Affected
I | N | V | C | Z | |||
X | - | - | - | X | X | X | X |
SB Rd,d[Rs] |
Synopsis
Store byte from register to memory
Detail
memory [Rs + disp] = Rd
desc: | op | rd | rs | disp |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | C | R3..0 | R3..0 | d3..0 |
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | - | - | - | - |
SBC Rd,Rs | |
SBC Rd,#n |
Synopsis
Arithmetic 'add' with carry register with register or immediate.
Detail
Rd = Rd + Rs + c
desc: | op | Rd | Rs | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 2 | R3..0 | R3..0 | 3 |
Rd = Rd + #imm + c
desc: | op | Rd | Rs | imm |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 3 | R3..0 | 3 | n3..0 |
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | X | X | X | X |
SC Rd,d[Rs] |
Synopsis
Store character from register to memory
Detail
memory [Rs + disp] = Rd
desc: | op | rd | rs | disp | op |
size: | 4 | 4 | 4 | 3 | 1 |
bits: | 15 12 | 11 8 | 7 4 | 3 1 | 0 |
bit pattern: | D | R3..0 | R3..0 | d3..1 | 0 |
The least significant sixteen bits of the register are stored to memory. The address must be character aligned.
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | - | - | - | - |
SHL Rd,#1 |
Synopsis
Arithmetically shift register left by one bit.
Detail
Rd = Rd << 1; c = Rd.msb
desc: | op | rd | op | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 3 | R3..0 | 8 | 1 |
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | X | - | X | X |
SHR Rd,#1 |
Synopsis
Logically shift register right by one bit.
Detail
Rd = Rd >> 1; c = Rd.lsb
desc: | op | rd | op | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 3 | R3..0 | A | 1 |
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | X | - | X | X |
STOP |
Synopsis
Stop the processor from executing instructions and wait for the external hardware 'go' signal or a non-maskable interrupt.
Detail
desc: | op |
size: | 16 |
bits: | 15 0 |
bit pattern: | 0021 |
This instruction can be used to synchronize the processor to an external event.
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | - | - | - | - |
SUBR Rd,#n |
Synopsis
Subtract register from immediate.
Detail
Rd = n - Rd
Note: this instruction is usually the opposite of what's needed. See SUB.
desc: | op | rd | op | imm |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 3 | R3..0 | 2 | n3..0 |
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | X | X | X | X |
SUB | Rd,Rs | |
SUB | Rd,Rs,#n |
Synopsis
Subtract immediate or register from register.
Detail
Rd = Rd - Rs
desc: | op | rd | rs | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 2 | R3..0 | R3..0 | 2 |
Rd = Rs - n
This is really the add instruction with the immediate constant automatically negated by the assembler. Because it's really an add instruction the immediate constant is limited to the range -biggest integer to -smallest integer
desc: | op | rd | rs | imm |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 1 | R3..0 | R3..0 | n3..0 |
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | X | X | X | X |
SW Rd,d[Rs] |
Synopsis
Store word from register to memory
Detail
memory [Rs + disp] = Rd
desc: | op | rd | rs | disp | op |
size: | 4 | 4 | 4 | 3 | 1 |
bits: | 15 12 | 11 8 | 7 4 | 3 1 | 0 |
bit pattern: | D | R3..0 | R3..0 | d3..1 | 1 |
Note the format of the displacement field. D0 is forced to zero because word accesses must be character aligned.
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | - | - | - | - |
Synopsis
Perform system call.
Detail
ILR = pc; pc = FFFF_FF88; flags.backup = flags; flags.im = 1
desc: | op |
size: | 16 |
bit pattern: | 0001 |
This instruction causes the processor to perform a system call. It is meant for operating system support.
Flags Affected
I | N | V | C | Z | |||
1 | - | - | - | - | - | - | - |
SXB Rd | |
Synopsis
Sign extend byte.
Detail
Rd = {24'b0,Rd[7:0]}
desc: | op | rd | op | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 2 | R3..0 | 2 | E |
Sign extends a byte contained in a register by setting the upper 24 bits of the register to register bit 7.
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | X | - | - | X |
SXC Rd | |
Synopsis
Sign extend character.
Detail
Rd = {16'b0,Rd[15:0]}
desc: | op | rd | op | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 2 | R3..0 | 3 | E |
Sign extends a character contained in a register by setting the upper 16 bits of the register to register bit 15.
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | X | - | - | X |
TRS Rd,Spr |
Synopsis
Transfer register to special purpose register.
Detail
Special Register = Rd
desc: | op | rd | op | spr |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 3 | R3..0 | F | R3..0 |
Special Registers
Currently there are only five special registers defined, the remaining codes are reserved for future use. The processor version register is read only.
Code | Register Name | Description |
0000 | SR | Status Register |
0001 |
ILR | Interrupt Link Register |
0002 | {this register is reserved} | |
0003 | VER | Processor Version - major, minor, revision |
Flags Affected (for TRS Rn,SR only)
I | N | V | C | Z | |||
- | - | - | - | X | X | X | X |
TSR Rd,Spr |
Synopsis
Transfer special purpose register to register.
Detail
Rd = Special Register
desc: | op | rd | op | spr |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 3 | R3..0 | E | R3..0 |
Special Registers
Currently there are only five special registers defined, the remaining codes are reserved for future use.
Code | Register Name | Description |
0000 | SR | Status Register |
0001 |
ILR | Interrupt Link Register |
0002 | {this register is reserved} | |
0003 | VER | Processor Version - major, minor, revision |
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | - | - | - | - |
XOR Rd,Rs | |
XOR Rd,#n |
Synopsis
Logically exclusively 'or' register with register or immediate.
Detail
Rd = Rd ^ Rs
desc: | op | rd | rs | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 2 | R3..0 | R3..0 | 4 |
Rd = Rd ^ n
desc: | op | rd | op | imm |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 3 | R3..0 | 4 | n3..0 |
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | X | - | - | X |
ZXB Rd | |
Synopsis
Zero extend byte.
Detail
Rd = {24'b0,Rd[7:0]}
desc: | op | rd | op | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 2 | R3..0 | 0 | E |
Zero extends a byte contained in a register by setting the upper 24 bits of the register to zero.
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | X | - | - | X |
ZXC Rd | |
Synopsis
Zero extend character.
Detail
Rd = {16'b0,Rd[15:0]}
desc: | op | rd | op | op |
size: | 4 | 4 | 4 | 4 |
bits: | 15 12 | 11 8 | 7 4 | 3 0 |
bit pattern: | 2 | R3..0 | 1 | E |
Zero extends a character contained in a register by setting the upper 16 bits of the register to zero.
Flags Affected
I | N | V | C | Z | |||
- | - | - | - | X | - | - | X |