Simple CPU v1d2: FPGA

Home

Figure 1 : simpleCPU_v1dv2 block diagram, spot the differences

##################################
# SIMPLECPU V1D2 INSTRUCTION-SET #
##################################

INSTR   IR15 IR14 IR13 IR12 IR11 IR10 IR09 IR08 IR07 IR06 IR05 IR04 IR03 IR02 IR01 IR00  
MOVE    0    0    0    0    RD   RD   0    0    K    K    K    K    K    K    K    K
MOVEH   0    0    0    0    RD   RD   0    1    K    K    K    K    K    K    K    K
MOVEU   0    0    0    0    RD   RD   1    0    K    K    K    K    K    K    K    K

ADD     0    0    0    1    RD   RD   X    X    K    K    K    K    K    K    K    K
SUB     0    0    1    0    RD   RD   X    X    K    K    K    K    K    K    K    K

AND     0    0    1    1    RD   RD   0    0    K    K    K    K    K    K    K    K
OR      0    0    1    1    RD   RD   0    1    K    K    K    K    K    K    K    K
XOR     0    0    1    1    RD   RD   1    0    K    K    K    K    K    K    K    K

LOAD    0    1    0    0    A    A    A    A    A    A    A    A    A    A    A    A
STORE   0    1    0    1    A    A    A    A    A    A    A    A    A    A    A    A
ADDM    0    1    1    0    A    A    A    A    A    A    A    A    A    A    A    A
SUBM    0    1    1    1    A    A    A    A    A    A    A    A    A    A    A    A

JUMPU   1    0    0    0    A    A    A    A    A    A    A    A    A    A    A    A
JUMPZ   1    0    0    1    0    0    0    R    R    R    R    R    R    R    R    R  -- RELATIVE -256 to +255
JUMPNZ  1    0    0    1    0    0    1    R    R    R    R    R    R    R    R    R
JUMPC   1    0    0    1    0    1    0    R    R    R    R    R    R    R    R    R 
JUMPNC  1    0    0    1    0    1    1    R    R    R    R    R    R    R    R    R 
JUMPN   1    0    0    1    1    0    0    R    R    R    R    R    R    R    R    R
JUMPP   1    0    0    1    1    0    1    R    R    R    R    R    R    R    R    R 

CMP     1    0    1    0    RD   RD   0    0    K    K    K    K    K    K    K    K  
TEST    1    0    1    0    RD   RD   0    1    K    K    K    K    K    K    K    K  

XOP1    1    0    1    1    RD   RD   X    X    K    K    K    K    K    K    K    K  -- NOT IMPLEMENTED
XOP2    1    1    0    0    RD   RD   X    X    K    K    K    K    K    K    K    K  -- NOT IMPLEMENTED
XOP3    1    1    0    1    RD   RD   X    X    K    K    K    K    K    K    K    K  -- NOT IMPLEMENTED

CALL    1    1    1    0    A    A    A    A    A    A    A    A    A    A    A    A
RET     1    1    1    1    X    X    X    X    X    X    X    0    0    0    0    0

MOVE    1    1    1    1    RD   RD   RS   RS   X    X    X    0    0    0    0    1
LOAD    1    1    1    1    RD   RD   RS   RS   X    X    X    0    0    0    1    0  -- REG INDIRECT
STORE   1    1    1    1    RD   RD   RS   RS   X    X    X    0    0    0    1    1  -- REG INDIRECT  
 
SHL     1    1    1    1    RSD  RSD  X    X    X    X    X    0    0    1    0    0  
SHR     1    1    1    1    RSD  RSD  X    X    X    X    X    0    0    1    0    1  

ADD     1    1    1    1    RD   RD   RS   RS   X    X    X    0    0    1    1    0  -- NOT IMPLEMENTED
SUB     1    1    1    1    RD   RD   RS   RS   X    X    X    0    0    1    1    1  -- NOT IMPLEMENTED

AND     1    1    1    1    RD   RD   RS   RS   X    X    X    0    1    0    0    0  -- NOT IMPLEMENTED
OR      1    1    1    1    RD   RD   RS   RS   X    X    X    0    1    0    0    1  -- NOT IMPLEMENTED
XOR     1    1    1    1    RD   RD   RS   RS   X    X    X    0    1    0    1    0  -- NOT IMPLEMENTED

ASL     1    1    1    1    RSD  RSD  X    X    X    X    X    0    1    0    1    1  
ASR     1    1    1    1    RSD  RSD  X    X    X    X    X    0    1    1    0    0  

XOP4    1    1    1    1    RD   RD   RS   RS   X    X    X    0    1    1    0    1  -- NOT IMPLEMENTED
XOP5    1    1    1    1    RD   RD   RS   RS   X    X    X    0    1    1    1    0  -- NOT IMPLEMENTED
XOP6    1    1    1    1    RD   RD   RS   RS   X    X    X    0    1    1    1    1  -- NOT IMPLEMENTED
XOP7    1    1    1    1    RD   RD   RS   RS   X    X    X    1    0    0    0    0  -- NOT IMPLEMENTED
XOP8    1    1    1    1    RD   RD   RS   RS   X    X    X    1    0    0    0    1  -- NOT IMPLEMENTED
XOP9    1    1    1    1    RD   RD   RS   RS   X    X    X    1    0    0    1    0  -- NOT IMPLEMENTED

Another year, another processor :). When designing the simpleCPU processor the goal was not to build the fastest processor, the best processor i.e. a replacement for an off the shelf processor that could be purchased for a fraction of the cost, rather it was to build a processor that would be used for teaching. There are a lot of different approaches to teaching computer architectures and i'm sure many of my colleagues would disagree with the way i run the SYS1 module, but for me learning is something you do by doing, something that a student has to actively engage with, rather than passively listen to. Therefore, for me learning is something a student has to experience, simply using a "thing" is not enough, you need to build that "thing" and use it in a real world context to understand why that "thing" looks the way it does, you need to become the "thing" (Link) :). Soooo the simpleCPU processor is a platform from which a student can experience what a processor is, something they can build upon, add features and functionality to, a platform where they can experience what works and what does not, go beyond simply using a processor. Well that is the thought :).

From figure 1 you should spot that there are no top level architectural changes from the simpleCPUv1d to the simpleCPUv1d2. I did consider adding more general purpose data registers, different data paths, but these are "nice to have" improvements, rather than "teaching" improvements. Not to say there isn't significant changes to some of these hardware components, but these changes are driven by changes to the instruction-set i.e. to add more instructions, to add more flexibility to the processor, to make it easier to use, easier to program.

The new and improved ISE project files for this processor can be downloaded here: (Link).

Table of Contents

Improvement 1 : immediate addressing mode
Improvement 2 : changes to data MUX and ALU
Improvement 3 : relative addressing mode
Improvement 4 : changes to the PC
Improvement 5 : SHR, SHL, ASL and ASR instructions
Improvement 6 : CMP and TEST instructions
Testing : simulating the new instructions

Improvement 1 : immediate addressing mode

Figure 2 : basic instruction format

A basic simpleCPU instruction is a 16bit fixed length instruction, with a 4bit opcode and a 12bit operand. To add more flexibility and functionality i could switch to a variable length instruction format e.g. instruction could vary in size, from 8bit to 64bit+ in length, depending on their requirements / operands etc. You can see an example of this type of micro-coded architecture in the simpleCPUv2 (Link). However, this approach significantly increases the complexity of the fetch-phase and control logic, therefore, making it more difficult for students to add new instructions. So i decided to stick with fixed length instructions. The downside of this is that you are limited to 16 basic instructions, which is enough to make a processor i.e. the simpleCPUv1a (Link), but its a little limited. That is, the ideal teaching processor should NOT do everything a student whats i.e. some of those features are for the student to add, but equally it should not cause excessive frustration when programming like the minimalCPU :) (Link). Fortunately, there is space within the existing instructions to add additional bit-fields, to expand the functionality of some of the instructions. Consider the immediate addressing mode instruction format shown in figure 3.

Figure 3 : immediate addressing mode instruction format

# INSTR   IR15 IR14 IR13 IR12 IR11 IR10 IR09 IR08 IR07 IR06 IR05 IR04 IR03 IR02 IR01 IR00  

# MOVE    0    0    0    0    RD   RD   U    U    K    K    K    K    K    K    K    K
# ADD     0    0    0    1    RD   RD   U    U    K    K    K    K    K    K    K    K
# SUB     0    0    1    0    RD   RD   U    U    K    K    K    K    K    K    K    K
# AND     0    0    1    1    RD   RD   U    U    K    K    K    K    K    K    K    K

All immediate addressing mode instruction opcodes start with 00, have a 2bit register select field and an 8bit immediate value. They also have two unused bits (UU). Therefore, these bits can be used to augment the functionality of these instructions. Thats is, where possible, without significant changes to the processor's hardware we can use these bits to add new instructions. Note, i'm not looking to just add instructions here, rather expand the functionality of the existing ones, any changes should not break existing instructions, the instruction must use the same addressing mode, or at a push be a variation upon. Looking to show how we can expand the processor's instruction-set, rather than rewrite from scratch.

Figure 4 : simpleCPUv1d2 top level schematic

The top level schematic of the simpleCPUv1d2 is shown in figure 4, from this we can identify the possible data paths for the immediate addressing mode. This hardware is fixed, partially to maintain consistency with the previous version, but to also help ensure we meet timing closure i.e. we don't change the critical path delay tooo much. Sooo the first improvement is to add new MOVE instructions:

INSTR   IR15 IR14 IR13 IR12 IR11 IR10 IR09 IR08 IR07 IR06 IR05 IR04 IR03 IR02 IR01 IR00  

MOVE    0    0    0    0    RD   RD   0    0    K    K    K    K    K    K    K    K
MOVEL   0    0    0    0    RD   RD   0    0    K    K    K    K    K    K    K    K
MOVEH   0    0    0    0    RD   RD   0    1    K    K    K    K    K    K    K    K
MOVEU   0    0    0    0    RD   RD   1    0    K    K    K    K    K    K    K    K

Example          :    move RA 0xFF, RA = 0xFFFF
Addressing mode  :    immediate
Opcode           :    0000 00
RTL              :    RX <- ( (K7)8 || KK )
Flags set        :    None
Description      :    write 8bit immediate value to low byte position and sign extend high byte

Note, movel is the same as the move instruction, just renamed to match moveh naming convention.

Example          :    moveh RA 0xFF, RA = 0xFF00 
Addressing mode  :    immediate
Opcode           :    0000 01
RTL              :    RX <- ( KK || 00 )
Flags set        :    None
Description      :    write 8bit immediate value to high byte position

Example          :    moveu RA 1, RA = 0x00FF 
Addressing mode  :    immediate
Opcode           :    0000 10
RTL              :    RX <- ( 00 || KK )
Flags set        :    None
Description      :    write 8bit immediate value to low byte position and zero high byte

MOVE and MOVEL are the same instruction, a signed extended 8bit move, i included the MOVEL (move low byte) alternative name to complement the MOVEH (move high byte) naming convention. MOVEH is a write immediate to high byte position, low byte is zeroed. MOVEU is an unsigned move, write immediate into low byte position, high byte zeroed.

For the next changes i am slightly pushing the limits of the rules i set, but rather than implementing different types of AND instructions going for a Boolean instruction group by adding support for bitwise OR and XOR:

INSTR   IR15 IR14 IR13 IR12 IR11 IR10 IR09 IR08 IR07 IR06 IR05 IR04 IR03 IR02 IR01 IR00  

AND     0    0    1    1    RD   RD   0    0    K    K    K    K    K    K    K    K
OR      0    0    1    1    RD   RD   0    1    K    K    K    K    K    K    K    K
XOR     0    0    1    1    RD   RD   1    0    K    K    K    K    K    K    K    K

Example            :    and RD 0x04
Addressing mode    :    immediate
Opcode             :    0011 00
RTL                :    RX <- RX & ( (0)8 || KK )
Flags set          :    Z,C,O,P,N
Description        :    bitwise AND of immediate 8bit values and low byte of register

Example            :    or RD 0x04
Addressing mode    :    immediate
Opcode             :    0011 01
RTL                :    RX <- RX | ( (0)8 || KK )
Flags set          :    None
Note               :    bitwise OR of immediate 8bit values and low byte of register

Example            :    xor RD 0x04 
Addressing mode    :    immediate
Opcode             :    0011 10
RTL                :    RX <- RX ^ ( (0)8 || KK )
Flags set          :    None
Note               :    bitwise XOR of immediate 8bit values and low byte of register

At this time I have not implemented any changes to the ADD and SUB instructions, as this would need some changes to the ALU's adder/subtractor. However, possible improvements would be to implement an unsigned addition, could also add instructions to include the previous carry flag, to allow the addition of values greater than 16bits i.e. numbers spread across multiple memory locations :

INSTR   IR15 IR14 IR13 IR12 IR11 IR10 IR09 IR08 IR07 IR06 IR05 IR04 IR03 IR02 IR01 IR00  

ADD     0    0    0    1    RD   RD   0    0    K    K    K    K    K    K    K    K
ADDU    0    0    0    1    RD   RD   0    1    K    K    K    K    K    K    K    K
ADDCY   0    0    0    1    RD   RD   1    0    K    K    K    K    K    K    K    K

SUB     0    0    1    0    RD   RD   0    0    K    K    K    K    K    K    K    K
SUBU    0    0    1    0    RD   RD   0    1    K    K    K    K    K    K    K    K
SUBCY   0    0    1    0    RD   RD   1    0    K    K    K    K    K    K    K    K

Example            :    add RB 2
Addressing mode    :    immediate
Opcode             :    0001 00
RTL                :    RX <- RX + ( (K7)8 || KK )
Flags set          :    Z,C,O,P,N
Description        :    signed addition 

Example            :    addu RB 2
Addressing mode    :    immediate
Opcode             :    0001 01
RTL                :    RX <- RX + ( 00 || KK )
Flags set          :    Z,C,O,P,N
Description        :    unsigned addition 

Example            :    addcy RB 2
Addressing mode    :    immediate
Opcode             :    0001 10
RTL                :    RX <- RX + ( (K7)8 || KK ) + C
Flags set          :    Z,C,O,P,N
Description        :    signed addition + cary flag

Example            :    sub RB 2
Addressing mode    :    immediate
Opcode             :    0010 00
RTL                :    RX <- RX - ( (K7)8 || KK )
Flags set          :    Z,C,O,P,N
Description        :    signed subtraction 

Example            :    sub RB 2
Addressing mode    :    immediate
Opcode             :    0010 01
RTL                :    RX <- RX + ( 00 || KK )
Flags set          :    Z,C,O,P,N
Description        :    unsigned subtraction

Example            :    subcy RB 2
Addressing mode    :    immediate
Opcode             :    0010 00
RTL                :    RX <- RX - ( (K7)8 || KK ) - carry flag
Flags set          :    Z,C,O,P,N
Description        :    signed subtraction - carry flag 

Note, just to repeat, the above modified ADD and SUB instructions have not been implemented.

Improvement 2 : changes to data MUX and ALU

Figure 5 : changes to data MUX

To allow the different MOVE instructions to function we need to update the DATA_MUX component shown in figure 5. This multiplexer always did have an unused A-input branch that was included for future addressing mode expansion, which i confess i never got around to adding until now :). Sooo this hardware was removed and the hardware needed for the MOVEH instruction added. The top A-input logic generates sign extended and unsigned immediate values. The bottom A-input logic generates the MOVEH value, writing the immediate value to the high byte of the destination register and zeroing the low byte.

To implement the Boolean logic functions the bitwise_AND component within the ALU is replaced with the Boolean_logic component shown in figure 6. This component simply adds equivalent OR and XOR components in additiona to the existing bitwise AND_16, combined with a MUX to select the desired function, as shown in figure 7. The AND, OR and XOR components are basically the same i.e. a logic gate array, the OR_16 component is shown in figure 8.


Figure 6 : ALU


Figure 7 : Boolean_logic component


Figure 8 : OR logic

Improvement 3 : relative addressing mode

The simpleCPU has four JUMP instructions (listed below), one unconditional and three conditional jumps. These use the absolute/direct addressing mode i.e. they contain the 12bit branch target address.

Note, for the simpleCPU i slightly bend the definitions of absolute and direct addressing modes, so that i can differentiation between addresses used in LOAD/STORE instructions and those used in JUMP instructions. If the specified address points to a data value, in the simpleCPU i refer to this as an absolute addressing mode. If the specified address points to an instruction, in the simpleCPU i refer to this as an direct addressing mode. Both absolute/direct addressing modes are addresses, its just that i think its useful to highlight what these instructions are pointing to different "things". For people that don't like this, all i can say is its my processor, so my rules :)

The downside of the implementation in the simpleCPUv1d is that these four JUMP instructions take 4 of the 16 base opcode values, is this a big chunk of opcodes, limiting the processor's istruction-set. Also, to simplify coding we need to increase the number of conditional JUMP instructions, however, there are no spare opcodes to do this :(.

INSTR   IR15 IR14 IR13 IR12 IR11 IR10 IR09 IR08 IR07 IR06 IR05 IR04 IR03 IR02 IR01 IR00  

JUMPU   1    0    0    0    A    A    A    A    A    A    A    A    A    A    A    A
JUMPZ   1    0    0    1    A    A    A    A    A    A    A    A    A    A    A    A
JUMPNZ  1    0    1    0    A    A    A    A    A    A    A    A    A    A    A    A
JUMPC   1    0    1    1    A    A    A    A    A    A    A    A    A    A    A    A

Sooo, to make some space we can take advantage of the observation that the branch target address distance of a conditional jump is small, we rarely need to branch the full address range, typically we only need to skip forwards or backwards a few instructions i.e. a small relative jump. Therefore, rather than specifying the branch target address as an absolute/direct address we can use a relative address i.e. an +/- offset from the current PC value. Using this approach we now have bit-fields free to define a range of conditional jump instructions:

INSTR   IR15 IR14 IR13 IR12 IR11 IR10 IR09 IR08 IR07 IR06 IR05 IR04 IR03 IR02 IR01 IR00  

# JUMPU   1    0    0    0    A    A    A    A    A    A    A    A    A    A    A    A     Absolute addressing mode
# JUMPZ   1    0    0    1    0    0    0    R    R    R    R    R    R    R    R    R     Relative addressing mode
# JUMPNZ  1    0    0    1    0    0    1    R    R    R    R    R    R    R    R    R     Relative addressing mode 
# JUMPC   1    0    0    1    0    1    0    R    R    R    R    R    R    R    R    R     Relative addressing mode
# JUMPNC  1    0    0    1    0    1    1    R    R    R    R    R    R    R    R    R     Relative addressing mode
# JUMPN   1    0    0    1    1    0    0    R    R    R    R    R    R    R    R    R     Relative addressing mode 
# JUMPP   1    0    0    1    1    0    1    R    R    R    R    R    R    R    R    R     Relative addressing mode

Example            :    jump 200
Addressing mode    :    direct
Opcode             :    1000
RTL                :    PC <- AAA
Flags set          :    None
Description        :    unconditional jump, 12bit absolute address

Example            :    jumpz 201
Addressing mode    :    direct
Opcode             :    1001
RTL                :    IF Z=1 THEN PC <- PC + RRR ELSE PC <- PC + 1
Flags set          :    None
Description        :    jump if zero conditional jump, signed 9bit relative address, jump if last  
                        arithmetic / logical instruction generated a zero result.

Example            :    jumpnz 202
Addressing mode    :    direct
Opcode             :    1001
RTL                :    IF Z=0 THEN PC <- PC + RRR ELSE PC <- PC + 1
Flags set          :    None
Description        :    jump if not zero conditional jump, signed 9bit relative address, jump if last  
                        arithmetic / logical instruction did not generate a zero result.

Example            :    jumpc 203
Addressing mode    :    direct
Opcode             :    1001
RTL                :    IF Z=1 THEN PC <- PC + RRR ELSE PC <- PC + 1
Flags set          :    None
Description        :    jump if carry conditional jump, signed 9bit relative address, jump if last  
                        arithmetic / logical instruction generated a carry.

Example            :    jumpnc 204
Addressing mode    :    direct
Opcode             :    1001
RTL                :    IF Z=0 THEN PC <- PC + RRR ELSE PC <- PC + 1
Flags set          :    None
Description        :    jump if not carry conditional jump, signed 9bit relative address, jump if last  
                        arithmetic / logical instruction did not generate a carry.

Example            :    jumpn 205
Addressing mode    :    direct
Opcode             :    1001
RTL                :    IF Z=1 THEN PC <- PC + RRR ELSE PC <- PC + 1
Flags set          :    None
Description        :    jump if negative conditional jump, signed 9bit relative address, jump if last  
                        arithmetic / logical instruction generated a negative result.

Example            :    jumpp 206
Addressing mode    :    direct
Opcode             :    1001
RTL                :    IF Z=0 THEN PC <- PC + RRR ELSE PC <- PC + 1
Flags set          :    None
Description        :    jump if positive conditional jump, signed 9bit relative address, jump if last  
                        arithmetic / logical instruction generated a positive result.

Unconditional jumps use a 12bit absolute/direct address (AAA) allowing them to traverse the full address range. Conditional jumps use a signed 9bit relative address (RRR) i.e. PC-256 to PC+255.

Improvement 4 : changes to the PC

Figure 9 : original PC




Figure 10 : changes to the PC (top), 9-to-12bit sign extender logic (bottom)

To implement the relative addressing mode we need to update the program counter (PC) component, as shown in figures 9 and 10. As you can see i have added a second multiplexer on the B input of the top adder. This allows the control logic to add a sign extended 9bit immediate stored in the JUMP instruction to the PC i.e. the PC can be updated within the range PC-256 to PC+255. These muxs are controlled by control signals CTL3 and CTL0:

 
CTL3  CTL0   MUX_A_OUT   MUX_B_OUT    CIN    ADDER_OUT
0     0        PC           0         1        PC+1
0     1        ABS          0         0        ABS
1     0        PC           REL       0        PC+REL
1     1        ABS          REL       0        NOT-USED

The second update to the PC is to the CALL/RET stack. In the original PC the LIFO stack was implemented as a schematic, a register based stack with a depth of 4. However, after using this in a number of different applications i found a call depth of 4 a little limiting, sooo rather than redrawing this component's schematic to increase its depth, i cheated and implemented it using the VHDL shown below, increasing the call depth to 8, which was a lot more usable i.e. when using subroutine it was very easily to miss that you have had exceeded the call depth of 4, that you had overwritten the return addresses. As you can imagine when this did occur you would observe very "interesting" behaviours. Note, moving to a call depth of 8 does not remove this issue, just makes its less likely, sooo always keep an eye on what subroutines are calling other subroutines :).

library IEEE;
use IEEE.STD_LOGIC_1164.all;
use IEEE.STD_LOGIC_UNSIGNED.all;
use IEEE.Numeric_Std.all;

entity stack is
port(
  clk  : in  std_logic;
  clr  : in  std_logic;
  pop  : in  std_logic;
  push : in  std_logic;
  din  : in  std_logic_vector(11 downto 0);
  dout : out std_logic_vector(11 downto 0));
end stack;

architecture stack_arch of stack is

  signal count : std_logic_vector(2 downto 0);
  signal count_dec : std_logic_vector(2 downto 0);

  type ram_type is array (7 downto 0) of std_logic_vector(11 downto 0);
  signal RAM : ram_type;

begin

  counter : process( clk, clr )
  begin
    if clr = '1'
    then
      count <= (others=>'0');
    elsif clk'event and clk='1'
    then
      if push = '1'
      then
        count <= count + 1;
      elsif pop = '1'
      then 
        count <= count - 1;
      end if;
    end if;
  end process;

  decrement : process( clk )
  begin
    if clk'event and clk='1'
    then
      count_dec <= count - 1;
	 end if;
  end process; 

  ram_wr : process( clk )
  begin
    if clk'event and clk='1'
    then
      if push = '1'
      then 
        RAM(to_integer(unsigned(count))) <= din;
      end if;
    end if;
  end process;
  
  dout <= RAM(to_integer(unsigned(count_dec)));
  
end stack_arch;

Improvement 5 : SHR, SHL, ASL and ASR instructions

Figure 11 : shift logic

Figure 12 : ALU

The logic to implement the shift left / right function is simply buffers i.e. wires, implementing a fixed shift. To fill the bubble, multiplexers are used to select between a fixed 0 for ASL and ASR, and the carry flag (C) for SHR and SHL, as shown in figure 11. Two output multiplexers are then used to select between the shift left and shift right buffer arrays. The ALU now also needs an extra MUX to select the carry out source i.e. from the shift logic, or the adder/subtractor, as shown in figure 12.

INSTR   IR15 IR14 IR13 IR12 IR11 IR10 IR09 IR08 IR07 IR06 IR05 IR04 IR03 IR02 IR01 IR00  

SHL     1    1    1    1    RSD  RSD  X    X    X    X    X    0    0    1    0    0  
SHR     1    1    1    1    RSD  RSD  X    X    X    X    X    0    0    1    0    1  

ASL     1    1    1    1    RSD  RSD  X    X    X    X    X    0    1    0    1    1  
ASR     1    1    1    1    RSD  RSD  X    X    X    X    X    0    1    1    0    0

Example            :    asl RB 
Addressing mode    :    register
Opcode             :    1111 00100
RTL                :    RX <- RX(14:0) || 0
Flags set          :    Z,C,O,P,N
Description        :    shift left, insert 0 into LSB

Example            :    asr RB 
Addressing mode    :    register
Opcode             :    1111 00100
RTL                :    RX <- 0 || RX(15:1)
Flags set          :    Z,C,O,P,N
Description        :    shift right, insert 0 into MSB

Example            :    shl RB 
Addressing mode    :    register
Opcode             :    1111 00100
RTL                :    RX <- RX(14:0) || 0
Flags set          :    Z,C,O,P,N
Description        :    shift left, insert carry flag (C) into LSB

Example            :    shr RB 
Addressing mode    :    register
Opcode             :    1111 00100
RTL                :    RX <- 0 || RX(15:1)
Flags set          :    Z,C,O,P,N
Description        :    shift right, insert carry flag (C) into MSB 

Improvement 6 : CMP and TEST instructions

Compare (CMP) and Test instructions, these instructions are used to test the state of a register without changing its value. The CMP instruction performs a subtract operation i.e. register - immediate, however the register is not updated with the result, only the status register is updated. Therefore, in a program this can be used with a conditional JUMP instruction to test the outcome of this subtraction e.g. if the Z flag is set then the register equals the immediate value. The advantage of using the CMP instruction rather than a SUB instruction is that the original value is not lost i.e. when do not have to add additional instructions to buffer this value so that it can be restored after the SUB instruction is performed. The TEST instruction is the same but performs a bitwise AND operation, allowing the programmer to test if specific bits are set within a register. Note, no new hardware needs to be added to the processor to implement these new instructions, they are simply the SUB and AND instructions with the register file's write-enable disabled.

INSTR   IR15 IR14 IR13 IR12 IR11 IR10 IR09 IR08 IR07 IR06 IR05 IR04 IR03 IR02 IR01 IR00  

CMP     1    0    1    0    RD   RD   0    0    K    K    K    K    K    K    K    K  
TEST    1    0    1    0    RD   RD   0    1    K    K    K    K    K    K    K    K  

Example            :    cmp RB 2
Addressing mode    :    immediate
Opcode             :    1010 00
RTL                :    RX - ( (K7)8 || KK ) 
Flags set          :    Z,C,O,P,N
Description        :    signed subtraction, register is not updated, flags are.

Example            :    test RB 2
Addressing mode    :    immediate
Opcode             :    1010 00
RTL                :    RX & ( (0)8 || KK ) 
Flags set          :    Z,C,O,P,N
Description        :    bitwise AND, register is not updated, flags are.

Testing : simulating the new instructions

Figure 13 : simulation waveform

As always it easy to design new hardware, the hard part is getting it to work :). To test out these new instructions i wrote this test code. At this time it seems to be working correctly, fingers crossed :). Note, the NOT_IMPLEMENTED instructions are for the students to add during the assessment, i have tested the processor with my implementations, but these are not available for download :).

###############################
# SIMPLECPU V1d2 TEST PROGRAM #
###############################

###################
# INSTRUCTION-SET #
###################

# INSTR   IR15 IR14 IR13 IR12 IR11 IR10 IR09 IR08 IR07 IR06 IR05 IR04 IR03 IR02 IR01 IR00  
# MOVE    0    0    0    0    RD   RD   0    0    K    K    K    K    K    K    K    K
# MOVEL   0    0    0    0    RD   RD   0    0    K    K    K    K    K    K    K    K
# MOVEH   0    0    0    0    RD   RD   0    1    K    K    K    K    K    K    K    K
# MOVEU   0    0    0    0    RD   RD   1    0    K    K    K    K    K    K    K    K

# ADD     0    0    0    1    RD   RD   X    X    K    K    K    K    K    K    K    K
# SUB     0    0    1    0    RD   RD   X    X    K    K    K    K    K    K    K    K

# AND     0    0    1    1    RD   RD   0    0    K    K    K    K    K    K    K    K
# OR      0    0    1    1    RD   RD   0    1    K    K    K    K    K    K    K    K
# XOR     0    0    1    1    RD   RD   1    0    K    K    K    K    K    K    K    K

# LOAD    0    1    0    0    A    A    A    A    A    A    A    A    A    A    A    A
# STORE   0    1    0    1    A    A    A    A    A    A    A    A    A    A    A    A
# ADDM    0    1    1    0    A    A    A    A    A    A    A    A    A    A    A    A
# SUBM    0    1    1    1    A    A    A    A    A    A    A    A    A    A    A    A

# JUMPU   1    0    0    0    A    A    A    A    A    A    A    A    A    A    A    A
# JUMPZ   1    0    0    1    0    0    0    A    A    A    A    A    A    A    A    A  -- RELATIVE -256 to +255
# JUMPNZ  1    0    0    1    0    0    1    A    A    A    A    A    A    A    A    A
# JUMPC   1    0    0    1    0    1    0    A    A    A    A    A    A    A    A    A 
# JUMPNC  1    0    0    1    0    1    1    A    A    A    A    A    A    A    A    A 
# JUMPN   1    0    0    1    1    0    0    A    A    A    A    A    A    A    A    A 
# JUMPP   1    0    0    1    1    0    1    A    A    A    A    A    A    A    A    A 

# CMP     1    0    1    0    RD   RD   0    0    K    K    K    K    K    K    K    K 
# TEST    1    0    1    0    RD   RD   0    1    K    K    K    K    K    K    K    K 

# XOP1    1    0    1    1    RD   RD   X    X    K    K    K    K    K    K    K    K  -- NOT IMPLEMENTED  
# XOP2    1    1    0    0    RD   RD   X    X    K    K    K    K    K    K    K    K  -- NOT IMPLEMENTED  
# XOP3    1    1    0    1    RD   RD   X    X    K    K    K    K    K    K    K    K  -- NOT IMPLEMENTED  

# CALL    1    1    1    0    A    A    A    A    A    A    A    A    A    A    A    A
# RET     1    1    1    1    X    X    X    X    X    X    X    0    0    0    0    0

# MOVE    1    1    1    1    RD   RD   RS   RS   X    X    X    0    0    0    0    1
# LOAD    1    1    1    1    RD   RD   RS   RS   X    X    X    0    0    0    1    0  -- REG INDIRECT
# STORE   1    1    1    1    RD   RD   RS   RS   X    X    X    0    0    0    1    1  -- REG INDIRECT  
 
# SHL     1    1    1    1    RSD  RSD  X    X    X    X    X    0    0    1    0    0  
# SHR     1    1    1    1    RSD  RSD  X    X    X    X    X    0    0    1    0    1  

# ADD     1    1    1    1    RD   RD   RS   RS   X    X    X    0    0    1    1    0  -- NOT IMPLEMENTED
# SUB     1    1    1    1    RD   RD   RS   RS   X    X    X    0    0    1    1    1  -- NOT IMPLEMENTED

# AND     1    1    1    1    RD   RD   RS   RS   X    X    X    0    1    0    0    0  -- NOT IMPLEMENTED
# OR      1    1    1    1    RD   RD   RS   RS   X    X    X    0    1    0    0    1  -- NOT IMPLEMENTED
# XOR     1    1    1    1    RD   RD   RS   RS   X    X    X    0    1    0    1    0  -- NOT IMPLEMENTED

# ASL     1    1    1    1    RD   RD   RS   RS   X    X    X    0    1    0    1    1  
# ASR     1    1    1    1    RD   RD   RS   RS   X    X    X    0    1    1    0    0  

# XOP4    1    1    1    1    RD   RD   RS   RS   X    X    X    0    1    1    0    1  -- NOT IMPLEMENTED
# XOP5    1    1    1    1    RD   RD   RS   RS   X    X    X    0    1    1    1    0  -- NOT IMPLEMENTED
# XOP6    1    1    1    1    RD   RD   RS   RS   X    X    X    0    1    1    1    1  -- NOT IMPLEMENTED
# XOP7    1    1    1    1    RD   RD   RS   RS   X    X    X    1    0    0    0    0  -- NOT IMPLEMENTED
# XOP8    1    1    1    1    RD   RD   RS   RS   X    X    X    1    0    0    0    1  -- NOT IMPLEMENTED
# XOP9    1    1    1    1    RD   RD   RS   RS   X    X    X    1    0    0    1    0  -- NOT IMPLEMENTED

start:
  move RA 0x01        # RA = 0x0001
  move RB 0x01        # RB = 0x0001
  move RC 0x01        # RC = 0x0001
  move RD 0x01        # RD = 0x0001

  move RA 0x80        # RA = 0xFF80
  move RB 0x80        # RB = 0xFF80 
  move RC 0x80        # RC = 0xFF80
  move RD 0x80        # RD = 0xFF80

  movel RA 0x01       # RA = 0x0001
  movel RB 0x01       # RB = 0x0001
  movel RC 0x01       # RC = 0x0001
  movel RD 0x01       # RD = 0x0001

  movel RA 0x80       # RA = 0xFF80
  movel RB 0x80       # RB = 0xFF80
  movel RC 0x80       # RC = 0xFF80
  movel RD 0x80       # RD = 0xFF80

  moveh RA 0x01       # RA = 0x0100
  moveh RB 0x01       # RB = 0x0100
  moveh RC 0x01       # RC = 0x0100
  moveh RD 0x01       # RD = 0x0100

  moveh RA 0x80       # RA = 0x8000
  moveh RB 0x80       # RB = 0x8000
  moveh RC 0x80       # RC = 0x8000
  moveh RD 0x80       # RD = 0x8000

  moveu RA 0x01       # RA = 0x0001       
  moveu RB 0x01       # RB = 0x0001
  moveu RC 0x01       # RC = 0x0001
  moveu RD 0x01       # RD = 0x0001

  moveu RA 0x80       # RA = 0x0080
  moveu RB 0x80       # RB = 0x0080
  moveu RC 0x80       # RC = 0x0080
  moveu RD 0x80       # RD = 0x0080 

  add RA 0x01         # RA = 0x0081
  add RB 0x01         # RB = 0x0081
  add RC 0x01         # RC = 0x0081
  add RD 0x01         # RD = 0x0081

  sub RA 0x01         # RA = 0x0080
  sub RB 0x01         # RB = 0x0080
  sub RC 0x01         # RC = 0x0080
  sub RD 0x01         # RD = 0x0080

  move RA 0xFF        # RA = 0xFFFF
  move RB 0xFF        # RB = 0xFFFF
  move RC 0xFF        # RC = 0xFFFF
  move RD 0xFF        # RD = 0xFFFF

  and RA 0x55         # RA = 0x0055
  and RB 0x55         # RB = 0x0055
  and RC 0x55         # RC = 0x0055
  and RD 0x55         # RD = 0x0055

  or RA 0xAA          # RA = 0x00FF
  or RB 0xAA          # RB = 0x00FF
  or RC 0xAA          # RC = 0x00FF
  or RD 0xAA          # RD = 0x00FF

  xor RA 0xCC         # RA = 0x0033 
  xor RB 0xCC         # RB = 0x0033 
  xor RC 0xCC         # RC = 0x0033 
  xor RD 0xCC         # RD = 0x0033 

  addm RA one         # RA = 0x0034
  subm RA one         # RA = 0x0033
 
                      # store RA to memory data 0-9
  store RA data0      # RA = 0x0033
  addm RA one         # RA = 0x0034
  store RA data1
  addm RA one         # RA = 0x0035
  store RA data2
  addm RA one         # RA = 0x0036
  store RA data3
  addm RA one         # RA = 0x0037
  store RA data4
  addm RA one         # RA = 0x0038
  store RA data5
  addm RA one         # RA = 0x0039
  store RA data6
  addm RA one         # RA = 0x003A
  store RA data7
  addm RA one         # RA = 0x003B
  store RA data8
  addm RA one         # RA = 0x003C
  store RA data9
  addm RA one         # RA = 0x003D

                      # load memory data 0-9
  load RA data0       # RA = 0x0033
  load RA data1       # RA = 0x0034
  load RA data2       # RA = 0x0035
  load RA data3       # RA = 0x0036
  load RA data4       # RA = 0x0037
  load RA data5       # RA = 0x0038
  load RA data6       # RA = 0x0039
  load RA data7       # RA = 0x003A
  load RA data8       # RA = 0x003B
  load RA data9       # RA = 0x003C

  move RA 0xFF        # RA = 0xFFFF
  move RB 0xFF        # RB = 0xFFFF 
  move RC 0xFF        # RC = 0xFFFF
  move RD 0xFF        # RD = 0xFFFF 

  asl RA              # RA = 0xFFFE 1111 1111 1111 1110
  shl RA              # RA = 0xFFFD 1111 1111 1111 1101
  asl RB              # RB = 0xFFFE
  shl RB              # RB = 0xFFFD
  asl RC              # RC = 0xFFFE
  shl RC              # RC = 0xFFFD
  asl RD              # RD = 0xFFFE  
  shl RD              # RD = 0xFFFD
         
  asr RA              # RA = 0x7FFE 0111 1111 1111 1110
  shr RA              # RA = 0xBFFF 1011 1111 1111 1111
  asr RB              # RB = 0x7FFE 
  shr RB              # RB = 0xBFFF
  asr RC              # RC = 0x7FFF
  shr RC              # RC = 0xBFFF
  asr RD              # RD = 0x7FFF
  shr RD              # RD = 0xBFFF

  move RA 0x00        # RA = 0x0000
  move RB 0x00        # RB = 0x0000 
  move RC 0x00        # RC = 0x0000
  move RD 0x00        # RD = 0x0000

  move RD 48          # walk a 1 through RA, RB and RC
  move RA 1           # RD is loop counter
loop:
  asl RA
  shl RB
  shl RC
  sub RD 1
  jumpnz loop 
  
  move RB 0x00        # compare RA to 5
  move RA 5           # RB 0=OK, 1=ERROR
  cmp RA 5
  jumpz cmp_eq        # is RA = 5

  move RB 0xFF

cmp_eq:
  cmp RA 10           # is RA = 10
  jumpz cmp_neq
  jump next

cmp_neq:
  move RB 0xFF

next: 
  test RA 1           # is RA(0) 1?
  jumpnz test_eq

  move RB 0xFF

test_eq:
  test RA 2           # is RA(1) 1?
  jumpz test_neq
  jump next1

test_neq:
  move RB 0xFF

next1:
  call subA           # test nested subroutines
  move RA 16          # call depth 8, if working correctly the value
                      # 1 to 15 should be stored in register RA
trap:
  jump trap           # finish

subA:
  move RA 1           # subroutine A
  call subB
  move RA 15
  ret

subB:
  move RA 2           # subroutine B
  call subC
  move RA 14
  ret

subC:
  move RA 3           # subroutine C
  call subD
  move RA 13
  ret

subD:
  move RA 4           # subroutine D
  call subE
  move RA 12
  ret

subE:
  move RA 5           # subroutine E
  call subF
  move RA 11
  ret

subF:
  move RA 6           # subroutine F
  call subG
  move RA 10
  ret

subG:
  move RA 7           # subroutine G
  call subH
  move RA 9
  ret

subH:
  move RA 8           # subroutine H
  ret                 

one:
  .data 1

data0:
  .data 0
data1:
  .data 0
data2:
  .data 0
data3:
  .data 0
data4:
  .data 0
data5:
  .data 0
data6:
  .data 0
data7:
  .data 0
data8:
  .data 0
data9:
  .data 0

WORK IN PROGRESS

Creative Commons Licence

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Contact email: mike@simplecpudesign.com

Back