Minimal CPU

Home

When is a simple cpu not simple enough? Answer: when you are learning to program in assembler :). When i started my journey into computers at school you had one choice: BBC Micro (Link) and Basic (Link). This computer is based on the 6502 (Link), a simple 8bit processor, allowing anyone with a basic understanding of electronics to understand this computer's hardware, shown in figure 1. These were also simpler times with regards to programming languages, typically starting with Basic, but soon progressing to assembler (Link) owing to the performance limitations of the interpreted Basic programming language. As a result students had a shorter path from software to hardware, so had a better low level understanding of how a computer worked.

Figure 1 : BBC computer schematic

Today the distance between software and hardware is huge, software components abstract away from the underlying hardware, separating the user from the processor. As a result students coming to computer science do not have that DIY, low level, electronics experience of old. Also, the programming style used by modern high level programming languages i.e. function / method focused, OO style, is a world removed from low level programming languages such as assembler, where a basic understanding of the underlying hardware is required. As a result students need to relearn how to program, how to breakdown a problem into a sequence of low level, simple instructions, rather than relying on built in functions and libraries. To aid is this transition from a high level language such as python / java, to the simple assembly language used on the SimpleCPU, i designed an assembly like language for a minimalCPU processor.

MinimalCPU v1
Instruction Set Simulator - v1
Instruction Set Simulator - v2

MinimalCPU v1

Figure 2 : SimpleCPU version 1

Traditional assembly languages come with the joys of having to know about the hardware, consider the SimpleCPU version 1 shown in figure 2. To understand this block diagram a student needs to understand its architectural building blocks: buses, multiplexers, registers, accumulator etc. This knowledge is needed to understand the function and terminology used by its assembly language instructions. Therefore, the minimal CPU needs to remove this requirement, whilst still using an assembler style syntax, it needs to abstract away from: Busses, Memory and Registers etc. This may seem like fundamental stuff, when considering computer architectures and you would be correct, however, the focus here is develop an architecture, an assembly language that is easy to understand, rather than being one that is easy to implement in hardware. An example program is shown below:

start:
  add A 1
  add B 2
test:
  and A B

The minimalCPU does not have the concept of an address, rather data is stored in variables, A to Z, so it has 26 data "storage" locations. Therefore, it also has no concept of memory i.e. data or instruction locations, rather its memory is the text file used to store its program, it is a list of instructions that will be sequentially executed, one after another. Positions within this program can be identified by labels i.e. text ending with a ":". In the above example there are two labels start and test. Therefore, from one point of view, you can consider this machine to be comparable to a type of data-flow architecture, as shown in figure 3. With each instruction setting up a data path, from our variables, through a functional block and then back into a variable.

Figure 3 : processing architecture

Instructions can be either one operand or two operand formats. An instruction has an OPCODE that defines its functions and then one or two OPERANDs that define the data that this function will process. To keep things simple the first version of the CPU supports three instructions, using two addressing modes i.e. methods of getting your data, "variables" and "constants" :

INSTRUCTION      DESCRIPTION                               EXAMPLE 

AND X Y          Perform the bitwise AND function on       X = X AND Y
                 two variables X and Y, storing the        AND 11011011 00001111
                 result in variable X.                     
                                                           X = 11011011
                                                           Y = 00001111
                                                               --------
                                                               00001011
                                                               --------
                
AND X 10         Perform the bitwise AND function on       X = X AND 10
                 variable X and constant 10, storing       AND 11011011 00001010
                 the result in variable X.                     
                                                           X = 11011011
                                                           Y = 00001010
                                                               --------
                                                               00001010
                                                               --------

ADD X Y          Perform the ADDITION function on          X = X ADD Y
                 two variables X and Y, storing the        ADD 11011011 00001111
                 result in variable X.                     
                                                           X = 11011011
                                                           Y = 00001111
                                                               --------
                                                               11101010
                                                               --------
                                                                 11111

ADD X 10         Perform the ADDITION function on          X = X ADD Y
                 variable X and constant 10, storing       ADD 11011011 00001010
                 the result in variable X.                     
                                                           X = 11011011
                                                           Y = 00001010
                                                               --------
                                                               11100101
                                                               --------
                                                                 11 1

JPZ loop         Jump to label "loop" is the last          start:      
                 calculation generated a zero result,          AND Z 0 
                 this example implementing a loop              JPZ start

The two operand syntax: OPCODE : OPERAND_0 : OPERAND_1. Operand_0 must be a variable as this is used to store the result. Operand_1 can be a variable or a constant. Both variables and constants are 8bit values. Note, i decided to break my rule regarding abstracting away from the hardware here as this allows more interesting quizzz questions later, but for now what this means is that variables and constants are limited to integers within the range 0 to 255. Early quizzz questions limit calculations to results within this range, so all is fine, however, if you do generate a result bigger than 255 the result will loop around back through 0 e.g. 255+1=0, 255+2=1, 255+3=2 ...

The one operand syntax: OPCODE : OPERAND_0, is used by the JUMP instruction, this is a label within the program e.g. in the previous example program if you wanted to jump back to the beginning of program you could execute the instruction : jpz start, where start is a label. Note, this jump is only performed if the pevious calculation produced a zero result i.e. there are no unconditional jumps, exactly the same as the classic first generation computer EDSAC (Link).

Using just these three instructions the first programming task for the students is to answer the questions below. The aim here is force students to consider how to combine these simple instructions, such that they can implement the functionality needed to solve these more complex programming tasks :). You can download a PDF copy of these questions here: (Link).

Q1) the memory locations X and Y are initialised with the values:

128 64 32 16 8 4 2 1
X = 14 = 0 0 0 0 1 1 1 0
Y = 35 = 0 0 1 0 0 0 1 1

Consider the program below, what is the final value stored in memory location X? Hint, remember
that the first variable i.e. X in this example, is overwritten with the result.

ADD X, Y
AND X, Y

Q2) write a program to initialise memory locations X and Y to the following values
X = 123 and Y = 0x123 i.e. load these values into memory.

Hint, you can pass an instruction a hard-coded value e.g. AND X, 67, rather than two variables
e.g. AND X, Y. Also, consider the two scenarios where the original value stored in the memory
location is zero and when it is a non zero value. A number starting with 0x indicates that the
following value is in hexadecimal i.e. base 16, rather than the default base 10.

Q3) write a program to perform the following calculation.

X = X * 5

Hint, what is multiplication? You will need to use multiple instructions to implement this function.

Q4) write a program to perform the following calculation. Your solution must use less than eight
instructions.

X = X * 16

Q5) write a program to perform the following calculation i.e. the multiplication of two unknown
variables.

X = X * Y

Hint, you will need to use the JUMP instruction. Jumps are made to labels within a program i.e.
a name (label) ending with a ":". Consider the example below, the JPZ instruction jumps to the
label START, if the result of the ADD instruction produces a zero result, otherwise the next
instruction in the program is executed i.e. this code gets stuck in an infinite loop.

START: START:
AND X 0 X=0
AND Y 0 Y=0
ADD X Y X=X+Y
JPZ START IF RESULT = 0 THEN START
ADD Z X Z=Z+X

Q6) write a program to perform the following calculation.

X = X OR Y

where the OR function performs a bitwise OR on the memory locations X and Y. Like the bitwise
AND instruction, the bitwise OR instruction will perform the logical OR function on each bit
position. Consider the exmaple below:

OR 11011011 00001111

X = 11011011
Y = 00001111
--------
11011111
--------

Your solution must be less than 16 instructions. You may assume X and Y have already been
loaded into memory. Hint, this one is tricky :)

Instruction Set Simulator - v1

Don't judge :), this is a cheap and cheerful, command line, instruction set simulator for the minimalCPU, written in everyone's favourite programming language python. The aim here was not to write the fastest simulator, or the most compacted, or to factor out each function, but rather to produce code that was simple to read / understand. Code that an inexperienced programmer could understand and improve. You can download a copy here: (Link). Note, still a work in progress so may have some "features", not fully tested this code in Windows, i only play game in Windows :).

#!/usr/bin/python3

import getopt
import sys
import os
import re
import time

if sys.platform =='win32':
  import msvcrt
else:
  import tty
  import termios

###################
# INSTRUCTION-SET #
###################

# a very simple instruction set simulator for the minimal cpu, 
# program file plain text, single step through program.

#############
# VERSION 1 #
#############

# ADD X Y       ADD X 0
# AND X Y       AND X 0
# JPZ 

#############
# VERSION 2 #
#############

# ADD X Y       ADD X 0
# AND X Y       AND X 0
# XOR X Y       XOR X 0
# JPZ 

#############
# VERSION 3 #
#############

# ADD X Y       ADD X 0
# AND X Y       AND X 0
# OR  X Y       OR  X 0 
# XOR X Y       XOR X 0
# JPZ 

#############
# FUNCTIONS #
#############

def get_key():
  if sys.platform == 'win32':
    return msvcrt.getch().decode('utf-8')
  else:
    fd = sys.stdin.fileno()
    old_settings = termios.tcgetattr(fd)
    try:
      tty.setraw(sys.stdin.fileno())
      ch = sys.stdin.read(1)
    finally:
      termios.tcsetattr(fd, termios.TCSADRAIN, old_settings)
    return ch

def pad(number):
  if number <10:
    return "00" + str(number)
  elif number >9 and number <100:
    return "0" + str(number)
  else:
    return str(number)

################
# MAIN PROGRAM #
################

def minimal_cpu_simulator(argv):

  if len(sys.argv) <= 1:
    print ("Usage: minimal_cpu_simulator.py -i ")
    print ("                                -v ") 
    print ("                                -b ") 
    print ("                                -d <0/1/2>") 
    return
  
  version = 1
  debug = False
  debug_level = 0
  breakpoint = False
  input_file_present = False
  run = False

  line_number = 0
  max_line_number = 0
  jump_line_number = 0
  breakpoint_line_number = 0
  zero_flag = 0

  s_config = 'i:v:d:b:'
  l_config = ['input', 'version', 'debug', 'breakpoint']

  instructions = ['and', 'add', 'jpz', 'xor', 'or']

  variables = { 'a':0, 'b':0, 'c':0, 'd':0, 'e':0, 'f':0, 'g':0, 'h':0, 'i':0, 'j':0,
                'k':0, 'l':0, 'm':0, 'n':0, 'o':0, 'p':0, 'q':0, 'r':0, 's':0, 't':0,
                'u':0, 'v':0, 'w':0, 'x':0, 'y':0, 'z':0 }

  source = []
  source_filename = ""
  code = []
  labels = {}  

  try:
    options, remainder = getopt.getopt(sys.argv[1:], s_config, l_config)
  except getopt.GetoptError as m:
    print("Error: invalid arguments -", m)
    sys.exit(1)

  # process arguments #
  # ----------------- #

  for opt, arg in options:
    if opt in ('-i', '--input'):
      if ".asm" in arg:
        source_filename = arg
      else:
        source_filename = arg + ".asm"

      if os.path.isfile(source_filename):
        input_file_present = True
    elif opt in ('-v', '--version'):
      version = int(arg)
    elif opt in ('-d', '--debug'):
      if int(arg) == 0:
        debug = False
      elif int(arg) == 1:
        debug = True
        debug_level = 1
      else:
        debug = True
        debug_level = 2
    elif opt in ('-b', '--breakpoint'):
      breakpoint = True
      breakpoint_line_number = arg

  if debug:	  
    print ("read input parameter : OK")
    if debug_level == 2:
      print( str(source_filename) + " " + str(version) + " " + str(debug) + "\n")

  # read source file #
  # ---------------- #

  if input_file_present:
    try:
      source_file = open(source_filename, "r")
      source = source_file.readlines()
    except IOError: 
      print("Error: could not open source file")
      sys.exit(1) 

  else:
    print("Error: could not find source file")
    sys.exit(1) 

  if debug:	  
    print ("read code : OK")
    if debug_level == 2:
      print ( source )
      print("")

  # seperate labels and instructions #
  # -------------------------------- #

  for line in source:	
    line = re.sub(r'\s+', ' ', line.replace('\n','').replace('\r','').lower())
    if line == '': 
      continue

    if ":" in line:
      key = re.sub(":$", '', line)
      if key in labels:
        print("Error: duplicate labels")
        sys.exit(1)
      else:
        labels[key] = line_number
    else:
      words = line.split(' ')
      if words[1] in instructions:	
        if len(words) == 4:
          code.append( (words[1], words[2], words[3]) )
        elif len(words) == 3:
          code.append( (words[1], words[2]) )
        else:
          print("Error: invalid instruction: " + str(words)) 
          sys.exit(1)
        line_number = line_number + 1
      else:
        print("Error: invalid instruction: " + str(words)) 
        sys.exit(1)

  if debug:	  
    print ("code processed : OK")
    print ("instructions : " + str(line_number))
    if debug_level == 2:
      print( code )
      print( labels )
    print("")

  # run program #
  # ----------- #

  max_line_number = line_number
  line_number = 0

  while True:
    instruction = code[line_number]

    opcode = instruction[0]
    operand_0 = instruction[1]

    absolute = False
    immediate = False
    jump = False

    # check breakpoint #
    # ---------------- #

    if breakpoint:
      if line_number == labels[breakpoint_line_number]:
        print("Breakpoint")
        print( memory )
        return

    # decode instruction #
    # ------------------ #

    if len(instruction) == 3:
      operand_1 = instruction[2]
      if operand_1.isalpha():
        absolute = True
      else:
        immediate = True

    # AND #
    # --- #

    if opcode == "and":
      if immediate: 
        result = (variables[operand_0] & 255) & (int(operand_1) & 255)
        variables[operand_0] = result

      else:
        result = (variables[operand_0] & 255) & (variables[operand_1] & 255)
        variables[operand_0] = result

    # ADD #
    # --- #

    elif opcode == "add":
      if immediate: 
        result = ((variables[operand_0] & 255) + (int(operand_1) & 255)) & 255
        variables[operand_0] = result

      else:
        result = ((variables[operand_0] & 255) + (variables[operand_1] & 255)) & 255
        variables[operand_0] = result

    # JPZ #
    # --- #

    elif opcode == "jpz":
      if zero_flag:
        jump_line_number = labels[operand_0]
        jump = True
      else:
        jump = False

    # XOR #
    # --- #

    elif opcode == "xor" and version >= 1:
      if immediate: 
        result = (variables[operand_0] & 255) ^ (int(operand_1) & 255)
        variables[operand_0] = result

      else:
        result = (variables[operand_0] & 255) ^ (variables[operand_1] & 255)
        variables[operand_0] = result

    # OR #
    # -- #

    elif opcode == "or" and version >= 2:
      if immediate: 
        result = (variables[operand_0] & 255) | (int(operand_1) & 255)
        variables[operand_0] = result

      else:
        result = (variables[operand_0] & 255) | (variables[operand_1] & 255)
        variables[operand_0] = result

    else:
      print("Error: invalid instruction: " + str(instruction)) 
      sys.exit(1)

    # update zero flag #
    # ---------------- #

    if result == 0:
      zero_flag = True
    else:
      zero_flag = False

    # update line counter #
    # ------------------- #

    if jump:
      print (pad(line_number) + " : " + opcode + " " + operand_0 + " : " + str(jump_line_number) + " = " + str(jump) )
      line_number = jump_line_number
    else:
      print (pad(line_number) + " : " + opcode + " " + operand_0 + " " + str(operand_1) + " : " + operand_0 + " = " + str(result) + ", zf = " + str(zero_flag))
      line_number = line_number + 1

      if line_number == max_line_number:
        break

    if not run: 
      print("    : r=run, s=step, v=variables, q=quit")
      while True:
        key = get_key()
        if key == 'q':
          return
        elif key == 'r':
          run = True
          break
        elif key == 's':
          break
        elif key == 'v':
          line = "    : "
          count = 0
          for v in variables:
            line = line + str(v) + ":" + str(variables[v]) + " "
            count = count + 1
            if count == 14 or count == 26:
              print(line)
              line = "    : "
    else:
      time.sleep(0.25)
   

if __name__ == '__main__':
  minimal_cpu_simulator(sys.argv)

To test this simulator i used the simple test program shown below. This program loads X=252 and Y=1. Then enters a while loop, incrementing X, until an overflow occurs, causing the program to finish. Note, the final add x 0 is a NOP i.e. a No Operation instruction. You can download a copy here: (Link).

start:
    and x 0
    and y 0
    add x 252
    add y 1
loop:
    add x y
    jpz exit
    add z 0
    jpz loop
exit:
    add x 0

This simulator is a command line program, therefore, to run this program you will need to open your favourite flavour of command line e.g. for Windows Anaconda (Link), or Linux your normal terminal, as shown in figure 4.

Figure 4 : command line

At the command line you can specify:

input file -i parameter : the assembly code that you want to test, a plain text file using the extension .asm i.e. its just a normal ASCII text file.
version of the minimal cpu -v parameter : at the moment there are 3 versions of the minimal cpu, discussed later, but basically an incremental expansion of the instruction set, adding new instructions to allow more complex programs, default is version 1, 2=version2, 3=version3.
breakpoint -b parameter : causes the simulator to stop on a label e.g. for the above code example if the parameter -b loop was specified, the simulator would stop after it had executed the ADD Y 1 instruction, before the ADD X Y instruction was executed. The simulator will then print the contents of the variables A - Z to the screen and finish.
debug -d parameter : prints to the console debugging information i.e. 0=none, 1=some, 2=all.

A screen shot of the above program's instruction trace is shown in figure 5. The first three digit number is the line number in the text file of the instruction being executed, starting at line 0. Why 0, well everything starts at 0 in computer science :). ZF=Zero flag, this is set to TRUE if the last calculation produced a zero result.

Figure 5 : instruction trace

The simulator executes the first instruction, the user can choose to single step through the program i.e. execute one instruction at a time, by pressing the S key, or run to completion by pressing the R key. After each instruction is executed the status of the updated variable and ZF are printed to the screen. You can also display the state of all variables by pressing the V key. To finish the program before it is completed press the Q key. Note, if you press the R key and you code get stuck in an infinite loop, CRTL-C is your friend :).

Instruction Set Simulator - v2

Some small improvements, updates. Added a -s commandline switch, this allows a used to initialise a variable before run time. The parameter format is VARIABLE:VALUE e.g. -s a:123 will initialise the variable 'A' to the value 123, an example is shown in figure 6 below.

Figure 6 : initialising variables

In this example the program file code.asm contains a single instruction ADD X Y. The variables X is set to 10 and Y is set to 30 using the -s switch. To see the final result another small change was made to the code, on completion of the program all variables are printed to the screen i.e. so we can see the result X = 30. You can download a copy here: (Link).

WORK IN PROGRESS

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Contact email: mike@simplecpudesign.com

Back

Minimal CPU

Table of Contents

MinimalCPU v1

Instruction Set Simulator - v1

Instruction Set Simulator - v2