Make your own Assembler simulator in JavaScript (Part 2)

Make your own Assembler simulator in JavaScript (Part 2)

In part 1 we did create the CPU. In the second and final part, we gonna focus on the memory, console output, assembler and the UI.

The memory

We will use a simple array to represent the memory. Each slot contains a JavaScript numeric value. So, in theory, the memory can hold values bigger than a byte but our CPU code will make sure all values are between 0-255.

The memory has three functions. Load retrieves a byte from the given address while the Store function writes a given byte value to the specified address. Both functions will throw an error if the given address is outside of the valid address space. The third function resets all memory values back to zero. We will use it to initialize and reset the simulator.

var memory = Array(256);

function load(address) {
    if (address < 0 || address >= memory.length) {
        throw "Memory access violation. Address: " + address;
    }

    return memory[address];
};
        
function store(address, value) {
    if (address < 0 || address >= memory.length) {
        throw "Memory access violation. Address: " + address;
    }

    memory[address] = value;
};

function reset() {
    for(var i=0; i < memory.length; i++) {
        mmeory[i] = 0;
    }
};

It's as simple as that. The complete code can be found in memory.js.

The console output

The last part of our virtual computer is the console output. It can display 24 characters and will map the last 24 bytes of the memory to the output. Thus in order for the program to write something on the output all what needs to be done is to write the data in the last 24 bytes of the memory. No need to write any code from our side :).

The assembler

After we finished the basic components of our virtual computer we have one last piece missing. How to assemble code into CPU instructions?

In order to do this, we need to write our own assembler. The assembler goes trough each line of the code, parses the operands and generates the CPU instructions.

To parse the code, we will use a regular expression. This is just a very simple approach because it is not possible to fully parse code with a regex. Instead, we would need to generate a AST and continue from there. But for simplicity and because Assembly code has a simple structure we will use a regex.

The regex is defined as written below and will assign each component of an Assembler code line to a group.

// Matches: "label: INSTRUCTION (["')OPERAND1(]"'), (["')OPERAND2(]"')
// GROUPS:      1       2               3                    7
var regex = /^[\t ]*(?:([.A-Za-z]\w*)[:])?(?:[\t ]*([A-Za-z]{2,4})(?:[\t ]+(\[(\w+((\+|-)\d+)?)\]|\".+?\"|\'.+?\'|[.A-Za-z0-9]\w*)(?:[\t ]*[,][\t ]*(\[(\w+((\+|-)\d+)?)\]|\".+?\"|\'.+?\'|[.A-Za-z0-9]\w*))?)?)?/;

// Regex group indexes
var GROUP_LABEL = 1;
var GROUP_OPCODE = 2;
var GROUP_OPERAND1 = 3;
var GROUP_OPERAND2 = 7;

The assembler has one main function called run. It takes the code as a parameter, parses the code and generates all instructions. The result is returned as an array of instructions and can be used to load into the memory. You guessed it correctly. The returned array is an executable program which we will later run inside the simulator.

In Assembly, each instruction needs to be on a separate line. This makes it easy for us to parse it. Running the assembler will:

  1. Split the code line by line. Each line is then processed separately.
  2. Make sure the line contains a valid instruction.
  3. If so, then depending on the instruction read the operands. Each instruction can have 0-2 operands. The readOperand function will parse this value to see if the operand is a register, memory address or constant value. It returns the type (register, regaddress, address, number) and the value.
  4. Determine the correct opcode depending on the instruction and type of operands. As written in the first part of the tutorial each opcode only serves one specific instruction. So different variants have different opcodes.
  5. Add the final instruction including any operands to the code array.
  6. Go to step (1) until all line are parses and return the code array.
function run(code) {
    var code = [];
    var opCode;
    var lines = code.split('\n');
    
    for (var i = 0, l = lines.length; i < l; i++) {
        var match = regex.exec(lines[i]);
    
        if (match[GROUP_OPCODE]) {
            var instr = match[GROUP_OPCODE].toUpperCase();
            switch (instr) {
                case 'ADD':
                    var op1 = readOperand(match[GROUP_OPERAND1]);
                    var op2 = readOperand(match[GROUP_OPERAND2]);
                    
                    if (op1.type === "register" && op2.type === "register")
                        opCode = OpCodes.ADD_REG_TO_REG;
                    else if (op1.type === "register" && op2.type === "regaddress")
                        opCode = OpCodes.ADD_REGADDRESS_TO_REG;
                    else if (op1.type === "register" && op2.type === "address")
                        opCode = OpCodes.ADD_ADDRESS_TO_REG;
                    else if (op1.type === "register" && op2.type === "number")
                        opCode = OpCodes.ADD_NUMBER_TO_REG;
                    else
                        throw "ADD does not support this operands";
                    
                    code.push(opCode, op1.value, op2.value);
                    
                    break;
                case ...
                case ...
                default:
                    throw "Not a valid instruction: " + instr;
            }
        }
    }
    
    return code;
};

The complete assembler code, including error handling and the details of the readOperand function is available in asm.js.

As a side note: Because the assembler simulator was a weekend project the code is not as nice structured as it should be and is a bit different than the sample code from above. So don't be surprised.

The UI

For the UI, I recommend using a JavaScript framework. It makes the work much easier. In the case of the simulator, I chose Angular.

We use a two column layout to display the simulator. The left column contains the assembler code input field and the simulator run/stop buttons. The right column contains the CPU registers and flags, the memory, and the console output. Additionally the memory component contains visual indicators for the stack, opcodes and current position of the IP.

UI Screenshot

The major UI elements are the run/stop, step and reset buttons. As their name already implies those buttons control the simulator. Pressing the run button calls the assembler to generate and load the instructions into the memory and is starting a timer to execute a CPU cycle on each interval. The stop button does simply stop the CPU cycle timer. With the step button, a user is able to execute a single CPU cycle step by step for debugging purposes. The last button resets the complete simulator by setting all CPU and memory values to 0.

HTML:

<button ng-click="runOrStopSimulator()">{{ simulatorTimer && 'Stop' || 'Run' }}</button>
<button ng-click="runSimulatorOneStep()">Step</button>
<button ng-click="resetSimulator()">Reset</button>

JS Code:

var simulatorTimer = undefined;
var simulatorClockSpeed = 300;

function runOrStopSimulator() {
	if (!assembler.isAssembled) {
        assembler.run();
    }
        
    simulatorTimer = setInterval(runSimulatorOneStep, simulatorClockSpeed);
};

function runSimulatorOneStep() {
    if (!assembler.isAssembled) {
        assembler.run();
    }
    
	cpu.step();
};

function resetSimulator() {
    if (simulatorTimer !== undefined) {
    	clearInterval(simulatorTimer);
    }
    
    simulatorTimer = undefined;
    cpu.reset();
    memory.reset();
};

I would like to go into detail for another part of the UI. The following HTML code contains a simplified version on how to display the memory component. The beauty of Angular is, that the memory can directly be bound to the UI representation and all changes of the memory are shown in real time on the UI. Additionally instructions are displayed as links allowing the user to click on it and the UI will highlight the corresponding code part.

<div class="console">
  <div style="float:left;" class="console-character"
       ng-repeat="m in memory | startFrom: 232 track by $index">
    <span>{{ convertToChar(m) }}</span>
   </div>
</div>
<div class="memory"
     ng-repeat="m in memory track by $index">
  <div ng-class="getMemoryCellCss($index)" 
       ng-switch="isInstruction($index)">
    <small ng-switch-default>{{ m | number:displayHex }}</small>
    <a ng-switch-when="true" ng-click="jumpToLine($index)">
      <small>{{ m | number:displayHex }}</small>
    </a>
  </div>
</div>

The complete HTML can be found here: index.html.

I hope you enjoyed the quick introduction on how to make your own assembler simulator. If you would like to try the simulator online you can do so here: schweigi.github.io/assembler-simulator/.

Please feel free to contact me if you have any questions or suggestions. You can also use the comments section of this blog so other people can join the discussion too.