haker
Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Back to homepage

Assembly

This article is a work in progress and will be expanded with more information.

Assembly Files

Assembly files are plain text files, usually with .wopr extension.

Comments begin with ;, and extend to the end of the line. They can appear at the beginning of the line or after an instruction.

; this is a full line comment
        push 0 ; this is an in line comment

Empty lines are ignored.

Each line contains one instructions or preprocessor directive. The only exception are labels which can appear both alone and in front of an instruction.

Line endings are important, i.e. whole instruction must be on the same line. However, white spaces and indentation on the same line are ignored.

Preprocessor

WOPR does not have a classic preprocessor. Classic preprocessor process the source, perform substitution or expansion and produce expanded/preprocessed source file which is then assembled. WOPR’s preprocessor directives are parsed at the same time with assembly instructions, and are executed when appropriate.

Imports

%import directive is used to import additional files and thus reuse code and build more complex programs. Files requested for import are queued and assembled one at a time. If the same file is imported more than once, either from the same file or from multiple files, it is assembled only once.

%import "module.wopr"

NOTE: Note that %import does not include the file inline, into the original source. Rather is simply assembles the referenced file once the current file is fully assembled.

Programming

WOPR is a 16-bit stack based CPU. Therefore all instructions operate on 16 bit values and that all operands and results are retrieved and stored on the stack. WOPR has two stacks, evaluation stack (ES) and return stack (RS). There are some exceptions to this rule like push and sto instructions, and these will be explained separately.

Example: Subtraction

Let us try to calculate 5-2.

First we have to setup the evaluation stack (ES) with the operands. We will do this by pushing the two values onto the stack.

        push 5
        push 2

This will make the stack look like this.

ES Value
8002 0002
8000 0005

Note that we always display stack with the last element on the top. Thus we push values to the top of the stack and pop them from the top as well.

Then we subtract.

        sub

Finally, the stack will look like this, with our result 3 at the top of the stack.

ES Value
8000 0003
section code 0x0000
        push 5
        push 2
        sub

Evaluation Stack

Evaluation stack stores operands and results for all instructions. All elements on the stack are 16 bits wide. Evaluation stack is initialized at 0xFFC0 when CPU boots, and it grows down before each push and after each pop. ES can be changed in code if necessary.

Arithmetic Operations

Arithmetic instructions will take one or two operands from the evaluation stack, perform a calculation and store the result back on the ES.

See CPU page for details about all instructions.

Labels

Labels are named references to a memory location which can be an instruction or data. Labels are used as names for functions, destinations for jumps and names of variables.

        ...
jmp_target:
        nop     ; some random instruction
        ; ...
func_label:
        ; ...
        ret
var1:   db 1    ; byte sized variable named var1

The label addresses are determined at link time. During assembly and linking the assembler calculates all addresses and replaces the references with actual addresses.

All labels must have unique names, across all files comprising a single program. This includes all files imported with %import. Therefore good naming is very important.

You can define as many labels as you wish. Sometimes labels are useful for documenting code. Labels do not have to be referenced, although this might indicate a problem in your code. However if a label is referenced in a PUSH or PUSHB instruction, then it has to be defined. If it is not defined assembler will produce an error.

Nested Labels

As a convenience, WOPR supports nested labels. Nested labels start with a period . but must be followed by a normal identifier. E.g. .exit.

As a guiding principle, full labels should be thought of as symbols that your program wishes to export, while nested labels are internal symbols that are not meant to be shared.

Nested labels are actually just a shorthand for writing a longer label name. For example:

main:
.sub1:
.sub2:
func:
.sub1:
.sub2:

is equivalent to:

main:
main.sub1:
main.sub2:
func:
func.sub1:
func.sub2:

This is convenient for identifying small commonly repeated pieces of code. For example, most functions will have an .exit label where the cleanup code resides or .loop label for creating loops.

; countdown - calculate power of a number
; P0 - number from which to countdown
countdown:
.loop:  dup       ; duplicate control variable as this copy will be consumed by jz
        push .exit
        jz .exit  ; exit when control variable reaches zero
        dec       ; decrease control variable
        push .loop
        jmp       ; loop
.exit:
        pop       ; drop loop control variable from ES
        ret

main:   push 10
        push countdown
        calc      ; call countdown function passing 10 as P0
        halt

Flow Control - Jumps

WOPR has several instructions for making conditional and unconditional jumps.

Unconditional Jump

        jmp here
        push 10
here:
        push 5
        push 2
        sub

In the above example, jmp here executes a jump to the address marked with the label here and thus skips execution of push 10. Therefore the code above will calculate 5-2 only.

Conditional Jumps

Conditional jumps jump similarly to jmp but only if a condition is met.

JNZ takes one parameter from ES (let’s call it P0) and will jump only of P0 is not zero (Jump if Not Zero).

        push 0
        jnz here1
        push 2
        jmp here2
here1:  push 4
here2:  push 6
        add

In the above example, the first push determines which of the instructions are executed. Since the value on ES is a zero, then jnz here1 does not jump and the execution executes on the next instruction push 2. Therefore the above code calculates 2+6.

However if we change the first line to push 1, then jnz condition would be met, and the execution would continue at the label here1 and the code would calculate 4+6.

See CPU page for list of all other conditional instructions.

Functions and Calls

Functions are reusable portions of code. Some code can be reused using jumps. What makes calls and functions superior is that functions will return to wherever they were called from, thus making them reusable from any place in your program.

A function is any labeled piece of code that ends with a return instruction.

shortest:
; this is the shortest function which does nothing other then returning
        ret

Above is a trivial example, but useful to demonstrate mechanics of calls and functions. To call this function, you must load the address of the function onto ES and then execute call.

        push shortest
        call
        ; after shortest function returns, execution will continue here

This is the most common way of calling functions. Notice that the function address is simply a value on the stack, and it can be calculated in more advanced use cases.

The calls use return stack (RS). Instruction call pushes the program counter (PC) onto RS before jumping to function address. When you want to return from the function, you simply execute ret which pops the value from the RS and jumps to it, effectively continuing execution immediately after the call instruction.

Parameters and Return Values

To make functions more useful, we often need to pass them parameters or receive results. Like most things in stack machines, parameters and results are passed on the stack.

This is similar to individual instructions. For example add takes two parameters, adds them and returns the sum on the stack. So add pops two values from ES and pushes one value onto ES. Functions work the same way, except the author decides how many parameters a function takes and how many it returns.

increase_by_10:
        ; this function accepts one parameter, increases it by 10 and returns
        ; it on the stack

        ; one parameter is already on the stack
        push 10
        add ; this will add the parameter passed to the function and the one we pushed
        ret ; return leaving the increased value on the stack

The function above takes a single parameter, adds 10 to it and returns a single return value. So, we call it like this:

        push 4 ; this is the parameter
        push increase_by_10
        call
        ; result (14) is not on top of ES

Functions can be as simple or complex as you need them. Also, they have no limit on number of parameters or return values.

IMPORTANT: The hardest thing about writing functions is ensuring that the function always consumes all of it’s parameters and always returns the proper number of results. When writing complex functions which have error conditions for example, even in the if the function is unable to complete it’s job, it still must consume it’s parameters and push proper number of results onto the stack. If this is not done properly, the calling program will lost track of what each value on the stack means, stack corruption will occur and your program will most likely crash.

Sections

So far, all code you have seen has been short and has been using default placement at address 0x0000.

section 0x0000

This instructs the assembler to place all code that appears after this directive in the contiguous memory starting at 0x0000.

section 0x0000
        nop       ; nop is placed at address 0x0000
        push 2    ; is placed at 0x0001
        pushb 3   ; is placed at 0x0004 because push instruction takes 3 bytes,
                  ; 1 byte instruction followed by a word (2 bytes)
        add       ; is placed at 0x0006 because pushb (note b) takes 2 bytes,
                  ; 1 byte instruciton followed by a byte (1 byte)
        halt      ; is placed at 0x0007 as all non push instructions take 1 byte

Simple programs usually use single section starting at address 0x0000. However sometimes it is useful or required to specify precisely where some code or data should reside.

For example WOPR interrupt table is always located at 0xFFC0.

; interrupt table
section 0xFFC0
int00:  dw 0
int01:  dw 0
int02:  dw 0
int03:  dw 0
int04:  dw 0
int05:  dw 0
int06:  dw 0
int07:  dw 0
int08:  dw 0
int09:  dw 0
int0A:  dw 0
int0B:  dw 0
int0C:  dw 0
int0D:  dw 0
int0E:  dw 0
int0F:  dw 0
int10:  dw 0
int11:  dw 0
int12:  dw 0
int13:  dw 0
int14:  dw 0
int15:  dw 0
int16:  dw 0
int17:  dw 0
int18:  dw 0
int19:  dw 0
int1A:  dw 0
int1B:  dw 0
int1C:  dw 0
int1D:  dw 0
int1E:  dw 0
int1F:  dw 0