TCSS 372 - Program 3 Assignment

Big Picture

You will be producing a two pass assembler that takes a source .asm file and, if no errors are detected, produces a listing (.lst) and object load file (.ld). The latter files you will be able to load and run on your SimpComp simulator.

SimpComp Assembler Program

See the Assembly Language Guide for details of the SimpComp assembly language and format of .asm source files. Your program will be invoked from either a GUI or the command line window. Invocation takes the name of the source file without extension (.asm is assumed). The running program will print messages at the end of each pass providing progress information.

C:>Pass One complete - 3 errors detected. See programname.tmp for error listing.
or
C: Pass One complete - 0 errors detected. Beginning Pass Two.
C: Pass Two complete - 1 error detected. See Listing file for error messages. No load file
or
C: Pass Two complete - 0 errors detected. Listing and Load files completed.

The SimpComp Assembler will be comprised of:

SCA class (application main)

This is the main application class, invoked by run java SCA hello
runs the assembler on the source file hello.asm.

This program will create an internal opcode table (hint: you may want to read in a text file, or create a separate Opcode object that had previously been serialized. See below for the Opcode class requirements. During the run, pass 1 will generate the symbol table object, which is then passed to pass 2 for use. See below for the symbol table format.

The program will call pass 1 passing in the opcode table. Pass 1 will return with a boolean indicating True = success, False = failure due to errors. If the latter, the program terminates with the above message. Pass 1 errors must be fixed by the programmer before the pass 2 can run. If pass 1 succeeds then the application calls pass 2 passing in the symbol table.

Both passes work essentially the same way. Each line of the input file is read and parsed for the different fields. Pass one will build the symbol table to record label/address associations. Pass two will complete address resolution and generate the object code. It will also generate a listing file for the programmer to use.

Pass 1

Pass one reads in the source file (.asm) line by line, parsing each line and establishing the address for associated labels. It also keeps track of the address increments based on the number of bytes required by each instruction type or directive. It generates the symbol table (internally) and an intermediate file (see below).

Pass one is to verify that labels are correctly formatted, that they are associated with unique addresses (no duplicate labels within a single file), and that mneumonics for both opcodes and directives are correct. In order to update the address counter properly, memory allocating directives (e.g. DW) will need to have their operands evaluated.

Pass one will generate the symbol table (see below for format and requirements). It will check for typical pass one errors as well as format errors. A listing of error conditions:

File name - missing required label
File format - missing required ORG
Label field - improper format
Label field - duplicate label
Opcode field - unrecognized opcode
Opcode field - opcode missing
Operand field - operand format error
Operand field - required operand missing
File format - can't find end of segment
File format - end of file w/o END

Every found error will be printed into the temporary (intermediate) file on the next line below where the error was detected.

Intermediate file format

The intermediate file is generated from pass one and stored as filename.int. This file will be read as the source file by pass 2. The format of the file is similar to the .asm file except the line-end comments have been stripped and the address field is preappended to each line. Comment lines are retained for readability. A header comprised of a column title line and a line of '-' used as a separator has been added.

ADDR:	Label           Mneumonic       Operands
----------------------------------------------------
	;
	; A sample program
	;
	; this is a comment line
	;
----	MAIN:                           MYPROGRAM
1000	                ORG             0x1000
1000	START:          LD              A, FIRSTOP
1004	                LD              B, SECNDOP
1008	                ADD             A, B
100A...

If an error is detected in any line an appropriate error message (above) is printed on the next line after the line just parsed. If more than one error is detected, each error gets its own line below the line parsed.

Pass 2

Pass two receives the symbol table created in pass 1. It also opens and uses the intermediate file as input. This pass's job is to verify operands and generate code. It should read each line of the intermediate file, parse the line for elements, and verify that the operands are correct and meaningful. The first element in the line is a substring representing (in HEX) the address assigned to that instruction. Since pass 1 has already verified the symbols and the mneumonics the only remaining job is to parse the operands for determining register(s) and, if necessary, a memory address associated with a label (symbol). It then generates the opcode, register field, and immediate/address field as needed.

Disambiguation of the addressing mode for memory accesses (LD, ST, and JMP instructions) may be done either in pass 1 or pass 2. This means determining the addressing mode by virtue of the special symbol used to designate the mode. Disambiguation between register-based access modes (e.g. LD B, +C means base-relative using register C for the offset) and immediate offset (e.g. LD B, +OFFSET) is based on the symbol used is either a reserved register symbol or a well-formed label.

Pass two generates two files (if successful), the listing file filename.lst, and the load file, filename.ld. The list file format is as follows:

--------------- filename Listing ---------------------------------
ADDR:	Label           Mneumonic       Operands         Code
------------------------------------------------------------------
	;
	; A sample program
	;
	; this is a comment line
	;
----	MAIN:                           MYPROGRAM
1000	                ORG             0x1000
1000	START:          LD              A, FIRSTOP       F30010A0
1004	                LD              B, SECNDOP       F31010A2
1008	                ADD             A, B             0101
100A...

The code replaces the end-of-line comments in the original source file. The symbol table is printed out at the end of the listing file in the format given below. The format for the load file has already been given.

Pass two errors include:

Operand field - symbol not found
Operand field - improper operand format
Operand field - second operand needed
Operand field - operand(s) needed

Symbol Table

The symbol table object shall have the following format:

String: label
int:    address
char:   type // 'A' address, 'V' value, 'H' header - used with reserved labels

The symbol table is maintained as a hashmap internally but must be printed in sorted order to the intermediate file at the end of pass 1. Format for the printed symbol table is:

-------------------------------------------------
             Symbol Table
-------------------------------------------------
Symbol          Address           Type
-------------------------------------------------
ALOOP           11A4              A
BEFORE          13FF              V
MYPROG          0000              H

Opcode Table

The format for the opcode table will be:

String: opcode //mneumonic
String: hexcode
int:    byte_cnt

Test Programs

You should be able to write SimpComp programs in the above source format and assemble them with your SCA. The final test is to load the generated load file into your simulator and run the program.

hello.ld

This program should consist of a main that calls subroutines to 1) get keyboard input and store in a buffer, a persons name. Use a null terminator (like C strings). 2) write "Hello " followed by the input name on the video monitor. The program can then terminate.

calc.ld

This program should be able to do basic four function calculator functions on single digit inputs. For division you may use repeated subtraction and for multiplication you can use repeated addition algorithms. Remember that division needs two registers, one for the result and one for the remainder. Ideally the input would be of the standard form, e.g. 3 + 7 = (the = sign acts as the terminator. The output should be the correct value - up to two digits printed on the monitor.

What to turn in

I would like to see demonstrations of the test platform and all of the functions described above. We will do this in class as we did for Program 2. You will turn all source code to me via a zip file through e-mail. Make sure the subject line reads: TCSS 372 - Program 3 turnin teamname. In addition to your source files I would like Javadocs on all classes. Be sure that ALL files contain the team name and all members of your team.

Due Date: March 13, 2008

We will schedule demonstrations during class time so everyone can get a glimps of what every other team did.