OS-202-02 2020 Spring Lab 1—Linker Page 1
You are to implement a two-pass linker in C, C++, or Java and submit the source code on NYU Classes, which we compile
The target machine is word addressable and has a memory of 300 words, each consisting of 4 decimal digits. The first
(leftmost) digit is the opcode, which is unchanged by the linker. The remaining three digits (called the address field) form
• An immediate operand, which is unchanged.
• An absolute address, which is unchanged.
• A relative address, which is relocated.
• An external address, which is resolved.
Relocating relative addresses and resolving external references were discussed in class and are in the notes. The input
consists of a series of object modules, each of which contains three parts: definition list, use list, and program text.
The linker processes the input twice (that is why it is called two-pass). Pass one determines the base address for each
module and the absolute address for each external symbol, storing the later in the symbol table it produces. The first
module has base address zero; the base address for module I + 1 is equal to the base address of module I plus the length
of module I. The absolute address for a symbol S defined in module M is the base address of M plus the relative address
of S within M. Pass two uses the base addresses and the symbol table computed in pass one to generate the actual output
by relocating relative addresses and resolving external references.
The definition list is a count ND followed by ND pairs (S, R) where S is the symbol being defined and R is the relative
address to which the symbol refers. Pass one relocates R forming the absolute address A and stores the pair (S, A) in the
The use list is a count NU followed by the NU “pairs”. The first entry in the pair is an external symbol used in the
module. The second entry is a list of relative addresses in the module in which the symbol is used. The list is terminated
by a sentinel of -1. For example, a use list of “2 f 3 1 4 -1 xyg 0 -1” signifies that the symbol f is used in instructions 1, 3,
and 4, and the symbol xyg is used in instruction 0.
The program text consists of a count NT followed by NT 5-digit numbers. NT is the length of the module. The left four
digits of each number form the instruction as described above. The last (rightmost) digit specifies the address type: 1
signifies “immediate”, 2 “absolute”, 3 “relative”, and 4 “external”.
Other requirements: Error detection, arbitrary limits, et al.
Your program must check the input for the errors listed below. All error messages produced must be informative, e.g.,
“Error: The symbol ‘diagonal’ was used but not defined. It has been given the value 111”.
• If a symbol is multiply defined, print an error message and use the value given in the last definition.
• If a symbol is used but not defined, print an error message and use the value 111.
• If a symbol is defined but not used, print a warning message and continue.
• If an absolute address exceeds the size of the machine, print an error message and use the largest legal value.
• If multiple symbols are listed as used in the same instruction, print an error message and ignore all but the last usage
• If an address appearing in a definition exceeds the size of the module, print an error message and treat the address
given as the last word in the module.
You may need to set “arbitrary limits”, for example you may wish to limit the number of characters in a symbol to (say)
8. Any such limits should be clearly documented in the program and if the input fails to meet your limits, your program
must print an error message and continue if possible. Naturally, the limits must be large enough for all the inputs on the
web. Under no circumstances should your program reference an array out of bounds, etc.
Submit the source code for your lab, together with a README file (required) describing how to compile and run it. Your
program must read an input set from standard input, i.e., directly from the keyboard. It is an error for you to require the
input be in a file. You may develop your lab on any machine you wish, but must insure that it compiles and runs on the
NYU system assigned to the course.
OS-202-02 2020 Spring Lab 1—Linker Page 2
There are several sample input sets on the web. The first is shown below and the second is an re-formatted version of the
first. If you use the java Scanner or C’s scanf() inputs 1 and 2 should look the same to your program. Some of the input
sets contain errors that you are to detect as described above. We will run your lab on these (and other) input sets. The
expected output is also on NYU Classes. You need to match the format of the provided outputs.
1 xy 2
2 z 2 -1 xy 4 -1
5 10043 56781 20004 80023 70014
1 z 1 2 3 -1
6 80013 10004 10004 30004 10023 10102
1 z 1 -1
2 50013 40004
1 z 2
2 xy 2 -1 z 1 -1
3 80002 10014 20004
The following is output annotated for clarity and class discussion. Your output is not expected to be this fancy.
0: 10042 1004+0 = 1004
1: 56781 5678
2: xy: 20004 ->z 2015
3: 80023 8002+0 = 8002
4: 70014 ->xy 7002
0 80013 8001+5 = 8006
1 10004 ->z 1015
2 10004 ->z 1015
3 30004 ->z 3015
4 10023 1002+5 = 1007
5 10102 1010
0 50013 5001+11= 5012
1 40004 ->z 4015
0 80002 8000
1 10014 ->z 1015
2 z: 20004 ->xy 2002