这个作业是开发一个有序的，功能齐全的C ++ CPU仿真器
ENSC 251: SOFTWARE DESIGN & ANALYSIS FOR ENGINEERS
Final Project: CPU Simulator
Objective: Students will develop an in-order, fully functional C++ CPU simulator that executes
appropriately binary-formatted programs. The CPU supports flexible hardware structures and program
parameters, may run in debug or execution mode, and provides performance statistics on executed
programs. The simulator will be coded and realized using various classes and structures in C++’s Standard
Template Library (STL).
1) Background Information
Object-oriented programmers are often required to code simulators that model a simulated
representation of a physical system. In this project, we will simulate a simple in-order Central Processing
Unit (CPU) which will replicate very basic processor functionality in software. We will recreate various
parts of the CPU and observe its states, instruction, and data-flow using a Debug Mode. This will allow us
to trace instructions as they flow through the CPU to ensure proper functionality, while obtaining an
appreciation for computer architecture. Once the simulator is completely coded and bug free, it may be
run in execution mode, reflecting al terminal environment where the program will be executed on a
There are numerous CPU simulators that have been created over the past decades. Many of the simulators
are designed either for 1) educational purposes, or 2) for research/commercial purposes.
In the context of education, CPU simulators are mainly used to study single and multi-processor systems,
where one may tune various parameters to view performance effects of design choices, and study CPU
structure optimization. In the context of research/commercial simulators, computer architects must first
simulate a physical system and observe potential performance gains before proceeding to create the
actual hardware system and fabricating the chip. If performance is negligible, another route or design
must be considered. Since tuning or redesigning a simulator is associated with minimal in cost in
comparison to fabricating a processor chip (that may or may not work), simulators are widely used as a
gateway for determining performance potential before proceeding to the actual hardware design.
The CPU simulator we will be designing in this project will not be modeling a complete physical system: a
computing system comprises of many layers beyond the scope of this project. Instead, we will be
simulating a very simple CPU for educational purposes, however the processor will still be capable of
executing various programs and obtaining respective performance statisticsfrom varying parameters. You
may refer to this simulator and CPU design principles in your future engineering courses as well, such as
ENSC254 and ENSC350.
Overview of a simple CPU pipeline
An instruction which enters a CPU is processed incrementally in a series of steps. All instructions are
assigned a number, or ID, as they enter the CPU. Our CPU is an in-order processor, meaning that
instructions must be processed in the order which they enter the CPU. Therefore instructions must be
identified using a numbering system to process instructions in-order; the younger the instruction, the
greater the instruction’s ID value.
1) Instruction Fetch (IF) – the first step is to read an instruction of our program (fetched) from instruction
memory according to the address specified by a variable (or “register” in hardware terminology) called
the Program Counter (PC). The PC isincremented every time an instruction is fetched from the instruction
memory. Therefore, the value held by the PC dictates the location, or address, of the next instruction to
be read from the instruction memory. We will assume the first instruction of a program is located at PC
=0, where instruction_memory contains the first instruction that will be fetched and processed in the
A “fetch width” parameter may be specified as well, indicating the number of instructions that may be
obtained from instruction memory simultaneously at a given time. One unit of time is referred to as a
clock cycle in CPU terminology.
2) Instruction Decode (ID) – once an instruction is fetched, it is decoded according to the processor’s
Instruction Set Architecture (ISA). The ISA stipulates the CPU’s supported “instruction set” i.e. instruction
operations and their formatsrecognized and used by the processor to interpret instructions. The decoding
process allows the CPU to extract information from the fetched instruction. Specifically, the decoder
extracts an instruction’s i) input data locations to be read from the register file (referred to as source
operands), ii) operation, and iii) the output destination/location where the result will be written to in the
Once the instruction’s information is extracted, it is placed in two separate CPU structures: 1) Instruction
Queue (IQ), and 2) Retire/Commit Buffer, or more commonly referred to as the Reorder Buffer (ROB).
The IQ is a finite entry queue which buffers instructions until they are ready for execution. An instruction
is ready once all its source operands are marked as “valid”.
The ROB is a finite FIFO list which manages and ensures the safe eviction or “retirement” of instructions
from the pipeline after execution. This retirement process is completed during the Retirement / Commit
stage. The ROB has a much bigger role in complex CPUs, as its name implies, however in the context of
our project, the ROB will only be used to guarantee that the instructions are executed and released from
the pipeline in the order from which they were fetched.
Instruction fetch and decode will be performed as one step in our CPU simulator.
3) Dispatch/Read/Execute/Writeback (Rd/EXE/WB) – The term issue, interchangeable with the term
dispatch, is the process of releasing an instruction from the IQ and proceeding to execution.
In this step, instructions in the IQ wait until all previous instruction have been issued. An instruction may
be issued once all of its source operands are ready to be read from the register file, and resources are
available for execution.
Accordingly in this second pipeline stage, when an instruction is ready in the IQ:
i) the instruction is released from the IQ (dispatch) , ii) its input operands are read from the register file,
the iii) operation is executed using source operand data, iv) the result is written back to the register file
at the instruction’s specified destination register, and v) the destination register is broadcasted to the IQ
to inform younger instructions that the contents in the register are ready to be read.
The CPU will implement all 5 of these substeps as one pipeline stage in the CPU simulator. More details
pertaining to each step of this multi-step process are provided below:
Dispatch (Issue): The instructions in the IQ are monitored for operand readiness. Since instructions are
dependent on one another, a “consumer” instruction can not execute before a “producer” instruction has
finished executing. For instance:
Instruction 1: Z = A + B; (“produces” results of variable Z)
Instruction 2: C = Z + F; (“consumes” the contents of variable Z)
As the example illustrates, Instruction 2 can not execute until the contents of Z are computed and written
to the variable Z, signifying that instruction 1 must first execute. Instruction 2 may thereafter proceed in
the next cycle to read the contents of the Z operand and executed its operation. Consequently, these two
instructions can not dispatch nor execute in the same cycle due to this read-after-write (RAW)
dependency. We refer to this issue validation process as monitoring an instruction for “valid” operands.
Once a given instruction’s operands are valid, the instruction may be dispatched (or “issued”) for
execution. Since we are implementing an in-order CPU, this implies that the oldest instruction in the IQ
must be ready for execution, implementing a FIFO (First-In, First-Out) scheme; if the oldest instruction is
not ready, subsequent younger instructions can not proceed to execute even if they have valid operand