本次代写主要为MIPS汇编相关的理论题作业

For Q1 and Q2, assume for a given processor the CPI of arithmetic instructions is 1, the CPI of load/store

instructions is 10, and the CPI of branch instructions is 3. Assume a program has the following instruction

breakdowns: 600 million arithmetic instructions, 200 million load/store instructions, 100 million branch

instructions.

Q1. Suppose that new, more powerful arithmetic instructions are added to the instruction set. On average,

through the use of these more powerful arithmetic instructions, we can reduce the number of arithmetic

instructions needed to execute a program to 80% of original value, and the cost of increasing the clock cycle

time to 1.1 times of the earlier cycle time. Is this a good design choice? Why?

Q2. Suppose that we find a way to double the performance of arithmetic instructions. What is the overall

speedup of our machine? What if we find a way to improve the performance of arithmetic instructions by 15

times?

For Q3 and Q4, assume that for a given program 70% of the executed instructions are arithmetic, 10% are

load/store, and 20% are branch.

Q3. Given this instruction mix and the assumption that an arithmetic instruction requires 2 cycles, a load/store

instruction takes 7 cycles, and a branch instruction takes 3 cycles, find the average CPI.

Q4. For a speedup of 1.25, how many cycles, on average, may an arithmetic instruction take if load/store and

branch instructions are not improved at all?

For Q5 and Q6, consider the following two processors:

P1 has a clock rate of 3 GHz, average CPI of 0.7, and requires the execution of 1.0E9 instructions. P2 has a clock

rate of 4 GHz, an average CPI of 0.9, and requires the execution of 5.0E9 instructions.

Q5. One usual fallacy is to consider the computer with the largest clock rate as having the largest performance.

Check if this is true for P1 and P2. Explain your result.

Q6. Another fallacy is to consider that the processor executing the largest number of instructions will need a

larger CPU time. Considering that processor P2 is executing a sequence of 1.0E9 instructions and that the CPI of

processors P1 and P2 do not change, determine the number of instructions that P1 can execute in the same time

that P2 needs to execute 1.0E9 instructions.