B.
3.
Write the result back to register C.
Thus the expanded list of actions required to execute an arithmetic
instruction is as follows (substitute any other arithmetic instruction for add in the following list to see how it’s executed):
1.
Fetch the next instruction from the address stored in the program
counter and load that instruction into the instruction register.
Increment the program counter.
2.
Decode the instruction in the instruction register.
3.
Execute the instruction in the instruction register. Because the instruction is not a branch instruction but an arithmetic instruction, send it to the
arithmetic logic unit (ALU).
a.
Read the contents of registers A and B.
b.
Add the contents of A and B.
c.
Write the result back to register C.
36
Chapter 3
At this point, I need to make a modification to the preceding list. For
reasons we’ll discuss in detail when we talk about the instruction window
in Chapter 5, most modern microprocessors treat sub-steps 3a and 3b as
a group, while they treat step 3c, the register write, separately. To reflect this conceptual and architectural division, this list should be modified to look as
follows:
1.
Fetch the next instruction from the address stored in the program
counter, and load that instruction into the instruction register.
Increment the program counter.
2.
Decode the instruction in the instruction register.
3.
Execute the instruction in the instruction register. Because the instruction is not a branch instruction but an arithmetic instruction, send it to
the ALU.
a.
Read the contents of registers A and B.
b.
Add the contents of A and B.
4.
Write the result back to register C.
In a modern processor, these four steps are repeated over and over again
until the program is finished executing. These are, in fact, the four stages in
a classic RISC1 pipeline. (I’ll define the term pipeline shortly; for now, just think of a pipeline as a series of stages that each instruction in the code
stream must pass through when the code stream is being executed.) Here
are the four stages in their abbreviated form, the form in which you’ll most
often see them:
1.
Fetch
2.
Decode
3.
Execute
4.
Write (or “write-back”)
Each of these stages could be said to represent one phase in the lifecycle of an instruction. An instruction starts out in the fetch phase , moves to the decode phase , then to the execute phase , and finally to the write phase . As I mentioned in “The Clock” on page 29, each phase takes a fixed, but by no means equal, amount of time. In most of the example processors with which you’ll
be working in this chapter, all four phases take an equal amount of time;
this is not usually the case in real-world processors. In any case, if the DLW-1
takes exactly 1 nanosecond (ns) to complete each phase, then the DLW-1
can finish one instruction every 4 ns.
1 The term RISC is an acronym for Reduced Instruction Set Computing . I’ll cover this term in more detail in Chapter 5.
Pipelined Execution
37
Basic Instruction Flow
One useful division that computer architects often employ when talking
about CPUs is that of front end versus back end . As you already know, when instructions are fetched from main memory, they must be decoded for
execution. This fetching and decoding takes place in the processor’s front
end.
You can see in Figure 3-1 that the front end roughly corresponds to the
control and I/O units in the previous chapter’s diagram of the DLW-1’s
programming model. The ALU and registers constitute the back end of the
DLW-1. Instructions make their way from the front end down through the
back end, where the work of number crunching gets done.
Front End
Back End
Control Unit
Registers
Program Counter (PC)
A
B
Instruction Register
C
D
Proc. Status Word (PSW)
Data Bus
I/O Unit
ALU
Address
Bus
Figure 3-1: Front end versus back end
We can now modify Figure 1-4 to show all four phases of