







## Outline of Today's Lecture

- ° Recap and Introduction
- ° Introduction to Hazards
- ° Forwarding
- ° 1 cycle Load Delay
- ° 1 cycle Branch Delay
- ° What makes pipelining hard
- ° Summary

361 hazards.5



































| Soft           | ware                          | Scheduling to Avoid Load Hazards |  |  |  |  |
|----------------|-------------------------------|----------------------------------|--|--|--|--|
| Try pro        | ducing                        | l fast code for                  |  |  |  |  |
|                | a = b + c;                    |                                  |  |  |  |  |
|                | d = e – f;                    |                                  |  |  |  |  |
| assum          | assuming a, b, c, d ,e, and f |                                  |  |  |  |  |
| in merr        | in memory.                    |                                  |  |  |  |  |
| Slow c         | ode:                          |                                  |  |  |  |  |
|                | LW                            | Rb,b                             |  |  |  |  |
|                | LW                            | Rc,c                             |  |  |  |  |
|                | ADD                           | Ra,Rb,Rc                         |  |  |  |  |
|                | SW                            | a,Ra                             |  |  |  |  |
|                | LW                            | Re,e                             |  |  |  |  |
|                | LW                            | Rf,f                             |  |  |  |  |
|                | SUB                           | Rd,Re,Rf                         |  |  |  |  |
|                | SW                            | d,Rd                             |  |  |  |  |
| 361 hazards.23 |                               |                                  |  |  |  |  |

| Try producing fast code for<br>a = b + c;<br>d = e – f;                                                                    |                                                        |                                                                                    |
|----------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------|------------------------------------------------------------------------------------|
| assuming a. b. c. d .e. and f                                                                                              | F                                                      |                                                                                    |
| in memory.<br>Slow code:<br>LW Rb,b<br>LW Rc,c<br>ADD Ra,Rb,Rc<br>SW a,Ra<br>LW Re,e<br>LW Rf,f<br>SUB Rd,Re,Rf<br>SW d,Rd | Fast code:<br>LW<br>LW<br>ADD<br>LW<br>SW<br>SUB<br>SW | Rb,b<br>Rc,c<br><u>Re,e</u><br>Ra,Rb,Rc<br>Rf,f<br><u>a,Ra</u><br>Rd,Re,Rf<br>d,Rd |













## Option 2: Define Branch as Delayed

- ° Worst case, SW inserts NOP into branch delay
- ° Where get instructions to fill branch delay slot?
  - Before branch instruction
  - From the target address: only valuable when branch
  - From fall through: only valuable when don't branch
- ° Compiler effectiveness for single branch delay slot:
  - Fills about 60% of branch delay slots
  - About 80% of instructions executed in branch delay slots useful in computation
  - about 50% (60% x 80%) of slots usefully filled

361 hazards.31

| When is pipelining hard?      |                                                                                                                         |  |  |  |  |
|-------------------------------|-------------------------------------------------------------------------------------------------------------------------|--|--|--|--|
| ° Interr<br>• H<br>• R<br>• W | upts: 5 instructions executing in 5 stage pipeline<br>ow to stop the pipeline?<br>estrart?<br>//o caused the interrupt? |  |  |  |  |
| Stage                         | Problem interrupts occurring                                                                                            |  |  |  |  |
| IF                            | Page fault on instruction fetch; misaligned memory<br>access; memory-protection violation                               |  |  |  |  |
| ID                            | Undefined or illegal opcode                                                                                             |  |  |  |  |
| EX                            | Arithmetic interrupt                                                                                                    |  |  |  |  |
| MEM                           | Page fault on data fetch; misaligned memory<br>access; memory-protection violation                                      |  |  |  |  |
|                               |                                                                                                                         |  |  |  |  |
|                               |                                                                                                                         |  |  |  |  |
| 361 hazards.32                |                                                                                                                         |  |  |  |  |



| Also, may pipelin<br>instructions with | e FP exec   | ution unit so that c<br>g full latency | an initiate new   |
|----------------------------------------|-------------|----------------------------------------|-------------------|
| P Instruction                          | Latency     | Initiation Rate                        | (MIPS R4000)      |
| Add, Subtract                          | 4           | 3                                      |                   |
| Aultiply                               | 8           | 4                                      |                   |
| Divide                                 | 36          | 35                                     |                   |
| Square root                            | 112         | 111                                    |                   |
| legate                                 | 2           | 1                                      |                   |
| Absolute value                         | 2           | 1                                      |                   |
| P compare                              | 3           | 2                                      |                   |
| Divide, Square Ro                      | oot take -1 | I0X to -30X longer t                   | han Add           |
| <ul> <li>Exceptions?</li> </ul>        |             |                                        |                   |
| Adds WAR an                            | d WAW h     | azards since pipeli                    | nes are no longer |

| Hazard Detection                                                                                                                       |
|----------------------------------------------------------------------------------------------------------------------------------------|
| Suppose instruction <i>i</i> is about to be issued and a predecessor instruction <i>j</i> is in the instruction pipeline.              |
| Rregs ( <i>i</i> ) = Registers read by instruction <i>i</i>                                                                            |
| Wregs ( i ) = Registers written by instruction i                                                                                       |
| ° A RAW hazard exists on register $\rho$ if $\exists \rho$ , $\rho \in \text{Rregs}(i) \cap \text{Wregs}(j)$                           |
| <ul> <li>Keep a record of pending writes (for inst's in the pipe) and compare<br/>with operand regs of current instruction.</li> </ul> |
| <ul> <li>When instruction issues, reserve its result register.</li> </ul>                                                              |
| - When on operation completes, remove its write reservation.                                                                           |
| ° A WAW hazard exists on register ρ if ∃ρ, ρ∈Wregs( <i>i</i> )∩ Wregs( <i>j</i> )                                                      |
| ° A WAR hazard exists on register ρ if ∃ρ, ρ∈Wregs( <i>i</i> )∩ Rregs( <i>j</i> )                                                      |
| 361 hazards.35                                                                                                                         |
|                                                                                                                                        |

| Avoiding Data Hazards by Design                                                                                                                                                          |  |  |  |  |  |  |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|--|--|--|--|
| Suppose instructions are executed in a pipelined fashion such that<br>Instructions are initiated in order.                                                                               |  |  |  |  |  |  |
| <sup>o</sup> WAW avoidance: if writes to a particular resource (e.g., reg) are<br>performed in the same stage for all instructions, then no WAW<br>hazards occur.                        |  |  |  |  |  |  |
| proof: writes are in the same time sequence as instructions.                                                                                                                             |  |  |  |  |  |  |
|                                                                                                                                                                                          |  |  |  |  |  |  |
|                                                                                                                                                                                          |  |  |  |  |  |  |
|                                                                                                                                                                                          |  |  |  |  |  |  |
| <sup>o</sup> WAR avoidance: if in all instructions reads of a resource occur at an<br>earlier stage than writes to that resource occur in any instruction,<br>then no WAR hazards occur. |  |  |  |  |  |  |
| proof: A successor instruction must issue later, hence it will perform<br>writes only after all reads for the current instruction.                                                       |  |  |  |  |  |  |
|                                                                                                                                                                                          |  |  |  |  |  |  |
| 361 hazards.36                                                                                                                                                                           |  |  |  |  |  |  |



- Avoid WAR hazards
- ° Memory access in stage 4
- Avoid all memory hazards
- ° Control hazards resolved by delayed branch (with fast path)
- ° RAW hazards resolved by bypass, except on load results
- which are resolved by fiat (delayed load).

Substantial pipelining with very little cost or complexity. Machine organization is (slightly) exposed! Relies very heavily on "hit assumption" of memory accesses in cache

361 hazards.37

