Definition
Data Hazard
A data hazard is a pipelined hazard in which an instruction needs a value that a previous instruction has not yet written back.
So the later instruction depends on a register or other result that is still being produced by the pipeline.
Handling
Real pipelined processors often combine these techniques.
- rearranging code reduces hazards statically;
- forwarding resolves many hazards dynamically;
- stalling is used when the other methods are not enough.
Inserting Nops
Inserting Nops
One way to avoid a data hazard is to insert explicit
nopinstructions between the instruction that produces a value and the instruction that consumes it.This gives the pipeline enough time to write the value back before it is read.
The method is simple, but it wastes cycles.
Leaky abstraction
nopinstructions depend on details of the hardware pipeline.In particular, the required number of
nopinstructions depends on how many stages the pipelined processor has and when results become available. So the binary is no longer independent of the concrete hardware.Example
add s8, s4, s5 nop nop sub s2, s8, s3 or s9, t6, s8 and s7, s8, t2The
nopinstructions create delay slots so thats8is available beforesubreads it.
Rearranging Code
Rearranging Code at Compile Time
A compiler can move independent instructions between the producer and the consumer.
This fills the gap that would otherwise be occupied by
nopinstructions. The idea is the same as inserting delays, but useful work is done instead of idling.Example
add s8, s4, s5 sub s2, s8, s3 or s9, t6, s8 and s7, s8, t2rearranging the shown instructions does not help much, because
sub,or, andandall depend ons8.So this method only helps if some independent instruction is available that can be moved into the gap after
add.
Forwarding Data
Forwarding Data at Run Time
Forwarding, or bypassing, sends a result directly from a later pipeline stage to an earlier stage that needs it.
So the dependent instruction does not have to wait until the value is written back to the register file.
This removes many data hazards, but only when the required value has already been computed somewhere in the pipeline.
Forwarding cases
- if the execute-stage source register matches the destination register of the memory stage, forward from the memory stage;
- otherwise, if it matches the destination register of the write-back stage, forward from the write-back stage;
- otherwise, use the value read from the register file.
For the first ALU input, this can be written as:
The equation for
ForwardBEis analogous, withRs1Ereplaced byRs2E.Example
add s8, s4, s5 sub s2, s8, s3 or s9, t6, s8 and s7, s8, t2the value produced by
addfors8can be forwarded directly to the later instructions instead of waiting untils8is written back to the register file.
Stalling the Processor
Stalling the Processor at Run Time
If the needed value is still not available, the processor can stall the pipeline.
In that case, early pipeline stages are held in place and one or more bubbles are inserted until the data hazard disappears.
This preserves correctness, but reduces throughput.
Example
add s8, s4, s5 sub s2, s8, s3 or s9, t6, s8 and s7, s8, t2the processor may detect that
subneedss8too early and temporarily stop the advance of the pipeline. During that pause, a bubble is inserted until the value ofs8becomes available.
Load-use hazard
The classic case that still needs a stall is the load-use hazard.
If the instruction in the execute stage is a load and the instruction in the decode stage already wants to read the loaded register, forwarding alone is not enough, because the loaded value is not available yet.
A common detection condition is:
and then:
So the front of the pipeline is stalled and a bubble is inserted behind the load.
Example
Register value not ready
Suppose one instruction computes a value for a register, and the following instruction immediately wants to read that same register.
If the first instruction has not yet reached write-back, the correct value is not yet available in the register file. This creates a data hazard.
A typical instruction sequence is:
add s8, s4, s5 sub s2, s8, s3 or s9, t6, s8 and s7, s8, t2Here the first instruction produces the value for
s8, and the following instructions try to reads8before that value has necessarily reached the register file.