Definition
Simultaneous Multi-Threading
Simultaneous multi-threading (SMT) issues instructions from multiple threads in the same cycle.
It is built on top of a dynamically scheduled (OoO) processor, which already supplies the hardware mechanisms SMT needs: issue buffers, reservation stations, and a reorder buffer.
Mechanism
SMT requires additional per-thread hardware:
- separate program counters,
- separate register files,
- the ability for instructions from different threads to commit.
The existing wakeup, selection, and issue logic operates across threads without modification. When one thread stalls — for example on a data cache miss — the other thread supplies instructions that keep functional units busy.
Example
Two threads execute the same loop (lw → add → sw → addi → blt). Thread 1 hits a cache miss on its second load iteration.
- Single-threaded: 22 cycles (10 + 12 stalled).
- SMT dual-threaded: 15 cycles. Thread 2 occupies the ALU, LSU, and SU while Thread 1 waits.
Functional unit utilisation rises because the second thread fills pipeline slots the first thread cannot use.
Relationship to Other MT Types
SMT exploits both ILP and thread-level parallelism simultaneously. It achieves higher single-core throughput than coarse-grained or fine-grained multi-threading, at the cost of moderate additional hardware on top of an already OoO core.