Lukas' Notes

computer-architecture

Definition

Simultaneous Multi-Threading

Simultaneous multi-threading (SMT) issues instructions from multiple threads in the same cycle.

It is built on top of a dynamically scheduled (OoO) processor, which already supplies the hardware mechanisms SMT needs: issue buffers, reservation stations, and a reorder buffer.

Mechanism

SMT requires additional per-thread hardware:

The existing wakeup, selection, and issue logic operates across threads without modification. When one thread stalls — for example on a data cache miss — the other thread supplies instructions that keep functional units busy.

Example

Two threads execute the same loop (lwaddswaddiblt). Thread 1 hits a cache miss on its second load iteration.

  • Single-threaded: 22 cycles (10 + 12 stalled).
  • SMT dual-threaded: 15 cycles. Thread 2 occupies the ALU, LSU, and SU while Thread 1 waits.

Functional unit utilisation rises because the second thread fills pipeline slots the first thread cannot use.

Relationship to Other MT Types

SMT exploits both ILP and thread-level parallelism simultaneously. It achieves higher single-core throughput than coarse-grained or fine-grained multi-threading, at the cost of moderate additional hardware on top of an already OoO core.