Lukas' Notes

computer-architecture

Definition

Hardware Multi-Threading

Hardware multi-threading allows multiple threads to share the functional units of a single processor without duplicating the entire core.

Each thread has private state — a program counter and a register file — but all threads share the same execution resources. The processor switches between threads rapidly to hide pipeline and memory latencies, exploiting thread-level parallelism (TLP) to increase throughput.

This is distinct from instruction-level parallelism: TLP draws independent work from separate threads rather than from a single instruction stream.

Types

Three approaches differ in when and how frequently the processor switches threads.

1

Coarse-Grained

Definition

Coarse-Grained Hardware Multi-Threading

Coarse-grained hardware multi-threading switches between threads only on costly stalls, typically L2 or L3 cache misses.

The pipeline drains before switching, so a thread that is running without stalls executes continuously.

Link to original

Fine-Grained

Definition

Fine-Grained Hardware Multi-Threading

Fine-grained hardware multi-threading switches between threads on every clock cycle, typically in round-robin order, skipping threads that are currently stalled.

Link to original

Simultaneous Multi-Threading (SMT)

Definition

Simultaneous Multi-Threading

Simultaneous multi-threading (SMT) issues instructions from multiple threads in the same cycle.

It is built on top of a dynamically scheduled (OoO) processor, which already supplies the hardware mechanisms SMT needs: issue buffers, reservation stations, and a reorder buffer.

Link to original

Footnotes

  1. 191.003 Computer Systems