Discount Factor

Definition

Discount Factor

The discount factor $γ \in [0, 1]$ weights future rewards in the return:
$G_{t} = k = 0 \sum \infty γ^{k} R (s_{t + k}, a_{t + k}) .$

$γ = 0$ : the agent is myopic, maximising only the immediate reward.

$γ \to 1$ : the agent is farsighted, valuing distant rewards almost as much as immediate ones.

$γ < 1$ also ensures $G_{t}$ remains finite for unbounded or continuing tasks, acting as a soft horizon.

Lukas' Notes

Discount Factor

Definition

Backlinks