Lukas' Notes

reinforcement-learning

Definition

Advantage Estimate

An advantage estimate is an empirical approximation of the advantage function , computed from sampled trajectories.

Common estimators:

  • TD residual: .
  • GAE: , interpolating between low-variance (, TD) and low-bias ().

The estimate enters the policy gradient as the weight multiplying .