reinforcement-learning Definition Parameterised Stochastic Policy A parameterised policy that is stochastic.