Definition
Parameterised Policy
A parameterised policy is a stochastic policy whose action probabilities are computed by a function with parameters :
Common parameterisations include a neural network that outputs logits over followed by a softmax (discrete actions) or outputs the mean and variance of a Gaussian (continuous actions). The parameters are updated via gradient ascent on a policy gradient objective.