statistics reinforcement-learning
Definition
Markov Property
A stochastic process has the Markov property if the conditional probability distribution of future states depends only on the present state, not on the sequence of events that preceded it.
Formally, for a sequence of random variables taking values in a state space :
for all and all . The process is said to be memoryless: given the present, the past provides no additional information about the future.
Examples
Random walk on a line
A particle that at each step moves left or right with equal probability, independent of its previous trajectory, satisfies the Markov property. Only its current position matters.
Board games
In many board games (e.g. Snakes and Ladders with dice), the next position depends only on the current square and the die roll, not on how the player arrived there.
Reinforcement learning
A Markov decision process assumes the Markov property: the agent’s next state and reward depend only on the current state and action, not on the full history.