Continuous State Space

Definition

Continuous State Space

A state space is called continuous if the variables that define the state can take on any value within a continuous range. In other words, the state variables are real-valued and are not restricted to a finite set of discrete values.

Search Approaches

Constraint Optimisation (Mathematical Programming)

This is a method for finding the best possible solution (optimising) according to a specific goal (the objective function $f$ ) while respecting certain rules or limitations (the constraints, $C$ ).

Convex optimisation: It occurs when the set of solutions allowed by the constraints $C$ forms convex region (geometrically, a shape where a line drawn between any two points within the shape stays entirely within the shape) and the objective function $f$ is convex.

Discretisation

This is the process of converting a continuous problem space (where variables can take any real value within a range) into a discrete one (where variables can only take specific, separate values).

Gridding: Impose a grid with a fixed spacing $δ$ onto the continuous space, where only points on the grid are considered valid states. This reduces the infinite possibilities to a finite set.

Sampling: Instead of a rigid grid, randomly sample potential “successor” states near the current state.

Empirical Gradient Search: Steepest-ascent hill-climbing for discretised version.

Gradient Methods

This is an analytical approach to find an optimum of $f$ . Consider the gradient:

Δ f = [\frac{\partial f}{\partial x _{1}}, \frac{\partial f}{\partial y _{1}}, \frac{\partial f}{\partial x _{2}}, \frac{\partial f}{\partial y _{2}}, \frac{\partial f}{\partial x _{3}}, \frac{\partial f}{\partial y _{3}}]

for the local change of $f$ . We aim to solve (find the closest form):

Δ f (x) = 0

, which is often not feasible. To compute the local gradients for steepest-ascent hill-climbing (gradient ascent):

x \leftarrow x + α Δ f (x)

Newton-Raphson method: This another, often effective, method, i.e. iterate:

x \leftarrow x - H_{f}^{- 1} (x) a Δ f (x)

where $H_{f} (x)$ is the matrix with $H_{ij} = \frac{\partial ^{2} f}{\partial x _{i} \partial x _{j}}$ (Hessian matrix of second derivatives).

Lukas' Notes

Continuous State Space

Definition

Search Approaches

Constraint Optimisation (Mathematical Programming)

Discretisation

Gradient Methods

Graph View

Table of Contents