stormvogel.teaching.lovejoy¶

Lovejoy grid MDP: finite upper-bounding MDP for POMDPs.

Constructs the Lovejoy grid MDP for POMDPs where every observation class has at most three states.

Two-state class — the belief simplex is a line segment [0, 1]; a uniform grid of step 1/k covers it, and off-grid successors are replaced by linear interpolation over the two adjacent grid points.
Three-state class — the belief simplex is a triangle (2-simplex); a uniform triangulation at resolution k covers it with (k+1)(k+2)//2 grid points, and off-grid successors are replaced by barycentric interpolation over the three vertices of the enclosing sub-triangle.

In both cases the interpolated value is an over-approximation because V* is convex, so the result is a stormvogel MDP whose maximal reachability probability is an upper bound on the true POMDP value at the initial belief.

Functions¶

lovejoy_grid_mdp(→ stormvogel.model.model.Model)

Build the Lovejoy grid MDP for a POMDP with at most 2 states per observation.

Module Contents¶

stormvogel.teaching.lovejoy.lovejoy_grid_mdp(pomdp: stormvogel.model.model.Model, initial_belief: Mapping[State, Fraction | int], k: int) → stormvogel.model.model.Model¶

Build the Lovejoy grid MDP for a POMDP with at most 2 states per observation.

Every observation class must contain at most two POMDP states. For each such class the belief simplex is 1-dimensional; a uniform grid of k + 1 points {0, 1/k, 2/k, …, 1} covers it. Successor beliefs that fall between two grid points are replaced by their linear convex interpolation, which gives an over-approximation because V* is convex.

The returned MDP has exactly the reachable grid beliefs as states. Its maximal reachability probability (under label target) is an upper bound on V*(initial_belief).

Parameters:

pomdp – A POMDP where every observation class has at most 3 states.
initial_belief – Distribution over POMDP states summing to 1. Its support must lie within a single observation class and every probability must be an exact multiple of 1/k.
k – Grid resolution. Grid points for each 2-state observation class are at {0, 1/k, 2/k, …, 1}; for each 3-state class they form a uniform triangulation of the 2-simplex at step 1/k.

Returns:

A stormvogel MDP whose optimal reachability value upper-bounds V*(initial_belief).

Raises:

ValueError – If pomdp is not a POMDP, any observation class has more than 3 states, initial_belief does not sum to 1, its support spans more than one observation class, or any probability is not an exact multiple of 1/k.