{
"cells": [
{
"cell_type": "markdown",
"id": "efd3108f",
"metadata": {},
"source": [
"# Visual value iteration and Markov Chain evolution\n",
"This notebook provides an example of using the stormvogel API for naive value iteration and Markov Chain evolution, and a way to visualize these algorithms.
"
]
},
{
"cell_type": "markdown",
"id": "e74b5666",
"metadata": {},
"source": [
"## Value iteration\n",
"This algorithm takes a model and a target state. It then calculates the probability of reaching the target state from all other states iteratively. If a choice has to be made for an action, the action that gives the best result in the previous iteration is chosen. It also takes an additional parameter epsilon, which indicates the target accuracy. A lower value of epsilon gives a more accurate result with more iterations."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "3d577da8",
"metadata": {
"execution": {
"iopub.execute_input": "2026-03-26T10:42:00.831996Z",
"iopub.status.busy": "2026-03-26T10:42:00.831752Z",
"iopub.status.idle": "2026-03-26T10:42:01.061922Z",
"shell.execute_reply": "2026-03-26T10:42:01.061404Z"
}
},
"outputs": [],
"source": [
"from stormvogel import *\n",
"\n",
"\n",
"def naive_value_iteration(\n",
" model: Model, epsilon: float, target_state: State\n",
") -> list[list[float]]:\n",
" \"\"\"Run naive value iteration. The result is a 2D list where result[n][m] is the probability to be in state m at step n.\n",
"\n",
" Args:\n",
" model (stormvogel.model.Model): Target model.\n",
" steps (int): Amount of steps.\n",
" target_state (stormvogel.model.State): Target state of the model.\n",
"\n",
" Returns:\n",
" list[list[float]]: The result is a 2D list where result[n][m] is the value of state m at iteration n.\n",
" \"\"\"\n",
" if epsilon <= 0:\n",
" raise RuntimeError(\"The algorithm will not terminate if epsilon is zero.\")\n",
"\n",
" # Map state objects to their integer index to avoid slow lookup\n",
" state_to_index = {state: i for i, state in enumerate(model.states)}\n",
" target_idx = state_to_index[target_state]\n",
" values_matrix = [[0 for _ in model.states]]\n",
" values_matrix[0][target_idx] = 1\n",
"\n",
" terminate = False\n",
" while not terminate:\n",
" old_values = values_matrix[-1]\n",
" new_values = [0 for _ in model.states]\n",
" for sid, state in enumerate(model.states):\n",
" choices = state.choices\n",
" # Now we have to take a decision for an action.\n",
" action_values = {}\n",
" for action, branch in choices:\n",
" branch_value = sum(\n",
" [prob * old_values[state_to_index[s]] for (prob, s) in branch]\n",
" )\n",
" action_values[action] = branch_value\n",
" # We take the action with the highest value.\n",
" highest_value = max(action_values.values()) if action_values else 0\n",
" new_values[sid] = highest_value\n",
" values_matrix.append(new_values)\n",
" terminate = (\n",
" sum([abs(x - y) for (x, y) in zip(new_values, old_values)]) < epsilon\n",
" )\n",
" return values_matrix"
]
},
{
"cell_type": "markdown",
"id": "827e8f2b",
"metadata": {},
"source": [
"To demonstrate the workings of this algorithm, we use the lion model again."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "2f310df7",
"metadata": {
"execution": {
"iopub.execute_input": "2026-03-26T10:42:01.064033Z",
"iopub.status.busy": "2026-03-26T10:42:01.063775Z",
"iopub.status.idle": "2026-03-26T10:42:01.284563Z",
"shell.execute_reply": "2026-03-26T10:42:01.283969Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"\n",
"