{
"cells": [
{
"cell_type": "markdown",
"id": "0a7e6243",
"metadata": {},
"source": [
"# Visual value iteration and Markov Chain evolution\n",
"This notebook provides an example of using the stormvogel API for naive value iteration and Markov Chain evolution, and a way to visualize these algorithms.
"
]
},
{
"cell_type": "markdown",
"id": "60e04cbe",
"metadata": {},
"source": [
"## Value iteration\n",
"This algorithm takes a model and a target state. It then calculates the probability of reaching the target state from all other states iteratively. If a choice has to be made for an action, the action that gives the best result in the previous iteration is chosen. It also takes an additional parameter epsilon, which indicates the target accuracy. A lower value of epsilon gives a more accurate result with more iterations."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "98a0b860",
"metadata": {
"execution": {
"iopub.execute_input": "2026-03-26T10:47:35.686559Z",
"iopub.status.busy": "2026-03-26T10:47:35.686331Z",
"iopub.status.idle": "2026-03-26T10:47:35.882645Z",
"shell.execute_reply": "2026-03-26T10:47:35.882155Z"
}
},
"outputs": [],
"source": [
"from stormvogel import *\n",
"\n",
"\n",
"def naive_value_iteration(\n",
" model: Model, epsilon: float, target_state: State\n",
") -> list[list[float]]:\n",
" \"\"\"Run naive value iteration. The result is a 2D list where result[n][m] is the probability to be in state m at step n.\n",
"\n",
" Args:\n",
" model (stormvogel.model.Model): Target model.\n",
" steps (int): Amount of steps.\n",
" target_state (stormvogel.model.State): Target state of the model.\n",
"\n",
" Returns:\n",
" list[list[float]]: The result is a 2D list where result[n][m] is the value of state m at iteration n.\n",
" \"\"\"\n",
" if epsilon <= 0:\n",
" RuntimeError(\"The algorithm will not terminate if epsilon is zero.\")\n",
"\n",
" # Create a dynamic matrix (list of lists) to store the result.\n",
" values_matrix = [[0 for state in model.get_states()]]\n",
" values_matrix[0][target_state.id] = 1\n",
"\n",
" terminate = False\n",
" while not terminate:\n",
" old_values = values_matrix[len(values_matrix) - 1]\n",
" new_values = [None for state in model.get_states()]\n",
" for sid, state in model.states.items():\n",
" choice = model.get_choice(state)\n",
" # Now we have to take a decision for an action.\n",
" actions = choice.choice.keys()\n",
" action_values = {}\n",
" for action, branch in choice.choice.items():\n",
" branch_value = sum(\n",
" [prob * old_values[state.id] for (prob, state) in branch.branch]\n",
" )\n",
" action_values[action] = branch_value\n",
" # We take the action with the highest value.\n",
" highest_value = max(action_values.values())\n",
" new_values[sid] = highest_value\n",
" values_matrix.append(new_values)\n",
" terminate = (\n",
" sum([abs(x - y) for (x, y) in zip(new_values, old_values)]) < epsilon\n",
" )\n",
" return values_matrix"
]
},
{
"cell_type": "markdown",
"id": "1df5d0d0",
"metadata": {},
"source": [
"To demonstrate the workings of this algorithm, we use the lion model again."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "412dabda",
"metadata": {
"execution": {
"iopub.execute_input": "2026-03-26T10:47:35.884707Z",
"iopub.status.busy": "2026-03-26T10:47:35.884480Z",
"iopub.status.idle": "2026-03-26T10:47:36.088405Z",
"shell.execute_reply": "2026-03-26T10:47:36.087830Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"\n",
"