{ "cells": [ { "cell_type": "markdown", "id": "cf83ede0", "metadata": {}, "source": [ "# Building MDPs\n", "In Stormvogel, a **Markov Decision Process (MDP)** consists of:\n", "* states $S$,\n", "* actions $A$,\n", "* an initial state $s_0$,\n", "* a mapping from states to sets of *enabled actions*,\n", "* a successor distribution $P(s,a)$ for every state $s$ and every enabled action $a$, i.e., sets of transitions between states $s$ and $s'$, each annotated with an action and a probability.\n", "* state labels $L(s)$.\n", "\n", "\n", "Here we show how to construct a simple example mdp using the bird API and the model builder API.\n", "The idea is that you can choose to study (you will likely pass the exam but you have less free time) or not to study (you will have more free time but risk failing the exam)." ] }, { "cell_type": "markdown", "id": "2d540bc7", "metadata": {}, "source": [ "## The study dilemma\n", "This little MDP is supposed to help you decide whether you should stuy or not." ] }, { "cell_type": "markdown", "id": "c344b880", "metadata": {}, "source": [ "### Bird API\n", "For MDPs, you specify the availaible actions in `available_actions`. An action here is simply a string. You specify the transition of a state-action pair in `delta`." ] }, { "cell_type": "code", "execution_count": 1, "id": "8a5f5bbb", "metadata": { "execution": { "iopub.execute_input": "2026-03-26T10:47:14.163371Z", "iopub.status.busy": "2026-03-26T10:47:14.163196Z", "iopub.status.idle": "2026-03-26T10:47:14.562249Z", "shell.execute_reply": "2026-03-26T10:47:14.561690Z" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", " \n", " Network\n", " \n", " \n", " \n", " \n", " \n", "
\n", " \n", " \n", " \n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from stormvogel import *\n", "\n", "\n", "def available_actions(s):\n", " if s == \"init\": # Either study or not\n", " return [\"study\", \"don't study\"]\n", " else: # Otherwise, we have no choice (DTMC-like behavior)\n", " return [\"\"]\n", "\n", "\n", "def delta(s, a):\n", " if a == \"study\":\n", " return [(9 / 10, \"pass test\"), (1 / 10, \"fail test\")]\n", " elif a == \"don't study\":\n", " return [(2 / 5, \"pass test\"), (3 / 5, \"fail test\")]\n", " else:\n", " return [(1, \"end\")]\n", "\n", "\n", "def labels(s):\n", " return s\n", "\n", "\n", "# For rewards, you have to provide a dict. This enables multiple reward models if you use a non-singleton list.\n", "def rewards(s: bird.State, a: bird.Action):\n", " if s == \"pass test\":\n", " return {\"R\": 100}\n", " elif s == \"init\" and a == \"don't study\":\n", " return {\"R\": 15}\n", " else:\n", " return {\"R\": 0}\n", "\n", "\n", "bird_study = bird.build_bird(\n", " delta=delta,\n", " init=\"init\",\n", " available_actions=available_actions,\n", " labels=labels,\n", " modeltype=ModelType.MDP,\n", " rewards=rewards,\n", ")\n", "vis = show(bird_study, layout=Layout(\"layouts/pinkgreen.json\"))" ] }, { "cell_type": "markdown", "id": "fd03bba4", "metadata": {}, "source": [ "### Model API" ] }, { "cell_type": "code", "execution_count": 2, "id": "6060fbd2", "metadata": { "execution": { "iopub.execute_input": "2026-03-26T10:47:14.573274Z", "iopub.status.busy": "2026-03-26T10:47:14.572960Z", "iopub.status.idle": "2026-03-26T10:47:14.604395Z", "shell.execute_reply": "2026-03-26T10:47:14.603846Z" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", " \n", " Network\n", " \n", " \n", " \n", " \n", " \n", "
\n", " \n", " \n", " \n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from stormvogel import *\n", "\n", "mdp = stormvogel.model.new_mdp()\n", "\n", "init = mdp.get_initial_state()\n", "study = mdp.action(\"study\")\n", "not_study = mdp.action(\"don't study\")\n", "\n", "pass_test = mdp.new_state(\"pass test\")\n", "fail_test = mdp.new_state(\"fail test\")\n", "end = mdp.new_state(\"end\")\n", "\n", "init.set_choice(\n", " {\n", " study: [(9 / 10, pass_test), (1 / 10, fail_test)],\n", " not_study: [(4 / 10, pass_test), (6 / 10, fail_test)],\n", " }\n", ")\n", "\n", "pass_test.set_choice([(1, end)])\n", "fail_test.set_choice([(1, end)])\n", "\n", "reward_model = mdp.new_reward_model(\"R\")\n", "reward_model.set_state_action_reward(pass_test, stormvogel.model.EmptyAction, 100)\n", "reward_model.set_state_action_reward(fail_test, stormvogel.model.EmptyAction, 0)\n", "reward_model.set_state_action_reward(init, not_study, 15)\n", "reward_model.set_state_action_reward(init, study, 0)\n", "reward_model.set_unset_rewards(0)\n", "\n", "vis2 = show(mdp, layout=Layout(\"layouts/pinkgreen.json\"))" ] }, { "cell_type": "markdown", "id": "02717240", "metadata": {}, "source": [ "## Grid model\n", "An MDP model that consists of a 3x3 grid. The direction to walk is chosen by an action." ] }, { "cell_type": "markdown", "id": "7d8d03d0", "metadata": {}, "source": [ "### Bird API" ] }, { "cell_type": "code", "execution_count": 3, "id": "b4a0a3b3", "metadata": { "execution": { "iopub.execute_input": "2026-03-26T10:47:14.612408Z", "iopub.status.busy": "2026-03-26T10:47:14.612208Z", "iopub.status.idle": "2026-03-26T10:47:14.643578Z", "shell.execute_reply": "2026-03-26T10:47:14.643009Z" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", " \n", " Network\n", " \n", " \n", " \n", " \n", " \n", "
\n", " \n", " \n", " \n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "N = 3\n", "\n", "ACTION_SEMANTICS = {\"l\": (-1, 0), \"r\": (1, 0), \"u\": (0, -1), \"d\": (0, 1)}\n", "\n", "\n", "def available_actions(s):\n", " res = []\n", " if s[0] > 0:\n", " res.append(\"l\")\n", " if s[0] < N - 1:\n", " res.append(\"r\")\n", " if s[1] > 0:\n", " res.append(\"u\")\n", " if s[1] < N - 1:\n", " res.append(\"d\")\n", " return res\n", "\n", "\n", "def pairwise_plus(t1, t2):\n", " return (t1[0] + t2[0], t1[1] + t2[1])\n", "\n", "\n", "def delta(s, a):\n", " return [(1, pairwise_plus(s, ACTION_SEMANTICS[a]))]\n", "\n", "\n", "def labels(s):\n", " return [str(s)]\n", "\n", "\n", "m1 = bird.build_bird(\n", " init=(0, 0), available_actions=available_actions, labels=labels, delta=delta\n", ")\n", "vis3 = show(m1)" ] }, { "cell_type": "markdown", "id": "17058487", "metadata": {}, "source": [ "### Model API" ] }, { "cell_type": "code", "execution_count": 4, "id": "8012cee4", "metadata": { "execution": { "iopub.execute_input": "2026-03-26T10:47:14.652360Z", "iopub.status.busy": "2026-03-26T10:47:14.652152Z", "iopub.status.idle": "2026-03-26T10:47:14.683757Z", "shell.execute_reply": "2026-03-26T10:47:14.683216Z" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", " \n", " Network\n", " \n", " \n", " \n", " \n", " \n", "
\n", " \n", " \n", " \n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "grid_model = stormvogel.model.new_mdp(create_initial_state=False)\n", "\n", "for x in range(N):\n", " for y in range(N):\n", " grid_model.new_state(f\"({x},{y})\")\n", "\n", "for x in range(N):\n", " for y in range(N):\n", " state_tile_label = str((x, y)).replace(\" \", \"\")\n", " state = grid_model.get_states_with_label(state_tile_label)[0]\n", " av = available_actions((x, y))\n", " for a in av:\n", " target_tile_label = str(pairwise_plus((x, y), ACTION_SEMANTICS[a])).replace(\n", " \" \", \"\"\n", " )\n", " target_state = grid_model.get_states_with_label(target_tile_label)[0]\n", " state.add_choice([(grid_model.action(a), target_state)])\n", "vis4 = show(grid_model)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.13" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": { "010e02bc62a64927aaf9d1b0a33da00d": { "model_module": "@jupyter-widgets/base", "model_module_version": "2.0.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "2.0.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "2.0.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border_bottom": null, "border_left": null, "border_right": null, "border_top": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "31cbf67206804832a63d97457d0d1c63": { "model_module": "@jupyter-widgets/base", "model_module_version": "2.0.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "2.0.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "2.0.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border_bottom": null, "border_left": null, "border_right": null, "border_top": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "518767ee5cb24cecb7aaf3bc59fee9d5": { "model_module": "@jupyter-widgets/output", "model_module_version": "1.0.0", "model_name": "OutputModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/output", "_model_module_version": "1.0.0", "_model_name": "OutputModel", "_view_count": null, "_view_module": "@jupyter-widgets/output", "_view_module_version": "1.0.0", "_view_name": "OutputView", "layout": "IPY_MODEL_f3108afd54a74828af044132a46ee7eb", "msg_id": "", "outputs": [], "tabbable": null, "tooltip": null } }, "573b3325d8d448b19a7e0c5c896f417c": { "model_module": "@jupyter-widgets/output", "model_module_version": "1.0.0", "model_name": "OutputModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/output", "_model_module_version": "1.0.0", "_model_name": "OutputModel", "_view_count": null, "_view_module": "@jupyter-widgets/output", "_view_module_version": "1.0.0", "_view_name": "OutputView", "layout": "IPY_MODEL_6c8eaaf41c6a491ab5be78c59870e5f3", "msg_id": "", "outputs": [], "tabbable": null, "tooltip": null } }, "6c8eaaf41c6a491ab5be78c59870e5f3": { "model_module": "@jupyter-widgets/base", "model_module_version": "2.0.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "2.0.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "2.0.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border_bottom": null, "border_left": null, "border_right": null, "border_top": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "72001dbe2cb54e39be866cc11ff74f6f": { "model_module": "@jupyter-widgets/base", "model_module_version": "2.0.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "2.0.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "2.0.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border_bottom": null, "border_left": null, "border_right": null, "border_top": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "8cbac2b728be4dedb9be2b69c18289ce": { "model_module": "@jupyter-widgets/output", "model_module_version": "1.0.0", "model_name": "OutputModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/output", "_model_module_version": "1.0.0", "_model_name": "OutputModel", "_view_count": null, "_view_module": "@jupyter-widgets/output", "_view_module_version": "1.0.0", "_view_name": "OutputView", "layout": "IPY_MODEL_31cbf67206804832a63d97457d0d1c63", "msg_id": "", "outputs": [], "tabbable": null, "tooltip": null } }, "90997747de4a4d2e9ae848a0501d224d": { "model_module": "@jupyter-widgets/output", "model_module_version": "1.0.0", "model_name": "OutputModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/output", "_model_module_version": "1.0.0", "_model_name": "OutputModel", "_view_count": null, "_view_module": "@jupyter-widgets/output", "_view_module_version": "1.0.0", "_view_name": "OutputView", "layout": "IPY_MODEL_c155da21e224465aa973ede5e43606a1", "msg_id": "", "outputs": [], "tabbable": null, "tooltip": null } }, "a32f821aa12b41618ad979d538f6c223": { "model_module": "@jupyter-widgets/base", "model_module_version": "2.0.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "2.0.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "2.0.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border_bottom": null, "border_left": null, "border_right": null, "border_top": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "ab86c49838e1454f88096341b00fa4ac": { "model_module": "@jupyter-widgets/output", "model_module_version": "1.0.0", "model_name": "OutputModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/output", "_model_module_version": "1.0.0", "_model_name": "OutputModel", "_view_count": null, "_view_module": "@jupyter-widgets/output", "_view_module_version": "1.0.0", "_view_name": "OutputView", "layout": "IPY_MODEL_e4c7c862e29046c885c14c76939805b8", "msg_id": "", "outputs": [], "tabbable": null, "tooltip": null } }, "bb491892549f412fbb5a64cfaf44c5a2": { "model_module": "@jupyter-widgets/output", "model_module_version": "1.0.0", "model_name": "OutputModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/output", "_model_module_version": "1.0.0", "_model_name": "OutputModel", "_view_count": null, "_view_module": "@jupyter-widgets/output", "_view_module_version": "1.0.0", "_view_name": "OutputView", "layout": "IPY_MODEL_a32f821aa12b41618ad979d538f6c223", "msg_id": "", "outputs": [], "tabbable": null, "tooltip": null } }, "c155da21e224465aa973ede5e43606a1": { "model_module": "@jupyter-widgets/base", "model_module_version": "2.0.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "2.0.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "2.0.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border_bottom": null, "border_left": null, "border_right": null, "border_top": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "e2ec62713b4541ca83d0a635fb8fe41c": { "model_module": "@jupyter-widgets/output", "model_module_version": "1.0.0", "model_name": "OutputModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/output", "_model_module_version": "1.0.0", "_model_name": "OutputModel", "_view_count": null, "_view_module": "@jupyter-widgets/output", "_view_module_version": "1.0.0", "_view_name": "OutputView", "layout": "IPY_MODEL_010e02bc62a64927aaf9d1b0a33da00d", "msg_id": "", "outputs": [], "tabbable": null, "tooltip": null } }, "e4c7c862e29046c885c14c76939805b8": { "model_module": "@jupyter-widgets/base", "model_module_version": "2.0.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "2.0.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "2.0.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border_bottom": null, "border_left": null, "border_right": null, "border_top": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "f3108afd54a74828af044132a46ee7eb": { "model_module": "@jupyter-widgets/base", "model_module_version": "2.0.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "2.0.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "2.0.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border_bottom": null, "border_left": null, "border_right": null, "border_top": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "fd5512ad35ec4bde9b7de78cb5535c54": { "model_module": "@jupyter-widgets/output", "model_module_version": "1.0.0", "model_name": "OutputModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/output", "_model_module_version": "1.0.0", "_model_name": "OutputModel", "_view_count": null, "_view_module": "@jupyter-widgets/output", "_view_module_version": "1.0.0", "_view_name": "OutputView", "layout": "IPY_MODEL_72001dbe2cb54e39be866cc11ff74f6f", "msg_id": "", "outputs": [], "tabbable": null, "tooltip": null } } }, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 5 }