{ "cells": [ { "cell_type": "markdown", "id": "79b5e68a", "metadata": {}, "source": [ "# Simulator\n", "If we have a transition system, it might be nice to run a simulation. In this case, we have an MDP that models a hungry lion. Depending on the state it is in, it needs to decide whether it wants to 'rawr' or 'hunt' in order to prevent reaching the state 'dead'." ] }, { "cell_type": "code", "execution_count": 1, "id": "bcc8069b", "metadata": { "execution": { "iopub.execute_input": "2026-07-01T08:29:46.482296Z", "iopub.status.busy": "2026-07-01T08:29:46.481992Z", "iopub.status.idle": "2026-07-01T08:29:47.142480Z", "shell.execute_reply": "2026-07-01T08:29:47.141847Z" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", " \n", " Network\n", " \n", " \n", " \n", " \n", " \n", "
\n", " \n", " \n", " \n", " \n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from stormvogel import *\n", "import stormvogel\n", "\n", "lion = examples.create_lion_mdp()\n", "show(lion)" ] }, { "cell_type": "markdown", "id": "fca42faf", "metadata": {}, "source": [ "Now, let's run a simulation of the lion! If we do not provide a scheduling function, then the simulator just does a random walk, taking a random choice each time." ] }, { "cell_type": "code", "execution_count": 2, "id": "0589a8ed", "metadata": { "execution": { "iopub.execute_input": "2026-07-01T08:29:47.150756Z", "iopub.status.busy": "2026-07-01T08:29:47.150373Z", "iopub.status.idle": "2026-07-01T08:29:47.153887Z", "shell.execute_reply": "2026-07-01T08:29:47.153263Z" }, "lines_to_next_cell": 2 }, "outputs": [], "source": [ "path = simulate_path(lion, steps=5, seed=1234)" ] }, { "cell_type": "markdown", "id": "e43fc8d0", "metadata": { "lines_to_next_cell": 2 }, "source": [ "We could also provide a scheduling function to choose the actions ourselves. This is somewhat similar to the `bird` API." ] }, { "cell_type": "code", "execution_count": 3, "id": "adc1a38c", "metadata": { "execution": { "iopub.execute_input": "2026-07-01T08:29:47.155871Z", "iopub.status.busy": "2026-07-01T08:29:47.155702Z", "iopub.status.idle": "2026-07-01T08:29:47.158739Z", "shell.execute_reply": "2026-07-01T08:29:47.158267Z" }, "lines_to_next_cell": 0 }, "outputs": [], "source": [ "def scheduler(s: State) -> Action:\n", " return Action(\"rawr\")\n", "\n", "\n", "path2 = stormvogel.simulator.simulate_path(\n", " lion, steps=5, seed=1234, scheduler=scheduler\n", ")" ] }, { "cell_type": "markdown", "id": "8019b8d5", "metadata": {}, "source": [ "We can also use the scheduler to create a partial model. This model contains all the states that have been discovered by the the simulation." ] }, { "cell_type": "code", "execution_count": 4, "id": "2c112ac3", "metadata": { "execution": { "iopub.execute_input": "2026-07-01T08:29:47.160498Z", "iopub.status.busy": "2026-07-01T08:29:47.160316Z", "iopub.status.idle": "2026-07-01T08:29:47.192119Z", "shell.execute_reply": "2026-07-01T08:29:47.191520Z" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", " \n", " Network\n", " \n", " \n", " \n", " \n", " \n", "
\n", " \n", " \n", " \n", " \n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "partial_model = stormvogel.simulator.simulate(\n", " lion, steps=5, scheduler=scheduler, seed=1234\n", ")\n", "show(partial_model)" ] }, { "cell_type": "markdown", "id": "cd3b05b7", "metadata": {}, "source": [ "## Gymnasium-Compliant Environment\n", "\n", "Stormvogel models can be wrapped as a [Gymnasium](https://gymnasium.farama.org/)\n", "environment via `ModelEnv`. This lets you use standard reinforcement-learning\n", "libraries directly on a stormvogel MDP or DTMC without any manual glue code.\n", "\n", "`ModelEnv` supports both **MDP** and **DTMC** models:\n", "* For an MDP the action space is `Discrete(n_actions)`, one index per named action.\n", "* For a DTMC there is no choice, so the action space is `Discrete(1)` — always pass `0`.\n", "\n", "The observation space has two modes, selected by `obs_type`:\n", "* `\"index\"` (default): a plain integer — the index of the current state.\n", "* `\"valuations\"`: a `Dict` space built from variables that have a declared domain\n", " (`IntDomain`, `BoolDomain`, or `CategoricalDomain`), one `Discrete` component\n", " per variable." ] }, { "cell_type": "code", "execution_count": 5, "id": "cd6a3e45", "metadata": { "execution": { "iopub.execute_input": "2026-07-01T08:29:47.199946Z", "iopub.status.busy": "2026-07-01T08:29:47.199718Z", "iopub.status.idle": "2026-07-01T08:29:47.350001Z", "shell.execute_reply": "2026-07-01T08:29:47.349412Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Observation space: Discrete(5)\n", "Action space: Discrete(3)\n", "Actions: [Action('hunt >:D'), Action('rawr'), Action(None)]\n" ] } ], "source": [ "from stormvogel.gym_env import ModelEnv, ActionUnavailableError\n", "\n", "env = ModelEnv(lion)\n", "print(\"Observation space:\", env.observation_space)\n", "print(\"Action space: \", env.action_space)\n", "print(\"Actions: \", env._index_to_action)" ] }, { "cell_type": "markdown", "id": "63630939", "metadata": {}, "source": [ "The standard Gymnasium loop works as-is. `reset()` returns the initial\n", "observation and an info dict; `step(action)` returns the next observation,\n", "reward, terminated flag, truncated flag, and an info dict. The info dict\n", "always contains the raw `stormvogel.model.State` under the key `\"state\"`." ] }, { "cell_type": "code", "execution_count": 6, "id": "971b5f76", "metadata": { "execution": { "iopub.execute_input": "2026-07-01T08:29:47.352081Z", "iopub.status.busy": "2026-07-01T08:29:47.351724Z", "iopub.status.idle": "2026-07-01T08:29:47.356613Z", "shell.execute_reply": "2026-07-01T08:29:47.355974Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Initial state index: 0 — state: State(id=e6b337ac-a0c5-40cf-a9cc-17efdfdd8530, labels=['init'])\n", "After 'hunt': obs = 0 | reward = 0.0 | terminated = False\n" ] } ], "source": [ "obs, info = env.reset(seed=42)\n", "print(\"Initial state index:\", obs, \"— state:\", info[\"state\"])\n", "\n", "hunt_idx = next(i for i, a in enumerate(env._index_to_action) if \"hunt\" in str(a))\n", "obs, reward, terminated, truncated, info = env.step(hunt_idx)\n", "print(\"After 'hunt': obs =\", obs, \"| reward =\", reward, \"| terminated =\", terminated)" ] }, { "cell_type": "markdown", "id": "671e03bb", "metadata": {}, "source": [ "Passing an action that is not available in the current state raises\n", "`ActionUnavailableError` rather than silently producing incorrect behaviour.\n", "\n", "### Variable-domain observations\n", "\n", "When all variables in the model carry a declared domain, `obs_type=\"valuations\"`\n", "gives a `Dict` observation whose keys are variable names and whose values are\n", "non-negative integers (domain encoding). This is more informative than a raw\n", "state index and compatible with structured RL policies." ] }, { "cell_type": "code", "execution_count": 7, "id": "8157a094", "metadata": { "execution": { "iopub.execute_input": "2026-07-01T08:29:47.358399Z", "iopub.status.busy": "2026-07-01T08:29:47.358215Z", "iopub.status.idle": "2026-07-01T08:29:47.363851Z", "shell.execute_reply": "2026-07-01T08:29:47.363246Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Observation space: Dict('car_pos': Discrete(4), 'chosen_pos': Discrete(4), 'reveal_pos': Discrete(4))\n", "Initial obs (all variables at sentinel -1): {'car_pos': 0, 'chosen_pos': 0, 'reveal_pos': 0}\n" ] } ], "source": [ "from stormvogel.examples.monty_hall import create_monty_hall_mdp\n", "\n", "mh = create_monty_hall_mdp()\n", "mh_env = ModelEnv(mh, obs_type=\"valuations\")\n", "print(\"Observation space:\", mh_env.observation_space)\n", "obs, info = mh_env.reset()\n", "print(\"Initial obs (all variables at sentinel -1):\", obs)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.14" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": { "23a63f8056f94f18912f7d689589b2b4": { "model_module": "@jupyter-widgets/output", "model_module_version": "1.0.0", "model_name": "OutputModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/output", "_model_module_version": "1.0.0", "_model_name": "OutputModel", "_view_count": null, "_view_module": "@jupyter-widgets/output", "_view_module_version": "1.0.0", "_view_name": "OutputView", "layout": "IPY_MODEL_41423d63b42d4a0197c36bfb59df0961", "msg_id": "", "outputs": [], "tabbable": null, "tooltip": null } }, "3660fc1701624f57a5ee3bbc890abd21": { "model_module": "@jupyter-widgets/output", "model_module_version": "1.0.0", "model_name": "OutputModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/output", "_model_module_version": "1.0.0", "_model_name": "OutputModel", "_view_count": null, "_view_module": "@jupyter-widgets/output", "_view_module_version": "1.0.0", "_view_name": "OutputView", "layout": "IPY_MODEL_f591ce9505784dd6918b84c220cb3f06", "msg_id": "", "outputs": [], "tabbable": null, "tooltip": null } }, "41423d63b42d4a0197c36bfb59df0961": { "model_module": "@jupyter-widgets/base", "model_module_version": "2.0.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "2.0.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "2.0.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border_bottom": null, "border_left": null, "border_right": null, "border_top": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "687a2238555a445cbfc9acd3c4815506": { "model_module": "@jupyter-widgets/base", "model_module_version": "2.0.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "2.0.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "2.0.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border_bottom": null, "border_left": null, "border_right": null, "border_top": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "97c8ffeef8074be69b5f298d586845f9": { "model_module": "@jupyter-widgets/output", "model_module_version": "1.0.0", "model_name": "OutputModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/output", "_model_module_version": "1.0.0", "_model_name": "OutputModel", "_view_count": null, "_view_module": "@jupyter-widgets/output", "_view_module_version": "1.0.0", "_view_name": "OutputView", "layout": "IPY_MODEL_9c773b2cee2f4192a6ae88197db1a24f", "msg_id": "", "outputs": [], "tabbable": null, "tooltip": null } }, "9c773b2cee2f4192a6ae88197db1a24f": { "model_module": "@jupyter-widgets/base", "model_module_version": "2.0.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "2.0.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "2.0.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border_bottom": null, "border_left": null, "border_right": null, "border_top": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "f591ce9505784dd6918b84c220cb3f06": { "model_module": "@jupyter-widgets/base", "model_module_version": "2.0.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "2.0.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "2.0.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border_bottom": null, "border_left": null, "border_right": null, "border_top": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "f83814f5b90a43bd80ced1db20db8de7": { "model_module": "@jupyter-widgets/output", "model_module_version": "1.0.0", "model_name": "OutputModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/output", "_model_module_version": "1.0.0", "_model_name": "OutputModel", "_view_count": null, "_view_module": "@jupyter-widgets/output", "_view_module_version": "1.0.0", "_view_name": "OutputView", "layout": "IPY_MODEL_687a2238555a445cbfc9acd3c4815506", "msg_id": "", "outputs": [], "tabbable": null, "tooltip": null } } }, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 5 }