stormvogel.teaching.multiobjective¶
Goal unfolding and weighted multi-target reachability.
Functions¶
Build the goal-unfolding product of an MDP. |
|
Build a model for weighted multi-target reachability via goal unfolding. |
|
Compute an optimal policy for weighted multi-target reachability. |
|
|
Evaluate a policy's reachability probability for each individual target. |
|
Return states from which no state with strictly more bits set is reachable. |
|
Occupancy-measure LP for multiobjective reachability on a goal unfolding. |
Module Contents¶
- stormvogel.teaching.multiobjective.goal_unfolding(mdp: stormvogel.model.Model, goal_labels: list[str], return_state_bits: Literal[True]) tuple[stormvogel.model.Model, dict]¶
- stormvogel.teaching.multiobjective.goal_unfolding(mdp: stormvogel.model.Model, goal_labels: list[str], return_state_bits: Literal[False] = ...) stormvogel.model.Model
Build the goal-unfolding product of an MDP.
The product state (s, b) tracks which goals have been visited on the path to reach it. b[i] = True iff a state labelled goal_labels[i] has been visited (including s itself). Bits never flip back to False.
Transition semantics: (s, b) –α–> (t, delta(b, L(t))) where delta(b, L(t))[i] = b[i] OR (goal_labels[i] in L(t)).
Product state labels: L(s).
- Parameters:
return_state_bits – When True, return a pair (model, bits_map) where bits_map maps each unfolded State to its bit tuple.
- stormvogel.teaching.multiobjective.weighted_multi_target_reachability(mdp: stormvogel.model.Model, target_labels: list[str], weights: list[float]) stormvogel.model.Model¶
Build a model for weighted multi-target reachability via goal unfolding.
Each target T_i is identified by label target_labels[i] with weight weights[i]. The returned model has a single state reward model “weighted_reach” whose expected total reward equals sum_i weights[i] * P(reach T_i).
Uses goal_unfolding to track first visits and transition rewards (eliminated to state rewards via auxiliary entry states) to assign weights.
- Parameters:
mdp – The MDP.
target_labels – Labels identifying each target set T_i.
weights – Non-negative weight w_i for each T_i.
- Returns:
Transformed MDP with state reward model “weighted_reach”.
- stormvogel.teaching.multiobjective.compute_weighted_reachability_policy(mdp: stormvogel.model.Model, target_labels: list[str], weights: list[float]) stormvogel.result.Result¶
Compute an optimal policy for weighted multi-target reachability.
Reduces the problem to single-objective total reward maximization via
weighted_multi_target_reachability(), then calls stormpy to solveR{"weighted_reach"}max=? [C]and extract the scheduler.- Parameters:
mdp – The input MDP.
target_labels – Labels identifying each target set T_i.
weights – Non-negative weight w_i for each T_i.
- Returns:
Model-checking result on the transformed MDP, including the optimal scheduler in
result.scheduler.- Raises:
ImportError – If stormpy is not installed.
RuntimeError – If model checking returns no result.
- stormvogel.teaching.multiobjective.evaluate_policy_reachability(result: stormvogel.result.Result, target_labels: list[str]) list[float]¶
Evaluate a policy’s reachability probability for each individual target.
Induces a DTMC from the scheduler in result and runs one reachability query
P=? [F "label"]per entry in target_labels.- Parameters:
result – Output of
compute_weighted_reachability_policy();result.schedulermust be non-None.target_labels – Labels of the individual target sets T_i.
- Returns:
List of length
len(target_labels)where entry i isP_policy(reach T_i)at the initial state.- Raises:
ImportError – If stormpy is not installed.
RuntimeError – If the induced DTMC cannot be constructed or model checking fails for any label.
- stormvogel.teaching.multiobjective._compute_stuck_states(unfolded: stormvogel.model.Model, bits_map: dict, m: int) frozenset¶
Return states from which no state with strictly more bits set is reachable.
A state is stuck if the bit vector can never improve from it — either because all bits are already True, or because the state lies in a part of the model from which no new goal label can ever be reached.
- stormvogel.teaching.multiobjective.lp_dual_multireachprob(unfolded: stormvogel.model.Model, bits_map: dict, goal_labels: list[str], weights: list[float] | None = None, threshold: list[float] | None = None) stormvogel.teaching.lp.LP¶
Occupancy-measure LP for multiobjective reachability on a goal unfolding.
unfolded and bits_map must be the pair returned by
goal_unfolding(mdp, goal_labels, return_state_bits=True). Passing any other model raisesValueError.Variables
y_{s,a}represent expected visit counts for each free state-action pair. A state is free if progress (a new goal bit flipping) is still reachable from it; all other states are treated as absorbing and receive no flow variable.The reach expression for goal i counts only transitions
(s, a, t)wherebits_map[s][i]is False andgoal_labels[i] in t.labels, ensuring each first-visit is counted exactly once.- Parameters:
unfolded – Goal-unfolded MDP (output of
goal_unfolding()).bits_map – Mapping from each state in unfolded to its bit tuple (output of
goal_unfolding()withreturn_state_bits=True).goal_labels – Labels identifying each goal, in the same order used when calling
goal_unfolding().weights – Objective weights λ_k; objective is zero if None.
threshold – Optional lower-bound vector of length K. Adds
P(reach goal_i) ≥ threshold[i]constraints.
- Raises:
ValueError – If bits_map is empty or its tuple length does not match goal_labels.