stormvogel.teaching.multiobjective¶

Goal unfolding and weighted multi-target reachability.

Functions¶

`goal_unfolding`(…)	Build the goal-unfolding product of an MDP.
`weighted_multi_target_reachability`(...)	Build a model for weighted multi-target reachability via goal unfolding.
`compute_weighted_reachability_policy`(...)	Compute an optimal policy for weighted multi-target reachability.
`evaluate_policy_reachability`(→ list[float])	Evaluate a policy's reachability probability for each individual target.
`_compute_stuck_states`(→ frozenset)	Return states from which no state with strictly more bits set is reachable.
`lp_dual_multireachprob`(→ stormvogel.teaching.lp.LP)	Occupancy-measure LP for multiobjective reachability on a goal unfolding.

Module Contents¶

stormvogel.teaching.multiobjective.goal_unfolding(mdp: stormvogel.model.Model, goal_labels: list[str], return_state_bits: Literal[True]) → tuple[stormvogel.model.Model, dict]¶

stormvogel.teaching.multiobjective.goal_unfolding(mdp: stormvogel.model.Model, goal_labels: list[str], return_state_bits: Literal[False] = ...) → stormvogel.model.Model

Build the goal-unfolding product of an MDP.

The product state (s, b) tracks which goals have been visited on the path to reach it. b[i] = True iff a state labelled goal_labels[i] has been visited (including s itself). Bits never flip back to False.

Transition semantics: (s, b) –α–> (t, delta(b, L(t))) where delta(b, L(t))[i] = b[i] OR (goal_labels[i] in L(t)).

Product state labels: L(s).

Parameters:: return_state_bits – When True, return a pair (model, bits_map) where bits_map maps each unfolded State to its bit tuple.

stormvogel.teaching.multiobjective.weighted_multi_target_reachability(mdp: stormvogel.model.Model, target_labels: list[str], weights: list[float]) → stormvogel.model.Model¶

Build a model for weighted multi-target reachability via goal unfolding.

Each target T_i is identified by label target_labels[i] with weight weights[i]. The returned model has a single state reward model “weighted_reach” whose expected total reward equals sum_i weights[i] * P(reach T_i).

Uses goal_unfolding to track first visits and transition rewards (eliminated to state rewards via auxiliary entry states) to assign weights.

Parameters:

mdp – The MDP.
target_labels – Labels identifying each target set T_i.
weights – Non-negative weight w_i for each T_i.

Returns:

Transformed MDP with state reward model “weighted_reach”.

stormvogel.teaching.multiobjective.compute_weighted_reachability_policy(mdp: stormvogel.model.Model, target_labels: list[str], weights: list[float]) → stormvogel.result.Result¶

Compute an optimal policy for weighted multi-target reachability.

Reduces the problem to single-objective total reward maximization via weighted_multi_target_reachability(), then calls stormpy to solve R{"weighted_reach"}max=? [C] and extract the scheduler.

Parameters:

mdp – The input MDP.
target_labels – Labels identifying each target set T_i.
weights – Non-negative weight w_i for each T_i.

Returns:

Model-checking result on the transformed MDP, including the optimal scheduler in result.scheduler.

Raises:

ImportError – If stormpy is not installed.
RuntimeError – If model checking returns no result.

stormvogel.teaching.multiobjective.evaluate_policy_reachability(result: stormvogel.result.Result, target_labels: list[str]) → list[float]¶

Evaluate a policy’s reachability probability for each individual target.

Induces a DTMC from the scheduler in result and runs one reachability query P=? [F "label"] per entry in target_labels.

Parameters:

result – Output of compute_weighted_reachability_policy(); result.scheduler must be non-None.
target_labels – Labels of the individual target sets T_i.

Returns:

List of length len(target_labels) where entry i is P_policy(reach T_i) at the initial state.

Raises:

ImportError – If stormpy is not installed.
RuntimeError – If the induced DTMC cannot be constructed or model checking fails for any label.

stormvogel.teaching.multiobjective._compute_stuck_states(unfolded: stormvogel.model.Model, bits_map: dict, m: int) → frozenset¶

Return states from which no state with strictly more bits set is reachable.

A state is stuck if the bit vector can never improve from it — either because all bits are already True, or because the state lies in a part of the model from which no new goal label can ever be reached.

stormvogel.teaching.multiobjective.lp_dual_multireachprob(unfolded: stormvogel.model.Model, bits_map: dict, goal_labels: list[str], weights: list[float] | None = None, threshold: list[float] | None = None) → stormvogel.teaching.lp.LP¶

Occupancy-measure LP for multiobjective reachability on a goal unfolding.

unfolded and bits_map must be the pair returned by goal_unfolding(mdp, goal_labels, return_state_bits=True). Passing any other model raises ValueError.

Variables y_{s,a} represent expected visit counts for each free state-action pair. A state is free if progress (a new goal bit flipping) is still reachable from it; all other states are treated as absorbing and receive no flow variable.

The reach expression for goal i counts only transitions (s, a, t) where bits_map[s][i] is False and goal_labels[i] in t.labels, ensuring each first-visit is counted exactly once.

Parameters:

unfolded – Goal-unfolded MDP (output of goal_unfolding()).
bits_map – Mapping from each state in unfolded to its bit tuple (output of goal_unfolding() with return_state_bits=True).
goal_labels – Labels identifying each goal, in the same order used when calling goal_unfolding().
weights – Objective weights λ_k; objective is zero if None.
threshold – Optional lower-bound vector of length K. Adds P(reach goal_i) ≥ threshold[i] constraints.

Raises:

ValueError – If bits_map is empty or its tuple length does not match goal_labels.