gym_csle_stopping_game.util package
Submodules
gym_csle_stopping_game.util.stopping_game_util module
- class gym_csle_stopping_game.util.stopping_game_util.StoppingGameUtil[source]
Bases:
object
Class with utility functions for the StoppingGame Environment
- static attacker_actions() numpy.ndarray[Any, numpy.dtype[numpy.int64]] [source]
Gets the action space of the attacker
- Returns
the action space of the attacker
- static b1() numpy.ndarray[Any, numpy.dtype[numpy.float64]] [source]
Gets the initial belief
- Returns
the initial belief
- static bayes_filter(s_prime: int, o: int, a1: int, b: numpy.ndarray[Any, numpy.dtype[numpy.float64]], pi2: numpy.ndarray[Any, numpy.dtype[Any]], l: int, config: gym_csle_stopping_game.dao.stopping_game_config.StoppingGameConfig) float [source]
A Bayesian filter to compute the belief of player 1 of being in s_prime when observing o after taking action a in belief b given that the opponent follows strategy pi2
- Parameters
s_prime – the state to compute the belief of
o – the observation
a1 – the action of player 1
b – the current belief point
pi2 – the policy of player 2
l – stops remaining
- Returns
b_prime(s_prime)
- static defender_actions() numpy.ndarray[Any, numpy.dtype[numpy.int64]] [source]
Gets the action space of the defender
- Returns
the action space of the defender
- static next_belief(o: int, a1: int, b: numpy.ndarray[Any, numpy.dtype[numpy.float64]], pi2: numpy.ndarray[Any, numpy.dtype[Any]], config: gym_csle_stopping_game.dao.stopping_game_config.StoppingGameConfig, l: int, a2: int = 0, s: int = 0) numpy.ndarray[Any, numpy.dtype[numpy.float64]] [source]
Computes the next belief using a Bayesian filter
- Parameters
o – the latest observation
a1 – the latest action of player 1
b – the current belief
pi2 – the policy of player 2
config – the game config
l – stops remaining
a2 – the attacker action (for debugging, should be consistent with pi2)
s – the true state (for debugging)
- Returns
the new belief
- static observation_space(n)[source]
Returns the observation space of size n
- Parameters
n – the maximum observation
- Returns
the observation space
- static observation_tensor(n)[source]
- Returns
a |A1|x|A2|x|S|x|O| tensor
- static pomdp_solver_file(config: gym_csle_stopping_game.dao.stopping_game_config.StoppingGameConfig, discount_factor: float, pi2: numpy.ndarray[Any, numpy.dtype[Any]]) str [source]
Gets the POMDP environment specification based on the format at http://www.pomdp.org/code/index.html, for the defender’s local problem against a static attacker
- Parameters
config – the POMDP config
discount_factor – the discount factor
pi2 – the attacker strategy
- Returns
the file content as a string
- static reward_tensor(R_SLA: int, R_INT: int, R_COST: int, L: int, R_ST: int) numpy.ndarray[Any, numpy.dtype[Any]] [source]
Gets the reward tensor
- Parameters
R_SLA – the R_SLA constant
R_INT – the R_INT constant
R_COST – the R_COST constant
R_ST – the R_ST constant
- Returns
a |L|x|A1|x|A2|x|S| tensor
- static sample_attacker_action(pi2: numpy.ndarray[Any, numpy.dtype[Any]], s: int) int [source]
Samples the attacker action
- Parameters
pi2 – the attacker policy
s – the game state
- Returns
a2 is the attacker action
- static sample_initial_state(b1: numpy.ndarray[Any, numpy.dtype[numpy.float64]]) int [source]
Samples the initial state
- Parameters
b1 – the initial belief
- Returns
s1
- static sample_next_observation(Z: numpy.ndarray[Any, numpy.dtype[Any]], s_prime: int, O: numpy.ndarray[Any, numpy.dtype[numpy.int64]]) int [source]
Samples the next observation
- Parameters
Z – observation tensor which include the observation probables
s_prime – the new state
O – the observation space
- Returns
o
- static sample_next_state(T: numpy.ndarray[Any, numpy.dtype[Any]], l: int, s: int, a1: int, a2: int, S: numpy.ndarray[Any, numpy.dtype[numpy.int64]]) int [source]
Samples the next state
- Parameters
T – the transition operator
s – the currrent state
a1 – the defender action
a2 – the attacker action
S – the state space
l – the number of stops remaining
- Returns
s’