gym_csle_apt_game.util package
Submodules
gym_csle_apt_game.util.apt_game_util module
- class gym_csle_apt_game.util.apt_game_util.AptGameUtil[source]
Bases:
object
Class with utility functions for the APTGame Environment
- static attacker_actions() numpy.ndarray[Any, numpy.dtype[numpy.int64]] [source]
Gets the action space of the attacker
- Returns
the action space of the attacker
- static b1(N: int) numpy.ndarray[Any, numpy.dtype[numpy.float64]] [source]
Gets the initial belief
- Parameters
N – the number of servers
- Returns
the initial belief
- static bayes_filter(s_prime: int, o: int, a1: int, b: numpy.ndarray[Any, numpy.dtype[numpy.float64]], pi2: numpy.ndarray[Any, numpy.dtype[Any]], config: gym_csle_apt_game.dao.apt_game_config.AptGameConfig) float [source]
A Bayesian filter to compute the belief of player 1 of being in s_prime when observing o after taking action a in belief b given that the opponent follows strategy pi2
- Parameters
s_prime – the state to compute the belief of
o – the observation
a1 – the action of player 1
b – the current belief point
pi2 – the policy of player 2
- Returns
b_prime(s_prime)
- static cost_function(s: int, a_1: int) float [source]
The cost function of the game
- Parameters
s – the state
a_1 – the defender action
- Returns
the immediate cost
- static cost_tensor(N: int) numpy.ndarray[Any, numpy.dtype[Any]] [source]
Gets the reward tensor
- Returns
a |A1|x|S| tensor
- static defender_actions() numpy.ndarray[Any, numpy.dtype[numpy.int64]] [source]
Gets the action space of the defender
- Returns
the action space of the defender
- static expected_cost(C: List[List[float]], b: List[float], S: List[int], a1: int) float [source]
Gets the expected cost of defender action a1 in belief state b
- Parameters
C – the cost tensor
b – the belief state
S – the state space
a1 – the defender action
- Returns
the expected cost
- static generate_os_posg_game_file(game_config: gym_csle_apt_game.dao.apt_game_config.AptGameConfig) str [source]
Generates the POSG game file for HSVI
- Parameters
game_config – the game configuration
- Returns
a string with the contents of the config file
- static generate_rewards(game_config: gym_csle_apt_game.dao.apt_game_config.AptGameConfig) List[str] [source]
Generates the reward rows of the POSG config file of HSVI
- Parameters
game_config – the game configuration
- Returns
list of reward rows
- static generate_transitions(game_config: gym_csle_apt_game.dao.apt_game_config.AptGameConfig) List[str] [source]
Generates the transition rows of the POSG config file of HSVI
- Parameters
game_config – the game configuration
- Returns
list of transition rows
- static next_belief(o: int, a1: int, b: numpy.ndarray[Any, numpy.dtype[numpy.float64]], pi2: numpy.ndarray[Any, numpy.dtype[Any]], config: gym_csle_apt_game.dao.apt_game_config.AptGameConfig, a2: int = 0, s: int = 0) numpy.ndarray[Any, numpy.dtype[numpy.float64]] [source]
Computes the next belief using a Bayesian filter
- Parameters
o – the latest observation
a1 – the latest action of player 1
b – the current belief
pi2 – the policy of player 2
config – the game config
a2 – the attacker action (for debugging, should be consistent with pi2)
s – the true state (for debugging)
- Returns
the new belief
- static observation_space(num_observations: int)[source]
Returns the observation space of size n
- Parameters
num_observations – the number of observations
- Returns
the observation space
- static observation_tensor(num_observations, N: int) numpy.ndarray[Any, numpy.dtype[Any]] [source]
Gets the observation tensor of the game
- Parameters
num_observations – the number of observations
N – the number of servers
- Returns
a |S|x|O| observation tensor
- static sample_attacker_action(pi2: numpy.ndarray[Any, numpy.dtype[Any]], s: int) int [source]
Samples the attacker action
- Parameters
pi2 – the attacker action
s – the game state
- Returns
a2 (the attacker action)
- static sample_defender_action(alpha: float, b: List[float]) int [source]
Samples the attacker action
- Parameters
alpha – the defender threshold
b – the belief state
- Returns
a1 (the defender action)
- static sample_initial_state(b1: numpy.ndarray[Any, numpy.dtype[numpy.float64]]) int [source]
Samples the initial state
- Parameters
b1 – the initial belief
- Returns
s1
- static sample_next_observation(Z: numpy.ndarray[Any, numpy.dtype[Any]], s_prime: int, O: numpy.ndarray[Any, numpy.dtype[numpy.int64]]) int [source]
Samples the next observation
- Parameters
s_prime – the new state
O – the observation space
- Returns
o
- static sample_next_state(T: numpy.ndarray[Any, numpy.dtype[Any]], s: int, a1: int, a2: int, S: numpy.ndarray[Any, numpy.dtype[numpy.int64]]) int [source]
Samples the next state
- Parameters
T – the transition operator
s – the current state
a1 – the defender action
a2 – the attacker action
S – the state space
- Returns
s’
- static state_space(N: int)[source]
Gets the state space
- Parameters
N – the number of servers
- Returns
the state space of the game
- static transition_function(N: int, p_a: float, s: int, s_prime: int, a_1: int, a_2: int) float [source]
The transition function of the game
- Parameters
N – the number of servers
p_a – the intrusion probability
s – the state
s_prime – the next state
a_1 – the defender action
a_2 – the attacker action
- Returns
f(s_prime | s, a_1, a_2)
gym_csle_apt_game.util.rollout_util module
- class gym_csle_apt_game.util.rollout_util.RolloutUtil[source]
Bases:
object
Class with utility functions for rollout
- static eval_attacker_base(alpha: float, pi2: numpy.ndarray[Any, numpy.dtype[Any]], config: gym_csle_apt_game.dao.apt_game_config.AptGameConfig, horizon: int, s: Optional[int], b: numpy.ndarray[Any, numpy.dtype[Any]], id: int) float [source]
Function for evaluating a base threshold strategy of the attacker
- Parameters
alpha – the defender’s threshold
pi2 – the attacker’s base strategy
config – the game configuration
horizon – the horizon for the Monte-Carlo sampling
id – the id of the parallel processor
s – the state
b – the belief
- Returns
the average return
- static eval_attacker_base_parallel(alpha: float, pi2: numpy.ndarray[Any, numpy.dtype[Any]], config: gym_csle_apt_game.dao.apt_game_config.AptGameConfig, num_samples: int, horizon: int, s: Optional[int], b: List[float]) float [source]
Starts a pool of parallel processors for evaluating a threshold base strategy of the attacker
- Parameters
alpha – the threshold of the defender
pi2 – the base strategy of the attacker
config – the game configuration
num_samples – the number of monte carlo samples
horizon – the horizon of the Monte-Carlo sampling
s – the state
b – the belief
- Returns
the average cost-to-go of the base strategy
- static eval_defender_base(alpha: float, pi2: numpy.ndarray[Any, numpy.dtype[Any]], config: gym_csle_apt_game.dao.apt_game_config.AptGameConfig, horizon: int, s: Optional[int], b: numpy.ndarray[Any, numpy.dtype[Any]], id: int) float [source]
Function for evaluating a base threshold strategy of the defender
- Parameters
alpha – the defender’s threshold
pi2 – the attacker’s strategy
config – the game configuration
horizon – the horizon for the Monte-Carlo sampling
id – the id of the parallel processor
s – the state
b – the belief
- Returns
the average return
- static eval_defender_base_parallel(alpha: float, pi2: numpy.ndarray[Any, numpy.dtype[Any]], config: gym_csle_apt_game.dao.apt_game_config.AptGameConfig, num_samples: int, horizon: int, s: Union[None, int], b: List[float]) float [source]
Starts a pool of parallel processors for evaluating a threshold base strategy of the defender
- Parameters
alpha – the threshold
pi2 – the defender strategy
config – the game configuration
num_samples – the number of monte carlo samples
horizon – the horizon of the Monte-Carlo sampling
s – the state
b – the belief
- Returns
the average cost-to-go of the base strategy
- static exact_defender_rollout(alpha: float, pi2: numpy.ndarray[Any, numpy.dtype[Any]], config: gym_csle_apt_game.dao.apt_game_config.AptGameConfig, num_samples: int, horizon: int, ell: int, b: List[float]) Tuple[int, float] [source]
Performs exact rollout of the defender against a fixed attacker strategy and with a threshold base strategy
- Parameters
alpha – the threshold base strategy
pi2 – the strategy of the attacker
config – the game configuraton
num_samples – the number of Monte-Carlo samples
horizon – the horizon for the Monte-Carlo sampling
ell – the lookahead length
b – the belief state
- Returns
The rollout action and the corresponding Q-factor
- static monte_carlo_attacker_rollout(alpha: float, pi2: numpy.ndarray[Any, numpy.dtype[Any]], config: gym_csle_apt_game.dao.apt_game_config.AptGameConfig, num_samples: int, horizon: int, ell: int, b: List[float], a1: Union[None, int] = None, s: Union[None, int] = None) Tuple[int, float] [source]
Monte-Carlo based on rollout of the attacker with a threshold base strategy
- Parameters
alpha – the threshold of the defender
pi – the base strategy of the attacker
config – the game configuration
num_samples – the number of monte-carlo samples
horizon – the horizon for monte-carlo sampling
ell – the lookahead length
b – the belief state
a1 – the action of the defender
- Returns
The rollout action and the corresponding Q-factor
- static monte_carlo_defender_rollout(alpha: float, pi2: numpy.ndarray[Any, numpy.dtype[Any]], config: gym_csle_apt_game.dao.apt_game_config.AptGameConfig, num_samples: int, horizon: int, ell: int, b: List[float], a2: Union[None, int] = None, s: Union[None, int] = None) Tuple[int, float] [source]
Monte-Carlo based on rollout of the defender with a threshold base strategy
- Parameters
alpha – the threshold of the base strategy
pi2 – the attacker strategy
config – the game configuration
num_samples – the number of monte-carlo samples
horizon – the horizon for monte-carlo sampling
ell – the lookahead length
b – the belief state
a2 – the action of the attacker
s – the state
- Returns
The rollout action and the corresponding Q-factor