gym_csle_apt_game.util package

Submodules

gym_csle_apt_game.util.apt_game_util module

class gym_csle_apt_game.util.apt_game_util.AptGameUtil[source]

Bases: object

Class with utility functions for the APTGame Environment

static attacker_actions() numpy.ndarray[Any, numpy.dtype[numpy.int64]][source]

Gets the action space of the attacker

Returns

the action space of the attacker

static b1(N: int) numpy.ndarray[Any, numpy.dtype[numpy.float64]][source]

Gets the initial belief

Parameters

N – the number of servers

Returns

the initial belief

static bayes_filter(s_prime: int, o: int, a1: int, b: numpy.ndarray[Any, numpy.dtype[numpy.float64]], pi2: numpy.ndarray[Any, numpy.dtype[Any]], config: gym_csle_apt_game.dao.apt_game_config.AptGameConfig) float[source]

A Bayesian filter to compute the belief of player 1 of being in s_prime when observing o after taking action a in belief b given that the opponent follows strategy pi2

Parameters
  • s_prime – the state to compute the belief of

  • o – the observation

  • a1 – the action of player 1

  • b – the current belief point

  • pi2 – the policy of player 2

Returns

b_prime(s_prime)

static cost_function(s: int, a_1: int) float[source]

The cost function of the game

Parameters
  • s – the state

  • a_1 – the defender action

Returns

the immediate cost

static cost_tensor(N: int) numpy.ndarray[Any, numpy.dtype[Any]][source]

Gets the reward tensor

Returns

a |A1|x|S| tensor

static defender_actions() numpy.ndarray[Any, numpy.dtype[numpy.int64]][source]

Gets the action space of the defender

Returns

the action space of the defender

static expected_cost(C: List[List[float]], b: List[float], S: List[int], a1: int) float[source]

Gets the expected cost of defender action a1 in belief state b

Parameters
  • C – the cost tensor

  • b – the belief state

  • S – the state space

  • a1 – the defender action

Returns

the expected cost

static generate_os_posg_game_file(game_config: gym_csle_apt_game.dao.apt_game_config.AptGameConfig) str[source]

Generates the POSG game file for HSVI

Parameters

game_config – the game configuration

Returns

a string with the contents of the config file

static generate_rewards(game_config: gym_csle_apt_game.dao.apt_game_config.AptGameConfig) List[str][source]

Generates the reward rows of the POSG config file of HSVI

Parameters

game_config – the game configuration

Returns

list of reward rows

static generate_transitions(game_config: gym_csle_apt_game.dao.apt_game_config.AptGameConfig) List[str][source]

Generates the transition rows of the POSG config file of HSVI

Parameters

game_config – the game configuration

Returns

list of transition rows

static next_belief(o: int, a1: int, b: numpy.ndarray[Any, numpy.dtype[numpy.float64]], pi2: numpy.ndarray[Any, numpy.dtype[Any]], config: gym_csle_apt_game.dao.apt_game_config.AptGameConfig, a2: int = 0, s: int = 0) numpy.ndarray[Any, numpy.dtype[numpy.float64]][source]

Computes the next belief using a Bayesian filter

Parameters
  • o – the latest observation

  • a1 – the latest action of player 1

  • b – the current belief

  • pi2 – the policy of player 2

  • config – the game config

  • a2 – the attacker action (for debugging, should be consistent with pi2)

  • s – the true state (for debugging)

Returns

the new belief

static observation_space(num_observations: int)[source]

Returns the observation space of size n

Parameters

num_observations – the number of observations

Returns

the observation space

static observation_tensor(num_observations, N: int) numpy.ndarray[Any, numpy.dtype[Any]][source]

Gets the observation tensor of the game

Parameters
  • num_observations – the number of observations

  • N – the number of servers

Returns

a |S|x|O| observation tensor

static sample_attacker_action(pi2: numpy.ndarray[Any, numpy.dtype[Any]], s: int) int[source]

Samples the attacker action

Parameters
  • pi2 – the attacker action

  • s – the game state

Returns

a2 (the attacker action)

static sample_defender_action(alpha: float, b: List[float]) int[source]

Samples the attacker action

Parameters
  • alpha – the defender threshold

  • b – the belief state

Returns

a1 (the defender action)

static sample_initial_state(b1: numpy.ndarray[Any, numpy.dtype[numpy.float64]]) int[source]

Samples the initial state

Parameters

b1 – the initial belief

Returns

s1

static sample_next_observation(Z: numpy.ndarray[Any, numpy.dtype[Any]], s_prime: int, O: numpy.ndarray[Any, numpy.dtype[numpy.int64]]) int[source]

Samples the next observation

Parameters
  • s_prime – the new state

  • O – the observation space

Returns

o

static sample_next_state(T: numpy.ndarray[Any, numpy.dtype[Any]], s: int, a1: int, a2: int, S: numpy.ndarray[Any, numpy.dtype[numpy.int64]]) int[source]

Samples the next state

Parameters
  • T – the transition operator

  • s – the current state

  • a1 – the defender action

  • a2 – the attacker action

  • S – the state space

Returns

s’

static state_space(N: int)[source]

Gets the state space

Parameters

N – the number of servers

Returns

the state space of the game

static transition_function(N: int, p_a: float, s: int, s_prime: int, a_1: int, a_2: int) float[source]

The transition function of the game

Parameters
  • N – the number of servers

  • p_a – the intrusion probability

  • s – the state

  • s_prime – the next state

  • a_1 – the defender action

  • a_2 – the attacker action

Returns

f(s_prime | s, a_1, a_2)

static transition_tensor(N: int, p_a: float) numpy.ndarray[Any, numpy.dtype[Any]][source]

Gets the transition tensor

Parameters

L – the maximum number of stop actions

Returns

a |A1|x|A2||S|^2 tensor

gym_csle_apt_game.util.rollout_util module

class gym_csle_apt_game.util.rollout_util.RolloutUtil[source]

Bases: object

Class with utility functions for rollout

static eval_attacker_base(alpha: float, pi2: numpy.ndarray[Any, numpy.dtype[Any]], config: gym_csle_apt_game.dao.apt_game_config.AptGameConfig, horizon: int, s: Optional[int], b: numpy.ndarray[Any, numpy.dtype[Any]], id: int) float[source]

Function for evaluating a base threshold strategy of the attacker

Parameters
  • alpha – the defender’s threshold

  • pi2 – the attacker’s base strategy

  • config – the game configuration

  • horizon – the horizon for the Monte-Carlo sampling

  • id – the id of the parallel processor

  • s – the state

  • b – the belief

Returns

the average return

static eval_attacker_base_parallel(alpha: float, pi2: numpy.ndarray[Any, numpy.dtype[Any]], config: gym_csle_apt_game.dao.apt_game_config.AptGameConfig, num_samples: int, horizon: int, s: Optional[int], b: List[float]) float[source]

Starts a pool of parallel processors for evaluating a threshold base strategy of the attacker

Parameters
  • alpha – the threshold of the defender

  • pi2 – the base strategy of the attacker

  • config – the game configuration

  • num_samples – the number of monte carlo samples

  • horizon – the horizon of the Monte-Carlo sampling

  • s – the state

  • b – the belief

Returns

the average cost-to-go of the base strategy

static eval_defender_base(alpha: float, pi2: numpy.ndarray[Any, numpy.dtype[Any]], config: gym_csle_apt_game.dao.apt_game_config.AptGameConfig, horizon: int, s: Optional[int], b: numpy.ndarray[Any, numpy.dtype[Any]], id: int) float[source]

Function for evaluating a base threshold strategy of the defender

Parameters
  • alpha – the defender’s threshold

  • pi2 – the attacker’s strategy

  • config – the game configuration

  • horizon – the horizon for the Monte-Carlo sampling

  • id – the id of the parallel processor

  • s – the state

  • b – the belief

Returns

the average return

static eval_defender_base_parallel(alpha: float, pi2: numpy.ndarray[Any, numpy.dtype[Any]], config: gym_csle_apt_game.dao.apt_game_config.AptGameConfig, num_samples: int, horizon: int, s: Union[None, int], b: List[float]) float[source]

Starts a pool of parallel processors for evaluating a threshold base strategy of the defender

Parameters
  • alpha – the threshold

  • pi2 – the defender strategy

  • config – the game configuration

  • num_samples – the number of monte carlo samples

  • horizon – the horizon of the Monte-Carlo sampling

  • s – the state

  • b – the belief

Returns

the average cost-to-go of the base strategy

static exact_defender_rollout(alpha: float, pi2: numpy.ndarray[Any, numpy.dtype[Any]], config: gym_csle_apt_game.dao.apt_game_config.AptGameConfig, num_samples: int, horizon: int, ell: int, b: List[float]) Tuple[int, float][source]

Performs exact rollout of the defender against a fixed attacker strategy and with a threshold base strategy

Parameters
  • alpha – the threshold base strategy

  • pi2 – the strategy of the attacker

  • config – the game configuraton

  • num_samples – the number of Monte-Carlo samples

  • horizon – the horizon for the Monte-Carlo sampling

  • ell – the lookahead length

  • b – the belief state

Returns

The rollout action and the corresponding Q-factor

static monte_carlo_attacker_rollout(alpha: float, pi2: numpy.ndarray[Any, numpy.dtype[Any]], config: gym_csle_apt_game.dao.apt_game_config.AptGameConfig, num_samples: int, horizon: int, ell: int, b: List[float], a1: Union[None, int] = None, s: Union[None, int] = None) Tuple[int, float][source]

Monte-Carlo based on rollout of the attacker with a threshold base strategy

Parameters
  • alpha – the threshold of the defender

  • pi – the base strategy of the attacker

  • config – the game configuration

  • num_samples – the number of monte-carlo samples

  • horizon – the horizon for monte-carlo sampling

  • ell – the lookahead length

  • b – the belief state

  • a1 – the action of the defender

Returns

The rollout action and the corresponding Q-factor

static monte_carlo_defender_rollout(alpha: float, pi2: numpy.ndarray[Any, numpy.dtype[Any]], config: gym_csle_apt_game.dao.apt_game_config.AptGameConfig, num_samples: int, horizon: int, ell: int, b: List[float], a2: Union[None, int] = None, s: Union[None, int] = None) Tuple[int, float][source]

Monte-Carlo based on rollout of the defender with a threshold base strategy

Parameters
  • alpha – the threshold of the base strategy

  • pi2 – the attacker strategy

  • config – the game configuration

  • num_samples – the number of monte-carlo samples

  • horizon – the horizon for monte-carlo sampling

  • ell – the lookahead length

  • b – the belief state

  • a2 – the action of the attacker

  • s – the state

Returns

The rollout action and the corresponding Q-factor

Module contents