gym_csle_stopping_game.util package

Submodules

gym_csle_stopping_game.util.stopping_game_util module

class gym_csle_stopping_game.util.stopping_game_util.StoppingGameUtil[source]

Bases: object

Class with utility functions for the StoppingGame Environment

static attacker_actions() numpy.ndarray[Any, numpy.dtype[numpy.int64]][source]

Gets the action space of the attacker

Returns

the action space of the attacker

static b1() numpy.ndarray[Any, numpy.dtype[numpy.float64]][source]

Gets the initial belief

Returns

the initial belief

static bayes_filter(s_prime: int, o: int, a1: int, b: numpy.ndarray[Any, numpy.dtype[numpy.float64]], pi2: numpy.ndarray[Any, numpy.dtype[Any]], l: int, config: gym_csle_stopping_game.dao.stopping_game_config.StoppingGameConfig) float[source]

A Bayesian filter to compute the belief of player 1 of being in s_prime when observing o after taking action a in belief b given that the opponent follows strategy pi2

Parameters
  • s_prime – the state to compute the belief of

  • o – the observation

  • a1 – the action of player 1

  • b – the current belief point

  • pi2 – the policy of player 2

  • l – stops remaining

Returns

b_prime(s_prime)

static defender_actions() numpy.ndarray[Any, numpy.dtype[numpy.int64]][source]

Gets the action space of the defender

Returns

the action space of the defender

static next_belief(o: int, a1: int, b: numpy.ndarray[Any, numpy.dtype[numpy.float64]], pi2: numpy.ndarray[Any, numpy.dtype[Any]], config: gym_csle_stopping_game.dao.stopping_game_config.StoppingGameConfig, l: int, a2: int = 0, s: int = 0) numpy.ndarray[Any, numpy.dtype[numpy.float64]][source]

Computes the next belief using a Bayesian filter

Parameters
  • o – the latest observation

  • a1 – the latest action of player 1

  • b – the current belief

  • pi2 – the policy of player 2

  • config – the game config

  • l – stops remaining

  • a2 – the attacker action (for debugging, should be consistent with pi2)

  • s – the true state (for debugging)

Returns

the new belief

static observation_space(n)[source]

Returns the observation space of size n

Parameters

n – the maximum observation

Returns

the observation space

static observation_tensor(n)[source]
Returns

a |A1|x|A2|x|S|x|O| tensor

static pomdp_solver_file(config: gym_csle_stopping_game.dao.stopping_game_config.StoppingGameConfig, discount_factor: float, pi2: numpy.ndarray[Any, numpy.dtype[Any]]) str[source]

Gets the POMDP environment specification based on the format at http://www.pomdp.org/code/index.html, for the defender’s local problem against a static attacker

Parameters
  • config – the POMDP config

  • discount_factor – the discount factor

  • pi2 – the attacker strategy

Returns

the file content as a string

static reward_tensor(R_SLA: int, R_INT: int, R_COST: int, L: int, R_ST: int) numpy.ndarray[Any, numpy.dtype[Any]][source]

Gets the reward tensor

Parameters
  • R_SLA – the R_SLA constant

  • R_INT – the R_INT constant

  • R_COST – the R_COST constant

  • R_ST – the R_ST constant

Returns

a |L|x|A1|x|A2|x|S| tensor

static sample_attacker_action(pi2: numpy.ndarray[Any, numpy.dtype[Any]], s: int) int[source]

Samples the attacker action

Parameters
  • pi2 – the attacker policy

  • s – the game state

Returns

a2 is the attacker action

static sample_initial_state(b1: numpy.ndarray[Any, numpy.dtype[numpy.float64]]) int[source]

Samples the initial state

Parameters

b1 – the initial belief

Returns

s1

static sample_next_observation(Z: numpy.ndarray[Any, numpy.dtype[Any]], s_prime: int, O: numpy.ndarray[Any, numpy.dtype[numpy.int64]]) int[source]

Samples the next observation

Parameters
  • Z – observation tensor which include the observation probables

  • s_prime – the new state

  • O – the observation space

Returns

o

static sample_next_state(T: numpy.ndarray[Any, numpy.dtype[Any]], l: int, s: int, a1: int, a2: int, S: numpy.ndarray[Any, numpy.dtype[numpy.int64]]) int[source]

Samples the next state

Parameters
  • T – the transition operator

  • s – the currrent state

  • a1 – the defender action

  • a2 – the attacker action

  • S – the state space

  • l – the number of stops remaining

Returns

s’

static state_space()[source]

Gets the state space

Returns

the state space of the game

static transition_tensor(L: int, p: float) numpy.ndarray[Any, numpy.dtype[Any]][source]

Gets the transition tensor

Parameters

L – the maximum number of stop actions

Returns

a |L|x|A1|x|A2||S|^2 tensor

Module contents