csle_agents.agents.lp_nf package

Submodules

csle_agents.agents.lp_nf.linear_programming_normal_form_game_agent module

class csle_agents.agents.lp_nf.linear_programming_normal_form_game_agent.LinearProgrammingNormalFormGameAgent(simulation_env_config: SimulationEnvConfig, experiment_config: ExperimentConfig, env: Optional[BaseEnv] = None, emulation_env_config: Union[None, EmulationEnvConfig] = None, training_job: Optional[TrainingJobConfig] = None, save_to_metastore: bool = True)[source]

Bases: BaseAgent

Linear programming agent for normal-form games

static compute_avg_metrics(metrics: Dict[str, List[Union[float, int]]]) Dict[str, Union[float, int]][source]

Computes the average metrics of a dict with aggregated metrics

Parameters

metrics – the dict with the aggregated metrics

Returns

the average metrics

compute_equilibrium_strategies_in_matrix_game(A: ndarray[Any, dtype[Any]], A1: ndarray[Any, dtype[Any]], A2: ndarray[Any, dtype[Any]]) Tuple[ndarray[Any, dtype[Any]], ndarray[Any, dtype[Any]], float][source]

Computes equilibrium strategies in a matrix game

Parameters
  • A – the matrix game

  • A1 – the action set of player 1 (the maximizer)

  • A2 – the action set of player 2 (the minimizer)

Returns

the equilibrium strategy profile and the value

compute_matrix_game_value(A: ndarray[Any, dtype[Any]], A1: ndarray[Any, dtype[Any]], A2: ndarray[Any, dtype[Any]], maximizer: bool = True) Tuple[Any, ndarray[Any, dtype[Any]]][source]

Uses LP to compute the value of a a matrix game, also computes the maximin or minimax strategy

Parameters
  • A – the matrix game

  • A1 – the action set of player 1

  • A2 – the action set of player 2

  • maximizer – boolean flag whether to compute the maximin strategy or minimax strategy

Returns

(val(A), maximin or minimax strategy)

hparam_names() List[str][source]
Returns

a list with the hyperparameter names

linear_programming_normal_form(exp_result: ExperimentResult, seed: int, training_job: TrainingJobConfig, random_seeds: List[int]) ExperimentResult[source]

Runs the linear programming algorithm for normal-form games

Parameters
  • exp_result – the experiment result object to store the result

  • seed – the seed

  • training_job – the training job config

  • random_seeds – list of seeds

Returns

the updated experiment result and the trained policy

static round_vec(vec) List[float][source]

Rounds a vector to 3 decimals

Parameters

vec – the vector to round

Returns

the rounded vector

train() ExperimentExecution[source]

Performs the policy training for the given random seeds using linear programming

Returns

the training metrics and the trained policies

static update_metrics(metrics: Dict[str, List[Union[float, int]]], info: Dict[str, Union[float, int]]) Dict[str, List[Union[float, int]]][source]

Update a dict with aggregated metrics using new information from the environment

Parameters
  • metrics – the dict with the aggregated metrics

  • info – the new information

Returns

the updated dict

Module contents