csle_agents.agents.fp package

Submodules

csle_agents.agents.fp.fictitious_play_agent module

class csle_agents.agents.fp.fictitious_play_agent.FictitiousPlayAgent(simulation_env_config: SimulationEnvConfig, experiment_config: ExperimentConfig, env: Optional[BaseEnv] = None, emulation_env_config: Union[None, EmulationEnvConfig] = None, training_job: Optional[TrainingJobConfig] = None, save_to_metastore: bool = True)[source]

Bases: BaseAgent

Fictitious Play Agent for Normal-form Games (Brown 1951)

best_response(p: List[float], A: ndarray[Any, dtype[Any]], maximize: bool = True) → Tuple[int, float][source]

Computes a best response against p

Parameters

p – the opponents strategy vector
A – the payoff matrix
maximize – whether it is a maximizer player or minimizer player

Returns

the best response action and its payoff (value)

static compute_avg_metrics(metrics: Dict[str, List[Union[float, int]]]) → Dict[str, Union[float, int]][source]

Computes the average metrics of a dict with aggregated metrics

Parameters: metrics – the dict with the aggregated metrics
Returns: the average metrics

compute_empirical_strategy(counts) → List[float][source]

Computes the empirical strategy from a list of counts

Parameters: counts – the list of counts
Returns: the empirical strategy

fictitious_play(exp_result: ExperimentResult, seed: int, training_job: TrainingJobConfig, random_seeds: List[int]) → ExperimentResult[source]

Runs the fictitious play algorithm

Parameters

exp_result – the experiment result object to store the result
seed – the seed
training_job – the training job config
random_seeds – list of seeds

Returns

the updated experiment result and the trained policy

hparam_names() → List[str][source]

Returns: a list with the hyperparameter names

static round_vec(vec) → List[float][source]

Rounds a vector to 3 decimals

Parameters: vec – the vector to round
Returns: the rounded vector

train() → ExperimentExecution[source]

Performs the policy training for the given random seeds using fictitious play

Returns: the training metrics and the trained policies

static update_metrics(metrics: Dict[str, List[Union[float, int]]], info: Dict[str, Union[float, int]]) → Dict[str, List[Union[float, int]]][source]

Update a dict with aggregated metrics using new information from the environment

Parameters

metrics – the dict with the aggregated metrics
info – the new information

Returns

the updated dict

csle_agents.agents.fp package

Submodules

csle_agents.agents.fp.fictitious_play_agent module

Module contents