csle_agents.agents.lp_cmdp package

Submodules

csle_agents.agents.lp_cmdp.linear_programming_cmdp_agent module

class csle_agents.agents.lp_cmdp.linear_programming_cmdp_agent.LinearProgrammingCMDPAgent(simulation_env_config: csle_common.dao.simulation_config.simulation_env_config.SimulationEnvConfig, experiment_config: csle_common.dao.training.experiment_config.ExperimentConfig, env: Optional[csle_common.dao.simulation_config.base_env.BaseEnv] = None, emulation_env_config: Union[None, csle_common.dao.emulation_config.emulation_env_config.EmulationEnvConfig] = None, training_job: Optional[csle_common.dao.jobs.training_job_config.TrainingJobConfig] = None, save_to_metastore: bool = True)[source]

Bases: csle_agents.agents.base.base_agent.BaseAgent

Linear programming agent for CMDPs

static compute_avg_metrics(metrics: Dict[str, List[Union[float, int]]]) Dict[str, Union[float, int]][source]

Computes the average metrics of a dict with aggregated metrics

Parameters

metrics – the dict with the aggregated metrics

Returns

the average metrics

hparam_names() List[str][source]
Returns

a list with the hyperparameter names

linear_programming_cmdp(exp_result: csle_common.dao.training.experiment_result.ExperimentResult, seed: int, training_job: csle_common.dao.jobs.training_job_config.TrainingJobConfig, random_seeds: List[int]) csle_common.dao.training.experiment_result.ExperimentResult[source]

Runs the linear programming algorithm for normal-form games

Parameters
  • exp_result – the experiment result object to store the result

  • seed – the seed

  • training_job – the training job config

  • random_seeds – list of seeds

Returns

the updated experiment result and the trained policy

static lp(actions: numpy.ndarray[Any, numpy.dtype[Any]], states: numpy.ndarray[Any, numpy.dtype[Any]], cost_tensor: numpy.ndarray[Any, numpy.dtype[Any]], transition_tensor: numpy.ndarray[Any, numpy.dtype[Any]], constraint_cost_tensors: numpy.ndarray[Any, numpy.dtype[Any]], constraint_cost_thresholds: numpy.ndarray[Any, numpy.dtype[Any]]) Tuple[str, numpy.ndarray[Any, numpy.dtype[Any]], numpy.ndarray[Any, numpy.dtype[Any]], numpy.ndarray[Any, numpy.dtype[Any]], float][source]

Linear program for solving a CMDP (see Altman ‘99 for details)

Parameters
  • actions – the action space

  • states – the state space

  • cost_tensor – the cost tensor

  • transition_tensor – the transition tensor

  • constraint_cost_tensors – the constraint cost tensors

  • constraint_cost_thresholds – the constraint cost thresholds

Returns

the solution status, the optimal occupancy measure, the optimal strategy, the expeted constraint cost, the objective value

static round_vec(vec) List[float][source]

Rounds a vector to 3 decimals

Parameters

vec – the vector to round

Returns

the rounded vector

train() csle_common.dao.training.experiment_execution.ExperimentExecution[source]

Performs the policy training for the given random seeds using linear programming

Returns

the training metrics and the trained policies

static update_metrics(metrics: Dict[str, List[Union[float, int]]], info: Dict[str, Union[float, int]]) Dict[str, List[Union[float, int]]][source]

Update a dict with aggregated metrics using new information from the environment

Parameters
  • metrics – the dict with the aggregated metrics

  • info – the new information

Returns

the updated dict

Module contents