Discrete Reinforcement Learning

Specialization of the Reinforcement Learning Problem for continuous action domains.

Usage

e["Problem"]["Type"] = "ReinforcementLearning/Discrete"

Compatible Solvers

This problem can be solved using the following modules:

Variable-Specific Settings

These are settings required by this module that are added to each of the experiment’s variables when this module is selected.

Type
  • Usage: e[“Variables”][index][“Type”] = string

  • Description: Indicates if the variable belongs to the state or action vector.

  • Options:

    • State”: The variable describes a state.

    • Action”: The variable describes an action.

Lower Bound
  • Usage: e[“Variables”][index][“Lower Bound”] = float

  • Description: Lower bound for the variable’s value.

Upper Bound
  • Usage: e[“Variables”][index][“Upper Bound”] = float

  • Description: Upper bound for the variable’s value.

Name
  • Usage: e[“Variables”][index][“Name”] = string

  • Description: Defines the name of the variable.

Configuration

These are settings required by this module.

Possible Actions
  • Usage: e[“Problem”][“Possible Actions”] = List of Lists of float

  • Description: The set of all possible actions.

Agents Per Environment
  • Usage: e[“Problem”][“Agents Per Environment”] = unsigned integer

  • Description: Number of agents in a given environment. All agents share the same policy .

Environment Count
  • Usage: e[“Problem”][“Environment Count”] = unsigned integer

  • Description: Number of concurrent environments to run.

Environment Function
  • Usage: e[“Problem”][“Environment Function”] = Computational Model

  • Description: Function to initialize and run an episode in the environment.

Actions Between Policy Updates
  • Usage: e[“Problem”][“Actions Between Policy Updates”] = unsigned integer

  • Description: Number of actions to take before requesting a new policy.

Custom Settings
  • Usage: e[“Problem”][“Custom Settings”] = knlohmann::json

  • Description: Any used-defined settings required by the environment.

Default Configuration

These following configuration will be assigned by default. Any settings defined by the user will override the given settings specified in these defaults.

{
"Actions Between Policy Updates": 0,
"Agents Per Environment": 1,
"Custom Settings": {    },
"Environment Count": 1
}

Variable Defaults

These following configuration will be assigned to each of the experiment variables by default. Any settings defined by the user will override the given settings specified in these defaults.

{
"Lower Bound": -Infinity,
"Type": "State",
"Upper Bound": Infinity
}