Discrete Reinforcement Learning

Specialization of the Reinforcement Learning Problem for continuous action domains.

Usage

e["Problem"]["Type"] = "ReinforcementLearning/Discrete"

Compatible Solvers

This problem can be solved using the following modules:

Agent/Discrete

Agent

Variable-Specific Settings

These are settings required by this module that are added to each of the experiment’s variables when this module is selected.

Type

Usage: e[“Variables”][index][“Type”] = string
Description: Indicates if the variable belongs to the state or action vector.
Options:
- “State”: The variable describes a state.
- “Action”: The variable describes an action.

Lower Bound

Usage: e[“Variables”][index][“Lower Bound”] = float
Description: Lower bound for the variable’s value.

Upper Bound

Usage: e[“Variables”][index][“Upper Bound”] = float
Description: Upper bound for the variable’s value.

Name

Usage: e[“Variables”][index][“Name”] = string
Description: Defines the name of the variable.

Configuration

These are settings required by this module.

Possible Actions

Usage: e[“Problem”][“Possible Actions”] = List of Lists of float
Description: The set of all possible actions.

Agents Per Environment

Usage: e[“Problem”][“Agents Per Environment”] = unsigned integer
Description: Number of agents in a given environment. All agents share the same policy .

Environment Count

Usage: e[“Problem”][“Environment Count”] = unsigned integer
Description: Number of concurrent environments to run.

Environment Function

Usage: e[“Problem”][“Environment Function”] = Computational Model
Description: Function to initialize and run an episode in the environment.

Actions Between Policy Updates

Usage: e[“Problem”][“Actions Between Policy Updates”] = unsigned integer
Description: Number of actions to take before requesting a new policy.

Custom Settings

Usage: e[“Problem”][“Custom Settings”] = knlohmann::json
Description: Any used-defined settings required by the environment.

Default Configuration

These following configuration will be assigned by default. Any settings defined by the user will override the given settings specified in these defaults.

{
"Actions Between Policy Updates": 0,
"Agents Per Environment": 1,
"Custom Settings": {    },
"Environment Count": 1
}

Variable Defaults

These following configuration will be assigned to each of the experiment variables by default. Any settings defined by the user will override the given settings specified in these defaults.

{
"Lower Bound": -Infinity,
"Type": "State",
"Upper Bound": Infinity
}