# Deep Supervisor

Uses a combination of a training and evaluation Neural Networks to solve a Supervised Learning problem. At each generation, it applies one or more optimization steps based on the loss function and the input/solutions received. The input and solutions may change in between generations.

Inference is fully openMP parallelizable, so that different openMP threads can infer from the learned parameters simultaneously. The training part should be done sequentially.

## Usage

`e["Solver"]["Type"] = "Learner/DeepSupervisor"`

## Results

These are the results produced by this solver:

## Variable-Specific Settings

These are settings required by this module that are added to each of the experiment’s variables when this module is selected.

## Configuration

These are settings required by this module.

- Neural Network / Hidden Layers
**Usage**: e[“Solver”][“Neural Network”][“Hidden Layers”] = knlohmann::json**Description**: Sets the configuration of the hidden layers for the neural network.

- Neural Network / Output Activation
**Usage**: e[“Solver”][“Neural Network”][“Output Activation”] = knlohmann::json**Description**: Allows setting an aditional activation for the output layer.

- Neural Network / Output Layer
**Usage**: e[“Solver”][“Neural Network”][“Output Layer”] = knlohmann::json**Description**: Sets any additional configuration (e.g., masks) for the output NN layer.

- Neural Network / Engine
**Usage**: e[“Solver”][“Neural Network”][“Engine”] =*string***Description**: Specifies which Neural Network backend engine to use.

- Neural Network / Optimizer
**Usage**: e[“Solver”][“Neural Network”][“Optimizer”] =*string***Description**: Determines which optimizer algorithm to use to apply the gradients on the neural network’s hyperparameters.**Options**:“

*Adam*”: Uses the Adam algorithm.“

*AdaBelief*”: Uses the AdaBelief algorithm.“

*MADGRAD*”: Uses the MADGRAD algorithm.“

*RMSProp*”: Uses the RMSProp algorithm.“

*Adagrad*”: Uses the Adagrad algorithm.

- Hyperparameters
**Usage**: e[“Solver”][“Hyperparameters”] = List of float**Description**: Stores the training neural network hyperparameters (weights and biases).

- Loss Function
**Usage**: e[“Solver”][“Loss Function”] =*string***Description**: Function to calculate the difference (loss) between the NN inference and the exact solution and its gradients for optimization.**Options**:“

*Direct Gradient*”: The given solution represents the gradients of the loss with respect to the network-output. Note that Korali uses the gradients to maximize the objective.“

*Mean Squared Error*”: The loss is calculated as the negative mean of square errors, one per input in the batch. Note that Korali maximizes the negative MSE.

- Steps Per Generation
**Usage**: e[“Solver”][“Steps Per Generation”] =*unsigned integer***Description**: Represents the number of opitmization steps to run per each generation.

- Learning Rate
**Usage**: e[“Solver”][“Learning Rate”] = float**Description**: Learning rate for the underlying ADAM optimizer.

- L2 Regularization / Enabled
**Usage**: e[“Solver”][“L2 Regularization”][“Enabled”] =*True/False***Description**: Regulates if l2 regularization will be applied to the neural network.

- L2 Regularization / Importance
**Usage**: e[“Solver”][“L2 Regularization”][“Importance”] =*True/False***Description**: Importance weight of l2 regularization.

- Output Weights Scaling
**Usage**: e[“Solver”][“Output Weights Scaling”] = float**Description**: Specified by how much will the weights of the last linear transformation of the NN be scaled. A value of < 1.0 is useful for a more deterministic start.

## Termination Criteria

These are the customizable criteria that indicates whether the solver should continue or finish execution. Korali will stop when at least one of these conditions are met. The criteria is expressed in C++ since it is compiled and evaluated as seen here in the engine.

- Target Loss
**Usage**: e[“Solver”][“Target Loss”] = float**Description**: Specifies the maximum number of suboptimal generations.**Criteria**:`(_k->_currentGeneration > 1) && (_targetLoss > 0.0) && (_currentLoss <= _targetLoss)`

- Max Model Evaluations
**Usage**: e[“Solver”][“Max Model Evaluations”] =*unsigned integer***Description**: Specifies the maximum allowed evaluations of the computational model.**Criteria**:`_maxModelEvaluations <= _modelEvaluationCount`

- Max Generations
**Usage**: e[“Solver”][“Max Generations”] =*unsigned integer***Description**: Determines how many solver generations to run before stopping execution. Execution can be resumed at a later moment.**Criteria**:`_k->_currentGeneration > _maxGenerations`

## Default Configuration

These following configuration will be assigned by default. Any settings defined by the user will override the given settings specified in these defaults.

{ "Hyperparameters": [], "L2 Regularization": { "Enabled": false, "Importance": 0.0001 }, "Model Evaluation Count": 0, "Neural Network": { "Output Activation": "Identity", "Output Layer": { } }, "Output Weights Scaling": 1.0, "Steps Per Generation": 1, "Termination Criteria": { "Max Generations": 10000000000, "Max Model Evaluations": 1000000000, "Target Loss": -1.0 }, "Variable Count": 0 }

## Variable Defaults

These following configuration will be assigned to each of the experiment variables by default. Any settings defined by the user will override the given settings specified in these defaults.

{ }