Deep Supervisor
Uses a combination of a training and evaluation Neural Networks to solve a Supervised Learning problem. At each generation, it applies one or more optimization steps based on the loss function and the input/solutions received. The input and solutions may change in between generations.
Inference is fully openMP parallelizable, so that different openMP threads can infer from the learned parameters simultaneously. The training part should be done sequentially.
Usage
e["Solver"]["Type"] = "Learner/DeepSupervisor"
Results
These are the results produced by this solver:
Variable-Specific Settings
These are settings required by this module that are added to each of the experiment’s variables when this module is selected.
Configuration
These are settings required by this module.
- Neural Network / Hidden Layers
Usage: e[“Solver”][“Neural Network”][“Hidden Layers”] = knlohmann::json
Description: Sets the configuration of the hidden layers for the neural network.
- Neural Network / Output Activation
Usage: e[“Solver”][“Neural Network”][“Output Activation”] = knlohmann::json
Description: Allows setting an aditional activation for the output layer.
- Neural Network / Output Layer
Usage: e[“Solver”][“Neural Network”][“Output Layer”] = knlohmann::json
Description: Sets any additional configuration (e.g., masks) for the output NN layer.
- Neural Network / Engine
Usage: e[“Solver”][“Neural Network”][“Engine”] = string
Description: Specifies which Neural Network backend engine to use.
- Neural Network / Optimizer
Usage: e[“Solver”][“Neural Network”][“Optimizer”] = string
Description: Determines which optimizer algorithm to use to apply the gradients on the neural network’s hyperparameters.
Options:
“Adam”: Uses the Adam algorithm.
“AdaBelief”: Uses the AdaBelief algorithm.
“MADGRAD”: Uses the MADGRAD algorithm.
“RMSProp”: Uses the RMSProp algorithm.
“Adagrad”: Uses the Adagrad algorithm.
- Hyperparameters
Usage: e[“Solver”][“Hyperparameters”] = List of float
Description: Stores the training neural network hyperparameters (weights and biases).
- Loss Function
Usage: e[“Solver”][“Loss Function”] = string
Description: Function to calculate the difference (loss) between the NN inference and the exact solution and its gradients for optimization.
Options:
“Direct Gradient”: The given solution represents the gradients of the loss with respect to the network-output. Note that Korali uses the gradients to maximize the objective.
“Mean Squared Error”: The loss is calculated as the negative mean of square errors, one per input in the batch. Note that Korali maximizes the negative MSE.
- Steps Per Generation
Usage: e[“Solver”][“Steps Per Generation”] = unsigned integer
Description: Represents the number of opitmization steps to run per each generation.
- Learning Rate
Usage: e[“Solver”][“Learning Rate”] = float
Description: Learning rate for the underlying ADAM optimizer.
- L2 Regularization / Enabled
Usage: e[“Solver”][“L2 Regularization”][“Enabled”] = True/False
Description: Regulates if l2 regularization will be applied to the neural network.
- L2 Regularization / Importance
Usage: e[“Solver”][“L2 Regularization”][“Importance”] = True/False
Description: Importance weight of l2 regularization.
- Output Weights Scaling
Usage: e[“Solver”][“Output Weights Scaling”] = float
Description: Specified by how much will the weights of the last linear transformation of the NN be scaled. A value of < 1.0 is useful for a more deterministic start.
Termination Criteria
These are the customizable criteria that indicates whether the solver should continue or finish execution. Korali will stop when at least one of these conditions are met. The criteria is expressed in C++ since it is compiled and evaluated as seen here in the engine.
- Target Loss
Usage: e[“Solver”][“Target Loss”] = float
Description: Specifies the maximum number of suboptimal generations.
Criteria:
(_k->_currentGeneration > 1) && (_targetLoss > 0.0) && (_currentLoss <= _targetLoss)
- Max Model Evaluations
Usage: e[“Solver”][“Max Model Evaluations”] = unsigned integer
Description: Specifies the maximum allowed evaluations of the computational model.
Criteria:
_maxModelEvaluations <= _modelEvaluationCount
- Max Generations
Usage: e[“Solver”][“Max Generations”] = unsigned integer
Description: Determines how many solver generations to run before stopping execution. Execution can be resumed at a later moment.
Criteria:
_k->_currentGeneration > _maxGenerations
Default Configuration
These following configuration will be assigned by default. Any settings defined by the user will override the given settings specified in these defaults.
{ "Hyperparameters": [], "L2 Regularization": { "Enabled": false, "Importance": 0.0001 }, "Model Evaluation Count": 0, "Neural Network": { "Output Activation": "Identity", "Output Layer": { } }, "Output Weights Scaling": 1.0, "Steps Per Generation": 1, "Termination Criteria": { "Max Generations": 10000000000, "Max Model Evaluations": 1000000000, "Target Loss": -1.0 }, "Variable Count": 0 }
Variable Defaults
These following configuration will be assigned to each of the experiment variables by default. Any settings defined by the user will override the given settings specified in these defaults.
{ }