| |
- JITAdadeltaOptimizer(builtins.object)
-
- JITAdadeltaOptimizer
- JITAdamOptimizer(builtins.object)
-
- JITAdamOptimizer
- JITSGDOptimizer(builtins.object)
-
- JITSGDOptimizer
class JITAdadeltaOptimizer(JITAdadeltaOptimizer) |
|
JITAdadeltaOptimizer(*args, **kwargs)
Adadelta optimizer class for training neural networks.
Formula:
E[g^2]_t = rho * E[g^2]_{t-1} + (1 - rho) * g^2
Delta_x = - (sqrt(E[delta_x^2]_{t-1} + epsilon) / sqrt(E[g^2]_t + epsilon)) * g
E[delta_x^2]_t = rho * E[delta_x^2]_{t-1} + (1 - rho) * Delta_x^2
Derived from: https://arxiv.org/abs/1212.5701
Args:
learning_rate (float, optional): The learning rate for the optimizer. Defaults to 1.0.
rho (float, optional): The decay rate. Defaults to 0.95.
epsilon (float, optional): A small value to prevent division by zero. Defaults to 1e-6.
reg_lambda (float, optional): The regularization parameter. Defaults to 0.0. |
|
- Method resolution order:
- JITAdadeltaOptimizer
- JITAdadeltaOptimizer
- builtins.object
Data and other attributes defined here:
- class_type = jitclass.JITAdadeltaOptimizer#1ed8a7c8610<learni...float64, 3d, C),E_delta_x2:array(float64, 3d, C)>
Methods inherited from JITAdadeltaOptimizer:
- __init__(self, learning_rate=1.0, rho=0.95, epsilon=1e-06, reg_lambda=0.0)
- Initializes the optimizer with specified hyperparameters.
Args:
learning_rate: (float), optional - The learning rate for the optimizer (default is 1.0).
rho: (float), optional - The decay rate for the running averages (default is 0.95).
epsilon: (float), optional - A small value to prevent division by zero (default is 1e-6).
reg_lambda: (float), optional - The regularization parameter (default is 0.0).
Attributes:
E_g2: (np.ndarray) - Running average of squared gradients.
E_delta_x2: (np.ndarray) - Running average of squared parameter updates.
- initialize(self, layers)
- Initializes the running averages for each layer's weights.
Args:
layers: (list) - List of layers in the neural network.
Returns:
None
- update(self, layer, dW, db, index)
- Updates the weights and biases of a layer using the Adadelta optimization algorithm.
Args:
layer: (Layer) - The layer to update.
dW: (np.ndarray) - The gradient of the weights.
db: (np.ndarray) - The gradient of the biases.
index: (int) - The index of the layer.
Returns:
None
- update_layers(self, layers, dWs, dbs)
- Updates all layers' weights and biases using the Adadelta optimization algorithm.
Args:
layers: (list) - List of layers in the neural network.
dWs: (list of np.ndarray) - Gradients of the weights for each layer.
dbs: (list of np.ndarray) - Gradients of the biases for each layer.
Returns:
None
Data descriptors inherited from JITAdadeltaOptimizer:
- __dict__
- dictionary for instance variables
- __weakref__
- list of weak references to the object
|
class JITAdamOptimizer(JITAdamOptimizer) |
|
JITAdamOptimizer(*args, **kwargs)
Adam optimizer class for training neural networks.
Formula: w = w - alpha * m_hat / (sqrt(v_hat) + epsilon) - lambda * w
Derived from: https://arxiv.org/abs/1412.6980
Args:
learning_rate (float, optional): The learning rate for the optimizer. Defaults to 0.001.
beta1 (float, optional): The exponential decay rate for the first moment estimates. Defaults to 0.9.
beta2 (float, optional): The exponential decay rate for the second moment estimates. Defaults to 0.999.
epsilon (float, optional): A small value to prevent division by zero. Defaults to 1e-8.
reg_lambda (float, optional): The regularization parameter. Defaults to 0.01. |
|
- Method resolution order:
- JITAdamOptimizer
- JITAdamOptimizer
- builtins.object
Data and other attributes defined here:
- class_type = jitclass.JITAdamOptimizer#1ed897c3e90<learning_r...t64, 2d, A),db:array(float64, 2d, A),index:int32>
Methods inherited from JITAdamOptimizer:
- __init__(self, learning_rate=0.001, beta1=0.9, beta2=0.999, epsilon=1e-08, reg_lambda=0.01)
- Initializes the optimizer with the specified hyperparameters.
Args:
learning_rate: (float), optional - The learning rate for the optimizer (default is 0.001).
beta1: (float), optional - Exponential decay rate for the first moment estimates (default is 0.9).
beta2: (float), optional - Exponential decay rate for the second moment estimates (default is 0.999).
epsilon: (float), optional - A small value to prevent division by zero (default is 1e-8).
reg_lambda: (float), optional - Regularization parameter; larger values imply stronger regularization (default is 0.01).
- initialize(self, layers)
- Initializes the first and second moment estimates for each layer's weights.
Args:
layers: (list) - List of layers in the neural network.
Returns:
None
- update(self, layer, dW, db, index)
- Updates the weights and biases of a layer using the Adam optimization algorithm.
Args:
layer: (Layer) - The layer to update.
dW: (np.ndarray) - The gradient of the weights.
db: (np.ndarray) - The gradient of the biases.
index: (int) - The index of the layer.
Returns:
None
- update_layers(self, layers, dWs, dbs)
- Updates all layers' weights and biases using the Adam optimization algorithm.
Args:
layers: (list) - List of layers in the neural network.
dWs: (list of np.ndarray) - Gradients of the weights for each layer.
dbs: (list of np.ndarray) - Gradients of the biases for each layer.
Returns:
None
Data descriptors inherited from JITAdamOptimizer:
- __dict__
- dictionary for instance variables
- __weakref__
- list of weak references to the object
|
class JITSGDOptimizer(JITSGDOptimizer) |
|
JITSGDOptimizer(*args, **kwargs)
Stochastic Gradient Descent (SGD) optimizer class for training neural networks.
Formula: w = w - learning_rate * dW, b = b - learning_rate * db
Args:
learning_rate (float, optional): The learning rate for the optimizer. Defaults to 0.001.
momentum (float, optional): The momentum factor. Defaults to 0.0.
reg_lambda (float, optional): The regularization parameter. Defaults to 0.0. |
|
- Method resolution order:
- JITSGDOptimizer
- JITSGDOptimizer
- builtins.object
Data and other attributes defined here:
- class_type = jitclass.JITSGDOptimizer#1ed8a7be510<learning_ra...eg_lambda:float64,velocity:array(float64, 3d, C)>
Methods inherited from JITSGDOptimizer:
- __init__(self, learning_rate=0.001, momentum=0.0, reg_lambda=0.0)
- Initializes the optimizer with specified hyperparameters.
Args:
learning_rate: (float), optional - The learning rate for the optimizer (default is 0.001).
momentum: (float), optional - The momentum factor for the optimizer (default is 0.0).
reg_lambda: (float), optional - The regularization parameter (default is 0.0).
Attributes:
learning_rate: (float) - The learning rate for the optimizer.
momentum: (float) - The momentum factor for the optimizer.
reg_lambda: (float) - The regularization parameter.
velocity: (np.ndarray) - The velocity used for momentum updates, initialized to zeros.
- initialize(self, layers)
- Initializes the velocity for each layer's weights.
Args:
layers: (list) - List of layers in the neural network.
Returns:
None
- update(self, layer, dW, db, index)
- Updates the weights and biases of a layer using the SGD optimization algorithm.
Args:
layer: (Layer) - The layer to update.
dW: (np.ndarray) - The gradient of the weights.
db: (np.ndarray) - The gradient of the biases.
index: (int) - The index of the layer.
Returns:
None
- update_layers(self, layers, dWs, dbs)
- Updates all layers' weights and biases using the SGD optimization algorithm.
Args:
layers: (list) - List of layers in the neural network.
dWs: (list of np.ndarray) - Gradients of the weights for each layer.
dbs: (list of np.ndarray) - Gradients of the biases for each layer.
Returns:
None
Data descriptors inherited from JITSGDOptimizer:
- __dict__
- dictionary for instance variables
- __weakref__
- list of weak references to the object
| |