| |
- builtins.object
-
- CuPyAdadeltaOptimizer
- CuPyAdamOptimizer
- CuPySGDOptimizer
class CuPyAdadeltaOptimizer(builtins.object) |
|
CuPyAdadeltaOptimizer(learning_rate=1.0, rho=0.95, epsilon=1e-06, reg_lambda=0.0)
Adadelta optimizer class for training neural networks.
Formula:
E[g^2]_t = rho * E[g^2]_{t-1} + (1 - rho) * g^2
Delta_x = - (sqrt(E[delta_x^2]_{t-1} + epsilon) / sqrt(E[g^2]_t + epsilon)) * g
E[delta_x^2]_t = rho * E[delta_x^2]_{t-1} + (1 - rho) * Delta_x^2
Derived from: https://arxiv.org/abs/1212.5701
Args:
learning_rate (float, optional): The learning rate for the optimizer. Defaults to 1.0.
rho (float, optional): The decay rate. Defaults to 0.95.
epsilon (float, optional): A small value to prevent division by zero. Defaults to 1e-6.
reg_lambda (float, optional): The regularization parameter. Defaults to 0.0. |
|
Methods defined here:
- __init__(self, learning_rate=1.0, rho=0.95, epsilon=1e-06, reg_lambda=0.0)
- Initializes the optimizer with the specified hyperparameters.
Args:
learning_rate: (float), optional - The learning rate for the optimizer (default is 1.0).
rho: (float), optional - The decay rate for the moving average of squared gradients (default is 0.95).
epsilon: (float), optional - A small constant to prevent division by zero (default is 1e-6).
reg_lambda: (float), optional - The regularization parameter for weight decay (default is 0.0).
Attributes:
E_g2: (None or np.ndarray) - The moving average of squared gradients, initialized as None.
E_delta_x2: (None or np.ndarray) - The moving average of squared parameter updates, initialized as None.
- initialize(self, layers)
- Initializes the optimizer's internal state for the given layers.
Args:
layers: (list) - A list of layers, each containing weights and biases.
- update_layers(self, layers, dWs, dbs)
- Updates the weights and biases of the layers using Adadelta optimization.
Args:
layers: (list) - A list of layers to update.
dWs: (list) - Gradients of the weights for each layer.
dbs: (list) - Gradients of the biases for each layer.
Data descriptors defined here:
- __dict__
- dictionary for instance variables
- __weakref__
- list of weak references to the object
|
class CuPyAdamOptimizer(builtins.object) |
|
CuPyAdamOptimizer(learning_rate=0.001, beta1=0.9, beta2=0.999, epsilon=1e-08, reg_lambda=0.01)
Adam optimizer class for training neural networks.
Formula: w = w - alpha * m_hat / (sqrt(v_hat) + epsilon) - lambda * w
Derived from: https://arxiv.org/abs/1412.6980
Args:
learning_rate (float, optional): The learning rate for the optimizer. Defaults to 0.001.
beta1 (float, optional): The exponential decay rate for the first moment estimates. Defaults to 0.9.
beta2 (float, optional): The exponential decay rate for the second moment estimates. Defaults to 0.999.
epsilon (float, optional): A small value to prevent division by zero. Defaults to 1e-8.
reg_lambda (float, optional): The regularization parameter. Defaults to 0.01. |
|
Methods defined here:
- __init__(self, learning_rate=0.001, beta1=0.9, beta2=0.999, epsilon=1e-08, reg_lambda=0.01)
- Initializes the optimizer with the specified hyperparameters.
Args:
learning_rate: (float), optional - The step size for updating weights (default is 0.001).
beta1: (float), optional - Exponential decay rate for the first moment estimates (default is 0.9).
beta2: (float), optional - Exponential decay rate for the second moment estimates (default is 0.999).
epsilon: (float), optional - A small constant to prevent division by zero (default is 1e-8).
reg_lambda: (float), optional - Regularization parameter for weight decay (default is 0.01).
- initialize(self, layers)
- Initializes the optimizer's internal state for the given layers.
Args:
layers: (list) - A list of layers, each containing weights and biases.
- update_layers(self, layers, dWs, dbs)
- Updates the weights and biases of the layers using Adam optimization.
Args:
layers: (list) - A list of layers to update.
dWs: (list) - Gradients of the weights for each layer.
dbs: (list) - Gradients of the biases for each layer.
Data descriptors defined here:
- __dict__
- dictionary for instance variables
- __weakref__
- list of weak references to the object
|
class CuPySGDOptimizer(builtins.object) |
|
CuPySGDOptimizer(learning_rate=0.001, momentum=0.0, reg_lambda=0.0)
Stochastic Gradient Descent (SGD) optimizer class for training neural networks.
Formula: v = momentum * v - learning_rate * dW, w = w + v - learning_rate * reg_lambda * w
Args:
learning_rate (float, optional): The learning rate for the optimizer. Defaults to 0.001.
momentum (float, optional): The momentum factor. Defaults to 0.0.
reg_lambda (float, optional): The regularization parameter. Defaults to 0.0. |
|
Methods defined here:
- __init__(self, learning_rate=0.001, momentum=0.0, reg_lambda=0.0)
- Initializes the optimizer with specified hyperparameters.
Args:
learning_rate: (float), optional - The step size for updating weights (default is 0.001).
momentum: (float), optional - The momentum factor for accelerating gradient descent (default is 0.0).
reg_lambda: (float), optional - The regularization strength to prevent overfitting (default is 0.0).
Attributes:
velocity: (None or np.ndarray) - The velocity term used for momentum-based updates (initialized as None).
- initialize(self, layers)
- Initializes the optimizer's velocity for the given layers.
Args:
layers: (list) - A list of layers, each containing weights and biases.
- update_layers(self, layers, dWs, dbs)
- Updates the weights and biases of the layers using SGD optimization.
Args:
layers: (list) - A list of layers to update.
dWs: (list) - Gradients of the weights for each layer.
dbs: (list) - Gradients of the biases for each layer.
Data descriptors defined here:
- __dict__
- dictionary for instance variables
- __weakref__
- list of weak references to the object
| |