Python: module sega_learn.neural_networks.optimizers

sega_learn.neural_networks.optimizers_cupy

Modules

Classes

CuPyAdadeltaOptimizer
CuPyAdamOptimizer
CuPySGDOptimizer

class CuPyAdadeltaOptimizer(builtins.object)

CuPyAdadeltaOptimizer(learning_rate=1.0, rho=0.95, epsilon=1e-06, reg_lambda=0.0)

Adadelta optimizer class for training neural networks.

Formula:
    E[g^2]_t = rho * E[g^2]_{t-1} + (1 - rho) * g^2
    Delta_x = - (sqrt(E[delta_x^2]_{t-1} + epsilon) / sqrt(E[g^2]_t + epsilon)) * g
    E[delta_x^2]_t = rho * E[delta_x^2]_{t-1} + (1 - rho) * Delta_x^2
Derived from: https://arxiv.org/abs/1212.5701

Args:
    learning_rate (float, optional): The learning rate for the optimizer. Defaults to 1.0.
    rho (float, optional): The decay rate. Defaults to 0.95.
    epsilon (float, optional): A small value to prevent division by zero. Defaults to 1e-6.
    reg_lambda (float, optional): The regularization parameter. Defaults to 0.0.

Methods defined here:

__init__(self, learning_rate=1.0, rho=0.95, epsilon=1e-06, reg_lambda=0.0): Initializes the optimizer with the specified hyperparameters.

Args:
    learning_rate: (float), optional - The learning rate for the optimizer (default is 1.0).
    rho: (float), optional - The decay rate for the moving average of squared gradients (default is 0.95).
    epsilon: (float), optional - A small constant to prevent division by zero (default is 1e-6).
    reg_lambda: (float), optional - The regularization parameter for weight decay (default is 0.0).

Attributes:
    E_g2: (None or np.ndarray) - The moving average of squared gradients, initialized as None.
    E_delta_x2: (None or np.ndarray) - The moving average of squared parameter updates, initialized as None.

initialize(self, layers): Initializes the optimizer's internal state for the given layers.

Args:
layers: (list) - A list of layers, each containing weights and biases.

update_layers(self, layers, dWs, dbs): Updates the weights and biases of the layers using Adadelta optimization.

Args:
    layers: (list) - A list of layers to update.
    dWs: (list) - Gradients of the weights for each layer.
    dbs: (list) - Gradients of the biases for each layer.

Data descriptors defined here:

__dict__: dictionary for instance variables

__weakref__: list of weak references to the object

class CuPyAdamOptimizer(builtins.object)

CuPyAdamOptimizer(learning_rate=0.001, beta1=0.9, beta2=0.999, epsilon=1e-08, reg_lambda=0.01)

Adam optimizer class for training neural networks.

Formula: w = w - alpha * m_hat / (sqrt(v_hat) + epsilon) - lambda * w
Derived from: https://arxiv.org/abs/1412.6980

Args:
    learning_rate (float, optional): The learning rate for the optimizer. Defaults to 0.001.
    beta1 (float, optional): The exponential decay rate for the first moment estimates. Defaults to 0.9.
    beta2 (float, optional): The exponential decay rate for the second moment estimates. Defaults to 0.999.
    epsilon (float, optional): A small value to prevent division by zero. Defaults to 1e-8.
    reg_lambda (float, optional): The regularization parameter. Defaults to 0.01.

Methods defined here:

__init__(self, learning_rate=0.001, beta1=0.9, beta2=0.999, epsilon=1e-08, reg_lambda=0.01): Initializes the optimizer with the specified hyperparameters.

Args:
    learning_rate: (float), optional - The step size for updating weights (default is 0.001).
    beta1: (float), optional - Exponential decay rate for the first moment estimates (default is 0.9).
    beta2: (float), optional - Exponential decay rate for the second moment estimates (default is 0.999).
    epsilon: (float), optional - A small constant to prevent division by zero (default is 1e-8).
    reg_lambda: (float), optional - Regularization parameter for weight decay (default is 0.01).

initialize(self, layers): Initializes the optimizer's internal state for the given layers.

Args:
layers: (list) - A list of layers, each containing weights and biases.

update_layers(self, layers, dWs, dbs): Updates the weights and biases of the layers using Adam optimization.

Args:
    layers: (list) - A list of layers to update.
    dWs: (list) - Gradients of the weights for each layer.
    dbs: (list) - Gradients of the biases for each layer.

Data descriptors defined here:

__dict__: dictionary for instance variables

__weakref__: list of weak references to the object

class CuPySGDOptimizer(builtins.object)

CuPySGDOptimizer(learning_rate=0.001, momentum=0.0, reg_lambda=0.0)

Stochastic Gradient Descent (SGD) optimizer class for training neural networks.

Formula: v = momentum * v - learning_rate * dW, w = w + v - learning_rate * reg_lambda * w

Args:
    learning_rate (float, optional): The learning rate for the optimizer. Defaults to 0.001.
    momentum (float, optional): The momentum factor. Defaults to 0.0.
    reg_lambda (float, optional): The regularization parameter. Defaults to 0.0.

Methods defined here:

__init__(self, learning_rate=0.001, momentum=0.0, reg_lambda=0.0): Initializes the optimizer with specified hyperparameters.

Args:
    learning_rate: (float), optional - The step size for updating weights (default is 0.001).
    momentum: (float), optional - The momentum factor for accelerating gradient descent (default is 0.0).
    reg_lambda: (float), optional - The regularization strength to prevent overfitting (default is 0.0).

Attributes:
    velocity: (None or np.ndarray) - The velocity term used for momentum-based updates (initialized as None).

initialize(self, layers): Initializes the optimizer's velocity for the given layers.

Args:
layers: (list) - A list of layers, each containing weights and biases.

update_layers(self, layers, dWs, dbs): Updates the weights and biases of the layers using SGD optimization.

Args:
    layers: (list) - A list of layers to update.
    dWs: (list) - Gradients of the weights for each layer.
    dbs: (list) - Gradients of the biases for each layer.

Data descriptors defined here:

__dict__: dictionary for instance variables

__weakref__: list of weak references to the object