Python: module sega_learn.linear

sega_learn.linear_models.regressors

# Importing the required libraries

Modules

scipy.linalg

numpy

warnings

Classes

builtins.object

Bayesian
Lasso
OrdinaryLeastSquares
PassiveAggressiveRegressor
RANSAC
Ridge

class Bayesian(builtins.object)

Bayesian(max_iter=300, tol=0.001, alpha_1=1e-06, alpha_2=1e-06, lambda_1=1e-06, lambda_2=1e-06, fit_intercept=None)

Fits the Bayesian Regression model to the training data using Coordinate Descent.

Args:
    X_train: (np.ndarray) - Training feature data.
    y_train: (np.ndarray) - Training target data.
    X_test: (np.ndarray), optional - Testing feature data (default is None).
    y_test: (np.ndarray), optional - Testing target data (default is None).
    max_iter: (int), optional - The maximum number of iterations to perform (default is 300).
    tol: (float), optional - The convergence threshold. The algorithm stops when the coefficients change less than this threshold (default is 0.001).
    alpha_1: (float), optional - The shape parameter for the prior on the weights (default is 1e-06).
    alpha_2: (float), optional - The scale parameter for the prior on the weights (default is 1e-06).
    lambda_1: (float), optional - The shape parameter for the prior on the noise (default is 1e-06).
    lambda_2: (float), optional - The scale parameter for the prior on the noise (default is 1e-06).
    fit_intercept: (bool), optional - Whether to calculate the intercept for this model (default is True).

Returns:
    intercept_: (float) - The intercept of the model.
    coef_: (np.ndarray) - Estimated coefficients for the linear regression problem. If `fit_intercept` is True, the first element is the intercept.
    n_iter_: (int) - The number of iterations performed.
    alpha_: (float) - The precision of the weights.
    lambda_: (float) - The precision of the noise.
    sigma_: (np.ndarray) - The posterior covariance of the weights.

Methods defined here:

__init__(self, max_iter=300, tol=0.001, alpha_1=1e-06, alpha_2=1e-06, lambda_1=1e-06, lambda_2=1e-06, fit_intercept=None): Implements Bayesian Regression using Coordinate Descent.

Bayesian regression applies both L1 and L2 regularization to prevent overfitting by adding penalty terms to the loss function.

Args:
    max_iter: (int) - The maximum number of iterations to perform (default is 300).
    tol: (float) - The convergence threshold. The algorithm stops when the coefficients change less than this threshold (default is 0.001).
    alpha_1: (float) - The shape parameter for the prior on the weights (default is 1e-06).
    alpha_2: (float) - The scale parameter for the prior on the weights (default is 1e-06).
    lambda_1: (float) - The shape parameter for the prior on the noise (default is 1e-06).
    lambda_2: (float) - The scale parameter for the prior on the noise (default is 1e-06).
    fit_intercept: (bool), optional - Whether to calculate the intercept for this model (default is True).

Returns:
    intercept_: (float) - The intercept of the model.
    coef_: (np.ndarray) - Estimated coefficients for the linear regression problem. If `fit_intercept` is True, the first element is the intercept.
    n_iter_: (int) - The number of iterations performed.
    alpha_: (float) - The precision of the weights.
    lambda_: (float) - The precision of the noise.
    sigma_: (np.ndarray) - The posterior covariance of the weights.

__str__(self): Returns the string representation of the model.

fit(self, X, y): Fits the Bayesian Regression model to the training data.

Args:
    X: (np.ndarray) - Training feature data of shape (n_samples, n_features).
    y: (np.ndarray) - Training target data of shape (n_samples,).

Returns:
    self: (Bayesian) - The fitted Bayesian Regression model.

get_formula(self): Computes the formula of the model.

Returns:
formula: (str) - The formula of the model as a string.

predict(self, X): Predicts the target values using the Bayesian Regression model.

Args:
X: (np.ndarray) - Feature data of shape (n_samples, n_features).

Returns:
y_pred: (np.ndarray) - Predicted target values of shape (n_samples,).

tune(self, X, y, beta1=0.9, beta2=0.999, iter=1000): Tunes the hyperparameters alpha_1, alpha_2, lambda_1, and lambda_2 using ADAM optimizer.

Args:
    X: (np.ndarray) - Training feature data of shape (n_samples, n_features).
    y: (np.ndarray) - Training target data of shape (n_samples,).
    beta1: (float), optional - The exponential decay rate for the first moment estimates (default is 0.9).
    beta2: (float), optional - The exponential decay rate for the second moment estimates (default is 0.999).
    iter: (int), optional - The maximum number of iterations to perform (default is 1000).

Returns:
    best_alpha_1: (float) - The best value of alpha_1.
    best_alpha_2: (float) - The best value of alpha_2.
    best_lambda_1: (float) - The best value of lambda_1.
    best_lambda_2: (float) - The best value of lambda_2.

Data descriptors defined here:

__dict__: dictionary for instance variables

__weakref__: list of weak references to the object

class Lasso(builtins.object)

Lasso(alpha=1.0, fit_intercept=True, max_iter=10000, tol=0.0001, compile_numba=False)

Fits the Lasso Regression model to the training data.

Lasso regression implements L1 regularization, which helps to prevent overfitting by adding a penalty term to the loss function.

Args:
    X_train: (np.ndarray) - Training feature data of shape (n_samples, n_features).
    y_train: (np.ndarray) - Training target data of shape (n_samples,).
    X_test: (np.ndarray), optional - Testing feature data (default is None).
    y_test: (np.ndarray), optional - Testing target data (default is None).
    custom_metrics: (dict: str -> callable), optional - Custom metrics for evaluation (default is None).
    verbose: (bool), optional - If True, prints progress (default is False).

Attributes:
    coef_: (np.ndarray) - Estimated coefficients for the linear regression problem. If `fit_intercept` is True, the first element is the intercept.
    intercept_: (float) - Independent term in the linear model. Set to 0.0 if `fit_intercept` is False.

Returns:
    results: (list) - A list of dictionaries containing model performance metrics.
    predictions: (dict) - A dictionary of predictions for each model.

Methods defined here:

__init__(self, alpha=1.0, fit_intercept=True, max_iter=10000, tol=0.0001, compile_numba=False): Initializes the Lasso Regression model.

Lasso regression implements L1 regularization, which helps to prevent overfitting by adding a penalty term to the loss function.

Args:
    alpha: (float) - Regularization strength; must be a positive float (default is 1.0).
    fit_intercept: (bool), optional - Whether to calculate the intercept for this model (default is True).
    max_iter: (int), optional - Maximum number of iterations for the coordinate descent solver (default is 10000).
    tol: (float), optional - Tolerance for the optimization. The optimization stops when the change in the coefficients is less than this tolerance (default is 1e-4).
    compile_numba: (bool), optional - Whether to precompile the numba functions (default is False). If True, the numba fitting functions will be compiled before use.

__str__(self): Returns the string representation of the model.

fit(self, X, y, numba=False): Fits the Lasso Regression model to the training data using coordinate descent.

Args:
    X: (np.ndarray) - Training feature data of shape (n_samples, n_features).
    y: (np.ndarray) - Training target data of shape (n_samples,).
    numba: (bool), optional - Whether to use numba for faster computation (default is False).

Returns:
    self: (Lasso) - The fitted Lasso Regression model.

get_formula(self): Computes the formula of the model.

Returns:
- formula : str: The formula of the model.

predict(self, X): Predicts the target values using the Lasso Regression model.

Args:
X: (np.ndarray) - Feature data of shape (n_samples, n_features).

Returns:
y_pred: (np.ndarray) - Predicted target values of shape (n_samples,).

Data descriptors defined here:

__dict__: dictionary for instance variables

__weakref__: list of weak references to the object

class OrdinaryLeastSquares(builtins.object)

OrdinaryLeastSquares(fit_intercept=True) -> None

Ordinary Least Squares (OLS) linear regression model.

Attributes:
    coef_ : ndarray of shape (n_features,) or (n_features + 1,) - Estimated coefficients for the linear regression problem. If `fit_intercept` is True, the first element is the intercept.
    intercept_ : float - Independent term in the linear model. Set to 0.0 if `fit_intercept` is False.

Methods:
    fit(X, y): Fit the linear model to the data.
    predict(X): Predict using the linear model.
    get_formula(): Returns the formula of the model as a string.

Methods defined here:

__init__(self, fit_intercept=True) -> None: Initializes the OrdinaryLeastSquares object.

Args:
fit_intercept: (bool) - Whether to calculate the intercept for this model (default is True).

__str__(self): Returns the string representation of the model.

fit(self, X, y): Fits the linear regression model to the training data.

Args:
    X: (np.ndarray) - Training feature data of shape (n_samples, n_features).
    y: (np.ndarray) - Training target data of shape (n_samples,).

Returns:
    self: (OrdinaryLeastSquares) - The fitted linear regression model.

get_formula(self): Returns the formula of the model as a string.

Returns:
formula: (str) - The formula of the model.

predict(self, X): Predicts the target values using the linear model.

Args:
X: (np.ndarray) - Feature data of shape (n_samples, n_features).

Returns:
y_pred: (np.ndarray) - Predicted target values of shape (n_samples,).

Data descriptors defined here:

__dict__: dictionary for instance variables

__weakref__: list of weak references to the object

class PassiveAggressiveRegressor(builtins.object)

PassiveAggressiveRegressor(C=1.0, max_iter=1000, tol=0.001)

Fits the Passive Aggressive Regression model to the training data.

Args:
    X: (np.ndarray) - Training feature data of shape (n_samples, n_features).
    y: (np.ndarray) - Training target data of shape (n_samples,).
    save_steps: (bool), optional - Whether to save the weights and intercept at each iteration (default is False).
    verbose: (bool), optional - If True, prints progress during training (default is False).

Attributes:
    coef_: (np.ndarray) - Estimated coefficients for the regression problem.
    intercept_: (float) - Independent term in the linear model.
    steps_: (list of tuples), optional - The weights and intercept at each iteration if `save_steps` is True.

Methods defined here:

__init__(self, C=1.0, max_iter=1000, tol=0.001): Fits the Passive Aggressive Regression model to the training data.

Args:
    C: (float) - Regularization parameter/step size (default is 1.0).
    max_iter: (int) - The maximum number of passes over the training data (default is 1000).
    tol: (float) - The stopping criterion (default is 1e-3).

Attributes:
    coef_: (np.ndarray) - Estimated coefficients for the regression problem.
    intercept_: (float) - Independent term in the linear model.

__str__(self): Returns the string representation of the model.

fit(self, X, y, save_steps=False, verbose=False): Fits the Passive Aggressive Regression model to the training data.

Args:
    X: (np.ndarray) - Training feature data of shape (n_samples, n_features).
    y: (np.ndarray) - Training target data of shape (n_samples,).
    save_steps: (bool), optional - Whether to save the weights and intercept at each iteration (default is False).
    verbose: (bool), optional - If True, prints progress during training (default is False).

Returns:
    None

get_formula(self): Computes the formula of the model.

Returns:
formula : str: The formula of the model.

predict(self, X): Predict using the linear model. Dot product of X and the coefficients.

predict_all_steps(self, X): Predict using the linear model at each iteration. (save_steps=True).

Data descriptors defined here:

__dict__: dictionary for instance variables

__weakref__: list of weak references to the object

class RANSAC(builtins.object)

RANSAC(n=10, k=100, t=0.05, d=10, model=None, auto_scale_t=False, scale_t_factor=2, auto_scale_n=False, scale_n_factor=2)

Fits the RANSAC (RANdom SAmple Consensus) algorithm for robust linear regression.

Args:
    X_train: (np.ndarray) - Training feature data.
    y_train: (np.ndarray) - Training target data.
    X_test: (np.ndarray), optional - Testing feature data (default is None).
    y_test: (np.ndarray), optional - Testing target data (default is None).
    n: (int), optional - Number of data points to estimate parameters (default is 10).
    k: (int), optional - Maximum iterations allowed (default is 100).
    t: (float), optional - Threshold value to determine if points are fit well, in terms of residuals (default is 0.05).
    d: (int), optional - Number of close data points required to assert model fits well (default is 10).
    model: (object), optional - The model to use for fitting. If None, uses Ordinary Least Squares (default is None).
    auto_scale_t: (bool), optional - Whether to automatically scale the threshold until a model is fit (default is False).
    scale_t_factor: (float), optional - Factor by which to scale the threshold until a model is fit (default is 2).
    auto_scale_n: (bool), optional - Whether to automatically scale the number of data points until a model is fit (default is False).
    scale_n_factor: (float), optional - Factor by which to scale the number of data points until a model is fit (default is 2).

Returns:
    best_fit: (object) - The best model fit.
    best_error: (float) - The best error achieved by the model.
    best_n: (int) - The best number of data points used to fit the model.
    best_t: (float) - The best threshold value used to determine if points are fit well, in terms of residuals.
    best_model: (object) - The best model fit.

Methods defined here:

__init__(self, n=10, k=100, t=0.05, d=10, model=None, auto_scale_t=False, scale_t_factor=2, auto_scale_n=False, scale_n_factor=2): Fits the RANSAC (RANdom SAmple Consensus) algorithm for robust linear regression.

Args:
    X_train: (np.ndarray) - Training feature data.
    y_train: (np.ndarray) - Training target data.
    X_test: (np.ndarray), optional - Testing feature data (default is None).
    y_test: (np.ndarray), optional - Testing target data (default is None).
    n: (int), optional - Number of data points to estimate parameters (default is 10).
    k: (int), optional - Maximum iterations allowed (default is 100).
    t: (float), optional - Threshold value to determine if points are fit well, in terms of residuals (default is 0.05).
    d: (int), optional - Number of close data points required to assert model fits well (default is 10).
    model: (object), optional - The model to use for fitting. If None, uses Ordinary Least Squares (default is None).
    auto_scale_t: (bool), optional - Whether to automatically scale the threshold until a model is fit (default is False).
    scale_t_factor: (float), optional - Factor by which to scale the threshold until a model is fit (default is 2).
    auto_scale_n: (bool), optional - Whether to automatically scale the number of data points until a model is fit (default is False).
    scale_n_factor: (float), optional - Factor by which to scale the number of data points until a model is fit (default is 2).

Returns:
    best_fit: (object) - The best model fit.
    best_error: (float) - The best error achieved by the model.
    best_n: (int) - The best number of data points used to fit the model.
    best_t: (float) - The best threshold value used to determine if points are fit well, in terms of residuals.
    best_model: (object) - The best model fit.

__str__(self): Returns the string representation of the model.

fit(self, X, y): Fits the RANSAC model to the training data.

Args:
    X: (np.ndarray) - Training feature data of shape (n_samples, n_features).
    y: (np.ndarray) - Training target data of shape (n_samples,).

Returns:
    None

get_formula(self): Computes the formula of the model if fit, else returns "No model fit available".

predict(self, X): Predicts the target values using the best fit model.

Args:
X: (np.ndarray) - Feature data of shape (n_samples, n_features).

Returns:
y_pred: (np.ndarray) - Predicted target values of shape (n_samples,).

Data descriptors defined here:

__dict__: dictionary for instance variables

__weakref__: list of weak references to the object

class Ridge(builtins.object)

Ridge(alpha=1.0, fit_intercept=True, max_iter=10000, tol=0.0001, compile_numba=False)

Fits the Ridge Regression model to the training data.

Ridge regression implements L2 regularization, which helps to prevent overfitting by adding a penalty term to the loss function.

Args:
    alpha: (float) - Regularization strength; must be a positive float (default is 1.0).
    fit_intercept: (bool), optional - Whether to calculate the intercept for this model (default is True).
    max_iter: (int), optional - Maximum number of iterations for the coordinate descent solver (default is 10000).
    tol: (float), optional - Tolerance for the optimization. The optimization stops when the change in the coefficients is less than this tolerance (default is 1e-4).

Attributes:
    coef_: (np.ndarray) - Estimated coefficients for the linear regression problem. If `fit_intercept` is True, the first element is the intercept.
    intercept_: (float) - Independent term in the linear model. Set to 0.0 if `fit_intercept` is False.

Methods:
    fit(X, y): Fits the Ridge Regression model to the training data.
    predict(X): Predicts using the Ridge Regression model.
    get_formula(): Returns the formula of the model as a string.

Methods defined here:

__init__(self, alpha=1.0, fit_intercept=True, max_iter=10000, tol=0.0001, compile_numba=False): Initializes the Ridge Regression model.

Ridge regression implements L2 regularization, which helps to prevent overfitting by adding a penalty term to the loss function.

Args:
    alpha: (float) - Regularization strength; must be a positive float (default is 1.0).
    fit_intercept: (bool), optional - Whether to calculate the intercept for this model (default is True).
    max_iter: (int), optional - Maximum number of iterations for the coordinate descent solver (default is 10000).
    tol: (float), optional - Tolerance for the optimization. The optimization stops when the change in the coefficients is less than this tolerance (default is 1e-4).
    compile_numba: (bool), optional - Whether to precompile the numba functions (default is False). If True, the numba fitting functions will be compiled before use.

__str__(self): Returns the string representation of the model.

fit(self, X, y, numba=False): Fits the Ridge Regression model to the training data.

Args:
    X: (np.ndarray) - Training feature data of shape (n_samples, n_features).
    y: (np.ndarray) - Training target data of shape (n_samples,).
    numba: (bool), optional - Whether to use numba for faster computation (default is False).

Returns:
    self: (Ridge) - The fitted Ridge Regression model.

get_formula(self): Computes the formula of the model.

Returns:
formula: (str) - The formula of the model as a string.

predict(self, X): Predicts the target values using the Ridge Regression model.

Args:
X: (np.ndarray) - Feature data of shape (n_samples, n_features).

Returns:
y_pred: (np.ndarray) - Predicted target values of shape (n_samples,).

Data descriptors defined here:

__dict__: dictionary for instance variables

__weakref__: list of weak references to the object