| |
- abc.ABC(builtins.object)
-
- sega_learn.utils.animator.AnimationBase
-
- sega_learn.utils.animator.ClassificationAnimation
- sega_learn.utils.animator.ForcastingAnimation
- sega_learn.utils.animator.RegressionAnimation
- builtins.object
-
- sega_learn.utils.dataAugmentation.Augmenter
- sega_learn.utils.dataAugmentation.RandomOverSampler
- sega_learn.utils.dataAugmentation.RandomUnderSampler
- sega_learn.utils.dataAugmentation.SMOTE
- sega_learn.utils.dataPrep.DataPrep
- sega_learn.utils.dataPreprocessing.Scaler
- sega_learn.utils.decomposition.PCA
- sega_learn.utils.decomposition.SVD
- sega_learn.utils.metrics.Metrics
- sega_learn.utils.modelSelection.GridSearchCV
- sega_learn.utils.modelSelection.ModelSelectionUtility
- sega_learn.utils.modelSelection.RandomSearchCV
- sega_learn.utils.polynomialTransform.PolynomialTransform
- sega_learn.utils.voting.ForecastRegressor
- sega_learn.utils.voting.VotingClassifier
- sega_learn.utils.voting.VotingRegressor
class AnimationBase(abc.ABC) |
|
AnimationBase(model, train_series, test_series, dynamic_parameter=None, static_parameters=None, keep_previous=None, **kwargs)
Base class for creating animations of machine learning models. |
|
- Method resolution order:
- AnimationBase
- abc.ABC
- builtins.object
Methods defined here:
- __init__(self, model, train_series, test_series, dynamic_parameter=None, static_parameters=None, keep_previous=None, **kwargs)
- Initialize the animation base class.
Args:
model: The forecasting model or any machine learning model.
train_series: Training time series data.
test_series: Testing time series data.
dynamic_parameter: The parameter to update dynamically (e.g., 'window', 'alpha', 'beta').
static_parameters: Static parameters for the model.
Should be a dictionary with parameter names as keys and their values.
keep_previous: Whether to keep all previous lines with reduced opacity.
**kwargs: Additional customization options (e.g., colors, line styles).
- animate(self, frames, interval=150, blit=True, repeat=False)
- Create the animation.
Args:
frames: Range of frames (e.g., window sizes).
interval: Delay between frames in milliseconds.
blit: Whether to use blitting for faster rendering.
repeat: Whether to repeat the animation.
- save(self, filename, writer='pillow', fps=5, dpi=100)
- Save the animation to a file.
Args:
filename: Path to save the animation.
writer: Writer to use (e.g., 'pillow' for GIF).
fps: Frames per second.
dpi: Dots per inch for the saved figure.
- setup_plot(self, title, xlabel, ylabel, legend_loc='upper left', grid=True, figsize=(12, 6))
- Set up the plot for the animation.
Args:
title: Title of the plot.
xlabel: Label for the x-axis.
ylabel: Label for the y-axis.
legend_loc: Location of the legend.
grid: Whether to show grid lines.
figsize: Size of the figure.
- show(self)
- Display the animation.
- update_model(self, frame)
- Abstract method to update the model for a given frame. Must be implemented by subclasses.
- update_plot(self, frame)
- Abstract method to update the plot for a given frame.Must be implemented by subclasses.
Data descriptors defined here:
- __dict__
- dictionary for instance variables
- __weakref__
- list of weak references to the object
Data and other attributes defined here:
- __abstractmethods__ = frozenset({'update_model', 'update_plot'})
|
class Augmenter(builtins.object) |
|
Augmenter(techniques, verbose=False)
General class for data augmentation techniques.
This class allows for the application of multiple augmentation techniques in sequence. |
|
Methods defined here:
- __init__(self, techniques, verbose=False)
- Initializes the Augmenter with a list of techniques and verbosity option.
- augment(self, X, y)
- Applies multiple augmentation techniques in sequence.
Args:
X: (np.ndarray) - Feature matrix.
y: (np.ndarray) - Target vector.
Returns:
tuple: (np.ndarray, np.ndarray) - Augmented feature matrix and target vector.
Data descriptors defined here:
- __dict__
- dictionary for instance variables
- __weakref__
- list of weak references to the object
|
class ClassificationAnimation(AnimationBase) |
|
ClassificationAnimation(model, X, y, test_size=0.3, dynamic_parameter=None, static_parameters=None, keep_previous=False, scaler=None, pca_components=2, plot_step=0.02, **kwargs)
Class for creating animations of classification models. |
|
- Method resolution order:
- ClassificationAnimation
- AnimationBase
- abc.ABC
- builtins.object
Methods defined here:
- __init__(self, model, X, y, test_size=0.3, dynamic_parameter=None, static_parameters=None, keep_previous=False, scaler=None, pca_components=2, plot_step=0.02, **kwargs)
- Initialize the classification animation class.
Args:
model: The classification model.
X: Feature matrix (input data).
y: Target vector (output data).
test_size: Proportion of the dataset to include in the test split.
dynamic_parameter: The parameter to update dynamically (e.g., 'alpha', 'beta').
static_parameters: Additional static parameters for the model.
Should be a dictionary with parameter names as keys and their values.
keep_previous: Whether to keep all previous lines with reduced opacity.
scaler: Optional scaler for preprocessing the data.
pca_components: Number of components to use for PCA.
plot_step: Resolution of the decision boundary mesh.
**kwargs: Additional customization options (e.g., colors, line styles).
- setup_plot(self, title, xlabel, ylabel, legend_loc='upper left', grid=True, figsize=(12, 6))
- Set up the plot for classification animation.
- update_model(self, frame)
- Update the classification model for the current frame.
Args:
frame: The current frame (e.g., parameter value).
- update_plot(self, frame)
- Update the plot for the current frame.
Args:
frame: The current frame (e.g., parameter value).
Data and other attributes defined here:
- __abstractmethods__ = frozenset()
Methods inherited from AnimationBase:
- animate(self, frames, interval=150, blit=True, repeat=False)
- Create the animation.
Args:
frames: Range of frames (e.g., window sizes).
interval: Delay between frames in milliseconds.
blit: Whether to use blitting for faster rendering.
repeat: Whether to repeat the animation.
- save(self, filename, writer='pillow', fps=5, dpi=100)
- Save the animation to a file.
Args:
filename: Path to save the animation.
writer: Writer to use (e.g., 'pillow' for GIF).
fps: Frames per second.
dpi: Dots per inch for the saved figure.
- show(self)
- Display the animation.
Data descriptors inherited from AnimationBase:
- __dict__
- dictionary for instance variables
- __weakref__
- list of weak references to the object
|
class DataPrep(builtins.object) |
|
A class for preparing data for machine learning models. |
|
Methods defined here:
- df_to_ndarray(df, y_col=0)
- Converts a DataFrame to a NumPy array.
Args:
df: (pandas.DataFrame) - The DataFrame to be converted.
y_col: (int), optional - The index of the label column (default is 0).
Returns:
X: (numpy.ndarray) - The feature columns as a NumPy array.
y: (numpy.ndarray) - The label column as a NumPy array.
- find_categorical_columns(data)
- Finds the indices of non-numerical columns in a DataFrame or numpy array.
Args:
data: (pandas.DataFrame or numpy.ndarray) - The data to be checked.
Returns:
categorical_cols: (list) - The list of indices of non-numerical columns.
- k_split(X, y, k=5)
- Splits the data into k folds for cross-validation.
Args:
X: (numpy.ndarray) - The feature columns.
y: (numpy.ndarray) - The label column.
k: (int), optional - The number of folds (default is 5).
Returns:
X_folds: (list) - A list of k folds of feature columns.
y_folds: (list) - A list of k folds of label columns.
- one_hot_encode(data, cols)
- One-hot encodes non-numerical columns in a DataFrame or numpy array.
Drops the original columns after encoding.
Args:
data: (pandas.DataFrame or numpy.ndarray) - The data to be encoded.
cols: (list) - The list of column indices to be encoded.
Returns:
data: (pandas.DataFrame or numpy.ndarray) - The data with one-hot encoded columns.
- prepare_data(csv_file, label_col_index, cols_to_encode=None, write_to_csv=True)
- Prepares the data by loading a CSV file, one-hot encoding non-numerical columns, and optionally writing the prepared data to a new CSV file.
Args:
csv_file: (str) - The path of the CSV file to load.
label_col_index: (int) - The index of the label column.
cols_to_encode: (list), optional - The list of column indices to one-hot encode (default is None).
write_to_csv: (bool), optional - Whether to write the prepared data to a new CSV file (default is True).
Returns:
df: (pandas.DataFrame) - The prepared DataFrame.
prepared_csv_file: (str) - The path of the prepared CSV file. If write_to_csv is False, returns "N/A".
- write_data(df, csv_file, print_path=False)
- Writes the DataFrame to a CSV file.
Args:
df: (pandas.DataFrame) - The DataFrame to be written.
csv_file: (str) - The path of the CSV file to write to.
print_path: (bool), optional - If True, prints the file path (default is False).
Data descriptors defined here:
- __dict__
- dictionary for instance variables
- __weakref__
- list of weak references to the object
|
class ForcastingAnimation(AnimationBase) |
|
ForcastingAnimation(model, train_series, test_series, forecast_steps, dynamic_parameter=None, static_parameters=None, keep_previous=False, max_previous=None, **kwargs)
Class for creating animations of forecasting models. |
|
- Method resolution order:
- ForcastingAnimation
- AnimationBase
- abc.ABC
- builtins.object
Methods defined here:
- __init__(self, model, train_series, test_series, forecast_steps, dynamic_parameter=None, static_parameters=None, keep_previous=False, max_previous=None, **kwargs)
- Initialize the forecasting animation class.
Args:
model: The forecasting model.
train_series: Training time series data.
test_series: Testing time series data.
forecast_steps: Number of steps to forecast.
dynamic_parameter: The parameter to update dynamically (e.g., 'window', 'alpha', 'beta').
static_parameters: Static parameters for the model.
Should be a dictionary with parameter names as keys and their values.
keep_previous: Whether to keep all previous lines with reduced opacity.
max_previous: Maximum number of previous lines to keep.
**kwargs: Additional customization options (e.g., colors, line styles).
- setup_plot(self, title, xlabel, ylabel, legend_loc='upper left', grid=True, figsize=(12, 6))
- Set up the plot for forecasting animation.
- update_model(self, frame)
- Update the model for the current frame.
Args:
frame: The current frame (e.g., parameter value).
- update_plot(self, frame)
- Update the plot for the current frame.
Args:
frame: The current frame (e.g., parameter value).
Data and other attributes defined here:
- __abstractmethods__ = frozenset()
Methods inherited from AnimationBase:
- animate(self, frames, interval=150, blit=True, repeat=False)
- Create the animation.
Args:
frames: Range of frames (e.g., window sizes).
interval: Delay between frames in milliseconds.
blit: Whether to use blitting for faster rendering.
repeat: Whether to repeat the animation.
- save(self, filename, writer='pillow', fps=5, dpi=100)
- Save the animation to a file.
Args:
filename: Path to save the animation.
writer: Writer to use (e.g., 'pillow' for GIF).
fps: Frames per second.
dpi: Dots per inch for the saved figure.
- show(self)
- Display the animation.
Data descriptors inherited from AnimationBase:
- __dict__
- dictionary for instance variables
- __weakref__
- list of weak references to the object
|
class ForecastRegressor(builtins.object) |
|
ForecastRegressor(models, model_weights=None)
Implements a forcast voting regressor.
Takes a list of fitted models and their weights and returns a weighted average of the predictions. |
|
Methods defined here:
- __init__(self, models, model_weights=None)
- Initialize the ForecastRegressor object.
Args:
models: list of models to be stacked
model_weights: list of weights for each model. Default is None.
- forecast(self, steps)
- Forecast the target variable using the fitted models.
Args:
steps: number of steps to forecast
Returns:
y_pred: predicted target variable
- get_params(self)
- Get the parameters of the ForecastRegressor object.
Returns:
params: dictionary of parameters
- show_models(self, formula=False)
- Print the models and their weights.
Data descriptors defined here:
- __dict__
- dictionary for instance variables
- __weakref__
- list of weak references to the object
|
class GridSearchCV(builtins.object) |
|
GridSearchCV(model, param_grid, cv=5, metric='mse', direction='minimize')
Implements a grid search cross-validation for hyperparameter tuning. |
|
Methods defined here:
- __init__(self, model, param_grid, cv=5, metric='mse', direction='minimize')
- Initializes the GridSearchCV object.
Args:
model: The model Object to be tuned.
param_grid: (list) - A list of dictionaries containing hyperparameters to be tuned.
cv: (int) - The number of folds for cross-validation. Default is 5.
metric: (str) - The metric to be used for evaluation. Default is 'mse'.
- Regression Metrics: 'mse', 'r2', 'mae', 'rmse', 'mape', 'mpe'
- Classification Metrics: 'accuracy', 'precision', 'recall', 'f1', 'log_loss'
direction: (str) - The direction to optimize the metric. Default is 'minimize'.
- fit(self, X, y, verbose=False)
- Fits the model to the data for all hyperparameter combinations.
Args:
X: (numpy.ndarray) - The feature columns.
y: (numpy.ndarray) - The label column.
verbose: (bool) - A flag to display the training progress. Default is True.
Returns:
model: The best model with the optimal hyperparameters.
Data descriptors defined here:
- __dict__
- dictionary for instance variables
- __weakref__
- list of weak references to the object
|
class Metrics(builtins.object) |
|
Implements various regression and classification metrics. |
|
Class methods defined here:
- accuracy(y_true, y_pred)
- Calculates the accuracy score between the true and predicted values.
Args:
y_true: (np.ndarray) - The true values.
y_pred: (np.ndarray) - The predicted values.
Returns:
accuracy: (float) - The accuracy score.
- classification_report(y_true, y_pred)
- Generates a classification report for the true and predicted values.
Args:
y_true: (np.ndarray) - The true values.
y_pred: (np.ndarray) - The predicted values.
Returns:
report: (dict) - The classification report.
- confusion_matrix(y_true, y_pred)
- Calculates the confusion matrix between the true and predicted values.
Args:
y_true: (np.ndarray) - The true values.
y_pred: (np.ndarray) - The predicted values.
Returns:
cm: (np.ndarray) - The confusion matrix.
- f1_score(y_true, y_pred)
- Calculates the F1 score between the true and predicted values.
Args:
y_true: (np.ndarray) - The true values.
y_pred: (np.ndarray) - The predicted values.
Returns:
f1_score: (float) - The F1 score.
- log_loss(y_true, y_pred)
- Calculates the log loss between the true and predicted values.
Args:
y_true: (np.ndarray) - The true values.
y_pred: (np.ndarray) - The predicted probabilities.
Returns:
log_loss: (float) - The log loss.
- mean_absolute_error(y_true, y_pred)
- Calculates the mean absolute error between the true and predicted values.
Args:
y_true: (np.ndarray) - The true values.
y_pred: (np.ndarray) - The predicted values.
Returns:
mae: (float) - The mean absolute error.
- mean_absolute_percentage_error(y_true, y_pred)
- Calculates the mean absolute percentage error between the true and predicted values.
Args:
y_true: (np.ndarray) - The true values.
y_pred: (np.ndarray) - The predicted values.
Returns:
mape: (float) - The mean absolute percentage error as a decimal. Returns np.nan if y_true is all zeros.
- mean_percentage_error(y_true, y_pred)
- Calculates the mean percentage error between the true and predicted values.
Args:
y_true: (np.ndarray) - The true values.
y_pred: (np.ndarray) - The predicted values.
Returns:
mpe: (float) - The mean percentage error.
- mean_squared_error(y_true, y_pred)
- Calculates the mean squared error between the true and predicted values.
Args:
y_true: (np.ndarray) - The true values.
y_pred: (np.ndarray) - The predicted values.
Returns:
mse: (float) - The mean squared error.
- precision(y_true, y_pred)
- Calculates the precision score between the true and predicted values.
Args:
y_true: (np.ndarray) - The true values.
y_pred: (np.ndarray) - The predicted values.
Returns:
precision: (float) - The precision score.
- r_squared(y_true, y_pred)
- Calculates the R-squared score between the true and predicted values.
Args:
y_true: (np.ndarray) - The true values.
y_pred: (np.ndarray) - The predicted values.
Returns:
r_squared: (float) - The R-squared score.
- recall(y_true, y_pred)
- Calculates the recall score between the true and predicted values.
Args:
y_true: (np.ndarray) - The true values.
y_pred: (np.ndarray) - The predicted values.
Returns:
recall: (float) - The recall score.
- root_mean_squared_error(y_true, y_pred)
- Calculates the root mean squared error between the true and predicted values.
Args:
y_true: (np.ndarray) - The true values.
y_pred: (np.ndarray) - The predicted values.
Returns:
rmse: (float) - The root mean squared error.
- show_classification_report(y_true, y_pred)
- Generates and displays a classification report for the true and predicted values.
Args:
y_true: (np.ndarray) - The true values.
y_pred: (np.ndarray) - The predicted values.
Returns:
report: (dict) - The classification report.
- show_confusion_matrix(y_true, y_pred)
- Calculates and displays the confusion matrix between the true and predicted values.
Args:
y_true: (np.ndarray) - The true values.
y_pred: (np.ndarray) - The predicted values.
Returns:
cm: (np.ndarray) - The confusion matrix.
Data descriptors defined here:
- __dict__
- dictionary for instance variables
- __weakref__
- list of weak references to the object
|
class ModelSelectionUtility(builtins.object) |
|
A utility class for hyperparameter tuning and cross-validation of machine learning models. |
|
Static methods defined here:
- cross_validate(model, X, y, params, cv=5, metric='mse', direction='minimize', verbose=False)
- Implements a custom cross-validation for hyperparameter tuning.
Args:
model: The model Object to be tuned.
X: (numpy.ndarray) - The feature columns.
y: (numpy.ndarray) - The label column.
params: (dict) - The hyperparameters to be tuned.
cv: (int) - The number of folds for cross-validation. Default is 5.
metric: (str) - The metric to be used for evaluation. Default is 'mse'.
- Regression Metrics: 'mse', 'r2', 'mae', 'rmse', 'mape', 'mpe'
- Classification Metrics: 'accuracy', 'precision', 'recall', 'f1', 'log_loss'
direction: (str) - The direction to optimize the metric. Default is 'minimize'.
verbose: (bool) - A flag to display the training progress. Default is False.
Returns:
tuple: A tuple containing the scores (list) and the trained model.
- get_param_combinations(param_grid)
- Generates all possible combinations of hyperparameters.
Returns:
param_combinations (list): A list of dictionaries containing hyperparameter combinations.
Data descriptors defined here:
- __dict__
- dictionary for instance variables
- __weakref__
- list of weak references to the object
|
class PCA(builtins.object) |
|
PCA(n_components)
Principal Component Analysis (PCA) implementation. |
|
Methods defined here:
- __init__(self, n_components)
- Initializes the PCA model.
Args:
n_components: (int) - Number of principal components to keep.
- fit(self, X)
- Fits the PCA model to the data.
Args:
X: (np.ndarray) - Input data of shape (n_samples, n_features).
Raises:
ValueError: If input data is not a 2D numpy array or if n_components exceeds the number of features.
- fit_transform(self, X)
- Fits the PCA model and applies dimensionality reduction on the input data.
Args:
X: (np.ndarray) - Input data of shape (n_samples, n_features).
Returns:
X_transformed: (np.ndarray) - Data transformed into the principal component space of shape (n_samples, n_components).
- get_components(self)
- Retrieves the principal components.
Returns:
components_: (np.ndarray) - Array of principal components of shape (n_features, n_components).
- get_explained_variance_ratio(self)
- Retrieves the explained variance ratio.
Returns:
explained_variance_ratio_: (np.ndarray) - Array of explained variance ratios for each principal component.
- inverse_transform(self, X_reduced)
- Reconstructs the original data from the reduced data.
Args:
X_reduced: (np.ndarray) - Reduced data of shape (n_samples, n_components).
Returns:
X_original: (np.ndarray) - Reconstructed data of shape (n_samples, n_features).
Raises:
ValueError: If input data is not a 2D numpy array.
- transform(self, X)
- Applies dimensionality reduction on the input data.
Args:
X: (np.ndarray) - Input data of shape (n_samples, n_features).
Returns:
X_transformed: (np.ndarray) - Data transformed into the principal component space of shape (n_samples, n_components).
Raises:
ValueError: If input data is not a 2D numpy array or if its dimensions do not match the fitted data.
Data descriptors defined here:
- __dict__
- dictionary for instance variables
- __weakref__
- list of weak references to the object
|
class PolynomialTransform(builtins.object) |
|
PolynomialTransform(degree=2)
Implements Polynomial Feature Transformation.
Polynomial feature transformation creates new features by raising existing features to a power or creating interaction terms.
Args:
degree (int): The degree of the polynomial features (default is 2).
Attributes:
n_samples (int): The number of samples in the input data.
n_features (int): The number of features in the input data.
n_output_features (int): The number of output features after transformation.
combinations (list of tuples): The combinations of features for polynomial terms. |
|
Methods defined here:
- __init__(self, degree=2)
- Initialize the PolynomialTransform object.
- fit(self, X)
- Fit the model to the data.
Uses itertools.combinations_with_replacement to generate all possible combinations of features(X) of degree n.
- fit_transform(self, X)
- Fit to data, then transform it.
- transform(self, X)
- Transform the data into polynomial features by computing the product of the features for each combination of features.
Data descriptors defined here:
- __dict__
- dictionary for instance variables
- __weakref__
- list of weak references to the object
|
class RandomOverSampler(builtins.object) |
|
RandomOverSampler(random_state=None)
Randomly over-sample the minority class by duplicating examples.
This technique helps to balance the class distribution by randomly duplicating samples from the minority class.
It is a simple yet effective method to address class imbalance in datasets.
Algorithm Steps:
- Step 1: Identify the minority class and its samples.
- Step 2: Calculate the number of samples needed to balance the class distribution.
- Step 3: Randomly select samples from the minority class with replacement.
- Step 4: Duplicate the selected samples to create a balanced dataset. |
|
Methods defined here:
- __init__(self, random_state=None)
- Initializes the RandomOverSampler with an optional random state.
- fit_resample(self, X, y)
- Resamples the dataset to balance the class distribution by duplicating minority class samples.
Args:
X: (array-like) - Feature matrix.
y: (array-like) - Target vector.
Returns:
tuple: (np.ndarray, np.ndarray) - Resampled feature matrix and target vector.
Data descriptors defined here:
- __dict__
- dictionary for instance variables
- __weakref__
- list of weak references to the object
|
class RandomSearchCV(builtins.object) |
|
RandomSearchCV(model, param_grid, iter=10, cv=5, metric='mse', direction='minimize')
Implements a random search cross-validation for hyperparameter tuning. |
|
Methods defined here:
- __init__(self, model, param_grid, iter=10, cv=5, metric='mse', direction='minimize')
- Initializes the RandomSearchCV object.
Args:
model: The model Object to be tuned.
param_grid: (list) - A list of dictionaries containing hyperparameters to be tuned.
iter: (int) - The number of iterations for random search. Default is 10.
cv: (int) - The number of folds for cross-validation. Default is 5.
metric: (str) - The metric to be used for evaluation. Default is 'mse'.
- Regression Metrics: 'mse', 'r2', 'mae', 'rmse', 'mape', 'mpe'
- Classification Metrics: 'accuracy', 'precision', 'recall', 'f1', 'log_loss'
direction: (str) - The direction to optimize the metric. Default is 'minimize'.
- fit(self, X, y, verbose=False)
- Fits the model to the data for iter random hyperparameter combinations.
Args:
X: (numpy.ndarray) - The feature columns.
y: (numpy.ndarray) - The label column.
verbose: (bool) - A flag to display the training progress. Default is True.
Returns:
model: The best model with the optimal hyperparameters.
Data descriptors defined here:
- __dict__
- dictionary for instance variables
- __weakref__
- list of weak references to the object
|
class RandomUnderSampler(builtins.object) |
|
RandomUnderSampler(random_state=None)
Randomly under-sample the majority class by removing examples.
This technique helps to balance the class distribution by randomly removing samples from the majority class.
It is a simple yet effective method to address class imbalance in datasets.
Algorithm Steps:
- Step 1: Identify the majority class and its samples.
- Step 2: Calculate the number of samples to remove to balance the class distribution.
- Step 3: Randomly select samples from the majority class without replacement.
- Step 4: Remove the selected samples to create a balanced dataset. |
|
Methods defined here:
- __init__(self, random_state=None)
- Initializes the RandomUnderSampler with an optional random state.
- fit_resample(self, X, y)
- Resamples the dataset to balance the class distribution by removing majority class samples.
Args:
X: (array-like) - Feature matrix.
y: (array-like) - Target vector.
Returns:
tuple: (np.ndarray, np.ndarray) - Resampled feature matrix and target vector.
Data descriptors defined here:
- __dict__
- dictionary for instance variables
- __weakref__
- list of weak references to the object
|
class RegressionAnimation(AnimationBase) |
|
RegressionAnimation(model, X, y, test_size=0.3, dynamic_parameter=None, static_parameters=None, keep_previous=False, max_previous=None, pca_components=1, **kwargs)
Class for creating animations of regression models. |
|
- Method resolution order:
- RegressionAnimation
- AnimationBase
- abc.ABC
- builtins.object
Methods defined here:
- __init__(self, model, X, y, test_size=0.3, dynamic_parameter=None, static_parameters=None, keep_previous=False, max_previous=None, pca_components=1, **kwargs)
- Initialize the regression animation class.
Args:
model: The regression model.
X: Feature matrix (input data).
y: Target vector (output data).
test_size: Proportion of the dataset to include in the test split.
dynamic_parameter: The parameter to update dynamically (e.g., 'alpha', 'beta').
static_parameters: Additional static parameters for the model.
Should be a dictionary with parameter names as keys and their values.
keep_previous: Whether to keep all previous lines with reduced opacity.
max_previous: Maximum number of previous lines to keep.
pca_components: Number of components to use for PCA.
**kwargs: Additional customization options (e.g., colors, line styles).
- setup_plot(self, title, xlabel, ylabel, legend_loc='upper left', grid=True, figsize=(12, 6))
- Set up the plot for regression animation.
- update_model(self, frame)
- Update the regression model for the current frame.
Args:
frame: The current frame (e.g., parameter value).
- update_plot(self, frame)
- Update the plot for the current frame.
Args:
frame: The current frame (e.g., parameter value).
Data and other attributes defined here:
- __abstractmethods__ = frozenset()
Methods inherited from AnimationBase:
- animate(self, frames, interval=150, blit=True, repeat=False)
- Create the animation.
Args:
frames: Range of frames (e.g., window sizes).
interval: Delay between frames in milliseconds.
blit: Whether to use blitting for faster rendering.
repeat: Whether to repeat the animation.
- save(self, filename, writer='pillow', fps=5, dpi=100)
- Save the animation to a file.
Args:
filename: Path to save the animation.
writer: Writer to use (e.g., 'pillow' for GIF).
fps: Frames per second.
dpi: Dots per inch for the saved figure.
- show(self)
- Display the animation.
Data descriptors inherited from AnimationBase:
- __dict__
- dictionary for instance variables
- __weakref__
- list of weak references to the object
|
class SMOTE(builtins.object) |
|
SMOTE(random_state=None, k_neighbors=5)
Synthetic Minority Over-sampling Technique (SMOTE) for balancing class distribution.
SMOTE generates synthetic samples for the minority class by interpolating between existing samples.
This helps to create a more balanced dataset, which can improve the performance of machine learning models.
Algorithm Steps:
- Step 1: Identify the minority class and its samples.
- Step 2: For each sample in the minority class, find its k nearest neighbors (using Euclidean distance.)
- Step 3: Randomly select one or more of these neighbors.
- Step 4: Create synthetic samples by interpolating between the original sample and the selected neighbors. |
|
Methods defined here:
- __init__(self, random_state=None, k_neighbors=5)
- Initializes the SMOTE with an optional random state and number of neighbors.
- fit_resample(self, X, y, force_equal=False)
- Resamples the dataset to balance the class distribution by generating synthetic samples.
Args:
X: (array-like) - Feature matrix.
y: (array-like) - Target vector.
force_equal: (bool), optional - If True, resample until classes are equal (default is False).
Returns:
tuple: (np.ndarray, np.ndarray) - Resampled feature matrix and target vector.
Data descriptors defined here:
- __dict__
- dictionary for instance variables
- __weakref__
- list of weak references to the object
|
class SVD(builtins.object) |
|
SVD(n_components)
Singular Value Decomposition (SVD) implementation. |
|
Methods defined here:
- __init__(self, n_components)
- Initializes the SVD model.
Args:
n_components: (int) - Number of singular values and vectors to keep.
- fit(self, X)
- Fits the SVD model to the data.
Args:
X: (np.ndarray) - Input data of shape (n_samples, n_features).
Raises:
ValueError: If input data is not a 2D numpy array or if n_components exceeds the minimum dimension of the input data.
- fit_transform(self, X)
- Fits the SVD model and applies the transformation on the input data.
Args:
X: (np.ndarray) - Input data of shape (n_samples, n_features).
Returns:
X_transformed: (np.ndarray) - Data transformed into the singular value space of shape (n_samples, n_components).
- get_singular_values(self)
- Retrieves the singular values.
Returns:
S: (np.ndarray) - Array of singular values of shape (n_components,).
- get_singular_vectors(self)
- Retrieves the singular vectors.
Returns:
U: (np.ndarray) - Left singular vectors of shape (n_samples, n_components).
Vt: (np.ndarray) - Right singular vectors of shape (n_components, n_features).
- transform(self, X)
- Applies the SVD transformation on the input data.
Args:
X: (np.ndarray) - Input data of shape (n_samples, n_features).
Returns:
X_transformed: (np.ndarray) - Data transformed into the singular value space of shape (n_samples, n_components).
Raises:
ValueError: If input data is not a 2D numpy array.
Data descriptors defined here:
- __dict__
- dictionary for instance variables
- __weakref__
- list of weak references to the object
|
class Scaler(builtins.object) |
|
Scaler(method='standard')
A class for scaling data by standardization and normalization. |
|
Methods defined here:
- __init__(self, method='standard')
- Initializes the scaler with the specified method.
Args:
method: (str) - The scaling method to use. Options are 'standard', 'minmax', or 'normalize'.
- fit(self, X)
- Fits the scaler to the data.
Args:
X: (numpy.ndarray) - The data to fit the scaler to.
- fit_transform(self, X)
- Fits the scaler to the data and then transforms it.
Args:
X: (numpy.ndarray) - The data to fit and transform.
Returns:
X_transformed: (numpy.ndarray) - The transformed data.
- inverse_transform(self, X)
- Inverse transforms the data using the fitted scaler.
Args:
X: (numpy.ndarray) - The data to inverse transform.
Returns:
X_inverse: (numpy.ndarray) - The inverse transformed data.
- transform(self, X)
- Transforms the data using the fitted scaler.
Args:
X: (numpy.ndarray) - The data to transform.
Returns:
X_transformed: (numpy.ndarray) - The transformed data.
Data descriptors defined here:
- __dict__
- dictionary for instance variables
- __weakref__
- list of weak references to the object
|
class VotingClassifier(builtins.object) |
|
VotingClassifier(estimators, weights=None)
Implements a hard voting classifier.
Aggregates predictions from multiple fitted classification models based on
majority vote (optionally weighted). |
|
Methods defined here:
- __init__(self, estimators, weights=None)
- Initialize the VotingClassifier object for hard voting.
Args:
estimators (list): A list of *fitted* classifier objects.
Each estimator must have a `predict` method.
weights (array-like of shape (n_estimators,), optional): Sequence of
weights (float or int) to weight the occurrences of predicted class
labels during voting. Uses uniform weights if None. Defaults to None.
- get_params(self, deep=True)
- Get parameters for this estimator.
Args:
deep (bool, optional): If True, will return the parameters for this
estimator and contained subobjects that are estimators. (Not fully implemented for deep=True yet).
Returns:
params (dict): Parameter names mapped to their values.
- predict(self, X)
- Predict class labels for X using hard voting.
Args:
X (array-like of shape (n_samples, n_features)): The input samples.
Returns:
maj (np.ndarray of shape (n_samples,)): Predicted class labels based on majority vote.
- show_models(self)
- Print the models and their weights.
Data descriptors defined here:
- __dict__
- dictionary for instance variables
- __weakref__
- list of weak references to the object
|
class VotingRegressor(builtins.object) |
|
VotingRegressor(models, model_weights=None)
Implements a voting regressor.
Takes a list of fitted models and their weights and returns a weighted average of the predictions. |
|
Methods defined here:
- __init__(self, models, model_weights=None)
- Initialize the VotingRegressor object.
Args:
models: list of models to be stacked
model_weights: list of weights for each model. Default is None.
- get_params(self)
- Get the parameters of the VotingRegressor object.
Returns:
params: dictionary of parameters
- predict(self, X)
- Predict the target variable using the fitted models.
Args:
X: input features
Returns:
y_pred: predicted target variable
- show_models(self, formula=False)
- Print the models and their weights.
Data descriptors defined here:
- __dict__
- dictionary for instance variables
- __weakref__
- list of weak references to the object
| |