Python: module sega_learn.trees.gradientBoostedRegressor

sega_learn.trees.gradientBoostedRegressor

Modules

Classes

GradientBoostedRegressor

class GradientBoostedRegressor(builtins.object)

GradientBoostedRegressor(X=None, y=None, num_trees: int = 100, max_depth: int = 3, learning_rate: float = 0.1, min_samples_split: int = 2, random_seed: int = None)

A class to represent a Gradient Boosted Decision Tree Regressor.

Attributes:
    random_seed (int): The random seed for the random number generator.
    num_trees (int): The number of decision trees in the ensemble.
    max_depth (int): The maximum depth of each decision tree.
    learning_rate (float): The learning rate for the gradient boosted model.
    min_samples_split (int): The minimum number of samples required to split a node.
    random_seed (int): The random seed for the random number generator.

Methods:
    fit(X=None, y=None, verbose=0): Fits the gradient boosted decision tree regressor to the training data.
    predict(X): Predicts the target values for the input features.
    calculate_metrics(y_true, y_pred): Calculates the evaluation metrics.
    get_stats(y_true, y_pred, verbose=False): Returns the evaluation metrics.

Methods defined here:

__init__(self, X=None, y=None, num_trees: int = 100, max_depth: int = 3, learning_rate: float = 0.1, min_samples_split: int = 2, random_seed: int = None): Initializes the Gradient Boosted Decision Tree Regressor.

Args:
    X: (np.ndarray), optional - Input feature data (default is None).
    y: (np.ndarray), optional - Target data (default is None).
    num_trees (int): Number of boosting stages (trees).
    max_depth (int): Maximum depth of each individual tree regressor.
    learning_rate (float): Step size shrinkage to prevent overfitting.
    min_samples_split (int): Minimum samples required to split a node.
    random_seed (int): Seed for reproducibility (currently affects feature selection within trees).

calculate_metrics(self, y_true, y_pred): Calculate common regression metrics.

Args:
    y_true (array-like): True target values.
    y_pred (array-like): Predicted target values.

Returns:
    dict: A dictionary containing calculated metrics (MSE, R^2, MAE, RMSE, MAPE).

fit(self, X=None, y=None, sample_weight=None, verbose=0): Fits the gradient boosted decision tree regressor to the training data.

This method trains the ensemble of decision trees by iteratively fitting each tree to the residuals
of the previous iteration. The residuals are updated after each iteration by subtracting the predictions
made by the current tree from the :target values.

Args:
    X (array-like): Training input features of shape (n_samples, n_features).
    y (array-like): Training target values of shape (n_samples,).
    sample_weight (array-like): Sample weights for each instance (not used in this implementation).
    verbose (int): Whether to print progress messages (e.g., residuals). 0 for no output, 1 for output, >1 for detailed output

Returns:
    self: The fitted GradientBoostedRegressor instance.

get_params(self): Get the parameters of the GradientBoostedRegressor.

get_stats(self, y_true, y_pred, verbose=False): Calculate and optionally print evaluation metrics.

Args:
    y_true (array-like): True target values.
    y_pred (array-like): Predicted target values.
    verbose (bool): Whether to print progress messages (e.g., residuals).

Returns:
    dict: A dictionary containing calculated metrics (MSE, R^2, MAE, RMSE, MAPE).

predict(self, X): Predicts target values for input features X using the fitted GBR model.

Args:
X (array-like): Input features of shape (n_samples, n_features).

Returns:
np.ndarray: Predicted target values of shape (n_samples,).

Data descriptors defined here:

__dict__: dictionary for instance variables

__weakref__: list of weak references to the object