ml_grid.model_classes.lightgbm_class

LightGBM Classifier Wrapper.

This module provides a scikit-learn compatible wrapper for the LightGBM classifier, handling feature name sanitization.

Classes

LightGBMClassifier

Initializes the LightGBMClassifier wrapper.

Module Contents

class ml_grid.model_classes.lightgbm_class.LightGBMClassifier(boosting_type: str = 'gbdt', num_leaves: int = 31, learning_rate: float = 0.05, n_estimators: int = 100, objective: str = 'binary', num_class: int | None = None, metric: str = 'logloss', feature_fraction: float = 0.9, early_stopping_rounds: int | None = None, verbosity: int = -1)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.ClassifierMixin

Initializes the LightGBMClassifier wrapper.

Parameters:

boosting_type (str) – The type of boosting to use.
num_leaves (int) – Maximum number of leaves in one tree.
learning_rate (float) – Boosting learning rate.
n_estimators (int) – Number of boosting rounds.
objective (str) – The learning objective.
num_class (Optional[int]) – The number of classes for multiclass classification. Not needed for binary. Defaults to None.
metric (str) – The metric to be used for evaluation. Defaults to ‘logloss’.
feature_fraction (float) – Fraction of features to be considered for each tree.
early_stopping_rounds (Optional[int]) – Activates early stopping. Defaults to None.
verbosity (int) – Controls the level of LightGBM’s verbosity.

boosting_type = 'gbdt'[source]

num_leaves = 31[source]

learning_rate = 0.05[source]

n_estimators = 100[source]

objective = 'binary'[source]

num_class = None[source]

metric = 'logloss'[source]

feature_fraction = 0.9[source]

early_stopping_rounds = None[source]

model: lightgbm.LGBMClassifier | None = None[source]

verbosity = -1[source]

classes_: numpy.ndarray | None = None[source]

fit(X: pandas.DataFrame, y: pandas.Series | numpy.ndarray) → LightGBMClassifier[source]

Fits the LightGBM model.

This method sanitizes the feature names in X before fitting the underlying lgb.LGBMClassifier.

Parameters:

X (pd.DataFrame) – The training input samples.
y (Union[pd.Series, np.ndarray]) – The target values.

Returns:

The fitted estimator.

Return type:

LightGBMClassifier

predict(X: pandas.DataFrame) → numpy.ndarray[source]

Predicts class labels for samples in X.

This method sanitizes the feature names in X to match those used during training.

Parameters:: X (pd.DataFrame) – The input samples to predict.
Raises:: ValueError – If the model has not been fitted yet.
Returns:: The predicted class labels.
Return type:: np.ndarray

score(X: pandas.DataFrame, y: pandas.Series | numpy.ndarray) → float[source]

Returns the mean accuracy on the given test data and labels.

Parameters:

X (pd.DataFrame) – Test samples.
y (Union[pd.Series, np.ndarray]) – True labels for X.

Raises:

ValueError – If the model has not been fitted yet.

Returns:

Mean accuracy of self.predict(X) wrt. y.

Return type:

predict_proba(X: pandas.DataFrame) → numpy.ndarray[source]

Predicts class probabilities for samples in X.

This method sanitizes the feature names in X to match those used during training.

Parameters:: X (pd.DataFrame) – The input samples to predict.
Raises:: ValueError – If the model has not been fitted yet.
Returns:: The predicted class probabilities.
Return type:: np.ndarray