ml_grid.model_classes.H2OStackedEnsembleClassifier
Attributes
Classes
Initializes the H2OStackedEnsembleClassifier. |
Module Contents
- class ml_grid.model_classes.H2OStackedEnsembleClassifier.H2OStackedEnsembleClassifier(base_models: List[ml_grid.model_classes.H2OBaseClassifier.H2OBaseClassifier] = None, **kwargs)[source]
Bases:
ml_grid.model_classes.H2OBaseClassifier.H2OBaseClassifierInitializes the H2OStackedEnsembleClassifier.
- Parameters:
base_models (List[H2OBaseClassifier], optional) – A list of unfitted H2O classifier wrapper instances that will be trained as base learners. These models must be trained with nfolds > 1 and keep_cross_validation_predictions=True.
**kwargs – Keyword arguments passed to the H2OStackedEnsembleEstimator. Common arguments include metalearner_algorithm, seed, etc.
- set_params(**kwargs)[source]
Overrides set_params to correctly handle the base_models list.
This is critical for scikit-learn’s clone function to work correctly during cross-validation.
- Returns:
The instance with updated parameters.
- Return type:
- get_params(deep: bool = True) dict[source]
Overrides get_params to ensure base_models is included.
This allows scikit-learn’s clone to work correctly.
- Returns:
A dictionary of the estimator’s parameters.
- Return type:
which is critical for scikit-learn’s clone function.
- score(X: pandas.DataFrame, y: pandas.Series, sample_weight=None) float[source]
Returns the mean accuracy on the given test data and labels.
This method is required for scikit-learn compatibility, especially for use with tools like GridSearchCV when no scoring is specified.
- Parameters:
X (pd.DataFrame) – Test samples.
y (pd.Series) – True labels for X.
sample_weight – Sample weights (ignored, for API compatibility).
- Returns:
The mean accuracy of the model.
- Return type:
- fit(X: pandas.DataFrame, y: pandas.Series, **kwargs) H2OStackedEnsembleClassifier[source]
Fits the H2O Stacked Ensemble model, making it compatible with scikit-learn’s CV tools.
This method encapsulates the entire two-stage fitting process: 1. It first fits each of the base models on the provided training data, ensuring
they are trained with cross-validation to generate predictions for the metalearner.
It then collects the model IDs of the fitted base models.
Finally, it trains the metalearner (the stacked ensemble model) using these base models.
- Parameters:
X (pd.DataFrame) – The feature matrix.
y (pd.Series) – The target vector.
**kwargs – Additional keyword arguments (not used).
- Returns:
The fitted classifier instance.
- Return type:
- Raises:
ValueError – If base_models is empty or not provided.