ml_grid.pipeline.hyperparameter_search

Classes

HyperparameterSearch

Initializes the HyperparameterSearch class.

Module Contents

class ml_grid.pipeline.hyperparameter_search.HyperparameterSearch(algorithm: sklearn.base.BaseEstimator, parameter_space: Dict | List[Dict], method_name: str, global_params: Any, sub_sample_pct: int = 100, max_iter: int = 100, ml_grid_object: Any = None, cv: Any = None)[source]

Initializes the HyperparameterSearch class.

Parameters:

algorithm (BaseEstimator) – The scikit-learn compatible estimator instance.
parameter_space (Union[Dict, List[Dict]]) – The hyperparameter search space.
method_name (str) – The name of the algorithm.
global_params (Any) – The global parameters object.
sub_sample_pct (int, optional) – Percentage of the parameter space to sample for randomized search. Defaults to 100.
max_iter (int, optional) – The maximum number of iterations for randomized or Bayesian search. Defaults to 100.
ml_grid_object (Any, optional) – The main pipeline object containing data and other parameters. Defaults to None.
cv (Any, optional) – Cross-validation splitting strategy. Can be None, int, or a CV splitter. Defaults to None (no cross-validation).

algorithm: sklearn.base.BaseEstimator[source]: The scikit-learn compatible estimator instance.

parameter_space: Dict | List[Dict][source]: The hyperparameter search space.

method_name: str[source]: The name of the algorithm.

global_params: ml_grid.util.global_params.global_parameters[source]: A reference to the global parameters singleton instance.

sub_sample_pct: int[source]: Percentage of the parameter space to sample for randomized search. Defaults to 100.

max_iter: int[source]: The maximum number of iterations for randomized or Bayesian search. Defaults to 100.

ml_grid_object: Any[source]: The main pipeline object containing data and other parameters.

cv = None[source]

run_search(X_train: pandas.DataFrame, y_train: pandas.Series) → sklearn.base.BaseEstimator[source]

Executes the hyperparameter search.

This method selects the search strategy (Grid, Random, or Bayesian) based on global parameters and runs the search on the provided training data.

Parameters:

X_train (pd.DataFrame) – Training features with reset index.
y_train (pd.Series) – Training labels with reset index.

Returns:

The best estimator found during the search.

Return type:

BaseEstimator