ml_grid.pipeline.main

Classes

run

Initializes the run class.

Module Contents

class ml_grid.pipeline.main.run(ml_grid_object: ml_grid.pipeline.data.pipe, local_param_dict: Dict[str, Any])[source]

Initializes the run class.

This class takes the main data pipeline object and a dictionary of local parameters to set up and prepare for executing a series of hyperparameter searches across multiple machine learning models.

Parameters:
  • ml_grid_object (pipe) – The main data pipeline object, which contains the data (X_train, y_train, etc.) and a list of model classes to be evaluated.

  • local_param_dict (Dict[str, Any]) – A dictionary of parameters for the current experimental run, such as param_space_size.

global_params: ml_grid.util.global_params.global_parameters[source]

A reference to the global parameters singleton instance.

verbose: int[source]

The verbosity level for logging, inherited from global parameters.

error_raise: bool[source]

A flag to control error handling. If True, exceptions will be raised.

ml_grid_object: ml_grid.pipeline.data.pipe[source]

The main data pipeline object, containing data and model configurations.

sub_sample_param_space_pct: float[source]

The percentage of the parameter space to sample in a randomized search.

parameter_space_size: str[source]

The size of the parameter space for base learners (e.g., ‘medium’, ‘xsmall’).

model_class_list: List[Any][source]

A list of instantiated model class objects to be evaluated in this run.

pg_list: List[int][source]

A list containing the calculated size of the parameter grid for each model.

mean_parameter_space_val: float[source]

The mean size of the parameter spaces across all models in the run.

sub_sample_parameter_val: int[source]

The calculated number of iterations for randomized search, based on sub_sample_param_space_pct.

arg_list: List[Tuple][source]

A list of argument tuples, one for each model, to be passed to the grid search function.

multiprocess: bool[source]

A flag to enable or disable multiprocessing for running grid searches in parallel.

local_param_dict: Dict[str, Any][source]

A dictionary of parameters for the current experimental run.

model_error_list: List[List[Any]][source]

A list to store details of any errors encountered during model training.

highest_score: float[source]

The highest score achieved across all successful model runs in the execute step.

execute() Tuple[List[List[Any]], float][source]

Executes the grid search for each model in the list.

This method iterates through the list of configured models and their parameter spaces, running a cross-validated grid search for each one. It captures any errors that occur during the process and returns a list of those errors along with the highest score achieved.

Returns:

A tuple containing:
  • A list of model errors, where each error is a list containing the algorithm instance, the exception, and the traceback.

  • The highest score achieved across all successful model runs.

Return type:

Tuple[List[List[Any]], float]