ml_grid.util.global_params
==========================

.. py:module:: ml_grid.util.global_params

.. autoapi-nested-parse::

   Global parameters for the ml_grid project.

   This module defines a singleton class `GlobalParameters` to hold configuration
   settings that are accessible throughout the application. It also includes a
   custom scoring function for ROC AUC that handles cases with a single class.


Attributes
----------

.. autoapisummary::

   ml_grid.util.global_params.global_parameters


Classes
-------

.. autoapisummary::

   ml_grid.util.global_params.GlobalParameters


Functions
---------

.. autoapisummary::

   ml_grid.util.global_params.custom_roc_auc_score


Module Contents
---------------

.. py:function:: custom_roc_auc_score(y_true: numpy.ndarray, y_pred: numpy.ndarray) -> float

   Calculates ROC AUC score, handling cases with only one class in y_true.

   If `y_true` contains fewer than two unique classes, ROC AUC is undefined.
   In such cases, this function returns np.nan.

   :param y_true: True binary labels.
   :type y_true: np.ndarray
   :param y_pred: Target scores.
   :type y_pred: np.ndarray

   :returns: The ROC AUC score, or np.nan if the score is undefined.
   :rtype: float


.. py:class:: GlobalParameters(debug_level: int = 0, knn_n_jobs: int = -1)

   Initializes the GlobalParameters instance.

   This method sets the default values for all global parameters. The
   `_initialized` flag prevents re-initialization on subsequent calls.

   :param debug_level: The initial debug level. Defaults to 0.
   :type debug_level: int, optional
   :param knn_n_jobs: The number of jobs for KNN. Defaults to -1.
   :type knn_n_jobs: int, optional


   .. py:attribute:: debug_level
      :type:  int

      The verbosity level for debugging. Not widely used. Defaults to 0.


   .. py:attribute:: knn_n_jobs
      :type:  int

      The number of parallel jobs to run for KNN algorithms. -1 means using all available processors. Defaults to -1.


   .. py:attribute:: verbose
      :type:  int

      Controls the verbosity of output during the pipeline run. Higher values produce more detailed logs. Defaults to 0.


   .. py:attribute:: rename_cols
      :type:  bool

      If True, renames DataFrame columns to remove special characters (e.g., '[, ], <') that can cause issues with some models like XGBoost. Defaults to True.


   .. py:attribute:: error_raise
      :type:  bool

      If True, the pipeline will stop and raise an exception if an error occurs during model training or evaluation. If False, it will log the error and continue. Defaults to False.


   .. py:attribute:: random_grid_search
      :type:  bool

      If True and `bayessearch` is False, uses `RandomizedSearchCV` instead of `GridSearchCV`. Defaults to False.


   .. py:attribute:: bayessearch
      :type:  bool

      If True, uses `BayesSearchCV` from `scikit-optimize` for hyperparameter tuning, which can be more efficient than grid or random search. Defaults to True.


   .. py:attribute:: sub_sample_param_space_pct
      :type:  float

      The percentage of the total parameter space to sample when using `RandomizedSearchCV`. For example, 0.1 means 10% of the combinations will be tried. Defaults to 0.0005.


   .. py:attribute:: grid_n_jobs
      :type:  int

      The number of jobs to run in parallel for hyperparameter search (`GridSearchCV`, `RandomizedSearchCV`, `BayesSearchCV`). -1 means using all available processors. Defaults to -1.


   .. py:attribute:: time_limit_param
      :type:  List[int]

      A parameter for future use, intended to set time limits on model fitting. Currently not implemented. Defaults to [3].


   .. py:attribute:: random_state_val
      :type:  int

      A seed value for random number generation to ensure reproducibility across runs. Defaults to 1234.


   .. py:attribute:: n_jobs_model_val
      :type:  int

      The number of parallel jobs for models that support it (e.g., RandomForest). -1 means using all available processors. Defaults to -1.


   .. py:attribute:: max_param_space_iter_value
      :type:  int

      A hard limit on the number of parameter combinations to evaluate in `RandomizedSearchCV` or `BayesSearchCV`. Prevents excessively long run times. Defaults to 10.


   .. py:attribute:: store_models
      :type:  bool

      Whether to save trained models to disk. Defaults to True.


   .. py:attribute:: metric_list
      :type:  Dict[str, Union[str, Callable]]

      A dictionary of scoring metrics to evaluate models during cross-validation. Keys are metric names and values are scikit-learn scorer strings or callable objects.


   .. py:method:: update_parameters(**kwargs: Any) -> None

      Updates global parameters at runtime.

      :param \*\*kwargs: Key-value pairs of parameters to update.
      :type \*\*kwargs: Any

      :raises AttributeError: If a key in kwargs is not a valid parameter.


.. py:data:: global_parameters