plot_hyperparameters

Hyperparameter analysis plotting module for ML results analysis. Focuses on visualizing the impact of hyperparameters on model performance.

Classes

HyperparameterAnalysisPlotter

Analyzes and visualizes the impact of hyperparameters on model performance.

Module Contents

class plot_hyperparameters.HyperparameterAnalysisPlotter(data: pandas.DataFrame)[source]

Analyzes and visualizes the impact of hyperparameters on model performance.

This class extracts hyperparameter settings from model string representations in the results data, allowing for detailed analysis of how different hyperparameters affect a given performance metric.

data[source]
clean_data[source]
get_available_algorithms()[source]

Gets a list of available, parsable algorithms from the data.

Returns:

A sorted list of unique algorithm names.

Return type:

List[str]

plot_performance_by_hyperparameter(algorithm_name: str, hyperparameters: List[str], metric: str = 'auc', figsize: Tuple[int, int] | None = None)[source]

Plots performance against a list of hyperparameters in a grid.

This function provides a visual analysis of how individual parameter values affect the model’s metric score. It creates a grid of subplots, where each subplot visualizes the relationship between a specific hyperparameter and the performance metric, automatically detecting whether to use a scatter plot (for continuous) or a box plot (for categorical/discrete).

Parameters:
  • algorithm_name (str) – The name of the algorithm to analyze (e.g., ‘RandomForestClassifier’).

  • hyperparameters (List[str]) – A list of hyperparameter names to plot.

  • metric (str, optional) – The performance metric for the y-axis. Defaults to ‘auc’.

  • figsize (Optional[Tuple[int, int]], optional) – The overall figure size. If None, a default is calculated. Defaults to None.

plot_hyperparameter_importance(algorithm_name: str, metric: str = 'auc', top_n_percent: int = 20, figsize: Tuple[int, int] | None = None)[source]

Plots hyperparameter distributions for top models vs. all models.

This method provides insight into which hyperparameter values are more prevalent in high-performing models compared to the overall distribution of values explored during the search.

Parameters:
  • algorithm_name (str) – The name of the algorithm to analyze.

  • metric (str, optional) – The metric used to define “top” models. Defaults to ‘auc’.

  • top_n_percent (int, optional) – The percentage of top models to compare against. Defaults to 20.

  • figsize (Optional[Tuple[int, int]], optional) – The figure size for the plot. Defaults to None.

plot_hyperparameter_correlations(algorithm_name: str, metric: str = 'auc', method: str = 'pearson', figsize: Tuple[int, int] | None = None, show_correlation_stats: bool = True)[source]

Plots correlation between continuous hyperparameters and a performance metric.

This method creates scatter plots to visualize the relationship between each continuous hyperparameter and the target metric, including a regression line and correlation statistics.

Parameters:
  • algorithm_name (str) – The name of the algorithm to analyze.

  • metric (str, optional) – The performance metric. Defaults to ‘auc’.

  • method (str, optional) – The correlation method (‘pearson’ or ‘spearman’). Defaults to ‘pearson’.

  • figsize (Optional[Tuple[int, int]], optional) – The figure size. Defaults to None.

  • show_correlation_stats (bool, optional) – Whether to print a summary table of correlations. Defaults to True.

plot_top_correlations(algorithm_name: str, metric: str = 'auc', method: str = 'pearson', top_n: int = 5, figsize: Tuple[int, int] = (15, 10))[source]

Plots only the top N most correlated hyperparameters with the metric.

Parameters:
  • algorithm_name (str) – The name of the algorithm to analyze.

  • metric (str, optional) – The performance metric. Defaults to ‘auc’.

  • method (str, optional) – The correlation method (‘pearson’ or ‘spearman’). Defaults to ‘pearson’.

  • top_n (int, optional) – The number of top correlated hyperparameters to plot. Defaults to 5.

  • figsize (Tuple[int, int], optional) – The figure size. Defaults to (15, 10).