plot_algorithms

Algorithm comparison plotting module for ML results analysis. Focuses on comparing algorithm performance with outcome variable stratification.

Attributes

MAX_OUTCOMES_FOR_STRATIFIED_PLOT

Classes

AlgorithmComparisonPlotter

Initializes the AlgorithmComparisonPlotter.

Module Contents

plot_algorithms.MAX_OUTCOMES_FOR_STRATIFIED_PLOT = 20[source]
class plot_algorithms.AlgorithmComparisonPlotter(data: pandas.DataFrame)[source]

Initializes the AlgorithmComparisonPlotter.

Parameters:

data (pd.DataFrame) – A DataFrame containing the experiment results.

data[source]
clean_data[source]
plot_algorithm_boxplots(metric: str = 'auc', algorithms_to_plot: List[str] | None = None, stratify_by_outcome: bool = False, outcomes_to_plot: List[str] | None = None, figsize: Tuple[int, int] = (12, 6)) None[source]

Creates box plots comparing algorithm performance.

Parameters:
  • metric (str, optional) – The performance metric to compare. Defaults to ‘auc’.

  • algorithms_to_plot (Optional[List[str]], optional) – A list of specific algorithms to include. If None, all are used. Defaults to None.

  • stratify_by_outcome (bool, optional) – If True, creates separate plots for each outcome. Defaults to False.

  • outcomes_to_plot (Optional[List[str]], optional) – A list of specific outcomes to plot. If None, all are used. Defaults to None.

  • figsize (Tuple[int, int], optional) – The figure size. Defaults to (12, 6).

Raises:

ValueError – If the specified metric is not found in the data.

plot_algorithm_performance_heatmap(metric: str = 'auc', algorithms_to_plot: List[str] | None = None, outcomes_to_plot: List[str] | None = None, aggregation: str = 'mean', figsize: Tuple[int, int] = (12, 8)) pandas.DataFrame[source]

Creates a heatmap showing algorithm performance across outcomes.

Parameters:
  • metric (str, optional) – The performance metric to visualize. Defaults to ‘auc’.

  • algorithms_to_plot (Optional[List[str]], optional) – A list of specific algorithms to include. Defaults to None.

  • outcomes_to_plot (Optional[List[str]], optional) – A list of specific outcomes to include. Defaults to None.

  • aggregation (str, optional) – How to aggregate multiple runs (‘mean’, ‘median’, ‘max’). Defaults to ‘mean’.

  • figsize (Tuple[int, int], optional) – The figure size. Defaults to (12, 8).

Raises:

ValueError – If ‘outcome_variable’ column is missing or an invalid aggregation method is provided.

Returns:

The pivot table data used for the heatmap.

Return type:

pd.DataFrame

plot_algorithm_ranking(metric: str = 'auc', algorithms_to_plot: List[str] | None = None, stratify_by_outcome: bool = False, outcomes_to_plot: List[str] | None = None, top_n: int = 10, figsize: Tuple[int, int] = (10, 8)) None[source]

Plots a ranked bar chart of algorithm performance.

Parameters:
  • metric (str, optional) – The performance metric to rank by. Defaults to ‘auc’.

  • algorithms_to_plot (Optional[List[str]], optional) – A list of specific algorithms to include. Defaults to None.

  • stratify_by_outcome (bool, optional) – If True, creates separate plots for each outcome. Defaults to False.

  • outcomes_to_plot (Optional[List[str]], optional) – A list of specific outcomes to plot when stratified. Defaults to None.

  • top_n (int, optional) – The number of top algorithms to display. Defaults to 10.

  • figsize (Tuple[int, int], optional) – The figure size. Defaults to (10, 8).

Raises:

ValueError – If the specified metric is not found, or if stratifying and ‘outcome_variable’ column is missing.

plot_algorithm_stability(metric: str = 'auc', top_n: int = 15, figsize: Tuple[int, int] = (10, 8)) None[source]

Plots the stability (standard deviation) of algorithm performance.

A lower standard deviation indicates more stable and predictable performance across different runs and data subsets.

Parameters:
  • metric (str, optional) – The performance metric to evaluate stability on. Defaults to ‘auc’.

  • top_n (int, optional) – The number of algorithms to display, ranked by stability (lower is better). Defaults to 15.

  • figsize (Tuple[int, int], optional) – The figure size for the plot. Defaults to (10, 8).

Raises:

ValueError – If the specified metric is not found in the data.

plot_performance_tradeoff(metric_y: str = 'auc', metric_x: str = 'run_time', stratify_by_outcome: bool = False, top_n_algos: int | None = 10, figsize: Tuple[int, int] = (12, 8)) None[source]

Plots a performance trade-off scatter plot between two metrics.

This is useful for visualizing trade-offs like Performance vs. Speed.

Parameters:
  • metric_y (str, optional) – The metric for the y-axis (e.g., ‘auc’). Defaults to ‘auc’.

  • metric_x (str, optional) – The metric for the x-axis (e.g., ‘run_time’). Defaults to ‘run_time’.

  • stratify_by_outcome (bool, optional) – If True, creates a separate plot for each outcome. Defaults to False.

  • top_n_algos (Optional[int], optional) – If set, only shows the top N algorithms based on metric_y. Defaults to 10.

  • figsize (Tuple[int, int], optional) – The figure size for the plot. Defaults to (12, 8).

Raises:

ValueError – If one or both specified metrics are not found.

plot_pareto_front(metric_y: str = 'auc', metric_x: str = 'run_time', lower_is_better_x: bool = True, figsize: Tuple[int, int] = (12, 8)) None[source]

Plots a Pareto front for two competing metrics.

The Pareto front highlights the set of “optimal” algorithms where you cannot improve one metric without degrading the other.

Parameters:
  • metric_y (str, optional) – The primary performance metric (higher is better). Defaults to ‘auc’.

  • metric_x (str, optional) – The secondary metric, often a cost (e.g., ‘run_time’). Defaults to ‘run_time’.

  • lower_is_better_x (bool, optional) – Set to True if a lower value of metric_x is better. Defaults to True.

  • figsize (Tuple[int, int], optional) – The figure size for the plot. Defaults to (12, 8).

plot_statistical_significance_heatmap(metric: str = 'auc', outcome: str | None = None, figsize: Tuple[int, int] = (14, 12)) None[source]

Performs pairwise t-tests and visualizes p-values in a heatmap.

This helps determine if observed performance differences between algorithms are statistically significant.

Parameters:
  • metric (str, optional) – The performance metric to compare. Defaults to ‘auc’.

  • outcome (Optional[str], optional) – If specified, filters data for a single outcome. Otherwise, uses all data. Defaults to None.

  • figsize (Tuple[int, int], optional) – The figure size for the plot. Defaults to (14, 12).

Raises:

ValueError – If stratifying and ‘outcome_variable’ column is missing.