summarize_results
Module for creating tabular summaries from ML results data.
Classes
Initializes the summarizer. |
Module Contents
- class summarize_results.ResultsSummarizer(data: pandas.DataFrame)[source]
Initializes the summarizer.
- Parameters:
data (pd.DataFrame) – Aggregated results DataFrame.
- Raises:
ValueError – If the input data is not a non-empty pandas DataFrame.
- get_best_model_per_outcome(metric: str = 'auc') pandas.DataFrame[source]
Finds the best model for each outcome and expands the feature list.
This method identifies the single best-performing model run for each outcome variable based on the specified metric. It then transforms the ‘decoded_features’ list into a set of boolean columns, where each new column represents a feature and its value indicates whether that feature was used in the best model run.
- Parameters:
metric (str, optional) – The performance metric to use for determining the “best” model. Defaults to ‘auc’.
- Returns:
A DataFrame containing the best model run for each outcome, with additional boolean columns for each feature.
- Return type:
pd.DataFrame