Interpreting Experiment Results

# Interpreting Experiment Results

This guide helps you understand the various plots and outputs generated by the project. These visualizations are key to extracting insights from your genetic algorithm experiments.

—

## Overview

After running an experiment, the GA_results_explorer class is used to parse the final_grid_score_log.csv file and generate a series of plots. This can be done by running the notebooks/example_usage.ipynb notebook, or automatically by using the –plot flag when running main.py from the command line.

Each plot provides a different perspective on the experiment’s outcome, from hyperparameter influence to feature importance.

### 1. Feature and Base Learner Importance

These plots help you understand which features (either original data columns or base learners) contributed most to the best-performing ensembles.

`plot_base_learner_feature_importance`: Shows which base models (e.g., RandomForest, XGBoost) were most frequently included in high-performing ensembles. This helps identify the most robust and effective algorithms for your dataset.
`plot_initial_feature_importance`: Displays the importance of the original input features (from your CSV) in the context of the best models. This is a form of feature selection, highlighting which data columns are most predictive.
`plot_feature_stability`: Measures how consistently a feature (or base learner) appears in the top-performing ensembles. High stability suggests a feature is reliably important.
`plot_feature_cooccurrence`: A heatmap showing which pairs of features or base learners tend to appear together in the same ensemble. This can reveal synergistic relationships between models or features.

### 2. Genetic Algorithm Performance

These plots visualize the performance of the genetic algorithm itself during the search process.

`plot_all_convergence`: Shows the fitness (e.g., AUC) of the best individual in the population over generations. A steep upward curve that flattens out indicates that the GA has successfully converged to an optimal solution. This plot is generated for each grid search run.
`plot_performance_vs_size`: A scatter plot that shows the relationship between ensemble size (number of base learners) and performance (AUC). This helps you determine if larger ensembles are actually better, or if there’s a point of diminishing returns.

### 3. Hyperparameter Analysis

These plots analyze the impact of different hyperparameters (both for the GA and the base models) on the final model performance.

`plot_parameter_distributions`: Creates boxplots showing the distribution of performance scores (AUC) for different values of each hyperparameter. This is one of the most important plots for understanding which settings work best. For example, you can see if a population_size of 100 consistently outperforms a size of 50.
`plot_interaction_heatmap`: Visualizes the interaction between two different hyperparameters and their combined effect on performance. This is useful for spotting complex relationships, e.g., “a high learning rate only works well with a low number of estimators.”
`plot_performance_tradeoff`: A scatter plot that helps you analyze the trade-off between model performance (e.g., AUC) and a cost metric (e.g., run_time). This is crucial for finding models that are not only accurate but also computationally efficient.

### 4. Ensemble Composition

These plots provide insights into the structure of the evolved ensembles.

`plot_algorithm_distribution_in_ensembles`: A bar chart showing the frequency of each base learner type across all the final, best-performing ensembles. It tells you which algorithms the GA favored overall.
`plot_ensemble_feature_diversity`: Analyzes the diversity of features used within the ensembles. Higher diversity can sometimes lead to more robust models.

### 5. Statistical Importance of Hyperparameters

`plot_combined_anova_feature_importances`: Uses ANOVA (Analysis of Variance) to statistically rank the impact of all hyperparameters (from both the GA configuration and model settings) on the outcome variable (e.g., auc). This gives you a quantitative measure of which parameter choices matter most.

—

## How to Use These Insights

Model Selection: Use the performance plots to identify the best-performing hyperparameter configurations.
Feature Engineering: Use the feature importance plots to guide your feature selection and engineering efforts for future experiments.
Algorithm Choice: Use the base learner importance and distribution plots to understand which types of models are best suited for your problem.
Tuning the GA: Use the convergence plots to see if you need to run the GA for more (or fewer) generations.

By carefully examining these plots, you can move beyond a single “best score” and gain a deep, qualitative understanding of the solution space for your machine learning problem.