Usage Guide

This guide explains the primary ways to run experiments using the Ensemble Genetic Algorithm project.

Recommended Workflow: Command-Line with `config.yml`

The most straightforward and recommended way to run an experiment is from your terminal using the main.py script and a config.yml file. This approach keeps your configuration separate from the code and is ideal for most use cases.

Prepare Your Data: Ensure your input CSV meets the requirements outlined in the Data Preparation Guide.
Create a Configuration File: Copy the config.yml.example file in the project root to a new file named config.yml.
Edit config.yml: Open your config.yml and customize the experiment. At a minimum, you should set:
- global_params.input_csv_path: Path to your dataset.
- global_params.n_iter: The number of grid search iterations.
- global_params.model_list: The base learners to use.
- ga_params and grid_params to define your search space.
Here is a minimal example to get you started:
```
# In your new config.yml
global_params:
  input_csv_path: "path/to/your/data.csv"
  n_iter: 10 # Start with a small number of iterations
  model_list: ["logisticRegression", "randomForest", "XGBoost"]

ga_params:
  pop_params: [64] # Use a single population size to start
```
See the Configuration Guide for a full list of options.
Activate Your Environment:
```
source ga_env/bin/activate
```
Run the Experiment:
- To run with the default config.yml:
```
python main.py
```
- To specify a different configuration file:
```
python main.py --config path/to/your/config.yml
```
- To automatically evaluate the best model and generate all analysis plots after the run:
```
python main.py --config path/to/your/config.yml --evaluate --plot
```

The following diagram illustrates this workflow:

!main.py Workflow

Alternative: Programmatic Usage with the Example Notebook

For development, debugging, or a more interactive walkthrough, you can use the example_usage.ipynb notebook. This notebook provides a script-based implementation of the same workflow orchestrated by main.py. See the Example Usage Notebook Guide guide for a detailed breakdown of its contents.

To execute the notebook from the command line (useful for HPC environments), use the following command from the root of the repository:

jupyter nbconvert --to notebook --execute notebooks/example_usage.ipynb --output notebooks/example_usage_executed.ipynb

This command will:

Run the notebook example_usage.ipynb using the current Python environment.
Save the executed version as executed_example_usage.ipynb in the same notebooks/ directory.
Preserve interactive IPython functionality (e.g., display, widgets) during execution.

📌 Note: Make sure the ga_env (or .venv) environment is activated before running this command:

source ga_env/bin/activate # Or .venv/bin/activate if installed manually

This ensures all required dependencies are available for successful execution.

Usage Guide

Recommended Workflow: Command-Line with config.yml

Alternative: Programmatic Usage with the Example Notebook

Recommended Workflow: Command-Line with `config.yml`