Ensemble Genetic Algorithm

Welcome to the documentation for ga-project!

ga-project is a Python library that leverages genetic algorithms to construct powerful machine learning model ensembles. It automates the challenging process of selecting the best combination of models and their weights to maximize predictive performance. This tool is for data scientists and machine learning engineers looking to improve their model accuracy with sophisticated ensembling techniques.

Note

This project is under active development and the API may change.

Installation

You can install ga-project from PyPI using pip:

pip install ga-project

Example Workflow

The primary way to use this library is by configuring and running an experiment pipeline, as demonstrated in notebooks/example_usage.ipynb. This automates grid searching, model training, and evaluation.

Here is a simplified example of the core logic:

import ml_grid
from ml_grid.pipeline import main_ga
from ml_grid.model_classes_ga import (
    logisticRegressionModelGenerator,
    randomForestModelGenerator,
    XGBoostModelGenerator,
)

# 1. Define experiment parameters
input_csv_path = "synthetic_data_for_testing.csv"
base_project_dir = "HFE_GA_experiments/my_first_run/"

# 2. Define the pool of base models for the Genetic Algorithm
model_list = [
    logisticRegressionModelGenerator,
    randomForestModelGenerator,
    XGBoostModelGenerator,
]

# 3. Set hyperparameters for this specific run
# In a full run, this is typically iterated from a grid search
hyperparameters = {
    'population_size': 50,
    'n_generations': 20,
    'mutation_rate': 0.2,
    'crossover_rate': 0.8,
}

# 4. Configure and run the experiment pipeline
ml_grid_object = ml_grid.pipeline.data.pipe(
    input_csv_path=input_csv_path,
    base_project_dir=base_project_dir,
    local_param_dict=hyperparameters,
    config_dict={"modelFuncList": model_list},
)

# 5. Execute the Genetic Algorithm
main_ga.run(ml_grid_object, local_param_dict=hyperparameters).execute()

Getting Started

Developer Guide

Project Information

API Reference

ml_grid
- ml_grid package