Model API Reference

This document provides comprehensive API reference for all model classes in the Ensemble Genetic Algorithm project.


Table of Contents

See also: GA_Python_API.md for genetic algorithm evaluation methods and pipeline integration.


Base Learner Generator Interface

Abstract Interface

All base learner generators follow a consistent interface pattern:

Required Methods:

class BaseLearnerGenerator:
    def hyperparameters(self, param_space):
        """Return dictionary of hyperparameter search spaces."""
        
    def model(self, params):
        """Instantiate model with given parameters."""

Generation Function Signature:

def model_nameModelGenerator(
    ml_grid_object: Any,
    local_param_dict: Dict
) -> Tuple[float, ModelClass, List[str], int, float, np.ndarray]:
    """Generates, trains, and evaluates a model.
    
    Args:
        ml_grid_object: Contains X_train, y_train, X_test, y_test and config
        local_param_dict: Parameters for this specific run
        
    Returns:
        Tuple of (mccscore, model, feature_names, train_time, auc_score, y_pred)
    """

Return Values:

Index

Type

Description

0

float

Matthews Correlation Coefficient (MCC)

1

ModelClass

Trained model object

2

List[str]

List of feature names used for training

3

int

Model training time in seconds

4

float

ROC AUC score

5

np.ndarray

Model predictions on test set


Classification Models

AdaBoost Classifier

Function: AdaBoostClassifierModelGenerator

Generates, trains, and evaluates an AdaBoost classifier.

Module: ml_grid.model_classes_ga.adaboostClassifier_model

Generation Signature:

ada_boost_generator = AdaBoostClassifierModelGenerator(
    ml_grid_object,
    local_param_dict
)

Parameters:

Parameter

Type

Description

ml_grid_object

Any

Contains X_train, y_train, X_test, y_test and config

local_param_dict

Dict

Parameters for this specific run

Returns: Same tuple format as base interface


Decision Tree Classifier

Function: DecisionTreeClassifierModelGenerator

Generates, trains, and evaluates a decision tree classifier.

Module: ml_grid.model_classes_ga.decisionTreeClassifier_model

Generation Signature:

dt_generator = DecisionTreeClassifierModelGenerator(
    ml_grid_object,
    local_param_dict
)

Dummy Model

Function: DummyModelGenerator

A baseline dummy model for comparison purposes.

Module: ml_grid.model_classes_ga.dummy_model

Generation Signature:

dummy_generator = DummyModelGenerator(
    ml_grid_object,
    local_param_dict
)

Elastic Neural Network

Function: elasticNeuralNetworkModelGenerator

Generates an elastic neural network classifier.

Module: ml_grid.model_classes_ga.elasticNeuralNetwork_model

Generation Signature:

enn_generator = elasticNeuralNetworkModelGenerator(
    ml_grid_object,
    local_param_dict
)

Extra Trees Classifier

Function: extraTreesModelGenerator

Generates, trains, and evaluates an Extra Trees classifier.

Module: ml_grid.model_classes_ga.extra_trees_model

Generation Signature:

et_generator = extraTreesModelGenerator(
    ml_grid_object,
    local_param_dict
)

Gaussian NB

Function: GaussianNB_ModelGenerator

Generates, trains, and evaluates a Gaussian Naive Bayes classifier.

Module: ml_grid.model_classes_ga.gaussianNB_model

Generation Signature:

gnb_generator = GaussianNB_ModelGenerator(
    ml_grid_object,
    local_param_dict
)

Gradient Boosting Classifier

Function: GradientBoostingClassifier_ModelGenerator

Generates, trains, and evaluates a gradient boosting classifier.

Module: ml_grid.model_classes_ga.gradientBoostingClassifier_model

Generation Signature:

gb_generator = GradientBoostingClassifier_ModelGenerator(
    ml_grid_object,
    local_param_dict
)

K-Nearest Neighbors

Function: kNearestNeighborsModelGenerator

Generates, trains, and evaluates a k-nearest neighbors classifier.

Module: ml_grid.model_classes_ga.kNearestNeighbors_model

Generation Signature:

knn_generator = kNearestNeighborsModelGenerator(
    ml_grid_object,
    local_param_dict
)

Logistic Regression

Function: logisticRegressionModelGenerator

Generates, trains, and evaluates a logistic regression classifier.

Module: ml_grid.model_classes_ga.logistic_regression_model

Generation Signature:

lr_generator = logisticRegressionModelGenerator(
    ml_grid_object,
    local_param_dict
)

Implementation Notes:

  • Uses ANOVA-based feature selection

  • Random search for hyperparameters (C, max_iter, solver)

  • Returns MCC and AUC scores


MLP Classifier

Function: MLPClassifier_ModelGenerator

Generates, trains, and evaluates a multi-layer perceptron classifier.

Module: ml_grid.model_classes_ga.mlpClassifier_model

Generation Signature:

mlp_generator = MLPClassifier_ModelGenerator(
    ml_grid_object,
    local_param_dict
)

Perceptron Model

Function: perceptronModelGenerator

Generates, trains, and evaluates a perceptron classifier.

Module: ml_grid.model_classes_ga.perceptron_model

Generation Signature:

perc_generator = perceptronModelGenerator(
    ml_grid_object,
    local_param_dict
)

Perceptron Dummy Model

Function: perceptron_dummy_model

A baseline perceptron model for comparison purposes.

Module: ml_grid.model_classes_ga.perceptron_dummy_model

Generation Signature:

perc_dummy_generator = perceptron_dummy_model(
    ml_grid_object,
    local_param_dict
)

PyTorch ANN Binary Classifier

Function: Pytorch_binary_class_ModelGenerator

Generates, trains, and evaluates a PyTorch artificial neural network for binary classification.

Module: ml_grid.model_classes_ga.pytorchANNBinaryClassifier_model

Generation Signature:

pytorch_generator = Pytorch_binary_class_ModelGenerator(
    ml_grid_object,
    local_param_dict
)

Quadratic Discriminant Analysis

Function: QuadraticDiscriminantAnalysis_ModelGenerator

Generates, trains, and evaluates a quadratic discriminant analysis classifier.

Module: ml_grid.model_classes_ga.quadraticDiscriminantAnalysis_model

Generation Signature:

qda_generator = QuadraticDiscriminantAnalysis_ModelGenerator(
    ml_grid_object,
    local_param_dict
)

Random Forest

Function: randomForestModelGenerator

Generates, trains, and evaluates a random forest classifier.

Module: ml_grid.model_classes_ga.randomForest_model

Generation Signature:

rf_generator = randomForestModelGenerator(
    ml_grid_object,
    local_param_dict
)

Implementation Notes:

  • Uses RandomForest-based feature selection (see Feature Selection)

  • Random search for hyperparameters (n_estimators, max_features, max_depth)

  • Returns MCC and AUC scores

See also: Evaluation Methods for GA-based model evaluation


SVC (Support Vector Classifier)

Function: SVC_ModelGenerator

Generates, trains, and evaluates a Support Vector Classifier.

Module: ml_grid.model_classes_ga.svc_model

Generation Signature:

svc_generator = SVC_ModelGenerator(
    ml_grid_object,
    local_param_dict
)

XGBoost Classifier

Function: XGBoostModelGenerator

Generates, trains, and evaluates an XGBoost classifier.

Module: ml_grid.model_classes_ga.XGBoost_model

Generation Signature:

xgb_generator = XGBoostModelGenerator(
    ml_grid_object,
    local_param_dict
)

Model Generation Functions

Generic Pattern for All Models

All model generators follow this pattern:

def model_nameModelGenerator(
    ml_grid_object: Any,
    local_param_dict: Dict
) -> Tuple[float, ModelClass, List[str], int, float, np.ndarray]:
    """Brief description of what this model does.
    
    The complete docstring includes:
    - Hyperparameter search strategy (random/grid)
    - Feature selection method used
    - Performance metrics calculated
    
    Args:
        ml_grid_object: Contains X_train, y_train, X_test, y_test and config
        local_param_dict: Parameters for this specific run
        
    Returns:
        Tuple of (mccscore, model, feature_names, train_time, auc_score, y_pred)
    """
    
    # Step 1: Apply feature selection
    X_train_selected, X_test_selected = apply_feature_selection(
        ml_grid_object, 
        method="anova"
    )
    
    # Step 2: Sample hyperparameters from space
    params = sample_hyperparameters()
    
    # Step 3: Initialize and train model
    model = create_model(params)
    model.fit(X_train_selected, y_train)
    
    # Step 4: Evaluate performance
    y_pred = model.predict(X_test)
    mccscore = matthews_corrcoef(y_test, y_pred)
    auc_score = roc_auc_score(y_test, y_pred)
    
    # Step 5: Return results
    return (mccscore, model, feature_names, train_time, auc_score, y_pred)

Model Registry

The global_parameters class contains a MODEL_REGISTRY that maps string names to model generator classes:

Model Name

Generator Class

“AdaBoostClassifier”

AdaBoostClassifierModelGenerator

“DecisionTreeClassifier”

DecisionTreeClassifierModelGenerator

“elasticNeuralNetwork”

elasticNeuralNetworkModelGenerator

“extraTrees”

extraTreesModelGenerator

“GaussianNB”

GaussianNB_ModelGenerator

“GradientBoostingClassifier”

GradientBoostingClassifier_ModelGenerator

“kNearestNeighbors”

kNearestNeighborsModelGenerator

“logisticRegression”

logisticRegressionModelGenerator

“MLPClassifier”

MLPClassifier_ModelGenerator

“perceptron”

perceptronModelGenerator

“Pytorch_binary_class”

Pytorch_binary_class_ModelGenerator

“QuadraticDiscriminantAnalysis”

QuadraticDiscriminantAnalysis_ModelGenerator

“randomForest”

randomForestModelGenerator

“SVC”

SVC_ModelGenerator

“XGBoost”

XGBoostModelGenerator

“DummyModel”

DummyModelGenerator


Usage Example

from ml_grid.util.global_params import global_parameters
from ml_grid.model_classes_ga.logistic_regression_model import (
    logisticRegressionModelGenerator
)

# Setup configuration
global_params = global_parameters(config_path='config.yml')

# Create model generator
model_gen = logisticRegressionModelGenerator(
    ml_grid_object=ml_grid_object,
    local_param_dict={'store_base_learners': True}
)

# Execute (returns tuple)
result = model_gen
mcc, model, features, train_time, auc, predictions = result

print(f"AUC: {auc:.4f}")
print(f"MCC: {mcc:.4f}")
print(f"Training time: {train_time}s")

Version Information

This API documentation corresponds to version v1.0+ of the ensemble_genetic_algorithm package.

Python Requirement: Python >=3.12

To check your installed version:

pip show ensemble_genetic_algorithm