ml_grid.pipeline.test_data_pipeline

Unit tests for the ml_grid.pipeline.data.pipe class.

This test suite validates the core functionality of the data pipeline, ensuring that data is loaded, cleaned, transformed, and split correctly according to various configurations.

Classes

TestDataPipeline

Create an instance of the class that will use the named test

Module Contents

class ml_grid.pipeline.test_data_pipeline.TestDataPipeline(methodName='runTest')[source]

Bases: unittest.TestCase

Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.

setUp()[source]

Set up a temporary environment for each test.

tearDown()[source]

Clean up the temporary directory after each test.

test_pipeline_initialization_successful()[source]

Test that the pipeline initializes and runs without errors.

test_no_constant_columns_in_final_X_train()[source]

Verify that the final X_train contains no constant columns.

test_data_quality_in_final_data()[source]

Check for NaN or infinite values in the final training data.

test_feature_importance_selection()[source]

Test that feature importance selection correctly reduces column count.

test_embedding_application()[source]

Test that embedding correctly reduces features to the target dimension.

test_index_alignment()[source]

Test that all final data splits have aligned indices.

test_safety_net_activation()[source]

Test that the safety net retains features when all are pruned.