plot_global_importance ====================== .. py:module:: plot_global_importance .. autoapi-nested-parse:: Global importance analysis plotting module for ML results analysis. This module trains a meta-model on the experimental parameters to determine which settings have the most significant impact on the target metric. Classes ------- .. autoapisummary:: plot_global_importance.GlobalImportancePlotter Module Contents --------------- .. py:class:: GlobalImportancePlotter(data: pandas.DataFrame) Initialize the plotter. :param data: Results DataFrame, must contain columns for experimental parameters and performance metrics. .. py:attribute:: data .. py:attribute:: clean_data .. py:attribute:: feature_categories :value: ['age', 'sex', 'bmi', 'ethnicity', 'bloods', 'diagnostic_order', 'drug_order', 'annotation_n',... .. py:attribute:: pipeline_categorical_params :value: ['resample', 'scale', 'param_space_size', 'percent_missing'] .. py:attribute:: pipeline_continuous_params :value: ['nb_size', 'X_train_size', 'X_test_orig_size', 'X_test_size', 'n_fits', 't_fits', 'run_time'] .. py:attribute:: algorithm_col :value: 'method_name' .. py:method:: plot_global_importance(metric: str = 'auc', top_n: int = 30, figsize: Tuple[int, int] = (12, 10)) -> None Trains a model to predict a metric from experimental parameters and plots importances. This method trains a RandomForestRegressor on the various pipeline and algorithm parameters to predict the outcome of a given performance metric. parameters and plots the resulting feature importances. :param metric: The target metric to predict. Defaults to 'auc'. :type metric: str, optional :param top_n: The number of top important features to plot. Defaults to 30. :type top_n: int, optional :param figsize: The figure size for the plot. Defaults to (12, 10). :type figsize: Tuple[int, int], optional