pat2vec.pat2vec_search.matcher

Functions

matcher(data_template_df,Β lab_results_df,Β ...)

Matches lab results to a template DataFrame based on the nearest date.

pat2vec.pat2vec_search.matcher.matcher(data_template_df, lab_results_df, source_patid_colname, source_date_colname, result_date_colname, result_testname, result_resultname, before, after)[source]

Matches lab results to a template DataFrame based on the nearest date.

For each row in the data_template_df, this function finds the closest lab test result from lab_results_df for the same patient. The search is constrained to a time window defined by before and after days relative to the date in the template row. This is done for each unique lab test name found in lab_results_df. The matched results are then added as new columns to the template DataFrame.

Parameters:
  • data_template_df (DataFrame) – Template DataFrame with patient IDs and target dates.

  • lab_results_df (DataFrame) – DataFrame with lab results, including patient IDs, dates, test names, and results.

  • source_patid_colname (str) – Column name for patient IDs in the template DataFrame.

  • source_date_colname (str) – Column name for dates in the template DataFrame.

  • result_date_colname (str) – Column name for dates in the lab results DataFrame.

  • result_testname (str) – Column name for test names in the lab results DataFrame.

  • result_resultname (str) – Column name for test results in the lab results DataFrame.

  • before (int) – Number of days before the target date to include in the search window.

  • after (int) – Number of days after the target date to include in the search window.

Return type:

DataFrame

Returns:

The template DataFrame with added columns for each unique lab test, populated with the nearest result value.