pat2vec.patvec_get_batch_methods.get_merged_batches

Functions

get_merged_pat_batch_appointments(...)

Retrieves a merged batch of appointments for a list of patients.

get_merged_pat_batch_bloods(...)

Retrieves a merged batch of blood test observations for a list of patients.

get_merged_pat_batch_bmi(client_idcode_list,Β ...)

Retrieves a merged batch of BMI-related observations for a list of patients.

get_merged_pat_batch_demo(...)

Retrieves a merged batch of demographic information for a list of patients.

get_merged_pat_batch_diagnostics(...)

Retrieves a merged batch of diagnostic orders for a list of patients.

get_merged_pat_batch_drugs(...)

Retrieves a merged batch of drug orders for a list of patients.

get_merged_pat_batch_epr_docs(...)

Retrieves a merged batch of EPR documents for a list of patients.

get_merged_pat_batch_mct_docs(...)

Retrieves a merged batch of MCT documents for a list of patients.

get_merged_pat_batch_news(...)

Retrieves a merged batch of NEWS observations for a list of patients.

get_merged_pat_batch_obs(client_idcode_list,Β ...)

Retrieves a merged batch of specific observations for a list of patients.

get_merged_pat_batch_reports(...)

Retrieves a merged batch of reports for a list of patients.

get_merged_pat_batch_textual_obs_docs(...)

Retrieves a merged batch of textual observations for a list of patients.

save_group(client_idcode_group,Β save_folder)

Saves a single patient's data group to a CSV file.

split_and_save_csv(df,Β client_idcode_column,Β ...)

Splits a DataFrame by a key and saves each subset as a CSV using multiprocessing.

verify_split_data_concatenated(original_df,Β ...)

Verifies split data by concatenating and comparing with the original.

verify_split_data_individual(original_df,Β ...)

Verifies split data by checking each individual CSV file.

pat2vec.patvec_get_batch_methods.get_merged_batches.verify_split_data_concatenated(original_df, client_idcode_column, save_folder)[source]

Verifies split data by concatenating and comparing with the original.

This function provides a fast verification method by reading all the split CSV files, concatenating them, and comparing the result to the original DataFrame. It assumes the files can be sorted in a way that matches the original DataFrame’s order.

Parameters:
  • original_df (DataFrame) – The original DataFrame before splitting.

  • client_idcode_column (str) – The name of the column used for splitting.

  • save_folder (str) – The directory where the split CSV files are saved.

Return type:

None

pat2vec.patvec_get_batch_methods.get_merged_batches.verify_split_data_individual(original_df, client_idcode_column, save_folder)[source]

Verifies split data by checking each individual CSV file.

This function performs a more thorough verification by iterating through each expected client ID, reading its corresponding CSV file, and comparing its content to the relevant slice of the original DataFrame.

Parameters:
  • original_df (DataFrame) – The original DataFrame before splitting.

  • client_idcode_column (str) – The name of the column used for splitting.

  • save_folder (str) – The directory where the split CSV files are saved.

Return type:

None

pat2vec.patvec_get_batch_methods.get_merged_batches.save_group(client_idcode_group, save_folder)[source]

Saves a single patient’s data group to a CSV file.

Parameters:
  • client_idcode_group (Tuple[str, DataFrame]) – A tuple containing the client ID and their data as a DataFrame.

  • save_folder (str) – The directory where the CSV file will be saved.

Return type:

None

pat2vec.patvec_get_batch_methods.get_merged_batches.split_and_save_csv(df, client_idcode_column, save_folder, num_processes=None)[source]

Splits a DataFrame by a key and saves each subset as a CSV using multiprocessing.

This function groups a large DataFrame by the client_idcode_column and saves the data for each client into a separate CSV file in the save_folder.

Parameters:
  • df (DataFrame) – The pandas DataFrame to split.

  • client_idcode_column (str) – The name of the column to group by.

  • save_folder (str) – The path to the folder where CSVs will be saved.

  • num_processes (Optional[int]) – The number of processes to use. Defaults to all available CPUs.

Return type:

None

pat2vec.patvec_get_batch_methods.get_merged_batches.get_merged_pat_batch_bloods(client_idcode_list, search_term, config_obj, cohort_searcher_with_terms_and_search)[source]

Retrieves a merged batch of blood test observations for a list of patients.

This function queries the basic_observations index for all patients in client_idcode_list in a single search operation.

Parameters:
  • client_idcode_list (List[str]) – A list of client ID codes.

  • search_term (str) – The term to search for (currently unused in the query).

  • config_obj (Any) – The configuration object.

  • cohort_searcher_with_terms_and_search (Any) – The search function to use.

Return type:

DataFrame

Returns:

A DataFrame containing the merged batch of blood test observations.

pat2vec.patvec_get_batch_methods.get_merged_batches.get_merged_pat_batch_drugs(client_idcode_list, config_obj, cohort_searcher_with_terms_and_search)[source]

Retrieves a merged batch of drug orders for a list of patients.

This function queries the order index for all patients in client_idcode_list in a single search operation, filtering for medication orders.

Parameters:
  • client_idcode_list (List[str]) – A list of client ID codes.

  • config_obj (Any) – The configuration object.

  • cohort_searcher_with_terms_and_search (Any) – The search function to use.

Return type:

DataFrame

Returns:

A DataFrame containing the merged batch of drug orders.

pat2vec.patvec_get_batch_methods.get_merged_batches.get_merged_pat_batch_diagnostics(client_idcode_list, config_obj, cohort_searcher_with_terms_and_search)[source]

Retrieves a merged batch of diagnostic orders for a list of patients.

This function queries the order index for all patients in client_idcode_list in a single search operation, filtering for diagnostic orders.

Parameters:
  • client_idcode_list (List[str]) – A list of client ID codes.

  • config_obj (Any) – The configuration object.

  • cohort_searcher_with_terms_and_search (Any) – The search function to use.

Return type:

DataFrame

Returns:

A DataFrame containing the merged batch of diagnostic orders.

pat2vec.patvec_get_batch_methods.get_merged_batches.get_merged_pat_batch_mct_docs(client_idcode_list, search_term, config_obj, cohort_searcher_with_terms_and_search)[source]

Retrieves a merged batch of MCT documents for a list of patients.

This function queries the observations index for all patients in client_idcode_list, filtering for β€˜AoMRC_ClinicalSummary_FT’ documents.

Parameters:
  • client_idcode_list (List[str]) – A list of client ID codes.

  • search_term (str) – The term to search for (currently unused).

  • config_obj (Any) – The configuration object.

  • cohort_searcher_with_terms_and_search (Any) – The search function to use.

Return type:

DataFrame

Returns:

A DataFrame containing the merged batch of MCT documents.

pat2vec.patvec_get_batch_methods.get_merged_batches.get_merged_pat_batch_epr_docs(client_idcode_list, search_term, config_obj, cohort_searcher_with_terms_and_search)[source]

Retrieves a merged batch of EPR documents for a list of patients.

This function queries the epr_documents index for all patients in client_idcode_list within the globally defined time window.

Parameters:
  • client_idcode_list (List[str]) – A list of client ID codes.

  • search_term (str) – The term to search for (currently unused).

  • config_obj (Any) – The configuration object.

  • cohort_searcher_with_terms_and_search (Any) – The search function to use.

Return type:

DataFrame

Returns:

A DataFrame containing the merged batch of EPR documents.

pat2vec.patvec_get_batch_methods.get_merged_batches.get_merged_pat_batch_textual_obs_docs(client_idcode_list, search_term, config_obj, cohort_searcher_with_terms_and_search)[source]

Retrieves a merged batch of textual observations for a list of patients.

This function queries the basic_observations index for all patients in client_idcode_list and filters for rows containing non-empty textualObs.

Parameters:
  • client_idcode_list (List[str]) – A list of client ID codes.

  • search_term (str) – The term to search for (currently unused).

  • config_obj (Any) – The configuration object.

  • cohort_searcher_with_terms_and_search (Any) – The search function to use.

Return type:

DataFrame

Returns:

A DataFrame containing the merged batch of textual observation documents.

pat2vec.patvec_get_batch_methods.get_merged_batches.get_merged_pat_batch_appointments(client_idcode_list, search_term, config_obj, cohort_searcher_with_terms_and_search)[source]

Retrieves a merged batch of appointments for a list of patients.

This function queries the pims_apps* index for all patients in client_idcode_list within the globally defined time window.

Parameters:
  • client_idcode_list (List[str]) – A list of client ID codes.

  • search_term (str) – The term to search for (currently unused).

  • config_obj (Any) – The configuration object.

  • cohort_searcher_with_terms_and_search (Any) – The search function to use.

Return type:

DataFrame

Returns:

A DataFrame containing the merged batch of appointments.

pat2vec.patvec_get_batch_methods.get_merged_batches.get_merged_pat_batch_demo(client_idcode_list, search_term, config_obj, cohort_searcher_with_terms_and_search)[source]

Retrieves a merged batch of demographic information for a list of patients.

This function queries the epr_documents index for all patients in client_idcode_list to get their demographic data.

Parameters:
  • client_idcode_list (List[str]) – A list of client ID codes.

  • search_term (str) – The term to search for (currently unused).

  • config_obj (Any) – The configuration object.

  • cohort_searcher_with_terms_and_search (Any) – The search function to use.

Return type:

DataFrame

Returns:

A DataFrame containing the merged batch of demographic information.

pat2vec.patvec_get_batch_methods.get_merged_batches.get_merged_pat_batch_bmi(client_idcode_list, search_term, config_obj, cohort_searcher_with_terms_and_search)[source]

Retrieves a merged batch of BMI-related observations for a list of patients.

This function queries the observations index for all patients in client_idcode_list, filtering for BMI, Weight, and Height observations.

Parameters:
  • client_idcode_list (List[str]) – A list of client ID codes.

  • search_term (str) – The term to search for (currently unused).

  • config_obj (Any) – The configuration object.

  • cohort_searcher_with_terms_and_search (Any) – The search function to use.

Return type:

DataFrame

Returns:

A DataFrame containing the merged batch of BMI-related observations.

pat2vec.patvec_get_batch_methods.get_merged_batches.get_merged_pat_batch_obs(client_idcode_list, search_term, config_obj, cohort_searcher_with_terms_and_search)[source]

Retrieves a merged batch of specific observations for a list of patients.

This function queries the observations index for all patients in client_idcode_list, filtering for a specific search_term.

Parameters:
  • client_idcode_list (List[str]) – A list of client ID codes.

  • search_term (str) – The specific observation term to search for.

  • config_obj (Any) – The configuration object.

  • cohort_searcher_with_terms_and_search (Any) – The search function to use.

Return type:

DataFrame

Returns:

A DataFrame containing the merged batch of specified observations.

pat2vec.patvec_get_batch_methods.get_merged_batches.get_merged_pat_batch_news(client_idcode_list, search_term, config_obj, cohort_searcher_with_terms_and_search)[source]

Retrieves a merged batch of NEWS observations for a list of patients.

This function queries the observations index for all patients in client_idcode_list, filtering for β€˜NEWS’ or β€˜NEWS2’ observations.

Parameters:
  • client_idcode_list (List[str]) – A list of client ID codes.

  • search_term (str) – The term to search for (currently unused).

  • config_obj (Any) – The configuration object.

  • cohort_searcher_with_terms_and_search (Any) – The search function to use.

Return type:

DataFrame

Returns:

A DataFrame containing the merged batch of NEWS observations.

pat2vec.patvec_get_batch_methods.get_merged_batches.get_merged_pat_batch_reports(client_idcode_list, search_term, config_obj, cohort_searcher_with_terms_and_search)[source]

Retrieves a merged batch of reports for a list of patients.

This function queries the basic_observations index for all patients in client_idcode_list, filtering for documents where the item name is β€˜report’.

Parameters:
  • client_idcode_list (List[str]) – A list of client ID codes.

  • search_term (str) – The specific report type to search for.

  • config_obj (Any) – The configuration object.

  • cohort_searcher_with_terms_and_search (Any) – The search function to use.

Return type:

DataFrame

Returns:

A DataFrame containing the merged batch of reports.