pat2vec.pat2vec_get_methods.get_method_demo

Functions

get_demographics3(patlist, ...[, config_obj])

Gets demographics information for patients within a specified date range.

process_demographics_data(demo_data, patlist)

Processes raw demographics data to return the most recent record per patient.

search_demographics([...])

Searches for demographics data for patients within a date range.

pat2vec.pat2vec_get_methods.get_method_demo.search_demographics(cohort_searcher_with_terms_and_search=None, client_id_codes=None, demographics_time_field='updatetime', fields_override=None, start_year='1995', start_month='01', start_day='01', end_year='2025', end_month='12', end_day='12', additional_custom_search_string=None, index_name='epr_documents', output_filename='demographics_search_results.csv', overwrite=False, config_obj=None)[source]

Searches for demographics data for patients within a date range.

Parameters:
  • cohort_searcher_with_terms_and_search (Optional[Callable]) – The function for cohort searching. Defaults to None.

  • client_id_codes (Optional[Union[str, List[str]]]) – The client ID code(s) of the patient(s). Defaults to None.

  • demographics_time_field (str) – The timestamp field for filtering demographics. Defaults to ‘updatetime’.

  • fields_override (Optional[List[str]]) – A list of fields to override the default DEMOGRAPHICS_FIELDS. Defaults to None.

  • start_year (str) – Start year for the search. Defaults to ‘1995’.

  • start_month (str) – Start month for the search. Defaults to ‘01’.

  • start_day (str) – Start day for the search. Defaults to ‘01’.

  • end_year (str) – End year for the search. Defaults to ‘2025’.

  • end_month (str) – End month for the search. Defaults to ‘12’.

  • end_day (str) – End day for the search. Defaults to ‘12’.

  • additional_custom_search_string (Optional[str]) – An additional string to append to the search query. Defaults to None.

  • index_name (str) – The name of the Elasticsearch index to search. Defaults to “epr_documents”.

  • output_filename (Optional[str]) – The filename or path to a CSV file to load from or save to. Defaults to “demographics_search_results.csv”.

  • overwrite (bool) – If True, perform the search even if output_filename exists. Defaults to False.

  • config_obj (Optional[object]) – Configuration object containing root_path. Defaults to None.

Returns:

A DataFrame containing the raw demographics data.

Return type:

pd.DataFrame

Raises:

ValueError – If essential arguments are None.

pat2vec.pat2vec_get_methods.get_method_demo.process_demographics_data(demo_data, patlist)[source]

Processes raw demographics data to return the most recent record per patient.

Parameters:
  • demo_data (pd.DataFrame) – Raw demographics data from the search.

  • patlist (List[str]) – List of patient IDs that were requested.

Returns:

Processed demographics data containing the single most

recent record for the patient(s).

Return type:

pd.DataFrame

pat2vec.pat2vec_get_methods.get_method_demo.get_demographics3(patlist, target_date_range, cohort_searcher_with_terms_and_search, config_obj=None)[source]

Gets demographics information for patients within a specified date range.

Parameters:
  • patlist (List[str]) – List of patient IDs.

  • target_date_range (Tuple) – A tuple representing the target date range.

  • cohort_searcher_with_terms_and_search (Callable) – The function for cohort searching.

  • config_obj (Optional[object]) – Configuration object containing settings. Defaults to None.

Returns:

Demographics information for the specified patients.

Return type:

pd.DataFrame

Raises:

ValueError – If config_obj or cohort_searcher_with_terms_and_search is None, or if patlist is empty.