pat2vec.util.methods_annotation_json_to_dataframe
Functions
|
Converts a MedCAT JSON entity dictionary to a pandas DataFrame. |
|
Parses meta-annotations from a MedCAT entity dictionary. |
- pat2vec.util.methods_annotation_json_to_dataframe.json_to_dataframe(json_data, doc, current_pat_client_id_code, full_doc=False, window=300, text_column='body_analysed', time_column='updatetime', guid_column='document_guid')[source]
Converts a MedCAT JSON entity dictionary to a pandas DataFrame.
This function takes the ‘entities’ dictionary from a MedCAT output for a single document and transforms it into a structured DataFrame. Each row in the resulting DataFrame represents a single annotation (entity). It also extracts a text sample around the annotation and includes document-level metadata.
- Parameters:
json_data (
Dict
[str
,Any
]) – The ‘entities’ dictionary from MedCAT’s output.doc (
Series
) – The pandas Series representing the original document, containing metadata like text, timestamp, and GUID.current_pat_client_id_code (
str
) – The patient’s unique identifier.full_doc (
bool
) – If True, includes the full document text in the first annotation row. Defaults to False.window (
int
) – The number of characters to include on either side of the annotation for the ‘text_sample’. Defaults to 300.text_column (
str
) – The name of the column in doc containing the text.time_column (
str
) – The name of the column in doc containing the timestamp.guid_column (
str
) – The name of the column in doc containing the document GUID.
- Return type:
DataFrame
- Returns:
A pandas DataFrame where each row is a single annotation, or an empty DataFrame if no entities are present in the input.
- pat2vec.util.methods_annotation_json_to_dataframe.parse_meta_anns(meta_anns)[source]
Parses meta-annotations from a MedCAT entity dictionary.
This function extracts the value and confidence for ‘Time’, ‘Presence’, and ‘Subject/Experiencer’ meta-annotations. It includes a fallback to check for ‘Subject’ if ‘Subject/Experiencer’ is not found.
- Parameters:
meta_anns (
Dict
[str
,Any
]) – The meta_anns dictionary from a MedCAT entity.- Return type:
Dict
[str
,Any
]- Returns:
A dictionary containing the parsed meta-annotation values and confidences.