pat2vec.util.get_dummy_data_medcat_annotation

Functions

dummy_medcat_annotation_generator()

Loads a sample MedCAT annotation dictionary and returns a random subset.

random_sample(pickled_dict,Β sample_size)

Selects a random sample of entities from a pickled dictionary.

Classes

dummy_CAT([with_filters])

A dummy MedCAT class for testing purposes.

pat2vec.util.get_dummy_data_medcat_annotation.random_sample(pickled_dict, sample_size)[source]

Selects a random sample of entities from a pickled dictionary.

Parameters:
  • pickled_dict (Dict[str, Any]) – The dictionary loaded from a pickle file, expected to have an β€˜entities’ key.

  • sample_size (int) – The number of entities to sample.

Return type:

Dict[str, Any]

Returns:

A new dictionary containing the sampled entities.

pat2vec.util.get_dummy_data_medcat_annotation.dummy_medcat_annotation_generator()[source]

Loads a sample MedCAT annotation dictionary and returns a random subset.

This function reads a predefined pickle file containing sample annotations and uses random_sample to return a random number of entities (between 0 and 10).

Return type:

Dict[str, Any]

Returns:

A dictionary containing a random subset of entities from the sample annotations.

class pat2vec.util.get_dummy_data_medcat_annotation.dummy_CAT(with_filters=False)[source]

Bases: object

A dummy MedCAT class for testing purposes.

This class mimics the behavior of the MedCAT CAT object by providing methods that return randomly generated dummy annotations, allowing for testing of annotation pipelines without needing a real MedCAT model.

Parameters:

with_filters (bool)

class DummyFilters(*args, **kwargs)[source]

Bases: dict

Dummy filters object that behaves like a dict with attribute access.

class DummyLinkingConfig[source]

Bases: object

Dummy linking configuration.

class DummyConfig[source]

Bases: object

Dummy config object.

class DummyCDB[source]

Bases: object

Dummy CDB (Concept Database) object.

__init__(with_filters=False)[source]

Initialize dummy CAT object.

Parameters:

with_filters (bool) – If True, initialize with some dummy filters for testing filter removal logic. Defaults to False.

get_entities(text)[source]

Returns a random subset of sample MedCAT annotations for a single text.

Parameters:

text (str) – The text to annotate (input is ignored, used for signature compatibility).

Return type:

Dict[str, Any]

Returns:

A dictionary containing a random subset of entities.

get_entities_multi_texts(texts)[source]

Returns a list of random annotations for a list of texts.

For each text in the input list, it generates a separate random subset of sample MedCAT annotations.

Parameters:

texts (List[str]) – The list of texts to annotate. The content is ignored, but the length determines the number of dummy annotations returned.

Return type:

List[Dict[str, Any]]

Returns:

A list of dictionaries, where each dictionary contains a random subset of entities.