pat2vec.util.generate_elastic_schema

Functions

generate_schema_from_cluster([indices,Ā ...])

Generates index schemas (mappings and settings) from the connected Elasticsearch cluster.

pat2vec.util.generate_elastic_schema.generate_schema_from_cluster(indices=None, output_file='elastic_schemas.json')[source]

Generates index schemas (mappings and settings) from the connected Elasticsearch cluster.

This function retrieves the mappings and settings for specified indices from the live Elasticsearch instance connected via pat2vec.cs. It cleans the settings to make them suitable for creating new indices in a test environment (removing UUIDs, creation dates, etc.).

Parameters:
  • indices (Optional[List[str]]) – List of index names or patterns to export. If None, defaults to the standard pat2vec indices: [ā€œepr_documentsā€, ā€œbasic_observationsā€, ā€œobservationsā€, ā€œorderā€, ā€œpims_apps*ā€].

  • output_file (str) – Path to save the generated schema JSON.

Return type:

Dict[str, Any]

Returns:

A dictionary where keys are the simplified index names (e.g., ā€˜pims_apps’ instead of ā€˜pims_apps*’) and values are dictionaries containing ā€œmappingsā€ and ā€œsettingsā€.