# Welcome to the pat2vec Wiki!

This wiki contains comprehensive documentation for `pat2vec`.

## Overview

`pat2vec` is a Python-based tool designed to transform raw electronic health records (EHR) into structured, time-series feature vectors. This process makes the data suitable for machine learning tasks, particularly binary classification. It can aggregate data at the patient level or construct detailed longitudinal timelines.

## Example Use Cases

### 1. Patient-Level Aggregation
Compute summary statistics (e.g., the mean of *n* variables) for each unique patient, resulting in one row per patient. This is ideal for models requiring a single representation per individual.

### 2. Longitudinal Time Series Construction
Generate a monthly time series for each patient that includes:

- Biochemistry results
- Demographic attributes
- MedCat-derived clinical text annotations

The time series spans up to 25 years retrospectively, aligned to each patient's diagnosis date, enabling a consistent retrospective view across varying start times.