pat2vec.util.filter_dataframe_by_timestamp

Functions

filter_dataframe_by_timestamp(df,Β ...[,Β dropna])

Filters a DataFrame to include only rows within a specified date range.

pat2vec.util.filter_dataframe_by_timestamp.filter_dataframe_by_timestamp(df, start_year, start_month, end_year, end_month, start_day, end_day, timestamp_string, dropna=False)[source]

Filters a DataFrame to include only rows within a specified date range.

This function takes a DataFrame and filters it based on a timestamp column, retaining only the rows where the timestamp falls between a given start and end date. It handles conversion of the timestamp column to datetime objects and ensures the start date is chronologically before the end date.

Parameters:
  • df (DataFrame) – The DataFrame to filter.

  • start_year (Union[int, str]) – The year of the start date.

  • start_month (Union[int, str]) – The month of the start date.

  • start_day (Union[int, str]) – The day of the start date.

  • end_year (Union[int, str]) – The year of the end date.

  • end_month (Union[int, str]) – The month of the end date.

  • end_day (Union[int, str]) – The day of the end date.

  • timestamp_string (str) – The name of the column in df that contains the timestamps to filter on.

  • dropna (bool) – If True, drops rows with NaN values in the timestamp column before filtering. Defaults to False.

Return type:

DataFrame

Returns:

A new DataFrame containing only the rows that fall within the specified date range.