pat2vec.util.methods_annotation_regex
Functions
|
Counts occurrences of regex patterns in a DataFrame's text column. |
- pat2vec.util.methods_annotation_regex.append_regex_term_counts(df, terms, text_column='body_analysed', debug=False)[source]
Counts occurrences of regex patterns in a DataFrame’s text column.
For each term (regex pattern) in the terms list, this function counts its case-insensitive occurrences in each row of the specified text_column. A new column is added to the DataFrame for each term, containing the count.
- Parameters:
df (
DataFrame
) – The DataFrame to process.terms (
List
[str
]) – A list of regex patterns to search for.text_column (
str
) – The name of the column containing the text to search.debug (
bool
) – If True, prints debugging information about the DataFrame.
- Return type:
DataFrame
- Returns:
The original DataFrame with new columns for the counts of each term.