Skip to content Skip to sidebar Skip to footer

Extracting Age Variations Using Regex

import re s = '18year old 23 year old 99 years old but not 25-year-old and 91year old cousin is 99 now and 90-year-old or 102 year old' From s, I would like to extract all ages th

Solution 1:

This regex will do what you want:

(?:9\d|1\d{2})(?:\s|-)?years?(?:\s|-)?old

Regex Demo

Explanation:

(?:9\d|1\d{2})     # Non-capturing group - match 9x or 1xx
(?:\s|-)?          # Non-capturing group - optionally match whitespace or -
years?             # Match year and optionally s
(?:\s|-)?          # Non-capturing group - optionally match whitespace or -
old                # Match old

Code snippet:

reg = r'(?:9\d|1\d{2})(?:\s|-)?years?(?:\s|-)?old'
r1 = re.findall(reg,s)
print(r1)
# ['99 years old', '91year old', '90-year-old', '102 year old']

Post a Comment for "Extracting Age Variations Using Regex"