Picking Multiple Values From String
I have data like the sample data below, and I'm trying to pattern match and parse it to create something like the output data. The idea is, if I have a string value that contains
Solution 1:
You can use str.extract
to capture pattern in the string and convert each into a column:
pd.concat([
SampleDf,
SampleDf.OtherField.str.extract(r"Aggr\((?P<Part1>.*?)\),(?P<Part2>[^\(]*)", expand=True)
], axis=1)
# ReportField OtherField Part1 Part2#0 tom words Aggr(stuff),something1 stuff something1#1 bob Morewords Aggr(Diffstuff),something2 Diffstuff something2
regexAggr\\((?P<Part1>.*?)\\),(?P<Part2>[^\\(]*)
captures two patterns you needed (with one being Aggr\\((?P<Part1>.*?)\\)
named part1: the content in the first parenthesis after Aggr, another being ,(?P<Part2>[^\\(]*)
named part2: the pattern after the comma following the first pattern before the next parenthesis).
Solution 2:
You can use str.extractall with regex pattern matching
SampleDf[['Part1', 'Part2']]=SampleDf.OtherField.str.extractall('\((.*)\),(.*)').reset_index(drop = True)
You get
ReportField OtherField Part1 Part2
0 tom words Aggr(stuff),something1 stuff something1
1 bob Morewords Aggr(Diffstuff),something2 Diffstuff something2
Post a Comment for "Picking Multiple Values From String"