Finding Correct Person Among Multiple Person Names
I have a dataframe. In its one column there are single values and in its coresponding column there are subset of values. df = pd.DataFrame() Index Values_1
Solution 1:
Use:
#convert values to list and subtract index by 1 for match by next group
s = df.groupby(level=0)['Values_1'].agg(list)
s.index = s.index - 1
print (s)
Index
0 [Muhammad bin Bashr bin al-Farafsa, Muhammad b...
1 [Yahya bin Sa'id bin Farroukh al-Qatan, Yahya ...
2 [Hamza bin al-Mughira bin Shu'ba, Shu'ba]
Name: Values_1, dtype: object
#replace NaN to emty list
df['test'] = df.index.map(s).map(lambda x: [] if isinstance(x, float) else x)
#test if at least one value match from list from previous group
f = lambda x: any([y in x['Values_2'] for y in x['test']])
mask = df.apply(f, axis=1)
#filter by mask and remove helper column
df = df[mask].drop('test',axis=1)
print (df)
Values_1 \
Index
1 Muhammad bin Bashar Bindar
Values_2
Index
1 Mua'dh bin Hisham bin Aby [20287], Yahya bin S...
Post a Comment for "Finding Correct Person Among Multiple Person Names"