Manipulate Values In Pandas Dataframe Columns Based On Matching Ids From Another Dataframe
Solution 1:
You can solve your problem using this instead:
for letter in ['b','c']: # took off enumerate cuz i didn't need it here, maybe you do for the rest of your codedf[letter] = df.apply(lambda row: row[letter] if row['a'] in (df_id[letter].tolist()) else np.nan,axis=1)
just replace isin
with in
.
The problem is that when you use apply on df
, x will represent df rows
, so when you select x['a']
you're actually selecting one element.
However, isin is applicable for series or list-like structures which raises the error so instead we just use in
to check if that element is in the list.
Hope that was helpful. If you have any questions please ask.
Solution 2:
Adapting a hard-to-find answer from Pandas New Column Calculation Based on Existing Columns Values:
for i, letter in enumerate(['b','c']):
mask = df['a'].isin(df_id[letter])
name = letter + '_new'# for some reason, df[letter] = df.loc[mask, letter] does not work
df.loc[mask, name] = df.loc[mask, letter]
df[letter] = df[name]
del df[name]
This isn't pretty, but seems to work.
Solution 3:
If you have a bigger Dataframe and performance is important to you, you can first build a mask df and then apply it to your dataframe. First create the mask:
mask = df_id.apply(lambda x: df['a'].isin(x))
b c
0TrueFalse1TrueFalse2FalseTrue
This can be applied to the original dataframe:
df.iloc[:,1:]= df.iloc[:,1:].mask(~mask, np.nan)
a b c0201.0NaN150NaNNaN2100NaN1.0
Post a Comment for "Manipulate Values In Pandas Dataframe Columns Based On Matching Ids From Another Dataframe"