How Do I Pivot One Dataframe Column To A Truth Table With Columns Based On Another Dataframe?
I have one df with a user_id and a category. I'd like to transform this to a truth table for whether or not that user has at least one entry for that category. However, the final t
Solution 1:
Option 1crosstab
I'd recommend converting that column to a categorical dtype. crosstab
/pivot
will then handle the rest.
i = df.user_id
j = pd.Categorical(df.category, categories=df_list.category)
pd.crosstab(i, j).astype(bool)
col_0 A B C D E F
user_id
1TrueTrueFalseTrueFalseFalse2TrueFalseFalseFalseFalseTrue
Option 2unstack
+ reindex
To fix your existing code, you can simplify the second step with reindex
:
(df.groupby(['user_id', 'category'])
.size()
.unstack(fill_value=0)
.reindex(df_list.category, axis=1, fill_value=0)
.astype(bool)
)
category A B C D E F
user_id
1TrueTrueFalseTrueFalseFalse2TrueFalseFalseFalseFalseTrue
Post a Comment for "How Do I Pivot One Dataframe Column To A Truth Table With Columns Based On Another Dataframe?"