Skip to content Skip to sidebar Skip to footer

How Do I Pivot One Dataframe Column To A Truth Table With Columns Based On Another Dataframe?

I have one df with a user_id and a category. I'd like to transform this to a truth table for whether or not that user has at least one entry for that category. However, the final t

Solution 1:

Option 1crosstab I'd recommend converting that column to a categorical dtype. crosstab/pivot will then handle the rest.

i = df.user_id
j = pd.Categorical(df.category, categories=df_list.category)

pd.crosstab(i, j).astype(bool)

col_0       A      B      C      D      E      F
user_id                                         
1TrueTrueFalseTrueFalseFalse2TrueFalseFalseFalseFalseTrue

Option 2unstack + reindex To fix your existing code, you can simplify the second step with reindex:

(df.groupby(['user_id', 'category'])
   .size()
   .unstack(fill_value=0)
   .reindex(df_list.category, axis=1, fill_value=0)
   .astype(bool)
)

category     A      B      C      D      E      F
user_id                                          
1TrueTrueFalseTrueFalseFalse2TrueFalseFalseFalseFalseTrue

Post a Comment for "How Do I Pivot One Dataframe Column To A Truth Table With Columns Based On Another Dataframe?"