Pandas: "distribute" Column Values Into Multiple Rows
I am a pandas newbie, and I am trying to solve the following problem. I have a large DataFrame (10000 x 28) as follows. Col1 Col2 Col3 Col4 Col5 A B C D E How can I r
Solution 1:
Set the ['Col1, 'Col2]
as index
and use .stack()
.
df.set_index(['Col1', 'Col2']).stack()
Col1 Col2
A B 0 C
0 D
0 E
Then do .reset_index()
to format as in your example (you can also add name='Col'
for the same result as suggested by @jezrael:
df.reset_index(-1, drop=True).reset_index(name='Col')
Col1 Col2 0
0 A B C
1 A B D
2 A B E
Solution 2:
print pd.melt(df, id_vars=['Col1','Col2'],value_name='Col').drop('variable', axis=1)
Col1 Col2 Col
0 A B C
1 A B D
2 A B E
Timings:
df = pd.concat([df]*1000).reset_index(drop=True)
In [58]: %timeit pd.melt(df, id_vars=['Col1','Col2'],value_name='Col').drop('variable', axis=1)
100 loops, best of 3: 2.48 ms per loop
In [59]: %timeit df.set_index(['Col1', 'Col2']).stack().reset_index(-1, drop=True).reset_index(name='Col')
100 loops, best of 3: 3.83 ms per loop
Post a Comment for "Pandas: "distribute" Column Values Into Multiple Rows"