Converting Column With String Separated Values Into Rows
I am trying to implement the below: I know how to use 2 columns but I need to extend it to more than 2 columns. In other words, var2 and var3 need to be replicated and extended dow
Solution 1:
In one big oneliner:
In [107]: df
Out[107]:
var1 var2 var3
0 47429,47404 10700 1403298300
1 23030,23831,23147 99999 1403297100
In [108]: pd.concat((pd.Series((v, row['var2'], row['var3']), df.columns) for _, row in df.iterrows() for v in row['var1'].split(',')), axis=1).T
Out[108]:
var1 var2 var3
0 47429 10700 1403298300
1 47404 10700 1403298300
2 23030 99999 1403297100
3 23831 99999 1403297100
4 23147 99999 1403297100
The inner nested generators are the ones doing the trick. They basically doing the same work as these for-loops:
In [112]: for _, row in df.iterrows():
for v in row['var1'].split(","):
print (v, row['var2'], row['var3'])
.....:
('47429', 10700, 1403298300)
('47404', 10700, 1403298300)
('23030', 99999, 1403297100)
('23831', 99999, 1403297100)
('23147', 99999, 1403297100)
I also added the column-headers of the original data-frame to the produced Series
.
Finally, since I'm no pandas expert, I resolved concatenating the series along axis 1 and then transposing the data-frame to get it in the correct structure.
Post a Comment for "Converting Column With String Separated Values Into Rows"