Skip to content Skip to sidebar Skip to footer

How To Create Cummulative Sum In Dataframe Python Part2?

How to create cumulative sum (new_supply)in dataframe python from demand column from table item Date supply demand A 2018-01-01 0 10 A 2018-01-02 0 1

Solution 1:

You can change condition in boolean mask and change - to 0 in np.where:

m = df['supply'] != 0

df['new'] = np.where(m, df.groupby('item')['demand'].cumsum(), 0)
print (df)
   item        Date  supply  demand  new
0     A  2018-01-01       0      10    0
1     A  2018-01-02       0      15    0
2     A  2018-01-03     100      30   55
3     A  2018-01-04       0      10    0
4     A  2018-01-05       0      40    0
5     A  2018-01-06      50      50  155
6     A  2018-01-07       0      10    0
7     B  2018-01-01       0      20    0
8     B  2018-01-02       0      30    0
9     B  2018-01-03      20      60  110
10    B  2018-01-04       0      20    0
11    B  2018-01-05     100      10  140
12    B  2018-01-06       0      20    0
13    B  2018-01-07       0      30    0

Solution 2:

As the above data doesn't have any proper column which can be used to create groups, we need to create one:

df['grp_attr'] = df['supply'].clip(upper=1)
df['grp_attr'] = df[df['grp_attr'] != 0]['grp_attr'].cumsum()
df['grp_attr'] = df['grp_attr'].bfill().fillna(0).astype(int)

The df looks like this once you create grp_attr:

   item        Date  supply  demand  grp_attr
0     A  2018-01-01       0      10         1
1     A  2018-01-02       0      15         1
2     A  2018-01-03     100      30         1
3     A  2018-01-04       0      10         2
4     A  2018-01-05       0      40         2
5     A  2018-01-06      50      50         2
6     A  2018-01-07       0      10         3
7     B  2018-01-01       0      20         3
8     B  2018-01-02       0      30         3
9     B  2018-01-03      20      60         3
10    B  2018-01-04       0      20         4
11    B  2018-01-05     100      10         4
12    B  2018-01-06       0      20         0
13    B  2018-01-07       0      30         0`

Now we can groupby using grp_attr:

df['new_supply'] = df.groupby('grp_attr')['demand'].cumsum()
df.loc[df['supply'] == 0, 'new_supply'] = 0

Once you are done with your operations you can now drop the grp_attr columns from your dataframe.

df.drop(columns=['grp_attr'], inplace=True)

Post a Comment for "How To Create Cummulative Sum In Dataframe Python Part2?"