Skip to content Skip to sidebar Skip to footer

How To Group Rows And Extract Mean Values

I have the following data: df = QUEUE_1 QUEUE_2 QUEUE_3 HOUR TOTAL_SERVICE_TIME TOTAL_WAIT_TIME ABC123 DEF656 7 20 30 ABC

Solution 1:

I think you need first reshape your data by melt or lreshape:

result = pd.lreshape(df, {'QUEUE': ['QUEUE_1','QUEUE_2','QUEUE_3']})
print (result)
   HOUR  TOTAL_SERVICE_TIME  TOTAL_WAIT_TIME   QUEUE
0     7                  20               30  ABC123
1     7                  22               32  ABC123
2     8                  15               12  DEF656
3     8                  15               16  FED456
4     7                  20               30  DEF656
5     8                  15               12  ABC123
6     8                  15               16  DEF656
7     8                  15               12  FED456

Then groupby with mean and last reindex by MultiIndex created from unique values of columns QUEUE and HOUR:

mux = pd.MultiIndex.from_product([result.QUEUE.dropna().unique(), 
                                  result.dropna().HOUR.unique()], names=['QUEUE','HOUR'])

print (result.groupby(['QUEUE','HOUR'])
             .mean()
             .reindex(mux, fill_value=0)
             .add_prefix('AVG_')
             .reset_index())

    QUEUE  HOUR  AVG_TOTAL_SERVICE_TIMEAVG_TOTAL_WAIT_TIME0  ABC123     721311  ABC123     815122  DEF656     720303  DEF656     815144  FED456     7005  FED456     81514

Solution 2:

Steps:

1) Use pd.lreshape to convert the DF from wide to long format for the column names starting with QUEUE_X and name that wholesome column as QUEUE.

2) Pivot the DF using pivot_table which uses np.mean as it's aggregating function by default. Optionally fill missing values with 0.

3) Stack the obtained DF so that columns get enforced as the index resulting in a multi-index format. Add a char prefix and reset it's index.


df = pd.lreshape(df, {'QUEUE': df.columns[df.columns.str.startswith('QUEUE')].tolist()})
piv_df = df.pivot_table(index=['QUEUE'], columns=['HOUR'], fill_value=0)
piv_df.stack().add_prefix('AVG_').reset_index()

enter image description here

Post a Comment for "How To Group Rows And Extract Mean Values"