How To Group Rows And Extract Mean Values
I have the following data: df = QUEUE_1 QUEUE_2 QUEUE_3 HOUR TOTAL_SERVICE_TIME TOTAL_WAIT_TIME ABC123 DEF656 7 20 30 ABC
Solution 1:
I think you need first reshape your data by melt
or lreshape
:
result = pd.lreshape(df, {'QUEUE': ['QUEUE_1','QUEUE_2','QUEUE_3']})
print (result)
HOUR TOTAL_SERVICE_TIME TOTAL_WAIT_TIME QUEUE
0 7 20 30 ABC123
1 7 22 32 ABC123
2 8 15 12 DEF656
3 8 15 16 FED456
4 7 20 30 DEF656
5 8 15 12 ABC123
6 8 15 16 DEF656
7 8 15 12 FED456
Then groupby
with mean
and last reindex
by MultiIndex
created from unique
values of columns QUEUE
and HOUR
:
mux = pd.MultiIndex.from_product([result.QUEUE.dropna().unique(),
result.dropna().HOUR.unique()], names=['QUEUE','HOUR'])
print (result.groupby(['QUEUE','HOUR'])
.mean()
.reindex(mux, fill_value=0)
.add_prefix('AVG_')
.reset_index())
QUEUE HOUR AVG_TOTAL_SERVICE_TIMEAVG_TOTAL_WAIT_TIME0 ABC123 721311 ABC123 815122 DEF656 720303 DEF656 815144 FED456 7005 FED456 81514
Solution 2:
Steps:
1) Use pd.lreshape
to convert the DF
from wide to long format for the column names starting with QUEUE_X and name that wholesome column as QUEUE.
2) Pivot the DF
using pivot_table
which uses np.mean
as it's aggregating function by default. Optionally fill missing values with 0.
3) Stack the obtained DF
so that columns get enforced as the index resulting in a multi-index format. Add a char prefix and reset it's index.
df = pd.lreshape(df, {'QUEUE': df.columns[df.columns.str.startswith('QUEUE')].tolist()})
piv_df = df.pivot_table(index=['QUEUE'], columns=['HOUR'], fill_value=0)
piv_df.stack().add_prefix('AVG_').reset_index()
Post a Comment for "How To Group Rows And Extract Mean Values"