Recalculate Mean Considering Each Count
if the dataframe is given as below index yearmon college major gpa num 0 20140401 1 a 3.36 29 1 20180401 2 b 2.63 48 2 20160401
Solution 1:
A lazy way, given that the number of students are integers,
(df.loc[df.index.repeat(df['num']), ['major', 'gpa']]
.groupby('major').mean()
)
Option 2 groupby().apply()
and np.average
:
(df.groupby('major')
.apply(lambda x: np.average(x['gpa'], weights=x['num']))
)
Option 3 Most complicated but best performant is to assign the total score, and calculate the average manually:
df['total'] = df['gpa'] * df['num']
groups = df.groupby('major')
out = groups['total'].sum()/groups['num'].sum()
Output:
gpa
major
a 3.360
b 3.284
c 3.230
d 4.220
Post a Comment for "Recalculate Mean Considering Each Count"