我有以下数据帧:
...
与更多类别
我现在想计算每个类别的12个月滚动平均值。pd.滚动函数的问题是,在计算cat2中的滚动平均值时,它从cat1中获取数据。在计算第3类时,它从第2类中获取数据,依此类推。
亲切的问候,
要计算每个类别
的滚动
平均值,您必须首先对类别
上的数据帧进行分组
df['roll_avg'] = df.groupby('categorie')['avg(monthly)'].rolling(12).mean().droplevel(0)
YYYYMM avg(monthly) categorie roll_avg
0 202001 0.666667 cat1 NaN
1 202002 0.750000 cat1 NaN
2 202003 1.000000 cat1 NaN
3 202004 1.000000 cat1 NaN
4 202005 1.000000 cat1 NaN
5 202006 1.000000 cat1 NaN
6 202007 1.000000 cat1 NaN
7 202008 1.000000 cat1 NaN
8 202009 0.333333 cat1 NaN
9 202010 0.375000 cat1 NaN
10 202011 0.400000 cat1 NaN
11 202012 0.800000 cat1 0.777083
12 202101 0.833333 cat1 0.790972
13 202102 1.000000 cat1 0.811806
14 202103 0.857143 cat1 0.799901
15 202104 0.571429 cat1 0.764187
16 202105 1.000000 cat1 0.764187
17 202106 0.833333 cat1 0.750298
18 202107 0.666667 cat1 0.722520
19 202001 0.529412 cat2 NaN
20 202002 0.666667 cat2 NaN
21 202003 0.684211 cat2 NaN
22 202004 0.400000 cat2 NaN
23 202005 0.791667 cat2 NaN
24 202006 0.480000 cat2 NaN
25 202007 0.578947 cat2 NaN
26 202008 0.411765 cat2 NaN
27 202009 0.466667 cat2 NaN
28 202010 0.545455 cat2 NaN
29 202011 0.458333 cat2 NaN
30 202012 0.724138 cat2 0.561438
31 202101 0.611111 cat2 0.568247
32 202102 0.513514 cat2 0.555484
33 202103 0.560000 cat2 0.545133
34 202104 0.350000 cat2 0.540966
35 202105 0.533333 cat2 0.519439
36 202106 0.625000 cat2 0.531522
37 202107 0.000000 cat2 0.483276