湖州市建设局网站6,贵阳网站开发制作公司,免费软件制作网站模板下载软件,网店代运营合同模板1、Pandas 聚合
Pandas 聚合的操作实例 创建滚动#xff0c;扩展和ewm对象后#xff0c;可以使用多种方法对数据执行聚合。
1.1、对DataFrame聚合
我们创建一个DataFrame并对其应用聚合
import pandas as pd
import numpy as np
df pd.DataFrame(np.random.randn(10, 4)…1、Pandas 聚合
Pandas 聚合的操作实例 创建滚动扩展和ewm对象后可以使用多种方法对数据执行聚合。
1.1、对DataFrame聚合
我们创建一个DataFrame并对其应用聚合
import pandas as pd
import numpy as np
df pd.DataFrame(np.random.randn(10, 4),index pd.date_range(6/1/2024, periods10),columns [A, B, C, D])
print(df)
r df.rolling(window3,min_periods1)
print(r)运行结果 A B C D
2024-06-01 1.441992 0.507236 -1.279692 -0.283955
2024-06-02 0.732984 -1.022779 -1.188695 0.899738
2024-06-03 0.363206 -0.610489 0.987919 -0.556534
2024-06-04 1.760517 0.513175 -1.952190 -0.371333
2024-06-05 -0.975915 0.941488 0.116632 -1.384646
2024-06-06 0.278110 2.193880 0.434967 -3.136830
2024-06-07 0.998929 -1.174505 -0.512467 -0.076176
2024-06-08 -0.836676 0.255251 -0.283001 -0.069504
2024-06-09 -1.042460 1.008820 1.203172 1.790213
2024-06-10 -0.000309 0.327030 0.235055 0.137578
Rolling [window3,min_periods1,centerFalse,axis0,methodsingle]我们可以通过将函数传递给整个DataFrame进行聚合也可以通过标准的get item方法选择一列。
1.2、对Dataframe聚合
import pandas as pd
import numpy as np
df pd.DataFrame(np.random.randn(10, 4),indexpd.date_range(6/1/2024, periods10),columns[A, B, C, D])
print(df)
r df.rolling(window3, min_periods1)
print(r.aggregate(np.sum))运行结果 A B C D
2024-06-01 0.137541 -0.666472 -0.512313 0.124189
2024-06-02 -0.274006 0.546432 0.804729 0.444257
2024-06-03 0.656569 1.087017 0.546081 -0.645019
2024-06-04 0.287474 -0.037974 0.646037 -0.116104
2024-06-05 0.159287 0.242253 1.092559 -0.437320
2024-06-06 -1.081650 0.408552 0.273044 -0.802035
2024-06-07 -1.384118 0.366630 0.503155 -1.720862
2024-06-08 0.016059 -0.177049 0.066783 0.138181
2024-06-09 0.189092 1.099488 0.788672 -0.643970
2024-06-10 0.504482 0.307674 -1.186342 -1.958610A B C D
2024-06-01 0.137541 -0.666472 -0.512313 0.124189
2024-06-02 -0.136465 -0.120041 0.292416 0.568445
2024-06-03 0.520104 0.966977 0.838497 -0.076574
2024-06-04 0.670037 1.595475 1.996847 -0.316866
2024-06-05 1.103330 1.291296 2.284677 -1.198443
2024-06-06 -0.634889 0.612831 2.011640 -1.355459
2024-06-07 -2.306481 1.017435 1.868758 -2.960217
2024-06-08 -2.449709 0.598133 0.842982 -2.384716
2024-06-09 -1.178967 1.289068 1.358610 -2.226651
2024-06-10 0.709633 1.230113 -0.330886 -2.4643991.3、将聚合应用于Dataframe的单列
import pandas as pd
import numpy as np
df pd.DataFrame(np.random.randn(10, 4),indexpd.date_range(6/1/2024, periods10),columns[A, B, C, D])
print(df)
r df.rolling(window3, min_periods1)
print(r[A].aggregate(np.sum))运行结果 A B C D
2024-06-01 1.337425 -2.008430 0.487408 0.619035
2024-06-02 -1.057971 -0.454410 1.029195 1.031153
2024-06-03 0.180957 -1.598784 0.235843 -1.234636
2024-06-04 -0.215478 -0.283628 -0.159067 -0.441236
2024-06-05 0.568535 0.468742 -0.981265 -0.225904
2024-06-06 1.251656 0.045891 0.533743 -1.809453
2024-06-07 -0.118663 0.430278 -1.811598 1.199368
2024-06-08 1.103233 -0.909900 0.184519 0.363605
2024-06-09 0.499495 1.120610 -1.283629 0.073462
2024-06-10 1.182883 -0.573653 0.291168 -1.079381
2024-06-01 1.337425
2024-06-02 0.279453
2024-06-03 0.460411
2024-06-04 -1.092492
2024-06-05 0.534014
2024-06-06 1.604712
2024-06-07 1.701527
2024-06-08 2.236225
2024-06-09 1.484064
2024-06-10 2.785611
Freq: D, Name: A, dtype: float641.4、将聚合应用于DataFrame的多个列
import pandas as pd
import numpy as np
df pd.DataFrame(np.random.randn(10, 4),index pd.date_range(6/1/2024, periods10),columns [A, B, C, D])
print(df)
r df.rolling(window3,min_periods1)
print(r[[A,B]].aggregate(np.sum))运行结果 A B C D
2024-06-01 -0.315264 -1.007784 -0.422830 0.240110
2024-06-02 -0.899798 1.220554 0.043764 -0.724214
2024-06-03 -0.506266 -1.114019 0.970437 -1.436598
2024-06-04 -0.567130 -0.358241 -2.330796 0.720396
2024-06-05 0.002677 0.358061 -0.191730 -2.024825
2024-06-06 -1.241444 -0.185388 1.539475 -0.398289
2024-06-07 -0.394370 0.899715 -0.235603 2.083027
2024-06-08 -0.063937 -0.703623 -0.771960 1.069107
2024-06-09 -0.997480 -0.145053 -2.013109 0.630082
2024-06-10 -1.323366 0.407704 -1.958234 -0.136122A B
2024-06-01 -0.315264 -1.007784
2024-06-02 -1.215063 0.212770
2024-06-03 -1.721329 -0.901249
2024-06-04 -1.973195 -0.251705
2024-06-05 -1.070719 -1.114198
2024-06-06 -1.805897 -0.185567
2024-06-07 -1.633136 1.072388
2024-06-08 -1.699751 0.010704
2024-06-09 -1.455787 0.051039
2024-06-10 -2.384783 -0.4409711.5、在数据框的单列上应用多个功能
import pandas as pd
import numpy as np
df pd.DataFrame(np.random.randn(10, 4),indexpd.date_range(6/1/2024, periods10),columns[A, B, C, D])
print(df)
r df.rolling(window3, min_periods1)
print(r[A].aggregate([np.sum, np.mean]))运行结果 A B C D
2024-06-01 -1.353324 0.127682 -0.200629 0.450458
2024-06-02 0.949610 -1.400609 0.627148 -0.043679
2024-06-03 0.033043 0.892801 -0.425507 0.880760
2024-06-04 -0.717365 -0.126336 -0.688569 0.406762
2024-06-05 -1.432076 1.305415 0.316325 1.700087
2024-06-06 -0.130123 1.470843 0.255068 -0.466856
2024-06-07 -0.259649 0.972374 -0.294581 -0.246689
2024-06-08 0.451554 0.726053 1.198266 -0.721875
2024-06-09 -1.328514 -0.188786 0.499362 -0.998840
2024-06-10 -0.235946 0.063362 -1.474905 -1.410311sum mean
2024-06-01 -1.353324 -1.353324
2024-06-02 -0.403714 -0.201857
2024-06-03 -0.370671 -0.123557
2024-06-04 0.265288 0.088429
2024-06-05 -2.116398 -0.705466
2024-06-06 -2.279563 -0.759854
2024-06-07 -1.821847 -0.607282
2024-06-08 0.061783 0.020594
2024-06-09 -1.136609 -0.378870
2024-06-10 -1.112906 -0.3709691.6、在数据框的多个列上应用多个功能
import pandas as pd
import numpy as np
df pd.DataFrame(np.random.randn(10, 4),index pd.date_range(6/1/2024, periods10),columns [A, B, C, D])
print(df)
r df.rolling(window3,min_periods1)
print(r[[A,B]].aggregate([np.sum,np.mean]))运行结果 A B C D
2024-06-01 0.688572 0.335234 0.752168 0.961081
2024-06-02 1.085028 1.130616 -0.536655 0.779873
2024-06-03 0.867040 0.676979 -0.389117 -2.827168
2024-06-04 0.964311 0.861692 -0.421859 -1.080160
2024-06-05 -0.203971 -1.289974 -0.553891 -0.809878
2024-06-06 1.126439 1.169267 -2.039094 -1.062846
2024-06-07 0.442940 -2.056051 0.917150 -0.204623
2024-06-08 -0.441348 -0.131800 -0.884501 -0.733120
2024-06-09 -0.529172 -0.652189 -1.366874 -0.988671
2024-06-10 0.189241 0.030703 0.020499 0.532722A B sum mean sum mean
2024-06-01 0.688572 0.688572 0.335234 0.335234
2024-06-02 1.773601 0.886800 1.465849 0.732925
2024-06-03 2.640640 0.880213 2.142829 0.714276
2024-06-04 2.916379 0.972126 2.669287 0.889762
2024-06-05 1.627379 0.542460 0.248697 0.082899
2024-06-06 1.886778 0.628926 0.740985 0.246995
2024-06-07 1.365408 0.455136 -2.176759 -0.725586
2024-06-08 1.128031 0.376010 -1.018584 -0.339528
2024-06-09 -0.527580 -0.175860 -2.840041 -0.946680
2024-06-10 -0.781280 -0.260427 -0.753286 -0.2510951.7、将不同的功能应用于数据框的不同列
import pandas as pd
import numpy as np
df pd.DataFrame(np.random.randn(3, 4),indexpd.date_range(6/1/2024, periods3),columns[A, B, C, D])
print(df)
r df.rolling(window3, min_periods1)
print(r.aggregate({A: np.sum, B: np.mean}))运行结果 A B C D
2024-06-01 0.024827 -0.020137 1.930786 -0.481966
2024-06-02 0.301334 0.295961 -0.983852 0.401034
2024-06-03 0.025677 0.625714 0.948775 -0.490254A B
2024-06-01 0.024827 -0.020137
2024-06-02 0.326161 0.137912
2024-06-03 0.351838 0.300513