做信息图的免费网站,如何获取网站是哪个公司制作,最新电大网站开发维护,中山网站上排名KFlod 适用于用户回归类型数据划分 stratifiedKFlod 适用于分类数据划分 并且在实验中也发现#xff0c;stratifiedKFlod.split(X_train,y_train)的y_train不可为连续数据#xff0c;因此无法使用#xff0c;只能用KFold
models [GBDT(n_estimators100), RF(n_estimators1…KFlod 适用于用户回归类型数据划分 stratifiedKFlod 适用于分类数据划分 并且在实验中也发现stratifiedKFlod.split(X_train,y_train)的y_train不可为连续数据因此无法使用只能用KFold
models [GBDT(n_estimators100), RF(n_estimators100), ET(n_estimators100), ADA(n_estimators100)]
X_train_stack np.zeros((X_train.shape[0], len(models))) X_test_stack np.zeros((X_test.shape[0], len(models)))
10折stacking
n_folds 10 kf KFold(n_splitsn_folds)
for i, model in enumerate(models): X_stack_test_n np.zeros((X_test.shape[0], n_folds))
for j, (train_index, test_index) in enumerate(kf.split(X_train)):tr_x X_train[train_index]tr_y y_train[train_index]model.fit(tr_x, tr_y)# 生成stacking训练数据集X_train_stack[test_index, i] model.predict(X_train[test_index])X_stack_test_n[:, j] model.predict(X_test)# 生成stacking测试数据集
X_test_stack[:, i] X_stack_test_n.mean(axis1)model_second LinearRegression() model_second.fit(X_train_stack,y_train) pred model_second.predict(X_test_stack) print(“R2:”, r2_score(y_test, pred))
GBDT
model_1 models[0] model_1.fit(X_train,y_train) pred_1 model_1.predict(X_test) print(“R2:”, r2_score(y_test, pred_1))
RF
model_2 models[1] model_2.fit(X_train, y_train) pred_2 model_2.predict(X_test) print(“R2:”, r2_score(y_test, pred_2))
ET
model_3 models[2] model_3.fit(X_train, y_train) pred_3 model_1.predict(X_test) print(“R2:”, r2_score(y_test, pred_3))
ADA
model_4 models[3] model_4.fit(X_train, y_train) pred_4 model_4.predict(X_test) print(“R2:”, r2_score(y_test, pred_4))
stackingstacking是一种分层模型集成框架。以两层为例第一层由多个基学习器组成其输入为原始训练集第二层的模型则是以第一层基学习器的输出作为特征加入训练集进行再训练从而得到完整的stacking模型。stacking的方法在各大数据挖掘比赛上都很风靡模型融合之后能够小幅度的提高模型的预测准确度。