赞
踩
- examDict={' 学习时 ':[0.50, 0.75, 1.00, 1.25,1.50,1.75, 1.75,2.00, 2.25,2.50,
- 2.75,3.00,3.25,3.50,4.00,4.25,4.50,4.75,5.00,5.50], '分':[10,22,13 ,43,20,22,33,50,62 ,
- 48,55,75,62,73,81,76,64,82,90,93]}
- examDf = pd.DataFrame(examDict)
- X_train,X_test,Y_train,Y_test = train_test_split(exam_X,exam_Y,train_size=0.8)
- model = LinearRegression()
- model.fit(X_train,Y_train)
- a = model.intercept_#截距
- b = model.coef_#回归系数
- y_train_pred = model.predict(X_train) #预测
- score = model.score(X_test,Y_test) #可决系数 0.8866470295386657
- import statsmodels.api as sm
- from sklearn import datasets ## 从 scikit-learn 导入数据集
- data = datasets.load_boston() ## 从数据集库加载波士顿数据集
- import numpy as np
- import pandas as pd
- df = pd.DataFrame(data.data, columns=data.feature_names)
- target = pd.DataFrame(data.target, columns=["MEDV"])
- X = df[['CRIM', 'ZN', 'INDUS']] ## X 通常表示我们的输入变量 (或自变量)
- y = target["MEDV"] ## Y 通常表示输出/因变量
- X = sm.add_constant(X) ## 我们添加一个截距(beta_0)到我们的模型
- model = sm.OLS(y, X).fit() ## sm.OLS(输出, 输入)
- predictions = model.predict(X)
- model.summary() ## 打印出统计模型
3. 岭回归模型
- X_train,X_test,Y_train,Y_test = train_test_split(df2,df1,train_size=0.8)
- model = Ridge(alpha=0.5,fit_intercept=True)
- model = RidgeCV(alphas=[0.01,0.1,0.2, 0.5, 1],normalize = True,cv=10)
- model.fit(X_train,Y_train)
- ridge_best_alpha = model.alpha_ #得到最佳lambda值
- print(f"岭回归关键正则参数={ridge_best_alpha}")
- 计算可决系数
- a=model.intercept_
- b=model.coef_
- y_train_pred =model.predict(X_train)
- score=model.score(X_test, Y_test)
- print(score)
4. 基于最佳lambda值建模
- ridge = Ridge(alpha = ridge_best_alpha,normalize = True)
- ridge.fit(X_train,Y_train)
- ridge_predict = ridge.predict(X_test)
- 计算损失函数
- rmse = np.sqrt(mean_squared_error(Y_test,ridge_predict))
5. LASSO回归模型:
- lasso_cv = LassoCV(alphas = alphas, normalize=True, cv = 10, max_iter=10000)
- lasso_cv.fit(x_tr,y_tr)
- lasso_best_alpha = lasso_cv.alpha_
- lasso_best_alpha
- lasso = Lasso(alpha = lasso_best_alpha, normalize=True, max_iter=10000)
- lasso.fit(x_tr, y_tr)
- lasso_predict = lasso.predict(x_te) #预测
- RMSE = np.sqrt(mean_squared_error(y_te,lasso_predict))
本次任务额外知识点:
seed = 7
np.random.seed(seed)
10折交叉验证
kfold = StratifiedKFold(n_splits=10, shuffle=False, random_state=seed)
固定random_state后,每次构建的模型是相同的、生成的数据集是相同的、每次的拆分结果也是相同的
y代表输出答案,y_代表标准答案
mse=tf.reduce_mean(tf.square(Y_test-yy_train_pred))
题目
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。