赞
踩
(1)数据读取
import pandas as pd
from sklearn.tree import DecisionTreeRegressor
from sklearn.metrics import mean_absolute_error, mean_squared_error
import numpy as np
data = pd.read_csv(‘wind_dataset.csv’, index_col=0, parse_dates=True)
(2)创建滞后特征
data[‘T.MIN_lag3’] = data[‘T.MIN’].shift(3)
data[‘T.MIN.G_lag3’] = data[‘T.MIN.G’].shift(3)
(3)删除NaN值
data = data.dropna()
(4)拆分数据
train_size = int(len(data) * 0.8)
train, test = data[:train_size], data[train_size:]
X_train = train[[‘RAIN’, ‘T.MAX’, ‘T.MIN_lag3’, ‘T.MIN.G_lag3’]]
y_train = train[‘WIND’]
X_test = test[[‘RAIN’, ‘T.MAX’, ‘T.MIN_lag3’, ‘T.MIN.G_lag3’]]
y_test = test[‘WIND’]
注意:训练集并没有WIND,所以这个算法的意思是:仅仅使用[‘RAIN’, ‘T.MAX’, ‘T.MIN_lag3’, 'T.MIN.G_lag3’去构建一个模型,来预测WIND。个人认为不太行,忽略了WIND
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。