实验数据:
数据每分钟一个点,共获取7天的数据。
def xgboost_model_forecast(data, step):
"""
将原始数据分割为两部分,一部分进行训练,一部分用于模型评估(默认近三天),
然后预测未来step hours的数据
:param data: dataframe格式,index,val
:param hours: 要预测的时长
:return: Series,预测的时间点和预测值
"""
latest_date = data.index[-1] + datetime.timedelta(minutes=1)
forecast_times = pd.date_range(start=latest_date, periods=step, freq='T')
split_date = latest_date + datetime.timedelta(-3)
train_data = data.loc[data.index <= split_date].copy()
evaluation_data = data.loc[data.index > split_date].copy()
forecast_data = pd.DataFrame({'val': np.zeros(len(forecast_times))}, index=forecast_times)
x_train, y_train = create_features(train_data, label='val')
x_eval, y_eval = create_features(evaluation_data, label='val')
x_forecast = create_features(forecast_data)
reg = XGBRegressor(n_estimators=1000)
reg.fit(x_train, y_train,eval_set=[(x_train, y_train), (x_eval, y_eval)],early_stopping_rounds=50,verbose=False)
evaluation_data['val'] = reg.predict(x_eval)
forecast_data['val'] = reg.predict(x_forecast)
data_forecasted = pd.Series(dict(zip(forecast_times, forecast_data['val'])))
return data_forecasted
def create_features(df, label=None):
df['date'] = df.index # index: DatetimeIndex
df['hour'] = df['date'].dt.hour # dt: DatetimeProperties, hour: Series
df['quarter'] = df['date'].dt.quarter
df['minute'] = df['date'].dt.minute
df['month'] = df['date'].dt.month
df['year'] = df['date'].dt.year
df['day_of_year'] = df['date'].dt.dayofyear
df['day_of_month'] = df['date'].dt.day
df['week_of_year'] = df['date'].dt.weekofyear
X = df[['hour', 'quarter', 'minute', 'month', 'year', 'day_of_year', 'day_of_month', 'week_of_year']]
if label:
y = df[label]
return X, y
return X
如上图,使用历史一周的数据进行预测未来一天的数据,可以看到,预测结果和真实值还是很接近的。
参考:
https://blog.csdn.net/kewei168/article/details/90375743,原理+代码,
https://blog.csdn.net/guolindonggld/article/details/87826024,附带部分代码,指定分割点切分数据集
https://www.cnblogs.com/zongfa/p/9324684.html,原理讲解,很详细,需要细看
https://blog.csdn.net/ljzology/article/details/82154143,参数解读,模型参数的调参
https://blog.csdn.net/qq_20412595/article/details/82621744