Using Python to predict the price trend of gold futures, it turns out that machine learning is so simple! (code included)

2024-04-08 09:10:00

Hello everyone, today we are going to explore an interesting topic: how to use machine learning technology to predict the price movement of gold futures?

As an important safe-haven asset, the price trend of gold has attracted the attention of investors, especially the recent turmoil in the world, the price of gold has soared, which has attracted everyone's attention, so how to make a relatively accurate prediction of the price of gold?

Traditional gold price forecasts rely heavily on fundamental and technical analysis, but in the face of complex and volatile market conditions, these methods often do not work as well as they should. In recent years, with the rapid development of artificial intelligence technology, more and more quantitative investors have begun to try to use machine learning models to predict gold prices, and have achieved good results.

The basic idea of machine learning to predict the price of gold futures is to take various factors that affect the price (such as historical price, open interest, inflation rate, etc.) as the input characteristics of the model, take the future price as the output target of the model, and train the model through a large amount of historical data, so that it can automatically learn the law of price fluctuations, so as to predict the future trend.

Let's use Python to implement this process step by step and see the charm of machine learning!

Step 1: Import the necessary libraries and get the data

import numpy as np
import pandas as pd
import akshare as ak
from datetime import datetime
from sklearn.preprocessing import MinMaxScaler
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM


# 获取黄金期货历史数据
symbol = "AU0"  # 沪金主力合约代码


df = ak.futures_zh_daily_sina(symbol=symbol)
# 设置日期列为索引
df['date'] = pd.to_datetime(df['date'])
df.set_index('date', inplace=True)


# 选取收盘价数据
df = df[['close']]

Step 2: Data preprocessing

To facilitate model learning, the data needs to be normalized and the price scaled between 0 and 1. In addition, in order to predict the price for the next N days, the data needs to be reconstructed into a supervised learning problem.

# 数据归一化
scaler = MinMaxScaler()
data = scaler.fit_transform(df)


# 数据重构
X = []
y = []
window = 60  # 窗口期
future = 5  # 预测未来几天


for i in range(window, len(data)-future+1):
    X.append(data[i-window:i])
    y.append(data[i:i+future])
    
X = np.array(X)
y = np.array(y)


# 划分训练集和测试集
split = int(len(X) * 0.8) 
X_train, X_test = X[:split], X[split:]
y_train, y_test = y[:split], y[split:]

Step 3: Build and train the LSTM model

Here we use LSTM (Long Short-Term Memory Network) as a gold price prediction model. LSTM is a special RNN (Recurrent Neural Network) that is good at processing time series data.

model = Sequential()
model.add(LSTM(units=64, input_shape=(window, 1)))
model.add(Dense(units=future))  


model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(X_train, y_train, epochs=50, batch_size=32)

The results of the training are as follows:

Using Python to predict the price trend of gold futures, it turns out that machine learning is so simple! (code included)

The model uses the Adam adaptive learning rate optimization algorithm, which combines the advantages of RMSprop and Momentum, and is a kind of optimization algorithm that is often used at present, which performs well in dealing with the local minimum value and saddle point in the high-dimensional space.

The loss result of the last training of the model is 2.6029e-04, which indicates that the gap between the predicted value and the actual value is very small.

Step 4: Model evaluation and prediction

Evaluate the prediction effect of the model with test set data:

y_pred = model.predict(X_test)


y_test = scaler.inverse_transform(y_test.reshape(-1, 1)).flatten()
y_pred = scaler.inverse_transform(y_pred.reshape(-1, 1)).flatten()


from sklearn.metrics import mean_squared_error
rmse = np.sqrt(mean_squared_error(y_test, y_pred))
print(f"测试集RMSE: {rmse}")

Finally, the trained model is used to predict the gold price in the next 5 days:

last_data = data[-window:]
last_data = np.expand_dims(last_data, axis=0)


next_5d_price = model.predict(last_data)
next_5d_price = scaler.inverse_transform(next_5d_price)


print(f"未来5天的黄金价格预测值为: {next_5d_price[0]}")

Result output:

As you can see, the prediction error (RMSE) of our LSTM model on the test set is about 6 yuan/gram, which is quite good for the gold price of hundreds of yuan. In addition, the model predicts that gold prices will fall first and then rise in the next five days, showing a volatile trend as a whole.

Of course, this is just a simple example, but in practice, we can also incorporate more features that affect prices, such as the US dollar index, inflation expectations, geopolitical risks, etc., and we can also try more complex deep learning models to further improve the accuracy of forecasts.

However, no matter how advanced the model is, it is not omnipotent, and the price of gold will be affected by many unpredictable factors, so in the actual investment, we must look at the model prediction results rationally, comprehensively consider all factors, and control the risk.

The above is all the content of using Python machine learning to predict the price trend of gold futures, don't you think it's interesting?

Using Python to predict the price trend of gold futures, it turns out that machine learning is so simple! (code included)

Read on