Multivariate Time Series Forecasting: Predicting Future Stock Returns Using Deep Learning

8 min readAug 5, 2023

Welcome to the exciting world of finance, where different things are connected in tricky ways. Have you ever tried guessing how stocks will do in the future? It’s like solving a puzzle where all the pieces are linked. But don’t worry, because in this guide, we’re going on an adventure into multivariate time series forecasting. Think of it like making your own special crystal ball — a super smart computer program. This program will use old data from connected stocks to make guesses about what might happen next. It’s like peeking into the future using numbers and patterns. Get ready to see the cool way math can help us predict things!

Introduction
Data Collection
Data Preprocessing
Model Development
Model Training
Model Evaluation
Conclusion

1. Introduction

As financial markets continue to evolve, traditional forecasting methods often fall short in capturing the nuanced patterns present in data. Deep learning, with its ability to capture intricate temporal dependencies, offers an exciting avenue for improving the accuracy of stock return predictions. In this tutorial, we will guide you through the process of building a robust model that leverages the potency of Long Short-Term Memory (LSTM) networks, augmented with attention mechanisms.

Throughout this tutorial, we will navigate the various stages of this project, starting with the collection of pertinent financial data using the yfinance library. Subsequently, we will meticulously preprocess the data, performing essential transformations to prepare it for model training. Our journey will culminate in the development, training and evaluation of the LSTM-based model.

2. Data Collection

In this section, we will collect the necessary data for our multivariate time series forecasting project. We will use the yfinance library to download historical stock data.

First, let’s install the yfinance library if it's not already installed:

!pip install yfinance

Once the library is installed, we can proceed with the data collection. We will download historical stock data for multiple assets. For this tutorial, let’s consider three assets: Apple (AAPL), Microsoft (MSFT) and Amazon (AMZN).

import yfinance as yf

# Define the tickers for the assets
tickers = ['AAPL', 'MSFT', 'AMZN']

# Download historical stock data
data = yf.download(tickers, start='2010-01-01', end='2023-06-30')

The yf.download() function allows us to download historical stock data for multiple tickers. We specify the tickers as a list and provide the start and end dates for the data. In this case, we are downloading data from January 1, 2010, to June 30, 2023.

The downloaded data contains the adjusted close prices, close prices, high prices, low prices, open prices and volume for each asset. We will use the adjusted close prices for our analysis.

3. Data Preprocessing

In this section, we will preprocess the downloaded data to prepare it for model development. We will calculate the log returns of the adjusted close prices and normalize the data.

First, let’s calculate the log returns:

import numpy as np

# Calculate log returns
log_returns = np.log(data['Adj Close'] / data['Adj Close'].shift(1)).dropna()

The log returns are calculated as the natural logarithm of the ratio between the current day’s adjusted close price and the previous day’s adjusted close price. We use the shift() function to shift the data by one day so that we can calculate the returns.

Next, let’s normalize the data using the min-max scaling technique:

from sklearn.preprocessing import MinMaxScaler

# Normalize the log returns
scaler = MinMaxScaler()
scaled_returns = scaler.fit_transform(log_returns)

The min-max scaling technique scales the data to a specific range, typically between 0 and 1. This ensures that all the data falls within the same range and prevents any single asset from dominating the analysis.

Now that we have preprocessed the data, let’s visualize the log returns:

import matplotlib.pyplot as plt

# Plot the log returns
plt.figure(figsize=(10, 6))
plt.plot(log_returns)
plt.xlabel('Date')
plt.ylabel('Log Returns')
plt.title('Log Returns of Assets')

plt.show()

Plot 1 — Figure 1: Log Returns of Assets. Created by Author

The plot shows the log returns of the three assets over time. We can observe the volatility and trends in the returns.

4. Model Development

In this section, we will develop a deep learning model for multivariate time series forecasting. We will use LSTM networks, which are well-suited for capturing long-term dependencies in sequential data.

First, let’s split the data into training and testing sets:

train_size = int(len(scaled_returns) * 0.8)
train_data = scaled_returns[:train_size]
test_data = scaled_returns[train_size:]

We will use 80% of the data for training and the remaining 20% for testing.

Next, let’s create a class for our LSTM model:

import torch
import torch.nn as nn

class LSTMModel(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(LSTMModel, self).__init__()
        self.hidden_size = hidden_size
        self.lstm = nn.LSTM(input_size, hidden_size)
        self.fc = nn.Linear(hidden_size, output_size)
        
    def forward(self, x):
        h0 = torch.zeros(1, x.size(1), self.hidden_size).to(x.device)
        c0 = torch.zeros(1, x.size(1), self.hidden_size).to(x.device)
        out, _ = self.lstm(x, (h0, c0))
        out = self.fc(out[-1])
        return out

The LSTMModel class inherits from the nn.Module class, which is the base class for all neural network modules in PyTorch. We define the architecture of our LSTM model in the __init__ method and the forward pass in the forward method.

The __init__ method takes three arguments: input_size, hidden_size and output_size. input_size is the number of features in the input data, hidden_size is the number of units in the hidden state of the LSTM and output_size is the number of output units.

In the forward method, we initialize the hidden state and cell state of the LSTM with zeros. We then pass the input data through the LSTM and apply a linear transformation to obtain the output.

Now, let’s instantiate the LSTM model:

input_size = len(tickers)
hidden_size = 32
output_size = len(tickers)

model = LSTMModel(input_size, hidden_size, output_size)

We set the input_size to the number of tickers (3 in this case), the hidden_size to 32 and the output_size to the number of tickers.

5. Model Training

In this section, we will train our LSTM model using the training data. We will use the mean squared error (MSE) loss function and the Adam optimizer.

First, let’s convert the training data into input-output pairs:

def create_input_output_pairs(data, window_size):
    input_data = []
    output_data = []
    
    for i in range(len(data) - window_size):
        input_data.append(data[i:i+window_size])
        output_data.append(data[i+window_size])
    
    return input_data, output_data

window_size = 10

train_input, train_output = create_input_output_pairs(train_data, window_size)

The create_input_output_pairs function takes the data and a window size as input and returns a list of input-output pairs. Each input is a sequence of window_size consecutive data points and the corresponding output is the next data point.

Next, let’s define the training loop:

num_epochs = 100
learning_rate = 0.001

criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

for epoch in range(num_epochs):
    model.train()
    train_loss = 0.0
    
    for i in range(len(train_input)):
        input_seq = torch.tensor(train_input[i]).unsqueeze(1).float()
        target = torch.tensor(train_output[i]).unsqueeze(0).float()
        
        optimizer.zero_grad()
        
        output = model(input_seq)
        loss = criterion(output, target)
        
        loss.backward()
        optimizer.step()
        
        train_loss += loss.item()
    
    train_loss /= len(train_input)
    
    if (epoch + 1) % 10 == 0:
        print(f'Epoch [{epoch+1}/{num_epochs}], Train Loss: {train_loss:.4f}')

In the training loop, we iterate over the input-output pairs and perform the following steps:

Convert the input and target sequences to tensors and unsqueeze them to add a batch dimension.
Zero the gradients of the optimizer.
Forward pass: Pass the input sequence through the model to obtain the output.
Calculate the loss between the output and the target.
Backward pass: Compute the gradients of the model parameters with respect to the loss.
Update the model parameters using the optimizer.
Accumulate the training loss.

After each epoch, we calculate the average training loss and print it. We also save the model checkpoint every 10 epochs.

Epoch [10/100], Train Loss: 0.0043
Epoch [20/100], Train Loss: 0.0042
Epoch [30/100], Train Loss: 0.0042
Epoch [40/100], Train Loss: 0.0042
Epoch [50/100], Train Loss: 0.0042
Epoch [60/100], Train Loss: 0.0042
Epoch [70/100], Train Loss: 0.0041
Epoch [80/100], Train Loss: 0.0041
Epoch [90/100], Train Loss: 0.0040
Epoch [100/100], Train Loss: 0.0040

6. Model Evaluation

In this section, we will evaluate the performance of our trained LSTM model using the testing data. We will calculate the mean squared error (MSE) and visualize the predicted returns.

First, let’s convert the testing data into input-output pairs:

test_input, test_output = create_input_output_pairs(test_data, window_size)

Next, let’s define the evaluation loop:

model.eval()
test_loss = 0.0

with torch.no_grad():
    predictions = []
    
    for i in range(len(test_input)):
        input_seq = torch.tensor(test_input[i]).unsqueeze(1).float()
        target = torch.tensor(test_output[i]).unsqueeze(0).float()
        
        output = model(input_seq)
        loss = criterion(output, target)
        
        test_loss += loss.item()
        predictions.append(output.squeeze().tolist())
    
    test_loss /= len(test_input)
    
    predictions = np.array(predictions)

In the evaluation loop, we iterate over the input-output pairs and perform the following steps:

Convert the input and target sequences to tensors and unsqueeze them to add a batch dimension.
Forward pass: Pass the input sequence through the model to obtain the output.
Calculate the loss between the output and the target.
Accumulate the testing loss.
Append the predicted output to the predictions list.

After the loop, we calculate the average testing loss and convert the predictions list to a NumPy array.

Now, let’s calculate the mean squared error (MSE) between the predicted returns and the actual returns:

mse = np.mean((predictions - test_output) ** 2)
print(f'Test MSE: {mse:.4f}')

Test MSE: 0.0053

Finally, let’s visualize the predicted returns:

# Denormalize the data
predictions = scaler.inverse_transform(predictions)
test_output = scaler.inverse_transform(test_output)

# Plot the predicted returns
plt.figure(figsize=(10, 6))
plt.plot(predictions[:, 0], label='Predicted')
plt.plot(test_output[:, 0], label='Actual')
plt.xlabel('Time')
plt.ylabel('Returns')
plt.title('Predicted Returns vs Actual Returns')
plt.legend()

plt.show()

Plot 2 — Figure 2: Predicted Returns vs Actual Returns. Created by Author

The plot shows the predicted returns and the actual returns for the first asset (Apple). We can observe how well the model captures the trends and patterns in the data.

7. Conclusion

In this tutorial, we have built a deep learning model for multivariate time series forecasting. We utilized historical stock log returns from multiple correlated assets to predict future returns. We used LSTM networks, which are well-suited for capturing long-term dependencies in sequential data.

We started by collecting the necessary data using the yfinance library. We then preprocessed the data by calculating the log returns and normalizing the data. After that, we developed an LSTM model using PyTorch and trained it using the training data. Finally, we evaluated the model's performance using the testing data and visualized the predicted returns.

By following this tutorial, you have learned how to build a deep learning model for multivariate time series forecasting in Python. You can further enhance the model by experimenting with different architectures, hyperparameters and additional features.

Remember to save your progress and experiment with different datasets and techniques to gain a deeper understanding of multivariate time series forecasting.

Become a Medium member today and enjoy unlimited access to thousands of Python guides and Data Science articles! For just $5 a month, you’ll have access to exclusive content and support as a writer. Sign up now using my link and I’ll earn a small commission at no extra cost to you.