Predicting stock movements using deep learning

Swarit Dholakia
5 min readMar 19, 2019

--

In grade 8, I got grounded for 2 weeks and had my phone taken away. So I had to resort to an alternative pass-time activity.

While cleaning up my closet looking for something to do, I stumbled upon one of my dad’s investing books.

Passive income, dividend reinvesting, NASDAQ, low-cost ETF’s, yada, yada, yada.

A bunch of my dad’s old MBA books, combined with MoneySense magazines (back when they were still in print) a whole lot of Investopedia and 2 whole weeks of no-screen time, I became a de facto investing guru (or, at least by any means for a 12-year-old).

I quickly realized the need for investing and personal finance knowledge, and hence, started my journey in learning about the markets and stocks.

Machine learning is an up-and-coming subsection of artificial intelligence that focuses on recognizing (sometimes obscure) patterns in large sets of data.

Predicting stock market price trends, interestingly, can be described in the same way.

In order to accurately realize returns and experience growth in investments, it’s crucial to be able to gauge an understanding of the market as precisely as possible. From a strictly quantitative view, machine learning is the optimal solution.

Putting a human up to this task is just wildly ineffective, but a computer — especially one equipped with properly trained machine learning models — could prove to be an asset to both consumer and institutional investors.

In this article, I walk you through the machine learning model I replicated, that has the capability to identify trends in stock prices.

The dataset we use has over 41k minutes of data recorded from April to August 2017 on all stocks prices of the S&P 500, in addition to its index price.

Training and test data

The dataset was split into training and test data.

The training data contains 80% of date.

Note, that the data was not shuffled but sliced sequentially.

# Training and test data
train_start = 0
train_end = int ( np.floor ( 0.8*n ) )
test_start = train_end
test_end = n
data_train = data [ np.arange ( train_start, train_end ), :]
data_test = data[ np.arange( test_start, test_end ), :]

Placeholders

X contains the inputs ( price of all S&P 500 stocks at time T = t) and Y holds network output (index price of the S&P 500 at time T = t + 1).

# Placeholder
X = tf.placeholder(dtype=tf.float32, shape=[None, n_stocks])
Y = tf.placeholder(dtype=tf.float32, shape=[None])

Variables

Variables are used as flexible ranges in the graph that change dynamically. Weights and biases are represented as variables in order to adapt during training.

The model consists of four hidden layers. The first layer contains 1024 neurons.

Subsequent hidden layers are always half the size of the previous layer, which means 512, 256 and 128 neurons.

The reduced number of neurons in each subsequent layer compresses the data the network identifies from the previous layers.

n_stocks = 500
n_neurons_1 = 1024
n_neurons_2 = 512
n_neurons_3 = 256
n_neurons_4 = 128
n_target = 1
# Layer 1W_hidden_1 = tf.Variable(weight_initializer([n_stocks, n_neurons_1]))
bias_hidden_1 = tf.Variable(bias_initializer([n_neurons_1]))
# Layer 2W_hidden_2 = tf.Variable(weight_initializer([n_neurons_1, n_neurons_2]))
bias_hidden_2 = tf.Variable(bias_initializer([n_neurons_2]))
# Layer 3W_hidden_3 = tf.Variable(weight_initializer([n_neurons_2, n_neurons_3]))
bias_hidden_3 = tf.Variable(bias_initializer([n_neurons_3]))
# Layer 4W_hidden_4 = tf.Variable(weight_initializer([n_neurons_3, n_neurons_4]))
bias_hidden_4 = tf.Variable(bias_initializer([n_neurons_4]))
# Output layerW_out = tf.Variable(weight_initializer([n_neurons_4, n_target]))
bias_out = tf.Variable(bias_initializer([n_target]))

Network architecture

After defining the required weight and bias variables, placeholders (for the data) and variables (the weights and biases) need to be combined into a system of sequential matrix multiplications (learn more about them in the link).

The hidden layers of the network are transformed by activation functions.

Activation functions are important elements of the network architecture since they introduce a level of non-linearity to the model.

# Hidden layerhidden_1 = tf.nn.relu(tf.add(tf.matmul(X, W_hidden_1), bias_hidden_1))
hidden_2 = tf.nn.relu(tf.add(tf.matmul(hidden_1, W_hidden_2), bias_hidden_2))
hidden_3 = tf.nn.relu(tf.add(tf.matmul(hidden_2, W_hidden_3), bias_hidden_3))
hidden_4 = tf.nn.relu(tf.add(tf.matmul(hidden_3, W_hidden_4), bias_hidden_4))
# Output layer out = tf.transpose(tf.add(tf.matmul(hidden_4, W_out), bias_out))

The model consists of three major building blocks.

The input layer, the hidden layers and the output layer. This architecture is called a feedforward network. Feedforward indicates that the batch of data solely flows from left to right.

Other network architectures, such as recurrent neural networks, also allow data flowing “backwards” in the network.

Cost function

The cost function of the network is used to generate a measure of deviation between the network’s predictions and the actually observed training targets.

# Cost function
mse = tf.reduce_mean(tf.squared_difference(out, Y))

The neural network

After having defined the placeholders, variables, initializers, cost functions and optimizers of the network, the model needs to be trained.

The training dataset gets divided into n / batch_size batches that are sequentially fed into the network. At this point the placeholders X and Ycome into play. They store the input and target data and present them to the network as inputs and targets.

A sampled data batch of X goes through the network until it reaches the output layer.

The TensorFlow then compares the model's predictions against the actual-observed targets Y in the batch.

The mean absolute percentage error of the forecast on the test set is equal to 4.89% which isn’t that bad.

Please note that there are tons of ways of further improving this result: design of layers and neurons, choosing different initialization and activation schemes, the introduction of dropout layers of neurons, early stopping and so on.

Special thanks to Sebastian for inspiration!

Liked this article? AWESOME! Show you’re appreciation down below 👏👏

  1. Follow me on Medium
  2. Connect with me on LinkedIn
  3. Reach out at dholakia.swarit@gmail.com to say hi!

I’d love to chat about fintech or any cool exponential technology!

--

--

Swarit Dholakia

I write about tech ideas, startups, life, philosophies and mindsets.