Building a Better Stock Market Prediction Algorithm

Introduction: Predictive Analytics for Stock Price Prediction

The majority of methods employed today for stock price prediction algorithms usually rely on correlation and statistical analysis. In this article we challenge this method, and propose a different and well-established alternative that results in superior outcomes.

This article contains some level of mathematical detail. You can skip some math-heavy sections, and keep reading to find out about the relevance of each concept.

To jump directly to a section, click the table of contents:

Weather vs the stock market: why we can’t use the same predictive analytics for both
Why we should not rely on correlations for stock price prediction algorithms
DSP (Digital Signal Processing) method to remove noise for stock price data
Smooth price curves: the result of using DSP filters
Conclusion: the importance of noise removal for stock price prediction

Weather Prediction Algorithm vs Stock Market Prediction Algorithm

stock price prediction algorithm correlation, green purple blue graphs insight inc miami

In the case of weather prediction, physical variables such as temperature, velocity, humidity, and density are available to help predict the next state of the system in some organized scientific way. Although this type of system is nonlinear and chaotic, sophisticated prediction models have been developed—thanks to those measurable data-inputs.

However, in the case of stock trading, the inputs to the stock prices are not directly visible nor measurable. Thus, building stock market prediction algorithm models based on causation is impractical. This is why some entities treat the market as a statistical variant.

Yet we still consider the system to be nonlinear and chaotic—just like the weather. The historical record shows that portfolio returns are still erratic, which means stock market prediction is not yet an established science.

Correlations in Stock Market Trading

Trading strategies seem to seek correlations with external events in some way. This choice is tricky, because correlation does not necessarily mean causation. In addition, correlations (when known) are often time-changing (and temporary).

Current thinking seems to favor correlation-based ideas for stock market prediction algorithms. This includes leveraging large databases using artificial intelligence, and machine learning for price prediction. Ostensibly the goal is to improve returns performance. So far, AI is still in development.

For example, we compare AIEQ, the Russell 1000, and the S&P 500 in this article. Figure 4 behavior (plus the higher returns of AIEQ) suggest that AI is sometimes “smarter” than the behavior observed in Figure 5. But in the article, we also note: “When the broad market goes down, AIEQ should be moving up, not down.”

Time Scales and Sample Rates Also Affect "Correlation"

Price data can look quite different, depending on the time scales, and the sample rates we choose. For example, we wrote a post about comparing ES and SPY on the intraday time scale:

As the time period goes from long intervals to very short intervals, the correlation between the indexes goes from approximately 1 (strong correlation) to approximately .008 or less (very weak correlation).

For those who want “market neutral,” it is available. For those who want correlation, it is available. Thus there is an absence of true correlation, only confirmation bias.

Volatility in stock pricing hides the truth. Moving averages help clean up the noise, but predictive potential requires engineering-strength digital signal processing (DSP) methods.

Further Reading:
On the References tab, we have two links showing “visual” examples of dynamics. Each video shows something about sample rate, or frequency-related dynamics. What we “see” depends on those choices.

Digital Signal Processing and Removing Noise for Price Prediction

The Fourier Method

By using the Fourier method (Learn about this method in our recent post), we can see that every equation has a characteristic frequency response. The frequency response informs us in an instant whether the equation will be quantitatively effective for our mission’s goals, or not. In essence, calculating the frequency response is a robust de-noising feature which is especially useful for messy and noisy systems — and also a method very common in aerospace flight test data analysis.

In this context, it is not even necessary to run Monte Carlo simulations—attempting to model the probability of all different outcomes of a process.

Creating Filters to Remove Noise

In the below example, we will create two simple filters to remove noise. One filter will be an ordinary 61 point average. The other filter will be 61 points, with the coefficients based upon “Pascal’s Triangle.”

One way to utilize Pascal’s Triangle is to use the numbers in any row as a set of filter coefficients. Interestingly, as the coefficient sets become longer and longer, the shape of the coefficient set looks more and more like a “normal distribution” (i.e., the well-known gaussian curve).

To make the pascal filter coefficients, we pick the row with 61 numerical elements, and then we divide all the numbers in that row by 2 to the 60^th power. As a check, both sets of coefficients should add up to 1.0000. (See this method explained.) Bingo, we now have two digital filters – let’s compare them side by side.

pascal vs average DSP filters for stock price prediction, red blue graphs, fig 1, bell curve

pascal vs average DSP filters for stock price prediction, red blue graphs, fig 2

Figure 1 compares the coefficients. The coefficients for the ordinary average are all exactly the same. And, the pascal coefficients look more “gaussian.”

Figure 2 compares the coefficients on log scale. The pascal coefficients are larger in the middle, and become tiny toward the ends.

Applying the Filters to Price Data

blue graph, red graph, normalized frequency, magnitude dB, DSP filter for stock price prediction

Figure 3 shows scientifically what the filters will do to price data. The ordinary average will tend to weigh the information similarly at all frequencies, and removes some noise at all frequencies. However, the Pascal will keep all information at low frequencies, and remove more signal and noise as the frequency increases.

Past the frequency of X = .31, computer precision is limited to -300 dB, so it not able to display how much the filter is capable of removing. But we can visually extrapolate the graph to guess how it might look. The extrapolation suggests that the pascal filter is powerful for removing noise in a stock price prediction algorithm.

The Result: Noise Free Data

digital signal processing filter for stock price prediction algorithm, blue red graph pascal

digital signal processing filter for stock price prediction algorithm, blue red graph averaging

Figure 4 and Figure 5 compare the original data and the filtered data. Naturally, the original data contains lots of noisy jiggles. However, data filtered by averaging tends to track an “average” path. Of course, some noise still remains in the signal. We can see that data filtered by Pascal creates a new smooth signal with significantly reduced noise.

Conclusion: The Importance of Noise Removal for Stock Price Prediction Algorithms

Even if you do not follow all the math, the two filter examples emphasize the importance of removal of noise, in order to obtain more information about the true signal buried within the noise. We want to remove the noise because the noise distracts us away from the truth, and thus the ability to predict.

A large body of books and papers cover the subject of Digital Signal Processing – how to design filters. However, most of these references tend to lean toward certain formalized subject areas. While this formal background is helpful, the arena of stock and index prices has special characteristics. For example, the stock market has impulsive crowd dynamics and chaotic effects. These characteristics require that filter designs be based on first principles, with little connection to stochastic theory.

Summing up, whenever a system does not have visible inputs (such as stock price data), option B is to look at the data itself for clues. Typically, first you remove the noise in some scientific way. Then, physics principles and DSP can be used to reveal dynamic movements.

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30