Natural
language processing, widely known as NLP, is a subfield of artificial
intelligence. This is used to create a link between human and machine. NLP
helps machine to understand human language by educating or train them based on
rules or data.
NLP
is mainly used in Speech Technology, OCR, Machine learning etc. Most common
example we can consider of NLP is email or text filter, predictive words in
email or text, digital assistant, data analysis etc.
Now
researchers have taken NLP into a next level where a machine is trained to understand financial market's up and down. Using its data analysis capability, a
machine can now predict how a specific stock price will behave in future.
Now this is important to understand why we needed such an AI powered system which will tell what stock to buy or not. This can be also done by a human also. But here machine beats human by it's computational power. It can analyze large number of historical and current financial data, data from social media and news and then analyse it and finally provide the prediction.
Investment is one of the most difficult decisions which may result in huge
profit or loss according to the investors' analysis. It is very crucial that
the extent of human errors in these pressure situations is reduced so that the
profit can be maximized. The technical analysts believe that the future price
can be forecasted using the past price movements.
Sentiment analysis uses text mining, natural language processing and computational techniques to automatically extract sentiments from a text. It aims to classify the polarity of a given text at the sentence level or class level, whether it reflects a positive, negative, or neutral view. In stock market prediction task, two important sources of the text are used either social media or online financial news article and historical stock prices. Sentiment analysis decreases the risk factor by informing the investors about the intricacies of the decision they are about to make. The stock closing prices for some future date could be predicted by training the machine learning models by providing the stock prices for previous dates. When sentiment analysis is applied on stocks in news from moneycontrol.com regarding the public sentiment or opinion on that stock. Then, it becomes evident that whether to invest in that stock or not.
Block diagram representation:
Above diagram shows how data can be fed to a machine and then rules will be applied to those data to make the prediction. The more data machine consumes to train the more accurate result can be seen.
To
show how this prediction model works I have created various case studies and
tested with Amazon, Facebook and Netflix stock prices.
Note: These codes are
written in Python using google colab.
Programs are created for this article are very simple and it shows how to train the machine with dataset and predict future stock prices. I have used
SVM, LR and Decision tree model. These programs uses downloaded
stock files from financial site (Yahoo Finance) as input data. Also, these programs can be more enhanced which
can read from all type of social media news and multiple financial files.
SVM
model:
Support
Vector Machine is a supervised machine learning algorithm which can be used for both classification
or regression challenges. However, it is mostly used in classification
problems.
LR
model:
Linear regression was introduced in statistics as a model to understand the relationship between input and output numerical variables. But later this is used in natural language processing. It is both a statistical algorithm and a machine learning algorithm.
Below are the steps used for Decision Tree Classifier with Amazon, Facebook and Netflix stock files.
Case Study 1: With
Facebook Stock
We are loading the stock file here.
Next step is making 'Date' field as indexed field.
Here
we are using identifier 1 or 0 to understand when stock prices gone up or down.
We are monitoring 'Close' field for this purpose. So, if the price is up next
day it will show as 1 and if the price is down then it will show 0. Please refer 'Price_Up' column.
Next
step is manipulating this dataset for further activities:
Above score is predicted by Decision Tree Classifier.
Below
is the comparison of actual and prediction data:
Case Study 2: With Amazon Stock
We are using same program here with different stock file.
Prediction as below:
So
far, we have seen Decision tree model, the prediction score is not high enough.
Therefore, Support Vector Machine Model or Liner Regression Model comes into picture.
Below studies have been done with both SVM and LR.
Case
Study 3: This has
been performed with Wiki data and Facebook stock price which is available in quandl
First,
we will install the required packages as below. Also this is a new program created. Please follow below series of steps.
SVM model score:
Linear Regression model score:
LR vs SVM prediction:
Above case studies help us to understand how an AI based prediction system can be built. Also, the prediction score will depend on multiple factors like complex logic, training dataset etc. This process is not just simply trying to predict a value but it works on every stock related sentiment and risk analysis.
Now
we will perform another case study which will show the graphical representation
of prediction.
Case
Study 4: Graphical
representation of Amazon stock price prediction
To showcase this below steps/codes were built.
Graphical
representation of Original and Predictive values in Tree model:
Graphical representation of Original and Predictive values in LR model
As mentioned earlier, this machine can perform more accurately if it is built to handle massive load as training data equipped with better hardware or processor etc. Also, this can be interfaced with any number of language feed. The machine will translate any feed to a common machine language and then perform its analysis.
Reference:
1. Wikipedia
2. Yahoo Finance
3. Forbes
4. Google Images