Uploaded by Deependra Yadav

15-FR

advertisement
A NOVEL TIME SERIES FORECASTING METHOD
USING FUZZY INFORMATION RETRIEVAL
SYSTEM
A project report submitted in partial fulfillment of the requirements for B.Tech. Project
B.Tech.
by
Anuj Bhatt(2016IPG-017)
Prakhar Sharma(2016IPG-071)
Pranjal Srivastava(2016IPG-072)
ATAL BIHARI VAJPAYEE INDIAN INSTITUTE OF
INFORMATION TECHNOLOGY AND
MANAGEMENT-474 010
2019
2
CANDIDATES DECLARATION
We hereby certify that the work, which is being presented in the report, entitled A Novel Time Series
Forecasting Method Using Fuzzy Information Retrieval System, in partial fulfillment of the requirement for the award of the Degree of Bachelor of Technology and submitted to the institution is an
authentic record of our own work carried out during the period May 2019 to September 2019 under the
supervision of Dr. W. Wilfred Godfrey and Dr. Jeevaraj S.. We also cited the reference about the
text(s)/figure(s)/table(s) from where they have been taken.
Date:
Signatures of the Candidates
This is to certify that the above statement made by the candidates is correct to the best of my knowledge.
Date:
Signatures of the Research Supervisors
3
Abstract
The time series is an efficient way to study about the existing trends and then take future decisions
according to the results obtained during the analysis. Stock market is one such platform where time
series is of utmost use and importance in order to predict the future market trend. Even though stock
markets have a high level of entropy and randomness about them , still they are somewhat time driven
and by sentiments of market players. This leads to the task of modelling of the stock market which is a
difficult task as it includes the raw, random data and the hidden market sentiment. To solve this problem,
we convert the time series consisting of raw data of stock markets having Open, High, Low, Close values
into a fuzzy linguistic time series. Information Retrieval systems are used to find the most relevant documents based upon a query. Fuzzy Information Retrieval Systems use this logic and find the most relevant
document based upon the tf-idf scores of values in the documents. The novelty of the approach followed
here is that we include the different kinds of candlesticks that are used to quantify the trend reversals in
a market (for example - hanging man, kicking bullish candlestick) and the relative strength index (RSI)
values (which signify momentum of the market) to our list of parameters. These added factors to the
fuzzy representation of trends in our documents makes the future trend prediction more accurate.
Keywords: RSI, tf-idf, candlestick, document, fuzzy inference system, hanging man, kicking bullish.
fuzzy linguistic time series
4
ACKNOWLEDGEMENTS
We are highly indebted to Dr. W. Wilfred Godrey and Dr. Jeevaraj S., and are obliged for giving us
the autonomy of functioning and experimenting with ideas. We would like to take this opportunity to
express our profound gratitude to them not only for their academic guidance but also for their personal
interest in our project and constant support coupled with confidence boosting and motivating sessions
which proved very fruitful and were instrumental in infusing self-assurance and trust within us. The nurturing and blossoming of the present work is mainly due to their valuable guidance, suggestions, astute
judgment, constructive criticism and an eye for perfection. Our mentor always answered myriad of our
doubts with smiling graciousness and prodigious patience, never letting us feel that we are novices by
always lending an ear to our views, appreciating and improving them and by giving us a free hand in
our project. It’s only because of their overwhelming interest and helpful attitude, the present work has
attained the stage it has.
Finally, we are grateful to our Institution and colleagues whose constant encouragement served to renew
our spirit, refocus our attention and energy and helped us in carrying out this work.
(Anuj Bhatt)
(Prakhar Sharma)
(Pranjal Srivastava)
Contents
List of Tables
6
List of Figures
7
1
Introduction and Literature Review
8
1.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
1.1.1
Japanese Candlestick Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
1.1.2
Candlestick Types (Trend Reversal Patterns) . . . . . . . . . . . . . . . . . . .
8
1.1.3
Relative Strength Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
1.1.4
Fuzzy Logic Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
1.1.5
Fuzzy Information Retrieval System . . . . . . . . . . . . . . . . . . . . . . . .
15
1.1.6
Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
1.1.7
Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17
2
Design Details and Implementation
18
2.1
Proposed Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
18
2.2
Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
18
2.3
Rearrangement and Reformation of Data . . . . . . . . . . . . . . . . . . . . . . . . . .
18
2.3.1
RSI Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
18
2.3.2
RSI - Divergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
2.3.3 RSI - Swing Rejection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Cluster Classification and Function Definition . . . . . . . . . . . . . . . . . . . . . . .
20
20
2.4.1
Fuzzification of candlestick properties . . . . . . . . . . . . . . . . . . . . . . .
22
2.4.2
Previous Trend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
2.4.3
2.4
3
4
Fuzzy Rules for Candlestick Classification . . . . . . . . . . . . . . . . . . . .
24
2.5
Document Formulation for Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
28
2.6
Document Formulation for Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
28
2.7
Document Matching and TF-IDF Score Calculation . . . . . . . . . . . . . . . . . . . .
28
Results and Discussion
29
3.1
Future Trend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
29
3.2
Future Trend Prediction (Final Output) . . . . . . . . . . . . . . . . . . . . . . . . . . .
29
Conclusion
32
Bibliography
33
5
List of Tables
3.1
BSE Sensex Data without RSI values . . . . . . . . . . . . . . . . . . . . . . . . . . . .
29
3.2
An example cluster of 5 days . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
30
3.3
BSE Sensex Data with RSI values . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
30
3.4
An example candlestick cluster of 5 days . . . . . . . . . . . . . . . . . . . . . . . . . .
30
3.5
An example fuzzy candlestick cluster of 5 days . . . . . . . . . . . . . . . . . . . . . .
31
6
List of Figures
1.1
A White Japanese Candlestick and a Black Japanese Candlestick . . . . . . . . . . . . .
9
1.2
Marubozu Candlestick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
1.3
Doji Candlestick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
1.4
Umbrella Candlestick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
1.5
Kicking Bearish Candlestick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
1.6
Engulfing Bearish Candlestick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
1.7
Bearish Harami Candlestick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
1.8
Bearish Meeting Line Candlestick . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
1.9
Bearish Hanging Man Candlestick . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
1.10 Bearish One Black Crow Candlestick . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
1.11 Bearish Descending Hawk Candlestick . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
1.12 Bullish Kicking Candlestick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
1.13 Bullish Engulfing Candlestick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
1.14 Bullish Harami Candlestick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
1.15 Bullish Meeting Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
1.16 Bullish Hammer Candlestick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
1.17 Bullish Piercing Line Candlestick . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
1.18 Bullish Homing Pigeon Candlestick . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
1.19 Bullish One White Soldier Candlestick . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
2.1
Bullish Divergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20
2.2
Bullish Swing Rejection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21
2.3
Membership function for US(k) & LS(k) . . . . . . . . . . . . . . . . . . . . . . . . . .
22
2.4
Membership function for BL(k) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
2.5
Membership function for gap(k) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
2.6
Membership function for trend(k) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
2.7
Membership function for difclose(k), difopen(k) & difcentral(k) . . . . . . . . . . . . . .
24
3.1
Candlesticks for 2-July-2019 to 8-July-2019
. . . . . . . . . . . . . . . . . . . . . . .
31
3.2
RSI graph for 2-July-2019 to 8-July-2019 . . . . . . . . . . . . . . . . . . . . . . . . .
31
7
Chapter 1
Introduction and Literature Review
This chapter includes the details of time series and fuzzy information retrieval system.
1.1
Introduction
The amount of information present in this world is growing with each day in volume as well as complexity. The large volume of data available to us is crossing the limits of our existing search technologies/
information retrieval systems to provide us the required information with precision and time boundedness. So in order to solve the existing problem, we have tried to develop a novel Fuzzy Information
Retrieval system to use it for time series forecasting over a given dataset. A new aggregation operator
is used in fuzzy information retrieval to overcome the drawbacks of the existing methods. A complete
model in python is being developed to successfully showcase these concepts proposed by us.
1.1.1
Japanese Candlestick Theory
The traditional statistical inference system lacks the robustness of dealing with complex real world time
series since they are based on strict assumptions while the computational inference systems ignore the
dependency structure of time series observations. Thus to overcome the shortcomings of these previous
two models, a fuzzy information retrieval system is proposed that will be utilised as an inference system.
1.1.2
Candlestick Types (Trend Reversal Patterns)
• Basic Candlestick Types
Some basic candlesticks types are defined below:
– Normal Candlestick: This is characterized by a candlestick having a significant body length
and shadow length.
– Marubozu: This is characterized by a candlestick with negligible shadow lengths.
– Doji: This is characterized by a candlestick with negligible body length and significant
shadow lengths.
– Umbrella: This is characterized by a candlestick when one of the shadows is negligible
while the other is significant along with very small body length.
8
1.1. INTRODUCTION
9
Figure 1.1: A White Japanese Candlestick and a Black Japanese Candlestick
Figure 1.2: Marubozu Candlestick
Figure 1.3: Doji Candlestick
• Trend Reversal Patterns
Trend reversal patterns are characteristic patterns associated with the candlesticks that showcase
the reversal of the trend of the current market. These are helpful in forecasting the future trend of
the market.
– Bearish Pattern i.e. the patterns which signify a downward future trend
– Bullish Pattern i.e. the patterns which signify an upward future trend
Bearish Patterns
Bearish candle reveals the bearish trend of the market. They can be of numerous types :
10
CHAPTER 1. INTRODUCTION AND LITERATURE REVIEW
Figure 1.4: Umbrella Candlestick
– Kicking Bearish
This pattern is characterised by a white marubozu and then immediately followed by a black
marubozu. The new session is opened below the opening session of previous candlestick and
therefore there is a gap between the two candlestick.
Figure 1.5: Kicking Bearish Candlestick
– Engulfing Bearish
In this the market is an upward trend. After the final white body the next body is a black
body that entirely engulfs the preceding white body.
Figure 1.6: Engulfing Bearish Candlestick
– Bearish Harami
The prevailing trend in the market is an upward trend. The final white body is followed by a
black body which is completely engulfed by the white body formed on the previous day.
– Bearish Meeting line
This candlestick is characterized by a white candlestick formed on the first day followed by
a black candlestick the next day. The black candlestick opens at a sharply higher level as
1.1. INTRODUCTION
11
Figure 1.7: Bearish Harami Candlestick
compared to the white candle but closes at the same level as the previous session’s close.
Figure 1.8: Bearish Meeting Line Candlestick
– Bearish Hanging Man
The trend in the market before this pattern is an uptrend. It is characterised by the formal of a
small black body formed at the top of its daily trading range. It has a lower shadow twice the
size of its body and its looks like a hanging man. This the reason why the pattern is named
like this.
Figure 1.9: Bearish Hanging Man Candlestick
– Bearish One Black Crow
The market is characterised by an upward trend. A white candlestick is formed on the first
day and is followed by a black candlestick. The black candlestick opens at a value lower than
last day’s close and closes below the open of the previous candlestick.
– Bearish Descending Hawk
This pattern is observed when a big white body on the first day which engulfs the other small
12
CHAPTER 1. INTRODUCTION AND LITERATURE REVIEW
Figure 1.10: Bearish One Black Crow Candlestick
white body formed on the next day. Its similar to harami pattern except the fact that both the
candlesticks in this are white.
Figure 1.11: Bearish Descending Hawk Candlestick
Bullish Patterns
Bullish candle reveals the bearish trend of the market. They can be of numerous types :
• Kicking Bullish
This pattern is characterised by a black marubozu and then immediately followed by a white
marubozu. The new session opens above the previous day’s opening.
Figure 1.12: Bullish Kicking Candlestick
• Engulfing Bullish
The market is characterised by downward trend in which a black body is observed on the first day.
On the second day a white body is formed which engulfs the previous day’s black body.
• Bullish Harami
1.1. INTRODUCTION
13
Figure 1.13: Bullish Engulfing Candlestick
The prevailing trend in the market is downward trend. The black body formed on the first day is
followed by a white body which is engulfed by the black body.
Figure 1.14: Bullish Harami Candlestick
• Bullish Meeting line
This candlestick is characterized by a black candlestick formed on the first day followed by a white
candlestick the next day. The white candlestick closes at the same level as the previous session’s
close.
Figure 1.15: Bullish Meeting Line
• Bullish Hammer
The trend in the market before this pattern is a downtrend. It is characterised by the formation of a
small body either black or white formed at the top of its daily trading range. It has a lower shadow
twice the size of its body and it looks like a hammer. This the reason why the pattern is named like
this.
• Bullish Piercing Line
The market is characterised by an downward trend. A black candlestick is formed on the first day
14
CHAPTER 1. INTRODUCTION AND LITERATURE REVIEW
Figure 1.16: Bullish Hammer Candlestick
and is followed by a white candlestick. The white candlestick opens with a gap down and closes
halfway into the body of the black candlestick but not above it.
Figure 1.17: Bullish Piercing Line Candlestick
• Bearish Homing Pigeon
This pattern is observed when a big black body on the first day which engulfs the other black white
body formed on the next day. Its similar to harami pattern except the fact that both the candlesticks
in this are black.
Figure 1.18: Bullish Homing Pigeon Candlestick
• Bullish One White Soldier
The prevailing trend in the market is a downward trend and is characterised by two candlesticks.
The candlestick on the first day has a black body and is followed by a white body. The white
candlestick opens above the previous day’s close and closes above its open.
1.1. INTRODUCTION
15
Figure 1.19: Bullish One White Soldier Candlestick
1.1.3
Relative Strength Index
The relative strength index (RSI) is a momentum indicator that measures the magnitude of recent price
changes to evaluate overbought or oversold conditions in the price of a stock or other asset.
1.1.4
Fuzzy Logic Theory
L.A. Zadeh introduced fuzzy set theory in which a fuzzy set F defined over a universe of discourse U is
a set of pairs:
F = {(x, µF (x)) : xU, µF (x)[0, 1]}
where µF (x) is called membership degree of the element x to the fuzzy set F.
In the method proposed in this report, the concept of fuzzy logic is used to represent the approximate
nature of candlestick time series and its properties in terms of linguistic variables which are saved as a
collection of documents.
1.1.5
Fuzzy Information Retrieval System
Information retrieval systems are defined to obtain the information resource which is most relevant to a
query made to a collection of these resources. These are broadly classified as:
• Algebraic models modifies queries and documents into mathematical objects like vectors. matrices,etc.
– Vector space model
– Extended Boolean model
– Latent semantic indexing model
• Set-theoretic models modifies queries and documents into sets of phrases. These are used to derive
similarities by utilising set-theory.
Examples:
– Boolean model
– Fuzzy retrieval model
Boolean models use Boolean indexing process. Thus, they are intolerant to any kind of the imprecision
in the information. To overcome this limitation, fuzzy retrieval systems are being developed. The process
16
CHAPTER 1. INTRODUCTION AND LITERATURE REVIEW
of fuzzification can effectively handle the user introduced vagueness in queries and is also very effective
in estimating partial relevance of the documents for a query. In the method proposed in this report,
The concept of fuzzy information retrieval is used to fetch the most relevant document which is used to
predict the future trend with respect to the query.
1.1.6
Literature Review
According to Fama [3] stock markets are random walks and this limits the predictability of a stock market. As per Bagheri [2] there are mainly two kind of tools for predicting a stock market trend, first
being fundamental analysis while the other being technical analysis. Fundamental analysis utilises the
knowledge of the structure of the company and the market it functions in. Technical analysis uses a data
mining techniques to find the association rules in the dataset. The approach we are using in the method
proposed in this paper is based on technical analysis.
Zhang [6] used neural networks with Bacterial Chemo-taxis Optimization(IBCO) for predicting the
stock market values. L. Wang proposed the conversion of normal time series into fuzzy time series and
used it to make stock market prediction. The data was fuzzified to the cluster centers in their approach.
W. Zhang suggested methods for indexing and classification of text. So, the assessment of semantic and
statistical qualities of text still isn’t standardised.
Attia[4] proposed a linguistic fuzzy information retrieval model. Gupta proposed that performance is
increased by use of fuzzy logic. Korol [7] proposed a system that works by using fuzzy rules contained in
a knowledge base. Partha[10] designed a time series forecasting method by utilising document retrieval
and a modified tf-idf scheme.
Gupta [9] proposed a ranking function which was used to find the most relevant document corresponding to a query based on weights of the terms. Fuzzy logic was used to implement ranking on two
levels, thus increasing the total number of fuzzy rules and increasing the accuracy of the output.
Zadrozny [8] proposed a new Information Retrieval (IR) system based on Zadeh’s calculus of linguistic statements. This model extend the normal fuzzy logic by extending the usual method of information
retrieval based on finding the most relevant document out of the pool.
Hong [11] drew a comparison between the various power-mean averaging operators currently used
in retrieving relevant documents in information retrieval systems. They proposed weighted power-mean
averaging operators which found the most relevant document based on the cumulative weights of each
term in query and document base.
Naranjo[14] proposed a way to identify patterns of candlestick in the stock market by making use of
fuzzy logic. The use of fuzzy logic made it easier to quantify the uncertainty of the market. The performance was tested against two different stock markets Nasdaq-100 and Eurostoxx. Using fuzzy rules
and candlesticks, they were able to improve results as it was less risky and showed a stable behavior for
which it was tested.
The literature review helped in formulation of these concepts:
• It was observed that forecasting is very complex process and if the time series data involves a
financial aspect, then the complexity further increase.
1.1. INTRODUCTION
17
• In most of the approaches the information retrieval systems have a very limited purpose i.e. to
assign relevance score to documents and returning the document with the highest value as per the
query.
• Time series contains a lot of information. A forecasting algorithm would give accurate results if
and only if it can be extracted completely.
1.1.7
Objectives
The main objective of this project is to design a time series forecasting system by utilising fuzzy logic.
Sub-objectives are as follows:
• To convert given time series data to a linguistic fuzzy time series data. This is done by fuzzifying
all the relevant properties of time series.
• To convert the fuzzified time series into documents.
• To design a fuzzy information retrieval system that utilises these documents to predict future
trends.
• To improve the existing ranking functions used in calculating relevance of a document to a query.
Chapter 2
Design Details and Implementation
This chapter covers topics about the details of the design aspects of our project and its implementation.
2.1
Proposed Methodology
The methodology used in the process described in this report is presented in Fig 2.1. The historical stock
market data contains Open, High, Low and close values of each day. This data is represented as Japanese
Candlesticks. The properties extracted from these candlesticks are then fuzzified. This fuzzified data
is used in formation of rule-base to the model. This fuzzy rule base is saved in the form of document
corpus which is served as an input to the information retrieval system. Simultaneously, we develop fuzzy
queries which are used in the information retrieval process. We use tf-idf scheme to perform the fuzzy
query processing. The result of this process gives the forecasted trend.
2.2
Data Collection
We collected S&P BSE SENSEX Index data from BSE India’s website. This data contains opening,
closing, high and low values for each trading session that happened from 1 st January 1991 to 31 st May
2019. It is stored as a csv file.
2.3
Rearrangement and Reformation of Data
In this section we rearrange the data as clusters of 5 days.
1. (-> Monday -> Tuesday -> Wednesday -> Thursday -> Friday)
2. (-> Tuesday -> Wednesday -> Thursday -> Friday -> Saturday)
3. (-> Wednesday -> Thursday -> Friday -> Saturday -> Sunday)
4. (-> Thursday -> Friday -> Saturday -> Sunday -> Monday )
5. (-> Friday -> Saturday -> Sunday -> Monday -> Tuesday)
The clusters we arranged for each consecutive day , this way we were able to create more number of
clusters for the time period provided.
2.3.1
RSI Calculation
Let Xkj represent the j value (Close, Low, High or Open value(OPCL)) on kth day . Then, We define
change as the absolute difference of closing values on kth and (k − 1)th day.
18
2.3. REARRANGEMENT AND REFORMATION OF DATA
19
i.e.
change = XCk − XCk−1
Let Uk and Dk represent the upward trend and downward trend respectively on kth day. Then,




change ≤ 0
0
Uk = 


change change > 0
and




change change < 0
Dk = 


0
change ≥ 0
Let Uk and Dk as the average upward movement and average downward movement from (k − 4)th day
to kth day respectively.
Then,
Pk
Uk =
and
i=k−4
Ui
5
Pk
Dk =
i=k−4
Di
5
Relative Strength(RS) is defined as average upward movement divided by average downward movement.
i.e.
RS k =
Uk
Dk
and relative strength index(RSI) is formulated as
RS Ik = 100 −
2.3.2
100
1 + RS k
RSI - Divergence
Divergence is an indicator of coming trend reversal in the stock market.It may be bullish or bearish in
nature. Bearish divergence occurs when the market is in an uptrend and the trend is going to reverse.
• Bullish divergence occurs when the market is in a downtrend and the trend is going to reverse.
Bullish Divergence is characterised by the dropping of RSI value into the oversold category i.e.
less than 30, after that a higher low is achieved that matches correspondingly lower lows in the
price. The RSI reading stays for some time in the oversold region gaining in strength signifying
an upcoming trend reversal to a bullish trend.
• Bearish divergence occurs when the market is in an uptrend and the trend is going to reverse.
Bearish Divergence is characterised by the floating of RSI values in the overbought region i.e.
greater than 70, after that a lower high is achieved that matches correspondingly lower highs in the
price. The RSI reading stays for some time in the overbought region losing in strength signifying
an upcoming trend reversal to a bullish trend.
20
CHAPTER 2. DESIGN DETAILS AND IMPLEMENTATION
Figure 2.1: Bullish Divergence
2.3.3
RSI - Swing Rejection
Swing Rejection is another indicator of coming trend reversal in the stock market. It also might be bullish
or bearish in nature. Bullish swing rejection is characterised by a bullish trend after a downtrend in the
market. Bearish swing rejection is characterised by a bearish trend after an uptrend in the market.
• Bullish swing rejection occurs when the market is in a downtrend, the RSI falls into the oversold
territory (<30). After that RSI climbs back up without falling back into the oversold territory and
keeps on climbing. After some time the RSI dips without crossing back into the oversold territory.
After that RSI climbs much higher than its previous high signifying an uptrend or a trend reversal.
• Bearish swing rejection occurs when the market is in an uptrend, the RSI climbs into the overbought territory (>70). After that RSI falls down without climbing back into the overbought territory and keeps on falling. After some time the RSI rises without crossing back into the overbought
territory. After that RSI falls much lower than its previous low signifying a downtrend or a trend
reversal.
– RSI falls into oversold territory.
– RSI crosses back above 30.
– RSI forms another dip without crossing back into oversold territory.
– RSI then breaks its most recent high.
2.4
Cluster Classification and Function Definition
Let high(k), low(k), open(k) and close(k) be the highest, lowest, opening and closing values and let
US (k), LS (k) and BL(k) be the upper shadow, lower shadow and body length respectively for a trading day k.
Then,
US (k) = 100.
high(k) − max(open(k), close(k))
open(k)
2.4. CLUSTER CLASSIFICATION AND FUNCTION DEFINITION
21
Figure 2.2: Bullish Swing Rejection
LS (k) = 100.
min(open(k), close(k) − low(k)
open(k))
BL(k) = 100.
close(k) − open(k)
close(k)
To gain more insight on this time series data, we define following fuzzy variables:
• gap: It is defined if and only if the highest value of preceding day was less than the lowest value
encountered today. It is the percentage relation between the gap-size and open/close value.
i.e.




0
gap(k) = 


100. low(k)−high(k−1
low(k)
low(k) ≤ high(k − 1)
in other cases
• trend: It represents the trend of the last two candlesticks, whether they represent bullish or bearish
trend.
trend(k) = 100.
close(k) − close(k − 1)
close(k)
• open-difference: It is defined as the percentage difference between low(k-1) and open(k).
i.e.




0
dopen(k) = 


100. low(k−1)−open(k)
low(k)
low(k − 1) ≤ open(k)
in other cases
• central-difference: It is defined as the percentage difference between the closing value on some
day and the average of open and close values of preceding day.
22
CHAPTER 2. DESIGN DETAILS AND IMPLEMENTATION
i.e.




0
dcentral(k) = 


100. close(k)−(open(t−1)+close(t−1))/2
close(k)
close(k) ≤
open(t−1)+close(t−1)
2
in other cases
• closing-difference: It is defined as the percentage difference between high(k-1) and close(k).
i.e.




0
dclose(k) = 


100. close(k)−high(k−1)
close(k)
2.4.1
close(k) ≤ high(k − 1)
in other cases
Fuzzification of candlestick properties
Fig 2.1 describes the fuzzy membership function used for used for fuzzifying upper shadow length and
lower shadow length of a candlestick. It converts the crisp value into one of the four linguistic variables:
NULL, SHORT, MIDDLE and LONG.
Similarly, Fig 2.2 to 2.5 describe membership functions used for fuzzifying other crisp variables defined
above.
For body length, there are 7 possible labels: BLACK_LONG, BLACK_MIDDLE, BLACK_SHORT,
EQUAL, WHITE_SHORT, WHITE_MIDDLE and WHITE_LONG.
For gap, close-difference, central-difference and open-difference, there are 4 possible values: NULL,
SHORT, MIDDLE and LONG.
For trend, there are 7 possible values: LONG_BEARISH, MIDDLE_BEARISH, SHORT_BEARISH,
NULL, SHORT_BULLISH, MIDDLE_BULLISH and LONG_BULLISH.
Figure 2.3: Membership function for US(k) & LS(k)
2.4. CLUSTER CLASSIFICATION AND FUNCTION DEFINITION
23
Figure 2.4: Membership function for BL(k)
Figure 2.5: Membership function for gap(k)
Figure 2.6: Membership function for trend(k)
2.4.2
Previous Trend
The crisp value of previous trend is derived as the average of crisp trend values(defined by trend(x) function).
i.e.
PrevT rend(cluster) =
trend(day1) + trend(day2) + trend(day3)
3
The fuzzy value of previous trend is generated by using the membership function as shown in figure 2.4.
• If the fuzzy value comes out to be either one of LONG_BEARISH, MIDDLE_BEARISH or
SHORT_BEARISH, we call the value of previous trend to be BEARISH.
24
CHAPTER 2. DESIGN DETAILS AND IMPLEMENTATION
Figure 2.7: Membership function for difclose(k), difopen(k) & difcentral(k)
• If the fuzzy value comes out to be NULL, we call the value of previous trend to be NEUTRAL.
• If the fuzzy value comes out to be either one of LONG_BULLISH, MIDDLE_BULLISH or
SHORT_BULLISH, we call the value of previous trend to be BULLISH.
2.4.3
Fuzzy Rules for Candlestick Classification
Kicking Bullish : This candlestick is observed if :
• value of fourth fuzzified candlestick day variable FuzzyUpper is NULL AND
• value of fourth fuzzified candlestick day variable FuzzyLower is NULL AND
• value of fifth fuzzified candlestick day variable FuzzyUpper is NULL AND
• value of fifth fuzzified candlestick day variable FuzzyLower is NULL AND
• value of fifth day variable of Low > value of fourth day variable of High AND
• value of fourth candlestick day variable of Body < -0.5 AND
• value of five candlestick day variable of Body>0.5 AND
Piercing Line : This candlestick is observed if:
• value of fourth candlestick day variable Body is <-0.5 AND
• value of fifth candlestick day variable Body is >0.5 AND
• value of fifth day variable of Open < value of fourth day variable of Low AND
• value of fifth day variable of Close > value of fourth day variable of Body/2 AND
• value of fifth day variable of Close < value of fourth day variable of Open AND
• value of fourth fuzzified candlestick day variable FuzzyBody is BLACKMIDDLE OR value of
fourth fuzzified candlestick day variable FuzzyBody is BLACKLONG
Engulfing : This candlestick is observed if:
2.4. CLUSTER CLASSIFICATION AND FUNCTION DEFINITION
25
• fourth candlestick day variable Body < -0.5 AND
• fifth candlestick day variable Body > 0.5 AND
• value of fourth day variable of High <= value of fifth day variable of Close AND
• value of fourth day variable of Low >= value of fifth day variable of Open
Harami : This candlestick is observed if:
• fourth candlestick day variable Body <-0.5 AND
• fifth candlestick day variable Body > 0.5 AND
• value of fourth day variable of Open >= value of fifth day variable of High AND
• value of fourth day variable of Close <= value of fifth day variable of Low
Inverted Hammer : This candlestick is observed if:
• value of fifth day variable of Low < value of fourth day variable of Low AND
• value of fifth fuzzified candlestick day variable FuzzyLower is NULL AND
• value of fourth day variable of Body <-0.5 AND
• (value of fifth day variable of Low - MIN (value of fifth day variable of Open, value of fifth day
variable of Close)) < value of fifth day variable of Body/5 AND
• value of fifth day variable of High - MAX(value of fifth day variable of Open , value of fifth day
variable of Close)> 2*ABS(value of fifth day variable of Open - value of fifth day variable of
Close)
One White Soldier : This candlestick is observed if:
• value of fourth day variable of Body < -0.5 AND
• value of fifth day variable of Body > 0.5 AND
• value of fifth day variable of Open > value of fourth day variable of Close AND
• value of fifth day variable of Close > value of fourth day variable of Open AND
• (value of fourth fuzzified candlestick day variable FuzzyBody is BLACKMIDDLE or value of
fourth fuzzified candlestick day variable FuzzyBody is BLACKLONG))
Homing Pigeon : This candlestick is observed if:
• value of fourth day variable of Body < -0.5 AND
• value of fifth day variable of Body < -0.5 AND
• value of fourth day variable of High > value of fifth day variable of High AND
• value of fourth day variable of Low < value of fifth day variable of Low)
26
CHAPTER 2. DESIGN DETAILS AND IMPLEMENTATION
Meeting Line : This candlestick is observed if:
• value of fourth day variable of Body < -0.5 AND
• value of fifth day variable of Body > 0.5 AND
• ((value of fourth day variable of Close-value of fifth day variable of Close])/value of fourth day
variable of Close) <= 0.5 AND
• ((value of fourth day variable of Close-value of fifth day variable of Close])/value of fourth day
variable of Close) >= 0)
Kicking Bearish : This candlestick is observed if:
• value of fourth fuzzified candlestick day variable FuzzyUpperis NULL AND
• value of fourth fuzzified candlestick day variable FuzzyLoweris NULL AND
• value of fifth fuzzified candlestick day variable FuzzyUpperis NULL AND
• value of fifth fuzzified candlestick day variable FuzzyLower is NULL AND
• value of fourth day variable Low > value of fifth day variable High AND
• value of fourth candlestick day variable of Body > 0.5 AND
• value of fifth candlestick day variable of Body < -0.5
Engulfing : This candlestick is observed if:
• value of fourth day variable of Body > 0.5 AND
• value of fifth day variable of Body <- 0.5 AND
• value of fourth day variable of High <= value of fifth day variable of Open AND
• value of fourth day variable of Low >= value of fifth day variable of Close
Harami : This candlestick is observed if:
• value of fourth candlestick day variable Body >0.5 AND
• value of fifth candlestick day variable Body < -0.5 AND
• value of fourth day variable of Close >= value of fifth day variable of High AND
• value of fourth day variable of Open <= value of fifth day variable of Low
Meeting Line : This candlestick is observed if:
• value of fourth day variable Body > 0.5 AND
• value of fifth day variable of Body < -0.5 AND
2.4. CLUSTER CLASSIFICATION AND FUNCTION DEFINITION
27
• ((value of fifth day variable of Close-value of fourth day variable of Close)/value of five day
variable of Close) <= 0.5 AND
• ((value of fifth day variable of Close-value of fourth day variable of Close)/value of five day
variable of Close) >= 0)
Hanging Man : This candlestick is observed if:
• value of fifth day variable High > value of fourth day variable High AND
• value of fifth fuzzified candlestick day variable FuzzyUpperis NULL AND
• (value of fifth day variable High - MAX(value of fifth day variable Open, value of fifth day variable
Close) < value of fifth day variable Body/5) AND
• MIN((value of fifth day variable Open, value of fifth day variable Close) - value of fifth day
variable Low > 2*ABS(value of fifth day variable Open - value of fifth day variable Close))
Descending Hawk : This candlestick is observed if:
• value of fifth day variable Body > 0.5 AND
• value of fifth day variable Body > 0.5 AND
• value of fourth day variable Close > value of fifth day variable High AND
• value of fourth day variable Open < value of fifth day variable Low)
One Black Crow : This candlestick is observed if:
• value of fourth day variable Body > 0.5 AND
• value of fifth day variable Body < -0.5 AND
• (value of fourth fuzzified candlestick day variable FuzzyBody is WHITEMIDDLE OR value of
fourth fuzzified candlestick day variable FuzzyBody is WHITELONG) AND
• value of fifth day variable Close < value of fourth day variable Low AND
• value of fifth day variable Open > value of fourth day variable Body/2)
Dark Cloud Clover : This candlestick is observed if:
• value of fourth day variable Body > 0.5 AND
• value of fifth day variable Body < -0.5 AND
• (value of fourth fuzzified candlestick day variable FuzzyBody is WHITEMIDDLE OR value of
fourth fuzzified candlestick day variable FuzzyBody is WHITELONG) AND
• value of fifth day variable Open > value of fourth day variable Close AND
• value of fifth day variable Close > value of fourth day variable Open
28
CHAPTER 2. DESIGN DETAILS AND IMPLEMENTATION
2.5
Document Formulation for Data
In our method, we have categorised the data of each fuzzified cluster into 3 documents: representing
bearish, neutral and bullish previous trend.
A document is formulated as a string of fuzzified values of :
Previous Candlestick Trend + Identified Candlestick Cluster+ Previous RSI Trend+ Divergence + Swing
Rejection.
Each of these documents is appended in one of three documents mentioned above as per their future
trends.
For example Bullish + Hammer + Overbought + Bearish Divergence + No Swing Rejection
2.6
Document Formulation for Query
Query is formulated the same way as the previous trends are defined. For each OHLC, RSI values
in query doc the corresponding fuzzified candlestick clusters and candlesticks are identified. Thus, a
document is formed for query just like before.
2.7
Document Matching and TF-IDF Score Calculation
To find the most relevant document, we use the tf-idf scheme in our method.
tf-idf scheme ranks the documents by their tf*idf value, where tf is the term frequency i.e. number of
occurrences of term in the document; while idf is the inverted document frequency i.e. total number of
occurrences of these terms in the documents.
In our method, we have normalized the values of tf and idf before calculating the final score.
Let tf and idf be the term frequency and inverted document frequency for any term. Then, we define
t flog = log10 (t f )
id flog = log10 (id f )
We define the normalized tf and idf scored as tf-norm and idf-norm respectively.
Let k be the total number of documents
(tf−norm)i =
(t flog )i
Pk
, then
i=1 (t flog )i
(idf−norm)i =
(id flog )i
Pk
i=1 (id flog )i
For each term in the query, we calculate its tf-idf value as
(tf−idf)term = (t fnorm )term ∗ (id fterm )term
We calculate relevance scores corresponding to each future trend document (BR, BL, NT). After that,
the relevance value of the document is calculated as the sum of tf-idf scores of all the terms in the query.
The document is maximum relevance score is considered to define the future trend.
Chapter 3
Results and Discussion
3.1
Future Trend
Future Trend is defined as the fuzzy trend of the set of first three days of the next cluster. It is calculated
exactly the same way as the previous trend is calculated.
Future Trend can be BEARISH, BULLISH OR NEUTRAL. It is expressed as - ’ Previous Candlestick Trend + Identified Candlestick Cluster + Previous RSI Trend + Divergence + Swing Rejection ’
and saved in three documents - BL, BR, NT.
3.2
Future Trend Prediction (Final Output)
• For our future trend prediction lets take an example of trend prediction for dates 16 August 2019
to 20 August 2019. We have included bse data only upto 31 May 2019 so there will be no exact
matching of trends for the upcoming future.
Table 3.1: BSE Sensex Data without RSI values
Date
02-Jul-19
03-Jul-19
04-Jul-19
05-Jul-19
08-Jul-19
09-Jul-19
10-Jul-19
11-Jul-19
12-Jul-19
15-Jul-19
16-Jul-19
17-Jul-19
18-Jul-19
19-Jul-19
Open
39811.68
39907.57
39917.65
39990.4
39476.38
38754.47
38701.99
38751.62
38941.1
39009.95
38961.86
39171.1
39204.47
39058.73
High
39838.49
39934.99
39979.1
40032.41
39476.38
38814.23
38854.85
38892.5
39021.84
39023.97
39173.89
39284.73
39204.47
39058.73
Low
39499.19
39732.38
39858.33
39441.38
38605.48
38435.87
38474.66
38631.31
38684.85
38696.6
38845.27
39081.14
38861.25
38271.35
Close
39816.48
39839.25
39908.06
39513.39
38720.57
38730.82
38557.04
38823.11
38736.23
38896.71
39131.04
39215.64
38897.46
38337.01
• Lets take 25-June-2019 to 19-July-2019 data for our query formulation.
• After arranging the data, we include the RSI values in our data.
29
30
CHAPTER 3. RESULTS AND DISCUSSION
Table 3.2: An example cluster of 5 days
Date
03-Jul-19
04-Jul-19
05-Jul-19
08-Jul-19
09-Jul-19
Open
39907.57
39917.65
39990.4
39476.38
38754.47
High
39934.99
39979.1
40032.41
39476.38
38814.23
Low
39732.38
39858.33
39441.38
38605.48
38435.87
Close
39839.25
39908.06
39513.39
38720.57
38730.82
RSI
75.469
78.360
42.475
19.756
20.444
Table 3.3: BSE Sensex Data with RSI values
Date
02-Jul-19
03-Jul-19
04-Jul-19
05-Jul-19
08-Jul-19
09-Jul-19
10-Jul-19
11-Jul-19
12-Jul-19
15-Jul-19
16-Jul-19
17-Jul-19
18-Jul-19
Open
39811.68
39907.57
39917.65
39990.4
39476.38
38754.47
38701.99
38751.62
38941.1
39009.95
38961.86
39171.1
39204.47
High
39838.49
39934.99
39979.1
40032.41
39476.38
38814.23
38854.85
38892.5
39021.84
39023.97
39173.89
39284.73
39204.47
Low
39499.19
39732.38
39858.33
39441.38
38605.48
38435.87
38474.66
38631.31
38684.85
38696.6
38845.27
39081.14
38861.25
Close
39816.48
39839.25
39908.06
39513.39
38720.57
38730.82
38557.04
38823.11
38736.23
38896.71
39131.04
39215.64
38897.46
74.570
75.469
78.360
42.475
19.756
20.444
17.301
36.099
33.034
44.009
56.902
60.959
42.258
• After that we take a cluster of 5 days representing one of our clusters from which the fuzzy values
will be generated.
• From the discrete OHLC and RSI values, we derive the normalised data representing various
attributes.
• After that, we fuzzify the normalised data.
• Document Created for this cluster - Bearish Inverted Hammer Overbought No Divergence No
Swing Rejection
• Future Trend - BR ( Query matches results in document BR )
• The future trend predicted for 2 July is Bearish which matches the actual trend as can be seen from
the RSI graph. The market is going to have a downward trend after an initial rise. This downward
trend will continue until the market starts climbing again.
Table 3.4: An example candlestick cluster of 5 days
Date
02-Jul-19
03-Jul-19
04-Jul-19
05-Jul-19
08-Jul-19
Upper
0.0687
0.153
0.105
0.0
0.154
Lower
0.267
0.124
0.180
0.291
0.761
Body
-0.171
-0.024
-1.207
-1.951
-0.061
Gap
0
0
0
0
0
Trend
0.057
0.172
-0.998
-2.047
0.026
Difopen
0
0
0
0
0
Difclose
0.002
0
0
0
0
Difcentral
0.063
0.086
0
0
0
RSI
75.469
78.360
42.475
19.756
20.444
3.2. FUTURE TREND PREDICTION (FINAL OUTPUT)
31
Table 3.5: An example fuzzy candlestick cluster of 5 days
Date
02-Jul-19
03-Jul-19
04-Jul-19
05-Jul-19
08-Jul-19
FuzzyLower
NULL
NULL
NULL
NULL
NULL
FuzzyUpper
NULL
NULL
NULL
NULL
NULL
FuzzyBody
BLACKSHORT
BLACKSHORT
BLACKSHORT
BLACKSHORT
BLACKSHORT
FuzzyTrend
NULL
NULL
SHORTBEARISH
MIDDLEBEARISH
NULL
FuzzyGap
NULL
NULL
NULL
NULL
NULL
FuzzyDifopen
NULL
NULL
NULL
NULL
NULL
FuzzyDifclose
NULL
NULL
NULL
NULL
NULL
Figure 3.1: Candlesticks for 2-July-2019 to 8-July-2019
Figure 3.2: RSI graph for 2-July-2019 to 8-July-2019
FuzzyDifcentral
NULL
NULL
NULL
NULL
NULL
RSI
VERYHIGHBEARISH
VERYHIGHBEARISH
LOWBEARISH
HIGHBEARISH
HIGHBEARISH
Chapter 4
Conclusion
The model proposed by us covers a lot of characteristics of the share market. RSI(Relative Strength
Index) helps to gather information about the momentum of the stock market. The data in the time series
that we have used is represented by candlesticks in the form of Open, High, Low, Close values as well
as the properties of the candlesticks. Candlesticks are useful for representing the discrete data in an
understandable form. There exist some candlesticks for identifying particular patterns of the market
which helps to predict the future trends in the market (such as trend reversals).
We then created fuzzy rules on the basis of cluster of days , each set of 5 consecutive days being one
cluster. This helps us to create more number of clusters and expanding our document base. After that
we normalised the discrete value by defining some functions which quantify the properties of candlesticks and trends. On the basis of this data, we derived membership functions through which we created
fuzzified data. With the help of this fuzzy data, we then created fuzzy rules through which we were
able to realise the different candlestick patterns which exist in the stock market. Along with momentum
of the market, we used all of these characteristics to create documents for each cluster. Through fuzzy
information retrieval systems, we were able to rank each term according to its relevance in the pool of
documents representing the future trend. The query with the highest relevance is matched and we finally
get the future trend of the market along with the fuzzy documents representing the strength of our prediction.
So finally, the motivation behind this work was to measure the quantitative as well as the qualitative
aspect of the time series. With the help of fuzzy logic we were able to preserve the information that is
otherwise lost in other methods, this ensured a more accurate prediction which is not quantified but is
linguistically stable. This motivates us to find trends in time series and be able to predict future trends in
our time series analysis.
32
Bibliography
[1] Hesham Ahmed Hefny Zeinab E. Attia, Ahmed M. Gadallah. An enhanced multi-view fuzzy
information retrieval model based on linguistics. IERI Procedia 7, 7:90–95, 2014.
[2] Peyhani Bagheri and Akbari. Financial forecasting using anfis networks with quantum-behaved
particle swarm optimization. Expert Systems With Applications, 41:6235–6350, 2014.
[3] Fama. Efficient capital markets: A review of theory and empirical work. The journal of finance,
25:383–417, 1970.
[4] Ahmed M. Gadallaha Zeinab E. Attiaa and Hesham M. Hefnya. An enhanced multi-view fuzzy
information retrieval model based on linguistics. IERI Procedia 7, 7:90–95, 2014.
[5] Moya F Cordon O. and Zarco C. Automatic learning of multiple extended boolean queries by
multiobjective ga-p algorithms. Studies in Fuzziness and Soft Computing, Springer, 137:100–127,
2004.
[6] Ngai Zhang, Hu and Liu. Stock trading rule discovery with an evolutionary trend following model.
Expert Systems with Applications, 42:212–222, 2015.
[7] Korol. A fuzzy logic model for forecasting exchange rates. Knowledge Based Systems, 67:49–60,
2014.
[8] Katarzyna Nowacka Slawomir Zadrozny. Fuzzy information retrieval model revisited. Fuzzy Sets
and Systems, 160:2173–2191, 2009.
[9] Ashish Saini Yogesh Gupta and A.K. Saxena. A new fuzzy logic based ranking function for efficient
information retrieval system. Expert Systems with Applications, 42:1223–1234, 2015.
[10] Ramesh Kumar Partha Roy and Sanjay Sharma. A novel fuzzy document based information retrieval model for forecasting. Fuzzy Information and Engineering, 9:137–159, 2017.
[11] Li-Hui Wang Shyi-Ming Chen Won-Sin Hong, Shi-Jay Chen. A new approach for fuzzy information retrieval based on weighted power-mean averaging operators. Computers Mathematics with
Applications, 53:1800–1819, 2007.
[12] Ashish Saini Yogesh Gupta and A.K.Saxena. A new fuzzy logic based ranking function for efficient
information retrieval system. Expert Systems with Applications, 42:1223–1234, 2015.
[13] SÅĆawomir ZadroÅijny and Katarzyna Nowacka. Fuzzy information retrieval model revisited.
Fuzzy Sets and Systems, 160:2173–2191, 2009.
[14] Javier Arroyo Rodrigo Naranjo and Matilde Santos. Fuzzy modeling of stock trading with fuzzy
candlesticks. Expert Systems With Applications, 162:2173–2191, 2017.
33
Download
Study collections