Proposal

advertisement
A graphical frontend to Numenta Platform for
Intelligent Computing
Nicholas Hainsey
10/13/2014
Background:
Introduction:
Neural networks and machine learning algorithms are now being developed for the
purpose of being able to solve problems that conventional algorithms have trouble with. Such
problems include pattern, image, and speech recognition. To do so, computer systems are
being designed more like a human brain than a computer. The human neocortex is able to take
in any type of sensory data, make inferences from it, predict outcomes, and react based off
those predictions. If we mimic this in building computer systems, it’s possible we can make a
computer that can recognize speech patterns or image patterns and do the same. Essentially
the goal is to create a computer that can learn from data it is trained on, remember that, and
make inferences as to what new data looks like.
Neural Networks
As stated, the goal of neural networks is to more efficiently solve problems like pattern
recognition. A simple example of a problem like this is given in Neural Networks for Pattern
Recognition [1].Say we want to determine the classification of some image. We could simply
store each possible image with its corresponding classification. This may work for incredibly tiny
images but it quickly becomes unmanageable. Take for example images that are 256 x 256
pixels; that means each image consists of 65536 pixels, each one a certain color represented by
8-bit numbers. The total possible number of images would then come out to 28x256x256 or
roughly 10158000 images. It is just impossible to store them all. In comparison, using a neural
network, we might only use a few thousand pictures to train it on. When presented with a new
image, the neural network would then make an inference as to what classification it would be,
based off what other images it knows it is similar to.
An example of a neural network performing pattern recognition similar to this can be
seen in an article in Expert Systems with Applications [2]. In the article they discuss designing a
multi-layered perceptron (MLP), a type of neural network, to perform iris recognition which is a
common type of biometrics recognition. Using a simple implementation of an MLP they tested
multiple different ways of partitioning the data they were given, which were images of eyes.
Based on the way they partitioned the data and organized the MLP, they were able to achieve
up to 93.33% accuracy. This shows that data preprocessing can be essential to creating an
accurate neural network. Their accuracy results were able to exceed 4 others who tried before
them and approach that made by others in 2008. Also working on iris recognition, Xu et. al.
used a different type of neural network called an intersecting cortical model to test the
accuracy of it under different circumstances [3]. In their tests, they were able to achieve 98%
accuracy for iris recognition using this neural network.
Iris recognition isn’t the only thing neural networks can excel at. Another article in
Neural Computing and Applications in 2009 shows that using a self-adaptive radial basis
function neural network for facial recognition could outperform the currently used methods of
facial recognition at that time [4]. The neural network was tested separately on 2 different
facial recognition databases, one with a small variation in angle and scaling of the images (ORL),
and one with large variation in angle and scaling of the images (UMIST). With both databases
and their proposed method they were able to achieve error rates better than other methods
for facial recognition and which approached the best error rates for facial recognition.
All of these articles so far have dealt with image recognition; however we’ve already
said that is not the only type of pattern recognition that neural networks can be used for. What
we are particularly interested in, is using neural networks for predicting time-series data. In an
article in Expert Systems with Applications in 2011, a neural network model was used to
forecast the occurrence of seismic events such as earthquakes [5]. In the article, the case study
was a neural network trained on only time-series magnitude data and the output was the
magnitude of the next day. The accuracy of this method was 80.55% for all seismic events but
only 58.02% for major seismic events. The second case study involved a neural network being
trained on seismic electric signals (SES) which occur before earthquakes, as well as the time
between SES and earthquakes. After constructing the missing SES data in the time-series, they
were able to predict seismic events’ magnitudes at 84% when predicting on the magnitude.
When predicting the magnitude and time lag to the seismic events they were 83.56% accurate
on the magnitude and 92.96% accurate on the time lag.
Another example of neural networks being used for time series can be seen in an article
from 2010 in Solar Energy [6]. This is another group to use an MLP to determine patterns in
data. Their goal was to use the MLP to be able to predict daily solar radiation on a horizontal
plane, which would be useful information for solar electricity providers. When trained on a time
series of solar radiation data, the MLP was able to perform at or better than other common
models they had been testing, based on mean square error, even before deseasonalization.
This is particularly surprising given the results of another article from the European
Journal of Operational Research which found that “neural networks are not able to capture
seasonal or trend variations effectively with the unpreprocessed raw data and either
detrending or deseasonalization can dramatically reduce forecasting errors [7].” In the article
they compared using a feed forward neural network on seasonal data with a trend to using it
on the same data that has been deseasonalized, detrended, and both. In every test, the neural
network using the original data performed far worse than the other three in teams of root
mean square error. This was true for 3 separate levels of noise. This lends more evidence that
when building a neural network we should be focused on setting up the data correctly
beforehand.
HTM
The specific type of neural network we are interested in using is called hierarchical
temporal memory. Jeffrey Hawkins and Numenta, Inc. currently hold a patent [8] for “Trainable
hierarchical memory system and method” which is another type of neural network as well as a
number of other patents [9] [10]on the structure and function of this system. They aim to
replicate the neocortex and the way it functions by copying the hierarchical structure of its
neurons. Each layer of the hierarchy learns information from the layer below it all the way
down to the input layer. The information is then passed up the hierarchy to the smaller layers
above until it reaches the top layer where output is determined [11]. HTMs are specifically
designed to work with sensory data that is constantly being read in. because of this they could
be a great method for predicting time series data as well as other pattern recognition problems.
Currently Numenta has released the source code for their implementation of an HTM called
Numenta Platform for Intelligent Computing (NuPIC) [12].
For example, in 2009, a paper published in the South African Journal of Science describes
Numenta being used for land-use classification [13]. In the article it is described that the NuPIC
was used and trained on satellite images of different types of land use: built-up surface,
irrigated land, fallow land, and different plant species. Once the HTM was trained on what each
type of land-use looked like it was tested with new satellite images to see if it could determine
which type of land it was looking at. In the end, the least accurate the HTM was with a land
type was 81.33% while another land type it was able to determine correctly 100% of the time.
The HTM showed an overall accuracy of 90.4% with a classification that was 80% better than
just randomly classifying images. Later in 2011, they published another article in African Journal
of Agricultural Research where they redid this experiment [14]. The purpose for the retooling of
the experiment was the first time, each training image and test image only had one pattern of
land in it. This time they used the same land classifications however the images were less
restrictive, allowing more than one land pattern in each one, also adding other factors in to try
to increase accuracy. In the end they were able to produce results similar to before, where they
were able to determine some land types with 100% accuracy and others at accuracies as low as
87.4% with an overall accuracy of 96%.
Another article from Neurocomputing compares different Heirarchical temporal
memory models and Hidden Markov Models for sign language recognition [15]. The article
discusses training different HTM models and Hidden Markov Models on input data about hand
signs such as position, velocity, and acceleration of the hand in 3D space; roll, pitch, and yaw of
the wrist; and bend coefficients for each finger. Once trained on the data, the systems would be
given new hand sign data to determine how accurately each one could predict the sign. Hidden
Markov Models ended with an accuracy of 88% while NuPIC had an accuracy of 61%. However,
when they modified the HTM and decided to partition the input space into multiple regions,
they were able to produce HTMs with accuracies greater than 61% and even greater than 88%
in some cases.
In 2008, Nathan C. Schey presented his Honors Thesis at Ohio State University on using
NuPIC for song identification [16]. In this study Schey began by using a piano roll graph of a midi
file as his data read into the HTM. In this attempt, the HTM had an accuracy of only 47%.
However he found this was due to the HTM using Gaussian distance when data is represented
in binary. Upon changing his representation scheme of the songs and creating a larger, more
robust HTM, he was able to achieve a song prediction accuracy of 100%. But the HTM only had
to learn 5 songs. So he decided to change the data set to 40 songs and still achieved 100%
prediction accuracy. The songs that were being analyzed were stored in a Midi format which is
generally simpler than that of other audio file formats; however the 100% accuracy still lends
weight to the idea that HTMs can be powerful mechanisms for computerized pattern
recognition.
Time Series Prediction:
Though we have given a few examples of neural networks used to pattern recognition,
the area we would really like to focus on is time-series prediction. A literature review on various
time series prediction algorithms by Kumara, et al. goes into some detail on using neural
networks for time series analysis [17]. They stated that with invention of back-propagation
algorithms (algorithms that let information flow forward and backward in a network) in 1986,
applications of neural networks on time-series problems began to have successful results,
eventually showing the capability to outperform statistical forecasting methods like regression
analysis and Box-Jenkins forecasting. Georg Dorffner also provides an overview of neural
networks being used for time series processing [18]. He also explains that in most cases, data
must be pre-processed depending on the problem and dataset before it can be used in the
neural network, giving the examples of deseasonalizing and detrending. His reasoning for this is
that many methods for forecasting require stationarity of the data. These and the other papers
we mentioned concerning time-series prediction using neural networks [5] [6] [7] give an idea
what must be done to use neural networks for time series forecasting, and provide an idea of
what we can do with HTMs given such a problem.
My Proposed Project:
Neural networks can be applied with relatively good accuracy to a multitude of pattern
recognition problems including time-series prediction. As was said before time-series prediction
is the problem that interests us most. While we were unable to find any articles using an HTM
for time-series prediction, we have found examples of it online. The HTM community on Github
has created some tutorials on the matter [19] and Numenta themselves have created
commercial products that use such a method. One product they have available is Grok [20].
Grok monitors and learns standard patterns from real-time data about a user’s Amazon Web
Server environment. Once it has learned these patterns, it alerts the user when it receives data
that is anomalous compared to what it is expecting. An example of this would be if CPU usage
spikes when it is expected to stay low and stable, this is flagged as anomalous.
NuPIC:
NuPIC is open source software for building HTMs that implement Numenta’s cortical
learning algorithm (CLA).
Specialized Graphical Interface:
It is in our interest now to take the Numenta Platform for Intelligent Computing
and build a user interface for it that would allow an HTM to easily be applied to a time-series
problem. As it stands, NuPIC is run entirely with Python scripts from the terminal or command
prompt. Our main goal in this project is to create a graphical user interface for a certain
problem that will allow us to build an HTM and run it with a given dataset. Examples of what
such an interface may look like can be seen in video tutorials of NuPIC and also in a canceled
project called OpenHTM [21]. From the articles we reviewed, it is apparent that a main factor in
creating accurate neural networks is data preprocessing. As such, our GUI should provide a view
of a user’s data and allow them to perform various measures of preprocessing to their data. It
should also provide a view of the system and what the HTM is predicting from given input data.
In addition to this, it should allow the user the ability to tweak the customizable paramerters of
NuPIC and see the result.
Generalized Graphical Interface:
Once I finish the graphical interface for the specified problem, the next goal would be to
extend it for a more general purpose. It would be great if the interface could use any type of
time-series data to be analyzed using an HTM. However, as it stands this may be a difficult task
to complete as some data may require a certain amount of specialized preprocessing in order
for the HTM to have accurate results, as was seen in a previous article [7]. This generalized
graphical interface is more of a stretch goal for the case that the specialized graphical interface
will be completed ahead of time.
Resources Needed:
As I would be creating a GUI for python code(unless Java proves to be simpler), I would
need the use of a PC, as well as whatever python libraries are required to run NuPIC. A
development environment for python would be useful as well though, however any text-editor
could work.
Timeline:
November: Get NuPIC tutorials working
December: Get NuPIC working with custom data sets
January: Start building GUI around NuPIC as I could get it to work with our datasets.
February: Flesh out GUI, add graphical representations of HTM
March: Extend GUI to a more generalized time-series problem.
Works Cited
[1] C. M. Bishop, Neural Networks for Pattern Recognition., Oxford, New York: Oxford University Press,
1995.
[2] F. N. Sibai, H. I. Hosani, R. M. Naqbi, S. Dhanhani and S. Shehhi, "Iris recognition using artificial
neural networks," Expert Systems with Applications, vol. 38, no. 5, p. 5940–5946, 2011.
[3] X. Guang-zhu, Z. Zai-Feng and M. Yi-de, "An image segmentation based method for iris feature," The
Journal of China Universities of Posts and Telecommunications, vol. 15, no. 1, 2008.
[4] J. K. Sing, S. Thakur, D. K. Basu, M. Nasipuri and M. Kundu, "High-speed face recognition using selfadaptive radial basis," Neural Computing and Applications, no. 18, pp. 979-990, 2009.
[5] M. Moustra, M. Avraamides and C. Christodoulou, "Artificial neural networks for earthquake
prediction using time series magnitude data or Seismic Electric Signals," Expert Systems with
Applications, vol. 38, no. 12, p. 15032–15039, 2011.
[6] C. Paoli, C. Voyant, M. Muselli and M.-L. Nivet, "Forecasting of preprocessed daily solar radiation
time series using neural networks," Solar Energy, vol. 84, no. 12, p. 2146–2160, 2010.
[7] G. Zhang and M. Qi, "Neural network forecasting for seasonal and trend time series," European
Journal of Operational Research, vol. 160, no. 2, p. 501–514, 2005.
[8] D. George and J. Hawkins, "Trainable hierarchical memory system and method". United States of
America Patent US20070005531 A1, 4 Jan 2007.
[9] J. Hawkins and D. George, "Directed behavior using a hierarchical temporal memory based system".
United States of America Patent US20070192268 A1, 16 Aug 2007.
[10] S. Ahmad, J. Hawkins, F. Astier and D. George, "Extensible hierarchical temporal memory based
system". United States of America Patent US20070276774 A1, 29 Nov 2007.
[11] Numenta, Inc, "Hierarchical Temporal Memory Including HTM Cortical Learning Algorithms," 12
Sept. 2011. [Online]. Available:
http://numenta.org/resources/HTM_CorticalLearningAlgorithms.pdf. [Accessed 2014 15 10].
[12] Numenta, Inc, "Numenta Platform for Intelligent Computing," Numenta, Inc, [Online]. Available:
http://numenta.org/nupic.html. [Accessed 15 Oct 2014].
[13] A. Perea, J. Merono and M. Aguilera, "Application of Numenta Hierarchical Temporal Memory for
land-use classification," South African Journal of Sciences, no. 105, pp. 370-376, 2009.
[14] P. A. J., M. J. E. and A. M. J, "Hierarchical temporal memory for mapping vineyards," African Journal
of Agricultural Research, vol. 7, no. 3, pp. 456-466, 2012.
[15] D. Rozado, F. B. Rodriguez and P. Varona, "Extending the bioinspired hierarchical temporal memory
paradigm for sign language recognition," Neurocomputing, vol. 79, pp. 75-86, 2012.
[16] N. C. Schey, "Knowledge Bank," May 2008. [Online]. Available:
http://kb.osu.edu/dspace/bitstream/handle/1811/32025/Schey_Thesis.pdf?sequence=1. [Accessed
2014 15 10].
[17] K. M.P.T.R., F. W.M.S, P. J.M.C.U. and P. C.H.C, "Slideshare," 12 2013. [Online]. Available:
http://www.slideshare.net/tharindurusira/time-series-prediction-algorithms-literature-review.
[Accessed 6 11 2014].
[18] G. Dorffner. [Online]. Available: http://machine-learning.martinsewell.com/ann/Dorf96.pdf.
[Accessed 6 11 2014].
[19] N. Community, "Github: Using Nupic," Github, 21 6 2014. [Online]. Available:
https://github.com/numenta/nupic/wiki/Using-NuPIC. [Accessed 16 10 2014].
[20] Numenta, Inc, "Numenta," 2014. [Online]. Available: http://numenta.com/grok/. [Accessed 2 11
2014].
[21] "SourceForge: OpenHTM," 21 6 2014. [Online]. Available:
http://sourceforge.net/projects/openhtm/. [Accessed 16 10 2014].
Download