A Visual Data Mining Framework for Similarity Search in Large... MS-CS Practicum; Student: Sunil Gokak (2004)

Sunil Gokak (2004)
A Visual Data Mining Framework for Similarity Search in Large Sequential Databases;
MS-CS Practicum; Student: Sunil Gokak (2004)
Signals that are dependent on time occur very commonly in our day to day lives and
surroundings. Signals may represent acoustic information, stock market data, and biological
and clinical data sets, which are dependent upon time. Although sampling and harmonic
analysis of signals can enable efficient signal analysis in the frequency domain, similarity search
for data mining in signal analysis can address issues like the prediction of values, the
classification of items, piece-wise correlation estimation, and the unsupervised clustering of time
dependent data sets. Efficient indexing techniques and algorithms are developed in this domain
to address the curse of dimensionality evident in this data. This work has built a value-added
Visual Data Mining Framework that would enable users, through a web-based interface, to
analyze time series data by conveniently interacting with efficient data mining algorithms. The
application primarily aims to address descriptive data mining, which would be useful from the
point of view of understanding the mechanics governing long term and short term fluctuations in
large time-series data. An efficient webcrawler is also designed to access online time-series
data through an interactive and user-controlled graphical interface. Additionally, the application
demonstrates the fusion of Java technology with Matlab for real-time data interoperability
between the two programming tools. The application can assist a non-data mining expert to
employ efficient similarity search algorithms for applications in areas including inventory
planning and material management, sales forecasting, demand forecasting, market research /
business conditions, biomedical signal analysis and classification, protein data mining, and
functional classification of genes.