Mining Spatial and Spatio-temporal Patterns in Scientific Data

advertisement
Mining Spatial and Spatiotemporal Patterns in Scientific
Data
Sai Radha Krisna Kalluri
Chaitanya Donthineni
Pradeep Reddy
Overview
0 Introduction
0 Basic Concepts
0 Framework and Implementation
0 Empirical Evaluation
0 Conclusion and Ongoing work
Introduction
0 Data mining is the process of discovering hidden and
meaningful knowledge in a data set.
0 In this paper, the author focuses on designing and applying data
mining techniques to analyze spatial and spatiotemporal data
originated in scientific domains.
0 The authors propose a solution framework to overcome
challenges such as considering features as single points in a
multidimensional space and incorporating temporal information.
Basic Concepts
0 The authors have proposed three main different representation
schemes:
- Parellelpiped
- Ellipsoid
- Landmarks Based representation
Framework and Implementation
Empirical Evaluation
0 To evaluate the efficacy of this framework, the authors applied it to
the following scientific applications:
- Protein structure analysis on Bio- informatics
.
- vortex simulation in computational fluid dynamics.
- Analysis of defect evolution in molecular dynamics
Conclusion and Future Work
0 The framework proposed, models features as geometric objects
rather than points.
0 It supports multiple distance measurements.
0 Capturing other types of spatial relationships such as topological
and directional relationships.
0 Discovering rare but important patterns.
Hair Data Model:
A New Data Model for SpatioTemporal
Data Mining
Introduction
What is spatio temporal data?
0 The data which has the issues such as satellite images, weather maps,
transportation systems etc further more this data ca changes over the time.
What is hair data model?
0 This model is similar to the the nature of hair which has specific properties
and its growth over the time. In order to have better looking and quality,
the data is needed to maintain over the time such as combing, cutting,
colouring, covering, cleaning etc
Introduction(contd….)
Data types in the spatio temporal databases are categorized in to two types:
 Spatial Data: It indicates the data at each position such as temperature,
humidity etc..
 Temporal Data: The above data mentioned by changing the time. So another
specification called time stamp is maintained to recognize the data at a
particular time.
 These spatio temporal data ca be developed by using the conventional
relational and object oriented databases.
Problems in Spatio-temporal data
0
Difficulties in analytical operations.
0 Poor compatibility with object-oriented environments.
0 The user- friendly weakness.
Problems in Spatio-temporal
data(contd…)
 The most data models in Spatio-Temporal systems are not able to identify by
users because the Spatio-Temporal data types are more complex.
 So, the users could not inform about facilities and features in data models or
predict the model behaviour in variant situations.
Hair Natural Specifications and Hair
Data model Functions.
Hair Natural Specifications and
Hair Data model Functions.
Detecting Spatio-Temporal Outliers in
Climate Dataset: A method Study
Introduction
0 Outlier detecting is one of the most important data analysis technologies in
data mining. It can be used to discover anomalous phenomena in huge
dataset.
0 Several successful applications are always referred when talking about
outliers such as credit card fraud detection.
Motivation
0 There is so many previous definition of outlier which doesn’t sound well ,
since all the definitions are all confined to some certain kinds of data.
Current paper presents reasonable definition of spatio-temporal outliers.
0 “An outlier is an observation that deviates so much from other observations
as to arouse suspicion that it was generated by a different mechanism”.
Problem Definition:
 With the aim of detecting useful and meaningful outliers in climate dataset.
 Authors introduce a formalized way to define outliers in spatio-temporal lattice
data, in which the importance of clarifying basic data structure is stressed.
 As a case study authors define two kinds of spatio-temporal outliers based on a
global climate dataset.
 Current Paper tries to establish a resonable defination for outliners.
Introduction to the Climate Dataset:
0 The dataset “CRU TS 2.0” is offered by Climate Research Unit.
0 It is a 0.5 degree grid dataset with five variables, which are cloud cover
percentage, diurnal temperature range, precipitation, temperature, vapor and
pressure, covering global land surface and with the time-step of one month
from the year 1901.
Area authors have chosen was south china area. The time period we select is a close
time period from 1992 to 2002.
Climate Dataset(Contd..)
The definition of outlier needs to consider 3 aspects.
0 The first is what kind of data structure you care, which we give a name “basic
element”.
0 Second, what do you want the “basic element” to compare with, i.e., compare
element;
0 Third, what do you want to compare and by what measurement, i.e., the compare
function.
With these considerations, we define two large-scale outliers based on this spatiotemporal climate dataset.
 Location Outliers Given A Time Period
 Time Period Outliers Given An Region
Conclusion
0 Outliers in spatio-temporal lattice data haven’t got a reasonable definition so far,
but they are so helpful in finding interesting phenomena in spatio-temporal
dataset.
0 In this paper, we point out the importance of clarifying the basic element when
defining outliers in spatio-temporal lattice dataset.
0 The introduction of basic element would make the definition clearer and more
meaningful. We propose three aspects when defining an outlier, the latter two of
which is based on the first one.
0 According to the three aspects, we define two kinds of the most important and
useful spatio-temporal outliers based on climate dataset, considering the
implement efficiency as well. Experiment shows that the definition is quite helpful
in finding outliers with practical utility.
Download