Mining Spatial and Spatiotemporal Patterns in Scientific Data Sai Radha Krisna Kalluri Chaitanya Donthineni Pradeep Reddy Overview 0 Introduction 0 Basic Concepts 0 Framework and Implementation 0 Empirical Evaluation 0 Conclusion and Ongoing work Introduction 0 Data mining is the process of discovering hidden and meaningful knowledge in a data set. 0 In this paper, the author focuses on designing and applying data mining techniques to analyze spatial and spatiotemporal data originated in scientific domains. 0 The authors propose a solution framework to overcome challenges such as considering features as single points in a multidimensional space and incorporating temporal information. Basic Concepts 0 The authors have proposed three main different representation schemes: - Parellelpiped - Ellipsoid - Landmarks Based representation Framework and Implementation Empirical Evaluation 0 To evaluate the efficacy of this framework, the authors applied it to the following scientific applications: - Protein structure analysis on Bio- informatics . - vortex simulation in computational fluid dynamics. - Analysis of defect evolution in molecular dynamics Conclusion and Future Work 0 The framework proposed, models features as geometric objects rather than points. 0 It supports multiple distance measurements. 0 Capturing other types of spatial relationships such as topological and directional relationships. 0 Discovering rare but important patterns. Hair Data Model: A New Data Model for SpatioTemporal Data Mining Introduction What is spatio temporal data? 0 The data which has the issues such as satellite images, weather maps, transportation systems etc further more this data ca changes over the time. What is hair data model? 0 This model is similar to the the nature of hair which has specific properties and its growth over the time. In order to have better looking and quality, the data is needed to maintain over the time such as combing, cutting, colouring, covering, cleaning etc Introduction(contd….) Data types in the spatio temporal databases are categorized in to two types: Spatial Data: It indicates the data at each position such as temperature, humidity etc.. Temporal Data: The above data mentioned by changing the time. So another specification called time stamp is maintained to recognize the data at a particular time. These spatio temporal data ca be developed by using the conventional relational and object oriented databases. Problems in Spatio-temporal data 0 Difficulties in analytical operations. 0 Poor compatibility with object-oriented environments. 0 The user- friendly weakness. Problems in Spatio-temporal data(contd…) The most data models in Spatio-Temporal systems are not able to identify by users because the Spatio-Temporal data types are more complex. So, the users could not inform about facilities and features in data models or predict the model behaviour in variant situations. Hair Natural Specifications and Hair Data model Functions. Hair Natural Specifications and Hair Data model Functions. Detecting Spatio-Temporal Outliers in Climate Dataset: A method Study Introduction 0 Outlier detecting is one of the most important data analysis technologies in data mining. It can be used to discover anomalous phenomena in huge dataset. 0 Several successful applications are always referred when talking about outliers such as credit card fraud detection. Motivation 0 There is so many previous definition of outlier which doesn’t sound well , since all the definitions are all confined to some certain kinds of data. Current paper presents reasonable definition of spatio-temporal outliers. 0 “An outlier is an observation that deviates so much from other observations as to arouse suspicion that it was generated by a different mechanism”. Problem Definition: With the aim of detecting useful and meaningful outliers in climate dataset. Authors introduce a formalized way to define outliers in spatio-temporal lattice data, in which the importance of clarifying basic data structure is stressed. As a case study authors define two kinds of spatio-temporal outliers based on a global climate dataset. Current Paper tries to establish a resonable defination for outliners. Introduction to the Climate Dataset: 0 The dataset “CRU TS 2.0” is offered by Climate Research Unit. 0 It is a 0.5 degree grid dataset with five variables, which are cloud cover percentage, diurnal temperature range, precipitation, temperature, vapor and pressure, covering global land surface and with the time-step of one month from the year 1901. Area authors have chosen was south china area. The time period we select is a close time period from 1992 to 2002. Climate Dataset(Contd..) The definition of outlier needs to consider 3 aspects. 0 The first is what kind of data structure you care, which we give a name “basic element”. 0 Second, what do you want the “basic element” to compare with, i.e., compare element; 0 Third, what do you want to compare and by what measurement, i.e., the compare function. With these considerations, we define two large-scale outliers based on this spatiotemporal climate dataset. Location Outliers Given A Time Period Time Period Outliers Given An Region Conclusion 0 Outliers in spatio-temporal lattice data haven’t got a reasonable definition so far, but they are so helpful in finding interesting phenomena in spatio-temporal dataset. 0 In this paper, we point out the importance of clarifying the basic element when defining outliers in spatio-temporal lattice dataset. 0 The introduction of basic element would make the definition clearer and more meaningful. We propose three aspects when defining an outlier, the latter two of which is based on the first one. 0 According to the three aspects, we define two kinds of the most important and useful spatio-temporal outliers based on climate dataset, considering the implement efficiency as well. Experiment shows that the definition is quite helpful in finding outliers with practical utility.