NTU/Intel M2M Project: Wireless Sensor Networks Content Analysis

advertisement
NTU/Intel M2M Project: Wireless Sensor Networks
Content Analysis and Management Special Interest Group
Data Analysis Team
Discriminant Classification Sub-Team
Graphical Learning Sub-Team
Anomaly Detection Sub-Team
Pattern Mining Sub-Team
Monthly Report: December, 2011
1. Team Organization
Principal Investigator: Shou-De Lin
Co-Principal Investigator: Yung-Jen (Jane) Hsu
Team Leader: Todd McKenzie
Team Members: Peng-Hua Gong, Hsun-Ping Hsieh, Fu-Chun Hsu, Chung-Yi Li, Ting-Wei Lin, WeiLun Su, En-Hsu Yen, Tu-Chun Yin
2. Discussion with Champions
Number of meetings with champion in current month: 2 (1/9, 1/16-1/18)
3. Progress between last month and this month
Graphical Model Team
For this period, we focus on
developing the indexing technique
for large-scale pattern recognition
problem. Current state-of-the-art
approach for Pattern Recognition
and Machine Learning (like SVM,
Graphical Model…) cannot scale to
problem with large number of
pattern/observation. In our research, we try to exploit indexing technique, which is widely used
for search engines, to solve scaling problem of Pattern Recognition in general. Many state-ofthe-art Machine Learning models will become feasible in large scale sensor network, if such
technique can be successfully applied.
In current setup, we transfer our target
problem (ex. detection, classification) into
Nearest Neighbor Search Problem in high
dimensional feature space and obtain quick
search via our indexing. We design a new indexing method “KernelSplitTree” to index observed
data and parametric models in kernel space instead of working on explicit feature space. In our
system, given observation we can quickly find relevant pattern models, while given a model, we
can quickly detect corresponding pattern from indexed observation.
Compared with existing machine learning approach, we get speedup in the training,
detection, and classification phase by several magnitude, with O(D*NolnNo ) and O(D*NMlnNM)
time building index at the beginning, as in the
following:
Inferring Social Relationships
1.
Literature Survey for Co-location feature or Spatial-Temporal Co-occurrence:
•
•
•
•
2.
Exploiting Place Features in Link Prediction on Location-based Social Networks(KDD
2011)
– Place features, Social features and Global features
– Supervised learning framework
Inferring social ties from geographic coincidences(PNAS 2010)
– Spatial-Temporal Co-occurrence
– Probabilistic model approach
A Geo-Social Model: From Real-World Co-occurrences to Social Connections(Journal for
DATABASES IN NETWORKED INFORMATION SYSTEMS )
– Time series pattern to identify the similarity for each pair
Finding Your Friends and Following Them to Where You Are (WSDM 2012)
– Text, co-location and topology of the underlying friendship graph feature from
twitter
– Probabilistic model approach
Transform our predicting model into a large-scale dataset, the check-in dataset
from Gowalla
•
•
10-fold cross-validation (Liblinear):Precision: 85.2%, Recall:43.5%
Apply on testing data (Liblinear): Precision: 79.63Recall: 32.1%
Activity Inference from Sensor Network Data
This month, our sub-team continued to examine the problem of learning and inferring
activities from a user using data from the event log of other users. Two possible methods
include using frequent pattern based profiles and transfer learning based methods. The former
idea generates a list of frequent episodes from observed data, and creates a weight matrix to
value the event periods. The latter idea tries to find a linear mapping from the list of events
happening at each location encountered by the original user to the lists of events happening at
other locations encountered by the other users. Each has received considerable interests.
Finally, we fortunately obtained the tracking record from the other four users from the
original dataset provided by Mr. Zhao. The data collected from the other four users are of
shorter length, and the corresponding trajectory of locations has less distinctive mappings to the
locations the original user travelled. How to actually employ the investigated methods
effectively is still our current research direction, and the question will be examined more
thoroughly in the coming month.
Classification
This month the classification keeps
exploring the algorithm for missing
recover in sensor network. We have
successfully incorporated the temporal
correlation into our matrix factorization
model, and now we are designing the
mathematical formulation to include
the spatial correlation and correlation among different attributes like temperature and humidity.
Besides, we are also conducting experiment on different dataset, to make sure our algorithm is
general and validate our finding.
Local Sensor Network Setup
Produce the data by controlling the voltage, there are four different scenarios in two
environments, the goal of this data set is to produce at least 100 data.
• Jan:
3.2V ~ 2.0V in 1 hours, (0.2V/min)
Two environment: bathroom, balcony
Different scenarios:
1 with full-charge power, 1 with artificial power
1 with full-charge power, 2 with artificial power
1 with battery power(full), 1 with artificial power
1 with full-charge power, 2 with artificial power
3.2V-2.0V, 2.6V-2.0V -> Different exhaustion rate
Survey near all the papers discussing sensory data imputation, there are several papers
mentioning the spatial-temporal correlation of the data. Several baselines methods are survey
and listed, they have the characteristic of easy to implement.
1. Linear interpolation: temporal
2. Moving average: temporal
3. Hybrid-KNN: spatial+temporal
4. Correlated imputation: spatial+temporal
5. Recent sliding window: temporal
6. Replaced by certain: temporal
7. Adaptive weight adjustment: temporal+spatial
8. Multiple regression: spatial
9. Support vector regression
Some of the above literatures will be implemented as the comparing methods to our RF
algorithm.4
5. Research Byproducts
5.1 Papers: N/A
(1) International Journal
(2) International Conference
(3) Domestic journal
(4) Domestic Conference
(5) Highly Cited Articles
5.2 Served on the Editorial Board of International Journals
Journal of Social Network Analysis and Mining
5.3 Invited Lectures
Intel/NTU Symposium on 1/17, Data Analysis Presentation by Professor Shou-De Lin
5.4 Significant Honors / Awards
N/A
Download