NTU/Intel M2M Project: Wireless Sensor Networks Content Analysis

advertisement
NTU/Intel M2M Project: Wireless Sensor Networks
Content Analysis and Management Special Interest Group
Data Analysis Team
Monthly Report
1. Team Organization
Principal Investigator: Shou-De Lin
Co-Principal Investigator: Mi-Yen Yeh
Team Members: Chih-Hung Hsieh (postdoc), Yi-Chen Lo (PhD student), Perng-Hwa Kung
(Graduate student), Ruei-Bin Wang (Graduate student), Yu-Chen Lu (Undergraduate student),
Kuan-Ting Chou (Undergraduate student), Chin-en Wang (Graduate student)
2. Discussion with Champions
a. Number of meetings with champion in current month: many times (during F2F meeting)
b. Major comments/conclusion from the discussion: discuss the future topics and directions
3. Progress between last month and this month
a. Topic1: Video Summarization using MSWave.
1) About manuscript to submit:
-
2104 IEEE International Conference on Image Processing
Title: Efficiency Multi-view Keyframe Extraction on Distributed Video
Sensor Network
Estimated date of completion: 2/14
2) Experiment Results:
-
Dataset: bl2, lobby, office
-
Recall: |events we found| / |all events|
Precision: |frames in event| / |frames we choose|
Bandwidth Saving: 1 - |mswave| / |naive|
Dataset: bl2, lobby, office
bl2:
-
-
office:
-
lobby:
b. Topic2: Distributed Nearest Neighbor Search of Time Series Using Dynamic Time
Warping
1) Setup of experiments.
-
Initialization Approaches
i.
Oracle (impossible in practice)
1. Assume we know which sites have kNN
2. Initialization: Send exact query to those sites
ii.
Our approach
1. Assume the order of LB = the order of DTWs
2. Initialization: Send exact query to sites that have top K lower
bounds
iii.
Naive approach
1. Assume we have no idea how to choose sites for
initialization
2.
-
Initialization: Send exact query to sites that have random K
time series
Initialization Comparison: Top: Oracle; Medium: Our; Bottom: Naive
-
Pruning Site Order: Top: LB order; Bottom: Random; Order: Smallest
1st-level lower bound of each site
-
Big Data:
i.
Synthetic dataset
ii.
iii.
1. 10000 times series of length 10000
2. Random walk
Parameters
1. S = 9999
2. K = 10
3. M = S / 2, S / 4, S / 8, S / 16, S / 32
Experiment process
1.
2.
3.
iv.
Randomly select a time series as the query
Run 100 times for each group of parameters
Bandwidth ratio = (Framework bandwidth) / (Naive
approach bandwidth)
Big Data Performance: Still has good performance
2) Future work
-
Design experiment presentation
Finish writing the paper
c. Topic 3: Intelligent Transportation System (ITS) Machine Learning: Predict whether
driver will stop at intersection or not without using video data.
1) Extract Stop & Non-stop cases from all users.
-
Total 135 users.
i.
Most of them doesn’t have valid trajectories. (we will re-check this
situation.)
ii.
44 users provides the stop & non-stop cases.
iii.
Total: 65819 cases
-
1.
Positive Samples = 38909 (stop)
2.
Negative Samples = 26910 (non-stop)
Experiment Setup
i.
random partition training : testing = 2:1
ii.
The best 5-CV rate for training set: 74.8%
iii.
Accuracy for testing set= 0.699342
2) PCA for feature reduction
-
Appling PCA to 91 features of training set.
i.
Transform training and testing sets based on the resulted principal
components.
-
Accuracy on Testing set.
i.
Original: 0.699342
ii.
The first 7 components: 0.5381
iii.
91 PCA components: 0.58976
3) Combine the driver type.
-
39 drivers have driver type
i.
31 normal drivers
1.
2.
ii.
iii.
24731 stop cases
18956 non-stop cases
8 aggressive drivers
1.
2009 stop cases
2.
712 non-stop cases
Labeling rule: One driver is aggressive if he's ratio of aggressive
trajectory is higher than a threshold t which indicates mean added 1
sigma computed from 81 drivers.
-
Experiment results
i.
5-CV on 31 normal drivers:
1.
ii.
Avg. accuracy = 0.771488
5-CV on 8 aggressive drivers:
1.
iii.
5-CV on 39(all) drivers:
1.
iv.
Avg. accuracy = 0.780228
Avg. accuracy = 0.774242
Currently, integrating the driver type seems no significantly improve.
4) To-do list.
-
After discussion with Jin-Yao, the current assignments of driver type will be
modified.
Other feature selection or feature reduction methods will be used and
-
evaluated.
i.
Apply a wrapper-method of feature selection method, (Fselect.py) to a ,
small-sized dataset, randomly sampled from original dataset.
-
Generate datasets of other driving behaviors
4. Brief plan for the next month
a. We will continuous paper survey and refine our proposed approaches.
b. To implement our proposed approaches and evaluate their performance.
5. Research Byproducts
a. Paper: N/A
b. Served on the Editorial Board of International Journals: N/A
c. Invited Lectures: N/A
d. Significant Honors / Awards: N/A
Download