NTU/Intel M2M Project: Wireless Sensor Networks Content Analysis and Management Special Interest Group Data Analysis Team Monthly Report 1. Team Organization Principal Investigator: Shou-De Lin Co-Principal Investigator: Mi-Yen Yeh Team Members: Meng-Jung Shih (postdoc), Chih-Hung Hsieh (postdoc), Yi-Chen Lo (PhD student), Perng-Hwa Kung (Graduate student), Ruei-Bin Wang (Graduate student), Yu-Chen Lu (Undergraduate student), Kuan-Ting Chou (Undergraduate student), Chin-en Wang(Graduate student) 2. Discussion with Champions a. Number of meetings with champion in current month: 1 (2011/10/25) b. Major comments/conclusion from the discussion: not yet met with champion 3. Progress between last month and this month a. Topic1: identifying K farthest points of a query with limited communication cost. - Goal: Give a set of points U ⊂ Rn which distributed in many sensor, and A set of queries Q ⊂ Rn . Try to find K farthest points of Q with limited communication cost. - We combine LEEWAVE algorithm and a modified average bound to filter out the impossible candidate. (a) (b) Figure. Comparison of (a) our proposed method and (b) the baseline method. Note the different scales of communication cost. b. Topic2: Using co-clustering method of sensors to reduce the communication burden - Goal: Using co-clustering method to analyze the dependencies between sensors, and try to reduce to reduce the communication burden in WSNs. - The co-clustering method performs slightly better in London_NO2 dataset than baseline method. c. Topic 3: continuous kNN query on distributed streams - Goal: given a continuous time-series query, topic 3 try to solve the kNN problem among distributed sensors with limited communication cost. - We try to combine the dynamic time warping (DTW) aligning method and LEEWAVE algorithm to solve the problem mentioned above. - We first transform a time series into wavelet representation, then DTW alignments are applied to the wavelet representations. d. Topic 4: matching a set of queries on distributed streams - Goal: try to design a novel distance measurement which is more appropriate than other existent alternatives. - The proposed measurement combined with logarithm operation and DTW algorithm provides a better performances for identifying the kNNs of queries. e. Topic 5: patterning learning and recognition on distributed time-series stream data. - Goal: propose an on-line algorithm for pattern learning and recognition on distributed time-series stream data. - Two papers are surveyed: a) Online learning meets optimization in the dual, COLT 06. b) Trading convexity for scalability, ICML 06. - An on-line support vector machine (SVM) based on a faster SVM approximated solution and a modified loss function to reduce the number of support vectors may be our proposed method. f. Topic 6: a method to aggregate a set of queries with equal length - Goal: The goal is to find a time series that have a minimum total distance to a set of time series with equal length (total distance is defined as the sum of respective distance to each time series in the set) - Use the method of gradient descent to approximate this time series with objective function : total distance to a set of time series and see whether it can beat the baseline: average every sample point in the set of time series. the result below is in the scenario of randomly chosen time series. 4. Brief plan for the next month a. We will continuous paper survey and refine our proposed approaches. b. To implement our proposed approaches and evaluate their performance. 5. Research Byproducts a. Paper: N/A b. Served on the Editorial Board of International Journals: N/A c. Invited Lectures: N/A d. Significant Honors / Awards: N/A