DARWIN PHONES: THE EVOLUTION OF SENSING AND INFERENCE ON MOBILE PHONES PRESENTED BY: BRANDON OCHS Emiliano Miluzzo, Cory T. Cornelius, Ashwin Ramaswamy, Tanzeem Choudhury, Zhigang Liu, Andrew T. Campbell, "Darwin phones: the evolution of sensing and inference on mobile phones," In Proc. of 8th ACM Conference on Mobile Systems, Applications, and Services (MobiSys), 2010, pp. 5-20. What does Darwin do? A Smartphone platform for urban sensing Proof of concept model uses microphone Communicates with other local devices to improve inference accuracy (collaborative inference) Framework can be expanded to gather information using a range of sensor data What about battery life? Communicates with backend server to do the CPUintensive machine learning algorithms Local devices share models rather than recomputing them Sensing is enabled/disabled as the system sees fit Common Urban Sensing Challenges Human burden of training classifiers Ability to perform reliably in different environments (indoor vs outdoor) The ability to scale to a large number of phones without hurting usability and battery life. Darwin overcomes all of these through classifier/model evolution, model pooling, and collaborative inference Types of Learning Supervised: Given a fully-labeled training set Semi-Supervised: Given a small training set that is evolved Unsupervised: No training set is given Darwin Steps Evolution, Pooling, and Collaborative Inference These represent Darwin’s novel evolve-pool-collaborate model implemented on mobile phones Classifier Evolution Automated approach to updating models over time Needs to account for variability in sensing conditions and settings Variability in background noise and phone location require separate models Model Pooling Reuses models that have already been built and evolved on other phones Exchange classification models whenever the model is available from another phone Classifiers do not need to be retrained, which increases scalability Can pool models from backend servers Collaborative Inference Combines results from multiple phones Run inference algorithms in parallel on the same classifiers System is more robust to degradation in sensing quality Increases accuracy Darwin Design: Computation Reduces the on-the-phone computation by offloading some of the work to backend servers Backend server uses a machine learning algorithm to compute a Gaussian Mixture Model (2 hours for 15 seconds of audio) Feature vectors are computed locally Darwin Design: Context Context (in/out of pocket, in/out of bag) will impact the sensing and inference capability Classifier evolution makes sure the classifier of an event is robust across different environments Darwin Design: Co-location Accounts for a group of co-located phones running the same classification algorithm and sensing the same event but computing different inference results Phones pool classification models when collocated or from backend servers Compares against its own model and the co-located model Drastically reduces classification latency Exploits diversity of different phone sensing context viewpoints Speaker Recognition Attempts to identify a speaker by analyzing the microphone’s audio stream Suppresses silence, low amplitude audio, and chunks that do not contain human voice Reduce false positives by pre-processing in 32ms blocks Speaker Modeling Feature vector consisting of Mel Frequency Cepstral Coefficients Each speaker is modeled with 20 Gaussians An initial speaker model is built by collecting a short training sample Classifier Evolution: Training Step Short training phase (30 seconds) used to build a model which is later evolved First 15 seconds used as the training set Last 15 seconds used as baseline for evolution Classifier Evolution: Evolution Step Semi-supervised learning strategy If the likelihood of the incoming audio stream is much lower than any of the baselines then a new model is evolved Collaborative Inference Local inference phase can be broken into three steps: Local inference operated by each individual phone Propagation of the result of the local inference to the neighboring phones Final inference based on the neighboring mobile phones local inference results Each node individually operates inference on the sensed event Results and confidence broadcasted Privacy and Trust Raw sensor data is not stored on or leaves the mobile phone The content of a conversation or raw audio data is never disclosed Users can choose to opt out of Darwin Experimental Results Tested using a mixture of five N97 and iPhones used by eight people over a period of two weeks Audio recorded in different locations Classifier trained indoors Experiment 1 Parameters Three people walk along a sidewalk of a busy road and engage in conversation The speaker recognition application without the Darwin components runs on each of the phones carried by the people Experiment 1 Results: Without Evolution Experiment 2 Parameters Meeting setting in an office environment where 8 people are involved in conversation The phones are located at different distances from people in the meeting, some on the table and some in people’s pockets Experiment 2 Results Experiment 2 Results Experiment 3 Parameters Five phones in a noisy restaurant Three of the five people are engaged in conversation Two of the five phones are placed on the table Phone 4 Is the closest phone to speaker 4 and also the closest phone to another group of people having a loud conversation Experiment 3 Results Experiment 3 Results Experiment 3 Results Experiment 3 Results Experiment 4 Parameters Five people walk along a sidewalk and three of them are talking The greatest improvement is observed by speaker 1, whose phone is clipped to their belt Experiment 4 Results Experiment 4 Results Time and Energy Measurements Baselines for power use determined Measurements performed using the Nokia Energy Profiler tool No data gathered for the iPhone Smart duty cycling required later to save battery life Time and Energy Measurements Possible Applications Virtual square application Social Place discovery application Use application for a group of friends collaborative inference to determine location Friend Tagging application Exploit face recognition to tag friends on pictures Future Work Duty cycling for improved battery life Simplified classification techniques Improvements On The Paper Studies don’t show conclusive evidence; there should be separate control models for each of the scenarios Conclusion The Darwin system combines classifier evolution, model pooling, and collaborative inference Results indicate that the performance boost offered by Darwin off sets problems with sensing context The Darwin system provides a scalable framework that can be used for other urban sensing applications References [1] Emiliano Miluzzo, Cory T. Cornelius, Ashwin Ramaswamy, Tanzeem Choudhury, Zhigang Liu, Andrew T. Campbell, "Darwin phones: the evolution of sensing and inference on mobile phones," In Proc. of 8th ACM Conference on Mobile Systems, Applications, and Services (MobiSys), 2010, pp. 5-20. [2] H. Ezzaidi and J. Rouat. Pitch and MFCC Dependent GMM Models for Speaker Identification systems. In Electrical and Computer Engineering, 2004. Canadian Conference on, volume 1, 2004 [3] H. Ezzaidi and J. Rouat. Pitch and MFCC Dependent GMM Models for Speaker Identification systems. In Electrical and Computer Engineering, 2004. Canadian Conference on, volume 1, 2004.