THESE AREN’T THE DROIDS YOU’RE LOOKING FOR Retrofitting Android to Protect Data from Imperious Applications Peter Hornyack, Seungyeop Han, Jaeyeon Jung, Stuart Schechter, David Wetherall SIL765 Jagjeet Singh Dhaliwal (2008CS50212) Manav Goel (2008CS50215) Applications can’t be trusted Recent academic research corroborates these findings * Source : Wall Street Journal - http://online.wsj.com/article/SB10001424052748704368004576027751867039730.html What is the threat? • Android applications that misappropriate the user’s privacysensitive data • Transmit sensitive data that the user intends application to use on-device only • Transmit sensitive data to third parties • Third parties: servers not used directly for app functionality; but often for advertising & analytics Outline • Measurement study of sensitive data usage • AppFence: a defense against misappropriation of sensitive data • Framework for evaluating impact on user’s experience • Evaluation of AppFence on 50 applications What qualifies as “sensitive data”? • Basically identified 12 types of privacy-sensitive data on Android. device id location phone number contacts camera accounts logs microphone SMS messages history & bookmarks calendar subscribed feeds How can we tell what apps are doing? • TaintDroid: dynamic taint tracking for Android applications [Enck et al] loc = getLocation(); ... loc_copy = loc; ... network_send(loc_copy); //taint tag applied //taint propagated //checked for taint Apps can’t transform, obfuscate data to remove taint Enhance TaintDroid: added tracking for all 12 data types Gives runtime detection of sensitive data transmission for apps Study of sensitive data usage • The authors performed an extensive study of sensitive data usage by Android apps • 110 popular free apps from Android Market • Selected to cover all 12 sensitive data types • Manually executed each app for ~5 minutes • Used TaintDroid to measure types of sensitive data sent out and destinations sent to. Results For location data ( across 110 apps): Application Third parties 45 apps Location? 73 apps Of these 30 apps, 28 sent location only to third parties! Mobclix, Flurry, Inmobi, AdMob Android Appears that some apps use sensitive data only for purpose of sharing with third parties. 30 apps Could they be tracking me? For unique device IDs (110 apps): Application Third parties 31 apps Device ID? 83 apps Android 14 apps Just 3 third party destinations: Mobclix, Flurry, Freystripe Multiple apps send device ID to same third parties: risk of cross-application profiling is real What else do apps misappropriate? • Two apps sent out the user’s phone number for no apparent reason except tracking • Call blocking app sent out user’s entire contacts book, then asked user to opt-in. Sensitive data intended only for on-device use may be sent off the device Outline • Measurement study of sensitive data usage • AppFence: a defense against misappropriation of sensitive data • Framework for evaluating impact on user’s experience • Evaluation of AppFence on 50 applications Our Defense: AppFence External servers Application Android Sensitive data Sensitive data Data shadowing Exfiltration blocking • Two complementary privacy controls: • Shadowing: app doesn’t get sensitive data at all • Blocking: app gets sensitive data, but can’t send it out How data shadowing works Application analytics.com (206) 555-4321 (123) 456-7890 Phone #? (206) 555-4321 (123) 456-7890 Android CCS - October 17-21, 2011 Shadow data Three kinds of shadow data • Blank data • e.g. contacts: {S. Han, 206-555-4321} {} • Fake data • e.g. location: {47.653,-122.306} {41.887,-87.619} • Constructed data • e.g. device ID = hash(app name, true device ID) • Consistent for each application, but different across applications How exfiltration blocking works analytics.com Application (206) 555-4321 Airplane mode: no network available Phone #? (206) 555-4321 Android CCS - October 17-21, 2011 Outline • Measurement study of sensitive data usage • AppFence: a defense against misappropriation of sensitive data • Framework for evaluating impact on user’s experience • Evaluation of AppFence on 50 applications What should we measure? • Privacy controls may cause changes in application behavior • The authors decided to measure the impact of AppFence on the user’s experience. • How did they measure this? • Look for user-visible changes in application behavior: side effects • Impact on whom? An example of a side effect • We look for user-visible changes in application screenshots: Framework for measuring side effects • Automate application execution by using an Android GUI testing program • Converts a script of high-level commands (e.g. “press button,” “select from menu”) into GUI interactions • Captures screenshot after every command • A human detects side effects by comparing screenshots taken with and without AppFence enabled • Classify applications based on the side effects observed: • None • Ads absent • Less functional • Broken How we check for side effects Baseline AppFence Diff Side effect: none Baseline AppFence Diff Side effect: ads absent Baseline AppFence Diff Side effect: less functional Baseline AppFence Diff Side effect: broken Baseline CCS - October 17-21, 2011 AppFence Diff Outline • Measurement study of sensitive data usage • AppFence: a defense against misappropriation of sensitive data • Framework for evaluating impact on user’s experience • Evaluation of AppFence on 50 applications Experiments • Selected 50 apps that sent out sensitive data • Wrote execution scripts for these apps • Exercise main features and features likely to send out sensitive data • Enable one AppFence privacy control, execute all applications • Check screenshots for side effects and classify applications Configuring privacy controls? • To reveal the most side effects: • Data shadowing of all sensitive data types • Exfiltration blocking of all types to all destinations • This imposes a policy on the app: sensitive data should never leave the device • But don’t some apps have legitimate need to send out data? Side effects shown by 50 apps Data shadowing None Ads absent Less functional Broken 28 (56%) 0 Exfiltration blocking 16 (32%) Choose leastdisruptive 30 (60%) (0%) 11 (22%) 3 (6%) 14 (28%) 10 (20%) 11 (22%) 8 (16%) 13 (26%) 6 (12%) Slightly more Choose Remember, the control we than applied half thatof acaused single the apps privacy least-severe ran with control side effects (one limited or or thenoother) for side each effects toapp: all applications 33 apps (66%) had no side effects or ads absent Data shadowing was less disruptive than We used profiling to choose; determining in exfiltration blocking advance is challenging So 34% of applications didn’t work? • These apps had four kinds of functionality that directly conflict with our configuration (sensitive data should never leave the device): • Location broadcast (location) • Geographic search (location) • Find friends (contacts) • Cross-application gaming profiles (device ID) When to use data shadowing • Data types such as device ID, location, phone number • Aren’t presented directly to the user • Must be transmitted off the device • Example application behaviors: • Device ID sent along with login information • Location collected at application launch When to use exfiltration blocking • Data types such as contacts, SMS, calendar • Presented to the user on the device • Don’t need to be transmitted off the device • Example application behaviors: • Selecting a contact to send a message to • Adding reminders to calendar Conclusion • AppFence breaks the power of the installation ultimatum • We revealed side effects by never allowing sensitive data to leave the device • Some apps: user must choose between functionality and privacy • Majority of apps: two privacy controls can prevent misappropriation without side effects Further Work • Extending the Taint sources to include compression using Java.util.zip • Extending Data shadowing to offer finer-granularity controls such as shadowing location with a nearby but less private place, e.g. the city center. Questions?