Efficient Privilege De-Escalation for Ad Libraries in Mobile Apps Ramesh Govindan (USC)

advertisement
Efficient Privilege De-Escalation for
Ad Libraries in Mobile Apps
Bin Liu (SRA), Bin Liu (CMU), Hongxia Jin (SRA),
Ramesh Govindan (USC)
2
The Mobile Ad Ecosystem
App Developer
Paid by Impressions
Paid by User Clicks
Ad Plugin
See/Click Ads
Ad Network
App User
Phone/Tablet App
Introduction
Challenges
PEDAL
Evaluation
Conclusion
Ecosystem Incentives are Skewed
Against Users
Ad libraries taking unwarranted liberties with personal data on
devices in order to more efficiently target ads
Users are especially concerned about privacy risks posed by ad
libraries
“Mobile advertising services were a consistent privacy concern
for the most participants”
“ Users felt the least comfortable when private resources
were used for advertising”
Introduction
Challenges
PEDAL
Evaluation
Conclusion
3
Therefore, our position is that…
Considering these privacy concerns on ad libraries
Ad libraries fundamentally need less privilege than app logic
The user should be able to specify what resources should be
granted to ad libraries
This cannot be achieved in Android
Android permissions model governs app access to resources,
however, acts on the whole apps, at install time
Once the app is installed, the app and all its included libraries
are granted access to these resources
Introduction
Challenges
PEDAL
Evaluation
Conclusion
4
5
Our Approach – Privilege De-Escalation
An ad library can have fewer resource access privileges than the app
logic itself
Users can selectively deny resource access privileges to the ad
libraries without affecting the main app logic
Introduction
Challenges
PEDAL
Evaluation
Conclusion
6
Our Approach – Examples
Introduction
Challenges
PEDAL
Evaluation
Conclusion
7
Our Approach – Examples
Introduction
Challenges
PEDAL
Evaluation
Conclusion
8
Challenges
To implement such a system, we need to answer two questions
How to identify ad library code in an app?
How to effect selective privilege de-escalation?
Both challenges are non-trivial
Introduction
Challenges
PEDAL
Evaluation
Conclusion
Challenges on Identify Ad Libraries
We can at best access the so called bytecodes which are a
intermediate code obtained by compiling source codes
There is no annotation that preserves the separation between
bytecodes from app logic and bytecodes from an ad library
Introduction
Challenges
PEDAL
Evaluation
Conclusion
9
Challenges on Identify Ad Libraries
Some researchers suggest to use bytecode path matching to
identify ad libraries in bytecodes, e.g. /com/google/ads
However, advanced ad libraries use package-level or code-level
obfuscation to foil this method
Introduction
Challenges
PEDAL
Evaluation
Conclusion
10
Challenges on privilege de-escalation
Ideally, the solution must not require changes to the OS or the
VM, or must not require rooting a phone
The solution must be highly efficient; significant slowdowns in
app execution time can affect usability
Introduction
Challenges
PEDAL
Evaluation
Conclusion
11
Challenges on privilege de-escalation
Most important, in a substantial fraction of apps, ad libraries
inherit privileges from the app logic
Any solution for privilege de-escalation must prevent this kind of
privilege inheritance
Introduction
Challenges
PEDAL
Evaluation
Conclusion
12
13
PEDAL Overview
PEDAL contains: a Separator and a Rewriter
Input: a packaged app & Output: a repacked app with deescalated privileges for any (obfuscated) ad libraries in the app
Introduction
Challenges
PEDAL
Evaluation
Conclusion
14
PEDAL Overview
This design achieves the challenges we have
reviewed before
Obfuscation resistant classification and binary-rewriting
achieve selective de-escalation on ad libraries
By using binary rewriting, our approach does not require OS
level changes, and also achieves significant efficiency
Finally, the Rewriter, by analyzing information flow across
bytecode sets, can prevent privilege inheritance
Introduction
Challenges
PEDAL
Evaluation
Conclusion
15
Separator Implementation
Most important: choose the set of features that
ensure high classification accuracy
Introduction
Challenges
PEDAL
Evaluation
Conclusion
16
Separator Implementation
We choose six groups of features that are
informative to ad library classification
Usage of Android basic
components
Usage of selective Android
permissions
Usage of visual elements
Usage of information
sources and sinks
Usage of APIs for runtime
permission check
Keyword matching for
class/method/field names
We do not use bytecode path information, and the
chosen features are resistance to code obfuscation
Introduction
Challenges
PEDAL
Evaluation
Conclusion
17
Rewriter Implementation
Rewriter effects privilege de-escalation by binary rewriting based on user-specified privacy policies
Rewriter interposes on resource accesses by the ad
library or the app logic
Rewriter only interposes what we called core
resource access functions
Introduction
Challenges
PEDAL
Evaluation
Conclusion
18
Rewriter Implementation
Preventing Privilege Inheritance
Focus on resource access core functions in the app logic to
Internet access calls in the ad library
Once these potential leakage paths have been identified,
Rewriter performs the same kind of interposition as above
Native Libraries Marginally Affect our Control
Introduction
Challenges
PEDAL
Evaluation
Conclusion
19
Evaluation: the Separator
Crawled 63,105 free apps from Google Play Store
Train a SVM from 335 ad modules and 335 non ad
modules: Recall 98.4%, Precision 98.5%
Randomly chose 200 apps, and manually check the
classification result
Even with obfuscation in most of these apps
(120/200) our classifier performs an accuracy of 93%
Introduction
Challenges
PEDAL
Evaluation
Conclusion
20
Evaluation: the Separator
Our Separator is more efficient than the traditional
package name matching approach
Among all apps, our Separator discovered 2,598 unique ad
library modules, belonging to 546 unique ad library sources
This is at least 5X more than the reported numbers in papers
that maintain a pre-defined blacklist of ad package names
Introduction
Challenges
PEDAL
Evaluation
Conclusion
22
Evaluation: the Rewriter
How much the runtime overhead the rewriting code
has added
We select 100 apps, and uses an UI automation tool
to run both original and rewritten apps
Both versions of a app were fed identical click
streams
Executing these 100 apps on showed a total increase
in runtime of 0.89% on average.
Introduction
Challenges
PEDAL
Evaluation
Conclusion
23
Evaluation: the Rewriter
How effective the control can be?
100 Apps + Pre-defined clickstream for each app
No Control
843 ads, 304 are
location targeted
Introduction
Challenges
Control Internet
(block ads)
Control Location
(feed fake location)
9 ads
806 ads, 249/23 targets
fake/real location
Due to missing
core functions
Due to limitations of
static flow analysis
PEDAL
Evaluation
Conclusion
24
Conclusion
PEDAL: a system to achieve selective privilege de-escalation for
ad libraries
PEDAL performs automated classification to identify ad library
code, and rewrite core resource functions to achieve de-escalation
PEDAL is robust, by design, to both package name obfuscations
and source code obfuscation
PEDAL shows remarkable classification accuracy and efficacy, yet
requires reasonable computing power to process apps
PEDAL is effective and imposes negligible runtime overhead
for apps
Introduction
Challenges
PEDAL
Evaluation
Conclusion
Download