Slides

advertisement
Physical Layer Attacks on
Unlinkability in Wireless LANs
Kevin Bauer* Damon McCoy* Ben Greenstein+
Dirk Grunwald* Douglas Sicker*
* University
of Colorado
+ Intel
Research Seattle
Our Wireless World
Link Layer Header
Blood pressure: high
Link Layer Header
PrivateVideo1.avi
Link Layer Header
Link Layer Header
PrivatePhoto1.jpg
Buddy list: Alice, Bob, …
Link Layer Header
Our wireless devices reveal
Home location=(47.28,…
lots of information about us
1
Best Security Practices for 802.11
Bootstrap
Username: Alice
Key: 0x348190…
SSID: Bob’s Network
Key: 0x2384949…
Out-of-band (e.g., password, WiFiProtected Setup)
Discover
Authenticate
and Bind
Send Data
802.11 probe
Is Bob’s Network here?
802.11 beacon
802.11 auth
802.11 auth
802.11 header
802.11 header
Bob’s Network is here
Proof that I’m Alice
Proof that I’m Bob
• Confidentiality
• Authentication
• Integrity
2
Problem: Short-Term Linking
12:34:56:78:90:ab,
12:34:56:78:90:ab
seqno: 1, …
12:34:56:78:90:ab,
12:34:56:78:90:ab
seqno: 2, …
00:00:99:99:11:11,
00:00:99:99:11:11
seqno:
Alice
->102,
AP …
12:34:56:78:90:ab,
12:34:56:78:90:ab
seqno: 3, …
00:00:99:99:11:11,
00:00:99:99:11:11
seqno:
Alice
->103,
AP …
12:34:56:78:90:ab,
12:34:56:78:90:ab
seqno: 4, …
00:00:99:99:11:11,
00:00:99:99:11:11
seqno:
Alice
->104,
AP …
Easy to isolate packet streams using addresses, seq nums
3
Problem: Short-Term Linking
Isolated data streams are susceptible to side-channel
analysis using packet size and timing information
– Exposes keystrokes, VoIP calls, webpages, movies, …
[Liberatore, CCS ‘06; Pang, MobiCom ’07; Saponas, Usenix Security ’07;
Song, Usenix Security ‘01; Wright, IEEE S&P ‘08; Wright, Usenix Security ‘07]
250
Device
300 500
200
120
fingerprints
100
Video
compression
signatures
transmission sizes
transmission sizes
≈
DFT
Keystroke
timings
4
Solution: Encrypt the Entire Frames
“SlyFi”, MobiSys ’08
Which packets are
transmitted by which
devices?
3-9 data streams overlap
each 100 ms, on average
Unlinkability is achieved
5
Our Goal: Short-Term Linking Using
Physical Layer Information
• State-of-the-art methods require specialized
and expensive hardware [Brik, Mobicom
’08; Danev, Usenix Security ‘09]
• We want to perform short-term transmitter
packet linking using low-cost commodity
hardware
Charlie
??? ->->APAP
Charlie
??? ->->APAP
Alice
??? ->
->AP
AP
Charlie
??? ->->APAP
Charlie
??? ->->APAP
Charlie
??? ->->APAP
6
Talk Outline
• Motivation and Goals ✓
• Physical Layer Packet Linking
• Experimental Evaluation
• Solution: Introduce Noise
7
Signal Strength Background
RSSI values can be obtained using
commodity 802.11 radios and drivers
Increasing distance
Eavesdropper
-50 dB
Decreasing RSSI
-65 dB
-85 dB
Received signal strength indication (RSSI)
fades as transmissions travel further
Noise floor
8
Real World Signal Strength Behavior
Physical Location
Signal Strength (dB)
Received signal strength is influenced by
the transmitting device’s physical location
9
Packet Linking with Device Localization
We first try to link packets by location
– RSSI values fluctuate due to environmental noise
– Supervised learning algorithms: RSSI  location mapping
– We use k-nearest neighbors [Bahl, Infocom ’00]
But localization requires training
data, which is expensive and
time consuming to collect
10
An Unsupervised Approach
We’re not interested in mapping packets to
location, just linking packets to transmitters
Use a clustering algorithm to handle noise
11
More Details
• Use k-means to classify packets by transmitter
– n listening sensors
– Feature vector: (RSSI1, RSSI2, … , RSSIn)
• k-means is probabilistic  may not find a
globally optimal solution
– Heuristic: Run 100 times to get a stable solution
• Meets our goal: Requires only commodity
802.11 hardware, stock drivers, and no
training
12
Talk Outline
• Motivation and Goals ✓
• Physical Layer Packet Linking ✓
• Experimental Evaluation
• Solution: Introduce Noise
13
Experimental Evaluation
• Collect real signal strength data in a 75m × 50m office building
• 5 passive monitors and 58 different measurement positions
• Our dataset is available in CRAWDAD wireless trace repository:
http://crawdad.cs.dartmouth.edu/cu/rssi
14
Packet Clustering Accuracy
But is this good enough
to enable interesting traffic
analysis?
Higher = Better
• Adversary uses 5 sensors to
record packets’ RSSI values
• Generate 100 random
device configurations
• Clustering accuracy > 75%
for all experiments
• Accuracy using localizationbased approach performs
worse
(see paper for details)
Vary the number of transmitters from 5-25
k-means is very accurate at clustering packets using RSSI
15
Website Fingerprinting Accuracy
Higher = Better
• Attack: Encrypted website
fingerprinting using
[Liberatore and Levine,
CCS ‘06]
• Naïve Bayes classifier to
identify websites after
clustering packets
Simple traffic analysis task performs well
16
Talk Outline
• Motivation and Goals ✓
• Physical Layer Packet Linking ✓
• Experimental Evaluation ✓
• Solution: Introduce Noise
17
Solution: Vary Transmit Power
Intuition:
We expect tight, separable clusters
Goal: Make the clusters overlap
Cluster is now larger,
more likely to overlap
with other clusters:
this introduces more
clustering errors
Varying transmit power introduces more noise in RSSI
18
Solution: Directional Antenna
Intuition: Focus signal in different directions:
creates “phantom” clusters
Inexpensive “cantenna”
1 device, 4 distinct clusters
Using a directional antenna causes fluctuation in RSSI
19
Combined: Clustering Accuracy
Lower = Better
• 15 transmitters total
• Vary number of devices
that add noise
• Decreases clustering
accuracy from 80% to
50%
• Traffic analysis accuracy
decreases from 40% to
26% for devices that add
noise
Both solutions decrease clustering accuracy
20
Other Potential Solutions
• Anonymity (still) loves company
– The more devices, the better
– Devices close together have similar clusters
• Wireless cover traffic
– Devices transmit “dummy traffic” to frustrate side
channel attacks
– Wireless shared medium  degrades performance
• Physical security, jamming, frequency hopping
– Performance implications, may not be effective
• Physical layer info is hard to control
21
Conclusion
• Wireless devices are becoming personal and pervasive
• Information present at the physical layer can lead to
privacy leaks
– Short-term linking: Side-channel attacks
• Defenses to mitigate attacks
– Introducing additional noise reduces clustering accuracy
– More research is needed to help address privacy risks
exposed by the physical layer
22
Backup Slides
23
How many sensors are enough?
Almost no gain after three sensors
24
Empirical stream interleaving
Many streams interleaved at short timescales
25
Why use k-means?
• k-means performs
well with spherical
patterns
• It’s simple, yet it
out-performed
other clustering
methods on our task
26
How does distance effect accuracy?
Two transmitters at
different distances
Measured accuracy
of k-means
27
What if attacker doesn’t know k?
Even if attacker can approximate k, website
fingerprinting attack can still perform well
28
Related Work
• Device Distinction
– Detect MAC spoofing [Faria, WISE ‘06]
» Doesn’t generalize to k devices
– Uses multipathing to detect spoofing [Patwari ‘07]
» Uses non-commodity hardware
• RF Fingerprinting
– Uses electromagnetic signature [Hall ‘05]
» Uses expensive non-commodity hardware
– Uses modulation fingerprinting [Brik ’08, Danev ‘09]
» Relies on signal analyzer hardware
29
Clustering accuracy: F-measure
Weighted harmonic mean of precision and recall:
1. In terms of information retrieval:
2. In terms of classification:
tp: true positive
fp: false positive
fn: false negative
Homogeneity of each cluster
Extent to which packets
are clustered together
30
k-Means Clustering Algorithm
• Input: Data set and number of clusters k
• Initialization: Select initial cluster centroids by
choosing k data points at random
• Repeat until cluster membership is stable:
– Compute the distance from each data point to each
of the k centroids
– Group the data points by their closest centroid
– Compute the new cluster centroids
• k-means minimizes the residual sum of squares
31
Why does clustering perform
better than localization for linking?
• Surprising result
– Training means it should be better, right?
• But, localized packets have error (3.5 meters
at the median) so we need to cluster the
localized packets by their location predictions
– Errors from localization and clustering steps are
additive
32
Estimating k from data
• k-means tries to minimize the within-cluster
residual sum of squares
where μi is the centroid of cluster Si
• Choose k s.t. the within-cluster sum of squares
is minimized using cross validation
– Works best when clusters are separable
33
Download