Mapping the Urban Wireless Landscape with Argos Ian Rose, Matt Welsh ianrose@eecs.harvard.edu mdw@eecs.harvard.edu 1 Motivation WiFi devices are extremely popular; usage continues to grow dramatically. h Wireless is increasingly pervasive – no longer just indoors – and diverse. 2 Motivation Suppose we had a global view of a city's wireless traffic... What kind of questions might we ask? What are user's mobility patterns? How does traffic and usage vary by device type (phone vs. laptop) or setting (cafe vs. bus)? How much malware is present in wireless networks? 3 The Big Picture Deploy WiFi sniffers across a large urban area Sniffers capture wireless traffic, merge individual traces into a global view, run custom user queries Our deployment: CitySense network 26 sniffers in Cambridge, MA using wireless mesh for network connectivity 4 Hardware Implementation SBC: Soekris net4826 or ALIX 2c2 CM9 2.4 GHz + 8 dBi antenna for sniffer XR9 900 MHz + 6 dBi antenna for mesh Power from streetlights or wall sockets 5 Deployment 2 sniffers 13 sniffers 5 km 9 sniffers 2 sniffers 8.5 km 6 Challenges Poor packet capture rates from individual sniffers Scalability, esp. regarding sniffer nodes' backhaul connectivity. Monitored population is quite diverse, exhibits large temporal and spatial variance 7 Privacy Concerns There is an (obvious) big privacy concern here. One goal: understand privacy vs. research tradeoffs Also, understand the capabilities of systems like this (whether for “good” or for “evil”) -- what are the actual risks/dangers? Identifying fields obfuscated by the system (IP address, MAC) 8 Architecture: User Queries User queries are expressed as a dataflow graph of packet processing operators. Think Click Modular Router or stream processing engines Let's consider a simple example: “stolen laptop finder” 9 Architecture: Collecting Packets Sniffer network w/ wireless mesh backhaul: Naive method: All sniffers stream captured packets to server for merging and user queries. 10 Architecture: Trace Merging Goal: Obtain complete, network-wide view of captured traffic. 11 Architecture: Trace Merging Trace merging like this is pretty standard practice (e.g. Jigsaw, Wit, Yeo et al. '04) In wired sniffer networks: all captured packets are collected at a central location for merging Expensive or impossible to do with a low-bw backhaul! How can we merge in the network? 12 Architecture: In-Network Processing Option #2: #1: In-Network Centralized Merging Merging This reduces b/w somewhat, as it eliminates duplicates, but we can do much better! 13 Architecture: User Queries Split user queries into sniffer and server dataflows (similar to Wishbone [NSDI '09]) 14 Architecture: In-Network Processing Option So how#3: does In-Network this help?Merging and user queries Big b/w savings by sending only query outputs back to server. (90% is common) 15 Architecture: In-Network Processing Main points: Merging packets in-network saves some b/w But the big savings come from running user queries in-network A few complications glossed over here (discussed in paper) 16 Architecture: Sniffer Nodes 17 Architecture: Channel Management There are 11 radio channels (802.11b/g) We need channel policies to determine when to change the radio channel When particularly interesting traffic is detected, sniffers can also recruit nearby sniffers to focus 18 all on one channel to maximize capture Sample Query Would not work right with packets from merged tap -- requires all (and only) locally captured packets. Done! 19 Performance Evaluation: Summary In-network Traffic Processing leads to a more even distribution of traffic load over network links; bottleneck links greatly reduced allows sniffer networks to scale to a greater offered load (monitored population) Channel Focusing increases network-wide capture of small windows of “interesting” traffic in some cases (no advantage in other cases) 20 Performance Evaluation In-network processing evaluated analytically 25 sniffers in grid Wired sink in center Variable # sources (placed randomly) Empirically-derived traffic model Max. link load 8x higher in centralized case 21 Case Studies Popular websites and search patterns Malicious traffic Tracking public transportation services Commuter Commutertrains trains Private bus lines Wireless Wirelessclient clientfingerprinting fingerprinting 22 Case Study: Train Tracking LTRX_IBSS ^@ ^@^@^@^@ Muffin MITRA-PC_Network shafanali Saleeqa hpsetup BrooklineWireless ARTSHOP MBTA_WiFi_Coach0365_Box-076 MBTA_WiFi_Coach0389_Box-180 Free Public WiFi Verizon MiFi MNR Ganesh LINKSYS MBTA_WiFi_Coach0227_Box-038 MBTA_WiFi_Coachnnnn_Box-050 Coach0385_Box-068 skando MBTA_Wifi_Coach1612_Box-143 23 Case Study: Train Tracking 24 Case Study: Train Tracking From captured traffic, try to predict: when trains passed by their direction of travel Use published train schedule as “gold standard” (probably not 100% accurate!) Over a 24 hour test, all 34 trains detected successfully time estimates usually accurate to within ~5 min. direction estimates: 25 correct, 4 incorrect 25 Case Study: User Fingerprinting WiFi devices send Probe Requests to search for known networks By capturing these and geolocating the named networks (via www.wigle.net) we can fingerprint user's movements Rank Probe Reqs Unique Nets Locatable 1 7431 49 28 Tulsa trains OK 2 87 48 11 Chicago Oregon IL Mass. 3 370 46 10 UK Austin TX 4 632 47 10 Belgium 26 5 120 47 0 Related Work Wardriving: wide spatial coverage, but no temporal coverage (Akella et al. - MobiCom '05, Han et al. - IMC '08) Dense indoor monitoring: good temporal coverage and high capture fidelity, but limited spatial coverage (Jigsaw & Wit – Sigcomm '06) 27 Conclusions Urban wireless capture is a difficult business – Argos shows that the technique is possible via: in-network merging & user queries to reduce traffic intelligent 802.11 channel control Our case studies demonstrate Argos' utility, but many more opportunities exist Future work: improved anonymity guarantees other sniffer types (vehicular, mobile phone, etc.) 28 Thanks! Ian Rose ianrose@eecs.harvard.edu http://eecs.harvard.edu/~ianrose Matt Welsh mdw@eecs.harvard.edu http://eecs.harvard.edu/~mdw 29