SybilCast: Broadcast on the Open Airwaves SETH GILBERT, CHAODONG ZHENG National University of Singapore Sunday afternoon in Starbucks We have a Sybil attack! Base Station B/2 B/10 Alice u B/2 B/10 Sybil identities: … B/10 Sean v v1 v2 v3 v4 v5 v6 v7 v8 v9 Radios can access many channels Use radio resource testing! u channel one v Ackmsg for msg Base Station x !ALERT! y msg channel two Honest users: always pass the test! Malicious users: lose (fake) id with 50% chance! [1] N. James, E. Shi, D. Song, and A. Perrig. The sybil attack in sensor networks: Analysis & defenses. [2] D. Mónica, J. Leitão, L. Rodrigues, and C. Ribeiro. On the use of radio resource tests in wireless ad-hoc networks. Challenges Colluding: Other malicious behavior: Malicious user jam channels, and/or spoof messages Continuous nature of the system: Malicious users can cover more than one channel Cannot run a set of tests and then stick to normal data deliver protocols Efficiency of detection: Overhead for detecting sybil identities must be low Overview 1. Introducing sybil attacks 2. Model and problem 3. The SybilCast protocol: Structure Why it works Model Synchronous wireless network: Single-hop 𝒄 channels Users: One (authenticated) base station up to 𝑵 real users (unauthenticated) that come and go Radios: Everyone has one radio, choose one channel in a round Transmit or receive Channel one v Channel two … Base Station w Channel c Malicious users Malicious users: At most 𝒕 < 𝑐/8 Colluding Capabilities: Create sybil identities Jam channels Spoof messages Each has only one radio transceiver as well! Channel one x y v Sean Base Station q r Shirley Quit #$%@#%#^@#^@ w Channel two … Channel c Problem: fair bandwidth access Basic problem: Users arrive and request data Base station delivers data to user Goal: every user gets a fair share of the bandwidth: If there are at most 𝒏∗ users in the system during request 𝒎 Request 𝑚 gets 1/𝑛∗ of the total bandwidth Channel one Channel two Sean Base Station Shirley data u … Channel c Introducing SybilCast Three phases per epoch: Registration phase: new users join the network Data phase: registered users receive data and authentication information Verification phase: base station checks registered users d registered identities 2d-s registered identities one epoch … registration phase: at most d new ids registered data phase: at most 2d ids present verification phase: s ids removed ((d c)log 2 N ) ((d c)log 2 N ) ((d c)log N ) time … Why those lengths? Balance sybil identities’ admission rate and honest identities’ admission rate: Fast admission → Low registration overhead However: Fast admission → More sybil identities → Low throughput Registered identities at most double! d registered identities 2d-s registered identities one epoch … registration phase: at most d new ids registered data phase: at most 2d ids present verification phase: s ids removed ((d c)log 2 N ) ((d c)log 2 N ) ((d c)log N ) time … Registration phase Goal: delivers a final seed to each request: Long random binary string Used as a frequency hopping sequence Hidden from the malicious users Procedure: Divide phase into sub-phases of Θ(log𝑁) In each sub-phase, deliver partial seed to user User takes XOR of all Θ(log𝑁) partial seeds (( x c)log2 N ) … (log N ) … Challenges and Tools Avoid jamming Authenticating nodes (to counter spoofing): Random uncoordinated frequency hopping Hash chain Avoid contention among nodes: Backoff protocol (ensures delivery of single partial seed) Registration list (ensures enough partial seeds) Data phase random binary string Goal: deliver data and nonces to registered identities Procedure for each round: Base station chooses a random registered identity Send a packet on the pre-agreed channel with data and nonce Intended receiver get the data All nodes on that channel record the nonce! Channel one Channel two Channel three Base Station data < 𝑚𝑢 |𝑟1 > nonce < 𝑚𝑤 |𝑟2 > u < 𝑚𝑢 |𝑟1 > v <× |𝑟1 > w < 𝑚𝑤 |𝑟2 > <× |𝑟1 , 𝑟2 > The Power of the NonceTM Most sybil identities miss many nonces: Many sybil identities → spread on many channels. Spread on many channels → high likelihood to lose nonces. We show, if there are s > 12𝑡 sybil identities, after 𝑘 data rounds, ≥ 𝑠 − 12𝑡 of them will lose ≥ 𝛼𝑘/𝑐 nonces. Honest identities do not miss many nonces: For an honest node, it lose each nonce with probability ≤ 1/8. After 𝑘 data rounds, each honest node loses ≤ 𝑘/8𝑐 nonces. We show 1/8 < 𝛼, honest nodes win! Verification phase Procedure: Users send collected nonces back to base station (Uncoordinated) frequency hopping to resolve jamming and contention. Threshold 𝛼𝑘/𝑐 : Base station eliminates identities without enough nonces Guarantee: No honest users are eliminated (w.h.p.) All but 12t sybil identities are eliminated (w.h.p.) Putting everything together For a request 𝒎 from honest node 𝒑 𝒏∗ = maximum number of active real nodes 𝒅∗ = maximum number of registered identities p finishes registration p initiate a request p obtains first partial seed 𝑂( 𝑑∗ + 𝑐 log 2 𝑁) 𝑂( 𝑛∗ + 𝑐 𝑐log 3 𝑁) … epoch i epoch i+1 epoch i+2 time … epoch j Putting everything together 𝑝 finishes reg. O 𝑛∗ + 𝑑 ∗ + 𝑐 𝑐log 3 𝑁 time. However, 𝑑 ∗ may count (many) sybil identities! By the end of any epoch: 𝑥 remaining identities at most 12𝑡 sybils. 𝑥 − 12𝑡 = 𝑂(𝑛∗ ), hence 𝑥 = 𝑂(𝑛∗ ) In next epoch, at most 𝑥 new identities We need to constrain 𝑑 ∗ ! We have 𝑑 ∗ = 𝑂 2𝑥 = 𝑂(𝑛∗ ). 𝑝 finishes registration in O 𝑛∗ + 𝑐 𝑐log 3 𝑁 time. Putting everything together 𝑝 finishes registration in O 𝑛∗ + 𝑐 𝑐log 3 𝑁 time. Once registered, 𝑝 gets 𝑚 in O 𝑚 ∙ 𝑑 ∗ = 𝑂 𝑚 ∙ 𝑛∗ time. In total, 𝑝 needs O 𝑛∗ + 𝑐 𝑐log 3 𝑁 + 𝑂 𝑚 ∙ 𝑛∗ time. If |𝑚| = Ω(𝑐 2 log 3 𝑁), this is just O 𝑚 ∙ 𝑛∗ time! I.e., (asymptotically) optimal time! SybilCast’s key property Theorem: If an honest user requests a data 𝑚 of size |𝑚| ≥ 𝑐 2 log 3 𝑁, and if there are at most 𝑛∗ concurrently active real nodes at any point during the request, then the download will complete in 𝑂(|𝑚| ∙ 𝑛∗ ) time w.h.p. Corollary: On average, each honest user corresponds to 𝑂 1 sybil identities, hence each honest user can finish data download in asymptotically optimal time. THIS IS IT! Conclusion SybilCast solves fair bandwidth allocation despite: Combination of existing tools: Sybil attacks! Jamming! Spoofing! Radio resource testing, frequency hopping, hash chain, … And innovations: Admission rate control, deferred verification, … Distri-SybilCast? If you have questions, now is the time!