A Framework for Secure Data Aggregation in Sensor Networks Yi Yang Joint work with Xinran Wang, Sencun Zhu and Guohong Cao Dept. of Computer Science & Engineering The Pennsylvania State University Sensor networks • Functions BS – Sensing – In-network processing – Ad-hoc communication • Applications – Real-time traffic monitor – Military surveillance – Homeland security Berkeley Mica Motes Yi Yang - SDAP 2 Why data aggregation? (1) • Without data aggregation BS – Data redundancy – Communication cost – Energy expenditure Reporting raw data is unnecessary! Yi Yang - SDAP 3 Why data aggregation? (2) • With data aggregation BS Reduce data redundancy, communication cost and energy expenditure in data collection! Yi Yang - SDAP 4 Security challenges in aggregation? (1) • A lossy data compression process BS – Individual sensor readings are lost in aggregation • A compromised intermediate node may change the aggregated data • BS cannot verify the result without knowing original readings Yi Yang - SDAP Compromised node False Alarm 5 Security challenges in aggregation? (2) BS • Question: ? – How can BS obtain a good approximation of the fusion result when a fraction of nodes are compromised? Compromised node Yi Yang - SDAP False Alarm 6 Network model BS BS ...... • • • • Yi Yang - SDAP An unbalanced tree rooted at BS Data are aggregated hop by hop Each aggregate is a tuple (value, count) Every node only forwards one copy 7 Attack model Goal: Inject false data without being detected by BS • Example: Legitimate temperature (32F ~ 150F) – Without modifying the received aggregate BS • (98.7F~101F, 51) – Count change attack • (100F~150F, *) (?, ?) (100F, 50) – Value change attack • (32F~150F, 51) The combination of count and value change attacks, and collusion among compromised nodes are more destructive! Yi Yang - SDAP 8 Observations • Hop-by-hop aggregation – Aggregates computed by a higher-level node are from more low-level nodes – If a compromised node is closer to BS, false value from it has more impact on the final result computed by BS Yi Yang - SDAP Legitimate temperature (32F ~ 150F) BS 9 Our solutions Divide and conquer Commit and attest • Tree construction and query dissemination • Probabilistic grouping – Partition nodes in the tree into multiple logical groups (subtrees) of similar size • Hop-by-hop aggregation – Each group generates a commitment which cannot be denied later • Attestation between BS and suspicious groups – BS identifies abnormal groups from the set of received group commitments – Groups under suspicion prove the correctness of submitted commitments to BS • BS discards commitments from groups failing to support previous values when computing final aggregates Yi Yang - SDAP 10 Tree Construction & Query Dissemination • Tree construction Legitimate temperature (32F ~ 150F) – Similar to TAG BS avg • Query dissemination avg – BS * : Fagg, Sg • Fagg: an aggregation function, e.g., avg, count • Sg: a random number as grouping seed avg avg ...... avg avg avg avg Yi Yang - SDAP avg avg avg avg avg avg avg avg avg avg avg avg avg avg avg avg avg avg 11 Probabilistic grouping & data aggregation • Probabilistic grouping is conducted through group leader selection Legitimate temperature (32F ~ 150F) – H(Kx, Sg|x) < Fg(c) BS •x : node id H(K , S |y) < F (c) •Kx : master key of x x •H : pseudorandom function, H(K , S |x) < F (15) uniform output in [0,1) •Sg : for security and load balance •c : count •Fg : grouping function, [0,1) output increasing with c H(K , S |id) > F (1) y x id Yi Yang - SDAP g g g y g ...... g w' H(Kw’, Sg|w’) < Fg(8) g 12 Probabilistic grouping & data aggregation • Probabilistic grouping is conducted through group leader selection Legitimate temperature (32F ~ 150F) – H(Kx, Sg|x) < Fg(c) •x : node id •Kx : master key of x •H : pseudorandom function, uniform output in [0,1) •Sg : for security and load balance •c : count •Fg : grouping function, [0,1) output increasing with c BS Default Leader y x ...... w' By choosing appropriate grouping functions, group sizes are roughly even with small deviation, providing good basis for attestation Yi Yang - SDAP 13 Group aggregation (1) • Format of aggregates Authenticated id flag count value seed MAC Encrypted Flag: initialized to 0, set to 1 after leaders finish group aggregation, so that other nodes on the path just forward group commitments y • Leaf node aggregation – uv : u, 0, E(Kuv ,1|Ru|Sg)|MACu MACu=MAC(Ku, 0|1|u|Ru|Sg) H(Ku, Sg|u) > Fg(1) BS x ...... w v u Yi Yang - SDAP 14 Group aggregation (2) • Immediate node aggregation – vw : v, 0, E(Kvw ,3|Aggv|Sg)|MACv Aggv=Fagg(Rv, Ru, Ru’) MACv=MAC(Kv, 0|3|v|Aggv| MACu MACu’ |Sg) BS y H(Kv, Sg|v) > Fg(3) x ...... MAC is also computed hop by hop, thus representing authentication of all the nodes contributing to the data w v u Yi Yang - SDAP 15 Group aggregation (3) • Leader node aggregation – xBS : x, 1, E(Kx ,15|Aggx|Sg)|MACx Aggx=Fagg(Rx, Aggw, Aggw’) MACx=MAC(Kx, 1|15|x|Aggx|MACw BS MACw’|Sg) Default leader of leftover nodes y H(Kx, Sg|x) < Fg(15) Tracking the forwarding path: x ...... • A forwarding table (incoming link, group id) • Group id is the id of group leader • Bloom filter may help scale up w v u Yi Yang - SDAP 16 Verification & attestation(1) BS identifies suspicious groups for attestation • Outlier detection by Grubbs’ Test – Hypothesis test: H0 vs. H1 – Our extensions: multiple outliers, bivariate • Pc * Pvalue <α? (significance level, e.g., 0.05) • One-sided test for count and two-sided test for values – Attackers tend to forge false values as well as large counts correspondingly, to make false values count for larger fraction in the final result (w’, 95F, 25) Yi Yang - SDAP (x, 142F, 50) (y, 100F, 20) (BS, 90F, 28) 17 Verification & attestation(1) BS identifies suspicious groups for attestation • Outlier detection by Grubbs’ Test – Hypothesis test: H0 vs. H1 – Our extensions: multiple outliers, bivariate • Pc * Pvalue <α? (significance level, e.g., 0.05) • One-sided test for count and two-sided test for values – Attackers tend to forge false values as well as large counts correspondingly, to make false values count for larger fraction in the final result (w’, 95F, 25) Yi Yang - SDAP (x, 142F, 50) (y, 100F, 20) (BS, 90F, 28) 18 Verification & attestation(2) Forwarding attestation requests from BS • Suppose group x is under suspicion BS y – BS y: x, Sa, Sg • Sa: a random number as attestation seed – Node y then forwards this request to leader x Yi Yang - SDAP x ...... w v u 19 Verification & attestation(3) Group attestation • Probabilistic attestation path selection BS y – From x, each parent sums up counts of all the children, then d computes w H (Sa | id ) ck , picks k 1 up ith child on the path, if i 1 i 1 1 w [ ck , ck ) A node with larger count has more chances to be attested Yi Yang - SDAP x ...... w v' w' v u u' 20 Verification & attestation(4) Attestation response from groups • Each node on the path sends back count and reading • Sibling node sends back count, aggregate and MAC BS y x (leaf only sends count and reading) w v' w' v u Yi Yang - SDAP ...... u' 21 Verification & attestation(5) Group response validation by BS • BS reconstructs Aggx and MACx based on responses BS y – If both match the submitted values, accepts them – Otherwise, rejects them x w v' w' v u Yi Yang - SDAP ...... u' 22 Conclusion & future work • Analysis and simulation results are skipped • A probabilistic grouping based secure data aggregation protocol – Divide-and-conquer – Commit-and-attest • Challenges: – Max/Min – Content-based attestation • Readings from nodes in the same neighborhood should bear certain temporal/spatial correlations Yi Yang - SDAP 23 Thank you! •Questions? Yi Yang - SDAP 24