Computer Science Assuring Integrity of Dataflow Processing in Large-Scale Cloud Systems Juan Du Co-advised by: Dr. Xiaohui (Helen) Gu, Dr. Douglas Reeves Department of Computer Science North Carolina State University Outline • Background – Multi-tenant cloud systems – Service integrity attack • Service Integrity Assurance – RunTest [ASIACCS’10] • Conclusion and Ongoing Work 2 Computer Science Multi-Tenant Cloud Systems c5 •f2 c6 VM •f1 •f3 VM •P1 c4 •f c2 2 VM •P2 •…,f2(f1(di)),… •f3 c3 VM VM •P3 •f1 c1 •P2 •f c7 4 •P3 VM VM •P1 •P3 •…di,… •User •…,f3(f2(f1(di))),… •Portal • Platform for Software as a Service (SaaS) 3 Computer Science Service Integrity Attack c5 •f1 •f2 c6 •P1 c2 c4 •P3 •f1 •P2 •f2 •…,f0(f1(di)),… •P2 c1 •P1 •f3 c7 •f4 •f3 c3 •P3 •P3 •…di,… •User •…,f3(f0(f1(di))),… •Portal • Service providers come from different security domains • Not all data processing components are trustworthy 4 Computer Science Previous Work • Distributed dataflow processing – focuses on resource and performance management issues. – usually assumes that all data processing components are trustworthy. • Trust management in distributed systems – Distributed messaging systems [Haeberlen, et al. SOSP 2007] – Pub-sub overlay [Srivatsa, et al., CCS 2005] – Virtualized datacenters [Berger, et al., SIGOPS 2008] – None of them addressed secure and scalable dataflow processing in multi-tenant cloud systems 5 Computer Science Previous Work (cont.) • Byzantine fault-tolerance – in Wide area networks [Amir, et al., DSN 2006] – Generally has scalability issues. • Security in SOA – WS-Security v1.1 [Oasis, 2006] – Focuses on integrity and confidentiality of web service messages through encryption and authentication. – Attacks can go beyond messaging security. 6 Computer Science RunTest RunTest: Assuring Integrity of Dataflow Processing in Cloud Computing Infrastructures. Juan Du, Wei Wei, Xiaohui Gu, Ting Yu. ACM Symposium on Information, Computer and Communications Security (ASIACCS), Beijing, China, April, 2010. •Randomized data attestation 7 Attestation Graph •Detect integrity •attack •Pinpoint malicious nodes Computer Science Integrity Attestation Graph • Randomized data attestation – Capture consistency/inconsistency relationships between pairs of components •f1 •d2•d1 •Portal •f2 •f1 • f1(d2’) •s1 •f1(d1) •s4 • f2(f1(d2’)) •f2(f1(d1)) •d1• d2’ •d1’ • d2 •s2 •f1(d1’)•s5•f2(f1(d1’)) •Portal • f2(f1(d2)) • f1(d1)=f1(d1’) •s3 • f1(d2) •s6 • f2(f1(d1))=f2(f1(d1’)) •f2 •s1 •1 •0.3 •s2 •0.3 •s3 •s4 •1 •0.6 •s5 •0.6 •s6 • f1(d2) != f1(d2’) 8 Computer Science Pinpoint Malicious Service Providers Proposition 1: All good nodes form a consistency clique. •clique P1 1 P5 P2 Assume: Good nodes take majority in each service function. 1 P3 P4 • 9 Computer Science Identify Attack Patterns •clique •clique •clique • Number of cliques • Weights on the edges 10 Computer Science Experimental Evaluation • Implementation – On top of IBM System S • Experiment setup – Tested on NCSU virtual computing lab (VCL) – Use about 10 blade servers – Each host run CentOS 5.2 64-bit with Xen 3.0.3 11 Computer Science Detection Rate •Can achieve 100% detection rate under different attack patterns 12 Computer Science Comparison • Full Time Majority Voting (pu = 1, r = 5) ― ― Immediate detection Not scalable • RunTest ― ― 13 Scalable, small pu and r => less attestation traffic A short delay in detection, small pu and r => takes longer to detect Computer Science Conclusion • The first attempt to address service integrity of dataflow processing applications in multi-tenant cloud systems • Scalable runtime service attestation – Light-weight • Randomized data attestation – Black-box approach • Application-level input replay and result consistency check – Effective • High detection rate and no false alarm 14 Computer Science Ongoing Work • Support stateful service functions • Relax the assumptions for malicious service providers – can take majority in service functions – Must be minority in overall system 15 Computer Science Thank you! Questions? 16 Computer Science