Part I Securely Using Cloud Computing Services Qin Liu Email: gracelq628@126.com Hunan University Outline 1. Cloud Computing 2. Security Issues in Clouds 3. Introduction to Our Work 2 Evolution of Computing Patterns What Is Cloud Computing? Wikipedia Definition Cloud computing is a concept of using the Internet to allow people to access technology-enabled services It allows users to consume services without knowledge of control over the technology infrastructure that supports them NIST Definition 5 essential characteristics 3 cloud service models 4 cloud deployment models The NIST Cloud Definition Framework Hybrid Clouds Deployment Models Private Cloud Service Models Software as a Service (SaaS) Community Cloud Public Cloud Platform as a Service (PaaS) Infrastructure as a Service (IaaS) On Demand Self-Service Essential Characteristics Broad Network Access Rapid Elasticity Resource Pooling Measured Service Essential Characteristics On-demand service Get computing capabilities as needed automatically Broad Network Access Services available over the net using desktop, laptop, PDA, mobile phone Resource pooling Provider resources pooled to server multiple clients Rapid Elasticity Ability to quickly scale in/out service Measured service Control, optimize services based on metering Essential Characteristics Cloud Service Models Software as a Service (SaaS) We use the provider apps User doesn’t manage or control the network, servers, OS, storage or applications Platform as a Service (PaaS) User deploys their apps on the cloud Controls their apps User doesn’t manage servers, IS, storage Infrastructure as a Service (IaaS) Consumers gets access to the infrastructure to deploy their stuff Doesn’t manage or control the infrastructure Does manage or control the OS, storage, apps, selected network components Service Delivery Model Examples Amazon Google Microsoft Salesforce SaaS PaaS IaaS Products and companies shown for illustrative purposes only and should not be construed as an endorsement Cloud Deployment Models Private cloud Community cloud Sold to the public, mega-scale infrastructure Hybrid cloud 10 Shared infrastructure for specific community Public cloud Enterprise owned or leased Composition of two or more clouds Cloud Deployment Models Top 8 Cloud Computing Companies Cloud Computing Example - Amazon EC2 IaaS http://aws.amazon.com/ec2 Cloud Computing Example - Google AppEngine PaaS http://code.google.com/appengine/ Google AppEngine API Python runtime environment Datastore API Images API Mail API Memcache API URL Fetch API Users API A free account can use up to 500 MB storage, enough CPU and bandwidth for about 5 million page views a month Conventional Computing vs. Cloud Computing Conventional Manually Provisioned Dedicated Hardware Fixed Capacity Pay for Capacity Capital & Operational Expenses Managed via Sysadmins Cloud Self-provisioned Shared Hardware Elastic Capacity Pay for Use Operational Expenses Managed via APIs Why A Cloud? Why A Cloud? Why A Cloud? Cloud Computing Summary Cloud computing is a kind of network service and is a trend for future computing Scalability matters in cloud computing technology Users focus on application development Services are not known geographically Outline 1. Cloud Computing 2. Security Issues in Clouds 3. Introduction to Our Work 20 What Not a Cloud? 21 Cloud Providers and Security Measures Kai Hwang and Deyi Li, “Trusted Cloud Computing with Secure Resources and Data Coloring”, IEEE Internet Computing, Sept. 2010 General Security Advantages 23 Shifting public data to an external cloud reduces the exposure of the internal sensitive data Cloud homogeneity makes security auditing/testing simpler Clouds enable automated security management Redundancy / Disaster Recovery General Security Challenges Trusting vendor’s security model Customer inability to respond to audit findings Obtaining support for investigations Indirect administrator accountability Proprietary implementations can’t be examined Loss of physical control 24 10 Security Concerns Where’s the data? Who has access? What are your regulatory requirements? Do you have the right to audit? What type of training does the provider offer their employees? What type of data classification system does the provider use? What are the service level agreement (SLA) terms? What is the long-term viability of the provider? What happens if there is a security breach? What is the disaster recovery/business continuity plan (DR/BCP)? 7 Potential Risks Privileged user access Regulatory compliance Data location Data segregation. Recovery Investigative support Long-term viability What Is Not New? Data Loss Downtimes Phishing Password Cracking Botnets and Other Malware Data Loss Downtimes 29 Phishing “hey! check out this funny blog about you...” 30 Password Cracking 31 What Is New? Accountability No Security Perimeter Larger Attack Surface New Side Channels Lack of Auditability Regulatory Compliance Data Security Accountability 33 No Security Perimeter Little control over physical or network location of cloud instance VMs Network access must be controlled on a host by host basis Larger Attack Surface Cloud Provider Your Network New Side Channels You don’t know whose VMs are sharing the physical machine with you. Attackers can place their VMs on your machine. See “Hey, You, Get Off of My Cloud” paper for how. Shared physical resources include CPU data cache: Bernstein 2005 CPU branch prediction: Onur Aciiçmez 2007 CPU instruction cache: Onur Aciiçmez 2007 In single OS environment, people can extract cryptographic keys with these attacks. 36 Lack of Auditability Only cloud provider has access to full network traffic, hypervisor logs, physical machine data. Need mutual auditability Ability of cloud provider to audit potentially malicious or infected client VMs. Ability of cloud customer to audit cloud provider environment. 37 Regulatory Compliance Certifications 39 Data Security Confidentiality Authorized to know Integrity Data Has Not Been Tampered With Availability Data Never Loss Machine Never Fail Symmetric Encryption Homomorphic SSL Encryption MAC Homomorphic SSL Encryption Redundancy Redundancy Storage Processing Redundancy Transmission Data Security Is A Major Concern Security concerns arising because both customer data and program are residing in Provider Premises. Security is always a major concern in Open System Architectures Customer Data Customer Customer Code Provider Premises Why Data Is Not Secure Cloud Security problems are coming from Loss of control Lack of trust Multi-tenancy Mainly exist in public cloud Loss of Control in the Cloud Consumer’s loss of control Data, applications, resources are located with provider User identity management is handled by the cloud User access control rules, security policies and enforcement are managed by the cloud provider Consumer relies on provider to ensure Data security and privacy Resource availability Monitoring and repairing of services/resources Lack of Trust in the Cloud A brief deviation from the talk Trusting a third party requires taking risks Defining trust and risk Opposite sides of the same coin People only trust when it pays Need for trust arises only in risky situations Defunct third party management schemes Hard to balance trust and risk e.g. Key Escrow Is the cloud headed toward the same path? Multi-tenancy Issues in the Cloud Conflict between tenants’ opposing goals Tenants share a pool of resources and have opposing goals How does multi-tenancy deal with conflict of interest? Can tenants get along together and ‘play nicely’ ? If they can’t, can we isolate them? How to provide separation between tenants? Possible Solutions Loss of Control Lack of trust Take back control Data and apps may still need to be on the cloud But can they be managed in some way by the consumer? Increase trust (mechanisms) Technology Policy, regulation Contracts (incentives): topic of a future talk Multi-tenancy Private cloud Takes away the reasons to use a cloud in the first place Strong separation Cloud Security Summary Cloud computing is sometimes viewed as a reincarnation of the classic mainframe client-server model However, resources are ubiquitous, scalable, highly virtualized Contains all the traditional threats, as well as new ones In developing solutions to cloud computing security issues it may be helpful to identify the problems and approaches in terms of Loss of control Lack of trust Multi-tenancy problems Outline 1. Cloud Computing 2. Security Issues in Clouds 3. Introduction to Our Work 48 Our Main Work Selected Publications G. Wang, Q. Liu, F. Li, S. Yang, and J. Wu, "Outsourcing Privacy-Preserving Social Networks to a Cloud," accepted to appear in the 32nd IEEE International Conference on Computer Communications (IEEE INFOCOM 2013). Q. Liu, C. C. Tan, J. Wu, and G. Wang, "Efficient Information Retrieval for Ranked Queries in Cost-Effective Cloud Environments" Proceedings of the 31st IEEE International Conference on Computer Communications (IEEE INFOCOM 2012). G. Wang, Q. Liu, and J. Wu, "Hierarchical Attribute-Based Encryption for Fine-Grained Access Control in Cloud Computing," Proceedings of the 17th ACM Conference on Computer and Communications Security (CCS-10). Q. Liu, C. C. Tan, J. Wu, and G. Wang, "Towards Differential Query Services in CostEfficient Clouds," accept to appear in IEEE Transactions on Parallel and Distributed Systems (TPDS). Q. Liu, G. Wang, and J. Wu, "Time-Based Proxy Re-encryption Scheme for Secure Data Sharing in a Cloud Environment", Information Sciences. Q. Liu, C. C. Tan, J. Wu, and G. Wang, "Cooperative Private Searching in Clouds," Journal of Parallel and Distributed Computing (JPDC). G. Wang, Q. Liu, and J. Wu, "Hierarchical Attribute-Based Encryption and Scalable User Revocation for Sharing Data in Cloud Servers," Computers & Security. Multi-User Data Sharing Environment Cloud Security problems are coming from : Loss of control Lack of trust (mechanisms) Multi-tenancy Security Issues Data Security Revocation Retrieval Privacy The cloud service provider is a potential attacker!! Data Security Natural way Adopting cryptographic technique Current solutions Traditional symmetric/ asymmetric encryption Low cost for encryption and decryption Support key delegation--HIBE Hard to achieve fine-grained access control Attribute-Based encryption Easy to achieve fine-grained access control High cost for encryption and decryption Do not support key delegation Public Key Cryptography 53 Attribute-Based Encryption (ABE) Key Policy ABE Ciphertext Policy ABE Hierarchical Attribute-Based Encryption (HABE) Sample URA Requirements Fine-grained access control Hierarchical key generation Efficiency Application scenario Hierarchical Attribute-Based Encryption (HABE) Key technique Combine the hierarchical identity-based encryption and attribute-based encryption Use the attributes and exact ID to identify each user HABE Architecture User Revocation Naïve solution The data owner re-encrypts data and distributes new keys to the data user Frequent revocation will make the data owner become a performance bottleneck Proxy re-encryption (PRE) Time-Based Proxy Re-Encryption PRE in clouds The data owner to send re-encryption instruction to the cloud The cloud perform re-encryption based on proxy reencryption How to achieve automatic revocation without sending any instructions? T1 : Da Acce ta ss n : T2 yptio r nc e Re Cloud Cloud Service Service Provider Provider Data Data Owner Owner Data Data User User T2<T1: Potential security risk Time-Based Proxy Re-Encryption Key technique Incorporate time into PRE This scheme is suitable for the application where the valid of access is pre-determined sa H s (a) sa( y ) H sa ( y ) sa( y ,m) H s (m) ( y) a ... 1 ( y ,m ) a (d ) 1 12 ... ... sa( y ,m,d ) H s 2012 31 1 31 A time tree is constructed The data owner and the cloud share a secret seed s The cloud re-encrypt data based on internal time automatically while receiving a data access request User Privacy User privacy Search privacy: The cloud cannot know what the users are searching for Access privacy: The cloud cannot know what/which files are returned to the users Existing solutions Private search (PS) can protect user privacy while searching public data Searchable encryption (SE) can protect search privacy while searching private data Searchable Encryption (SE) Bob sends to Alice an email encrypted under Alice’s public key. Alice’s email gateway wants to test whether the email contains the keyword urgent so that it could route the email to her PDA immediately. But,Alice does not want the email gateway to be able to decrypther messages Efficient Searchable Encryption Problem The user needs to perform decryption Thin client has only limited resources Requirements Enable the cloud to perform partial decryption without compromising search privacy User can access data from the cloud anytime and anywhere with any devices Efficient Searchable Encryption Key technique Alice takes both Bob and CSP’s public key as inputs of the encryption algorithm CSP uses its secret key to perform partial decrypt and generate an intermediate value Bob use the intermediate value to quickly recover data Private Search (PS) Given a public dictionary that contains all keywords, e.g., dictionary=<A,B,C,D>. Bob wants to retrieve files with keywords A and B [1] [1] [0] [0] F1: {A,B} Cloud Bob F1 F2 0 NA A compressed version of all files F2:{B,D} F3:{C,D} Private Search (PS) F1: { A, B} Homomorphic encryption E(x)*E(y) = E(x+y) E(x)^y = E(x*y) F1 F2 F2: {B,D} F3: {C,D} [1] [1] [0] [0] F1 F2 0 NA F3 E(F2)* E(0) =E(F2) F1 NA F2 0 survival collision survival unmatched key trick: map unmatched files to 0 Cooperative Private Search (COPS) Problem for simple PS Processing each query is expensive. Given n users, the cloud needs to execute n queries Performance bottleneck on the cloud COPS Architecture A proxy server (ADL) is introduced between the users and the cloud (trusted) Aggregate user queries Distribute searching results Cooperative Private Search (COPS) Key technique The user and the cloud share Shuffle functions shuffle the dictionary and the query --- to preserve search privacy Pseudonym function: hide file name Obfuscated function: hide file content ---preserve access privacy Key merits User privacy is preserved from The cloud The proxy server Other users Efficient Information Retrieval for Ranked Queries (EIRQ) Problem for Simple COPS No ranked queries The cloud returns all matched files Efficient Information Retrieval for Ranked Queries (EIRQ) Queries are classified into 0,1,…,r-1 ranks. Rank-i query retrieves (1-i/r) percentage of matched files … Files that match rank 0 queries Will not be filtered … … … Files that match rank 1 queries Filtered with probability 1/r Files that match rank i queries Filtered with probability i/r The cloud Cannot know which files are filtered/returned Cannot know each queries’ rank Efficient Information Retrieval for Ranked Queries (EIRQ) Key techniques: Construct a mask matrix to protect query ranks Filter files without knowing which files are filtered User Step 1: QueryGen ADL Cloud Keywords, rank Step 2: Matrix Construct Mask matrix Step 3: Step 4: File Recovery Certain percentage of files matching user keywords Buffer FileFilter Construct Mask Matrix ADL constructs a mask matrix that is encrypted with its publics key, and sends it to the cloud {A, B} Rank 0 Alice {A, C} Rank 1 ADL Bob A [1] [1] B [1] [1] C [1] [0] D [0] [0] … … [0] [0] Number of ranks, r=2 Cloud Number of keywords Filter Files The cloud chooses a random column for each file F1: { A, B} F2: {B, D} For F3: 50% 50% E(0)*E(0)=E(0) E(0)*E(0)=E(0) E(0)^F3 =E(0) E(1)^ F3 =E(F3) A file, matched rank i query, the probability to be filtered i/r … A [1] [1] B [1] [1] C [1] [0] D [0] [0] … … [0] [0] buffer ADL F1 and F2 will be returned F3 will be filtered with 50% F3: {C, D} Cloud … Evaluation Evaluation Questions? 75