Database Laboratory 2013-10-21 TaeHoon Kim Work Progress(Range Query) 질의 처리 시간 전송 시간 /표준 질의처리 시간 (초) /표준 /표준 /표준 /표준 /표준 /표준 /표준 0.001% 0.002% 0.005% 0.008% 0.010% 2 /25 Database Laboratory Regular Seminar 2013-10-21 TaeHoon Kim 3 Contents 1. Introduction 2. Security Overview 3. Queries Over Encrypted Data 4. Multiple Principals 5. Application Case Studies 6. Discussion 7. Implementation 8. Experimental Evaluation 9. Related Work 10. Conclusion 4 /25 Introduction Theft of private information is a significant problem An adversary can exploit software vulnerabilities to gain unauthorized access to servers Curious or malicious admin at a hosting or application provider can snoop on private data One approach to reduce the damage is to encrypt sensitive data This paper presents CryptDB A system that explores an intermediate design point to provide confidentiality for applications that use database management systems 5 /25 5 Introduction CryptDB addresses two threats 1. A curious database DBA who tries to learn private data 2. An adversary that gains complete control of application and DBMS servers Confidential Data Leaks User 1 User 2 SQL Application DB Server User 3 hackers • cloud.berkeley.edu/data/cryptdb.pptx 6 /25 Introduction CryptDB addresses these challenges using three key ideas The first is to execute SQL queries over encrypted data This idea using a SQL-aware encryption strategy The second technique is adjustable query-based encryption The third idea is to chain encryption keys to user passwords, so that each data item in the database can be decrypted only through a chain of keys rooted in the password of one of the users with access to that data 7 /25 Security Overview Threat1 : DBMS Server Compromise Our approach is to allow the DBMS server to perform query processing on encrypted data as it would on an unencrypted database Threat2 : Arbitrary Threats The solution is to encrypt different data items (e.g., data belonging to different users) with different keys CryptDB provides strong guarantees in the face of arbitrary serverside compromises 8 /25 Security Overview(Threat1) Application SELECT * FROM emp WHE RE salary = 100 table1 (emp) Proxy 60 100 800 100 SELECT * FROM table1 WH col1/rank col2/name col3/salary ERE col3 = x5a8c34 x934bc1 x95c623 x5a8c34 ? x5a8c34 x2ea887 x5a8c34 x2ea887 x84cec1 x4be219 x17cea7 x5a8c34 9 /25 Queries Over Encrypted Data SQL-aware Encryption Deterministic(DET) Random(RND) : in indistinguishability under(IND-CPA) Allows the server to perform equality check, which means it can perform selects with equality predicates, equality joins, GROUP BY, COUNT, DISTINCT Order-preserving encryption(OPE) OPE allows order relations between data items to be established based on their encrypted values, without revealing the data itself If x<y, then OPEk(X) < OPEk(Y), for any secret key K 10 /25 Queries Over Encrypted Data Homomorphic encryption (HOM) Join (Join and OPE-JOIN) HOMk(x)*HOMk(y) = HOMk(x+y) Join support all operations by DET, OPE-JOIN support joins by order relations Word Search (SEARCH) Search is used to perform searches on encrypted text to support operations such as MySQL’s LIKE operator Only support full-word keyword searches – Cannot support arbitrary regular expressions 11 /25 Queries Over Encrypted Data Adjustable Query-based Encryption Our goal is to use the most secure encryption schemes that enable running the requested queries Our idea is to encrypt each data item in one or more onions Each value is dressed in layers of increasingly stronger encryption To perform optimize adjustable query-based encryption 12 /25 Queries Over Encrypted Data Executing Over Encrypted Data The proxy transforms the query to operate on these onions For instance, for the schema shown in Figure 3, a reference to the Name column for an equality comparison will be replaced with a reference to the C2-Eq column Read Query Execution 1 2 3 Write Query Execution The proxy encrypts each inserted column’s value with each onion layer that has not yet been stripped off in that column 13 /25 Queries Over Encrypted Data Improving Security and Performance Minimum onion layers Application developers can specify the lowest onion encryption In-proxy processing Since the proxy receives the entire result set from the server, sorting these result in the proxy does not require significant amount of computation, and does not increase the bandwidth requirements Training mode Onion re-encryption When application performs infrequent queries requiring a low onion layer, CryptDB could be extended to re-encrypt onions 14 /25 Queries Over Encrypted Data Performance Optimization Developer annotation Known query set If many column are not sensitive, the developer can instead provide explicit annotation indicating the sensitive field Use training mode Optimize onion sets Ciphertext pre-computing and caching To reduce this cost, the proxy pre-computes and caches(for OPES) encryptions of frequently used constants under different keys 15 /25 Multiple Principle: Policy Annotations Policy Annotations 1. The developer must define the principal types(using PRINCTYPE) used in her application, such as users, groups, or messages 2. The developer must specify which columns in her SQL schema contain sensitive data, along with the principals that should have access to data using the ENC_FOR annotation 3. Programmers can specify rules for how to delegate the privileges of one principal to other principals, using the speak for relation 16 /25 Multiple Principle: Policy Annotations Observation : Each row in certain tables naturally specifies 1. how data should be encrypted privmsgs_to: msgid 5 6 privmsgs: senderid recipientid 1 9 2 6 msgid 5 6 msgtext “secret message” “hello world” 17 /25 Multiple Principle: Policy Annotations 1. Principals 2. ENCRYPT_FOR 3. HAS_ACCESS_TO Securing phpBB private messages: PRINC TYPES physical_user EXTERNAL; PRINC TYPES user, msg; CREATE TABLE privmsgs ( msgid int, subject varchar(255)ENCRYPT_FOR PRINC msgid TYPE msg, msgtext text ENCRYPT_FOR PRINC msgid TYPE msg ); CREATE TABLE privmsgs_to ( msgid int, rcpt id int, sender id int, PRINC sender_id TYPE user HAS_ACCESS_TO PRINC msgid TYPE msg, PRINC rcpt_id TYPE user HAS_ACCESS_TO PRINC msgid TYPE msg ); CREATE TABLE users ( userid int,username varchar(255), PRINC username TYPE physical_user HAS_ACCESS_TO PRINC userid TYPE user ); • cloud.berkeley.edu/data/cryptdb.pptx 18 /25 Multiple Principle: Key chaining userid 1 All key chaining operations done at proxy, keys stored encrypted at DB server SKu1 Username: Alice Password: asdf ESKu1[SKm5] msgid 5 SKm5 SKa = dblab ESKa[SKu1] “secret messag e” SKm5 userid 2 Username: Tomas Password: dfga SKb = dblab ESKb[SKu2] SKu2 ESKu2[SKm5 ] • cloud.berkeley.edu/data/cryptdb.pptx • Also use public key pair 19 /25 Application Study PhpBB e.g)xpressEngine board A widely used open source forum with a rich set of access control settings HotCRP A popular conference review application Grad-apply A graduate admissions system used by MIT EECS /25 Discussion / Implementation CryptDB cannot support on encrypted Data Not support both computation and comparison on the same column SELECT age*2+10 FROM … WHERE salary > age*2+10 (1)rewritten into a sub-query (2)re-encrypted in the proxy CryptDB proxy consist of a C++ Lib and a Lua module CryptDB used MySQL proxy CryptDB implementation consists of ~ 18,000 lines of C++ Code and ~150 lines of Lua Code /25 Performance Evaluation Performance environment MySQL 5.1.54 server : 2 machines CryptDB proxy and the clients : 8 machines 2.4 GHz Intel Xeon E5620 4-core processors 12 GB of RAM 2.4 GHz AMD Opteron 8431 6-core processors 64 GB of RAM Use a shared Gigabit Ethernet network Use TPC-C query set Compare with MySQL CryptDB CryptDB with only Random encryption(RND) :strawman /25 Performance Evaluation Throughput of different types of SQL queries from the TPC-C query /25 Related work Theoretical approaches ([Gentry’10], [Gennaro et al., ’10]) Search on encrypted data (e.g., [Song et al., ’00]) Restricted set of queries, inefficient Systems proposals (e.g., [Hacigumus et al., ’02])] Inefficient Lower degree of security, rewrite the DBMS, client-side processing Software checks (e.g., PQL, UrFlow, Resin) No protection against adversaries with complete access to servers /25 Conclusion We presented CryptDB, a system that provides a practical and a strong level of confidence in the face of two significant threats 1. A curious database DBA who tries to learn private data 2. An adversary that gains complete control of application and DBMS servers Our Evaluation show that CryptDB can support operations over encrypted data /25 Note that, All ppt contents is based on “cloud.berkeley.edu/data/cryptdb.pptx” and paper by Christof Kim(TaeHoon Kim) :D If ppt contents contains error, plz recommend to me taehun3718@gmail.com :D /25