Data Integrity Proofs in Cloud Storage Author: Sravan Kumar R and Ashutosh Saxena. Source: The Third International Conference on Communication Systems and Networks (COMSNETS), 2011, Bangalore, India, pp. 1–4. Presenter: Tsuei-Hung Sun (孫翠鴻) Date: 2011/3/4 Outline • • • • • • Introduction Motivation Scheme Performance Evaluation Advantage vs. Drawback Comment Introduction(1/3) • Data outsourcing to cloud storage server – Data Authentication – Data Integrity • Proof of retrievability (POR) – Obtain and verify data is not modified. – Using a keyed hash function hK(F). – Prevent the cloud storage archives modifying the data without the consent of the data owner. H(.): hash function K: Secret Key F: File Introduction(2/3) • Drawback of POR – It need high resource cost for the implementation. – It need to store secret key and hash value as many as file that stored at server. – It is a burdensome for server and some device. Introduction(3/3) A. Juels and B. S. Kaliski, Jr., “Pors: proofs of retrievability for large files,” Proceedings of the 14th ACM conference on Computer and communications security, New York, USA, 2007, pp. 584–597. • Proof of retrievability for large files using “sentinels.” – Only a single key can be used. – Access only a small portion of the file F. – Setup phase: Randomly embeds sentinels among the data blocks. – Verification phase: Verifier check the integrity of the data file F by challenge prover specifies positions and asking return the associated sentinel values. Motivation • The improve scheme need to insert sentinels and error correcting codes. • The improve scheme need to store all sentinels. • In the future, the owner of data maybe a small device (ex. PDA, mobile phone) • Goal – Deal with the problem of implementing POR. – Proof without the need to access the entire file or client to retrieving the entire file from the server. – Minimizing the local computation and bandwidth consumed at the client. Scheme(1/4) • Assumption and Limit – Storage server might not be malicious. – The proof of data integrity protocol just checks the integrity of data. – Only apply to static storage of data. – The number of queries that can be asked by the client is fixed. Scheme(2/4) • Setup phase – Let file F consist of n blocks and create metadata to append on it. – Let each of n data blocks have m bits in them. Fig. A data file F with 6 data blocks. Scheme(3/4) • Setup phase (cont.) 1. Generation of meta-data g (i, j ) {1,...,m}, i {1,...,n}, j {1,...,k} 2. Encrypting the meta data h : i i ,i {0,...,2n } M i mi i 3. Appending of meta data k: The number of bits per data block which read as meta data. g: a function to generates a set of k bit positions. h: a function to generates a k bit integer αi for each i. Mi: a encrypted mi by h. Scheme(4/4) • Verification phase Verifier (Client) Archive (Cloud) Challenges g(i,j) Using g(i,j) to find the corresponding meta data. k+1 bits Using αi to decrypt Compare decrypted bits and send by cloud bits Performance Evaluation • Storing only a single cryptographic key and two functions which using to generate a random sequence. • Only encrypting a part of file, so it can save computational time of the client. • Using XOR to instead hash function is more efficient. • Verification just need to find and send few bits of data to the client. • Network bandwidth is very less (k+1 bits for one proof). Advantage vs. Drawback • Advantage – Reducing the computational and storage overhead of the client. – Minimizing the computational overhead of the cloud. – Reducing the network bandwidth consumption. – It is advantageous to thin clients. • Drawback – It not prevent the archive from modifying the data. Comment • The decryption and comparison are not very clear to check that the response is correct or not. • It still need to store αi as many as number of files.