Journal of Network and Computer Applications 103 (2018) 185–193 Contents lists available at ScienceDirect Journal of Network and Computer Applications journal homepage: www.elsevier.com/locate/jnca Blockchain-based publicly verifiable data deletion scheme for cloud storage a a,⁎ MARK a,b Changsong Yang , Xiaofeng Chen , Yang Xiang a b State Key Laboratory of Integrated Service Networks (ISN), Xidian University, Xi'an, PR China Digital Research & Innovation Capability Platform, Swinburne University of Technology, Hawthorn, Australia A R T I C L E I N F O A BS T RAC T Keywords: Cloud storage Secure deletion Blockchain Public verification With the rapid development of cloud storage, more and more data owners store their data on the remote cloud, that can reduce data owners’ overhead because the cloud server maintaining the data for them, e.g., storing, updating and deletion. However, that leads to data deletion becomes a security challenge because the cloud server may not delete the data honestly for financial incentives. Recently, plenty of research works have been done on secure data deletion. However, most of the existing methods can be summarized with the same protocol essentially, which called “one-bit-return” protocol: the storage server deletes the data and returns a one-bit result. The data owner has to believe the returned result because he cannot verify it. In this paper, we propose a novel blockchain-based data deletion scheme, which can make the deletion operation more transparent. In our scheme, the data owner can verify the deletion result no matter how malevolently the cloud server behaves. Besides, with the application of blockchain, the proposed scheme can achieve public verification without any trusted third party. 1. Introduction Cloud computing is the fusion and development of the parallel computing, distributed computing and gird computing, and it connects large scale storage and computing resources together through the Internet (Miao et al., 2014; Wang et al., 2015). With an attractive computing paradigm, cloud computing allows clients to conveniently and ubiquitously enjoy various data services such as cloud storage service, outsourcing computing service, on-demand self-service, etc. (Buyya et al., 2009; Chen et al., 2016, 2015, 2014; Miao et al., 2017). In the cloud storage service paradigm, the resource-constraint users can outsource the expensive storage into the remote cloud and enjoy unstinted storage services. Despite the tremendous benefits, cloud storage inevitably suffers from some new security problems. Firstly, the outsourced data may often contain some sensitive information, which should be kept so secret that it does not expose to the cloud server. Thus, the secrecy of outsourced data is a security challenge: the cloud server should learn nothing about what it stores actually. The traditional encryption technique should be thought a solution to this problem. However, it is merely a partial solution because performing meaningful operation over the ciphertext is very difficult. Secondly, the data owner stores data on the remote cloud and the cloud server manages the data for him. When the data owner wants to delete the data, he sends a deletion command to the cloud server and the server executes the deletion ⁎ operation. However, the cloud server is semi-trusted, that is, it may not delete the data honestly for financial incentives. Therefore, how to delete the data permanently and verify the deletion outcome efficiently is another security problem. That is, the verification should only contain some simple computations. Trivially, it must be far more efficient than maintaining the data task itself at least. The primitive of secure data deletion (also called data erasure) has been extensively studied in the past decade (Cachin et al., 2013; Hao et al., 2016; Luo et al., 2016; Perito and Tsudik, 2010; Sun et al., 2008; Wright et al., 2008, 2003). Most of the existing data deletion methods can be summarized with the same protocol essentially, which called “one-bit-return” protocol. That is, the data owner usually sends a command to delete data from physical medium, and then receives a one-bit reply (Success/Failure) indicating the result of the deletion operation. For example, operation system achieves deletion by removing the link. When receives the deletion command, data management system deletes the link of the file from the underlying file system, and then returns a one-bit result (Success/Failure) to the data owner. While the deletion result of being returned can be misleading. The system only deletes the link of the file, however, the content of the file still remains on the disk, attackers can recover the file by scanning the disk (Garfinkel and Shelat, 2003). Obviously, deletion by unlinking is not sufficient in the real applications. To delete the content of the file, researchers apply overwriting technology to design secure data deletion schemes. They Corresponding author. E-mail address: xfchen@xidian.edu.cn (X. Chen). https://doi.org/10.1016/j.jnca.2017.11.011 Received 12 May 2017; Received in revised form 31 August 2017; Accepted 28 November 2017 Available online 07 December 2017 1084-8045/ © 2017 Elsevier Ltd. All rights reserved. Journal of Network and Computer Applications 103 (2018) 185–193 C. Yang et al. delete the content of the file by overwriting the physical disks with random data, and many protocols have been proposed (Diesburg and Wang, 2010; Gutmann, 1996, 2001; Hughes et al., 2009; Kissel et al., 2006). By overwriting the storage medium, although the problem of secure data deletion has been theoretically solved in general, the proposed schemes still have two inherent limitations. Firstly, most of the proposed schemes with overwriting method can not support verification. In those protocols, the data owner has to believe the data management system because they can not verify the result of the deletion. Although some schemes provide verification, they need to introduce a trusted third party. The other inherent limitation is that the proposed protocols are much inefficient for practical applications. Therefore, it is still significant to design secure data deletion schemes to delete data permanently and efficiently. Bonehand and Lipton (1996) presented the first cryptograph-based scheme to solve the secure data deletion problem in 1996. In their scheme, they encrypt all the data before saving it and then delete the plaintext. Later, they delete the decryption key to make the ciphertext invalid, and with a series of follow-up works (Geambasu et al., 2009; Peterson and Burns, 2005; Reardon et al., 2013, 2013; Tang et al., 2012; Yuan and Yu, 2013). The cryptography-based solution is efficient since it can delete a large amount of data by just deleting a very short decryption key. Especially in distributed storage, we can delete all the duplicate copies of the data that are backed up in distributed locations at once time. However, in those schemes, data owner also can not verify the result of the deletion operation. They have to trust the returned result. Besides, the ciphertext is still stored in the physical medium. Therefore, it is necessary for us to seek for publicly verifiable data deletion protocol. Although various of data deletion schemes have been presented, most of them have some inherent limitations. Firstly, in most of the existing schemes, they assume that the cloud server is fully trusted and the server will delete the data honestly. However, in cloud computing the cloud server may be dishonest and it does not delete the data sincerely for financial incentives. Therefore, some schemes introduce a trusted third party, and both the data owner and the cloud server believe the trusted third party unconditionally. Nevertheless, it is very difficult to find such a trusted third party. Besides, plenty of the existing solutions can not support public verification. However, to guarantee the deletion result correct and trace if the server is malevolent, the data owner hope the schemes are publicly verifiable. That is, not only the data owner can audit the deletion result, but also anyone else can verify the outcome. Therefore, we propose our blockchain-based publicly verifiable data deletion scheme, which not only supports public verification but also does not contain any trusted third party. To the best of our knowledge, it seems that there is no research work on efficient data deletion schemes that support public verification without any trusted third party in the malicious server model. Our Contributions. In this paper, we propose a new blockchainbased publicly verifiable data deletion scheme for cloud storage. In our protocol, the data owner O does not fully trust the cloud server S. We use the idea of Blockchain to guarantee that no matter how a malicious S behaves, anyone can verify the result of the deletion operation. The main contributions of this paper are as follows: • • 1.1. Related work The problem of how to delete digital data securely is particularly important. For the past decades, plenty of researchers have paid considerable attention to this problem, and a series of schemes have been proposed. Although deletion by unlinking is efficient, it just deletes the link and the contents still remain in the disk (Garfinkel and Shelat, 2003). To delete the content of the file from the physical medium, Gutmann (1996) suggests that the storage medium should be overwritten with random data. In 2010, Paul and Saxena (2010) present a novel data deletion protocol, which called “Proof of Erasability” (PoE). In their protocol, they delete data by overwriting the disk with random patterns, and the host program will return the same patterns of data to the data owner as a proof after deleting. Perito and Tsudik (2010) present a solution called “Proof of Secure Erasure” (PoSE-s). In the protocol, the host program sends a string of random patterns to the embedded device. They assume that the embedded device’ storage is so limited that it can only hold the received random patterns. Therefore, the original data will be overwritten. This scheme works the same way in essential as the PoE in Paul and Saxena (2010) except the additional assumption of limited memory. Similarly, Luo et al. (2016) propose a permutation-based assured deletion scheme. In the scheme, the cloud storage service provider is economy and offers elastic storage service for data owner. Since the cloud server is economical rational, they assume that the server only maintains the latest version of user's data. Besides, when data owner performs updating all the backups will be consistent. Based on this assumption, they disguise the overwriting performances as data updating operations to delete data. After that, the outcome will be verified through a challenge-response protocol. The data owner can judge whether the server is honest by challenge-response time. In 1996, Bonehand and Lipton (1996) propose the first cryptography-based protocol to solve secure data deletion problem. In 2010, Tang et al. (2010) present a policy-based file assured deletion scheme (FADE). In their scheme, they encrypt the file with a data key firstly. Then further encrypt the data key with the control keys corresponding to the policy. Finally, they remove the policy to delete the corresponding control key. Subsequently, Xiong et al. (2014) propose a secure data self-destructing protocol, which is key-policy attribute-based encryption with time-specified attributes. Perlman (2005) is among the first in proposing the use of trusted third party (TTP) to address the data deletion problem. In the solution, the data owner encrypts the data with a data key, and then the data key is encrypted with a control key by a separate TTP. The TTP destroys the control key to make the data which corresponding to the control key unrecoverable. In 2016, Hao et al. (2016) present a secure data deletion scheme which based on a “trust-but-verify” paradigm - Trusted Platform Module (TPM). Their scheme can make the deletion process more transparent and the deletion result publicly verifiable. Data owner can verify the correctness of encryption and honesty of deletion without accessing to the source code of the TPM. 1.2. Organization The rest of the paper is organized as follows: Section 2 defines some preliminaries. We defines the problem, including system model and design goals in Section 3. In Section 4, we propose our secure blockchain-based data deletion scheme. The analysis of the secure data deletion protocol is discussed in Section 5, including security analysis, comparison and performance evaluation. Finally, we give a brief conclusion of prior art in the last section. We construct a novel blockchain-based publicly verifiable data deletion scheme. If the cloud server does not delete the data honestly, our scheme enables the data owner can detect the malevolent operation of the cloud server. Different from the previous works, there is not any trusted third party in our solution. We introduce the primitive of Blockchain to solve the public verification problem in the secure data deletion scheme. Taking the advantages of Blockchain system, the proposed protocol can achieve public verification. Besides, our solution is also efficient in communication as well as in computation. 2. Preliminaries In this section, we first introduce the basic definitions and 186 Journal of Network and Computer Applications 103 (2018) 185–193 C. Yang et al. 2.3. Blockchain Blockchain technology (Nakamoto, 2008) is a distributed transaction ledger proposed in 2008, which is originally applied to design the underpinning operation for Bitcoin system (Huang et al., 2018). Generally, blockchain can be divided into two groups - public blockchain and private blockchain. As a novel data structure, both public and private blockchain have two attractive advantages. By generating a token as a chain of transactions, blockchain makes it possible to deal with the problem of double spending in a distributed network. The other attractive advantage is that with the use of the validation (e.g., proof of work) only one transaction history accepted. When the hash chain is extended by one, the current hash value is fed into the hash function to compute the new hash value. That makes it impossible to tamper the hash chain maliciously, therefore, blockchain can be adopted to authenticate digital data. In this paper, we change the content of the blocks in the blockchain. In our protocol, the blocks maintain the proofs produced by the cloud server instead of the transactions. We store the proofs in a Merkle hash tree periodicity, and all of the Merkle hash trees are connected by a hash chain. In the hash chain, the hash value is computed from the previous hash value of the hash chain, the Merkle root and the trusted timestamp generated by timestamp server, as shown in Fig. 2. Fig. 1. The Merkle Hash Tree. construction of Merkle Hash Tree. Then we will describe the Timestamping Service. Finally, we give a short description of Blockchain, which is very important for achieving the public verification. 2.1. Merkle hash tree Merkle hash tree (Merkle, 1980) is a specific binary tree, which can be used to authenticate digital data with lower computation and communication overhead. In the tree, every internal node keeps a hash value which is the concatenation of internal node's left child and right child. Specially, the hash values of the authenticated digital data are stored in the leaf nodes. To describe the Merkle hash tree more clearly, we will give a simple example. Firstly, {d1, d 2, d3, d4} is a data set denoted by D, as shown in Fig. 1, in the leaf nodes, h 2, j = H (dj ), where dj ∈ D , j ∈ [1, 4] and H(·) is a collision-resistant hash function. Besides, in the internal nodes, each node (i , j ) has a left child denoted by (i + 1, 2j − 1), and a right node denoted by (i + 1, 2j ), where (i , j ) meaning it is the j th node at the layer i, especially the root node is denoted by (0, 1). Each internal node (i , j ) stores a hash value hi, j , which is computed by h(i , j ) = H (hi +1,2j −1 ∥ hi +1,2j ). Finally, it will generate a signature on the root of the Merkle hash tree by traditional public key signature technique. We can use the Merkle hash tree to authenticate any subset of D through verification object. In the Merkle hash tree, a verification object is a set, which consists of all the sibling nodes on the path from the specific leaf node to the root. 3. Problem statement 3.1. System model The proposed blockchain-based publicly verifiable data deletion scheme consists of three entities: the data owner, the cloud server, and the timestamp server. The architecture of blockchain-based publicly verifiable data deletion scheme is shown in the Fig. 3. • • 2.2. Timestamp server • Timestamp server can offer trustworthy timestamping service for users in the network. Timestamp service can be implemented by two ways. The first one is implemented by a centralized server (Haber and Stornetta, 1991), the other one is implemented with a decentralized system, which based on blockchain system in the Bitcoin (Nakamoto, 2008). The later works by computing a hash value of a block of items to be timestamped, and then the hash will be widely published in the Usenet post or in a newspaper (Bayer et al., 1993; Haber and Stornetta, 1997). With computing the hash value, the timestamp can guarantee that the data have existed. The hash value of the timestamp is included in the next timestamp, and then it forms a hash chain. In the hash chain, every additional timestamp will reinforce all the ones before it. In our scheme, we make some changes in the implementation of the trustworthy timestamping. Firstly, time is divided into fixed time intervals by the timestamp server, such as per hour or per two hours as an interval. Secondly, at the end of every interval, the timestamp server proceeds to compute a trustworthy timestamp on the signed current time. Then it makes the timestamp published widely. Finally, the timestamp is used to calculate the hash value and forming a hash chain by the cloud server. Data Owner. The data owner is the entity that wants to upload and store the data on the remote cloud to reduce the burden of managing the data locally by himself. The uploaded file is only owned by the data owner. Moreover, when he does not need the data, he generates a deletion request, and sends it to the cloud server to delete the data. Finally, he can verify the deletion result with the deletion proof which provided by the cloud server. Cloud Server. The cloud server refers to the entity that maintains the data for the data owner. Besides, the cloud server deletes the data when receives the data deletion request from the data owner, and then generates a deletion proof using the blockchain. After that, the proof is sent to the data owner who can verify the proof. Timestamp Server. The timestamp server is an entry that proceeds to compute a trusted timestamp on the signed current time at the end of every interval. And then the timestamp server provides the trusted timestamp to the cloud server which uses the timestamp to generate the data deletion proof. Fig. 2. The Improved Blockchain. 187 Journal of Network and Computer Applications 103 (2018) 185–193 C. Yang et al. Fig. 3. The Data Deletion Model. 3.2. Design goals Fig. 4. The framework of our procotol. In this paper, we assume that the cloud server is a “semi-honest” server. That means the cloud server may not follow our presented scheme honestly to delete the data but return an error deletion result to mislead the data owner for financial incentives. When the data owner needs the data, he would try to download the data because he does not maintains any copy locally. Moreover, the data owner is assumed to be honest, and the communication channels are assumed to be secure. Therefore, two types of attacks are considered: (i) The semi-honest cloud server may delete some data arbitrarily which are rarely accessed by the data owner for economic interests; (ii) The semi-honest cloud server may not delete the data as the data owner's request and returns an error result to mislead the data owner for benefits. We aim to achieve the following three security goals in this paper: • • • protect the privacy, O should encrypt the file before uploading. This is, O uploads and stores the ciphertext of the file on S. Then S maintains the ciphertext for O. After that, O deletes the local file, therefore, if O needs the file later, he should send a request to S to download the ciphertext. Upon receiving the ciphertext, O checks it and decrypts it to obtain the plaintext of the file. Finally, if O does not need the file, he will sign a deletion request with his private key SKO and send the request to S to delete the ciphertext. Then S verifies the request and deletes the corresponding ciphertext if the request is valid. After deleting, S generates a proof for O to verify the result. Blockchain is introduced to achieve public verification. Besides, a timestamp server offers an inalterable timestamp ts(t ) for S, and ts(t ) is used to compute the hash value of hash chain, described in the follow scheme. With the use of blockchain, O can verify the deletion outcome and trace if S does not delete the file honestly. Correctness. We require that the deletion result is correct. Namely, if both of the data owner and the cloud server follow the proposed scheme, the data would be deleted permanently, and the deletion proof that generated by the cloud server can pass the verification. Completeness. In the presented protocol, the data owner stores the data on the remote cloud, and he does not keep any copy locally. Therefore, we require that if the semi-honest cloud server tampers the data arbitrarily, the data owner can detect the malicious behavior of the cloud server. This is, the proposed scheme can support the completeness. Accountable Traceability. We require that if the semi-honest cloud server does not delete the data honestly, and returns a incorrect deletion result, the misbehavior of the cloud server can be detected by the data owner with an overwhelming probability. Besides, upon the data owner requiring the cloud server to delete the data, he can not deny. This implies that the proposed scheme can support the accountable traceability. 4.2. The proposed secure data deletion protocol In this section, we present a blockchain-based publicly verifiable data deletion protocol in detail. In the following, we introduce some notations used in our scheme. Before enjoying the cloud storage service, each data owner must have an ID and have been authenticated by the cloud service provider. For simplicity, we assume that data owners have passed the identification by the cloud server S, and have got their sole IDs. For example, data owner O has an ID number ID , and he keeps ID secret such that only the O himself and S know the ID . At the same time, we assume that every data owner O has an ECDSA key pair (PKO, SKO ), and the cloud server S also has an ECDSA key pair (PKS , SKS ). H1(·), H2(·) and H3(·) are secure hash functions. Besides, we suppose that every file is named by a unique name which is kept so secret that only O knows it(e.g. nameF is the name of the file F). Furthermore, we named the file strongly enough such that it is secure under the brute-force attacks. 4. Secure data deletion protocol based on blockchain 4.1. High description • In this paper, we consider the secure data deletion model in cloud computing, which is similar to Hao et al. (2016), Luo et al. (2016). In this scenario, they suffer from a trust problem between the data owner O and the cloud server S. That is, O does not believe that S deletes the file sincerely as his request. To solve this trust problem, researchers both in academic and industry have made extreme efforts and proposed many schemes. In the previous literatures, many solutions introduced a trusted third party (TTP) to solve this type of trust issues. However, our blockchain-based data deletion protocol does not contain any trusted third party. Furthermore, with our scheme O can verify the outcome of the deletion operation. If S does not delete the file permanently, O can detect and trace no matter how malicious S is. The main process of our scheme is described in Fig. 4. In the first step, the data owner O stores the file on the remote cloud server S. To • 188 GenKey(ID, nameF ): Creating an instance of the data owner O. O wants to stores file F in the cloud server S. To generate a random encryption key for F, firstly, O computes a random number r1 = H1(nameF ), where H1(·) is a collision-resistant hash function. Then O can compute the random encryption key k = H1(r1, ID ). Finally, O keeps nameF , r1 and k secret. Encrypt(r1, k, ID, F, nameF ): Before uploading the file F, the data owner O must encrypt the file to protect his privacy. Firstly, O computes C1 = Enck (F ), where Enc is an encryption algorithm of a traditional symmetric encryption scheme. Secondly, O computes MACF = H2(C1, r1), where H2(·) is a one-way hash function. Besides, O generates a search tag TagF = tag(nameF ), which is used to search for the ciphertext in the physical storage medium by S. Later, O can generate a final ciphertext denoted by C = (C1, MACF , TagF ). Finally, O sends the ciphertext C to S, and only stores the name of the file (i.e Journal of Network and Computer Applications 103 (2018) 185–193 C. Yang et al. • nameF ) locally. S receives C and stores it at the same time. Decryption : After uploading, data owner O does not store the file F and its ciphertext at locally. Therefore, when he needs F he must download the ciphertext C = (C1, MACF , TagF ) from the cloud server S. And then decrypt C1 to get file F. Hence, the Decryption contains three algorithms - the ciphertext obtaining algorithm CipObtain(·), the audit algorithm Audit(·) and the decryption algorithm Decrypt(·). Besides, the procedures are performed as follows: – CipObtain(SKO, TagF , tr ): To download the ciphertext C, the data owner O must send a request to the cloud server S. S checks the request and returns C to O if the request is valid. (1) Firstly, O has to generate a ciphertext obtaining request Re . He computes the signature SigF = SigSK (“Ask for file ”, TagF , tr ), O where tr is the current time. And then O sends the ciphertext obtaining request Re = (“Ask for file”, SigF , TagF , tr ) to S. (2) Upon receiving the ciphertext obtaining request Re , S checks that if the current time tr and the tag TagF are valid, and SigF signature is the valid signature on tr and TagF . Only if tr , SigF and TagF are all valid, will S search the storage medium for the ciphertext C = (C1, MACF , TagF ) through the search tag TagF , and then returns C to O. Otherwise return an error ⊥. – Audit (C′, NameF ): Has received ciphertext C′ = (C′1 , MAC′F , Tag′F ) from S, data owner O should check the correctness and completeness of C′ before decrypting. ? (1) The data owner O can verify TagF = Tag′F , where Tag = tag(nameF ). If TagF ≠ Tag′F , it means that Tag′F is invalid and the ciphertext C′ is error. Otherwise, the Tag′F is valid. (2) If the TagF is valid, firstly, the data owner O should compute r1 and MACF , where r1 = H1(nameF ) and MACF = H2(C′1 , r1). Then O ? checks MACF = MAC′F . If MACF ≠ MAC′F , it means the C′1 ≠ C1 and C′ is error, otherwise, C′1 = C1 and C′ is valid. – Decrypt (r1, ID, C′1): Only if all verification have passed, will data owner O decrypt the C′1 with k, then O can obtain the plaintext F = Deck (C′1), where Dec is a decryption algorithm of a traditional symmetric encryption scheme, and k is the decryption key denoted by k = H1(r1, ID ). • Fig. 5. Proof Generation. hash tree with the proofs as shown in Fig. 5(a). After that S announces the tree and sends the root node rootj to the timestamp server TS . – Timestamping(rootj , t j ): Timestamp server TS has received the root node rootj , he firstly checks the rootj if valid. If the rootj is indeed computed from bottom to the root, and all the proofs proofi (1 ≤ i ≤ 8) are correct, TS proceeds to generate a trustworthy timestamp tsj on the time t j , where t j is the end of the current interval, and then declares tsj to S and all data owners. Otherwise, TS returns an error ⊥ to S. – ComputeHash(hj −1, tsj , rootj ): Upon obtaining tsj , the cloud server S computes a new hash value hj as hj = H3(hj −1, tsj , rootj ), where hj−1 is the previous hash value of the hash chain. After that, S proclaims the hash chain, and we can depict the hash chain as Fig. 5(b). Delete(SKO, TagF , td ): The data owner O sends the deletion request to the cloud server S to delete the file F. S receives the request and verifies it. – DelReqGen: The data owner generates an deletion request for deleting file F, and sends it to cloud server S. Firstly, O computes a signature SigDel = SigSK (“delete file ”, TagF , td ), where td is the O current time. Secondly, O generates the request DelRe = (“delete file ”, SigDel , TagF , td ), and then sends it to S. – Deletion: The cloud server S receives the deletion request DelRe , and must parse it to ensure if it is valid. He firstly checks the time td , and verifies the signature with the public key PKO . Then, only if the time td and the signature SigDel both are valid, will S perform deletion operation. Otherwise, S refuse to delete the file and returns an error ⊥ to O. • At last, the data owner O can get a final evidence τ = (proofi , rootj , hj ). • ProofGen: After deleting the file F, the cloud server S and the timestamp server TS generate the proof, which will be provided for every data owner to verify the outcome of the deletion. – GenProof (SigDel , Sigs , td ): After deleting the file F by overwriting the physical disk with random data, the cloud server S generates a proof to provide to data owner O. He computes proofi = (“delete file”, SigDel , Sigs , td ), where Sigs = SigSK (“delete file”, s TagF , td ) , and we assume 1 ≤ i ≤ m (m is the number of the file deleted in a interval). Here we can take m = 8 for example. Then S sends proofs to data owners such that very data owner can get the proof corresponding to his deleted file. Besides, S builds a Merkle Verify: When data owner has received the τ on their deletion request from the cloud server, he can verify the τ for correctness. Without loss of generality, we take the aforementioned proof proof4 as an example to describe the verification process. To verify the outcome of the deletion, data owner O firstly computes h′3,4 = H3(proof4 ) and if the equation h′3,4 = h3,4 holds, then O checks the Merkle root, which indeed computed from proof4 by evaluating the hash tree from bottom to the root. That is, O computes and checks if the following equations hold: h3,4 = H3(proof4 ); h 2,2 = H3(h3,3 ∥ h3,4 ); h1,1 = H3(h 2,1 ∥ h 2,2 ); root′ j = H3(h1,1 ∥ h1,2 ); ? root′ j = h 0,1; 189 Journal of Network and Computer Applications 103 (2018) 185–193 C. Yang et al. the data owner O is dishonest and the cloud server is dishonest respectively. Case 1: Dishonest Data Owner. If the data owner O is dishonest, firstly, we can assume that O has already asked the cloud server S to delete the data. However, O denies that he has deleted the data later when he needs the data. For this case, the cloud server S can present the deletion request DelRe = (“delete file ”, SigDel , TagF , td ) that sent by O. The SigDel is the signature on the message “delete file”, the tag TagF and the current time td with the O′ secret key SKO . Because the SigDel is the signature that can be generated by O merely, so S can present SigDel as a proof that O has required S to delete the data already. That is, S can not deny that he has asked S to delete the data. Secondly, we can make another assumption - O declares that he has required S to delete the data, but he did not do it in fact. Here, S just need to ask O to present τ = (proofi , rootj , hj ). The proofi contains a signature Sigs generated by S with his secret key SKs , no one but S can compute it. Therefore, O can not prove that he has required S to delete the data. In the other word, O can not cheat S. Case 2: Dishonest Cloud Server. Similarly, if the cloud server S is dishonest, on one hand, we assume that S deletes the data without authorization by data owner O. In this case, S does not have the deletion request DelRe = (“delete file ”, SigDel , TagF , td ) which should be computed by data owner. Since the deletion request contains a signature SigDel which signs with the secret key SKO , so only O can compute the signature and S can not counterfeit it. Finally, S can not prove that he deletes the data according to O′ request, therefore, he must bear the loss. On the other hand, if the ciphertext C can also be detected in the cloud storage after O has required S to delete the data, it means that S did not delete C honestly. In this case, O can perform the Verify protocol to prove that he has requested S delete C. We note that τ = (proofi , rootj , hj ), where proofi contains a signature Sigs generated by S with his secret key SKS . Since the signature Sigs can be computed by S merely, so O can prove that S does not delete the C honestly. Therefore, S must be responsible for this case. □ And then O verifies the hash chain value and checks if the following equations hold: ? hj = H3(hj −1 ∥ tsj ∥ rootj ) ? hj +1 = H3(hj ∥ tsj +1 ∥ rootj +1) ⋯⋯ ? hm −1 = H3(hm −2 ∥ tsm −1 ∥ rootm −1) ? hm = H3(hm −1 ∥ tsm ∥ rootm ). Where hm is the latest hash value published. Only if all verifications are successful, that is, all the equations hole, can O ensure that S is honest and the proofs are trustworthy. Remark 1. To ensure that the deleted data is irrecoverable any time in the future, the cloud server S deletes the ciphertext by overwriting. Deletion by overwriting has been proposed in several papers (Gutmann, 1996; Kissel et al., 2006; Paul and Saxena, 2010). In our scheme, the cloud server uses random data to overwrite the physical disks which stores the target data. 5. Analysis of our proposed protocol 5.1. Security analysis In this section, we analyze the security of the proposed secure data deletion protocol in this paper. As we mentioned before, we assume that the data owner O does not fully trust the cloud server S. Besides, we also consider that a malicious data owner O colludes with S and delete data illegally. Theorem 1. The proposed data deletion scheme satisfies the property of correctness. Proof. If the cloud server S is assumed to be honest, and it deletes the data sincerely, then the evidence is τ = (proofi , rootj , hj ). Firstly, note that the Merkle hash tree is built by the cloud server S in time td . The leaves of the tree are all the proofs which corresponding to deleted files in interval t j . And the rootj is the root of the Merkle hash tree. Besides, The timestamp tsj is trustworthy timestamp on t j , that is the end of the interval. Because we have assumed that the cloud server S is honest, therefore, proof proofi , the root rootj and the previous hash value hj−1 of the hash chain are all valid. Secondly, in the verification algorithm, due to h′ j = H3(hj −1, tsj , rootj ). Hence, the equation h′ j = hj always holds, that is, hj is always the outcome of the verification algorithm. □ 5.2. Comparison In this section, we compare our scheme with the very recent scheme (Hao et al., 2016). Firstly, both of the two schemes require some computational effort in the GenKey phase. Secondly, our scheme does not rely on any trusted third party and can simultaneously support public verification, which is different from the other schemes. Besides, though our scheme costs some expensive computations in ProofGen phase, the computations are outsourced to the cloud server. Trivially, most of the computations are finished by the cloud server (this is the same in the other scheme). Finally, the cloud server stores and manages the data for data owner. That is, the data owner does not require to keep any backup locally. The comparison between the two schemes is presented in Table 1. For the convenience of comparison, we introduce some marks. We denote by : a signature (resp., by = an operation of verifying the Theorem 2. The proposed data deletion scheme satisfies the property of completeness. Proof. We assume that the cloud server S is honest and S stores the ciphertext of the file securely before deletion, then we node that the ciphertext C = (C1, MACF , TagF ), where C1 = Enck (F ), MACF = H2(C1, r1) and TagF = tag(nameF ). Because data owner O does not keep any backup locally, so when data owner O needs the file, he should download the ciphertext from S and decrypt it to gain the plaintext. O get C′ = (C′1 , MAC′F , Tag′F ) from S. Firstly, O computes the random number r1 as r1 = H1(nameF ), where nameF is the unique name of the file F and only O knows it. Secondly, O computes MACF = H2(C′1 , r1) and Table 1 Comparison Between Two Schemes. ? TagF = tag(nameF ), then verify the follow two equations: Tag = Tag′ and Scheme TheSchemein Hao et al. (2016) OurScheme Computational Model TTP Public Verifiability Accountability Computation (Encrypt) Computation (Decrypt) Computation (Delete) Computation (Verify) Amortized Model Yes Yes Yes 14 + 2, + 4/ Amortized Model No Yes Yes 1, + 2/ 1, + 1+ + 3/ 1: + 1= + 1+ + 3/ 1: 1= 1: + 1= (n + 2)/ ? MAC = MAC′. Only if both two equations hold will O ensure that the C1 is correct and decrypt is to get plaintext. Since S is assumed to be honest, hence, C′ is always valid and plaintext of the file F is the output of the Decrypt algorithm all the time. Otherwise, the two equations can not hold and the verifications can not pass. That is, the ciphertext is correct and integrated. □ Theorem 3. The proposed data deletion scheme satisfies the property of accountable traceability. Proof. Without loss of generality, we consider the two scenarios that 190 Journal of Network and Computer Applications 103 (2018) 185–193 C. Yang et al. 300 Our Scheme Hao et al. Scheme 250 250 200 200 Time cost (ms) Time cost (ms) 300 150 150 100 100 50 50 0 0 1 2 3 4 5 6 7 0 8 Our Scheme Hao et al. Scheme 0 1 2 3 Size of file (Mb) 3 4 5 6 7 8 Size of file (Mb) 2000 Our Scheme Hao et al. Scheme Our Scheme Hao et al. Scheme 1800 2.5 1600 1400 Time cost (us) Time cost (ms) 2 1.5 1 1200 1000 800 600 400 0.5 200 0 0 5 10 15 20 25 30 35 0 40 0 5 10 15 The number of n 20 25 30 35 40 The number of n Fig. 6. Efficiency comparison. 45 550 Our Scheme 450 35 400 Time cost (ms) Time cost (ms) 30 25 20 15 350 300 250 200 150 10 100 5 0 Our Scheme Hao et al. Scheme 500 40 50 2 4 6 8 10 12 14 0 16 0 1 2 3 4 5 6 7 8 Size of file (Mb) The number of n Fig. 7. Efficiency comparison. validity of a signature), , an AES encryption operation, resp., by + an AES decryption operation (our scheme does not rely on the encryption algorithm, any secure encryption algorithm can be suitable for our scheme). / an operation of computing a hash value, 4 a multiplication, by m the number of the files deleted by the cloud server S in the interval(we assume that the data owner O only deletes one file no matter what the value of m is). Besides, we assume that n = log2m . For simplification, we omit the ordinary data (file and proof) upload and download operations. It can be seen easily that our protocol does not rely on any trusted third party, and can achieve public verifiability and accountability only result with a small increase in computation overhead compared with 191 Journal of Network and Computer Applications 103 (2018) 185–193 C. Yang et al. executed by the cloud server S. The data owner O does not need to bear any overhead. Therefore, our scheme is still efficient in proofs generation phase. In Fig. 7(b) shows the efficiency comparison for the all processes with the increasing of the file size. In our scheme, the computation cost is dominated by an encryption computation, a decryption computation, two signature computations and two signature verification operations. Besides, our scheme needs some hash computations. However, scheme (Hao et al., 2016) needs three encryption computations, a decryption computation, a signature operation and then verify the signature. Similarly, scheme (Hao et al., 2016) also needs some hash computations. Trivially, the growth rate of scheme (Hao et al., 2016) is relatively higher than that of our scheme. Therefore, our scheme is more efficient than the scheme (Hao et al., 2016). scheme (Hao et al., 2016). The proposed scheme is more efficient in Encrypt and Decrypt phase. When deleting a file, our scheme requires one signature operation and one verification operation on the signature, while scheme (Hao et al., 2016) only needs one signature operation. To verify the correctness of the result, our scheme needs to compute (2 + log2m ) hash values. Thus, when n is small, our scheme is more efficient for real applications. 5.3. Performance evaluation In this section, we give the experimental evaluation of the presented secure data deletion scheme. We implement our mechanism with the OpenSSL on a Windows machine with Intel(R) Core(TM) i5-4590 processors running at 3.30 GHz and 4 G memory. Throughout our experiment, we can evaluate the computation complexity precisely, we simulate both data owner and cloud server on this Windows machine. In order to ensure the security and privacy of the file, the data owner could encrypt the data before uploading it. We provide the time costs of encryption for scheme (Hao et al., 2016) and our scheme in Fig. 6(a). The simulation result reveals that the time cost of encryption is increasing with the size of plaintext file in both two schemes. Besides, the growth rate of scheme (Hao et al., 2016) is relatively higher than that of our scheme. Therefore, our scheme requires less overhead than that of scheme (Hao et al., 2016). That is, our scheme is more efficient in encryption process. Similarly, Fig. 6(b) shows the efficiency comparison for decryption with the increasing of ciphertext data size. In both the two schemes, the time cost of decryption is increasing with the size of the ciphertext. In our scheme, the main computation cost is dominated by decryption function calculations. Besides, it also needs to generate a signature and verify the generated signature. And three hash values should be computed at the same time. However, except the decryption calculations, scheme (Hao et al., 2016) still requires some encryption computation and three hash computations to verify the ciphertext. The Fig. 6(b) shows that the growth rate of our scheme it relatively lower than that of scheme (Hao et al., 2016). In other word, our scheme is much more efficient than the scheme (Hao et al., 2016) in decryption process. In the deletion phase, the main computation is computing a signature in scheme (Hao et al., 2016). However, our scheme needs to compute a signature and verify it. As shown in Fig. 6(c), it can be seen that the computation overhead keeps constant with the vary of the deleted files in the interval. Besides, the efficient in scheme (Hao et al., 2016) is higher than that of our scheme. However, our scheme is still efficient for real-word applications because the computation only to be conducted once and the time cost is acceptable. In Fig. 6(d) shows the efficiency comparison for verification with the increasing of deleted file number m in the interval. In our scheme, the computation cost is dominated by hash function calculations, which is used to verify the Merkle Hash Tree. Trivially, the data owner conducts (2 + log2m ) hash computations. Therefore, the time cost is increasing with m. While only an operation of verifying the validity of a signature needs to be conducted in scheme (Hao et al., 2016). However, our scheme is more efficient than the scheme (Hao et al., 2016) in general because computing a hash function is much faster than verifying a signature. In Fig. 7(a) shows the efficiency for proofs generation with the increasing of the deleted file number m (where n = log2m ). In our scheme, the proofs generation phase needs to conduct 2 m − 1 hash computations. However, scheme (Hao et al., 2016) does not contain the proofs generation phase. That is, scheme (Hao et al., 2016) seems more efficient for proofs generation phase. However, a hash computation only needs very little time. Therefore, although given a very large file number m, the time cost for proofs generation is very short. For example, the deleted file number m = 16384 , the time cost for proofs generation is about 10 ms. Besides, the proofs generation phase is 6. Conclusion In this paper, we propose a new publicly verifiable data deletion scheme for cloud storage based on Blockchain. In the proposed scheme, the data owner O and the cloud server S do not fully trust each other. Different from the existing schemes, we adopt Blockchain system to guarantee that a data owner O could detect the cheat no matter when a dishonest S behaves malevolently. Besides, if S is malicious and it cheats O, the O can prove that S is dishonest by verifying the proof of deletion without any trusted third party. Acknowledgement This work was supported by the National Natural Science Foundation of China (Nos. 61572382, 61772405 and 61702401), China 111 Project (No. B16037), and the Natural Science Basic Research Plan in Shaanxi Province of China (No.2016JZ021). References Bayer, D., Haber, D., Stornetta, W.S., 1993. Improving the efficiency and reliability of digital time-stamping. Seq. II: Methods Commun. Secur. Comput. Sci., 329–334. Bonehand, D., Lipton, R.J., 1996. A revocable backup system. In: Proceedings of the Sixth USENIX Security Symposium, pp. 91–96. Buyya, R., Yeo, C.S., Venugopal, S., Broberg, J., Brandic, I., 2009. Cloud computing and emerging it platforms: vision, hype, and reality for delivering computing as the 5th utility. Future Gener. Comput. Syst. 25 (6), 599–616. Cachin, C., Haralambiev, K., Hsiao, H.C., Sorniotti, A., 2013. Policy-based secure deletion. In: The 2013 ACM SIGSAC Conference on Computer and Communications Security, pp. 259–270. Chen, X., Li, J., Huang, X., Ma, J., Lou, W., 2015. New publicly verifiable databases with efficient updates. IEEE Trans. Dependable Secur. Comput. 12 (5), 546–555. Chen, X., Li, J., Ma, J., Tang, Q., Lou, W., 2014. New algorithms for secure outsourcing of modular exponentiations. IEEE Trans. Parallel Distrib. Syst. 25 (9), 2386–2396. Chen, X., Li, J., Weng, J., Ma, J., Lou, W., 2016. Verifiable computation over large database with incremental updates. IEEE Trans. Comput. 65 (10), 3184–3195. Diesburg, S.M., Wang, A.I.A., 2010. A survey of confidential data storage and deletion methods. ACM Comput. Surv. 43 (1), 1–37. Garfinkel, S.L., Shelat, A., 2003. Remembrance of data passed: a study of disk sanitization practices. IEEE Secur. Priv. 1 (1), 17–27. Geambasu, R., Kohno, T., Levy, A.A., Levy, H.M., 2009. Vanish: Increasing data privacy with self-destructing data. In: Proceedings of the 18th USENIX Security Symposium, pp. 299–316. Gutmann, P., 1996. Secure deletion of data from magnetic and solid-state memory. In: Proceedings of the Sixth USENIX Security Symposium, pp. 77–89. Gutmann, P., 2001. Data remanence in semiconductor devices. In: Proceedings of the 10th USENIX Security Symposium, pp. 39–54. Huang, H., Chen, X., Wu, Q., Huang, X., Shen, J., 2018. Bitcoin-based fair payments for outsourcing computations of fog devices. Future Gener. Comput. Syst. 78, 850–858. Haber, S., Stornetta, W.S., 1991. How to time-stamp a digital document. J. Cryptol. 3 (2), 99–111. Haber, S., Stornetta, W.S., 1997. Secure names for bit-strings. In: Proceedings of the 4th ACM Conference on Computer and Communications Security, pp. 28–35. Hao, F., Clarke, D., Zorzo, A.F., 2016. Deleting secret data with public verifiability. IEEE Trans. Dependable Secur. Comput. 13 (6), 617–629. Hughes, G.F., Coughlin, T., Commins, D.M., 2009. Disposal of disk and tape data by secure sanitization. IEEE Secur. Priv. 7 (4), 29–34. Kissel, R., Scholl, M., Skolochenko, S., Li, X., 2006. Guidelines for media sanitization. NIST Spec. Publ., 800–888. Luo, Y., Xu, X., Fu, S., Wang, D., 2016. Enabling assured deletion in thecloud storage by 192 Journal of Network and Computer Applications 103 (2018) 185–193 C. Yang et al. Communications Security, pp. 271–284. Sun, K., Choi, J., Lee, D., Noh, S.H., 2008. Models and design of an adaptive hybrid scheme for secure deletion of data in consumer electronics. IEEE Trans. Consum. Electron. 54 (1), 100–104. Tang, Y., Lee, P.P.C., Liu, J.C.S., Perlman, R., Fade: Secure overlay cloud storage with file assured deletion. In: The SecureComm 2010-Proceedings of the 6th International ICST Conference on Security and Privacy in Communication Networks, pp. 380–397. Tang, Y., Lee, P.P.C., Lui, J.C.S., Perlman, R.J., 2012. Secure overlay cloud storage with access control and assured deletion. IEEE Trans. Dependable Secur. Comput. 9 (6), 903–916. Wang, J., Chen, X., Huang, X., You, I., Xiang, Y., 2015. Verifiable auditing for outsourced database in cloud computing. IEEE Trans. Comput. 64 (11), 3293–3303. Wright, C., Kleiman, D., Shyaam, S.R.S., 2008. Overwriting hard drive data: The great wiping controversy. In: Proceedings of the 4th International Conference on Information Systems Security, pp. 243–257. Wright, C.P., Martino, M.C., Zadok, E., 2003. Ncryptfs: A secure and convenient cryptographic file system. In: Proceedings of the General Track: 2003 USENIX Annual Technical Conference, pp. 197–210. Xiong, J., Liu, X., Yao, Z., Ma, J., Li, Q., Geng, K., Chen, P.S., 2014. A secure data selfdestructing scheme in cloud computing. IEEE Trans. Cloud Comput. 2 (4), 448–458. Yuan, J., Yu, S., 2013. Secure and constant cost public cloud storage auditing with deduplication. In: Proceedings of the IEEE Conference on Communications and Network Security, pp. 145–153. overwriting. In: Proceedings of the 4th ACM International Workshop onSecurity in Cloud Computing, pp. 17–23. Merkle, R.C., 1980. Protocols for public key cryptosystems. In: The 1980 IEEE Symposium on Security and Privacy, pp. 122–134. Miao, M., Wang, J., Ma, J., Susilo, W., 2017. Publicly verifiable databases with efficient insertion/deletion operations. J. Comput. Syst. Sci. 86, 49–58. Miao, S., Li, Z., Qu, W., Du, Y., Wang, S., Qi, H., 2014. Progressive transmission based on wavelet used in mobile visual search. Int. J. Embed. Syst. 6 (2/3), 114–123. Nakamoto, S., 2008. Bitcoin: A peer-to-peer electronic cash system. Paul, M., Saxena, A., 2010. Proof of erasability for ensuring comprehensive data deletion in cloud computing. In: Proceedings of the Third International Conference on Recent Trends in Network Security and Applications, pp. 340–348. Perito, D., Tsudik, G., 2010. Secure code update for embedded devices via proofs of secure erasure. In: Proceedings of the 15th European Symposium on Research in Computer Security, pp. 643–662. Perlman, R., 2005. File system design with assured delete. In: Proceedings of the 3rd International IEEE Security in Storage Workshop, pp. 83–88. Peterson, Z., Burns, R., 2005. Ext3cow: a time-shifting file system for regulatory compliance. ACM Trans. Storage 1 (2), 190–212. Reardon, J., Basin, D.A., Capkun, S., 2013. Sok: Secure data deletion. In: The 2013 IEEE Symposium on Security and Privacy, pp. 301–315. Reardon, J., Ritzdorf, H., Basin, D.A., Capkun, S., 2013. Secure data deletion from persistent media. In: The 2013 ACM SIGSAC Conference on Computer and 193