International Journal of Engineering Trends and Technology (IJETT) – Volume 9 Number 7 - Mar 2014 Comparative Study of Enhanced Secure and Effective Relevance Keyword Search Over Public Cloud Data Sathya.R#1, Manjula babu#2, #1 #2 PG Student ,Dept. of MCA ,Vel Tech Dr.RR & Dr.SR technical university ,Chennai, India Asst Professor , Department , Vel Tech Dr.RR & Dr.SR technical university, Chennai, India Abstract-In the present world most recent computing achievement is cloud computing. But everyone can’t enjoy this cloud service there are lot of issues cost, security and infrastructure. To overcome this problem they have introduced private cloud but an individual cant access private cloud the cost is too high. To provide cloud service for all the individuals through public cloud in secured and efficient manner through proposed system. Even though there are lot of encryption tecniques to provide secured service in public cloud but they allow only Boolean search which cannot provide effective result because cloud have large amount of data and users. Proposed paper we introduced user side and owner side encryption and we providing separate extraction process for keyword extracting from the data. Also to double up the process security we using order preserving mapping to prevent the data from leakage. By providing DRKU to update the keyword from the file when it is update or edited, also generate the index for search string dynamically helps to retrieve results faster and encrypting the index to provide privacy. Some simple methods and user side encryption method implemented to guarantee the security much better. Keywords – Cloud Computing, RSE, OPES, DRKU, Public cloud. I. INTRODUCTION Cloud Computing is the group of service area, applications, data, and infrastructure encompassed of lots of compute, system, information, and putting away resources. These Modules can be quickly composed, provisioned, implemented and neutralized, and scaled up or down; given that for an on-demand utility-like model of distribution and consumption. ISSN: 2231-5381 Cloud increases association, dexterity, scaling, and availability, and delivers the potential for cost drop over improved and effective computing. The cloud model promotes availability and is composed of four important chars, two service models, and five deployment models [2]. Cloud computing is an growing prototype. Its classifications use case, fundamental technologies, concerns, risks, and profits will be sophisticated in a determined debate by the open and isolated divisions. These explanations, elements, and characteristics will grow and change over time. A consumer can unilaterally provision computing capabilities, such as server time and network storage, as required semantically without needing human contact with each service’s bill payer. In our paper we are going to deal with universal public cloud where a enormous commerce group and is maintained by an association marketing cloud services. There is lot of problems in providing public cloud service like technical problems to the implementation of Cloud Computing once it has been implemented, policy and business problems to the implementation of Cloud Computing [3]. As a cloud become more popular and efficient users ready to save their personal data also government sectors set up cloud environment. But the private cloud environment is too high and public cloud environment is not too confidential to have secured data. There is lot of risks in the cloud computing it has been detail discussed in [4]. In cloud the major problem is eavesdropping of data or leakage [5] or even hacked [6]. Unauthorized admission to cloud computing service may occur when a unauthorized person gets the User id or user name with the correct related password has been used. This can happen via diversity of procedural and non-procedural methods. Community work may be beleaguered towards the cloud service bill payer by, requesting that critical access is essential but that the password is not properly working and wanted to be changed. To overcome this en- http://www.ijettjournal.org Page 356 International Journal of Engineering Trends and Technology (IJETT) – Volume 9 Number 7 - Mar 2014 tire problem we need to go for encryption process when we are getting into data as a service we need reliable results and it must be most relevance in the large outsourced data. In this data process owner outsourced large amount of data and need to give preference like who can access the data authentication for users or we cannot provide all the data to user to avoid this we allow user to get the users there data as per their interests it came provide by keyword search. This keyword search technique so far used in the plain text only in this proposed paper we are creating key word search for crypto files. A huge assortment of records, publications, and journalists, of which, at any specified period, a specific user is involved in only a small segment. As a exact lumpy assessment, we might assume that one published side covers about 600 verses, or, counting lay out and punctuation, about 1,800 characters; then a 800-page book contains about three million letterings. A manuscript may have two sides and may be five tables from top to bottom and 6 mms lengthy, so it saves conceivably 2 and a half billion letterings, or, in workstation terms, 5.5 GB. Even a minor collection has 20 or added memory; a huge one may have thousands. In over-all, we may look even a comparatively little manuscript collection to cover many million letterings [7]. Even though we are come across lot of search techniques in encrypted files but all the techniques are “Boolean search” as mentioned some samples [8],[9],[10]it cannot give any relevant data for user requirement (keyword). When the method is applied straight away into the cloud service it may create main two problems. For every search request, consumers without acknowledgement of the encrypted data they have to go over and done with each received file in sequence to select the widely held corresponding their attention, which demands perhaps huge quantity of before process. Next problem, consistently distribution of files one by one to the user by checking the key is presence or not real a headache in the real world traffic control in network, which is entirely disagreeable in pay per use service. To provide a best relevance data by scoring technique [11], to allocate a dynamic score to a file for a request, the DRKU checks the resemblance with requested keyword and index keyword. The resemblance between two Keywords is once again not integral in the DRKU. Naturally, the approach between two keywords is used as a amount of difference between the keywords, and approach of COSINE is used as the dynamic score similarity. As an substitute, the inside model between two keywords is frequently used as a resemblance amount. If all the keywords are enforced to be element size, then the cosine of the approach between two keywords is same as their product. If ~M is the manuscript keyword and ~R is the request keyword, then the resemblance of manuscript M request R (or score of M for R) Sin (~M, ~R) = ∑ X K~M. X K~R, Although, we have lot of process to achieve the secured and relevance data outsource service in public cloud. First we need to make the user to register with our provider. Authentication is the main key in protecting the data this also helps the owner to provide the access of data to the owners in specific. All the files has to be maintained and the stored has to secured we mainly use three methodology to provide relevance and security first we need to extract the keywords in the document using the extraction algorithm Symmetric Key Extraction (SKE) fig 1. When the owner outsources the data before it’s sent to server it will be encrypted using the encryption method (SSE) Searchable Symmetric Encryption where both the data and index will be encrypted. Index is the process execute dynamically when the user send the request to the server it will Create the index for each and every keyword. Fig 1 Keyword Extraction Process ISSN: 2231-5381 http://www.ijettjournal.org Page 357 International Journal of Engineering Trends and Technology (IJETT) – Volume 9 Number 7 - Mar 2014 Multiple keywords can send at a time and can retrieve results one by one. Once user needs to use the document he will put under some security check and need provide a key to download the data. To create a dynamic score for keywords we need to maintain the extracted key words in updated manner. For this keyword updating we (DRKU) Dynamic Relevance keyword updating for update the index table when the file has been edited or updated or newly added. To double up the security we use order preserving mapping technique to prevent the data leakage. In Fig 2 we showed the architecture of the process. has been increased. In this particular kind of attack both the user and hacker place on same server so the hacker may try to hack any users account randomly. However, specified user may under attacks is still be possible. It has been shown with one cloud computing service provider that could be achieved 40 percent by setting up new users while instantaneously operating the resource needs of the targeted user’s virtual machines. Another attacking way is domain name system (DNS) attack. The method of DNS is changing the URLs or domain in the server. Different of DNS attacks are targeted at locating authenticated authorizations from broad band users, also it mentions the cloud users. DNS has been changed and it has been leads the user to reach the hackers server or they can easily enter the users system. Domain name stealing they steal the cloud server domain name and make users to register in the negative named servers. When the user leads to enter the duplicate URL same as original server URL where they provide all there authentication details where it can be stolen by the hacker and use to enter into the targeted server as another users with their authentication details. Session hijacking consists of the attacker misusing lively computer sessions by gaining the cookies that are used to validate users. This can be attained by cross site scripting, which includes malevolent code being vaccinated into the website, which is successively performed by the browser. So we can keep on describing lot of problems on cloud computing to overcome the proposed paper implemented lot of techniques. B. System Goals Fig 2 Architecture of System II. SYSTEM DECLARATION A. System Problems Unauthorized entry to cloud computing organizations might happen when a user id and password has been gotten without authorization. This kind of situation may occur using many technical and non-technical approaches. Community business might be directed on the way to the service provider, for example, requesting that crucial access is needed but that the password is not accessible and need to change. Passwords might be stolen logged in public places and left away, cracking the password using some software. Denial of service (DoS) on cloud server may stop user to use their accounts. This may happen by creating jam of traffic on the predicted site to make users not accessible to genuine users. When a DoS attack is happens, that will be devoted to as a distributed denial of service attack, . DoS attacks only concentrate on particular user not on the entire server and try to change the password the user or keep on login with wrong password and lock the account. ‘Side channel attacks’ may allow a user to access the other users account and data the boundary of usage ISSN: 2231-5381 To empower relevance searchable symmetric encryption for effective use of outsourced and protected data under the abovementioned problems, our model design should accomplish the subsequent security and performance assurance. Relevance keyword search to determine different methods for manipulative in effect of relevance search outlines based on the existing searchable encryption resource description frame work; Security assurance to avoid cloud server from searching the given keyword in the cryptic text, and attain the security power compared to existing system; Effectiveness above problems should be avoid with minimum communication and process. Inverted index is the indexes which put score for overall data in the server and maintain the index with help of DRKU the keyword will be update automatically.(KW) is to denoting the keyword. (F) Is the Frequency of keyword in File. (QF) Queries are done frequently. (ND) collection of total data in the database. (DT) the number of files matches the keyword. (L) is the file length in bytes, and (AL) is the average length of file. Score = ∑ L((ND- DT + 0.8) / DT +0.8) * ((KW+1)F) / ((KW (1.5-b) + (b(DT / AL)) + F * (KW+1)DT / KW + QF http://www.ijettjournal.org Page 358 International Journal of Engineering Trends and Technology (IJETT) – Volume 9 Number 7 - Mar 2014 The important part of retrieving the files are relevance of the keyword appears in the document. A large process is needed to do this calculation this score in every models. Alternative method that has been revealed to be in effect of increasing file relevance is query alteration by relevance response. An updated relevance model uses an efficient calcu- BuildIndex(K; C) Initialization: scan C and extract the distinct words W = (w1; w2; :::; wm) from C. For each wi W , build F(wi); Build posting list: for each wi W for j jF(wi)j: calculate the score for file Fij according to equation , denoted as Sij ; compute Ez(Sij ), and store it with Fij ’s identifier hid(Fij )jjEz(Sij )i in the posting list I(wi); Secure the index I: for each I(wi) where i m: encrypt all Ni entries, with key fy(wi), where j . set remaining Ni entries, if any, to random values of the same size as the existing Ni entries of I(wi). replace wi with x(wi); Output I. lation method in mixture with a best query extension method. Word File ID Wi Fi1 Fi2 Fi3 FiNi Relevance Score DW1++ Fig 3 Index Sample Relevance function when a user gives request with a keyword the index will check for the number of times file downloaded for that particular keyword the most downloaded file will be shown as the first result to the user. By doing relevance keyword index we can provide most relevant file at first and as ordered. We need a list of queried keywords and we need to match the keywords most relevance to the file in our database. Critical to the informal of relevance querying is the use of a resemblance experimental, a method that allots a numeric score signifying how relevant a file and the keyword match. ISSN: 2231-5381 Word File ID Java x1 y2 x2 ......... Xn Relevance Score x1-32 y2-26 x2-21………Xnn Fig 4 Example Index III. ENCRYPTION EVOLUTIONS In the introduction we have motivated the relevance keyword search over encrypted data to attain saving of space for Cloud-Computing. In this section, we being from the analysis of searchable symmetric encryption (SSE) systems and proceeding in giving definitions and frame-work for our proposed relevance symmetric encryption (RSE). By subsequent the similar SSE, it would be actual capable to maintain relevance search method over encrypted data. A. Searchable symmetric encryption (SSE) We present original confrontational prototypes for SSE. SSE mainly deals with searches like comparing the previous searches and left the non-adaptive searches. The next adaptive reflects rivals that select their requests as a purpose of before found trapdoors and search retrieved. Whenever the keywords received from the user the keyword index dynamically create for the present keywords. All previous effort on SSE with the allowance of unaware RAMs falls within not adjustable situation. The inference is that, conflicting to the normal use of searchable encryption defined in these descriptions only assurance security for users that achieve all their searches on one occasion. We talk this by introducing game and simulation based descriptions in the adjustable Situation. We providing two models which we verify in our new model. Our first model is alone safe in the not adjustable situation, but is the greatest effective SSE model update. In real, it attains searches in single communiqué turn, needs an extent of effort from the server that is rectilinear in the amount of files that encompass the keyword, needs continuous memory on the client side, and lined in the size of the file collection database on the server. Though the model in also achieves searches in one turn, it can encourage incorrect positives, this is not the situation in our model. Furthermore, all the models in need the server to achieve a quantity of effort that is lined in the total number of files in the storage. Our next model is protected against an adjustable rival, but at the value of needful an advanced communiqué above every request and more size at the serve r[12]. Though our adjustable system is abstractly simple, we note that modeling effective and provably secure adjustable SSE schemes is not a frequent task. The main challenge lies in proving such constructions secure in the simulation paradigm, since the simulator requires the ability. Procedure of SSE whenever the owner starts to upload first it uses Build http://www.ijettjournal.org Page 359 International Journal of Engineering Trends and Technology (IJETT) – Volume 9 Number 7 - Mar 2014 index process Build the index for keywords in the file. Using OPES we put up some extra protection by providing 3DES. Once the index has been build the data’s will be encrypted with the index and upload it to the database Fig 3. Having an exact model on the security assurance of existing SSE works is very significant for us to describe our relevance searchable Symmetric encryption difficulties. We tried to provide “as-strong-as-possible” relevance searchable symmetric encryption. Essentially, this concept has been active by decoders in numerous latest model [13], [14] where efficiency is preferred over security. tional diminishes the information leakage from public service. Therefore, the keyword security is provided in our model. We include 3DES to make the process double up. In fig 5 we show single process of DES when we follow triple the process we achieve 3DES. Algorithm for SSE: 1. 2. 3. 4. 5. 6. 7. 8. procedure SSE(D,R,id(F)) build index (I1); while |D| !=1 do {D,R}←Binary Search(K,D,R,m); End while coin←Tape Gen(K,(D,R,1||m,id(F))); c←R; C ←3DES(R) 9. return c; end procedure Fig 5 3DES Encryption Structure RSE algorithm: b. Relevance Symmetric Encryption (RSE) RSE is the method exactly on the user side. User need to provide the keywords as the search string. User can provide any number of keywords and request for the results. Once the keywords recited from the user now we need to go to SSE over there the index will be generated as we mentioned in section 3.2. The user will be received with the set of results for the requested keywords as per user interested one by one. When a user needs to download the data hey need to give another request to the server for download the data. Now user will be push under the security question once it has been verified the server provide the key for the selected file. The user can use the specific key for the specific file to download from the server. In the searchable index we combine the relevance score and key id for file. So the hacker may get in and see only the encrypted data in the server also here keyword is protected and file privacy. Due to the privacy strength of the document encryption model, the document in the database is entirely safe. However rival may acquire part of information from the cipher text, cipher text might specify very great consistent plaintext, the entirely randomized and the extremely compressed one-to-many mapping technique still make the hacker to find the partial data,. Likewise remember we use different encryption method for user and owner, which addi- ISSN: 2231-5381 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. procedure RSE(D,R,id(F)) build index (I1); while |D| !=1 do {D,R}←Binary Search(K,D,R,m); End while coin←Tap Gen(K,(D,R,1||m,id(F))); c←R; C ←3DES(R) return c; end procedure procedure Binary Search(K,D,R,m); M←|D|;N←|R|; d←Data retrived; y←keyword ; Build index = {DYM}; coin←Tap Gen(K,(D,R,0||y)); Req≤x then D←Data for request K←{r+1,…,y}KEY; End if Return {D,R}; End procedure http://www.ijettjournal.org Page 360 International Journal of Engineering Trends and Technology (IJETT) – Volume 9 Number 7 - Mar 2014 A.ORDER PRESERVING MAPPING Order-preserving symmetric encryption (OPE) is a control encryption model where encryption task provide number order for file. OPE has a extensive account in the method of single part , which are inclines of plaintexts and the consistent Cipher texts, together decided in arranged numbered so only one part of list is enough for encryption and relevance. An extra official action of the idea of order-preserving symmetric encryption (OPE).The motive for original attention in such models is that they let effective variety requests on encrypted data. That is, a inaccessible unprocessed storage server is able to index the data it collects, in encrypted type, in a data format that allows effective range of request. By an effective method we can comprise the database . In detail, OPE not only permits effective variety requests, but agrees indexing and request dealing out to be done accurately and as proficiently as for unencrypted data. OPE has also remained optional for use in network combination on encrypted data in sensor networks [13]. OPE helps to divert the hackers from hack the data in the server also it stops lekage of data even we added 3DES to provide efficient of security P/C= o (m_1), 0 < _ < 1, and n _ m3. Algorithm for OPES OPES PROCESS Owner Side : Process owner (K,R,I) I←|K|;N←|RE| K←min(K)-1;RE←min(RE)-1 j←RE+[N+2] if |K|= 1 then If F[K,RE,I] is indefinite then cc←Get Coin(1L,K,RE,1||I) CC<- ----3DES(CCA); F[K,RE,I]←RE Return F[K,RE,I] If I[K,RE,j] is undefined then cc←GetCoins(1L,K,RE,0||j) I[K,RE,j]←HGK(K,RE,j;cc) x←I[K,RE,j] If I≤x then K←{K+1,…,x} RE←{j+1,…,y} Else K←{x+1,…,K+I} RE←{j+1,…,RE+N} Return(K,RE,I) OPES PROCESS User Side : Process User(K,RE,c) I←|K|;N←|RE| K← min(K)-1;RE←min(RE) -1 j←RE+[N/2] if |K|= 1 then I←min(K) if F[K,RE,I] is indefinite then cc← Get Coins(1L,K,RE,1||I) F[K,RE,I]←RE If F[K,RE,I] = c then return I Else return If I[K,RE,j] is undefined then cc←GetCoins(1L,K,RE,0||j) I[K,RE,j]←HGK(K,RE,j;cc) x←I[K,RE,j] If c ≤ j then K←{K+1,…,x} RE←{RE+1,…,y} Else K←{x+1,…,K+I} RE←{j+1,…,RE+N} Return (K,RE,c) protect relatively a lot of calculation over during the inverted index update, and can be reflected as a significant. V. DYNAMIC RELEVANCE KEYWORD UPDATING (DRKU) The relevance can be achieved only when the data in the index has to be update automatically. In the present situation the updating are done by manually or updating process is not taken place for the newly adding file or editing the existing file. DRKU, where it has been retrieved from the collection of files. DRKU’s main function to update the score for the files in the data base whenever a new file added or existing file deleted it has to make an impact on the index, also when any content edited inside the file also the relevance score need to make change in index. Preferably, a score change in certain situation the , and we show that our proposed one-to-many change must only affect the score of particular file it should not disturb the other file scores. May be the rank of relevance in the index may change. Order-preserving mapping never consider about the deleting files and never make an impact on the score but this won’t help in providing the relevance to the user. Associate score dynamics is also the motive why we do not use the simple method for RSE, before outsourcing the data the owner update the inverted index in the server. All the process has been repeated whenever situation comes to the score has been change, execution it unusable in event of repeated document collection updates. In circumstance, supporting DRKU will VI SYSTEM PERFORMANCES A. Performance of SKE We tried to compare some extraction process with some key. Whenever we update a file index need to be added and it must be extracted. First keyword and description has been provided manually. The key word extracted human and automatic of extraction. And we have updated some graph as below fig 6 we gave some keywords and calculated results. 300 200 File retrived 100 time taken relevance % 0 Fig 6 Relevance of data ISSN: 2231-5381 http://www.ijettjournal.org Page 361 International Journal of Engineering Trends and Technology (IJETT) – Volume 9 Number 7 - Mar 2014 B. Performance of RSE SSE is almost same extraction with key process so let us we see the process of the proposed system RSE. We checked the time taken for process and file retrieves. We compared the existing and proposed encryption process and search technique cross check the execution time and performance of the proposed system. can provide security equal to the private cloud. Through DRKU we achieve the relevance in searching. Multiple keyword searches also have been proposed. As future enhancement we can improve the encryption method more secured. We can give humming bird algorithm to get best extraction of data. ACKNOWLEDGEMENT 7 6 5 4 3 2 1 0 Java Cloud time taken This work has been done under the guidance of Manjulababu and has supported by kamalesh.G releva nce REFERENCES Server Karizma Fig 7 Existing Extraction Word and relevance 10 8 6 4 Retrived % 2 Relevance 0 Fig 8 Proposed Extraction word and relevance VII. CONCLUSION In the existing paper we have achieve providing security and user satisfaction Through RSE and DRKU. In the current situation the problem are numerous, like DNS attack, Social hacking method. Social hacking method can overcome only when user have responsible on protecting of their private data. The ethical hacking can be overcome by some smart methodology like RSE. In our proposed paper we are going to encrypt all the data including the index and save in the database. Simply a hacker hacks the server he get only fake data not the real one. RSE helps to extracted data for user in relevant manner and also provide the user side decryption method in With the help of OPM technique we implement 3DES to provide more security. We achieved the goal of public cloud as equal to the private cloud. Even any one can access public cloud which ISSN: 2231-5381 [1] Cloud Security and Privacy: An Enterprise Perspective on Risks and Compliance (Theory in Practice) [Kindle Edition] & http://en.wikipedia.org/wiki/Service-oriented_architecture [2] ”The NIST Definition of Cloud Computing” Authors: Peter Mell and Tim Grance Version 15, 10-7-09 [3] “Electrical Engineering and Computer Sciences” University of California at Berkeley Technical Report No. UCB/EECS2009-28 http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-200928.html [4] “Security Guidance for Critical Areas of Focus in Cloud Computing” V2.1 Prepared by the Cloud Security Alliance December 2009 [5] Cloud computing for small business: Criminal and security threats and prevention measures Alice Hutchings, Russell G Smith & Lachlan James [6] https://opencuro.com/pdf/security_news/heartland.pdf [7] ” Managing Gigabytes: Compressing and Indexing Documents and Images”, Second EditionIan H. Witten, Alistair Moffat, and Timothy C. Bell [8] “Privacy Preserving Keyword Search over Encrypted Cloud Data Book Title “Advances in Computing and Communications” Book Subtitle First International Conference, ACC 2011, Kochi, India, July 22-24, 2011. Proceedings, Part I [9] “Public Key Encryption with keyword Search” Dan Boneh Giovanni Di Crescenzo Stanford University Telcordia Rafail Ostrovskyy Giuseppe Persianoz UCLA Universita di Salerno [10] ” Practical Techniques for Searches on Encrypted Data Dawn” Xiaodong Song David Wagner Adrian Perrig dawnsong, daw, perrig@cs.berkeley.edu University of California, Berkeley [11] ” Modern Information Retrieval”: A Brief Overview Amit Singhal Google, Inc. singhal@google.com http://www.ijettjournal.org Page 362