Optimal Search Results Over Cloud with a Novel Ranking Approach Movva Kalpana,

advertisement

International Journal of Engineering Trends and Technology (IJETT) – Volume 5 Number 4 - Nov 2013

Optimal Search Results Over Cloud with a Novel

Ranking Approach

1

Movva Kalpana,

2

JayanthiRao Madina

1

Final MTech student,

2

Assistant professor

1

Department of Software Engineering , SISTAM college, Srikakulam, Andhra Pradesh

2

Dept of CSE , SISTAM college, Srikakulam, Andhra Pradesh

Abstract: In this paper we are proposing an efficient search implementation over the data which is stored in cloud services .In the recent days of research searching over cloud services makes more importance, because of retrieving the user interesting and relevant results based on the user query. Obviously the out sourced data will be encrypted form. Our approach searches keywords in the encrypted documents in optimal manner based on the file relevance score over service oriented applications

Namely, the computation, by the adversary, of a message

(not sent by the legitimate parties) and its corresponding valid authentication tag. A precise definition of MAC sands their security.

II. RELATED WORK

I. INTRODUCTION

In Suppose user Alice wishes to read her email on a number of devices: laptop, desktop, pager, etc.Alice's mail gateway is supposed to route email to the appropriate device based on the keywords in the email. For example, when Bob sends email with the keyword \urgent" the mail is routed to Alice’s pager. When Bob sends email with the keyword \lunch" the mail is routed to Alice's desktop for reading later. One expects each email to contain a small number of keywords.

For example, all words on the subject line as well as the sender's email address could be used as keywords.[3] In c

Verifying the integrity and authenticity of information is a prime necessity in computer systems and networks. In particular, two parties communicating over an insecure channel require a method by which information sent by one party can be validated as authentic (or unmodified) by the other. Most commonly such a mechanism is based on a secret key shared between the parties and takes the form of a

Message Authentication Code (MAC). (Other terms used include \Integrity Check Value" or \cryptographic checksum"). In this case, when party A transmits a message to party Bit appends to the message a value called the authentication tag, computed by the MAC algorithms a function of the transmitted information and the shared secret key. At reception, B re-computes the authentication tag on the received message using the same mechanism (and key) and checks that the value he obtains equals the tag attached to the received message[2]. match is the information received considered as not altered on the way from A to B.1 The goals to prevent forgery,

Even though various search engines developed over encrypted data ,they may have the vulnerabilities either on the computational complexity of performance issue wise, various Symmetric and asymmetric approaches developed by the various researcher from so many years. Searching of encryption allows data owner to outsource his data in an encrypted manner while maintaining the selectively-search capability over the encrypted data. Generally, searchable encryption can be achieved in its full functionality using an oblivious RAMs [11]Although hiding everything during the search from a malicious server (including access pattern), utilizing oblivious RAM usually brings the cost of logarithmic number of interactions between the user and the server for each search request.

Thus, in order to achieve more efficient solutions, almost all the existing works on searchableencryption literature resort to the weakened security guarantee, i.e., revealing the access pattern and search pattern but nothing else. Here access pattern refers to the outcome of the search result, i.e., which files have been retrieved. The search pattern includes the equality pattern among the two search requests (whether two searches were performed for the same keyword), and any information derived thereafter from this statement. We refer readers to [12] for the thorough discussion on SSE definitions.

Having a correct intuition on the security guarantee of existing SSE literature is very important for us to define our ranked searchable symmetric encryption problem. As later we will show that following the exactly same security guarantee of existing SSE scheme, it would-be very inefficient to achieve ranked keyword search, which motivates us to further weaken the security guarantee of existing SSE appropriately (leak the relative relevance order

ISSN: 2231-5381 http://www.ijettjournal.org

Page 188

International Journal of Engineering Trends and Technology (IJETT) – Volume 5 Number 4 - Nov 2013 but not the relevance score) and realize an “as-strong-aspossible” ranked searchable symmetric encryption.

III. PROPOSED SYSTEM

Searching data over Out sourcing is still an interesting research issue in the field of cloud computing or service oriented application, because of retrieving the user interesting and relevant results based on the user query. Obviously the out sourced data will be encrypted form. Our approach searches keywords in the encrypted documents in optimal manner based on the file relevance score over service oriented applications, because the outsourced data usually encrypted before storage for the privacy preserving, traditional approaches uses the Boolean approach those are not optimal, those are not suitable for large datasets. Our approach searches the encrypted information in the outsource data by maintains the search table information for finding the relation between the search key word and documents related to it and it maintains the score of the search keyword with respect to documents, it gives the frequency and inverse document frequency and results can be displayed to the user based on the ranking

So many clustering approaches evolved to find the optimal ranked results for the user interestingness with various clustering approaches an they have various draw backs like local optima and random selection of the centroid ,but the problem with this approaches are, optimality in solution of search results

In this approach data owner out sources the data in the server, before storing data in the server , Data owner has a collection of n data files C = (F1; F2; : : : ; Fn) that he wants to outsource on the server in encrypted form while still keeping the capability to search through them for effective data utilization reasons. To do so, before outsourcing, data owner will first build a secure searchable index I from a set of m distinct keywords W = (w1;w2; :::;wm) extracted from the file collection C, and store both the index I and the encrypted file collection C on the server. After searching the information data can be organized after the ranking.

To do so, before outsourcing, data owner will first build a secure searchable index I from a set of m distinct keywords

W = (w1;w2; :::;wm) extracted from the file collection

C.Index table contains the unique keywords from the datasets along with file ids, before placing them into the index table encrypt the keywords by using symmetric key approach with

AES algorithm for security purpose.

A)Algorithm for index table generation

1. Read the document F

2. Segment the document term wise and encrypt with key

3. Calculate term frequency (TF) and inverse document frequency(IDF) and publishing time(P

T

)

4. Generate index table(I table

) and files upload to server

B) Rijandael algorithm

Rijandael algorithm is one of the form of AES algorithm Our paper uses an advanced cryptographic algorithm for secure data transmission and it uses the key and it is already proved that it is an efficient and secure algorithm than the so many traditional approaches and it is is generated from the multikey exchange group key protocol and the brief structure of the novel cryptographic algorithm as shown below ,the system mainly works on substitution and affine transformation techniques

KeyExpansion—round keys are derived from the cipher key using key schedule

AddRoundKey—each byte of the state is combined with the round key using bitwise xor Rounds

SubBytes—a non-linear substitution step where each byte is replaced with another according to a lookup table.

ShiftRows—a transposition step where each row of the state is shifted cyclically a certain number of steps.

MixColumns—a mixing operation which operates on the columns of the state, combining the four bytes in each column.

Our proposed Architecture works with web services

(service oriented applications) ,that provides the language interoperability and security ,Server receives the query from the user ,it encrypts the query by using AES algorithm and authenticates himself with the user key andcompares with the encrypted keyword in the index table, finds the number of occurrences of the keyword,

ISSN: 2231-5381 http://www.ijettjournal.org

Page 189

International Journal of Engineering Trends and Technology (IJETT) – Volume 5 Number 4 - Nov 2013

Data Owner

Query

Out sources with Index table generation

Index Tables

Outsources with index tables

User Rank Oriented

Results

Architecture

that determines the term frequency and inverse document frequency for finding the file relevance score.

Step3: User searches for relevant data with a plain keyword

Step4: Service process the query and checks for the authentication of user In this paper we proposed a novel file relevance score measurement with number of terms in the file, number of occurrences of the term(term frequency) and number of files

Step5 : Service retrieves the relevant information from index table for respective keyword relevance_Scores[j] = Convert.ToDecimal((1 / termsinfile[j])

* (1 + Math.Log(termfreqs[j])) * Math.Log(1 + (filecount / numberoffiles)));

Step6 : calculates the file relevance scores based on thefile relevance score

Ranking function calculates the term frequency and inverse document frequency for finding the score of the query or keyword with respect to the files, and forwards the datasets according to the score to the user based on ranking. relevance_Scores[j] = Convert.ToDecimal((1 / termsinfile[j])

* (1 + Math.Log(termfreqs[j])) * Math.Log(1 + (filecount / numberoffiles)));

Files can be retrieved based on the our novel file relevance scores

Step1: Registration of the user at Server by requesting the key

Step7:return the files based on the file relevance score to user

III. EXPERIMENTAL ANALYSIS

For implementation purpose we had used C#.net and

ASP.net,our experimental analysis shows in the index table generation at the data owners end as follows

Step2: User receives the key for authenticated and secure search

ISSN: 2231-5381 http://www.ijettjournal.org

Page 190

International Journal of Engineering Trends and Technology (IJETT) – Volume 5 Number 4 - Nov 2013

the above index table consists of Keyword,encrypted keyword and frequency of the keyword,it can be uploaded to service provider.User search results can be shown as specified search keyword with relevant results as follow with relevant file relevance scores.

IV. CONCLUSION AND FUTURE WORK

Our approach provides an efficient secure search mechanism over service oriented application with relevant files by calculating the file relevance scores of the files which contains the search keyword, encrypting the keyword at server side and retrieves the relevant information.

We can enhance the system by improving the search mechanisms along with semantic comparison and similarity based approaches

ISSN: 2231-5381 http://www.ijettjournal.org

Page 191

International Journal of Engineering Trends and Technology (IJETT) – Volume 5 Number 4 - Nov 2013

REFERENCES

1. B. Bloom, space time trade offs in hash coding with allowable errors," in Communications of the ACM, Vol.

13(7), pp. 422{426, 1970.

2. M. Bellare, R. Canetti, and H. Krawczyk, \Keying hash functions for message authentication," in Proceedings of

CRYPTO'96, Lecture Notes in Computer Science

1109, pp. 1{15.

3. D. Boneh, G. Crescenzo, R. Ostrovsky, and G. Persiano,

\Public key encryptionwith keyword search," in Proceedings of Eurocrypt 2004, Lecture Notes in Computer Science 3027, pp. 506{522.

4. K. Bennett, C. Grotho®, T. Horozov, and I. Patrascu,

\Efficient sharing of encrypted data," in Proceedings of

ACISP 2002, Lecture Notes in Computer Science2384, pp.

107{120.

5. O. Goldreich, Foundations of Cryptography: Basic Tools,

Cambridge UniversityPress, 2001.

[6] B. Krebs, “Payment Processor Breach May Be Largest

Ever,” Onlineat http://voices.washingtonpost.com/securityfix/2009/01/ payme nt processor breach may b.html, Jan. 2009.

[7] I. H. Witten, A. Moffat, and T. C. Bell, “Managing gigabytes:Compressing and indexing documents and images,” Morgan

Kaufmann Publishing, San Francisco, May 1999.

[8] D. Song, D. Wagner, and A. Perrig, “Practical techniques forsearches on encrypted data,” in Proc. of IEEE Symposium onSecurity and Privacy’00, 2000.

[9] E.-J. Goh, “Secure indexes,” Cryptology ePrint Archive,

Report2003/216, 2003, http://eprint.iacr.org/ .

[10] D. Boneh, G. D. Crescenzo, R. Ostrovsky, and G.

Persiano, “Publickey encryption with keyword search,” in

Proc. of EUROCRYP’04,volume 3027 of LNCS. Springer,

2004.

[11] Enabling Secure and Efficient Ranked Keyword

Search over Outsourced Cloud Data

[12] R. Curtmola, J. A. Garay, S. Kamara, and R. Ostrovsky,

“Searchable symmetric encryption: improved definitions and efficient constructions,” in Proc. of ACM CCS’06, 2006.

BIOGRAPHIES

MOVVA KALPANA received her B.TECH in the department of INFORMATION

TECHNOLOGY from Sri Sarathi Institute of Engineering & Technology, Nuzvid -

JNTU HYDERABAD in 2006 . She is currently a M.TECH candidate in

Department of Software Engineering at SISTAM college –

JNTU Kakinada. Her research interests including Network

Cryptography, information security, cloud computing and distributed Systems.

JayanthiRaoMadina is working as a

HOD in Sarada Institute of Science,

Technology And Management,

Srikakulam, Andhra Pradesh. He received his M.Tech (CSE) from

Aditya Institute of Technology And

Management, Tekkali. Andhra

Pradesh. His research areas include

Image Processing,

Networks, Data Mining, Distributed Systems.

Computer

ISSN: 2231-5381 http://www.ijettjournal.org

Page 192

Download