Comparative Study of Enhanced Secure and Effective Relevance Keyword Search Over

advertisement
International Journal of Engineering Trends and Technology (IJETT) – Volume 9 Number 7 - Mar 2014
Comparative Study of Enhanced Secure and
Effective Relevance Keyword Search Over
Public Cloud Data
Sathya.R#1, Manjula babu#2,
#1
#2
PG Student ,Dept. of MCA ,Vel Tech Dr.RR & Dr.SR technical university ,Chennai, India
Asst Professor , Department , Vel Tech Dr.RR & Dr.SR technical university, Chennai, India
Abstract-In the present world most recent computing
achievement is cloud computing. But everyone can’t
enjoy this cloud service there are lot of issues cost, security and infrastructure. To overcome this problem they
have introduced private cloud but an individual cant
access private cloud the cost is too high. To provide
cloud service for all the individuals through public
cloud in secured and efficient manner through proposed
system. Even though there are lot of encryption
tecniques to provide secured service in public cloud but
they allow only Boolean search which cannot provide
effective result because cloud have large amount of data
and users. Proposed paper we introduced user side and
owner side encryption and we providing separate extraction process for keyword extracting from the data.
Also to double up the process security we using order
preserving mapping to prevent the data from leakage.
By providing DRKU to update the keyword from the
file when it is update or edited, also generate the index
for search string dynamically helps to retrieve results
faster and encrypting the index to provide privacy.
Some simple methods and user side encryption method
implemented to guarantee the security much better.
Keywords – Cloud Computing, RSE, OPES, DRKU,
Public cloud.
I. INTRODUCTION
Cloud Computing is the group of service area, applications, data, and infrastructure encompassed
of lots of compute, system, information, and putting
away resources. These Modules can be quickly composed, provisioned, implemented and neutralized,
and scaled up or down; given that for an on-demand
utility-like model of distribution and consumption.
ISSN: 2231-5381
Cloud increases association, dexterity, scaling, and availability, and delivers the potential for
cost drop over improved and effective computing.
The cloud model promotes availability and is composed of four important chars, two service models,
and five deployment models [2]. Cloud computing is
an growing prototype. Its classifications use case,
fundamental technologies, concerns, risks, and profits
will be sophisticated in a determined debate by the
open and isolated divisions. These explanations, elements, and characteristics will grow and change over
time. A consumer can unilaterally provision computing capabilities, such as server time and network
storage, as required semantically without needing
human contact with each service’s bill payer. In our
paper we are going to deal with universal public
cloud where a enormous commerce group and is
maintained by an association marketing cloud services. There is lot of problems in providing public
cloud service like technical problems to the implementation of Cloud Computing once it has been implemented, policy and business problems to the implementation of Cloud Computing [3].
As a cloud become more popular and efficient users
ready to save their personal data also government
sectors set up cloud environment. But the private
cloud environment is too high and public cloud environment is not too confidential to have secured data.
There is lot of risks in the cloud computing it has
been detail discussed in [4]. In cloud the major problem is eavesdropping of data or leakage [5] or even
hacked [6]. Unauthorized admission to cloud computing service may occur when a unauthorized person
gets the User id or user name with the correct related
password has been used. This can happen via diversity of procedural and non-procedural methods. Community work may be beleaguered towards the cloud
service bill payer by, requesting that critical access is
essential but that the password is not properly working and wanted to be changed. To overcome this en-
http://www.ijettjournal.org
Page 356
International Journal of Engineering Trends and Technology (IJETT) – Volume 9 Number 7 - Mar 2014
tire problem we need to go for encryption process
when we are getting into data as a service we need
reliable results and it must be most relevance in the
large outsourced data. In this data process owner outsourced large amount of data and need to give preference like who can access the data authentication for
users or we cannot provide all the data to user to
avoid this we allow user to get the users there data as
per their interests it came provide by keyword search.
This keyword search technique so far used in the
plain text only in this proposed paper we are creating
key word search for crypto files. A huge assortment
of records, publications, and journalists, of which, at
any specified period, a specific user is involved in
only a small segment. As a exact lumpy assessment,
we might assume that one published side covers
about 600 verses, or, counting lay out and punctuation, about 1,800 characters; then a 800-page book
contains about three million letterings. A manuscript
may have two sides and may be five tables from top
to bottom and 6 mms lengthy, so it saves conceivably
2 and a half billion letterings, or, in workstation
terms, 5.5 GB. Even a minor collection has 20 or
added memory; a huge one may have thousands. In
over-all, we may look even a comparatively little
manuscript collection to cover many million letterings [7]. Even though we are come across lot of
search techniques in encrypted files but all the techniques are “Boolean search” as mentioned some
samples [8],[9],[10]it cannot give any relevant data
for user requirement (keyword). When the method is
applied straight away into the cloud service it may
create main two problems. For every search request,
consumers without acknowledgement of the encrypted data they have to go over and done with each received file in sequence to select the widely held corresponding their attention, which demands perhaps
huge quantity of before process. Next problem, consistently distribution of files one by one to the user by
checking the key is presence or not real a headache in
the real world traffic control in network, which is
entirely disagreeable in pay per use service. To provide a best relevance data by scoring technique [11],
to allocate a dynamic score to a file for a request, the
DRKU checks the resemblance with requested keyword and index keyword. The resemblance between
two Keywords is once again not integral in the
DRKU. Naturally, the approach between two keywords is used as a amount of difference between the
keywords, and approach of COSINE is used as the
dynamic score similarity. As an substitute, the inside
model between two keywords is frequently used as a
resemblance amount. If all the keywords are enforced
to be element size, then the cosine of the approach
between two keywords is same as their product. If
~M is the manuscript keyword and ~R is the request
keyword, then the resemblance of manuscript M request R (or score of M for R)
Sin (~M, ~R) = ∑ X K~M. X K~R,
Although, we have lot of process to achieve the secured and relevance data outsource service in public
cloud. First we need to make the user to register with
our provider. Authentication is the main key in protecting the data this also helps the owner to provide
the access of data to the owners in specific. All the
files has to be maintained and the stored has to secured we mainly use three methodology to provide
relevance and security first we need to extract the
keywords in the document using the extraction algorithm Symmetric Key Extraction (SKE) fig 1. When
the owner outsources the data before it’s sent to server it will be encrypted using the encryption method
(SSE) Searchable Symmetric Encryption where both
the data and index will be encrypted. Index is the
process execute dynamically when the user send the
request to the server it will Create the index for each
and every keyword.
Fig 1 Keyword Extraction
Process
ISSN: 2231-5381
http://www.ijettjournal.org
Page 357
International Journal of Engineering Trends and Technology (IJETT) – Volume 9 Number 7 - Mar 2014
Multiple keywords can send at a time and can retrieve results one by one. Once user needs to use the document
he will put under some security check and need provide a key
to download the data. To create a dynamic score for keywords
we need to maintain the extracted key words in updated manner. For this keyword updating we (DRKU) Dynamic Relevance keyword updating for update the index table when the
file has been edited or updated or newly added. To double up
the security we use order preserving mapping technique to
prevent the data leakage. In Fig 2 we showed the architecture
of the process.
has been increased. In this particular kind of attack both the
user and hacker place on same server so the hacker may try to
hack any users account randomly. However, specified user
may under attacks is still be possible. It has been shown with
one cloud computing service provider that could be achieved
40 percent by setting up new users while instantaneously operating the resource needs of the targeted user’s virtual machines. Another attacking way is domain name system (DNS)
attack. The method of DNS is changing the URLs or domain
in the server. Different of DNS attacks are targeted at locating
authenticated authorizations from broad band users, also it
mentions the cloud users. DNS has been changed and it has
been leads the user to reach the hackers server or they can
easily enter the users system. Domain name stealing they steal
the cloud server domain name and make users to register in
the negative named servers. When the user leads to enter the
duplicate URL same as original server URL where they provide all there authentication details where it can be stolen by
the hacker and use to enter into the targeted server as another
users with their authentication details. Session hijacking consists of the attacker misusing lively computer sessions by
gaining the cookies that are used to validate users. This can be
attained by cross site scripting, which includes malevolent
code being vaccinated into the website, which is successively
performed by the browser. So we can keep on describing lot
of problems on cloud computing to overcome the proposed
paper implemented lot of techniques.
B. System Goals
Fig 2 Architecture of System
II. SYSTEM DECLARATION
A. System Problems
Unauthorized entry to cloud computing organizations
might happen when a user id and password has been gotten
without authorization. This kind of situation may occur using
many technical and non-technical approaches. Community
business might be directed on the way to the service provider,
for example, requesting that crucial access is needed but that
the password is not accessible and need to change. Passwords
might be stolen logged in public places and left away, cracking the password using some software. Denial of service
(DoS) on cloud server may stop user to use their accounts.
This may happen by creating jam of traffic on the predicted
site to make users not accessible to genuine users. When a
DoS attack is happens, that will be devoted to as a distributed
denial of service attack, . DoS attacks only concentrate on
particular user not on the entire server and try to change the
password the user or keep on login with wrong password and
lock the account. ‘Side channel attacks’ may allow a user to
access the other users account and data the boundary of usage
ISSN: 2231-5381
To empower relevance searchable symmetric encryption for effective use of outsourced and protected data under
the abovementioned problems, our model design should accomplish the subsequent security and performance assurance.
Relevance keyword search to determine different methods for
manipulative in effect of relevance search outlines based on
the existing searchable encryption resource description frame
work; Security assurance to avoid cloud server from searching
the given keyword in the cryptic text, and attain the security
power compared to existing system; Effectiveness above
problems should be avoid with minimum communication and
process. Inverted index is the indexes which put score for
overall data in the server and maintain the index with help of
DRKU the keyword will be update automatically.(KW) is to
denoting the keyword. (F) Is the Frequency of keyword in
File. (QF) Queries are done frequently. (ND) collection of
total data in the database. (DT) the number of files matches
the keyword. (L) is the file length in bytes, and (AL) is the
average length of file.
Score = ∑ L((ND- DT + 0.8) / DT +0.8) * ((KW+1)F) / ((KW
(1.5-b) + (b(DT / AL)) + F * (KW+1)DT / KW + QF
http://www.ijettjournal.org
Page 358
International Journal of Engineering Trends and Technology (IJETT) – Volume 9 Number 7 - Mar 2014
The important part of retrieving the files are relevance of the keyword appears in the document. A large process is needed to do this calculation this score in every models. Alternative method that has been revealed to be in effect
of increasing file relevance is query alteration by relevance
response. An updated relevance model uses an efficient calcu-
BuildIndex(K; C)
Initialization:
scan C and extract the distinct words W = (w1; w2; :::;
wm) from C.
For each wi W , build F(wi);
Build posting list:
for each wi W
for j jF(wi)j:
calculate the score for file Fij according to equation ,
denoted as Sij ;
compute Ez(Sij ), and store it with Fij ’s identifier hid(Fij
)jjEz(Sij )i in the posting list I(wi);
Secure the index I:
for each I(wi) where i m:
encrypt all Ni entries, with key fy(wi),
where j .
set remaining Ni entries, if any, to random values of the
same size as the existing Ni entries of I(wi).
replace wi with x(wi);
Output I.
lation method in mixture with a best query extension method.
Word
File ID
Wi
Fi1 Fi2 Fi3 FiNi
Relevance Score DW1++
Fig 3 Index Sample
Relevance function when a user gives request with a
keyword the index will check for the number of times file
downloaded for that particular keyword the most downloaded
file will be shown as the first result to the user. By doing relevance keyword index we can provide most relevant file at first
and as ordered. We need a list of queried keywords and we
need to match the keywords most relevance to the file in our
database. Critical to the informal of relevance querying is the
use of a resemblance experimental, a method that allots a numeric score signifying how relevant a file and the keyword
match.
ISSN: 2231-5381
Word
File ID
Java
x1 y2 x2 ......... Xn
Relevance Score x1-32 y2-26 x2-21………Xnn
Fig 4 Example Index
III. ENCRYPTION EVOLUTIONS
In the introduction we have motivated the relevance
keyword search over encrypted data to attain saving of space
for Cloud-Computing. In this section, we being from the analysis of searchable symmetric encryption (SSE) systems and
proceeding in giving definitions and frame-work for our proposed relevance symmetric encryption (RSE). By subsequent
the similar SSE, it would be actual capable to maintain relevance search method over encrypted data.
A. Searchable symmetric encryption (SSE)
We present original confrontational prototypes for SSE. SSE
mainly deals with searches like comparing the previous
searches and left the non-adaptive searches. The next adaptive
reflects rivals that select their requests as a purpose of before
found trapdoors and search retrieved. Whenever the keywords
received from the user the keyword index dynamically create
for the present keywords. All previous effort on SSE with the
allowance of unaware RAMs falls within not adjustable situation. The inference is that, conflicting to the normal use of
searchable encryption defined in these descriptions only assurance security for users that achieve all their searches on
one occasion. We talk this by introducing game and simulation based descriptions in the adjustable Situation. We providing two models which we verify in our new model. Our first
model is alone safe in the not adjustable situation, but is the
greatest effective SSE model update. In real, it attains searches in single communiqué turn, needs an extent of effort from
the server that is rectilinear in the amount of files that encompass the keyword, needs continuous memory on the client
side, and lined in the size of the file collection database on the
server. Though the model in also achieves searches in one
turn, it can encourage incorrect positives, this is not the situation in our model. Furthermore, all the models in need the
server to achieve a quantity of effort that is lined in the total
number of files in the storage. Our next model is protected
against an adjustable rival, but at the value of needful an advanced communiqué above every request and more size at the
serve r[12]. Though our adjustable system is abstractly simple,
we note that modeling effective and provably secure adjustable SSE schemes is not a frequent task. The main challenge
lies in proving such constructions secure in the simulation
paradigm, since the simulator requires the ability. Procedure
of SSE whenever the owner starts to upload first it uses Build
http://www.ijettjournal.org
Page 359
International Journal of Engineering Trends and Technology (IJETT) – Volume 9 Number 7 - Mar 2014
index process Build the index for keywords in the file. Using
OPES we put up some extra protection by providing 3DES.
Once the index has been build the data’s will be encrypted
with the index and upload it to the database Fig 3.
Having an exact model on the security assurance of
existing SSE works is very significant for us to describe our
relevance searchable Symmetric encryption difficulties. We
tried to provide “as-strong-as-possible” relevance searchable
symmetric encryption. Essentially, this concept has been active by decoders in numerous latest model [13], [14] where
efficiency is preferred over security.
tional diminishes the information leakage from public service.
Therefore, the keyword security is provided in our model. We
include 3DES to make the process double up. In fig 5 we
show single process of DES when we follow triple the process
we achieve 3DES.
Algorithm for SSE:
1.
2.
3.
4.
5.
6.
7.
8.
procedure SSE(D,R,id(F))
build index (I1);
while |D| !=1 do
{D,R}←Binary Search(K,D,R,m);
End while
coin←Tape Gen(K,(D,R,1||m,id(F)));
c←R;
C ←3DES(R)
9. return c; end procedure
Fig 5 3DES Encryption Structure
RSE algorithm:
b. Relevance Symmetric Encryption (RSE)
RSE is the method exactly on the user side. User
need to provide the keywords as the search string. User can
provide any number of keywords and request for the results.
Once the keywords recited from the user now we need to go to
SSE over there the index will be generated as we mentioned in
section 3.2. The user will be received with the set of results
for the requested keywords as per user interested one by one.
When a user needs to download the data hey need to give another request to the server for download the data. Now user
will be push under the security question once it has been verified the server provide the key for the selected file. The user
can use the specific key for the specific file to download from
the server. In the searchable index we combine the relevance
score and key id for file. So the hacker may get in and see
only the encrypted data in the server also here keyword is protected and file privacy. Due to the privacy strength of the document encryption model, the document in the database is entirely safe. However rival may acquire part of information
from the cipher text, cipher text might specify very great consistent plaintext, the entirely randomized and the extremely
compressed one-to-many mapping technique still make the
hacker to find the partial data,. Likewise remember we use
different encryption method for user and owner, which addi-
ISSN: 2231-5381
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
procedure RSE(D,R,id(F))
build index (I1);
while |D| !=1 do
{D,R}←Binary Search(K,D,R,m);
End while
coin←Tap Gen(K,(D,R,1||m,id(F)));
c←R;
C ←3DES(R)
return c;
end procedure
procedure Binary Search(K,D,R,m);
M←|D|;N←|R|;
d←Data retrived;
y←keyword ;
Build index = {DYM};
coin←Tap Gen(K,(D,R,0||y));
Req≤x then
D←Data for request
K←{r+1,…,y}KEY; End if
Return {D,R};
End procedure
http://www.ijettjournal.org
Page 360
International Journal of Engineering Trends and Technology (IJETT) – Volume 9 Number 7 - Mar 2014
A.ORDER PRESERVING MAPPING
Order-preserving symmetric encryption (OPE) is a
control encryption model where encryption task provide number order for file. OPE has a extensive account in the method
of single part , which are inclines of plaintexts and the consistent Cipher texts, together decided in arranged numbered so
only one part of list is enough for encryption and relevance.
An extra official action of the idea of order-preserving symmetric encryption (OPE).The motive for original attention in
such models is that they let effective variety requests on encrypted data. That is, a inaccessible unprocessed storage server is able to index the data it collects, in encrypted type, in a
data format that allows effective range of request. By an effective method we can comprise the database . In detail, OPE not
only permits effective variety requests, but agrees indexing
and request dealing out to be done accurately and as proficiently as for unencrypted data. OPE has also remained optional for use in network combination on encrypted data in
sensor networks [13]. OPE helps to divert the hackers from
hack the data in the server also it stops lekage of data even we
added 3DES to provide efficient of security
P/C= o (m_1), 0 < _ < 1, and n _ m3.
Algorithm for OPES
OPES PROCESS Owner Side
:
Process owner (K,R,I)
I←|K|;N←|RE|
K←min(K)-1;RE←min(RE)-1
j←RE+[N+2]
if |K|= 1 then
If F[K,RE,I] is indefinite then
cc←Get Coin(1L,K,RE,1||I)
CC<- ----3DES(CCA);
F[K,RE,I]←RE
Return F[K,RE,I]
If I[K,RE,j] is undefined then
cc←GetCoins(1L,K,RE,0||j)
I[K,RE,j]←HGK(K,RE,j;cc)
x←I[K,RE,j]
If I≤x then
K←{K+1,…,x}
RE←{j+1,…,y}
Else
K←{x+1,…,K+I}
RE←{j+1,…,RE+N}
Return(K,RE,I)
OPES PROCESS User Side :
Process User(K,RE,c)
I←|K|;N←|RE|
K← min(K)-1;RE←min(RE) -1
j←RE+[N/2]
if |K|= 1 then I←min(K)
if F[K,RE,I] is indefinite then
cc← Get Coins(1L,K,RE,1||I)
F[K,RE,I]←RE
If F[K,RE,I] = c then
return I
Else return
If I[K,RE,j] is undefined then
cc←GetCoins(1L,K,RE,0||j)
I[K,RE,j]←HGK(K,RE,j;cc)
x←I[K,RE,j]
If c ≤ j then
K←{K+1,…,x}
RE←{RE+1,…,y}
Else
K←{x+1,…,K+I}
RE←{j+1,…,RE+N}
Return (K,RE,c)
protect relatively a lot of calculation over during the inverted
index update, and can be reflected as a significant.
V. DYNAMIC RELEVANCE KEYWORD UPDATING
(DRKU)
The relevance can be achieved only when the data in
the index has to be update automatically. In the present situation the updating are done by manually or updating process is
not taken place for the newly adding file or editing the existing file. DRKU, where it has been retrieved from the collection of files. DRKU’s main function to update the score for
the files in the data base whenever a new file added or existing
file deleted it has to make an impact on the index, also when
any content edited inside the file also the relevance score need
to make change in index.
Preferably, a score change in certain situation the , and we
show that our proposed one-to-many change must only affect
the score of particular file it should not disturb the other file
scores. May be the rank of relevance in the index may change.
Order-preserving mapping never consider about the deleting
files and never make an impact on the score but this won’t
help in providing the relevance to the user. Associate score
dynamics is also the motive why we do not use the simple
method for RSE, before outsourcing the data the owner update
the inverted index in the server. All the process has been repeated whenever situation comes to the score has been
change, execution it unusable in event of repeated document
collection updates. In circumstance, supporting DRKU will
VI SYSTEM PERFORMANCES
A. Performance of SKE
We tried to compare some extraction process with
some key. Whenever we update a file index need to be added
and it must be extracted. First keyword and description has
been provided manually. The key word extracted human and
automatic of extraction. And we have updated some graph as
below fig 6 we gave some keywords and calculated results.
300
200
File retrived
100
time taken
relevance %
0
Fig 6 Relevance of data
ISSN: 2231-5381
http://www.ijettjournal.org
Page 361
International Journal of Engineering Trends and Technology (IJETT) – Volume 9 Number 7 - Mar 2014
B. Performance of RSE
SSE is almost same extraction with key process so let
us we see the process of the proposed system RSE. We
checked the time taken for process and file retrieves. We
compared the existing and proposed encryption process and
search technique cross check the execution time and performance of the proposed system.
can provide security equal to the private cloud. Through
DRKU we achieve the relevance in searching. Multiple
keyword searches also have been proposed.
As future enhancement we can improve the encryption method more secured. We can give humming bird algorithm to get best extraction of data.
ACKNOWLEDGEMENT
7
6
5
4
3
2
1
0
Java
Cloud
time
taken
This work has been done under the guidance of
Manjulababu and has supported by kamalesh.G
releva
nce
REFERENCES
Server Karizma
Fig 7 Existing Extraction Word and relevance
10
8
6
4
Retrived %
2
Relevance
0
Fig 8 Proposed Extraction word and relevance
VII. CONCLUSION
In the existing paper we have achieve providing security
and user satisfaction Through RSE and DRKU. In the
current situation the problem are numerous, like DNS
attack, Social hacking method. Social hacking method can
overcome only when user have responsible on protecting
of their private data. The ethical hacking can be overcome
by some smart methodology like RSE. In our proposed
paper we are going to encrypt all the data including the
index and save in the database. Simply a hacker hacks the
server he get only fake data not the real one. RSE helps to
extracted data for user in relevant manner and also provide the user side decryption method in With the help of
OPM technique we implement 3DES to provide more security. We achieved the goal of public cloud as equal to the
private cloud. Even any one can access public cloud which
ISSN: 2231-5381
[1]
Cloud Security and Privacy: An Enterprise Perspective on Risks
and Compliance (Theory in Practice) [Kindle Edition] &
http://en.wikipedia.org/wiki/Service-oriented_architecture
[2] ”The NIST Definition of Cloud Computing” Authors: Peter Mell
and Tim Grance Version 15, 10-7-09
[3] “Electrical Engineering and Computer Sciences” University of
California at Berkeley Technical Report No.
UCB/EECS2009-28
http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-200928.html
[4] “Security Guidance for Critical Areas of Focus in Cloud Computing” V2.1 Prepared by the Cloud Security Alliance December
2009
[5] Cloud computing for small business: Criminal and security threats
and prevention measures Alice Hutchings, Russell G Smith &
Lachlan James
[6] https://opencuro.com/pdf/security_news/heartland.pdf
[7] ” Managing Gigabytes: Compressing and Indexing Documents and
Images”, Second EditionIan H. Witten, Alistair Moffat, and Timothy C. Bell
[8] “Privacy Preserving Keyword Search over Encrypted Cloud Data
Book Title “Advances in Computing and Communications” Book
Subtitle First International Conference, ACC 2011, Kochi, India,
July 22-24, 2011. Proceedings, Part I
[9] “Public Key Encryption with keyword Search” Dan Boneh Giovanni Di Crescenzo Stanford University Telcordia Rafail Ostrovskyy Giuseppe Persianoz UCLA Universita di Salerno
[10] ” Practical Techniques for Searches on Encrypted Data Dawn”
Xiaodong Song David Wagner Adrian Perrig
dawnsong, daw,
perrig@cs.berkeley.edu University of California, Berkeley
[11] ” Modern Information Retrieval”: A Brief Overview Amit Singhal
Google, Inc. singhal@google.com
http://www.ijettjournal.org
Page 362
Download