A Novel Rank Oriented and SOA Architecture Based Mobile Search Engine

advertisement
International Journal of Engineering Trends and Technology (IJETT) – Volume 15 Number 7 – Sep 2014
A Novel Rank Oriented and SOA Architecture Based
Mobile Search Engine
1
NarasimhalaVeeraLakshmi, 2N. Balayesu
Final M.Tech student, 2Assistant Professor
1,2
Computer science and Engineering, Raghu Engineering College, Visakhapatnam.
1
Abstract:Optimizing the search engines in mobile phones is
still an important research issue in the field of knowledge
and data engineering, even though various approaches
available, performance and time complexity issues are the
primary factors while implementation of the search
engines, We are proposing an efficient personalized mobile
search engine with efficient features of Mining, ranking
and cache implementation over the service web services.
I.INTRODUCTION
Accuracy is an important measurement in web
search to retrieve information. In previous works accuracy
is measured by humans to decide the accuracy of the query
document pairs. In present days’ many people interact daily
with search engines. We can get more amounts of data for
analyzing,
maintaining and
information
system
managements.
In recent works automatic feedback are developed
in information retrieval systems. That is after getting the
information from web providers offering feedback session
to users to give feedback to provider which uses to improve
their search engines or search websites. There are some
problems in web search also. That is some users behave
maliciously and may be they are not the real
(authenticated) users. Whatever all these affects the data
that can be collected. The basic problem in information
retrieval is ranking of search results. The common method
of web usage is similarity of the query and the entire
quality of the data presented in the page.
Later many testing the data in laboratory settings
does net required to convert the real time usage. So it is
very crucial to automatically make feedback schemes from
more amounts of user interactions.In recent researches
proposed an evaluation of interpreting clickthrough ways.
By processing eye tracingresearches and predictions of
their schemeswith exclusive ratings and the authors
presented that it accurately interpret click-trough’s in a
laboratorysettings.
The
extension
to
traditionalmethodsapplies to real world web search is not
clear. In that time only while present work on using clickthrough datafor increasing of web search ranking.It focuses
on onlyone aspect of the user interactions with web search
engines.
ISSN: 2231-5381
The normal method isre-ranking the results got by
a websearch engine based on tested click-through and
another user interaction for the query in traditional search
sessions. Every resultis assigned a score according to
expected relevance or usersatisfaction based on previous
interactions output in somepreference ordering based on
user interactions alone.
While there has been significant work on merging
multiplerankings, we adapt a simple and robust approach of
ignoring theoriginal rankers’ scores, and instead simply
merge the rank orders.The main reason for ignoring the
original scores is that since thefeature spaces and learning
algorithms are different, the scores arenot directly
comparable, and re-normalization tends to remove
thebenefit of incorporating classifier scores.
Search results have become increasingly complex
and that trend is likely to continue. The traditional model of
10 blue links and rank checking is no longer accurate as
users are receiving results that are increasingly customized
to them. As results are becoming more personalized, its
valuable to better understand how personalized search
results are being presented to users.
II. RELATED WORK
In recent web search engines the rank results
according to huge amount of features consists of content
based and query independent search quality features. In
many situations automatic process leads to development of
particular ranking function that integrate these features.
So the general approach is using of feedback
properties directly as the properties of the ranking
methodologies. At the training process the ranker can be
trained as before with additional properties. At the
execution time these properties are combined with each
query output URL. It needs ranking method to build the
strong web search with above fifty percent of distinct
queries. There are some features of user behavior such as
query text features, browsing features, and click through
features.
Coming to query test properties, users predict
which output to explain in more detail by viewing the
output of the query. In other cases after viewing the
document is not required. This includes the properties
http://www.ijettjournal.org
Page 330
International Journal of Engineering Trends and Technology (IJETT) – Volume 15 Number 7 – Sep 2014
consists of tokens in the query and the rate of works shared
in the query etc.
In browsing there are easy features in web page,
the interactions are extracted and measured. These
properties are used to find the characters of the user
interactions with pages in browsing. This property allows
making of intra query formulation of browsing behavior. In
click through property is very particular cases that is
interact with the search engines. We combine all the above
properties required to study click through based schemes.
Mobile search is main wing of information
retrieval services that is centered on the convergence of
mobile platforms and mobile phones, or that it is used to
explain the data about something and other mobile devices.
Web search engine ability in a mobile form allows users to
identify the mobile content on websites which are available
to mobile devices on mobile networks. As this occur the
mobile content displays the media shift toward mobile
multimedia. It is easily put the mobile search is not just a
spatial shift of PC web search to mobile equipment but it is
more of tree like branching into specialized segments of
mobile broadband and mobile content and both of which
show a fast-paced evolution.
A personalized mobile search engine (PMSE) that
captures the users‟ preferences in the form of concepts by
mining their click through data. Due to the importance of
location information in mobile search, PMSE classifies
these concepts into content concepts and location concepts.
In addition, users‟ locations (positioned by GPS) are used
to supplement the location concepts in PMSE. The user
preferences are organized in an ontology-based, multi-facet
userprofile, which are used to adapt a personalized ranking
function for rank adaptation of future search results.
In personalized mobile search engine (PMSE) that
captures the users „preferences in the form of concepts by
mining their click through data. Knowing the importance of
the location information in mobile search, this search
engine capturesuser’s preferences in the form of concepts
viz., content concept and location concept. Location
information is supplement to the location concept. User can
also submit the location by simply typing it on a particular
column or GPS helps. The user preferences are organized
in an ontology-based, multi facet user profile, which are
used to adapt a personalized ranking function for rank
adaptation of future search results. To characterize the
diversity of the concepts associated with a query and their
relevance’s to the user’s need, four entropies are introduced
ISSN: 2231-5381
to balance the weights between the content and location
facts [1]. To characterize the diversity of the concepts
associated with a query and their relevance to the users
need, four entropies are introduced to balance the weights
between the content and location facets.
II. PROPOSED WORK
We are proposing an efficient mechanism of
mobile search engine to meet complete user requirements
or user satisfied results and retrieval of search results in
optimal manner by the approaches of mining
implementation, the previous or traditional search results
based on spatial information like geo codes based search
results for user search input query, search results can be
depends on document weight of file relevance score and it
can be computed with two parameters. TF(term frequency)
and IDF (inverse document frequency ) and Cache
implementation for the frequently accessed previous search
results for specific input query to enhance the performance
and to reduce the complexity issues from the both end
points.
It was proved that a relevant number of input
queries or multiple queries were geo or location based
input keywords or queries and they are concentrating on
geo or location information, to retrieve such input queries
that emphasizes on geo or location based information, so
many number of location-based search implementations
developed for location or spatial queries have been
proposed. In our proposed system, it supports language
interoperability (i.e. any standard language can
communicate with other language) through SOA (service
oriented application) and minimizes the chances of
duplication of business logic by maintaining it at
centralized location or centralized web application server
instead of maintain the business logic or set of operations
at multiple locations. Search engine performance can be
improved by the simple cache implementation and file
relevance based rank oriented results from files or
documents.
Web service is one of technology to create
SOA (service oriented architecture) with three tier
architecture, it minimizes duplication of operations by
maintain the business logic at specific one location
(centralized server). The main goal of the service oriented
architecture is language interoperability (i.e. any standard
language can communicate with other language even
though both are different languages) and minimizes the
damage chances from client end.
http://www.ijettjournal.org
Page 331
International Journal of Engineering Trends and Technology (IJETT) – Volume 15 Number 7 – Sep 2014
Database
Business
Logic
Wsdl with Soap protocol
UI (VB.Net)
UI ( Java)
UI (Android)
Fig1: Web service Architecture
Data Cache is a mechanism which increases the
performance from user end and reduces over head from
server end and stores frequently access results for future
retrieval when user requested for same input query it
reduces execution time i.e (round trip over the input
request and response time from server during the user input
query can be minimized in terms of time complexity and
minimizes additional overhead on server to process the
same input keyword. If any user request with same input
query which is requested before, query
need not to
process by server again and no need of a round trip ,
because previous search results retrieved from the web
server before forwarded to user and it can be stored in data
cache ,next search onwards input query results retrieved
from cache storage instead of web server.
ISSN: 2231-5381
Initially every document is preprocessed and eliminates
inconsistent or un necessary keywords from document and
compute document weight or file relevance score with term
frequency (TF) and inverse document frequency (IDF). TF
computes the number of occurrences or frequency of a
search query or keyword in an individual file and IDF
(Inverse document frequency) computes the number of
occurrences or frequency the input search query in all files
or documents which have keyword then file relevance
score or document weight can be computed in terms of TF
and IDF.
http://www.ijettjournal.org
Page 332
International Journal of Engineering Trends and Technology (IJETT) – Volume 15 Number 7 – Sep 2014
3
.
Web service
Data base
4
.
2. Forward Request
Mobile User
5. Results
1. New Account
7. Search Results
Cache
8. Send Request
6. Store in cache
9. Result
Fig2: Proposed Architecture
Sequential Steps for Rank oriented results from Web service as follows
1. User makes a request with search query from Mobile
2. request forwards to data cache and checks previous
retrieval results, if same query results available then returns
from data cache otherwise forwards request to business
logic.
3. Service or business logic retrieves rank oriented results
based on term frequency and inverse document frequency
from the data sources.
FileScore=TF*IDF
FileScore= document weight or file relevance score
TF is term frequency ( number of occurrences of a
keyword in a single document)
IDF=Inverse document frequency (number of occurrences
of a keyword in all documents)
4. Search results can stored in data Cache for future
retrieval of same query
5. from cache, ranking based search results can be
forwarded to mobile when user who makes same request.
For experimental implementation we tested
SOA(service oriented architecture)
in C#.Net and
Android for user interface and generation of soap objects.
ISSN: 2231-5381
Set of operations or business logic is available in C#.net at
server end. UI( user interface) can be android , input search
keyword can be given through soap (simple object access
protocol) objects with web service description language in
abstract way of communication and calculations and
retrieval can be done at web service for file relevance
based results.
IV. CONCLUSION
We have been concluding our current research work
with efficient file relevance based ranking oriented results
in mobile search engine through service oriented
architecture. Cache Implementation enhances the
performance by minimizing round trip time or execution
time of search query. If same query is processed by the
same user before and Our experimental result shows
efficient results than previous mechanisms.
REFERENCES
[1] E. Agichtein, E. Brill, and S. Dumais, “Improving Web
SearchRanking by Incorporating User Behavior
Information,” Proc. 29thAnn.Int’l ACM SIGIR Conf.
Research and Development in InformationRetrieval
(SIGIR), 2006.
[2] E. Agichtein, E. Brill, S. Dumais, and R. Ragno,
“Learning UserInteraction Models for Predicting Web
Search Result Preferences,”Proc. Ann. Int’l ACM SIGIR
http://www.ijettjournal.org
Page 333
International Journal of Engineering Trends and Technology (IJETT) – Volume 15 Number 7 – Sep 2014
Conf. Research and Development inInformation Retrieval
(SIGIR), 2006.
[3] Y.-Y. Chen, T. Suel, and A. Markowetz, “Efficient
QueryProcessing in Geographic Web Search Engines,”
Proc. Int’l ACMSIGIR Conf. Research and Development
in Information Retrieval(SIGIR), 2006.
[4] K.W. Church, W. Gale, P. Hanks, and D. Hindle,
“Using Statisticsin Lexical Analysis,” Lexical Acquisition:
Exploiting On-LineResources to Build a Lexicon,
Psychology Press, 1991.
[5] Q. Gan, J. Attenberg, A. Markowetz, and T. Suel,
“Analysis ofGeographic Queries in a Search Engine Log,”
Proc.FirstInt’lWorkshop Location and the Web (LocWeb),
2008.
[6] T. Joachims, “Optimizing Search Engines Using
ClickthroughData,” Proc. ACM SIGKDD Int’l Conf.
Knowledge Discovery and DataMining, 2002.
[7] K.W.-T. Leung, D.L. Lee, and W.-C.Lee, “Personalized
WebSearch with Location Preferences,” Proc. IEEE Int’l
Conf. DataMining (ICDE), 2010.
[8] K.W.-T. Leung, W. Ng, and D.L. Lee, “Personalized
Concept-BasedClustering of Search Engine Queries,” IEEE
Trans. Knowledge andData Eng., vol. 20, no. 11, pp. 15051518, Nov. 2008.
[9] H. Li, Z. Li, W.-C. Lee, and D.L. Lee, “A Probabilistic
Topic-Based
Ranking Framework for Location-Sensitive Domain
InformationRetrieval,” Proc. Int’l ACM SIGIR Conf.
Research and Development in
Information Retrieval (SIGIR), 2009.
[10] B. Liu, W.S. Lee, P.S. Yu, and X. Li, “Partially
SupervisedClassification of Text Documents,” Proc. Int’l
Conf. MachineLearning (ICML), 2002.
ISSN: 2231-5381
http://www.ijettjournal.org
Page 334
Download