Ontology Driven Knowledge Base Information Retrieval Ashutosh V. Girase

advertisement
International Conference on Global Trends in Engineering, Technology and Management (ICGTETM-2016)
Ontology Driven Knowledge Base Information Retrieval
Ashutosh V. Girase#1, Girish Kumar Patnaik*2, Sandip S. Patil+3
#1
PG Student, *2Professor and Head, +3Associate Professor
(Department of Computer Engineering, SSBT’s College of Engineering & Technology, Bambhori, Jalgaon [M.
S.], INDIA)
Abstract— Decision-making is the task of every top
management in an organization, and they need
relevant and meaningful information to help in taking
decisions. Retrieval of meaningful information is a
challenge for effective decision-making. Due to lack of
domain knowledge, meaningful information remains
hidden in the database itself. Decisions made out of
irrelevant and meaningless information sometimes
leads to irreparable damage and reputation. To
retrieve relevant information it is necessary to have
background knowledge about the domain. Background
knowledge in the form of ontology is an important
source of information. The paper presents a solution
for meaningful information retrieval by using domain
ontology as a domain knowledge which reveals all the
meaningful information from the database to help in
taking decision.
Keywords-Ontology,
Decision-making,
Domain
knowledge, Meaningful information, Background
knowledge,
Information
retrieval,
Business
intelligence.
I. INTRODUCTION
Information retrieval (IR) is the activity of
obtaining information resources relevant to an
information need from a collection of information
resources. The meaning of the term information
retrieval can be very broad.Just getting a credit card
out of your wallet so that you can type in the card
number is a form of informationretrieval. Knowledge
base information retrieval (KBIR) is a process of
retrieving relevant and meaningful information as per
user need from the resources.Domain knowledge plays
an important role in retrieving the knowledge base
information. To retrieve the meaningful information,
domain knowledge in the form of ontology is an
efficient way.
Ontology is an explicit specification of
conceptualization. Ontology plays a big role in
knowledge management. Ontology describes the
information about particular domain in the form of
concepts and relations. Ontology allows information
to be stored in human as well as machine readable
format [1].Ontology is considered as a backbone of
semantic web. Problem of semantic heterogeneity in
semantic web is solved by using ontology. Ontology
also used to explore the semantic relationship between
the concepts& to represent the background knowledge
ISSN: 2231-5381
about the domain in various web related information
retrieval techniques.Background knowledge plays an
important role in retrieval of relevant and meaningful
information to meet the need of decision maker to take
the decision. Decision making is carried out in every
organization to solve the problems by using business
intelligence technique.
Business intelligence (BI) is the set of techniques and
tools for the transformation of raw data into
meaningful
information.The
term business
intelligence represents the tools and systems that play
a key role in the strategic planning process of the
corporation. These systems allow a company to gather,
store, access and analyse corporate data to aid in
decision-making. Decision-making is a crucial task of
every organization. To make the decisions it is
necessary to have relevant and meaningful
information. To retrieve the meaningful information
from the database it is necessary to have knowledge
about that domain. Practically it is not always possible
that a person has knowledge about every domain. Due
to this retrieving meaningful information from the
large database is a challenging job.
To make effective decisions it is necessary to have
meaningful information. In proposed approach
ontology is used as background knowledge of domain
to retrieve the relevant information. In this way
ontology will help the information retrieval system to
retrieve meaningful information from the database.
Rest of the paper is organized as follows: Section 2
gives an overview of the related work; Section 3
presents the proposed approach and Section 4
concludes the proposed approach.
II. RELATED WORK
Ontology is a popular area of research nowadays.
Mainly it is used in the area of artificial
intelligence.Due to lack of semantics, traditional
keyword based technique in data mining limits in
finding the relevancy and understanding the user need.
Ontology has given a new ray of hope to overcome the
challenges of data mining. Use of ontology as a
domain knowledge repository found too much
promising in the various data mining tasks such as
information
retrieval,
information
extraction,
classification, clustering, recommenders system, link
prediction etc.
Kaushal Giri, in [2], has given a role of ontology in
semanticweb. The increasing volume of data available
on the web makes information retrieval a tediousand
http://www.ijettjournal.org
Page 467
International Conference on Global Trends in Engineering, Technology and Management (ICGTETM-2016)
difficult task. Researchers are now exploring the
possibility of creating a semantic web. The vision of
the semantic web introduces the next generation of the
web by establishing a layer of machineunderstandable data.Ontology used in supporting
information exchange process, particularly with
semantic web. Main advantage of ontology is that it
provides the data in human as well as machine
readable format.
Mohammad Mustafa Taye, in [3], has given brief
overview about ontology and semantic web. Basic
concepts, structure and the main applications of
ontology and semantic web presented. Many relevant
terms are explained in order to provide a basic
understanding of ontologies. Overview and
information about the working of semantic web is
given. Semantic web is developed on the basis of
ontology. Ontology is considered as a backbone of
semantic web. Semantic web represents information
more meaningfully for humans and computers i.e. in
machine readable format and allows and enables
annotating, discovering, publishing, advertising and
composing services to be automated.
Dou et al., in [4], presentedvarious ontology
based approaches in semantic data mining. How
ontologies are beneficialin bridging the semantic gaps,
providing prior knowledge and constraints are
explained. Role of ontologies in mining tasks such as
information extraction, clustering, classification,
recommendation andlink prediction is given. Detailed
discussion carried out on why ontology has the power
to help semantic datamining and how formal
semantics in ontologies can be incorporated into the
data mining process.
Mishra and Jain in [5], given a study of various
approaches and tools on Ontology. Various ontology
based approaches are discussed in detail. Also various
tools used for the construction of ontology are given.
Also comparative study about the working of tools
carried out in the end.
Tao, in [6], has given personalized ontology model
for used for web information gathering. The existing
traditional approach was unable to retrieve the
information as per user need. Proposed ontology based
model represents user background knowledge for
personalized web information gathering. This model
constructs user personalized ontologies by extracting
world knowledge and discovering user background
knowledge from user local instance repositories. The
proposed ontology model is evaluated by comparing it
against with benchmark models in web information
gathering. The experimental evaluation proved that
ontology-based model is superior and promising as
compared to other models.
Wang et al., in [7], has given an ontology based
approach for association rule mining. The existing
traditional approach can't solve the problem of useless
rule mining and excessive concreteness of rules. In
order to solve above problems better, association rule
mining based-on ontology is used with the traditional
ISSN: 2231-5381
apriori algorithm. Experimental results proved that
ontology based approach improved the efficiency of
apriori algorithm.
Yongqing and Yan, in [8], has given an
ontology based approach for association rule mining.
The apriori algorithm is the best known association
rule mining algorithm, whose objective is to find all
co-occurrence relationships between data items.
Performance of apriori algorithm degrades with the
size of data. To overcome this problem ontology is
used to represent the domain knowledge which reveals
relationships between concepts. With the domain
knowledge, the search space and counting time is
reduced, so knowledge discovery can be improved
effectively and meaningful hierarchical rules can be
found.
Rudy et al., in [9], has given an ontology based
approach for enhancing automatic classification of
web pages. Various challenges and issues on existing
ontology based approach are discussed. As the number
of web data increasing daily, it is impossible to
classify the entire web data manually without help of
automated aid. Hence to help users to retrieve
information relevant to their need ontology is used as
a domain knowledge repository. Experimental
evaluation proved that use of ontology improves
accuracy as compared to existing technique.
Sundaramoorthy et al., in [10], has given an
ontology based approach for classification of user
history.Users browsing history is used to meet the user
need by classifying user in particular category.
Existing approach degrades the performance due to
lack of semantic knowledge about the user query.
Hence ontology is used to understand the user query
semantically. Experimental results proved that
personalization using such ontology and semantic
produce effective results.
Fang et al., in [11], has given an ontology
based automatic classification and ranking for web
documents. Ontology based approach used to solve
the problem of training datasets and semantic
complexity between words in traditional machine
learning algorithms. Issues of previous works on
ontology based classification such as ontology
construction and ranking of classified documents also
discussed. The experimental results proved that
ontology based classification algorithm achieves
higher precision and recall compared with traditional
approaches.
Nadana and Shriram, in [12], has given an
ontology based clustering algorithm for information
retrieval. Due to lack of semantic knowledge
traditional K-means algorithm fails in finding the
words that are syntactically different but semantically
same. Ontology is used with the K-means algorithm to
integrate the background knowledge. Ontology is used
to find the pages with words that are syntactically
different but semantically similar. Experimental
evaluation proved that ontology based approach
outperforms than the traditional K-means algorithm.
http://www.ijettjournal.org
Page 468
International Conference on Global Trends in Engineering, Technology and Management (ICGTETM-2016)
Fernandez et al., in [13], has given an ontology
based approach for semantically enhanced information
retrieval. Traditional information retrieval system is
keyword based; hence it has limited capabilities in
semantic understanding with user need. To address
this problem searching by meaning i.e. semantic
search is introduced. Proposed comprehensive
semantic search model extends classic IR model to
address the challenges of massive and heterogeneous
web environment, and integrates the benefits of both
keyword and semantic based search. Also an
innovative rank fusion technique is used to minimize
the undesired effect of knowledge sparseness.
Experimental evaluations proved that ontology based
approach improve the efficiency of traditional
keyword based technique by reducing search space,
knowledge sparseness and semantic complexity of
data.
Xiudan and Yuanyuan, in [14], has given an
ontology based approach for information extraction
system in E-commerce websites. Traditional
information extraction system is based on the
dictionary, rule-based extraction technology and
hidden markov model. Existing approach fails in
extracting the information due to lack of semantic
knowledge. Ontology technology is used to build the
wrapper, and then extract the information from web
site. Experimental results and analysis proved that, the
technology of information extraction based on
ontology is not mature. Especially for the ontology,
there are still a lot of manual works, and the
development remains to be further studied.
Revoredo et al., in [15], has given a probabilistic
ontology based approach for semantic link prediction
in a network. Due to semantic complexity in
traditional machine learning algorithm, there is an
uncertainty in link prediction. Hence probabilistic
ontology based approach used to provide the
information about the domain to help in link
prediction. In such schemes, numerical graph-based
features and ontology-based features are computed;
then both features are given as an input into a machine
learning algorithm where prediction is performed.
Experimental results proved that ontology based
model outperforms than existing prediction technique.
Caragea et al., in [16], has given an ontology
based approach for potential friendship link prediction
in LiveJournal social network. Existing approaches
used in prediction cannot capture the semantic
similarity of the data. Hence the performance of the
machine learning algorithm degrades. To overcome
this problem ontology used as a training dataset to
help machine learning algorithm. The experimental
evaluation showed that ontology based approach
improves the performance of machine learning
classifier at the task of predicting links in the social
network.
Augusto et al., in [17], has given an ontology based
recommender system. Traditional recommender
approach is based on keyword matching technique.
ISSN: 2231-5381
Existing approach fails when there is no identical
keyword although there is a semantic relationship
between the words. Hence ontology based
recommender system is proposed. Detailed discussion
on the ontology based approach and the technical
issues in it given.
Kadima and Malek in [18], has given an ontology
based approach for a personalization of a
recommender system in social network. Discussion
technical issues and possible solution raised by
integration of an ontology-based semantic user profile
within hybrid recommender system is given.
Martin et al., in [19], has presented a framework for
business intelligence application using ontology-based
classification. Every business needs knowledge about
their competitors to survive better. One of the
information repositories is web. Retrieving specific
required information for business purpose is a
challenging job nowadays. Hence ontology is used to
capture specific information by using web semantics
for the decision making purpose.A framework for
business
intelligence
based
on
ontological
classification is developed for retrieving the specific
information from the web Here ontology act as a guide
i.e. background information repository to help
business intelligenceprocess.
Zhan et al., in [20], discussed benefits of
ontologies in real time data access. How the
ontologies are beneficial in real time data access is
given. Also highlights the importance of a data
integration layer in a business intelligence system and
the benefits that the use of ontology as data
description formalism and query interface, can bring
to the system. Problem of an ontology mapping and
enrichment is discussed. Also focused on how the use
of ontologies brings the benefits in the area of
communication, inter-operability and knowledge
management.
Ontology based approaches in data mining
techniques given superior performance. Use of
ontology as a background knowledge repository
solved the problems of semantic complexity, time
complexity, lack of training datasets and relevancy of
results up to much greater extent. Ontology as a
domain knowledge repository findspromising in
meeting the decision makers need.
III. PROPOSED APPROACH
In the proposed approach ontology is used as
domain knowledge to improve the relevancy of
information retrieval from the database in order to get
more meaningful information. Use of ontology as a
domain knowledge in various data mining techniques
found effective for improving the precision of
information retrieval.
Architecture of proposed KBIR process is shown in
Figure 3.1. Proposed system mainly consists of
THREE units i.e. semantic query engine,
http://www.ijettjournal.org
Page 469
International Conference on Global Trends in Engineering, Technology and Management (ICGTETM-2016)
transformation unit and query processing unit
respectively.
Domain ontology O and database D are
given primarily. Domain ontology represents the set of
concepts, relations and attributes of that domain.
Mathematical representation of ontology is as follows:
SPARQL queries. SPARQL is a data query language
usedto retrieve and manipulate data stored in ontology.
Decision maker‘s needs are directly represented in the
form of SPARQL query. After entering the query,
semantic query engine will generate the results with
the help of ontology. Generated results are
semantically enhanced results.
O = {C, R, H, P, A}
Where, C: Set of Concepts
R: Set of Relations
H: Hierarchies between Concepts
P: Set of Attributes or Properties
A: Set of Axioms or Rules
Database is nothing but a set of tuples.
Tuples contains the set of attributes. Attributes
represents information about specific domain.
Where,
is a semantic query engine function,
which takes i.e. SPARQL query and i.e. ontology
as an input & gives S i.e. semantically enhanced
information asan output. SPARQL query is entered by
user to retrieve the information as per his need. Query
q fired on ontology O by using semantic query
function
.
Transformation function converts semantically
enhanced results into the SQL queries. Semantically
enhanced results generated by semantic query engine
are automatically transformed into SQL or OLAP
queries by using transformation function.
Where,
which
takes output of
function as an input & gives
i.e. SQL / OLAP query as an output.
Query processing unit is used to process SQL queries
generated by the transformation function. Queries
generated by transformation function are then
appliedon the database which is normally consisting of
a historical or financial data about the organization.
Meaningful information is then used for decision
making.
Fig. 1 Architecture of Proposed Prototype
Table 1: Sample Database
Semantic query engine is used to process the
Product_Name
MacBook
Samsung Tablet
MacOs
Iphone
Ipod
RedMiPowerbank
I Os
Samsung drives
Sony earphones
I watch
Samsung Mobile
Fablet
I band
Apple TV
HTC Smartphone
Samsung Watch
ISSN: 2231-5381
Purchase_Date
12/02/2015
23/07/2015
15/06/2015
12/09/2015
02/09/2015
09/03/2015
22/04/2015
27/06/2015
11/02/2015
18/08/2015
17/01/2015
12/06/2015
12/03/2015
12/07/2015
12/02/2015
12/08/2015
Quantity
3
7
4
5
7
5
8
4
3
9
2
3
4
5
6
9
City
Mumbai
Pune
Nagpur
Pune
Nasik
Pune
Aurangabad
Pune
Nagpur
Mumbai
Pune
Mumbai
Nagpur
Mumbai
Pune
Nagpur
http://www.ijettjournal.org
Price ( 1 item)
56,000
20,000
9,000
46,000
8,000
7,000
7,000
15,000
1,400
25,000
25,000
18,000
25,000
96,000
35,000
21,000
Page 470
International Conference on Global Trends in Engineering, Technology and Management (ICGTETM-2016)
Results:Results obtained by using proposed approach
is as shown in Table 3.
Table 2.Results using Traditional Approach
Product_Name
MacBook
MacOs
Iphone
MacOs
Purchase_Date
12/02/2015
15/06/2015
12/09/2015
12/03/2015
is a query processing function,
which takes
i.e. output of transformation
function and database D as an input & gives M i.e.
meaningful information as an output.In this way
meaningful information is retrieved with the help of
ontology.
Quantity
3
4
5
3
City
Mumbai
Nagpur
Pune
Aurangabad
Price ( 1 item)
56,000
9,000
46,000
12,000
SQL Query:
SELECT * FROM Sales
WHERE Product_Name= ‗MacBook‘ OR
Product_Name= ‗MacOs‘ OR Product_Name=
‗Iphone‘;
Example: Let us consider the database about the
selling details of particular company in year 2015
given below in Table 1. Find out the market of all the
products of Apple Company.
Table 3. Results using Proposed Approach
Product_Name
MacBook
MacOs
Iphone
Ipod
IOs
Iwatch
Iband
AppleTV
Purchase_Date
12/02/2015
15/06/2015
12/09/2015
02/09/2015
22/04/2015
18/08/2015
12/03/2015
12/07/2015
Quantity
3
4
5
7
8
9
4
5
City
Mumbai
Nagpur
Pune
Nasik
Aurangabad
Mumbai
Nagpur
Mumbai
Price ( 1 item)
56,000
9,000
46,000
8,000
7,000
25,000
25,000
96,000
In traditional approach, SQL query is directly
fired on the database. Query used to retrieve the all
information regarding the sales of apple products is as
follows:
Select clause is used to select the triplet related to the
concept.
To filter unnecessary data from ontology where clause
is used.
Results:Results obtained by
approach is as shown in Table 2.
SPARQL Query:
PREFIX foaf:
<http://www.semanticweb.org/ontologies/2015/7/untit
led-ontology-24#>
SELECT ?Product
WHERE {
foaf:Applefoaf:Sales_Relationship ?Product
using
traditional
In proposed approach, ontology is used to represent
the background knowledge about the domain.
Ontology stores the information about the product of
Apple Company by using the sales relationship.
Information stored by ontology in the form of subject,
object and predicate is as given below:
}
Apple Sales MacBook, Apple Sales MacOs etc.
SPARQL query language is used to retrieve
meaningful information from the ontology. PREFIX is
used to give the path of the location of the ontology.
Prefix variable such as foaf is used to store the value
of the path.
ISSN: 2231-5381
http://www.ijettjournal.org
Page 471
International Conference on Global Trends in Engineering, Technology and Management (ICGTETM-2016)
[9]
IV. CONCLUSION
KBIR is the process of retrieving meaningful
information from the database. Meaningful
information remains hidden in the database due to
insufficient background knowledge. To overcome this
challenge ontology driven approach is introduced.
Involvement of ontology will give promising and
superior results than traditional approach.
In future work focus will be on development of
automated construction of domain ontology. As data is
growing enormously, due to this manual construction
of ontology is a too much challenging job. Hence
there is need of some automated functionality to
construct the ontology automatically.
[10]
[11]
[12]
[13]
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
T. R. Gruber, ―Toward principles for the design of ontologies
used
for
knowledge sharing‖,International Journal Human Computer
Studies, vol.43, no.5-6, pp.907–928, 1995.
Kaushal Giri, ―Role of Ontology in Semantic Web‖,
DESIDOC Journal of Library and Information Technology,
Vol.31, No.2,pp.234-238 2011.
Mohammad Mustafa Taye, ―Understanding Semantic Web
and Ontologies: Theory and Applications‖, Journal of
Computing, Volume 2, Issue 6, pp.182-192, June 2010.
Dejing Dou, Hao Wang, Haishan Liu, ―Semantic Data
Mining: A Survey of Ontology-based Approaches‖,
Proceedings of the 2015 IEEE 9th International Conference
on Semantic Computing, pp.244-251, February 7-9, 2015,
Anaheim, California, USA..
Sanju Mishra and Sarika Jain, ―A study of various
Approaches and Tools on Ontology‖, International
Conference
on
Computational
Intelligence
and
Communication Technology, pp.57-61, February 2015.
Xiaohui Tao, Yuefeng Li, and NingZhong, ―A Personalized
Ontology Model for Web Information Gathering‖, IEEE
Transactions on Knowledge and Data Engineering, Volume.
23, No. 4, pp.496-511, April 2011.
Wang Xuping, NiZijian, CaoHaiyan, ―Research on
Association Rules Mining Based-on Ontology in Ecommerce‖, International Conference on Wireless
Communications, Networking and Mobile Computing,
Volume 2, Issue 31, pp.3549-3542, 2007.
Yongqing Wang and Yan Chen, ―A New Association Rules
Mining Method based on Ontology Theory‖, International
Conference on Advanced Computational Intelligence,
pp.287-291, October 2012, Nanjing, Jiangsu, China
ISSN: 2231-5381
[14]
[15]
[16]
[17]
[18]
[19]
[20]
Rudy Prabowo, Mike Jackson, Peter Burden, and HeinzDieter Knoell, ―Ontology-Based Automatic Classification for
the Web Pages: Design, Implementation and Evaluation‖,
Proceedings of the 3rd International Conference on Web
Information Systems Engineering,pp.182-191, 2002.
P.Sundaramoorthy, Sreekrishna.M, S.Bhuvaneshwari, and
M.Selvam, ―Ontology Based Classification of User History
in Obscured Web Search‖,2nd International Conference on
Current Trends in Engineering and Technology,pp.258-261,
July 2014, Coimbatore, India.
Jun Fang, Lei Guo, XiaoDong Wang, and Ning Yang,
―Ontology-Based Automatic Classification and Ranking for
Web Documents‖, Fourth International Conference on Fuzzy
Systems and Knowledge Discovery, Vol.3, 2007.
Nadana Ravishankar. T and Shriram. R, ―Ontology based
Clustering Algorithm for Information Retrieval‖, 4th
International Conference on Computing, Communications
and Networking Technologies,pp: 1-4, July
2013,
Tiruchengode, India.
Miriam Fernández , IvánCantador, VanesaLópez, David
Vallet, and Enrico Motta, ―Semantically enhanced
Information Retrieval: An ontology-based approach‖, Web
Semantics: Science, Services and Agents on the World Wide
Web 9, Vol.9, Issue.4, pp.434-452, 2011.
Yang Xiudan and Zhu Yuanyuan, ―Ontology-based
information extraction system in E-commerce websites‖,
International Conference on Control, Automation and
Systems Engineering, pp.1-4, 2011, Singapore.
Kate Revoredo, Josae Eduardo Ochoa Luna, and Fabio
GagliardiCozman, ―Semantic Link Prediction through
Probabilistic Description Logics‖,Journal of the Brazilian
Computer Society, pp.397-409, 2013.
Doina Caragea, VikasBahirwani, WaleedAljandal and
William H. Hsu, ―Ontology-Based Link Prediction in the
LiveJournal Social Network‖, Proceedings of the Eighth
Symposium
on
Abstraction,
Reformulation,
and
Approximation, pp. 34-41, 2009.
, F.A Ferreira Costa, J.A., Rodrigues
Muniz Silva, C., ―A Hierarchical Architecture for Ontologybased Recommender Systems‖, 2013 BRICS Congress on
Computational Intelligence and 11th Brazilian Congress on
Computational Intelligence, pp.362-367, 2013,Ipojuca.
Hubert Kadima, Maria Malek, ―Toward ontology-based
personalization of a Recommender System in social network‖,
International Conference on Soft Computing and Pattern
Recognition, pp.119-122, 2010, Paris.
A. Martin, D.Maladhy and Dr.V.PrasannaVenkatesan, ―A
framework for Business Intelligence Application using
Ontological Classification‖,
International Journal of
Engineering Science and Technology , Vol. 3, No. 2, pp.
1213-1221, February 2011.
Zhan Cui, Ernesto Damianit and Marcello Leida, ―Benefits of
Ontologies in Real Time Data Access‖, International
Conference on Digital Ecosystems and Technologies, pp.
392-397, February 2007, Cairns.
http://www.ijettjournal.org
Page 472
Download