- projectgenie

advertisement
ABSTRACT
As probabilistic data management is becoming one of the main research focuses and
keyword search is turning into a more popular query means, it is natural to think how to support
keyword queries on probabilistic XML data. With regards to keyword query on deterministic
XML documents, ELCA (Exclusive Lowest Common Ancestor) semantics allows more relevant
fragments rooted at the ELCAs to appear as results and is more popular compared with other
keyword query result semantics (such as SLCAs). In this paper, we investigate how to evaluate
ELCA results for keyword queries on probabilistic XML documents. After defining probabilistic
ELCA semantics in terms of possible world semantics, we propose an approach to compute
ELCA probabilities without generating possible worlds. Then we develop an efficient stackbased algorithm that can find all probabilistic ELCA results and their ELCA probabilities for a
given keyword query on a probabilistic XML document. Finally, we experimentally evaluate the
proposed ELCA algorithm and compare it with its SLCA counterpart in aspects of result
effectiveness, time and space efficiency, and scalability.
Modules:
Data storage and search:
we describe an approach based on tree-based association rules(tars) mined rules, which provide
approximate, intentional information on both the structure and the contents of xml documents
and can be stored in xml format as well. There are two main approaches to xml document access:
keyword-based search and query-answering. the idea of mining association rules to provide
summarized representations of xml documents has been investigated in many proposals either by
using languages x query.
file organization blacks
We do not store the data in a single file because, in hadoop and mapreduce framework, a file is
the smallest unit of input to a mapreduce job and, in the absence of caching, a file is always read
from the disk. if we have all the data in one file, the whole file will be input to jobs for each
query. Instead, we divide the data into multiple smaller files.
Contact: 040-23344332, 8008491861
Email id: info@projectgenie.in, www.projectgenie.in
User index based search:
We introduce indexes on tars to further speed up the access to mined trees - and in general of
intentional query answering. In general, path indexes are proposed to quickly answer queries that
follow some frequent path template, and are built by indexing only those paths having highly
frequent queries. We start from a different perspective: we want to provide quick, and often
approximate, answers also to casual queries.
Query plan generation:
We define the query plan generation problem, and show that generating the best (i.e., least cost)
query plan for the ideal model as well as for the practical is computationally expensive. then, we
will present a heuristic and a greedy approach to generate an approximate solution to generate
the best plan.
Running example:
We will use the following query as a running example in this section.
Running example
select ?v, ?x, ?y, ?z where{
?x xml : type ub : graduate student
?y xml: type ub : university
?z ?v ub : department
?x ub : memberof ?z
?x ub : undergraduatedegreefrom ?y }
5. Time Base Search:
Then we develop an efficient stack-based algorithm that can find all probabilistic ELCA results
and their ELCA probabilities for a given keyword query on a probabilistic XML document.
Finally, we experimentally evaluate the proposed ELCA algorithm and compare it with its SLCA
counterpart in aspects of result effectiveness, time.
Contact: 040-23344332, 8008491861
Email id: info@projectgenie.in, www.projectgenie.in
Existing System:
Semantic web technologies are being developed to present data in standardized way such that
such data can be retrieved and understood by both human and machine. Historically, web pages
are published in plain html files which are not suitable for reasoning.
1. No user data privacy
2. Existing commercial tools and technologies do not scale well in cloud
3. Computing settings.
Proposed System:
Integrates the functionalities proposed in our approach. Given an XML document, it enables
users to extract intentional knowledge and compose traditional queries as well as queries over the
intentional knowledge, receiving both extensional and intensional answers. Users formulate
Queries’ over the original data, and queries are automatically translated and executed on the
intentional knowledge.
Propose an approach to compute ELCA probabilities without generating possible worlds. Then
we develop an efficient stack-based algorithm that can find all probabilistic ELCA results and
their ELCA probabilities for a given keyword query on a probabilistic XML document. Finally,
we experimentally evaluate the proposed ELCA algorithm and compare it with its SLCA
counterpart in aspects of result effectiveness, time.
Contact: 040-23344332, 8008491861
Email id: info@projectgenie.in, www.projectgenie.in
ALGORITHM:
IN THIS SECTION, WE INTRODUCE AN ALGORITHM, PRELCA, TO PUT THE CONCEPTUAL IDEA
IN THE PREVIOUS SECTION INTO PROCEDURAL COMPUTATION STEPS.
INDEXING PROBABILISTIC
XML
DATA, AND THEN INTRODUCE
PRELCA
WE
START WITH
ALGORITHM, IN THE
END, WE DISCUSS WHY IT IS RELUCTANT TO FIND EFFECTIVE UPPER BOUNDS FOR
PROBABILITIES, AND IT TURNS OUT THAT
PRELCA
ELCA
ALGORITHM MAY BE THE ONLY
ACCEPTABLE SOLUTION.
Contact: 040-23344332, 8008491861
Email id: info@projectgenie.in, www.projectgenie.in
Contact: 040-23344332, 8008491861
Email id: info@projectgenie.in, www.projectgenie.in
System Requirements:
Hardware Requirements:
•
System
•
Hard Disk
•
Floppy Drive
: 1.44 Mb.
•
Monitor
: 15 VGA Colour.
•
Mouse
: Sony.
•
Ram
: 512 Mb.
: Pentium IV 2.4 GHz.
: 40 GB.
Software Requirements:
•
Operating system
: Windows 7.
•
Coding Language
: ASP.Net 4.0 with C#
•
Data Base
: SQL Server 2008.
Contact: 040-23344332, 8008491861
Email id: info@projectgenie.in, www.projectgenie.in
Download