AST_2011_Jin_Guang_Zheng_Week6

advertisement
Semantic Water Quality Portal
Jin Guang Zheng, Ping Wang, Deborah McGuiness, Joanne Luciano
Tetherless World Constellation, Rensselaer Polytechnic Institute, Troy, New York, USA
TWC-TR# (assigned once accepted- for homework assignments use format specified in introduction)
1
INTRODUCTION
Semantic web portal(SWP) is a website built based on the
semantic web technologies, which collects information for a
community of users and for those users to share and exchange information[1]. Some of benefits that SWP provides
include automated inference and reasoning, semantic query,
etc. Despite the useful features provided by SWP, the deployment status of SWP is still in an unsatisfactory state.
One of the main problems for this deployment status of semantic web technologies based portals or frameworks is its
difficulties for non-specialist to replicate the system and the
system’s uneasiness to be scale to different domains. There
are substantial materials needed to be learned by the nonspecialists in order to adopt the systems to various domains,
e.g. Provenance, SPARQL. Also, some SWP systems may
require substantial amount of configuration in order to be
adapt to a new domain. In this paper, we present an easyto-deploy SWP. The proposed SWP requires user with minimum learning curve and little configuration process while
enabling following semantic web technologies based features: 1. Provide both data level and application level of
provenance, 2. Support OWL typed inference and reasoning,
3. Visualize semantic data. We already deployed our SWP
on the environmental domain, more specific, we use the
system to develop a Semantic Water Quality Portal(SWQP).
2
DATA
Data from three different sources are collected to develop
our SWQP: 1. USGS data about water sources. 2. EPA data
about facilities. 3. State Regulation data about pollutants.
USGS data: This dataset provides measurements of many
different chemicals in groundwater and waterways (e.g.
Arsenic).
EPA data: This dataset provides information about specific
companies that must abide by the federal guidelines put in
place and if or when they have violated EPA regulations
State Regulation data: Since USGS are not responsible for
managing and enforcing regulations, their data must be
evaluated under federal or state regulations such as Rhode
Island's Water Quality Regulations (2009).
Since the data are collected from various sources, we must
be able to use semantic technologies to combine the data
and present data to end-users.
3
METHOD/SYSTEM ARCHITECHTURE
To support aforementioned features for SWP and build our
SWQP, we implemented following components:
Data Conversion Component: There are two converters
implemented. One of the converters is a general converter,
which is able to convert any data in csv format to rdf format.
Another converter is an ad-hoc converter for SWQP, which
converts some of the regulation data from PDF format to
RDF format.
Ontology Component: In SWQP, we designed a core
regulation ontology. When data are converting to RDF
format, we encode the data using the ontology. Therefore,
we can perform reasoning on the data we collected. The
ontology itself is designed and encoded use OWL2 [2].
Provenance Component: There are two levels of provenance information we are able to capture using our provenance component. The first level is data level provenance:
when data are converted to RDF using our data conversion
component, we inject provenance information about data
sources using PML[3]. The second level is application level
provenance: when a water source is marked as polluted water source(or facility been marked as violating facility), we
provide provenance information on what data we used for
this reasoning.
Visualization Component: This component is responsible
for mash-up and represents the data we collected from various sources in a meaningful way. Right now, we provide a
Geo Map visualization of water sources and facilities.
Back-end Reasoner Component: We also built a back-end
reasoner using JENA and PELLET. In SWQP, the reasoner
performs OWL 2 reasoning using the ontology we designed
1
Luciano, J et al.
ETST_2011_Luciano_Joanne_A1
over the data we collected from various sources to determine polluted water sources and violated facilities.
Provenance: Two tasks need to finish in for provenance:
provenance based query, and visualizing provenance data.
4
Portal Functionalities: We can also add few more functionalities to our portal: ontology based facet browser, and more
ways to visualize our data.
DISCUSSION
In this section, we discuss the project in two different aspects: 1. For the class, 2. For the research tasks.
4.1
Discussion for the class
In this section, I will discuss the project w.r.t class purpose,
mainly answering questions who is primary responsible for
what tasks? What I think I will learn?
Who is primary responsible for what tasks?
Data Conversion: Ping is primary responsible for the task on
converting different data using either the converter written
by her or Tim.
Ontology Development and Reasoner Development: Jin is
primary responsible for the task on developing ontologies
and the reasoners used in the system.
Functionality Development: Both Jin and Ping are responsible on developing functionalities required by the system.
What I think I will learn?
What I think I will learn?
I think I will learn more on ontology and reasoned developments, paper writing skills, and the applications of semantic web technologies.
4.2
Discussion for the research tasks
Some of the development and implementation tasks are already finished in last semester. However, some minor adjustment and new functionalities needed to develop in this
semester.
Scale the portal: There are two dimensions we can and need
to scale our portal. One of the dimensions is scale the portal
w.r.t state regulation. Second one is scale the portal to different domain, e.g. Health Portal.
Ontology and Reasoner: The goal for this task is to build a
more robust ontology to support various kinds of reasoning
we need in our portal. There are some interesting ontologies already implemented [4]. We can borrow some of the
ideas of these ontology designs and maybe link to the existing ontologies.
2
5
CONCLUSION
In this paper, we presented an easy-to-deploy Semantic Web
Portal. We also deployed the portal to the environment domain and build Semantic Water Quality Portal. SWQP
demonstrates interesting and useful semantic web technologies based features provided by our SWP: 1. Provides provenance information about data and reasoning, 2. Support
automatic OWL based reasoning, 3. Visualize semantic data
in a meaningful way. As we discussed in the previous section, we will be continue to work on the portal system to
build a more robust portal system.
REFERENCES
[1] Lausen,H., Ding, Y., Stollberg, M., Fensel, D., Hernandez, R.,
and Han,S. (2005): Semantic web portals:state-of-the-art
survey. Journal of Knowledge Management, vol. 9(5),
pp. 40--49
[2] Hitzler, P., Krotzsch, M., Parsia, B., Patel-Schneider, P., Ru
dolph, S., (2009) OWL 2 Web Ontology Language Pri
mer. <http://www.w3.org/TR/owl2-primer/>
[3] McGuinness, D., Silva, P., Ding, L., (2007): Proof Markup
Language
(PML)
Primer,
<http://inferenceweb.org/2007/primer/>
[4] Parekh V., (2005): Applying Ontologies and Semantic Web
technologies to Environmental Sceiences and Engineering. Mater Thesis, University of Maryland, Baltimore
County
Download