Towards Using Semantic Decision Tables for

advertisement
Towards Using Semantic Decision Tables for Organizing
Data Semantics
Yan Tang
VUB STARLab
10G731, Vrije Universiteit Brussel
Pleinlaan 2, 1050 Elesene
Brussels, Belgium
yan.tang@vub.ac.be
Abstract. In the ITEA2 Do-It-Yourself Smart Experiences project (DIY-SE),
we are required to design an ontology-based ambient computing environment to
support users to DIY their personalized solutions. In this paper, we illustrate
how to manage data semantics using Semantic Decision Table (SDT). We use a
simple rule writing language called Decision Commitment Language (DECOL)
to store the SDT commitments. Semantic Decision Rule Language (SDRule-L),
which is an extension to Object Role Modelling (ORM), is used to graphically
represent DECOL. In this paper, we will demonstrate how SDT, together with
SDRule-L and DECOL, are used by both technical and non-technical end users.
Keywords: Semantic Decision Table, Ontology, Do-It-Yourself
1
Introduction and Motivation
The ITEA2 Do-It-Yourself Smart Experiences project (DIY-SE, http://dyse.org:8080)
aims at allowing citizens to obtain highly personalized and social experiences of
smart objects at home and in the public areas, by enabling them to easily DIY
applications in their smart living environments.
The DIY aspect has been very important for decades. As Mark Frauenfelder points
out, “DIY helped them take control of their lives, offering a path that was simple,
direct, and clear. Working with their hands and minds helped them feel more engaged
with the world around them” [3].
In the IT/ICT domain, the DIY culture has been adapted with the modern Internet
technologies. For instance, YouTube (http://www.youtube.com) is a website that
allows users to upload, publish and share their videos. Yahoo! Pipes
(http://pipes.yahoo.com) is an application to help creating web-based applications
through a graphical web interface. OpenChord (http://www.openchord.org) is an
open source kit for interpreting the inputs from a regular or electronic guitar to
computer commands.
2
Yan Tang
We are required to design and implement an innovative DIY-enabled, ontologybased ambient computing environment. In particular, we use Semantic Decision Table
(SDT) to manage evolving data semantics in a ubiquitous network.
The paper is organized as follows. Section 2 is the paper background. Section 3
covers the discussions on how to use SDT and ORM in a DIY scenario for modeling
data semantics. We represent the implementation in section 4. Section 5 contains the
related work. In section 6, we conclude.
2
Background
The Developing Ontology Grounded Methodologies and Applications (DOGMA,
[10]) applies the data modeling methods to ontology modeling. An ontology modeled
in DOGMA has two layers: a lexon layer and a commitment layer.
A lexon is a simple binary fact type, which contains a context identifier, two terms
and two roles. For instance, a lexon ,
,
,
,
presents a fact that “a teacher teaches a student, and a student is taught by a teacher”,
where “teacher” and “students” are two terms, “teaches” and “is taught by” are two
roles. The context identifier points to a resource where “teacher” and “student” are
originally defined and disambiguated.
A commitment (also called “ontological commitment”) is an agreement made by a
community (also called “group of interests”). It is a rule in a given syntax.
Table 1. An SDT on deciding whether a student studies or visits his friends
Condition
Weather
Sunny
Sunny
Raining
Raining
Exam
Yes
No
Yes
No
*
*
Action
Study
*
Visit
*
friends
SDT Commitments in DECOL
1
P1=[Weather, has value, is value of, Value]: P1(Value)={Sunny, Raining}
2
P2 = [Exam, has value, is value of, Value]: P2(Value) = {Yes, No}
3
P3 = [Student, takes, is taken by, Exam]: MAND(P3)
4
(P4 = [ENGINE, VERIFY, is verified by, Weather], P5 = [ENGINE, VERIFY, is verified
by, Exam]): SEQ (P4(VERIFY, is verified by), P5(VERIFY, is verified by))
Semantic Decision Table (SDT, [11]) is a decision table properly annotated with
domain ontologies. It is modeled based on the DOGMA framework.
Table 1 is an SDT example. It contains a tabular presentation and an extra layer of
commitments. We use a simple rule writing language called Decision Commitment
Language (DECOL, [11]) to store the SDT commitments. It can be translated into a
controlled natural language for the verbalization, and be published in an XML format.
Towards Using Semantic Decision Tables for Organizing Data Semantics
3
A graphical notation for rule modeling is defined as a set of symbols for
representing rule vocabulary and rules in concise and unambiguous manner [8].
Semantic Decision Rule Language (SDRule-L, [12]) is an extension to Object Role
Modeling (ORM/ORM2, [5]). It contains a set of rich graphical notations for business
rule modeling. We use it as the modeling means for DECOL. Note that the value
range constraint and the mandatory constraint are reused from ORM/ORM2 (e.g.,
Commitments 1, 2 and 3 in Table 1). Its extensions to ORM/ORM are, for instance,
cross-context subtyping and sequence.
The constraint of cross-context subtyping is used to populate the subtypes of a type
when using SDT for ontology versioning. A detailed discussion can be found in [11].
The sequence is applied between conditions or between actions in an SDT, but not
between a condition and an action. It tells the reasoner in which sequence it needs to
validate the conditions or execute the actions. Fig. 1 shows an example.
Sequence
Verbalization: ENGINE VERIFIES Weather, THEN, ENGINE VERIFIES Exam
Fig. 1. Model Commitment 4 in Table 1 using SDRule-L
3
Use SDT to Manage Evolving Data Semantics
3.1
Design User-Friendly Decision Rules/Introduce New Concepts and Rules
The spreadsheet-style of decision tables is convenient for non-technical people to
model procedural decision rules. An SDT, first of all, is a decision table. Therefore, it
has the advantages of having a user-friendly presentation. As pointed out by Henry
Bentz et al. [7], decision tables are an excellent tool as a medium of communication
between technical and non-technical people. This is probably the biggest advantage.
In DIY-SE, there are three kinds of users – professional (e.g., technicians,
engineers and experts), semi-professional (e.g., geeks and nerds) and non-professional
(e.g., grandmothers and kids). A non-professional has the DIY needs. He/she knows
the problem but probably does not know the solution. A professional has the
technological abilities. He/she knows many solutions. A semi-professional has the
abilities from both professional and non-professional. He/she probably does not know
the problem or the technical solutions in depth. As he/she can play both roles, we will
only discuss the roles of professional and non-professional in the rest of the paper.
How we do is as follows: we first ask a non-professional to write down the
problem. He/she has decision rules in mind, which are not in a tabular format, but in a
natural language. For instance, he/she says “I want to sleep well, but I have a problem
because I’m very sensitive to the lights. So if it is sunny, please close my window. I
4
Yan Tang
also want fresh air while sleeping, so if it is not sunny and the air in my room is not
clean, then please open the window”.
Table 2. An SDT that decides whether to open the window or not
Condition
Weather
Air
Action
Open window
Close window
1
Sunny
Fresh
2
Cloudy
Fresh
3
Dark (night)
Fresh
4
Sunny
Dirty
5
Cloudy
Dirty
6
Dark (night)
Dirty
*
*
*
*
*
*
SDT commitment in DECOL
1
P1 = [Weather, has, is of, Value]:P1(Value)={Sunny, Cloudy, Dark (night)}
2
P2 = [Air, has, is of, Value]:P2(Value)={Fresh, Dirty}
Then, a knowledge engineer (a professional) gets the requirements and designs an
SDT as shown in Table 2. In order to design this SDT, he/she maybe need the
graphical modeling support from SDRule-L. In the meanwhile, new lexons and
concepts are collected at a local server. When some of the lexons/concepts appear
several times, they will be uploaded to the ontology server for versioning.
At the end, the knowledge engineer passes the SDT to a technician (another
professional), who installs the required ambient sets, e.g., a light sensor for the
window and an automatic window handler, and implements the decision rules in the
physical world. We will not discuss in detail as this process is out of the scope.
Note that SDRule-L and DECOL can also be translated into logical sentences. By
doing so, readers can use many existing rule engines to reason the SDT commitments.
For instance, commitment 1 in Table 2 can be written in predicate logic as follows.
,
3.2
,
Cross Check Decision Rules/Audit Decision Rules
Suppose that we introduce a new SDT commitment to Table 2. The DECOL code is
shown as below:
(P3 = [ENGINE, EXECUTE, is executed by, Open window],
P4 = [ENGINE, EXECUTE, is executed by, Close window]):
P_MAND_XOR (P3, P4).
This commitment written in predicate logic is illustrated as below.
,
Or,
,
Towards Using Semantic Decision Tables for Organizing Data Semantics
,
5
,
Process
constraint of
mandatory and
exclusive
Verbalization: ENGINE MUST EITHER EXECUTES Close window, OR EXECUTES Open
window, BUT NOT BOTH.
Fig. 2. Model Commitment 3 in Table 2 using SDRule-L
The graphical model and verbalization of this commitment is illustrated in Fig. 2.
The decision columns 2 and 5 in Table 2 are thus invalid because column 2 does
not contain any actions and column 5 contains more than one action. The knowledge
engineer then needs to modify Table 2. When more constraints are introduced, an
SDT will contain more semantic information and we will have a more accurate
implementation at the end.
Table 3. An SDT of deciding whether to study math or physics based on the previous exams
Condition
1
2
3
4
5
6
7
8
Previous exam – English
Y
Y
Y
Y
N
N
N
N
Previous exam – Chemistry
Y
Y
N
N
Y
Y
N
N
Previous exam – Math
Y
N
Y
N
Y
N
Y
N
*
*
Action
Study math
*
Study physics
*
*
*
SDT commitment in DECOL
1
(P1=[Person, has, is of, Previous exam], P2 = [Previous exam – English, is instance of, is,
Previous exam], P3 = [Previous exam – Chemistry, is instance of, is, Previous exam], P4 =
[Previous exam – Math, is instance of, is, Previous exam]): CARD (P1(has), <=2).
2
P1=[Person, has, is of, Previous exam]: P1(Previous exam)={ Previous exam – English,
Previous exam – Chemistry, Previous exam – Math}, CARD (P1(has), <=2).
If conflicting rules are introduced by different knowledge engineers in the case one
SDT had been designed by a group, then the group needs to have a formal agreement.
This process is also called ontological commitment process.
Another example is to use the constraint of occurrence frequency to cross check
the decision rules in an SDT. Suppose that we have an SDT shown in Table 3. There
are two SDT commitments in Table 3. They are equivalent from the modeling
perspective. The function CARD is taken from “cardinality”. They can be graphically
represented as in Fig. 3.
The SDT commitment 1 written in predicate logic is shown as follows.
6
Yan Tang
Has_exam x, y
Has_exam x, y
Has_exam x, y
Cardinality
constraint
Verbalization: EACH Person has <=2 Previous exam(s).
Fig. 3. Model Commitment 1 or 2 in Table 3 Using SDRule-L
Note that the ontological commitment process results in a set of ontological
commitments, which are uploaded to the ontology server for versioning.
Currently, we use the SDRule-L/DECOL constraints of uniqueness, mandatory,
occurrence frequency, subset, equality, exclusion, value range, sequence, subtype,
trans-context equality and trans-context subtyping for cross checking the rules.
3.3
Map Semantics to Data/Build a Component Network
In Ontology Engineering, instances are often stored together with an ontology. We
separate instances from concepts for the following two reasons.
─ We can achieve a higher level of extensibility. The ontology no longer needs to
be updated when a new instance is introduced. We can save the efforts of
ontology versioning.
─ We can convert parts of the ontology into database schemas and use many
existing powerful database engines to deal with the constraints. By doing so, the
performance of the ontology-based system will be improved.
In DIY-SE, we have a data semantic server that deals with mapping and
interpreting data from the heterogeneous databases. For example, the process
information of a particular ambient environment is stored in such a database. When a
new ambient object (also called an instance), i.e. a new window, is brought in, the
server will have a problem of finding the proper map.
Fig. 4. An ORM-style SDRule-L model extended from Fig. 2
In order to solve this problem, we need to derive an ORM-style SDRule-L model
from the previously designed SDRule-L model in order to deal with instances. Fig. 4
Towards Using Semantic Decision Tables for Organizing Data Semantics
7
shows a model that is extended from Fig. 2. When a database engineer gets Fig. 4,
he/she maps it into a database schema and implements it.
The database engineer (a professional) creates a database schema based on Fig. 4
as follows. He puts all the values of window in a table named “Window”. As a
window can be either open or closed, the data table contains a column of “Status”,
which contains NOT NULL values.
The data of a new window will be automatically inserted into the corresponding
data table when we annotate it with the concept “Window” defined in the ontology.
When the knowledge engineers cannot find a type for an instance, they will need to
first apply an ontology versioning method to introduce a new type.
3.4
Inspire New DIY Ideas/Regroup Decision Rules
In DIY-SE, the context of an SDT is specified with people, devices, events/tasks, time
and location.
People are any group of human beings collectively. The group is specified with a
common role that all the members play. A device is an instrument or equipment for a
purpose. An event slightly differs from a task. An event is a set of circumstances or a
phenomenon located at a single point in space-time. A task is defined as an execution
path through spaces and time. A task in the field of management is often seen as a
specific piece of work required to be done. Both task and event have the properties of
space and time. Time is an identifiable instance or a period of a moment for some
event. A location is a point or extent in space. It determines the place where an event
that should happen or a task that needs to be executed.
Each context corresponds to a set of condition alternatives and action alternatives.
New ideas can be inspired by merging several SDTs, or creating a new SDT based on
the existing condition and the action alternatives in a same (or similar) context.
They can be created by the non-professional by simply selecting the decisional
items from a context. The SDT engine takes the reasoning responsibly of checking the
consistency. When they come up with some new rules that are not defined, the task of
introducing new decision rules (section 3.1) will then be executed.
4
Prototype
We have been implementing a prototype in the existing SDT plug-in, which is a Java
module integrated in the DOGMA Studio Workbench.
With regard to the decision table construction, we have developed a standalone
module called Decision Table Constructor. It supports simple table editing functions.
Once a decision table is annotated with domain ontologies and the SDT
commitments are properly built, we call it SDT. In order to support the paper idea, we
have developed the following modules.
Module 1: Context Manager. The module is called DIBag (Decision Item Bag).
A user uses it to specify a new context, introduce new condition and action
candidates, and store the specification of a context in an XML file.
8
Yan Tang
Module 2: Domain Ontology Manager. Users can browse/edit the concepts and
lexon. The definitions of lexon terms can be visualized in TLex (graphs, [13]) or in
SDT Term Dictionary (textual information). It contains a list of functions, including
add lexons/definitions, set ontology/context, upload ontology to the remote DOGMA
server, download ontology from the remote server, and, search dictionary/ontology.
The lexons in the ontologies are grouped by their roles and visualized in a tree.
Module 3: Decision Item Annotator. With this module, users can annotate the
decision items (such as the condition stubs and the action stubs and their sub items)
with domain ontologies. For instance, we can annotate “People” from the condition
stub “People move Ear” with the concept “Name” by creating a new lexon
,
,
,
,
. “OnProperty” is used to further explain this
annotation. It means that “People” has a property of “Name”.
Module 4: DECOL Editor. It supports users to write DECOL code without
knowing their syntax. He/she can validate a combination of constraints, e.g., the
mandatory constraint, by reading the automatically generated verbalization, e.g.,
“EACH Lamp has AT LEAST ONE Light”.
We use Decision Table Editor to define user friendly decision rules (section 3.1).
In order to cross check decision rules in SDTs (section 3.2), we use DECOL Editor
and SDT rule engine. Decision Item Annotator is used to map semantics to data
(section 3.3). The new DIY ideas are inspired with the assistance of DIBag and
Domain Ontology Manager (section 3.4).
5
Related Work
ORM has been used as an ontology modeling tool right after the birth of DOGMA.
NORMA tool [4] from Neumont University is a tool to support ORM2 notations.
Popular Ontology Engineering tools are Protégé from Stanford University [9],
CoBrA from University of Edinburgh, CONE from VTT (Finland, [1]) and
commercialized TopBraid composer1. Compared to them, ORM2 based T-Lex
provides an excellent, easily controlled and community-based modeling feasibility.
Seeing that SDT is an extension to decision tables, the related work of SDT is
second-order decision table [6] and fuzzy decision table [2, 14]. A second-order
decision table uses second-order variables for condition entries. SDT does not specify
whether the annotated decision table is a first order or a second order. A fuzzy
decision table uses fuzzy logic to deal with the fuzziness of a condition entry. We do
not deal with fuzzy logic, but specify the ambiguous condition entries by properly
annotating them with domain ontologies.
In addition, SDT has the advantages of extensibility, formality, communitygrounded, shareability and reasoning feasibility, which are brought forward by the
modern Ontology Engineering technologies.
With regard to the related work of the SDT Editor, Corticon business rule modeler
(http://www.corticon.com), IBM WebSphere Decision Table Editor (http://www1
http://www.topquadrant.com/products/TB_Composer.html (available on July 2, 2010)
Towards Using Semantic Decision Tables for Organizing Data Semantics
9
01.ibm.com/software/websphere/), JaretTable (http://www.jaret.de) and PROLOGA
(http://www.econ.kuleuven.be/prologa/) are well known decision table editors.
Corticon modeler supports users with online editing and browsing decision tables.
IBM WebSphere DT Editor uses decision tables to generate program code. JaretTable
is an open source editor written in Java SWT for managing projects and analyzing
requirements. PROLOGA is a business rule modeling tool.
The approach to cross checking decision rules illustrated in this paper is to deal
with the verification and validation (V&V) of the rules. The authors from [7] discuss
how to how to deal with this issue based on a survey of the early papers on decision
tables. Their V&V method is grounded on the semantics of basic mathematical
operators. For instance, a decision table is considered inconsistent when it contains
the conditions “X<=10” and “X<=20” at the same time, seeing that these two
conditions are partly overlapped. In addition to the semantics of the basic
mathematical operators, SDT takes one step further by using semantically rich
ontological commitments.
6
Conclusion, Discussion and Future Work
In this paper, we have discussed how to use SDT to manage evolving data semantics
in DIY environment. Our attempts of managing evolving data semantics are:
• When we use SDTs to design user friendly decision rules based on the
new requirements, the new lexons and concepts are collected and
uploaded to the ontology server for versioning.
• When we cross check decision rules in (an) SDT(s), we often introducing
some new meta-rules, which are considered as ontological commitments
and are uploaded to the ontology versioning server.
• During the process of mapping semantics to data, new types will be
introduced when the technicians cannot find a defined type of a smart
component (an instance).
• When we regroup decision rules, new contexts are introduced. When the
users have new inspired rules, they can introduce them as new SDTs. The
new concepts will be formalized in the ontology of a newer version.
A decision rule is to formalize constraints and derivation rules in order to have a
precise conclusion. SDT helps rule modelers to correctly capture the policies using
their vocabulary. The purpose of using SDT can be further specified into 1) to control
the updating of persistent stored decision rules and data; 2) to help implementing and
updating the requirements in a tabular format; 3) to integrate the views from the nonprofessional and the professional by mapping the information between the
requirement models and the desired computational models. These purposes are
studied in this paper. We will continuously investigate on them in a broader context.
Acknowledgments. The work has been supported by the EU ITEA-2 Project
2008005 "Do-it-Yourself Smart Experiences", founded by IWT 459.
10
Yan Tang
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
Aitken, S., Korf., R., Webber, B., and Bard, J (2005): CoBrA: a Bio-Ontology Editor,
PMID: 15513995, http://www.ncbi.nlm.nih.gov/pubmed/15513995
Chen G., Vanthienen J., & Wets G. (1995): Using Fuzzy Decision Tables to build valid
intelligent systems, Sixth International Fuzzy Systems Association World Congress (IFSA
95), Sao Paulo, Brazil, July 22-28
Frauenfelder, M. (2010): Made by Hand: Searching for Meaning in a Throwaway World,
publisher: Portfolio Hardcover, ISBN-10: 1591843324, ISBN-13:978-1591843320.
Curland, M. & Halpin, T. 2007: Model Driven Development with NORMA, Proc. 40th Int.
Conf. on System Sciences (HICSS-40), 10 pages, CD-ROM, IEEE Computer Society.
Halpin, T.A. (2001): Information Modelling and Relational Databases: From Conceptual
Analysis to Logical Design, ISBN-13: 978-1-55860-672-2, ISBN-10: 1-55860-672-6, San
Francisco, California, Morgan Kaufman Publishers, 2001
Hewett, R., Leuchner, J. H. (2002): The Power of Second-Order Decision Tables.
Proceedings of the Second SIAM International Conference on Data Mining (SDM’02),
Arlington, VA, USA, April 11-13, 2002, in, Robert L. Grossman, Jiawei Han, Vipin
Kumar, Heikki Mannila and Rajeev Motwani (eds.), ISBN: 490-89871-517-2, SIAM
Henry Beitz, E., Buck, N.H., Jorgensen, P. C., Larson, L., Maes, R., Marselos, N. L.,
Muntz C., Rabin, J., Reinwald, L. T., and Verhelst, M. (1982): A modern appraisal of
decision tables, a Codasyl report, ACM, New York.
Lukichev. S., and Jarrar, M. (2009): Graphical Notations for Rule Modelling, Handbook
of Research on Emerging Rule-based Languages and Technologies: open solutions and
approaches, Adrian Giurca, Dragan Gasevic and Kuldar Taveter (eds.), Volume I, pp. 7698, Information Science Reference,IGI Global, ISBN978-1-60566-402-6
Noy, N.F., McGuinness, D.L. (2001): Ontology development 101: A guide to creating
your first ontology. Technical Report KSL-01-05, Knowledge Systems Laboratory,
Stanford University, Stanford, CA, 94305, USA (2001)
Spyns, P., Tang, Y., and Meersman, R. (2008): An Ontology Engineering Methodology for
DOGMA, Journal of Applied Ontology, special issue on "Ontological Foundations for
Conceptual Modeling", G. Guizzardi and T. Halpin (eds.), Vol. 3, issue1-2, p.13-39 (2008)
Tang, Y. (2010): Semantic Decision Tables - A New, Promising and Practical Way of
Organizing Your Business Semantics with Existing Decision Making Tools, ISBN 978-38383-3791-3, LAP LAMBERT Academic Publishing AG & Co. Saarbrucken, Germany
Tang, Y., and Meersman, R. (2009):SDRule Markup Language: Towards Modelling and
Interchanging Ontological Commitments for Semantic Decision Making, Handbook of
Research on Emerging Rule-based Languages and Technologies: open solutions and
approaches, Adrian Giurca, Dragan Gasevic and Kuldar Taveter (eds.), Vol. I, pp. 99-123,
Information Science Reference, IGI Global, ISBN978-1-60566-402-6
Trog, D., Vereecken, J., Christiaens, S., De Leenheer, P., and Meersman, R. (2008): TLex: a Role-based Ontology Engineering Tool, OTM workshops, ORM workshop,
Springer-Verlag, Volume 4278, Montpellier, France
Wets, G., Witlox, F., Timmermans, H., and Vanthienen J. (1996): A Fuzzy Decision Table
Approach for Business Site Selection, the fifth IEEE international conference on Fuzzy
Systems, Vol. 3, pp. 1605-1610, ISBN 0-7803-3645-3, USA, 8-11 Sep. 1996
Download