Enabling Semantic Understanding of Situations Fuming Shih Vidya Narayanan and Lukas Kuhn

advertisement
Activity Context Representation — Techniques and Languages: Papers from the 2011 AAAI Workshop (WS-11-04)
Enabling Semantic Understanding of Situations
from Contextual Data in a Privacy-Sensitive Manner
Fuming Shih
Vidya Narayanan and Lukas Kuhn
Computer Science and Artificial Intelligent Lab
Massachusetts Institute of Technology
32 Vassar Street, Cambridge, MA 02139
Qualcomm, Inc.
5775 Morehouse Dr.
San Diego, CA 92121
There are two main challenges in realizing situation
awareness: 1) fusing contextual information from disparate
sources in a coherent and structured manner for reasoning the contexts from different sources have varying semantics
and relationships; 2) representing the privacy constrains in a
situation or context specific manner that allows integration
with the reasoning process.
Abstract
Mobile applications can be greatly enhanced if they
have information about the situation of the user. Situations may be inferred by analyzing several types of
contextual information drawn from device sensors, such
as location, motion, ambiance and proximity. To capture a richer understanding of users’ situations, we introduce an ontology describing the relations between
background knowledge about the user and contexts inferred from sensor data. With the right combination of
machine learning and semantic modeling, it is possible to create high-level interpretations of user behaviors
and situations. However, the potential of understanding
and interpreting behavior with such detailed granularity
poses significant threats to personal privacy. We propose
a framework to mitigate privacy risks by filtering sensitive data in a context-aware way, and maintain provenance of inferred situations as well as relations between
existing contexts when sharing information with other
parties.
Towards a semantic understanding of situations, we can
model the relations between various types of contextual information and content in an upper-level ontology shown in
Figure 1. Represented in this ontology are the actions, places
and other information produced by a machine learning based
analysis on raw sensor or other data. Labels associated with
data in machine learning algorithms can be seen as the first
level of semantics. For instance, motion may be labeled as
walking, running, etc., which are contexts indicated by the
Action ontology in Figure 1. The ontologies then capture
the relationships that exist across these semantic labels as
well as to knowledge bases built from both modeled common sense and learned user knowledge. In addition, we provide a discussion here about how privacy fits into this model
and the overall situation awareness subsystem.
hasRole
hasRoleDeff
Role
Situation
Principal
uses
hasContext
Device
Content
App
Context
Knowledge
hasSensorr
With sensing and communication capability, modern mobile devices are becoming the most comprehensive platform
for context-aware applications. Such platforms merge user’s
content from applications on the Web and data created or
sensed locally on mobile devices. Smart sensors on the mobile device can easily collect information such as location,
movement and environment of a user. Situation awareness
can potentially lead to a wealth of applications, including
the ability to provide user-specific experiences and improvements. Although not a solved problem, the use of machine
learning techniques to derive contextual information about
the user, such as motion states or significant places, is very
prevalent. There have also been some attempts to approach
the problem of situation awareness from the semantic aspects. However, bringing the two fields together to provide a
semantic understanding of a situation is promising and powerful. We borrow the use of the terms context and situation
as per the definition given by Dey (2001) - “Context is any
information that can be used to characterize the situation of
an entity”.
hasContextt
Introduction
hasEnvironment
Place
Environment
occursAt
Action
Time
associatedWith
Model
Sensor
Figure 1: Top level ontology for situation awareness (solid
lines show subClassOf)
c 2011, Association for the Advancement of Artificial
Copyright Intelligence (www.aaai.org). All rights reserved.
68
Motivating Use-Case
The core concepts in this ontology are described below.
• A principal is the primary actor who performs actions.
A principal can be associated with varying contexts over
time and take on multiple roles in the same or different
situations.
We motivate the need for context representation and reasoning for situation-aware applications using a use case from a
scheduling application. As we will show, this use case incorporates aspects of context-awareness, reasoning over situations, information sharing, and privacy.
Consider the case where Alice is currently at a doctor’s
appointment,following which, she has an important meeting
at work. Alice is supposed to be presenting at the meeting.
Consider a situation-aware application (e.g. a scheduling application) that is able to infer a situation of ’running late for a
meeting’ from derived context artifacts. Thus, such a system
can remind Alice of the meeting in a situation aware manner
and notify others that Alice is running late. For example, say
the typical travel time from the location of her doctor’s office to the location of her meeting is 20 minutes. The system
can monitor Alice to track if she is leaving on time. Based
on this information, the system can notify other participants
of the meeting that Alice is running late.
The system is also aware of the reason for her being late
and can provide this to the meeting invitees. However, making the reason known needs to happen in accordance with
Alice’s privacy preferences. For instance, the reason for being late may be shared only with certain invitees of the meeting and not others. Further, the optimal information sharing
strategy may not only come from Alice’s situation and settings, but also from the situation of the other participants.
For example, if one of the participants is on vacation or if a
participant has a conflict that will delay his or her participation in this meeting by 30 minutes, a notification would not
do any good.
• A role characterizes a principal with respect to an situation. Example roles include guest, waiter, and owner in a
restaurant situation or presenter, organizer and invitee in
a meeting situation etc.
• A situation is a high level concept which is typically defined over a set of context artifacts by tying together various pieces of contextual information. Examples include
dining at a restaurant, delivering a lecture, grocery shopping, in a meeting etc.
• Context is any information that can characterize a situation. There are many sub-classes in context, only some
of which we are able to address explicitly: place, action,
environment, and time.
• A Place is a location with a semantic meaning. In other
words it is a location which is relevant to a situation or
an action. Examples of places include home, work, office,
meeting room, etc.
• An action describes an atomic lower level task performed
by a principal with no direct association to a role. Examples include silencing a cell phone, rejecting a call, running, driving, eating, talking, typing, etc. Actions have a
temporal aspect and can be seen as context information.
• An environment characterizes the ambiance in which a
principal is present. Examples include information such
as dark, loud, cold, crowded, proximity to other principals
or devices, etc.
Upper Level Ontology for Situation Awareness
• Time captures temporal information of a context and is
itself context.
Towards a collaborative situation-aware application, a first
step is to extract context artifacts that lead to a situational
understanding. Such artifacts are typically derived from various information sources such as sensors, audio, location,
time, application content, user-device interactions, etc. by
using a broad range of techniques such as signal processing, machine learning, natural language processing, etc. The
contextual information learnt after application of such preprocessing techniques is referred to in this paper as context
artifacts.
Given the capability of extracting contextual information,
a main challenge is to capture and integrate contextual information in a coherent, structured and semantically meaningful knowledge base (KB) that aids reasoning based on
the various context artifacts. This is challenging, as contextual information may be derived from different sources with
varying representations and semantic interpretations. Consider location as an example. There are many different ways
in which location can be interpreted. A location may refer to
an exact position on the globe or a place (e.g. home, office,
store). It is important to enable a common understanding of
concepts and their relations to enable semantically meaningful information sharing. For this reason, we propose the
upper-level ontology for situation awareness illustrated in
Figure 1 which is partly motivated from (Chen et al. 2004).
• A device is a system that provides the contextual information that contribute to a situation. In some cases, the
device is a system which follows the person (e.g., mobile
phone). In some other cases, devices are systems that are
present in the environment of the principal.
• The concept model captures the characteristic of another
concept. Examples include the GPS-model of a place or
the motion model of an action.
• Sensors on a device include accelerometer, light, proximity, microphone, GPS, WiFi, etc.
• Content is any information available from a device, either
in the form of structured or unstructured data.
• Applications (Apps) are resident on a device; these may
produce structured, unstructured or semi-structured data.
• Knowledge may be learned or commonsense knowledge
obtained from various sources.
The ontology explicitly outlines places, actions and environments as contextual information derived from physical sensors. Other types of contextual information are derived from applications, content and knowledge available
69
in the system. Domain specific ontologies are expected to
tie into this upper level ontology for specific types of context information sources. For e.g., in the motivating use case
described, a calendar ontology (e.g., iCal RDF vocabulary)
may be a type of application context that is used to model the
calendar information and the context derived from it. The
goal of the ontology is to provide a common formalism to
enable reasoning about situations.
Even though context artifacts provide a lot of information, they are typically isolated and do not directly lead to
situation-awareness. Fusing context artifacts for inference
at a statistical or probabilistic level produces richer context
artifacts, but still does not lead to a semantic understanding of the situation. In our use case, examples of context
artifacts are: indoor and outdoor location based on GPS or
Wifi (Place), motion states such as walking or driving based
on inertial sensors (Action), the ambiance of a user based
on inertial or audio sensors (Environment), the proximity to
other people (Principals) based on audio signatures or network interfaces (Environment), and many more. The gathering of such contextual information is by itself a hard problem which we do not address in the scope of this discussion.
Instead, we emphasize that such artifacts are only a stepping
stone towards situation-awareness.
and predicates over which the constraints and rules are defined are derived from the ontology. First-order logic is used
only for illustration and we acknowledge that there are other
ways of representing the rules.
Recall the use case from before. Alice is running late to
a meeting and invitees should be notified if and only if they
are likely to be in the meeting room before she arrives. For
this, the system requires an understanding of who should be
notified. If Bob is an invitee, the system can infer (without
considering any other information) that Bob is at the meeting
room based on Rule 1 (in Table 1). However, the likelihood
that supports that inferred result is low. Given the location of
Bob’s mobile phone, Bob can be placed at the same location
based on Rule 5. Depending on the accuracy of the indoor
location identification, the inference may not have a higher
likelihood based on the place information. Acquiring other
information such as whether Bob answers his office phone
may not be a viable option and therefore, the system continues with reasoning over incomplete knowledge and infers
with low confidence that Bob is at the meeting.
Challenges in Reasoning
The main challenges originate from uncertainty in the contextual facts, uncertainty in the inference rules or from missing data. Uncertainty in the contextual artifacts comes from
the fact that the artifacts themselves are derived from probabilistic information. Data may also be missing due to unavailability of data for a particular duration or due to known
gaps in data for power saving purposes or due to privacy
policies or preferences. Further, multiple contextual artifacts
may lead to the same situation as well or a subset of the
contextual artifacts may be shared across multiple situations.
Combined, these aspects present some key challenges in the
reasoning process.
Reasoning over Situations
In this section, we focus on reasoning to bridge the gap between isolated context artifacts and situation awareness. Situations are typically defined over a set of context artifacts.
A subset of context artifacts may be applicable to multiple
situations and this calls for consideration of sufficient artifacts in the reasoning process to infer a situation with high
certainty. For example, the situation of being in a meeting
is commonly defined as a gathering of two or more people (fact) that has been convened for the purpose (fact) of
achieving a common goal (fact) through verbal interaction
(fact), such as sharing information or reaching agreement (of
Labor Statistics’ 2009). However, meetings may occur face
to face or virtually, as mediated by communications technology (e.g., a videoconference). Thus, a meeting may be
distinguished from other gatherings, such as a chance encounter (not convened), a sports game or a concert (verbal
interaction is incidental), a party or the company of friends
(no common goal is to be achieved) and a demonstration
(whose common goal is achieved mainly through the number of demonstrator present, not verbal interaction).
Given a definition of a situation over contextual artifacts,
the challenge remains to infer situations from those contextual artifacts (which are often derived in isolation). We start
by formalizing the contextual artifacts and their relations using our ontology. Given this knowledge base, we illustrate
rule-based reasoning to infer situations.
Table 1 illustrates an approach to express rules as first order logic expressions and associating with each rule a probabilistic weight that indicates the likelihood that the inference
is true given the constraints or assumptions are satisfied. Inference can be performed over all possible worlds to determine the world with the highest likelihood. The variables
Privacy Sensitive Inference and Information
Sharing
In any collaborative application, there is an element of information sharing. Designing a strategy for information sharing is a balance between efficient communication and privacy restrictions. Traditionally, this tradeoff is managed with
static privacy settings which are not situation sensitive. For
instance, in some situations, a user might be willing to disclose his or her location, but may not wish to disclose it all
the time. As a result, users end up with static privacy settings
which are usually more restrictive and richer functionality
cannot be offered as a result. The ability to infer situations
allows for a better way of realizing more dynamic privacy
settings. However, in a system that is situation-aware, additional challenges arise that require privacy policies to be
applied even during the inference process.
Privacy in Situation Awareness
To properly address the privacy aspect in situation awareness, we present a shift in focus from previous researches
(Corradi, Montanari, and Tibaldi 2004; Zhang and Parashar
2004; Toninelli et al. 2006), about the concept of privacy.
The focus switches from “information access” of sensitive
70
English
Inference Knowledge Base
Weight
1
People who are invited to a meeting will go to the meeting.
Inv(x, m) ⇒ P art(x, m)
0.65
2
People who are located at the meeting location are in the meeting.
At(m, l) ∧ At(x, l) ⇒ P art(x, m)
0.82
3
People who answer their office phones are in their office.
Ans(x, p) ∧ At(p, l) ⇒ At(x, l)
0.98
4
People who do not answer their office phone are not in their office.
¬Ans(x, p) ∧ At(p, l) ⇒ ¬At(x, l)
0.81
5
People are at the same location as their mobile phone.
At(m, l) ⇒ At(x, l)
0.94
Table 1: Rule set to infer the location of an invitee of a meeting.
information to the consequences of “information use” (Kagal and Abelson 2010). Recent privacy stories on the Web
mainly came from the inappropriate use of personal information with unintended consequences (Shih and Paradesi
2010); the situation in context-aware platform is expected to
be even worse. One observation is that the users exhibit different degrees of sensitivity to information about them and
this concern is easily affected by the context within where
the information is “collected” and where the information is
“evaluated” (Sadeh et al. 2009).
We argue that privacy analysis is not limited to a single situation, but also on how each situation plays into a pattern of
contexts that can be inferred by the observer. A situation, as
shown in 2, may be a snapshot of multiple context artifacts
at a particular time. A situation may also be a pattern of context attributes over time - typically, this leads to revealing behaviors of users and groups. From such behavioral patterns,
it is possible to construct user profile information, categorizing groups of users. Research work such as the “Reality
Mining” effort (Eagle and Pentland 2006) aggregates these
behavior-profiles and identifies an individual as belonging
to a specific social group. Such an analysis can often lead
to loss in privacy, even if individual context attributes or a
single situation didn’t impact privacy.
Most of the decisions of giving away personal contexts
may be driven by the immediate benefit received from
situation-aware services. However, the purpose of information use should be clearly defined for both reasoning about
situations and sharing inferred context. The reasoning process should also be transparent and provide explanations of
the inferred situation as mentioned in (Weitzner et al. 2006).
The justifications provided by the program can help the user
to assess the quality and appropriateness of the inferences.
Architecture for Privacy in Situation
Awareness Platform
Here, we like to address privacy mechanisms in the “contexts” where personal information is collected, inferred,
shared, and evaluated. Figure 3 shows the architecture of a
privacy-sensitive situation awareness platform. In the development of a situation-aware system that uses data from a
variety of sources, three broad privacy issues arise:
Ua
Ub
Profile user based
d
on context pattern
a) Privacy of data collected for training to develop context
aware systems. Users must be offered privacy so that users
who contributed training data cannot be identified. At a
minimum, users must be offered the option of limiting
when and where the data is collected.
Uc
Ud
combination
of contexts
pattern of context
b) Privacy of data used to infer contextual information. User
preferences should guide how and what data is used for
inference purposes.
Ue
c) Privacy in sharing the inferred context. The user should
be able to manage permissions on what data and what inference is permitted share between various requesting entities (users or applications).
Uf
Ug
We illustrate b) and c) in the context of our running example, next.
situation
situation (behavioral)
User’s Policy as Privacy Constraints
Figure 2: Privacy concerns in different levels of use about
context information
A user should be able to specify how his or her data can
be collected, used and shared by the program in situation
71
Sensor Data
Application Data
Web Data
Alice always has to get a cup of coffee at this time everyday.
The provenance of an inferred situation will show whether
contexts collected about Alice are used appropriately, and
will help the information consumer to evaluate the quality
of the rule and the quality the contexts.
Data collection
Privacy-Aware Context Computing
Data Filtering
Learning Context Information
(machine learning techniques)
Traceability and Explanation
User Policy
Contextual privacy-aware module
Access Control using Contextual
Information
Context Computing
Reasoning Provenance Context Statistic
Situation Maintenance
Relation
output
Provenance + inferred situation +
Relevant contexts
Context Consumer
Figure 3: Architecture of privacy-aware context computing
awareness platform. In addition, a user can use “context artifacts” or ”situations” to specify the conditions under which
the collection of data or reasoning about data is valid. For
example, a policy from the user may state “do not collect
my location when I am not at the work place” or “any contextual information collected after working hours cannot be
used for making inferences about my relationship with my
colleagues”. In our scenario, Alice can set the rule to prohibit the program from collecting her location while she is
away for her doctor’s appointment. With that restriction, the
only inference that can be made might be “Alice is currently
away from her work place”. On the other hand, Alice can
set a policy to constrain the use of collected contexts while
away from the office, such that the reasoning engine will
skip this fact or it will not be shown in the explanation of
the inferred result. For example, if Alice prohibits the use of
her location to infer her arrival time, people will receive a
notice of “Alice being late to the meeting, but we have no
information about why or how we know it”. In this case, the
explanation that uses Alice’s location is “eclipsed” by the
privacy constraints.
Thus, aside from the context representations, a rich representation of the privacy constraints associated under various
contexts also needs to be developed.
The system should be capable of tracing the data that led
to a particular context to be inferred or the sharing of particular data or context and proving an explanation for it.
Indirectly, this helps the accountability of consumption of
the data under question. While accountability in social situations and data sharing happens in different ways (or sometimes does not happen), traceability is important to understand the implications and diagnosing errors. This may lead
to changes in the privacy interface, which will in turn be reflected as changes to privacy policies. The explanations help
users to understand how inferences are made in human readable forms with some transformations from the justification
tree. A user can create a policy to restrict how much explanation should be generated for an inferred situation, because
sometimes even the explanations are sensitive. In our scenario, Alice can set privacy constraints in the use of location
context as explanation, thus even when location context is
used in the inference, it won’t be shown in the explanation.
Conclusion
In this paper, we motivate and illustrate the boundaries and
relations between concepts learned using machine learning
techniques and semantic representations. We also describe
how constraints arising from privacy preferences can be incorporated in the resulting model. We have shown how rules
can be expressed over the context artifacts represented in
the ontology to aid reasoning and inference of situations. We
have further shown that incorporating the privacy constraints
on the model leads to reasoning over missing or hidden data,
thereby bringing more challenges to the situation awareness
system.
As a next step, we have been collecting datasets and developing machine learning based techniques to derive context artifacts. The data currently being collected include a
number of sensors, audio, GPS, WiFi, Bluetooth, etc. and
will incorporate application contexts going forward. As a
next step, the extracted context artifacts will be used to instantiate the ontology elements and experiments would be
done on inferring situations via reasoning using the framework described in this paper.
Provenance of Inferred Situation and Contextual
Information
Provenance information is important for data consumers to
evaluate the quality of the information in order to use it
appropriately. Two types of provenance information are required in situation awareness, one is the provenance of a inferred situation, another is the provenance of a piece of contextual information. Because inferences can be made from
multiple sources, it is important to understand more information about the sources to justify the validity of an inferred
situation. For example, Alice may have no problem with an
inference of her being late to the meeting due to her current location and her schedule with the doctor. But it may
be problematic if the same situation (Alice being late to the
meeting) is inferred by a pattern of contexts showing that
Acknowledgments
We would like to thank Hal Abelson, Jim Dolter, Lenny
Grokop, Jinwon Lee and Sanjiv Nanda for their valuable
contributions.
References
Chen, H.; Perich, F.; Finin, T.; and Joshi, A. 2004. Soupa:
Standard ontology for ubiquitous and pervasive applications.
In International Conference on Mobile and Ubiquitous Systems: Networking and Services.
72
Corradi, A.; Montanari, R.; and Tibaldi, D. 2004. Contextbased access control for ubiquitous service provisioning. In
COMPSAC, 444–451. IEEE Computer Society.
Dey, A. K. 2001. Understanding and using context. Personal
Ubiquitous Comput. 5(1):4–7.
Eagle, N., and Pentland, A. S. 2006. Reality mining: sensing complex social systems. Personal Ubiquitous Comput.
10:255–268.
Kagal, L., and Abelson, H. 2010. Access control is an inadequate framework for privacy protection. In W3C Privacy
Workshop.
of Labor Statistics’, U. B. 2009. Meeting and convention
planners. In http://www.bls.gov/oco/ocos298.htm (Retrieved
May 4, 2011).
Sadeh, N.; Hong, J.; Cranor, L.; Fette, I.; Kelley, P.;
Prabaker, M.; and Rao, J. 2009. Understanding and capturing people’s privacy policies in a mobile social networking application. Personal and Ubiquitous Computing
13(6):401–412.
Shih, F., and Paradesi, S. 2010. Saveface: Save george’s
faces in social networks where contexts collapse. In
IAB/W3C Internet Privacy Workshop.
Toninelli, A.; Montanari, R.; Kagal, L.; and Lassila, O.
2006. A semantic context-aware access control framework
for secure collaborations in pervasive computing environments. In Cruz, I. F.; Decker, S.; Allemang, D.; Preist, C.;
Schwabe, D.; Mika, P.; Uschold, M.; and Aroyo, L., eds., International Semantic Web Conference, volume 4273 of Lecture Notes in Computer Science, 473–486. Springer.
Weitzner, D. J.; Abelson, H.; Berners-Lee, T.; Hanson, C.;
Hendler, J.; Kagal, L.; McGuinness, D. L.; Sussman, G. J.;
and Waterman, K. K. 2006. Transparent accountable data
mining: New strategies forprivacy protection. Technical report, Massachusetts Institute of Technology Computer Science and Arti
cial Intelligence Laboratory.
Zhang, G., and Parashar, M. 2004. Context-aware dynamic
access control for pervasive applications. In Proceedings of
the Communication Networks and Distributed Systems Modeling and Simulation Conference, 21–30.
73
Download