Factic - Faceted Semantic Browser

advertisement
A
SemanticLog – Semantic logging of user
interaction evidence
A.1
Basic Information
Effective personalization and (automatic) user modeling rely on the availability and
quality of evidence about a user’s interaction with an adaptive system. Therefore, it is
imperative that precise data about user actions are recorded and that its associated
semantics are preserved for further processing by inference agents in the user modeling
process.
Traditional logging approaches focus on server-side logs produced by web servers,
which contain data incoming client-side requests (e.g., timestamp, URL addresses). On
the other hand, SemanticLog employs another logging approach that stresses logging
from the actual application that processes requests into responses and vice versa, both
on the server side and on the client side.
A.1.1 Basic Terms
Event ontology
An ontology that describes the possible events that can
occur in a system as results of user interaction together with
their attributes and their mapping onto specific users and
user sessions.
Semantic logging
The logging of defined events resulting from user actions
(user interaction) together with semantic metadata that
describe what exact events occurred along with their
attributes based on an event ontology.
A.1.2 Method Description
The logging method used by SemanticLog is based on three principles:
 Representing the events via an event ontology which describes their possible
types and attributes in a clear, concise and machine readable way while
preserving their semantics for further use by (third party) inference agents.
 Logging of events by the respective application modules, which “know” best
what actually happened within the application, i.e., what events happened and
what their context and semantics were.
 Logging of both client-side and server-side events with their successive
integration into a single stream of events for individual users on a per session
basis.
The key part of the used logging method is related to the event ontology, which
describes events and their attributes in a flexible and extensible way (Fig. 1), so that
they can be used for successive analysis of user behavior for personalization. Each event
is associated with a unique user and user session thus including support for multiple
simultaneous user sessions (bottom left). For each event a recursive set of associated
eventAttributes can be stored describing events and their context in more detail
(bottom right). Both events and their attributes have defined types.
0..1
typesOfDisplayedItemAttributes
typesOfDisplayedItem
id bigint unsigned
name varchar(100)
*
*
id bigint unsigned
name varchar(100)
1
*
1
*
*
displayedItemAttributes
displayedItems
id bigint unsigned
*
1
id bigint unsigned
name varchar(100)
*
*
0..1
0..1
0..1
typesOfEventAttributes
eventAttributes
*
*
displayStates
id bigint unsigned
1 fromState
1 toState
*
id bigint unsigned
value varchar(100)
*
*
*
*
*
typesOfEvents
1
events
id bigint unsigned
timestamp timestamp
1 id bigint unsigned
name varchar(100)
*
*
id bigint
1 unsigned
name varchar(100)
*
1
sessions
users
login varchar(20)
uri varchar(100)
1
*
id bigint unsigned
start datetime
end datetime
Fig. 1. The relational data model corresponding to the used event ontology schema.
To further support the processing of logs by (user model) inference agents, the state of
the user interface at the time an event occurred is logged. This includes the
displayState before and after an event occurs (left). Each display state is composed of
displayedItems and their attributes, which comprise the respective user interface visible
to the user and thus provide the state of the world based on which the user made his
decision to cause an event (top right).
A.1.3 Scenarios of Use
SemanticLog can be used to log and integrate user interaction evidence in the form of
semantic events occurring both on the server side and client side of a system. Thus the
possible usage scenarios are as follows:
 Client side user interaction logging, e.g., moving the mouse, hovering on links.
 Server side user interaction logging, which is usually associated with the
processing of client requests into responses sent back to the client. For example, a
faceted browser can log application specific details server side events such as
SelectRestriction based on what links the user clicks in the user interface.
SemanticLog should not be used in following cases:
 There is no event ontology available for the respective application.
 There is no consumer for the logged data.
A.1.4 External Links and Publications
 Tvarožek, M., Barla, M., & Bieliková, M. (2007). Personalized Presentation in
Web-Based Information Systems. In J. Van Leeuwen, G. F. Italiano, W. van der
Hoek, H. Sack, C. Meinel, & F. Plášil (Ed.), Lecture Notes in Computer Science:
Proceedings of SOFSEM 2007 - Theory and Practice of Computer Science. LNCS
4362, pp. 796-807. Harrachov, Czech Republic: Springer-Verlag, Berlin
Heidelberg.
 Andrejko, A., Barla, M., Bieliková, M., & Tvarožek, M. (2006). Softvérové
nástroje pre získavanie charakteristík používateľa. In P. Vojtáš, & T. Skopal (Ed.),
Proceedings of DATAKON ‘06, (pp. 139-148). Brno, Czech Republic.
 Hibernate, object-relational mapper, (http://www.hibernate.org/).
 Log4J,
Java-based
logging
utility,
(http://logging.apache.org/log4j)
Apache
Software
Foundation.
 Tomcat, a Java
(tomcat.apache.org)
Apache
Software
Foundation.
A.2
servlet
container,
Integration Manual
SemanticLog is developed in Java (Standard Edition 5) as a library, which uses
relational persistence via Hibernate and MySQL. It should be used in conjunction with
other applications that need to log user interaction evidence for further processing. The
distribution of SemanticLog consists of these parts:
 A jar archive, which contains the binary files of the logging service.
 A MySQL script used to initialize the MySQL database schema.
 A set of configuration files, which initialize Log4j logging, and describe the
notification and use of external inference agents (LogAnalyzer), and the mapping
rules for Hibernate.
A.2.1 Dependencies
SemanticLog uses these external tools and libraries:
 LogAnalyzer user model inference agent, which is notified and/or invoked when
new events are ready to be processed.
 UserLogs library, which facilitates relational persistence for the event ontology
model via Hibernate.
 Hibernate for relational persistence.
 Log4J logging utility.
A.2.2 Installation
Deploying SemanticLog with other applications involves these steps (Java Integrated
Development Environment should be used):
1. Including the SemanticLog jar archive into the project.
2. Adding the SemanticLog and dependency jar archives to the classpath.
3. Adding the optional SemanticLog.jar file to the root project directory.
4. Adding logging data for the log4j.properties file and the hibernate.cfg.xml file.
A.2.3 Configuration
SemanticLog must be configured to use a suitable
(hibernate.cfg.xml), and Log4J service (log4j.properties).
relational
database
The optional SemanticLog.properties file contains the following values:

CallLogAnalyzer

LogAnalyzerMessageThreshold
The
– true if LogAnalyzer should be invoked (default = false).
– the number of events after which LogAnalyzer is
invoked if it is enabled (default = 1).
file should contain configuration for the semanticlog logger, the
hibernate.cfg.xml file must contain connection data to a valid and initialized MySQL
database, where log data is to be stored.
log4j.properties
Additionally, the metadata level of event types and their respective attributes should be
initialized before their first use as these are required during the logging process.
A.2.4 Integration Guide
SemanticLog is used to log events caused by user interaction. The typical usage scenario
involves first a user login (method LogEventUserLogin), next zero or more application
events (method LogEvent) and lastly a user logout (method LogEventUserLogout). For
detailed description of individual logging methods see the accompanying javadoc
documentation.
Error handling
Most errors originate either from bad configuration and/or initialization (e.g., nonexisting database, event or attribute types), or from caching, connection pooling and
session issues in MySQL used by associated tools.
If an error is too serious or unknown and thus cannot be reasonably handled, an
exception is thrown. Otherwise, a log entry via log4j is made.
A.3
Development Manual
A.3.1 Tool Structure
SemanticLog consists of a single package: sk.fiit.nazou.semanticlog.
A.3.2 Method Implementation
The implementation of SemanticLog is relatively straightforward. Its operation consists
of two primary stages. First, during initialization the existing event, attribute and display
state types are loaded. Next, during normal operation SemanticLog waits for incoming
events and logs them as they arrive.
The respective logging methods first create an object model of the incoming event in
memory and then store it in a relational database via Hibernate. For faster processing, a
set of hash map tables are used to speed up seeks of event and attribute types and
sessions.
A.3.3 Enhancements and Optimization
Originally, SemanticLog was implemented as a Web service running in the Axis Web
service container. However, during evaluation we identified a significant performance
bottleneck related to the constant serialization and deserialization of displayStates. Thus
the design was changed to a library, which was invoked directly alleviating this
problem. However, this quick workaround effectively almost negated the original
intention to integrate events from various sources since each source effectively had its
own instance of SemanticLog (i.e., events from the same session may not be combined).
To fix this problem we devised yet did not implement a hybrid approach that used a
Web service for the initial logging of basic event data (user, session, event) and a jar
library for the logging of complex displayState data (only available on the server side).
Otherwise, the performance of SemanticLog is near optimal with the exception of event
evaluation via LogAnalyzer, which is invoked synchronously thus slowing down the
logging process. Thus an interesting optimization prospect would be to invoke
LogAnalyzer asynchronously, possibly as a Web service.
A.4
Manual for use in Other Application Domains
The use of SemanticLog in other application domains requires solely the definition of
events, event types and the corresponding attributes, attribute types and display states
relevant for the particular domain. As this can be done without any code and
configuration changes by solely preparing the required data offline and then population
the log database, there are no application changes required for the use of SemanticLog
in another application domain.
A.4.1 Configuration for use in Other Application Domains
No domain specific configuration, other than the one described in the configuration
section in necessary. Solely the definition of the respective events and corresponding
entities in the database is required.
A.4.2 Dependencies
Apart from LogAnalyzer, all dependencies are effectively domain independent.
Furthermore, LogAnalyzer itself is not explicitly required by the logging method as
such, but is a dependency out of convenience where log evaluation is invoked
automatically when new data are available. Any other inference agents specific for a
particular domain can be used instead of LogAnalyzer provided that it accepts the
created logs as input (i.e., “understands” the event ontology).
Detailed information about LogAnalyzer domain dependency is described in its
respective documentation. It may require some domain specific adjustments, although
the inference rules that are used are more dependent on the navigation model of a
faceted browser rather than on the application domain itself. However, as some
additional domain specific rules may be used, domain specific adjustments may be
useful.
Download