Using Map-Based Visual Interfaces to Facilitate Knowledge

advertisement
Using Map-Based Visual Interfaces to Facilitate
Knowledge Discovery in Digital Libraries
Olha Buchel
Faculty of Information and Media Studies
University of Western Ontario
oburchel@uwo.ca
ABSTRACT
In recent years there has been growing interest in
supporting knowledge discovery activities using map-based
visual interfaces. The goal is promising and ambitious, but
not very easy to achieve due to the lack of understanding of
cognitive factors involved in how information is
transformed into knowledge. In this paper we present a
map-based visual interface, VICOLEX (VIsual COLlection
Explorer), aimed at facilitating and supporting knowledge
discovery and users’ cognitive activities by means of
integrated visual representations coupled with interactions.
Keywords
Map-based visual interfaces, design, knowledge discovery,
visual representation of georeferenced collections, digital
libraries.
INTRODUCTION
In 1999 MacEachren et al. (1999) suggested the possibility
of integrating geographic visualizations with knowledge
discovery (KD). The possibility of bringing these two
research areas together sounds ambitious, yet promising.
The two disciplines can both contribute insight to the joint
venture. One of the possible areas of investigation that can
emerge from this joint venture is the development of mapbased visual interfaces (MVIs) that can support KD
activities. At the outset of such investigation, it seems that
the main concern of both research areas is the following:
discovering useful knowledge within given information1.
This concern involves such tasks as discovering patterns in
large volumes of data, identifying new patterns of data
distribution and dispersion, formulating hypothesis based
on observed patterns and trends, and finding new
unsuspected correlations and relationships (Fayyad,
Grinstein, & Wierse, 2002; MacEachren et al. 1999). This
observation, however, does not result in a simple borrowing
This is the space reserved for copyright notices.
ASIST 2011, October 9-13, 2011, New Orleans, LA, USA.
Copyright notice continues right here.
1
In this paper, we use the terms data and information
interchangeably.
Kamran Sedig
Department of Computer Science
Faculty of Information and Media Studies
University of Western Ontario
kamrans@uwo.ca
of ideas and methods from the two areas, and does not
translate to readymade, easy design choices.
Fayyad, et. al. (2002) and MacEachren et al. (1999)
emphasize the interactive and iterative nature of KD. In KD
humans need to interpret information and make many
decisions in the process of refining knowledge. From the
point of view of cognition, information interpretation and
decision making are complex activities in their own right.
In feature interpretation, for example, users have to link
abstract representations of data with the prior knowledge of
their own. Although a great portion of interpretation takes
place in the human mind, people often help their thinking
by performing small external actions with information such
as selecting, filtering, rearranging, reformulating, and
simplifying representations (Kirsh, 2009). At first glance,
such actions might seem superfluous, but their value can be
better understood when they are considered in the context
of some activity (e.g., in the context of performing KD
activities using highly-cluttered maps). To cope with
complexity and to interpret encoded information, users’
visual system samples visual information on maps by some
inherently selective perceptual acts that direct the attention
to restricted regions of the visual field. By processing maps
selectively, people visually extract a pool of hot spots (Amit
& Geman, 1999; Yang, Yuan, & Wu, 2007), suppress the
distracters, apply spatial filters, group similar items, and
perform operations on entire groups (e.g., reject, classify)
(Luck & Hillyard, 1994). To compute all these operations
in the mind by relying solely on vision is difficult. For this
reason when people work with paper maps they often
perform many external actions: they fold maps in order to
better focus on hot spots; they annotate them; they mark
locations of interest; and carry out other actions (Knapp,
1995). This example demonstrates that with external actions
people help their vision and prepare information for higherlevel cognitive activities such as interpretation, decision
making, and KD. Therefore, MVIs that are intended to
facilitate KD activities should provide users with
mechanisms by which they can act upon visual
representations (MacEachren et al. 1999).
The goal of this paper is to examine the role of interactions
with visual representations in KD activities. In particular,
we explain the value of interactions in MVIs as front ends
to complex digital libraries (DLs).
BACKGROUND
An MVI is made of interactive representations that provide
access to information and facilitate and support a KD
activity. A representation here refers to an integrated set of
visual encodings of entities (such as documents and
locations) and their properties. Such representations can
take various forms: maps, graphs, tables, and so on. The
main representation in a georeferenced collection is a map.
However, as an information space can be very complex, a
map can only encode a subset of the space’s entities and
relationships. As a result, other information elements and
relationships can be encoded and communicated using
different representations. These representations can then be
integrated to work as a unit at the interface level. For
example, in georeferenced DLs, other representations, such
as tables and graphs, can be placed on top of a map to
communicate other aspects and properties of information.
Even though representations encode information elements
and their relationships and properties, all static
representations have limitations, can support only certain
tasks, and can provide answers to certain questions. Finally,
due to the amount and complexity of encoded information,
a representation may become cluttered and dense, and
hence ineffective at communicating information. This is
certainly true of maps representing complex DL collections.
To compensate for some of the limitations of static
representations and to increase their utility, the MVIs
should provide support for users’ actions by means of
computer interactions. Computer interactions have two
components: actions and reactions (Fast, & Sedig,
Accepted). A user acts upon a representation and the
representation reacts and gives a response. An interface can
reduce its complexity and density by making certain
representations of information latent. Interactions, then,
can allow users to perform physical actions on the interface
in order to bring latent information to a more observable
level in order to simplify mental unpacking and elaboration
associated with representations (Kirsch, 2003). More
specifically, interactions enable different properties,
relations, and layers of static visual representations to be
probed, and available on demand, thereby making the
information representations better suited to the individual
and contextual needs of users; this can potentially enhance
users’ ability to explore, navigate, and transform different
elements and features of map-based visual representations,
all important cognitive tasks involved in KD activities.
Besides information latency, information context also plays
an important role in KD. Any particular object, document,
data, or event can be informative only under certain
circumstances depending on the inquiry and on the
expertise of the inquirer (Buckland, 1991). It follows from
this that the designers of visual representations, whose goal
is to facilitate KD, have to surmise situations of information
use. In this paper we assume the position that situations can
be predicted for particular contexts, and interactions can
support situational use of information. Interactions serve as
a glue that binds a series of low-level actions to support
different situational tasks that can be performed with
representations. In this sense, interactions allow information
to behave dynamically and situationally so that it can
facilitate users’ needs more effectively. This in turn plays
an important role in transforming information into personal
knowledge in KD situations.
PROTOTYPE COLLECTION
Our testbed collection is about the local history of Ukraine.
It is comprised of 349 MAchine Readable Cataloguing
(MARC) book records from the Library of Congress
Catalogue. All these records have call numbers that belong
to DK508 class of the Library of Congress Classification.
This class contains placenames and has many MARC
records linked to them via call numbers. Among the
selected MARC records there are the entire collections of
records for 32 Ukrainian cities which we treat as subcollections of the whole collection. This collection is highly
contextual: documents in this collection are interconnected
by subjects and have similarities in bibliographic
descriptions, forms/genres, languages, and places of
publication. Context in this collection is inferred by the
ontological properties of documents in the collection such
as physical descriptions, languages, subjects, and authors.
All of the above-mentioned properties were chosen to be
visually represented.
VICOLEX
In this section we present our prototype MVI, VICOLEX
(VIsual COLlection Explorer). VICOLEX is designed to
allow users to
explore georeferenced collections. It is designed with close
attention to representations and interactions with the
purpose of making collection structure more salient;
providing users with multiple perspectives on the data; and
therefore facilitating KD and sense making.
Representations
As to representations, we chose a variety of different
representations, each of which represents a collection from
a different perspective. More specifically, all metadata
records were mapped onto Google Maps (GM) (see Figure
1 below). Each marker of GM represents the number of
metadata records in each sub-collection. Since some
collections for individual locations have quite a large
number of records (e.g., Lviv – 78, Kyyiv – 92), additional
graphical representations were used to represent ontological
properties of sub-collections. An example is shown in
Figure 1. The scatter plot is utilized for showing book
heights, number of pages, and languages (Figure 1.a); the
pie chart, for displaying languages (Figure 1.b); the
histogram, for showing years of publication (Figure 1.c);
the embedded map, for visualizing places of publication
(Figure 1.d); the Kohonen map, for representing subjects
(Figure 1.e); and the tag cloud, for displaying authors
(Figure 1.f).
increases the speed and accuracy of information processing,
and reduces cognitive effort required to complete propertyrelated tasks (Enns & Akhtar, 1989). In VICOLEX, the
results of filtering can be observed not only on the surface
of the map, but also on the representations of ontological
properties of individual sub-collections that are linked to
markers. Because of filtering on the map, the
representations of properties in sub-collections become
more legible and easier to understand. Such filtering allows
completing tasks not only at the level of information
entities, but also at the level of properties.
Figure 1. Representing collections on Google Maps.
Overall, VICOLEX has 193 representations. These
representations encode the entities in the prototype
collection and their properties. These representations help
users gain insight into the various aspects of the collection
which are hidden from view on the main map. Each
representation encodes only small portions of the
information about the collection and supports only specific
tasks, hence making the main map in VICOLEX less
cluttered. Some representations assign additional meaning
to data (e.g., histogram of the years explains years of
publication in terms of historical periods). Each set of
representations for each location encodes storybooks about
that sub-collection, related to subjects, years of publication,
languages, book sizes, authors, and where the subcollection was published.
Interactions
Despite obvious advantages, the above approach of using
different representations to communicate properties and
entities in the collection still has shortcomings. In
particular, it is difficult to understand how properties are
related to each other; how they are distributed spatially and
temporally in the collection; how properties of the
collections can be combined and viewed together; and how
people can adapt VICOLEX’s MVI to their own needs. To
overcome these shortcomings in VICOLEX, we augment
representations with interactions, particularly linking,
filtering, selecting, and grouping which we discuss next.
Filtering
Filtering allows users to sift out document properties. Users
can query the ontological properties of a collection by
ticking off checkboxes and by setting limits on timelines
that show time of acquisition and publication (shown in
Figure 1). Property-based filtering reduces the complexity
of high-dimensional data, reduces cluttering, gives users
flexibility in selecting properties, and generates a number of
easy-to-understand displays, each focused clearly on a
particular aspect of the underlying data. In general, filtering
helps inhibit the processing of task-irrelevant information,
Selecting
Variable selection and feature extraction are regarded to be
crucial steps in KD (Fayyad, Grinstein, & Wierse, 2002).
Selecting objects with certain properties from unnamed
geographic areas (e.g., north or south of some region) from
MVIs can be quite challenging because such regions are
rarely described in systems explicitly. To facilitate this type
of selection, VICOLEX allows selecting regions with
markers by drawing a bounding box around markers with a
drag-and-drop rectangle corner technique (Figure 2.b).
This selection is intended to provide a sandbox like feel to
the MVI, with the capability to dynamically adjust
properties of objects since such a selection can be
performed both on an entire collection as well as on a
filtered collection. For example, a user can make visible
only books about history and select only those from the
Western Ukraine using the bounding box (Figure 2.a and
b). Properties that are suppressed by filtering cannot be
selected with the bounding box. Moreover, the area
selection mechanism in VICOLEX is coupled with
grouping interaction which results in representing the
selected documents with the same set of additional
representations as documents that are linked to individual
markers (Figure 2.c). Such selections with groupings can be
useful for answering the following questions: a) In which
area of Ukraine do collections have more illustrations? b)
Are places of publication in collections about small
locations different from places of publication about large
locations? c) Is there a difference in subjects in collections
about different parts of Ukraine? And other queries.
Figure 2. Example of selection with filtering and
grouping.
KNOWLEDGE DISCOVERY USING VICOLEX
In this section, we briefly discuss how representations of
the entities in the prototype collection along with the
implemented interactions in VICOLEX support KD. We
report a number of discoveries that we made using
VICOLEX, particularly with regard to changes in the
collection during the 1980ies and 1990ies. In general, the
discoveries can be classified as quantitative and qualitative.
Quantitative
One of the things that we were able to discover was that the
larger half of the entire collection was published after 1991.
This is evident from filtering the main map by the years of
publication: before 1990 and after 1991. Second, we found
that, in publications prior to 1990 maps were rarely
included in books. Moreover, books with maps published
before 1990 are about large cities only, whereas books with
maps after 1991 are about both small and large places.
Third, the number of publications in Ukrainian significantly
increased after 1991. Fourth, books in Polish about Ukraine
were nonexistent before 1981. But beginning with 1981 the
number of publications in Polish started increasing
incrementally, especially about Lviv. Fifth, after 1991
certain subjects started to demonstrate significant growth
(e.g., Biographies, Archaeological Excavations).
Qualitative
The majority of qualitative changes are associated with
subjects. Just as the number of published books increased
after 1991, the variability of subjects became greater after
1991 too. Subjects that emerged after 1991 are “Ethnic
Relations,” “Ukrainian, Nationalism,” “Minorities,”
“Jews,” “Economic conditions,” “International Executive
Service Corps,” “Vinnytsia Massacre, Vinnytsia, Ukraine,
1937-1938,” “Political Prisoners,” “Rehabilitation,”
“Political Prosecutions,” “Prisoners of War,” “Massacres,”
and others. Many of these subjects were banned during the
period when Ukraine was part of the Soviet Union, and
therefore they do not appear in books published before
1990. Second, it appears that books about locations with
population size smaller than 200,000 people appear to be
smaller and fewer in total than books about larger locations.
Third, Russian-language books are distributed more in the
East and South than in the West. In addition, we were able
to discover a few sub-collections with unusual language
distributions other than Ukrainian and Russian.
CONCLUSIONS
In this paper, we have presented VICOLEX, a prototype
front-end interface that provides ample support for users to
perform KD by means of interacting with MVIs of library
collections. A few of the representations used in VICOLEX
included maps, pie charts, scatter plots, and tag clouds.
Multiplicity of representations is intended to keep
information latent, not to overwhelm users with too much
information at once. The latent information remains hidden
waiting for users’ interactions. A few of the interactions
presented in this paper were linking, filtering, and selecting.
These interactions are intended to support KD activities. As
such, they simplify interpretation and understanding of
information in various situations, facilitate transformation
of information into personal knowledge for users, and
ultimately support higher-level KD activities. The
VICOLEX conceptualization can be utilized in the design
of front ends to complex DLs with georeferenced
collections. With numerous representations coupled with
interactions DLs will become more suitable for KD.
REFERENCES
Amit, Y., and Geman, D. (1999). A Computational Model
for Visual Selection. Neural Computation , 11, 7,
1691-1715.
Buckland, M. (1991). Information and information systems.
New York, NY: Greenwood Publishing Group, Inc.
Enns, J. T., and Akhtar, N. (1989). A Developmental Study
of Filtering in Visual Attention. Child Development
(60), 1188-1199.
Fast, K., and Sedig, K. (Accepted). Interaction and the
epistemic potential of digital libraries. International
Journal of Digital Libraries .
Fayyad, U., Grinstein, G., and Wierse, A. (2002).
Information visualization in data mining and
knowledge discovery. London, UK: Academic Press.
Kirsch, D. (2009). Interaction, External Representations and
Sense Making. Proceedings of the 31st Annual
Conference of the Cognitive Science Society. Austin,
TX: Cognitive Science Society.
Knapp, L. (1995). A task analysis approach: to the
visualization of geographic data. In T. L. Nyerges et al.
(Eds.), Cognitive aspects of human-computer
interaction for geographic information systems (pp.
355-371). Springer Verlag.
Luck, S. J., and Hillyard, S. A. (1994). Spatial Filtering
During Visual Search: Evidence From Human
Electrophysiology. Journal of Experimental
Psychology , 20, 5, 1000-1014.
MacEachren, A. et al. (1999). Constructing Knowledge
From Multivariate Spatiotemporal Data: Integrating
Geographic Visualization (GVis) with Knowledge
Discovery in Database (KDD) Methods. International
Journal of Geographical Information Science , 13 (4),
311-334.
Swanson, L. (1986). Organization of mammalian
neuroendocrine system. In V. Mountcastle, F. E.
Bloom, & S. Geinger (Eds.), Handbook of physiology.
Sec. 1, The nervous system, Vol. IV, Intrinsic
regulatory systems of the brain (pp. 317-363).
Bethesda, MD: American Physiological Society.
Yang, M., Yuan, J., and Wu, Y. (2007). Spatial selection
for attentional visual tracking. Computer Vision and
Pattern Recognition, 1-8.
Download