Visualizing Mailbox Yoo Ah Kim Abstract CMSC 838B Information Visualization

advertisement
CMSC 838B Information Visualization
Visualizing Mailbox
Yoo Ah Kim
Min-ho Shin
ykim@cs.umd.edu
mhshin@cs.umd.edu
Department of Computer Science
University of Maryland
Abstract
Electronic mails are one of the most popular
computer applications. As the number of emails
we exchange increases at high rate, it becomes
more and more important how to manage huge
volume of electronic messages. In addition,
email data patterns may give us useful
information including his/her personal history.
We propose two visualizations of email dataset:
time-based view and thread-based view. Timebased view displays messages in a twodimensional table of which rows are people and
columns are received/sent time. To scale large
volume of data, we use dynamic query and
zooming method. Thread-based view shows
emails that belong to the same thread. It shows
all senders who participated in a thread and
messages in the order of time with relations
among those messages.
Keywords: Electronic mails, time-based view,
thread-based view, scalability
Introduction
Nowadays electronic mails are one of the most
popular computer applications. As the number of
emails we exchange increases at high rate, it
becomes more and more important how to
manage huge volume of electronic messages.
Although emails are invented for asynchronous
communication, they are used for other purposes
such as task management, personal archives. In
addition, email data patterns may give us useful
information including his/her personal history.
However, there is no proper visualization which
can meet these purposes.
In this paper, we propose two visualizations of
email dataset to help users perform these tasks:
time-based view and thread-based view. Timebased view displays messages in a twodimensional table, of which rows are people and
columns are received/sent time. To scale large
volume of data, we use dynamic query and
zooming method. It also has sort, filter,
aggregate functions to help users find
information they need. Thread-based view shows
emails that belong to the same thread. Threads
are created using “reply” menu when users send
mails. Thread-based view shows all senders who
participated in a thread and messages in the order
of time with relationship of those messages.
Design Goals

View sent/received email patterns
With email dataset, users may want to see mail
patterns according to time. Interesting questions
are who sent the most emails in a certain period
or when a person sent emails most frequently. To
see patterns with large volume of messages, the
scalability problem should be solved. We used
dynamic query, zooming, aggregation, filtering,
and gradation to cope with this problem.

Find people and emails related to each
other
Emails can be threaded using "reply" and several
users participate in a mail thread. It would be
useful if we can see all participating users and
who sent or received emails in the thread with
relations among them.

Search information in the mailbox
Emails are used as personal archives to find
information in the future. Several studies [8]
showed that semantic hierarchies using folders,
the most predominant scheme currently, is not
suitable for this task because it is difficult for
users to organize mail folders properly and
figure out which mail folder has the mail they
need. Because people may easily figure out
senders and approximate sent/received time of
the message, time-based view can help users find
Outlook 2000 also has time-based view (Figure
2). They display all messages with subject at
received time without aggregating by date or
considering senders. Because they used the fixed
width for a day and show all messages with
subject, the view might be messy and hard to
understand if there are too many messages. In
the case that many emails arrives for a short time
period, they expand y-axis to list them.
Threading is necessary to help manage
conversation history and track the status of
conversation in emails [8]. Many systems are
developed to visualize conversations in chat
programs and instant messaging services
[2][3][4][5][7]. Netscan thread trees display
conversation thread for newsgroups. But
visualizing email thread is more difficult because
both senders and receivers are important and
there are two kinds of messages - incoming and
outgoing - unlike newsgroup.
a mail they need. Thread-based view also makes
it easy to extract related information by
providing all messages in the same thread.
Figure 2. Outlook 2000
Related Work
Timestore [1] [9] organizes messages by time
and sender in a two-dimensional grid as shown
in Figure 1. Messages are displayed as dots
encoding the number of messages as size. It
allows narrowing of the search space using fulltext searching. They also merged it with task and
calendar management system. Timestore focused
on time-based archiving and retrieving emails
Figure 3. Netscan Thread Tree
Figure 1. Timestore
Time-based Visualization

Features
In this view, we display messages in a two
dimensional grid, of which row is email address
of a person and column is date as shown in
Figure 4. Each grid has the messages that the
corresponding person sent/received on the given
time. We encoded the number of messages as
height in bar chart or gradation in spot.
see (Figure 6). If users change a range, then data
in the range will fit into the screen and data out
of the range is hidden. By moving slider bar, we
can see the hidden data, too. The labels such as
addresses or date fit dynamically to the chosen
range by displaying more detailed information as
zoomed more.
The first section shows email addresses of
people who sent or received mails. The second
section shows the number of mails the person
sent/received in total, using bar chart. Users can
choose the option whether they see incoming
mails or outgoing or both.

Users can choose date level as date, month, year
that messages are aggregated by the level. When
it is aggregated by date, there appear vertical
lines by week to help users see weekly patterns.
Sort can be done by the order of email addresses,
domain names, and message counts. It has
functions to filter people whose email address
has a certain substring, especially filtering by
domain name is an interesting query. It is also
possible to search messages by email addresses
or subject.

Scalability
- Bar chart vs. Gradation
To see the number of messages in each gird more
accurately and compare with others, bar chart
might be more helpful. But if we have many
people in a screen and a range of period is very
long, it is difficult to show the patterns using bar
chart. For the case that we have many people and
long-term period, we have another view using
gradation. Each cell has a spot and the gradation
of the spot represents the number of messages.
This view will give a good overview of messages
in terms of people and date. While incoming and
outgoing messages can be shown simultaneously
in bar chart as color coding, spot s will only
show the total number of messages as chosen.
Figure 5 shows the views using bar chart.
- Dynamic Query
To manage large dataset, we also used dynamic
query method for people and date. This will
dynamically filter and zoom the range of data so
that users can easily find the data they want to
Message Selection
As putting a mouse on the cell, the information
of the cell- person and date - can be seen. Users
can see the detailed information by clicking the
right mouse button on the cell. A pop-up window
will show up with a list of the messages in the
cell. Each message has the subject and the
number of messages in the thread which it
belongs to. To see the thread view related to a
message, users choose a individual message in
the list. Figure 7 shows the pop-up window for
message selection.
Thread-based Visualization
Thread view shows the relations of messages as
shown Figure 8. For a chosen message, we find
all messages that are related to it and display
them with all the people who participated in the
thread. The rows are people and messages are
listed in the order of received/sent time. Note
that unlike newsgroup data, both senders and
receivers are important.
We represented senders as big red rectangles and
receivers as small blue circles. There appear
arrows between senders and receivers of the
same mails to show we. If a mail is the reply
mail to the other, then another kind of links
connects two mails, which is red thick lines in
Figure 8. We divided time axis by date to help
understand time information of messages.
Problems in Visualization
For outgoing mails, receivers are important
because senders are always the owner. Receivers
may not be one, so the same messages may
appear several times in time-based view. This
may show us more messages in visualization
than really exists. But in some sense, we can
think that several messages that have the same
contents are sent to receivers.
Our thread view can be detected only if users
write messages using "reply", which will add
reply information in email headers. But
sometimes users may send emails without using
it although they are replies to other mails. In this
case, we should consider subjects, contents and
receiver/senders group but it is much more
difficult to find the correct information.
"Forward" information also can be useful for
constructing thread, but it is not available in our
implementation because this is not a part of
standard email headers.
In case that the same person use several email
addresses, we cannot detect them. Especially, if
users are in a mailing list, we cannot find this
only with mailboxes. In this case, it should be
possible that users can specify which email
addresses are actually from the same person and
merge the data related to them.
Future Work
In our visualization, users can see data in many
ways using filter, sort, search, etc. But they may
want to edit or annotate at messages for future
use. This function can be useful, especially in
email dataset. For example, users may want to
mark messages as it needs to be replied or as it is
a reminder for future tasks.
Search functions can be done only for subject,
and sender/receivers. But it will be useful to
search contents. Specifically we might want to
find a message that has URL, Email-address, or
attached files.
In time-based visualization, we can aggregate or
filter people based on domain name of their
email addresses. But other aggregation/filtering
can be done if we define groups for people in
various ways. For example, we can make a group
based on thread or users may define a group such
as family, friends, colleagues, etc. More
generally, it would be good if we can connect
this visualization with databases that have
information about people, and filter/aggregate
people based on the database.
We can think of another useful view of emails:
group-based visualization. Email exchange
pattern will give useful information about
relations between people. We may group people
based on how frequently they were in the same
thread and visualize those groups as graphs.
Conclusion
We proposed two visualizations of email dataset:
time-based view and thread-based view. Timebased view displays messages in a twodimensional table of which rows are people and
columns are received/sent time and each cell has
a list of messages for the person and the time. To
manage large volume of data, we used dynamic
query, zooming and gradation in this view. This
view will give users temporal email exchange
patterns of correspondents. Thread-based view
shows emails exchanged using "reply". It
displays all senders who participated in the
thread and messages in the order of time with
relations of those messages. This view is helpful
to see view the history and track the status of
conversation about the same topic.
Acknowledgements
We would like to thank Jihwang Yeo and
Hyunmo Kang for their valuable comments.
Reference
[1] Baecker, R., Booth K., Jovicic, S.,
McGrenere, J., Moore, G. "Reducing the Gap
Between What Users Know and What They
Need to Know"
[2] Donath, J., K. Karahalios, and F. Viegas,
"Visualizing conversations", In Proceedings of
HICSS 32, January 5-8, 1999
[3] Rodenstein, Roy and Judith S. Donath.
(2000) "Talking in Circles: Designing A
Spatially-Grounded
AudioConferencing
Environment", In Proceedings of CHI '2000, pp.
81-88
[4] Smith, Marc A., Cadiz, JJ and Burkhalter, B.,
"Conversation Trees and Threaded Chats", the
Proceedings of the 2000 ACM Conference on
Computer Supported Cooperative Work
[5] Smith, Marc A. and Fiore, Andrew.
"Visualization Components for Persistent
Conversations", ACM SIG CHI 2001
[6] Shneiderman, B., "Dynamic Queries for
Visual Information Seeking", IEEE Software,
11(6), 70-77
[7] Viegas, F. B. and Donath., J. S. "Chat
Circles", Proc. of CHI'99. 1999
[8] Whittaker, S. and Sidner, C. "Email overload:
exploring personal information management of
email", In Proceedings of Conference on Human
Factors in Computing System `96
[9] Yiu, K., Baecker, R.M., Silver, N., and Long,
B., "A Time-based Interface for Electronic Mail
and Task Management," In Design of Computing
Systems: Proceedings of HCI International '97,
Volume 2, Elsevier, 1997, 19-22.
Download