p903-sheridan - Center for the Study of Digital Libraries

advertisement
Display-agnostic Hypermedia
Unmil P. Karadkar, Richard Furuta, Selen Ustun, YoungJoo Park,
Jin-Cheon Na*, Vivek Gupta, Tolga Ciftci, Yungah Park
Center for the Study of Digital Libraries and
Department of Computer Science
Texas A&M University
College Station, TX 77843-3112
Phone: +1-979-845-3839
*Division of Information Studies,
School of Communication & Information,
Nanyang Technological University,
31 Nanyang Link, Singapore 637718
Phone: +65-6790-5011
furuta@csdl.tamu.edu
tjcna@ntu.edu.sg
ABSTRACT
In the diversifying information environment, contemporary
hypermedia authoring and filtering mechanisms cater to specific
devices. Display-agnostic hypermedia can be flexibly and
efficiently presented on a variety of information devices without
any modification of their information content. We augment
context-aware Trellis (caT) by introducing two mechanisms to
support display-agnosticism: development of new browsers and
architectural enhancements. We present browsers that reinterpret
existing caT hypertext structures for a different presentation. The
architectural enhancements, called MIDAS, flexibly deliver rich
hypermedia presentations coherently to a set of diverse devices.
Categories and Subject Descriptors
H.5.4
[Information
interfaces
Hypertext/Hypermedia – architectures.
and
Presentation]:
General Terms
Design, Human Factors
Keywords
Display-agnostic Hypermedia, Multi-device Integrated Dynamic
Activity Spaces (MIDAS), context-aware Trellis (caT)
1. INTRODUCTION
Over the last decade, the characteristics of information access
devices have grown dramatically more diverse. We have seen the
emergence of small, mobile information appliances on one hand
and the growth of large, community-use displays like SmartBoard
[27] and Liveboard [8] on the other. Desktop computer displays
also sport a variety of display resolutions. While PDAs and cell
phones are widely used for Web access, several other devices like
digital cameras [21] and wristwatches [24] are acquiring network
interfaces to become viable options for information access. These
devices vary in terms of characteristics such as their display real
estate, network bandwidth, processing power and storage space.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies
are not made or distributed for profit or commercial advantage and
that copies bear this notice and the full citation on the first page. To
copy otherwise, or republish, to post on servers or to redistribute to
lists, requires prior specific permission and/or a fee.
HT’04, August 9–13, 2004, Santa Cruz, California, USA.
Copyright 2004 ACM 1-58113-848-2/04/0008…$5.00.
Optical LED (OLED) displays that can be tailored for individual
applications and embedded into various daily use items will soon
be widely available [15], thus further diversifying the display
properties of information appliances.
Despite the diversity in appliance characteristics, most Web pages
are created and optimized for viewing from desktop computers.
To address this issue, a significant body of research has focused
on developing methods to tailor this information for presentation
on mobile devices. Projects like WebSplitter [14], Power Browser
[5], Proteus [3] and the Content Extractor [13] filter Web content
to facilitate its presentation on mobile devices. Popular Web
portals like Yahoo! [35], The Weather Channel [30], and CNN
[6] also provide interfaces and services for mobile devices.
Typically, the Web and mobile services are based on independent
architectures and retrieve information from a common data store.
While this approach caters to mobile devices, it requires the
service providers to maintain multiple system architectures and
synchronize content across these architectures. These services
must be periodically reconfigured in order to accommodate new
devices and changes to characteristics of existing devices or risk
losing their patronage. Furthermore, these mobile services, much
like Web site design practices, focus on delivering information to
specific classes of devices.
The pro-desktop bias of the Web information access model is not
limited to technology alone. As most desktop computers are
located in home or office environments, this model inherently
assumes that Web access clients browse the information from
these environments. Mobile service architectures, whether they
filter information or replicate services, focus solely upon the
technological issues regarding information delivery. However, the
needs and expectations of mobile users are different from those of
desktop users. Few present-day models tailor the modality of
delivery based on characteristics of the surrounding environment
without explicit action from the user.
In this paper we present two approaches to separating the
information content of context-aware Trellis (caT) [20] hypertexts
from their mode of presentation. The first approach involves
development of new browsers that reorient and repurpose the
hypertext content for novel presentations. The second approach
enhances the caT architecture to support dynamic integration and
co-use of devices with different characteristics for rich
information interactions. This enhanced architecture is dubbed
Multi-device Integrated Dynamic Activity Spaces (MIDAS). To
accommodate differences in the strengths of devices that render
them, MIDAS separates the information content from the mode of
its presentation. MIDAS-based hypertexts take the form that their
rendering device can best present and are thus, display-agnostic.
MIDAS supports co-use of various devices available to a user, say
devices that users carry with them, like cell phones, PDAs, pagers,
and notebook and tablet computers, or in some cases, publicly
available desktop computing resources in airports and malls, to
augment the information delivery environment. While the smaller
mobile devices may be individually restricted by their physical
characteristics, MIDAS can use these in combination with others
to overcome their individual limitations to make feature-rich
presentations. For instance, a user who carries a cell phone and a
networked PDA may view annotated illustrations when neither of
these devices have enough display space to visually render this
information. The cell phone may aurally render the annotation
while the PDA displays the corresponding images. Textual
annotations could be rendered easily as audio via freely available
software, such as the Festival Speech Synthesis System [10], in
order to overcome the lack of display space. While MIDAS jointly
uses the cell phone and PDA for the presentation, this association
is temporary and extends for the duration of this presentation.
The rest of the paper is organized as follows: in the following
section we review the work this research builds upon. The next
section presents our approaches to tackle the issues involved in
presenting hypermedia content effectively over multiple devices.
We then describe the MIDAS architecture and discuss how it
connects to other relevant research projects. We conclude this
paper with directions for continuing our work.
2. context-aware Trellis (caT)
The context-aware Trellis (caT) hypertext model [20], an
extension of Trellis [28], affords simultaneous rendering of
diverse, multi-modal presentations of information content via a
variety of browsers. caT documents can be presented differently to
various users, based on their preferences, characteristics, and a
wide variety of environmental factors such as their location, time
of access and the actions of other users perusing the document
[11]. The caT model differs from that of the Web in several
respects. We highlight the salient differences between these
models as we describe the caT interaction model.
In Figure 1, two users, John and Bob, are simultaneously
browsing a hypertext from a caT server. John is browsing sections
of the document from his desktop computer via two different
browsers and from his notebook computer via yet another
browser. Bob is accessing parts of this document from his
notebook computer via two browsers. While each browser may
present different but related sections of the document, it is equally
likely that John is viewing the same section of the document via
two of his browsers. Unlike Web browsers, caT allows its
browsers a great deal of flexibility in presenting information.
While all Web browsers render a given document identically, caT
browsers present documents differently based on the properties of
the browser. The caT server only tells the browsers what to
present but leaves the finer aspects of the presentation to the
browsers. Browsers have some flexibility in deciding how to
present this information. Thus, John may actually be viewing a
part of the document in multiple media formats; while browser A
displays images, browser C may present information textually,
and browser B may only present information that can be
tabulated. caT also supports synchronized information
presentation to a set of browsers. Bob may thus watch a video of a
product demo in browser D, while browser E presents the salient
points about each feature as it is presented in the video.
The other interesting aspect of this interaction is that user actions
are reflected in all browsers currently viewing the information,
even if they belong to different users. When John follows a link in
one of his browsers the caT server, unlike a Web server,
propagates the effects of this action to all five browsers connected
to it. While this action will almost certainly affect the display of
John’s other browsers, it also has the potential to influence Bob’s
browsing session. If John and Bob were browsing a Web
document, the effect of John’s action would be reflected in
Browser C alone. Typically, it would not affect other browsers
whether they belong to the same user or another, and whether they
run on the same computer or a different one.
The browsing experiences of caT users may vary depending upon
a variety of environmental factors, for example, location. If Bob is
working from home he may see the document differently than
John, who may be working from his office. The document may
also be shown differently depending upon personal characteristics
such as the roles they play in their organization. John, being a
developer, may see the technical details of a project, while Bob, a
Bob
John
Notebook 1
Notebook 2
Desktop
Browser A
Browser C
Browser B
S1
S1
R1
S1
Browser D
S1
S1
caT Server
(hypertext specification, state)
Figure 1: caT interaction model and state management
Browser E
project manager, may get a quick overview of the status of various
project deliverables.
In the WWW hypertext model, individual browsers maintain the
state of browsing via techniques that are generally reliant upon
cookies. Closing a Web browser often results in loss of state and
the user must restart the browsing session. In contrast, the caT
model maintains the browsing state for all users on the server.
Browsers may connect or leave the server without affecting their
browsing state. If John were to open another browser to view this
document, it would instantly reflect his current state in the new
browser as well. In fact, John could close all his browsers, return
the next day and continue browsing from where he left off today.
caT allows users to view hypertext materials from different
devices in a variety of modes. This flexibility makes caT an ideal
vehicle for building display-agnostic hypermedia.
3. BROWSER DEVELOPMENT
Display-agnostic hypermedia structures naturally lend themselves
to multiple forms of presentation. We include display-agnosticism
in hypertexts in two different ways: by developing browsers that
can interpret information content in diverse ways and via
architectural enhancements to caT.
We have expanded caT’s repertoire of browsers with an audiovideo [29] and a spatial browser. Before we developed these
browsers, caT supported textual and image browsers and a Web
interface that presents text and image composites [20]. The audiovideo browser renders textual information aurally, thus providing
a different rendering for existing information. The spatial browser
renders a hypertext’s information contents as widgets on a canvas.
3.1 Audio-Video Browser
The audio-video browser serves a two-fold purpose: it renders
audio and video information associated with caT hypertexts; it
also renders textual materials aurally [29]. Auditory browsing
serves as the primary browsing interface for visually impaired
users; sighted users can also use it in conjunction with other
browsers to avail themselves of an additional mode. This browser
uses the Xine multimedia player for audio playback [32]. It
generates audio from text via the Festival Speech Synthesis
System [10], which includes English as well as Spanish voices
and supports the integration of MBROLA [18] voices for greater
versatility.
The user interface employs a simple keyboard-based interaction
mechanism and confirms user input via audio prompts if the user
so chooses. The user interface works in two modes—navigation
and audio content. Initially the browser starts up in the navigation
mode. This mode supports users in browsing the hypertext by
following links and selecting the pages to visit. Once a user visits
a page, she has the option to listen to the contents of the page. The
audio content mode is initiated when the information associated
with the page is presented to her. This mode provides an interface
for controlling the content presentation to suit her preferences.
She can play or pause the rendering and skip or skim the contents
in either direction. Ending the audio content mode returns her to
the navigation mode.
The selection of sounds and voices to use is crucial for helping
users differentiate between audio prompts, user action
confirmation and content presentation. The interface employs
Table I: Common keys and commands
Key
Associated command
H
Help
I
Information
T
Notification (toggle)
Esc
Break out of the current audio stream
recorded audio prompts for notification of user actions and
synthesized voice for audio prompts and content rendering.
Frequently used actions are mapped to the numeric keypad so that
the most common functions are grouped together. Other actions
are mapped to letters that best represent them. Some inputs are
available to the user regardless of the mode. Table I displays the
inputs available to users in both navigation and audio content
modes. The ‘T’ toggles whether to provide audio feedback of user
actions. The escape key is used to break out of the active audio
stream, whether it is file contents, help menu, or information
options. The actions “help” and “information” both return
context-sensitive information. The help feature reminds the users
of various key mappings available in that mode. The information
command presents a brief summary of the user’s current context.
In the navigation mode, users hear information about the user’s
location in the hypertext and the actions available to her. In the
audio content mode it returns information about page contents, for
example, name of the file associated that is being presented, the
duration of audio presentation of the file, and the user’s current
position in the file.
Table II displays the commands available to a user in the
navigation mode. The up and down arrow keys are used to cycle
through the list of available links. The right and left arrow keys let
the users cycle through the list of active pages. The ‘P’ and ‘L’
keys present a complete listing of the pages and links available.
The ‘S’ key returns this description or summary associated with
the current page. The user may select a page or link from the
presented list by its number via the numeric keys located above
the alphabetic characters. The “Return” key selects the current
link or page. If the user selects a link she navigates to the next set
of pages and the interface presents her with the corresponding
information. On the other hand, if a page is selected, the system
Table II: Navigation mode keys and commands
Key
Associated command
→
Next page
←
Previous page
↑
Next link
↓
Previous link
P
List of pages
L
List of links
S
Summary information about the place
Numeric keys
Select a page or link by position
Enter/Return
Select the current page or link
Table III: Audio content mode keys and commands
Key
Associated command
→
Forward
←
Backward
↑
Increase skip time
↓
Decrease skip time
P
Play/Pause
switches to the audio content mode and the key mappings change
to those shown in table III.
Commands in the audio content mode aid the user in controlling
the presentation of the page contents. She can play or pause the
audio via the ‘P’ key, and skim the file by skipping forward or
backward in the presentation via the right and left arrow keys. The
initial skip locations are placed at increments of about 10% of the
duration of the file. Up and down arrow keys increase or decrease
this duration for rapid or more fine-grained skimming of the
contents.
3.2 Spatial Browser
We have also developed a spatial browser as an alternate
browsing interface for caT hypertexts that include textual and
image content. The spatial browser renders content information
and actions available to users on a two-dimensional canvas.
Hypertext authors can guide the placement of these elements on
the display by specifying coordinates for each element. They may
also choose to let the browser position content elements
randomly. The Spatial browser, like the audio browser is a
composite browser. It combines the information content of all
active nodes within a hypertext and all actions available to the
user and places them on a single canvas. The links are visually
distinguished from the information content. Figure 2 displays a
snapshot of a user browsing through her friend’s hypertext about a
visit to Spain. The top left part displays the castle in Segovia and
the superimposed text is the comment added by the friend. The
location of our user within the presentation is displayed on white
background towards the bottom. The arrow in the list indicates the
current position in the presentation. Finally, the user can click on
the big arrow above the comment to visit the next image in this
presentation. If our user were to view this presentation via text
and image browsers, all of these elements would appear in
different applications. Of these, the image and comment are
displayed in figure 4.
4. MIDAS
As another approach to support display-agnosticism, we enhance
the caT architecture to support users in browsing hypertexts from
various devices. This architecture is called MIDAS, an acronym
for Multi-device Integrated Dynamic Activity Spaces. To design a
hypermedia infrastructure that will interact with different devices
in diverse settings, we first need to understand the attributes of
devices, users, environments and the information elements that
affect the process of browsing and how these relate to each other.
As shown in figure 3, information access devices can be
characterized in terms of permanent and transient properties.
Hardware and software capabilities such as display resolution, the
Figure 2: Spatial browser
number of colors they can render, local storage space, processor
power, and their network bandwidth are all intrinsic
characteristics of a device. Other properties may change more
frequently, as media drivers are installed or uninstalled or users
select whether they wish to share information with others. A
device can render some media formats, and may be shared with
other users or may display multiple information elements
simultaneously. GPS enhanced mobile devices can locate their
position in geographic spaces, while larger devices may be
situated in well-known positions.
The location of a device helps characterize its environment. For
our purposes, we characterize environments in terms of the degree
of privacy a user has. This characteristic can help decide the
modality (whether to play audio or not) or the level of detail to
present. Interference indicates how distracted a user might be. For
example, a user waiting in an airport lounge is in a public place
but she may not be disturbed if she is traveling alone. On the other
hand, a conference attendee has a higher potential to be engaged
in conversation while she waits for her colleagues in the lobby of
their hotel. Users in insecure environments may enable higher
encryption algorithms for data transfer, possibly at the cost of
performance. While performance degradation may be insignificant
for desktop computers, it is a consideration when working with
mobile devices with low processing power.
Users indicate their preferences regarding devices, media formats,
and languages among myriad other preferences. They may also
express role-specific preferences.
In MIDAS, attributes of the hypertexts’ information contents,
such as the media types, file sizes, physical location of the files
either as paths on the disk or as URLs, their languages, versions,
and information about their creation are also externalized. The
system requirements for rendering files are also crucial for
Devices
Media formats
Display space
Display colors
Disk space
Processor
Network bandwidth
Sharable with others
Multitasking support
Location
Environment
Privacy (presentation)
Security (transfer)
Interference
Location
Information Elements
Media Format
File size
File Location
Language
Version
Creator
Time of Creation
Display space required
Display colors
Necessary Bandwidth
Privacy
User expertise
Figure 3: Mapping of attribute relationships
MIDAS to estimate whether a device can successfully render
information in the files assigned to it. These typically include the
display expectations from the device and file transfer requirements
such as the bandwidth required to transfer these files, especially
for audio or video files that may be streamed to user devices.
Finally, MIDAS must know the environmental properties and user
population that the file is suitable for. Given these attributes,
MIDAS finds the best matching resource for a user accoutered
with a set of devices within the constraints of her environment.
The MIDAS approach to information delivery is based on the
belief that client devices that present information to users in
diverse social settings are better equipped to judge their
capabilities and requirements than a central server, which can only
infer the clients’ state from information provided by the clients
themselves. In keeping with this spirit, MIDAS devices have a
great deal of leeway in deciding the media types to present. To
achieve this, MIDAS information contents are available in a
variety of media formats. Various instantiations of information
contents that may be interchangeably used are grouped together
under a common resource
handle. A resource handle thus
Media format preferences
acts as an abstract representation
Language Preferences
of
equivalent
information
Device Preferences
content representations and
Expertise
encapsulates them into a
semantic unit. Hypertexts refer
to information content via these
resource handles. This scheme adds a level of indirection between
the information content and the hypertexts and serves to introduce
the display-agnosticism in MIDAS. The devices receive resource
handles and must resolve this abstract identifier to a media format
that best matches its expectations. In a programming language
parlance, the resource handles serve to delay the binding [12] of
information content with the hypertext specifications. The
Resource Realizer thus acts a virtual entity and introduces
polymorphism [7] by allowing devices to bind the information
content with the hypertext at display time.
Users/Roles
caT and its predecessor, Trellis [28], have espoused documentcentric specification of hypertext structure and browsing behavior.
However, in the interest of easing creation and management of the
hypertext specifications MIDAS extends the caT architecture to
support browsing from a diverse set of information devices.
4.1 A MIDAS Browsing Session
With gadgets going mainstream increasingly many individuals
own cell phones and handheld computers. Other daily use or
general-purpose electronic items like wristwatches [24] and
cameras [21] are including network connectivity. Although such
devices increasingly are network enabled, they often lack the
ability to render media types such as images, audio and video by
themselves. The disparity in the properties of these devices,
however, provides for richer interaction opportunities when they
are used together. Cell phones are naturally suited for audio
rendering, while handheld computers are convenient for
presenting visual information.
For example, consider this scenario: As a researcher steps out for
coffee during a session break at a conference, she notices, on her
cell phone, that she has received an email from a friend visiting
Spain along with a link to photographs from her
trip. As she waits in a line behind other attendees
equally keen on having coffee, she decides to
check out the link. Her MIDAS-enabled cell
phone opens the link to her friend’s MIDAS site,
which contains annotated pictures of Segovia,
which she had visited earlier in the day. The cell
phone, being unable to display the photograph,
retrieves the annotation so she can listen to its
audio version. The description of the photograph
interests her and she switches on her PDA and
directs it to the link location received from her
friend. The PDA downloads and displays the
image associated with this annotation. From this
point onward, the PDA and the cell phone work in
tandem. The PDA displays the images via an
image browser while the cell phone converts the
image descriptions from text to audio and plays.
She uses the cell phone to navigate between
(b) Textual annotation
(a) Image browser displays the photograph
images. Thankfully there are only a few images,
which gives her enough time to send a quick email
Figure 4: The Alcazar in Segovia
Hypertext
Specification
resource handles
author preferences,
Hypertext
Device Manager Information
Service
Authoring
user actions
user and device profiles,
current device load
Authors
Select
Device(s)
status,
user actions
Resource
Resource
Browser Coordinator
Authoring
Realizer
Browser registry
resource handle,
constraints
information
user actions
content
MIDAS
Device 1
resource handles,
constraints, preferences
information content,
resource properties
resource instance(s)
Users
Browser
1 2 Resource
Browser
Browser
Browser3Repository
4
Device 2
Device 3
Figure 5: The MIDAS architecture
back to her friend before she reaches for a coffee mug. Figure 4
displays a snapshot of her browsing session captured from a
simulated PDA browser and a textual annotation display with an
integrated link that takes her back to the index.
Unlike a typical Web browsing session, in this example the
researcher browses using two heterogeneous devices in parallel,
much like a caT session. Her browsing session is also different
from John and Bob’s sessions discussed earlier. caT users must
open the browsers that they wish to browse with. MIDAS, on the
other hand, opened the browsers that are most appropriate for her
task on both the cell phone and the PDA. Noting that the cell
phone was already being used for navigation, the PDA suppressed
its browsing interface and devoted all of its available space to
displaying the photographs. Of course, our researcher could have
switched to browsing from the PDA if she so preferred. While her
PDA could have played the audio annotations, it did not take on
that task as she was in a public location and the audio would have
disturbed others in the coffee line. Lets take a look at the
architecture that would make this scenario a possibility.
4.2 Architecture
The MIDAS architecture [17] is illustrated in figure 5. The serverside extensions receive active information elements from the
information server and route these to a device or a set of devices.
The client-side extensions receive these information elements and
invoke the browsers that render the information for user perusal.
4.2.1 Information Service
The Information Server is a hypertext engine. It reads the
hypertext specification, effects user actions received from the
browsers on this specification, and returns the resulting state back
to the browsers. While the current MIDAS implementation is
based on caT, we aim to design a system that can work with other
hypertext models such as the World-Wide Web. Hypertext
specifications stored at the server refer to information content via
abstract resource handles. Hypertext authors also specify their
preferred media formats and other properties to zoom in on their
recommended instantiation of this resource. The resource handles
and author preferences are passed on to the Device Manager along
with the browsing state. In general, MIDAS attempts to comply
with the authors’ recommendations unless they are overruled by
user preferences or there are no suitable devices for presenting the
recommended resource instances.
4.2.2 Device Manager
All devices available to a user register with the Device Manager,
which acts as the centralized display controller. The Device
Manager, together with Browser Coordinators that run on MIDAS
client devices form the information routing mechanism. The
Device Manager receives resource handles and author
recommendations for instantiating these from the information
server. It compares author recommendations against a variety of
other parameters such as device characteristics, system policies,
user preferences, and the current information load for these
devices and routes various resource handles to devices that can
best present it under the given circumstances.
4.2.3 Browser Coordinator
Each MIDAS client device runs a Browser Coordinator that
communicates with the Device Manager. The Browser
Coordinator receives resource handles and author preferences
from the Device Manager, and uses these, with its knowledge of
the client’s hardware and software capabilities, to retrieve the
resource instantiation that this device can best render. It invokes
an appropriate browser to render the retrieved information
elements to users.
displays if authors of information elements do not provide this
content in a format that mobile devices can render.
4.2.4 Resource Realizer
While MIDAS recommends that authors provide information
elements in a variety of formats, they may often be hard pressed
for time and may not have the time or the inclination to provide
information in multiple formats. We are exploring mechanisms to
support authors in automatically or semi-automatically converting
information content to other media formats. The Resource
Manager [22] assists authors in adding resources to the Resource
Realizer’s repository. The Resource Manager works with a set of
schemas that define relationships and the level of automation for
converting information between various media types. For
example, textual documents such as PDF or Postscript files may
be automatically converted to plain text documents with some loss
of formatting information; images can be scaled up or down in
size or the number of colors they use. Similarly audio and video
files may be downgraded to lower sampling rates in order to
support the less capable devices. While some of these conversions
may be completed automatically, others may require user
intervention. For example, scaling a JPEG image from 320x240
pixels to 160x120 pixels can easily be automated. However, when
converting this image to 300x300 pixels, user intervention may be
The Resource Realizer provides a layer of abstraction between
specification of various resources that the hypertext references
and their physical manifestation in various formats. Conceptually,
it encapsulates all resources that contain similar or
interchangeable information. For example, a photograph of a
space shuttle launch, its video and its text description may all be
stored as a single resource or conceptual unit. Practically, it
decouples the hypertext structure from the information content
presented to its viewers. For example, if the video of the shuttle
launch were to become corrupted or unavailable (as in the case of
a Web-based resource) the author could either remove it from the
corpus of files associated with this resource or replace it with
another video without modifying its reference in the hypertext
specification. Operationally, the Resource Realizer receives an
abstract resource handle from various browser coordinators along
with their preferences and constraints. The Resource Realizer
weighs these against characteristics of the information elements
that it stores and returns the one that best matches the requested
specification to the Browser Coordinator. The Resource Realizer
may either return the file itself (if it is locally available), a location
pointer within the browser’s file system (disk path) or a globally
accessible location pointer such as a Web location.
4.4 Resource Management
4.2.5 Browsers
Browsers provide an interactive interface to MIDAS users. They
render the information content returned by the Resource Realizer
and the various actions (or links) available to the user. The
browsers report user actions to the Browser Coordinator, which
propagates them back to the Server. MIDAS devices may render
one or more information elements. For example, a cell phone may
deliver audio driving directions while displaying the latest stock
alert on its LCD panel.
4.3 Temporary integration of devices
Airport terminals and malls are increasingly providing computing
facilities for public use. These facilities include network
connections for users of notebook computers, or in some cases
desktop computers and in others, printing services. MIDAS
architecture supports temporary integration of public devices into
a user’s device space. These transient devices are treated as
trusted devices available to the user for various purposes while
they are a part of the device space. Such integration will aid users
in a variety of scenarios. Users whose devices lack the necessary
resources for presenting rich media formats may use them to
expand the features and services available to them. Other users
may extend their spaces to their friends’ or colleagues’ devices for
spontaneous collaboration.
Users may either have full control of the integrated devices over
the period of their inclusion or they may be restricted to specific
operations depending upon the security considerations of these
devices. For example, a large-screen display may grant users
view-only permissions that allow them to display their non-private
and non-critical information but they may not interact with this
information (by, say, following a link). In some cases, users who
carry small mobile devices alone may need to include large screen
Figure 6: Resource Manager – Coverage support
necessary as there are many possibilities for achieving this
conversion. For example, the image may be cropped to bring it
down to the desired size, and user input may be required to decide
which part should be retained. Alternately, the image may be
reduced to the desired size even if it changes its aspect ratio or as
yet another option, it may be scaled to the nearest possible size
and the difference in dimensions may be left unused (leaving the
image at a size small than 300 pixels in one dimension) or a filler
may be inserted to restore the image to its expected size, in which
case the user may specify the distribution of the filler content
(color, whether it is below the image, above it, or distributed
equally all around it).
As this example illustrates, conversion of information content is a
complex process and while some users may be content to let
MIDAS use its judgment in performing these conversions, others
would surely like to be active decision-makers in deciding the fate
of information content associated with their hypertexts.
Furthermore, authors also need support in visualizing the
coverage provided by their information elements. By coverage, we
mean the set of devices that their hypertexts can be presented on.
Authors have a vested interest in maximizing the coverage, as
hypertexts with better coverage reach wider audiences. Figure 6
illustrates a coverage notification generated after the user has
added a large image file. The image is too large to be displayed on
desktop computers; however the file size is small enough that it
can be presented over low bandwidth network connections. This
interface allows users to perform the basic management operations
on resource instantiations.
5. RELATED WORK
5.1 Ubiquitous Computing
bandwidth Web browsing environments [9]. However, real-time
image conversion is a slow and resource intensive process and
hence is not scalable. The power browser summarizes Web pages
and forms and presents them on devices with small form factors in
an integrated format [5]. WebSplitter supports collaborative,
multi-user, multi-device Web browsing by presenting partial
views on a set of devices, that may belong to different users [14].
Techniques for third party filtering and adapting Web pages
override author’s choices of words, images, text, etc. regarding
the presentation of their information. Also, they do not guarantee
that the resulting information will be acceptable to the user, thus
alienating both information creators and users. Our mechanism
faithfully reproduces some manifestation of authors’ specification
to readers within the constraints imposed by their browsing
devices. Authors of hypertexts retain complete control over all the
information that is presented to the users; they define the
browsing structure as well as the pool of resources that are
displayed to the users.
5.3 Document Specification
The Multivalent Document is a general digital document model
for organizing complex digital document content and functionality
[23]. It promotes integration of distinctly different but intimately
related content into multi-layered, complex documents. In the
Multivalent architecture small, dynamically loaded objects called
“behaviors” activate the content. Behaviors and layers work
together to support arbitrarily specialized complex document
types. This model is a document-centric equivalent of our
resource handle that binds the media-specific representations
together. However, the complex (and large) documents are not
optimal for frequent transfer over networks, especially to devices
with limited capabilities.
The field of Ubiquitous and Pervasive computing aims to enrich
the quality of life by augmenting our physical environment with
sensors and computers and using these to improve awareness,
interactions, and to provide services when and where desired [1].
MIDAS focuses on enriching users’ information browsing
sessions by distributing information presentation across the
different devices that they might possess; it does not expect any
augmentation to a user’s existing environment. While MIDAS
supports the integration of public access computing resources
when they are available in the environment presence of these
devices is not mandatory for users to fully benefit from the
MIDAS architecture.
SMIL [26] provides mechanisms that permit flexible specification
of multimedia presentations that can be defined to respond to a
user’s characteristics. However, this support is restricted to an
instantiation of the presentation. Once instantiated, the changes in
browsing state cannot be reflected to other browsers that the user
may open later, or to other devices. Similarly, XML [33] and
XSLT [34] implementations allow flexible translations of given
static structures, but these transformations are static and
irreversible. Generation of a new transformation requires
repetition of the process with a different XSLT template.
However, Ubiquitous computing and MIDAS share other foci, the
issue of scale, that is, support for a broad range of computing
devices [31]. Much like Ubiquitous computing [1], MIDAS
attempts to provide natural interfaces and deliver context-aware
information to users.
The Pebbles project has explored the co-use of Windows-based
handheld and PCs [19] for a variety of purposes. Multimachine
user interfaces (MMUIs) extend the Windows interface to PDAs.
Other applications include the use of PDAs as input devices for
Windows desktop computers and to control Powerpoint
presentations. The Ubiquitous Display project [2] promotes
interaction with large public access displays via users’ cell
phones. While both projects explore the co-use of diverse devices
for information access, they do not address the issues pertaining to
tailoring of information to suit the devices’ capabilities.
5.2 Capability-based Information Delivery
A variety of techniques have been employed to tailor information
rich Web pages for display on impoverished devices. Bickmore
and Schilit’s Digestor system provides device-independent Web
access by automatically reauthoring HTML pages to match the
capabilities of the target device [4]. The system transforms
existing Web pages to suit user-specified device characteristics.
However, the statelessness of the HTTP protocol prohibits users
from changing devices during a browsing session. Pythia distills
Web content at display time to maximize throughput for low-
5.4 Multi-device Information Interfaces
5.5 Other Application Areas for MIDAS
The Resource Realizer returns information contents that best
match the attributes specified by users’ devices. This approach
emphasizes the primacy of the devices in deciding the information
content that they can render optimally. It also helps MIDAS
address two special groups of audiences in a graceful manner and
with a minimal overhead to resource creators. The first of these
groups is international users, who may prefer to view information
in their native or preferred language(s). Resource creators can
accommodate the needs of these users by including information
content in various languages within their resources. This allows
users the flexibility of requesting resources in a language of their
choice. Furthermore, as all information encapsulated by a resource
in intricately connected, viewing a resource in different languages
may help users connect concepts and phrases in different
languages to improve their understanding of the languages they
struggle with. While textual materials can be directly translated
between languages, images, layouts and other presentation
artifacts that vary between cultures must also be considered [25].
While inclusion of these information elements is easy, languagespecific browsers may better address the issues involved in
combining and presenting coherent information views to
international users.
Disabled users also face an uphill task when accessing
information content. Ensuring accessibility is unfortunately not a
prime consideration when designing Web pages. While Web page
reading software [16] assists in audio rendering of computer
displays for visually impaired users, these software solutions must
deal with Web page clutter [13]. In contrast, caT documents allow
authors to tag information contents with additional attributes [20].
These attributes may serve a variety of purposes, for example, to
identify their function and individual browsers may decide
whether they should render this information. The caT audio-video
browser [29] assists users in browsing hypertexts aurally. While
this browser is most suited for visually impaired users, including
additional attributes within the Resource Realizer and developing
browsers that render information with due consideration of these
attributes will enable MIDAS to support users with various other
impairments as well.
6. FUTURE DIRECTIONS
Our efforts to develop and improve MIDAS and caT-based
display-agnostic hypermedia continue on multiple fronts. We are
working on improving the conversion and coverage support in the
Resource Manager to help resource creators in supporting various
devices with minimal effort. Providing accurate coverage
information is tricky as device characteristics are continually
changing. Providing a functional and manageable interface that
displays coverage information and conversion suggestions is a
challenging task due to the number of variations possible.
Currently, MIDAS authors use independent interfaces to create
hypertext specifications and to provide resource instances
associated with these structures. While each of the interfaces
seems reasonable individually, it is cumbersome to associate the
resources with the structure. We are devising strategies to provide
a unified interface for these tasks.
The Device Manager and Browser Coordinators are being
developed. We are developing rulebases for automated browser
selection and for partitioning information presentations by
reconciling author recommendations with user preferences. In the
current instantiation, the state is reflected on all devices and the
users must manually start the desired browsers. The biggest
challenge in partitioning information, by far, is aiding users in
intuitively grasping interconnections between information
elements that are simultaneously presented on the various devices.
We are in the process of designing experiments to observe
understand how users perceive and work with information
presented on multiple devices as well as their preferred interaction
mechanisms in this environment.
In this paper, we have discussed display-agnosticism as a
desirable goal for hypermedia systems. Our system uses two
approaches to achieve display-agnosticism: browser multiplicity
and information abstraction. Separation of information content
from presentation mechanisms allows browsers to interpret and
present hypermedia structures in various ways. Resource
abstraction supports presentation of information content on
devices with diverse properties. Our architecture, MIDAS, aids
users in browsing information-rich, interactive hypermedia
structures via the devices that are usually available to them.
MIDAS tailors the presentation to best match the characteristics
of users, their environment, their devices, and finally, the
information itself. It offers flexible interaction and browsing
mechanisms and intrinsically supports user populations that
otherwise require special consideration, with a minimal overhead.
7. ACKNOWLEDGMENTS
This material is based upon work supported by the National
Science Foundation under Grant No. DUE-0085798.
8. REFERENCES
[1] Abowd, G., and Mynatt, E. Charting Past, Present and Future
of Research in Ubiquitous Computing. In ACM Transactions
on Computer-Human Interaction 7(1), (Mar. 2000), ACM
Press, 29-58.
[2] Aizawa, K., Kentaro, K., and Nakahira, K. Ubiquitous
Displays for Cellular Phone Based Personal Information
Environments. In Proceedings of the Third IEEE Pacific Rim
Conference on Multimedia, PCM 2002 LNCS 2532 (Hsinchu
Taiwan, Dec. 16-18 2002),Springer-Verlag, 25-32.
[3] Anderson, C.R., Domingos, P., and Weld, D.S. Personalizing
Web Sites for Mobile Users, In Proceedings of the Twelfth
International World Wide Web Conference, WWW10 (Hong
Kong, May 1-5, 2001). ACM Press, 565-575.
[4] Bickmore, T.W., and Schilit, B. Digestor: Deviceindependent Access to the World Wide Web. In Proceedings
of the Sixth International WWW Conference (Santa Clara
CA, Apr. 1997).
[5] Buyukkokten, O., Kaljuvee, O., Garcia-Molina, H., Paepcke,
A., and Winograd, T. Efficient Web Browsing on Handheld
Devices Using Page and Form Summarization. In ACM
Transactions on Information Systems 20(1) (Jan. 2002),
ACM Press, 82-115.
[6] CNN to GO, http://www.cnn.com/togo/, accessed March
2004.
[7] Eckel, B. C++ Inside & Out, Osborne McGraw- Hill,
(Berkeley CA, 1993), ISBN: 0-07881809-5, 18-24.
[8] Elrod, S., Bruce, R., Gold, R., Goldberg, D., Halasz, F.,
Janssen, W., Lee, D., McCall, K., Pederson, E., Pier, K.,
Tang, J., and Welch, W. Liveboard: A Large Interactive
Display Supporting Group Meetings, Presentations and
Remote Collaboration. In Proceedings of the SIGCHI
conference on Human factors in computing systems
(Monterey CA, May 1992), ACM Press, 599-607.
[9] Fox, A., and Brewer, E. Reducing WWW Latency and
Bandwidth Requirements by Real-time Distillation. In
Proceedings of the Fifth International World Wide Web
Conference (Paris France, May 1996).
[10] The Festival Speech Synthesis System.
http://www.cstr.ed.ac.uk/projects/festival/, accessed March
2004.
[11] Furuta, R., and Na, J-C. Applying caT's Programmable
Browsing Semantics to Specify World-Wide Web
Documents that Reflect Place, Time, Reader, and
Community. In Proceedings of the 2002 ACM Symposium on
Document Engineering, DocEng ’02 (McLean VA,
November 2002), ACM Press, 10-17.
[12] Gantenbein, R.E., and Jones, D.W. Dynamic Binding of
Separately Compiled Objects Under Program Control. In
Proceedings of the 1986 ACM Fourteenth Annual
Conference on Computer Science (Cincinnati OH, Feb.
1986), ACM Press, 287-292.
[13] Gupta, S., Kaiser, G., Neistadt, D., and Grimm, P. DOMbased Content Extraction of HTML Documents. In
Proceedings of the Twelfth International World Wide Web
Conference, WWW2003 (Budapest Hungary, May 20-24,
2003). ACM Press, 207-214.
[14] Han, R., Perret, V., and Naghshineh, M. WebSplitter: A
Unified XML Framework for Multi-Device Collaborative
Web Browsing. In Proceedings of the 2000 ACM Conference
on Computer Supported Cooperative Work (Philadelphia
PA, December 2000), ACM Press, 221-230.
[15] Howard, W.E. Better Displays with Organic Films. In
Scientific American (Feb. 2004). Also available on the Web
at http://www.sciam.com/print_version.cfm?
articleID=0003FCE7-2A46-1FFB-AA4683414B7F0000
[16] JAWS for Windows Overview.
http://www.freedomscientific.com/fs_products/software_jaws
.asp, accessed March 2004.
[17] Karadkar, U., Na, J.-C. and Furuta, R. Employing Smart
Browsers to Support Flexible Information Presentation in
Petri net-based Digital Libraries. In Proceedings of the Sixth
European Conference on Digital Libraries, ECDL 2002
(Rome Italy, September 2002), Springer-Verlag LNCS 2458,
324-337.
[18] The MBROLA Project Homepage.
http://www.tcts.fpms.ac.be/synthesis/mbrola.html, accessed
March 2004.
[19] Myers, B. Using Handhelds and PCs Together. In
Communications of the ACM 44(11), (November 2001), 3441.
[20] Na, J-C., and Furuta, R. Dynamic Documents: Authoring,
Browsing, and Analysis Using a High-level Petri net-based
Hypermedia System. In Proceedings of the ACM Symposium
on Document Engineering, DocEng ’01 (Atlanta GA,
November 2001), ACM Press, 38-47.
[21] Nikon USA: D2H Set.
http://www.nikonusa.com/template.php?cat=1&grp=2&pro
ductNr=25208, accessed March 2004.
[22] Park, Y.J. Resource Manager. Texas A&M University,
Department of Computer Science Internal Report (January
2004).
[23] Phelps, T. and Wilensky, R. Toward Active, Extensive,
Networked Documents: Multivalent Architecture and
Applications. In Proceedings of the First ACM International
Conference on Digital Libraries (Bethesda MD, March
1996), ACM Press, 100-108.
[24] Raghunath, M.T., and Narayanaswami, C. User Interfaces for
Applications on a Wrist Watch. In Personal and Ubiquitous
Computing, 6(1) (2002) Springer Verlag, 17-30.
[25] Russo, P., and Boor, S. How Fluent is Your Interface?
Designing for International Users. In Proceedings of
Conference on Human Factors and Computing Systems,
InterCHI ’93 (Amsterdam, May 1993), ACM Press, 342-347.
[26] SMIL: Synchronized Multimedia Integration Language
(SMIL 2.0) specification. http://www.w3.org/TR/smil20/
W3C Proposed recommendation (2001), accessed June 2003.
[27] SMART Board Interactive WhiteBoard.
http://www.smarttech.com/Products/smartboard/index.asp
(accessed July 2003).
[28] Stotts, P.D., and Furuta, R. Petri-net-based hypertext:
Document structure with browsing semantics. ACM
Transactions on Information Systems 7(1), (January 1989),
ACM Press, 3-29.
[29] Ustun, S. Audio Browsing of Automaton-based Hypertext.
Masters Thesis, Texas A&M University (December 2003).
[30] Wireless Weather Updates – Palm Pilot or Cellular Phone,
http://www.w3.weather.com/services/, accessed March 2004.
[31] Weiser, M. Computer For the 21st Century. In Scientific
American, (Sep. 1991), 94-104.
[32] Xine – A Free Video Player. http://www.xinehq.de/, accessed
March 2004.
[33] XML: Extensible Markup Language (XML) 1.0 (Second
Edition). http://www.w3.org/TR/2000/REC-xml-20001006
W3C Recommendation (2000), accessed June 2003.
[34] XSLT: XSL Transformations (XSLT) Version 1.0.
http://www.w3.org/TR/xslt, W3C Recommendation (1999),
accessed June 2003.
[35] Yahoo! Mobile, http://mobile.yahoo.com/, accessed March
2004.
Download