WebTrails Jure Leskovec University of Ljubljana, Slovenia – Sept 31, 2001)

advertisement
WebTrails
Jure Leskovec
University of Ljubljana, Slovenia
Internship at MSR (Aug 1 – Sept 31, 2001)
Outline
 Problem
statement
 Our approach
 Demo
Problem 1: History Navigation

Insufficient support for navigation over seen pages
 One
step linear navigation (back/forward)
 Lost
parts of navigation (inaccessible pages)
A
B
C
D
E
F
Problem 2: Search

Poor search facility in Internet Explorer
 No
visual clues
 Inflexible results sorting
 No navigation information
IE History Navigation
IE History Search
Objectives
Enhance the history with navigation
information
 Provide easy access to the seen pages

history  in-session navigation
support tool
 Whole history  search over all seen pages
 Recent
Proposed Solution

Easy access to a page seen in the current
browsing session - Session Navigator
clues – Thumbnails of accessed pages
 Navigation Trails - Grouping pages based on
navigation
 Visual

Whole history search:
 Time
query
 Text query (content, query, link)
 Colour scheme (position of colours on the page)
WebTrails

Definition:
Sequence of traversed pages started by
entering a search query
 typing a URL

and generated by
following links
 using back/forward buttons
 using the Session Navigator.

Captured Data

For each page the WebTrail contains:
Page URL
 Body text
 Page type

Accessed by following a link
 Opened in a new window
 Typing in a URL
 Typing a search query (search result page)
 Back/forward navigation

Time
 Referring page (parent)

Session Navigator

Trail building and GUI presentation
Navigation Tree
A
B
C
Linear Trail
A
B
C
A
D
E
D
D
E
F
Flattening the graph to a linear
sequence of tree branches
 Time ordered
 Cursor indicating the current page

F
Session Navigator (2)

Trail usage
 Reviewing
seen pages
 Random access to a page to continue browsing (add
already viewed page if navigating further away from
the page)
WebTrail Graph
Search

Search over colour schemes of thumbnail
images
 Assumption:
people remember position of
highly distinct or predominant colours on the
page

Visual clues – Thumbnails and WebTrails
in the search results
Thumbnail Analysis


Partitioning thumbnails into regions
Assigning a single colour per region
 Histogram
analysis
 Colour clustering
Histogram colour merging
Body
Left
Right
Green
Colour Clustering

Colour Spaces
 3d
space – RGB, HSL, CIELab
Blue
Red
Green
Color Similarity
RGB Space
Blue
Red
RGB color space, Euclidean distance 60
Colour Clustering (2)
RGB Space
Green

Problem:
Define colour distance measure so
that colours similar by human
perception are close in distance
Blue
Red
Colour Clustering (2)
RGB
RGB Space
Green
HSV
Blue
Red
Colour set
Solution: 2 step colour merging
1.
2.
Conservative merging in RGB space
Further merging in HSV space
Value

Saturation
Hue
Colours in the union of two neighbourhoods
are merged
HSV Space
Colour Clustering (3)
Benefits:
 Flexible

colour palette selection
Applying different threshold on the distance
Colour Palettes

Automatically generated from the collection of thumbnails
15 colour palette
25 colour palette
Colour Clustering (3)
Benefits:
 Flexible

colour palette selection
Applying different threshold on the distance
 Simplified

By reducing the number of colours on the page
representation
 Human

description of the page
friendly colour distinction
Reducing the fine variability in colours
Thumbnail Clustering

Clustering thumbnail colour schemes
 Hierarchical
clustering based on colour
position
 Off-line (at the moment)

Useful for
 Query
by example
 Browsing through navigation history
WebTrails 0.1
DEMO
Download