WebTrails Jure Leskovec University of Ljubljana, Slovenia Internship at MSR (Aug 1 – Sept 31, 2001) Outline Problem statement Our approach Demo Problem 1: History Navigation Insufficient support for navigation over seen pages One step linear navigation (back/forward) Lost parts of navigation (inaccessible pages) A B C D E F Problem 2: Search Poor search facility in Internet Explorer No visual clues Inflexible results sorting No navigation information IE History Navigation IE History Search Objectives Enhance the history with navigation information Provide easy access to the seen pages history in-session navigation support tool Whole history search over all seen pages Recent Proposed Solution Easy access to a page seen in the current browsing session - Session Navigator clues – Thumbnails of accessed pages Navigation Trails - Grouping pages based on navigation Visual Whole history search: Time query Text query (content, query, link) Colour scheme (position of colours on the page) WebTrails Definition: Sequence of traversed pages started by entering a search query typing a URL and generated by following links using back/forward buttons using the Session Navigator. Captured Data For each page the WebTrail contains: Page URL Body text Page type Accessed by following a link Opened in a new window Typing in a URL Typing a search query (search result page) Back/forward navigation Time Referring page (parent) Session Navigator Trail building and GUI presentation Navigation Tree A B C Linear Trail A B C A D E D D E F Flattening the graph to a linear sequence of tree branches Time ordered Cursor indicating the current page F Session Navigator (2) Trail usage Reviewing seen pages Random access to a page to continue browsing (add already viewed page if navigating further away from the page) WebTrail Graph Search Search over colour schemes of thumbnail images Assumption: people remember position of highly distinct or predominant colours on the page Visual clues – Thumbnails and WebTrails in the search results Thumbnail Analysis Partitioning thumbnails into regions Assigning a single colour per region Histogram analysis Colour clustering Histogram colour merging Body Left Right Green Colour Clustering Colour Spaces 3d space – RGB, HSL, CIELab Blue Red Green Color Similarity RGB Space Blue Red RGB color space, Euclidean distance 60 Colour Clustering (2) RGB Space Green Problem: Define colour distance measure so that colours similar by human perception are close in distance Blue Red Colour Clustering (2) RGB RGB Space Green HSV Blue Red Colour set Solution: 2 step colour merging 1. 2. Conservative merging in RGB space Further merging in HSV space Value Saturation Hue Colours in the union of two neighbourhoods are merged HSV Space Colour Clustering (3) Benefits: Flexible colour palette selection Applying different threshold on the distance Colour Palettes Automatically generated from the collection of thumbnails 15 colour palette 25 colour palette Colour Clustering (3) Benefits: Flexible colour palette selection Applying different threshold on the distance Simplified By reducing the number of colours on the page representation Human description of the page friendly colour distinction Reducing the fine variability in colours Thumbnail Clustering Clustering thumbnail colour schemes Hierarchical clustering based on colour position Off-line (at the moment) Useful for Query by example Browsing through navigation history WebTrails 0.1 DEMO