Archiving the Mobile Web Frank McCown, Monica Yarbrough, & Keith Enlow Computer Science Dept Harding University WADL 2013 Indianapolis, IN July 25, 2013 Mobile vs. Stationary Web Mobile Web-Related Markup Languages Smartphone era http://en.wikipedia.org/wiki/File:Mobile_Web_Standards_Evolution_Vector.svg Two Types of Mobile Web Feature Phone Web Smartphone Web cHTML (iMode), WML, WAP, etc. XHTML, HTML5, etc. Serving Up Mobile Sites 1. Responsive web design • Same HTML content to desktop and mobile • CSS media queries alter appearance <!-- CSS media query on a link element --> <link rel="stylesheet" media="(max-width: 800px)" href="example.css" /> <!-- CSS media query within a style sheet --> <style> @media (max-width: 600px) { .sidebar { display: none; } } </style> Example of Responsive Web Design Serving Up Mobile Sites 1. Responsive web design • Same HTML content to desktop and mobile • CSS media queries alter appearance 2. Redirect mobile user agent to mobile site • Client-side redirection • Server-side redirection Client-Side Redirection • JavaScript detects mobile user agent // From www.harding.edu var ua = navigator.userAgent.toLowerCase(); if (queryString.match('version=mobile') || ua.match(/IEMobile|Windows CE|NetFront|PlayStation|like Mac OS Z|MIDP|UP\.Browser|Symbian| Nintendo|BlackBerry|mobile/i)) { if (!ua.match('ipad')) { if (window.location.pathname.match('.html')) window.location = window.location.pathname.replace('.html', '.m.html'); else window.location = window.location.pathname + 'index.m.html'; } } Client-Side Redirection Server-Side Redirection • Server routes mobile user agent to different page Apache Example: RewriteEngine On RewriteBase / RewriteCond %{HTTP_USER_AGENT} (android|bb\d+|meego).+mobile|avantgo|badda\/|blackberry|blazer|etc…|zte\-) [NC] RewriteRule ^$ http://detectmobilebrowser.com/mobile [R,L] https://developers.google.com/webmasters/smartphone-sites/details Server-Side Redirection Serving Up Mobile Sites 1. Responsive web design • Same HTML content to desktop and mobile • CSS media queries alter appearance 2. Redirect mobile user agent to mobile site • Client-side redirection • Server-side redirection 3. User-agent content negotiation • Dynamically serving different HTML for the same URL User-Agent Content Negotiation • Server serves up different content for same URL • Use Vary: User-Agent header in response • Best method for serving content quickly Archiving Mobile Sites 1. Responsive web design • Easy: Crawl like normal • Use client tools to view page formatted for mobile 2. Redirect mobile user agent to mobile site • Need to crawl with mobile user agent • Need JavaScript-enabled crawler to handle client-side redirection 3. User-agent content negotiation • Need to crawl with mobile user agent • Need to distinguish mobile vs. desktop for same URL How are we doing archiving mobile sites so far? Earliest archived page Earliest 2007 archived page: WML Finally some news! Really??? Great… Only desktop version is archived!