Project Proposal A transcoding proxy for customizing HTML web pages on small screen devices Andrew Stone 3/7/2016 Abstract The vast majority of the World Wide Web is designed for high resolution, large form factor screens, and fast CPUs with decent bandwidth. Several technologies have been designed to make the web more accessible to mobile devices. The proposed project is a web proxy that transforms requested web pages in a manner customized to a particular device or user profile. This will be done by offering a WAP webpage interface for setting up a profile. Web pages are then requested as arguments to the custom proxy and are transformed to the requester’s profile. Transformations such as HTML to WML (or WAP), image compression or removal, text only retrievals, etc will be attempted. Given time, page caching and webpage transformations to fit a device screen will be attempted. Introduction Cell phone networks are becoming an omnipresent resource for accessing the internet. However, they are still very slow when compared to connections provided by wired ISPs. Technologies such as WML and WAP have been developed to provide access to the web from mobile phones. But with the proliferation of high speed connections at home, many web pages and services are unable to be accessed from mobile phones and may even be unbearable on slow connections. This project is aimed at creating a free solution which can be used to leverage faster connections and streamline web content to a mobile device. This will be accomplished by eliminating unnecessary information or content that the user is willing to sacrifice for faster load time. There are two main modules that I would I like to implement. A HTML to WAP transcoder for mobile phones (known as a WAP Proxy) and a HTML optimizer which will be able to target handheld devices and laptops using a full blown browser. Given the possible scope of both those features, I’m planning on concentrating on the HTML optimizer. I expect this would encompass the largest user base and provide the greatest benefit since there is a free software product (HoTTProxy) available for HTML to WAP 2.0 conversion. Related Work Transcoding web pages to WAP have been done numerous times. A couple of examples I’ve found are HoTTProxy (http://www.hottproxy.org), which converts to WAP 2, and iMobile [2], a proxy server now offered by AT&T. A project by Robert Dugas [4] gives details on the reasoning behind creating a transcoding proxy for mobile phones. Essentially, the main benefit is it cuts down the amount of data that needs to be sent to the phone to view a web page’s content. Duga’s paper describes a transcoding proxy that converts web pages to WAP 1. He also discusses how the structure of a web page can be converted to be viewed via WAP. In addition, he describes his conversion process, which this project will be mimicking to some degree. Chen, Ma, and Zhang [9] provide several algorithms for changing the layout of a web page for a mobile device. An algorithm for breaking down a webpage into small content blocks is described and is used for the basis for restructuring a web page for smaller screened devices. Proposed Work A proxy server will be created that will be able to serve an HTML webpage to any browser. The proxy server component, called PocketProxy, will not make any modifications to the web page itself. It will use a group of components designed to make modifications to a webpage. PocketProxy will have a separate component execution list for particular devices, based on browser specifications and screen size. PocketProxy will retrieve this information by reading a configuration file when it starts. The configuration file will specify what components will be loaded and which ones will be used for particular situations. These components will all support the same interface, IHTMLModifier. This interface will provide one function which will accept a file and an object containing the requesting browser’s properties. After the called component is done with the file it returns a modified file. Each component in the pocketProxy’s stack will receive as an input the output of the component before it. The first component will receive the unchanged webpage as input and the last component gives the altered webpage to the proxy server to send to the client’s browser. The first component that will use the IHtmlModifier interface is the Tidy component. Tidy will use the open source code from HTMLTidy and wrap it in a component that complies with the IHtmlModifier interface. After HTMLTidy has run, it will pass the altered webpage to another component, HtmlStripper. HtmlStripper will remove any content deemed unnecessary. There will probably be several different versions designed for different devices. Content that would be removed could be comments, images, javascript, and special purpose tags (i.e., object). The next component is designed to change the layout of the page to fit the device screen. This component, HtmlLandscaper, uses techniques from <INSERT PAPER REFERENCE HERE>. Though the main goal of this project is to convert webpages so they are viewable on small screen non-phone devices. I plan on implementing an additional component, HtmlWapToo, to convert the HTML from HtmlStripper into a WAP 2.0 compliant webpage. This will allow the proxy to provide faster access for laptops, pdas, and cellphones using slower connections and provide viewable webpages for smaller screens. No caching or preemptive requests will be considered in this version. Image compression will also be left out. Evidence Given that several projects have been done with similar goals (converting HTML to WAP) this project is very feasible. Milestones 1. PocketProxy reads configuration files, can pass unmodified webpages correctly, can get browser information from the client, and loads test versions of IHtmlModifier interfaces correctly. 2. HtmlTidy component built and working with PocketProxy. 3. HtmlStripper component built and working with PocketProxy. Device profiling test provides laptop with unstripped page whilst pdas receive stripped version. 4. HtmlLandscaper component built and working with PocketProxy. Device profiling test provides laptop with stripped page whilst pdas receive the version with the modified layout. 5. HtmlWAPToo component built and working with PocketProxy. Device profiling test provides laptop with stripped page, pdas receive the version with the modified layout, and cellphones with WAP 2.0 converted page. Summary This project will provide faster access to the web for devices using cellular networks for access and it will make web pages more readable on small screen devices. References [1] Bruijin, et al. (2002). RSVP Browser: Web Browsing on Small Screen Devices. [2] Chen, et al. iMobile EE: an enterprise mobile service platform. Wireless Networks, Volume 9, Issue 4. July, 2003. Pages 283-297. [3] Rao, et al. (July, 2001). iMobile: a proxy-based platform for mobile services. Wireless Mobile Internet, Proceedings of the first workshop on Wireless mobile internet. Rome, Italy. 2001. Pages 3-10. [4] Dugas, Robert. WWW Unplugged: An HTML to WML Transcoding Proxy. http://zoo.cs.yale.edu/classes/cs490/00-01b/dugas.robert.rfd8/rfd8cs490.pdf. April 2001. [5] Kaikkonen, Anne and Roto, Virpi. Navigating in a mobile XHTML application. CHI 2003, April 5-10, 2003, Ft. Lauderdale, Florida, UA. Pages 329-336. [6] The University of Iowa. How to Create Handheld Friendly Web Pages. http://www.its.uiowa.edu/cs/sp/pda/PDA-handheldfriendly.html. [7] HTML Tidy Library Project. http://tidy.sourceforge.net/#binaries. [8] Charlie’s Tidy Add-Ons. http://users.rcn.com/creitzel/tidy.html#dotnet. [9] Chen, Y., Zhang, H.J., and Ma, Wei-Ying. Detecting web page structure for adaptive viewing on small form factor devices. WWW 2003, May 20-24, 2003, Budapest, Hungary. Pages 225233. [10] W3C. Cascading Style Sheets. http://www.w3.org/Style/CSS/. [11] W3C. HTML 4.0 specification. http://www.w3.org/TR/html4.