Content adaption Johan Montelius Introduction In this laboration you will do some server side scripting to identify a terminal and to adapt content to its characteristcs. We will use the follwong technology: • PHP for server side scripting, www.php.net • XML for representing content • XSL for tranforming content to presentation • UAprof for extracting information about the terminal Getting started Since we shall explore some server side scripting you need a web server to upload your documents to. If you access your documents directly over the file system the scripts will not be executed. I will assume that you use the www.it.kth.se or www.isk.kth.se web server and that the choice of scripting language is PHP. If you’re using another server you need to check which scripting languages that are supported and choose one of them. The examples in this tutorial will be different for different languages but the idea is the same. 1 Your first script PHP scripts are written inline in any document that the server is configures to scan for scripts. Documents with a .html or .php extension are scanned per default but .wml documents are normally not scanned. The reason is that the sequence <?... is used to identify the beginning of a script and this is of course in conflict with the XML declaration in a WML document. To work with PHP and WML we need to do a little trick. We use the .php extension but give directives to the web server to present the document as a WML document. Write the following code in a file test.php and upload it to the server. <script language="php"> header(’Content-type: text/vnd.wap.wml’); echo ’<?xml version="1.0" ?> 1 <!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.3//EN" "http://www.wapforum.org/DTD/wml13.dtd">’; </script> <wml> <card title=’’PHP Test’’> <p>A small test.</p> </card> </wml> The PHP script is enclosed by the <script language=’’php’’> and </script> tags. The first line of the script is a directive to the web server to serve the document as a WML document so a browser will accept it and handle it properly even though it has the .php extension. The second statement will output the XML header. We need to produce this in the script since we could not write it directly. Notice that the final document is all text in the source document apart from the script sections but including anything the scripts output using echo or print statements. Access the file test.php from a phone or WML browser and make sure that no traces of the PHP script remains. 2 Knock, knock. - Who is there The PHP script engine running on the server and has direct access to the HTTP get (or POST) request. The information in the request headers is made available to the scripts in the form of entries in a key-word indexed array. One way to find out what information we can work with is to write a script that displays the content of the array. Try the following script in a document called http.php. Don’t forget the PHP intro that will generate the XML header. <wml> <card id="home" title="HTTP info"> <p>This is an example of HTTP variables that you can access using PHP.</p> <p> <a href="#server">SERVER</a> <br/> </p> </card> <card id="server" title="HTTP info"> <p><b>SERVER</b></p> <script language="php"> reset($_SERVER); 2 while (list($key , $val) = each($_SERVER) ) { echo "<p>Key: ".$key. " <br/>"; echo " Val: <![CDATA[".$val. " ]]></p>"; } </script> </card> </wml> If you access the document with a browser (try several) you will find that they produce quite different information. Some of the information pertains to the web server but some depend on the browser. Pay especially attention to the HTTP ACCEPT and HTTP USER AGENT keys. Why did I use the construct with <![CDATA[ ...]]> , why not simply echo the $val value? If we access the HTTP USER AGENT directly we can write the following page (I know, I’m getting sloppy). <wml> <card id="home" title="I Spy"> <p>Hello, I see that you’re using a <?php echo $_SERVER[’HTTP_USER_AGENT’]; ?>!</p> </card> </wml> As you know it is very tricky to use the user-agent string to adapt the content so let’s move on to the accept header. 3 WML/HTML Let’s create a site that returns WML or HTML documents depending on the capabilities of the client browser. The accept header is however not as simple to parse as might seam at first sight. Notice that you (probably) have some q=0.9 or similar directives in the sequence. These parameters are quality factors and specify if a media type is more (1.0) or less (0.0) preferred. If nothing is mentioned the quality factor by default 1.0. Take a look in RFC-2616 for a complete description of the accept header. Let’s make a simple page that returns a WML page if WML is an accepted media type and a HTML page otherwise. We use the library function strpos() to detect if the accept header contains the string “text/vnd.wap.wml” and then produce two different versions of the page. Not that since “0” could also be interpreted as “false” we need to be very careful when we check the result of the strpos() call; “===” is equal without type conversion. 3 <script language=’’php’’> $accept = $_SERVER[’HTTP_ACCEPT’]; /* $pos will be either false or the position of the string */ $pos = strpos($accept, ‘‘text/vnd.wap.wml’’); /* $wml will be true if the string was found $wml = ($pos === 0 | $pos > 0 ); */ if( $wml ) { header(’Content-type: text/vnd.wap.wml’); echo ’<?xml version="1.0" ?>’; echo ’<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.3//EN" "http://www.wapforum.org/DTD/wml13.dtd">’; echo ’<wml> <card title="Dynamic"> <p>Ahh, a WML surfer!</p> </card> </wml>’; } else { header(’Content-type: text/html’); echo ’<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">’; echo ’<html> <head> <title>Dynamic</title> </head> <body> <p>Ahh, an HTML surfer!</p> </nody> </html>’; } </script> We could also look for support for XHTML and look for quality factors and serve the content that is most wanted. Since some browses (WAP 2.0) can handle both XHTML and WML we would have a choice. We could also direct the client to the right document using a redirect directive. This would however no be the best solution since this would require a mobile client to perform a second HTTP get request. One alternative would then be to serve a WML page with wml-links or a XHTML page with html-links. 4 4 A bit of XML parsing In the quest to separate content from presentation we will do a bit of XML hacking. Let’s assume that we have a file with the latest news in the following format: <?xml version="1.0"?> <news> <story id="n1"> <date>2004/08/20</date> <title>This is news</title> <short>This is a short description of the news. We will keep int short in order to deliver it to smaller devices. </short> <long> <pre>A short intro before the whole text</pre> <full>This is the full story. It could be long and include all the details that could not fit into the short description. I don’t want to write it all now.</full> </long> </story> <story id="n2"> <date>2004/08/21</date> <title>Even more news</title> <short>The short description of the news is now shorter.</short> <long> <pre>A small intro</pre> <full>And the the full story. It could be long and include all the details that could not fit into the short description. </full> </long> </story> </news> The quest is now to serve this content to smaller devices using WML and larger devices using XHTML. We need to do the same trick as before to discover the capabilities of the device but we also need to transform the above news descriptions into WML and XHTML. Let’s first create a page, news-wml.php that delivers the WML version. To help us in this process we use a XML parser accessible from PHP. The parser is a generic parser that runs through a XML document and by default does nothing. We need to create a element handler and a character handler to produce anything. The handlers are called every time the parser enter or leaves an element or when a character data section is encountered. 5 Look at the function startElement below, it is called when we enter an element. If the name of the element is SHORT or TITLE we should do something. If we enter the short element we set the global variable $chardata to true and write a <p> tag to the output stream. The $chardata variable will be a signal to the character data function to output the character data field. <script language="php"> header(’Content-type: text/vnd.wap.wml’); echo ’<?xml version="1.0" ?>’; echo ’<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.3//EN" "http://www.wapforum.org/DTD/wml13.dtd">’; </script> <wml> <card> <script language="php"> $chardata = FALSE; function startElement($parser, $name, $attrs) if( strcmp($name,"SHORT") == 0 ) { global $chardata; $chardata = TRUE; echo "<p>"; } elseif (strcmp($name,"TITLE") == 0 ) { global $chardata; $chardata = TRUE; echo "<p>"; } } function endElement($parser, $name) { if( strcmp($name,"SHORT") == 0 ) { global $chardata; $chardata = FALSE; echo "</p>"; } elseif (strcmp($name,"TITLE") == 0 ) { global $chardata; $chardata = FALSE; echo "</p><br/>"; } } 6 { function characterData($parser, $data) { global $chardata; if($chardata) { echo $data; } } /* create the parser */ $xml_parser = xml_parser_create(); /* we want all our tags to be upper case */ xml_parser_set_option($xml_parser, XML_OPTION_CASE_FOLDING, true); /* now we add an element handler */ xml_set_element_handler($xml_parser, "startElement", "endElement"); /* and a character handler */ xml_set_character_data_handler($xml_parser, "characterData"); /* access the name of the news file */ $file = "news.xml"; /* open the file */ if (!($fp = fopen($file, "r"))) { die("could not open XML input"); } /* start reading and feeding the parser */ while ($data = fread($fp, 4096)) { if (!xml_parse($xml_parser, $data, feof($fp))) { die(sprintf("XML error: %s at line %d", xml_error_string(xml_get_error_code($xml_parser)), xml_get_current_line_number($xml_parser))); } } xml_parser_free($xml_parser); </script> </card> </wml> Go through the setting up of the parser and try to output WML content in another way, maybe including the date. Then create a file news-html.php that produces the XHTML version. 7 5 XSL Transformations The above transformations can be automated by using XSL transformation. You need a webserver with PHP and XSLT support. If you’re going to work on another web server than web.it.kth.se you must make sure that the Sablotron package is loaded. You can always check wich packages that are supported by writing a smal php-page with the call phpinfo(). It is also very usefull to have xsltproc installed on your local machine to run XSL transformations locally. You can find documentation on xsltproc on http://xmlsoft.org/XSLT/index.html, you can install it under Cygwin if you’re using Windows. We start by using the news data file, news.xml. Now we create a XSL document with some directives on how to transform the news. We first creat a file, news-wml.xsl that will output the news in WML-format. <?xml version="1.0" encoding="ISO-8859-1"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="html" encoding="ISO-8859-1" doctype-public="-//WAPFORUM//DTD WML 1.1//EN" doctype-system="http://www.wapforum.org/DTD/wml_1.1.xml"/> <xsl:template match="/"> <wml> <card> <p> <table columns="1"> <xsl:for-each select="news/story"> <tr> <td> <a> <xsl:attribute name="href"> #<xsl:value-of select="@id"/> </xsl:attribute> <xsl:value-of select="title"/> </a> </td> </tr> </xsl:for-each> </table> </p> </card> 8 <xsl:for-each select="news/story"> <card> <xsl:attribute name="id"> <xsl:value-of select="@id"/> </xsl:attribute> <p><xsl:value-of select="title"/><br/> <xsl:value-of select="short"/> </p> </card> </xsl:for-each> </wml> </xsl:template> </xsl:stylesheet> Now this is of course a lot of XSL in one go but if you go through the code you will get the main idea of how it works. The next thing we need is a script file that calls the xslt process and apply the xsl transformation on the news.xml file. <script language="php"> $file = "news.xml"; $xsl = "news-wml.xsl"; $xh = xslt_create(); $result = xslt_process($xh, $file, $xsl); header(’Content-type: text/vnd.wap.wml’); if ($result) { print $result; } else { print "Error"; } xslt_free($xh); </script> Now write a news-html.xsl file that transforms the news into XHTML. You also need a news-xsl.-html.php file but as you can see this is a trivial modification. Using the two XSL transformations and the examples above where we looked at the accept headers you can easily write a news-xsl.php file that calls the right xsl file depending on the accept header. 9 6 UAProf The most interesting test is of course if we can take the UAprof from the request header and transform it into a XSL file that can be used to transform our news file. We will only do as simple transformation to see that we can actually retrieve the UAProf and parse it. It becomes quite comlicated if we want to do more advanced transformations. We first need a new directory called cache that should be writeable (and insert) by the webserver. YOu need to set the AFS rights to fix this. We will store uaprofs and also device XSl files in this directory. We then need a file uaprox.xsl that will take a UAprof file and turn it into a WML file. We will onlt produce WML files given the news.xml files that we have but we will augmnent it with a greeting thatis unique for each terminal. You will find this file on the course web site. Next we need a uaprof.php file that reads the UAProf (if available), caches it and process it using uaprof.xsl. The result is saved as model.xsl (where model is the name of the terminal, in the cache. This XSL file is then used to process the news.xml file. Tricky, yes ideed. Set it up and test it with a phone, then try other models and see how you start to collect UAprof data. For a more advanced exercise you can try to extract information about screen size and serve each device a banner that fits the screen. 10