PHP with XML Dequan Chen and Narith Kun ---Term Project--for WSU 2010 Summer Course - CS366 Emails: dequanchen2007@gmail.com NKun08@winona.edu Project Outline • • • • • • • • • Introduction What’s PHP? History of PHP Supporting XML PHP SimpleXML Functions PHP DOM Parser PHP XML Expat Parser Conclusion Demo 1 - PHP Processes a simple XML file Demo 2 - PHP Processes a complex XML (our HWK2 XML) file • References Introduction • XML is now a widely accepted and used standard or a set of rules for encoding documents in a machine-readable form. • XML support was taken more seriously in PHP 5 (2004) than it was in PHP 4 (2000) and PHP 3 (1998). - In this presentation we will focus on the XML support in PHP 5 – the latest version PHP 5.3.2. What’s PHP? • Hypertext Preprocessor, a popular generalpurpose server-side scripting language which can be embedded into HTML. It was originally designed to produce dynamic web pages. But PHP can now create a wide variety of mini or large-scale complex applications. - Note: PHP name is derived from Personal Home Page Tools or PHP/FI (as in PHP/FI 1.0 and 2.0). History of PHP Supporting XML • PHP 3 – SAX (Simple API for XML) parser (Expat parser) • PHP 4 – SAX + DOMXML ( non-standard, API-breaking, memory leaking, incomplete functionality) • PHP 5 – SAX + DOM + SimpleXML (Re-written from scratch) PHP SimpleXML Functions • SimpleXML handles the most common XML tasks • It is an easy way of getting an element's attributes and text content • SimpleXML just takes a few lines of code to read text data from an element (Compared to DOM and the Expat parser) PHP SimpleXML Functions SimpleXML Functions: • simplexml_import_dom — Get a SimpleXMLElement object from a DOM node. • simplexml_load_file — Interprets an XML file into an object • simplexml_load_string — Interprets a string of XML into an object PHP SimpleXML Functions SimpleXMLElement Class: • SimpleXMLElement::addAttribute — Adds an attribute to the SimpleXML element • SimpleXMLElement::addChild — Adds a child element to the XML node • SimpleXMLElement::asXML — Return a wellformed XML string based on SimpleXML element • SimpleXMLElement::attributes — Identifies an element's attributes (…to be continued…) PHP SimpleXML Functions SimpleXMLElement Class: • SimpleXMLElement::children — Finds children of given node • SimpleXMLElement::__construct — Creates a new SimpleXMLElement object • SimpleXMLElement::count — Counts the children of an element • SimpleXMLElement::getDocNamespaces — Returns namespaces declared in document (…to be continued…) PHP SimpleXML Functions SimpleXMLElement Class: • SimpleXMLElement::getName — Gets the name of the XML element • SimpleXMLElement::getNamespaces — Returns namespaces used in document • SimpleXMLElement::registerXPathNamespace — Creates a prefix/ns context for the next XPath query • SimpleXMLElement::xpath — Runs XPath query on XML data PHP SimpleXML Code For parsing a XML file: <?php $xml = simplexml_load_file("test2.xml"); echo $xml->getName() . ": "; foreach($xml->attributes() as $Attr) { echo $Attr->getName()."=".$Attr." "; } displayChildrenRecursive($xml); echo "<br />"; (…to be continued…) PHP SimpleXML Code function displayChildrenRecursive($xmlObj,$depth=0){ foreach($xmlObj->children() as $child) { echo "<br/>"; echo str_repeat('-',$depth).$child->getName().": ".$child." "; foreach($child->attributes() as $attr) { echo str_repeat('',$depth).$attr>getName()."=".$attr." "; } displayChildrenRecursive($child,$depth+1); }} How Does The Code Work? 1. Load the XML file 2. Get the name & attributes of the root element 3. Create a recursive function with a foreach loop that will trigger on each child node, using the children() function 4. For each child node: use getName() method for each element name to be displayed; use a foreach loop to get the attribute nodes for each child node, using the attributes() function . 5. For each attribute use getName() method for the name to be displayed. PHP XML DOM Parser • The DOM parser is an tree-based parser. • Tree-based parser: It transforms an XML document into a tree structure. It analyzes the whole document, and provides access to the tree elements • Good for small and medium-sized XML files (loading the whole file into the memory) PHP XML DOM Parser • <?xml version="1.0" encoding="ISO8859-1"?> <from>Jani</from> • The XML DOM parses the XML above as follows: – Level 1: XML Document – Level 2: Root element tag: <from> – Level 3: Root element content: "Jani" DOM Document Functions • There are many: (http://www.php.net/manual/en/class.domdocument.php) • DOMDocument::__construct — Creates a new DOMDocument object • DOMDocument::load — Load XML from a file • DOMDocument::saveXML — Dumps the internal XML tree back into a string DOM Element Functions • There are many: (http://www.php.net/manual/en/class.domelement.php) • DOMElement::__construct — Creates a new DOMElement object • DOMElement::getAttribute — Returns value of attribute • DOMElement::getAttributeNode — Returns attribute node PHP XML DOM Code <?php $xmlDoc = new DOMDocument(); $xmlDoc->load("test2.xml"); print $xmlDoc->saveXML(); print "<br/><br/>Alternative - Method 2: <br/><br/>"; $x = $xmlDoc->documentElement; print $x->nodeName.": "; foreach ($x->attributes AS $Attr){ echo $Attr->nodeName." = \"". $Attr->nodeValue."\" ";} PHP XML DOM Code displayChildrenRecursive($x); function displayChildrenRecursive($xmlObj,$depth=0) { foreach ($xmlObj->childNodes AS $child) { if ($child->nodeType==1 ){ echo "<br/>"; echo "-".str_repeat('-',$depth).$child->nodeName.": "; foreach ($child->attributes AS $attr){ echo str_repeat('',$depth).$attr->nodeName." = \"".$attr->nodeValue."\" "; } echo "(".$child->textContent.")"; displayChildrenRecursive($child,$depth+1); }}} ?> How Does The Code Work? 1. Create a DOMDocument instance and use it to load an XML file 2. Method 1: The saveXML() function puts the internal XML document into a string, so we can output it 3. Method 2: - Get the root element name and attributes - Loop recursively through all elements of all childNodes and output the node (element node and attribute node) names and contents PHP XML Expat Parser • The Expat parser is an event-based parser • Event-based parser: Views an XML document as a series of events, and each event is handled by a sp. function (Access data faster than tree-based parsers) • Good for any size of XML files PHP XML Expat Parser • <from>Jani</from> • The XML Expat Parser parses the above as a series of 3 events: – Start element: from – Start CDATA section, value: Jani – Close element: from Expat Parser Functions • There are many: (http://www.php.net/manual/en/book.xml.php) • xml_parser_create — Create an XML parser • xml_parser_get_option — Get options from an XML parser • xml_parser_set_option — Set options in an XML parser Expat Parser Functions • xml_set_character_data_handler — Set up character data handler • xml_set_default_handler — Set up default handler • xml_set_element_handler — Set up start and end element handlers • xml_set_processing_instruction_handler — Set up processing instruction (PI) handler PHP Expat Parser Code <?php echo "<pre>"; $file = "test2.xml"; #echo $file."\n"; global $inTag; $inTag = ""; $xml_parser = xml_parser_create(); xml_parser_set_option($xml_parser, XML_OPTION_CASE_FOLDING, 0); (---To be continued---) PHP Expat Parser Code xml_parser_set_option($xml_parser, XML_OPTION_SKIP_WHITE, 1); xml_set_processing_instruction_handler($xml_parser, "pi_handler"); xml_set_default_handler($xml_parser, "parseDEFAULT"); xml_set_element_handler($xml_parser, "startElement", "endElement"); xml_set_character_data_handler($xml_parser, "contents"); … (Note: The other code is not displayed here) ?php> How Does The Code Work? 1. Initialize the XML parser with the xml_parser_create() function 2. Create functions for different XML events with the different event handlers (xml_set_element_handler(), xml_set_character_data_handler(), etc) 3. Parse the XML file with the xml_parse() function 4. In case of an error, add xml_error_string() function to convert an XML error to a textual description 5. Call the xml_parser_free() function to release the memory Conclusion • The Latest PHP 5.3.2 has 3 ways to parse an XML file – SimpleXML – (XML) DOM – XML Expat Parser • The SimpleXML code is simple and good for most common XML tasks in PHP • Each PHP XML parser has its own advantages (e.g., SimpleXML is not good for advanced XML with namespaces,while the Expat parser or XML DOM are good for this kind of XML) DEMOS • Demo 1 - PHP Processes a simple XML file (test1.xml) - xml_test1a.php – (SimpleXML) - xml_test1b.php – (XML DOM) - xml_test1c.php – (XML Expat Parser) • Demo 2 - PHP Processes a complex XML (our HWK2 XML) file (test2.xml) - xml_test2a.php – (SimpleXML) - xml_test2b.php – (XML DOM) - xml_test2c.php – (XML Expat Parser) References 1. 2. 3. 4. 5. 6. Official PHP website: http://www.php.net/ PHP & XML Support History: http://uk3.php.net/manual/en/history.php.php http://devzone.zend.com/article/1713#Heading13 Full Web Building Tutorials: http://www.w3schools.com/ Parsing XML using SimpleXML: http://debuggable.com/posts/parsing-xml-using-simplexml:480f4dfe6a58-4a17-a133-455acbdd56cb XML for PHP developers, Part 1: The 15-minute PHP-withXML starter http://www.ibm.com/developerworks/library/x-xmlphp1.html Beginning XML, 4th Edition. Eds. by David Hunter, Jeff Rafter, Joe Fawcett, Eric van der Vlist, Danny Ayers, Jon Duckett, Andrew Watt (XMML.com, Strathdon, Scotland), Linda McKinnon (Calgary, Alberta, Canada). Wiley Publishing, Inc, Indianapolis, In, 2007.