XML Patch Operations based on XPath selectors Jari Urpalainen IETF62 Minneapolis Short overview Purpose was to remove the overlapping parts of xcapdiff and partial-pidf which reference patch-ops ● ~XCAP PUT and DELETE semantics. Patch operations are embedded within a transported XML document ● Uses XPath 1.0 compatible selectors ● Combines several requests: <add>, <replace> and <remove> with optional data onto a single XML doc ● XML Schema defines only types without any targetNamespace -> the elements inherit the namespace of the including Schema ● Changes ● ● ● The names in location steps are fully namespace qualified. The framing document (xcap-diff, pidf-diff etc.) carries all the namespace definitions that are needed to apply the patch requests The added or modified data content is also fully namespace qualified. Open Issues/BUGS ● ● ● Whitespace text node handling in <add> and <remove> In addition to above lack of XPath data model nodes: namespaces, comments and processing instructions Lack of the ability to say: I want to add these nodes immediately before/after this element that already exist within the patched document Adding namespace definitions ● namespace axis of XPath 1.0 does the trick <add sel="root" type="namespace::prefix">urn:xml:ns:something</add> ● value of “sel” selects the element where the namespace definition will be added (similar to adding an attribute with “@”) <remove sel="root/namespace::prefix"/> <replace sel="root/namespace::prefix">urn:xml:ns:something</replace> ● value of “sel” selects the namespace node <add> ● except attribute and namespace nodes this will always (by default) append elements, text nodes, comments and processing instructions as the last child(ren) <add sel="root"> <!-- This is a comment --> <new-element a=”1”/><new-element a=”2”/></add> ● This allows an easy handling of whitespace text nodes as well as several siblings at the same time <add> parameters the value of 'sel' attribute selects a single unique element from the patched doc ● 'pos' attribute: “to” [default], “before”, “after” ● “before” = immediate preceding sibling node ● “after” = immediate following sibling node ● “to” = last child(ren) node or attribute/namespace ● 'type' attribute; values: node() [default], text(), @attr, namespace::prefix ● child element(s) or text content of <add> = the new/updated XML fragment(s) or values for ns/attr ● Add before and after ● XPath 1.0 defines axises: preceding-sibling and following-sibling which could be used: <add sel="root/elem[@a='1']" type="precedingsibling::node()[1][self::comment()]">This is a comment</add> ● instead a much simpler model using “pos” attribute can be used: <add sel="root/elem[@a='1']" pos=”before"><!-- This is a comment --></add> <add sel="root/elem[@a='1']" pos=”after"> <!-- This is a new added node --><new-node a=”1”> </add> <replace> ● ● Only one “sel” selector (must locate a unique node) Last location step includes comment(), processinginstruction(“x”) and text(). <replace sel="root/elem[@a='1']"><update/></replace> <replace sel="root/@a">new attr value</replace> <replace sel="root/namespace::prefix">urn:new</replace> <replace sel="root/comment()[1]">This is a new comment</replace> <replace sel="root/processing-instruction('foo')">bar="foobar"</replace> <replace sel="root/elem[1]/text()[1]">This is the new text content</replace> <remove> ● whitespace text nodes somewhat problematic A.<remove sel="root/text()[1]"/> <remove sel="root/elem[@a='1']"/> B.<remove sel="root/text()[1] | root/elem[@a='1']"/> C.<remove sel="root/elem[@a='1']/precedingsibling::text()[1] | root/elem[@a='1']"/> The proposed model (similar to C, but simpler): ● in addition to 'sel' an optional whitespace attribute 'ws': "before", "after", "both", "none" [default] <remove sel="root/elem[@a='1']" ws="before"/> <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE document [ <!ENTITY ncname "[^:\I][^:\C]*"> <!ENTITY qname "(&ncname;:)?&ncname;"> <!ENTITY aname "@&qname;"> <!ENTITY pos_t "\[\d+\]"> <!ENTITY attr_t "\[&aname;=('|&quot;)[.\n]*('|&quot;)\]"> <!ENTITY name_t "\[(&qname;|\.)=('|&quot;)[.\n]*('|&quot;)\]"> <!ENTITY cond "(&attr_t;|&name_t;)?(&pos_t;)?|(&pos_t;)?(&attr_t;|&name_t;)?"> <!ENTITY step "(&qname;|\*)(&cond;)?"> <!ENTITY pi "processinginstruction\((('|&quot;)&qname;('|&quot;))?\)"> <!ENTITY comm "comment\(\)"> <!ENTITY text "text\(\)"> <!ENTITY nspace "namespace::&ncname;"> <!ENTITY last "&step;|&aname;|&nspace;|(&comm;(&pos_t;)?)|&text;(&pos_t;)?|&pi;(&po s_t;)?"> ]> <xsd:schema ... <xsd:simpleType name="xpath"> <xsd:restriction base="xsd:string"> <xsd:pattern value="(&step;/)*(&last;)"/> </xsd:restriction> </xsd:simpleType> </xsd:schema> Impact on xcap-diff & pidf-diff etc. Common model for applying XML changes ● XML Schemas include patch-ops which is only a framework for patch operations ● Content-Types are defined according to the including schema within xcap-diff and pidf-diff etc. ● As these operations describe the model to patch the full XPath 1.0 data model node types, applications may either extend or restrict these as desired ●