MEDIN Standards Workshop Standards / XML / Validation / Transformation / ESRI Introduction – XML – Schema and Validation • XSD Schema • Schematron – Transformation • Stylesheets – ESRI ArcGIS – Search XML XML •Extensible Markup Language (XML) – A metamarkup language – The basic unit is called an element Element <tag attribute="attribute value">element value</tag> Attribute Opening tag – Apparently similar to HTML but… Closing tag Metamarkup? •What does metamarkup mean? – There is no predefined and fixed set of tags for XML – XML allows implementers to define their own set of tags to meet their needs Examples • Office Open XML (ISO/IEC 29500) • Geography Markup Language (ISO 19136) Markup – ESRI ArcGIS 10 XML <idCitation> <resTitle>Title</resTitle> <date> <createDate>20110906</createDate> </date> </idCitation> Markup – ISO 19139 XML <gmd:citation> <gmd:CI_Citation> <gmd:title> <gco:CharacterString>Title</gco:CharacterString> </gmd:title> <gmd:date> <gmd:CI_Date> <gmd:date> <gco:Date>2011-09-06</gco:Date> </gmd:date> <gmd:dateType> <gmd:CI_DateTypeCode codeList="...#CI_DateTypeCode" codeListValue="creation">creation</gmd:CI_DateTypeCode> </gmd:dateType> </gmd:CI_Date> </gmd:date> </gmd:CI_Citation> </gmd:citation> Well-Formed •XML has strict rules, e.g.: – There must be one, and only one root element – All elements must have an opening and closing tag – Element names are case sensitive: • <citation/> is different from <Citation/> – XML conforming to the rules is said to be well-formed Well-Formed <idCitation> <resTitle>Title</resTitle> <date> <createDate>20110906</createDate> </date> </idCitation> <idCitation> <resTitle>Title</ResTitle> <date> Two root elements <createDate>20110906 </date> </idCitation> <idPurp>Summary</idPurp> Opening and closing tags are different No closing tag Structure •The markup defines data structure: – It signifies which elements are associated – It can define semantics: <date> <createDate>20110906</createDate> </date> – It says nothing about how to display data (there are exceptions to this rule) XML is machine readable •And… – Human readable… honestly Schema and Validation Schema •Schemas document the elements that are permitted in an XML application – XML that conforms to a schema is said to be schema-valid – XML that does not conform to a schema is said to be invalid XML Schema Definition Language <xs:complexType name="CI_Citation_Type"> ... <xs:complexContent> <xs:extension base="gco:AbstractObject_Type"> <xs:sequence> <xs:element name="title" type="gco:CharacterString_PropertyType"/> <xs:element name="alternateTitle" type="gco:CharacterString_PropertyType" minOccurs="0" maxOccurs="unbounded"/> <xs:element name="date" type="gmd:CI_Date_PropertyType" maxOccurs="unbounded"/> ... </xs:sequence> </xs:extension> </xs:complexContent> </xs:complexType> Markup – ISO 19139 XML <gmd:citation> <gmd:CI_Citation> <gmd:title> <gco:CharacterString>Title</gco:CharacterString> </gmd:title> <gmd:date> <gmd:CI_Date> <gmd:date> <gco:Date>2011-09-06</gco:Date> </gmd:date> <gmd:dateType> <gmd:CI_DateTypeCode codeList="...#CI_DateTypeCode" codeListValue="creation">creation</gmd:CI_DateTypeCode> </gmd:dateType> </gmd:CI_Date> </gmd:date> </gmd:CI_Citation> </gmd:citation> Schematron •Schematron is: – A schema language for XML • Document Schema Definition Language (DSDL) – Written in XML – It’s an ISO Standard – ISO 19757-3 Find out more at: http://www.schematron.com/ Why use Schematron? •XSD schema is unable to test some constraints: – The ability to specify a choice of attributes – The ability to vary the content model based on the value of an element or attribute (this sort of constraint is common in the ISO 19115 logical model) •Implementing profiles (e.g. MEDIN): – With Schematron there’s no need to edit the underlying standardised XSD Validation Workflow XSD Schema Validation ISO 19139 Schema Validation Valid? YES ISO 19139 Table A.1 Constraints Schematron Valid? Schematron Validation YES MEDIN Profile Schematron Valid? NO YES END PASS END FAIL Validation Tools Select profile XSD Schema Schematron schemas Transformation XSLT •Extensible Stylesheet Language Transformations (XSLT) – Specifies rules for transforming one XML instance into another XML instance – The output XML instance will have a different structure from the input XML instance ESRI XML to MEDIN XML •MEDIN XML must be follow the ISO 19139 XML encoding – Users may wish to use other software to create and manage metadata (e.g. ESRI desktop GIS) – ESRI software manages metadata using XML – The XML does not following the ISO 19139 standard – The XML can be transformed to ISO 19139 – MEDIN provides resources to support this Stylesheet Tools ESRI ArcGIS Versions •ArcGIS 9 – FGDC / ISO •ArcGIS 10 – ESRI Core Metadata – Both use XML encoding – The encodings are slightly different – Why the change at version 10? ESRI ArcCatalog – Options ESRI ArcCatalog Transformation e.g. Internal use e.g. External / Publish to DAC ArcGIS 9 Metadata Transformation ArcGIS 10 Metadata MEDIN Metadata Transform Options •Use MEDIN stylesheets – ArcGIS 9 version – ArcGIS 10 version (or the Validate button) •Implementation – Any XSL stylesheet processor (version1.0), e.g: • ArcGIS 9 or 10 ArcToolBox • Metadata Maestro