e-Documents based on reusable core assets such as core vocabularies e-SENS WP6.2 face-to-face meeting, Poznan, 24 October 2013 ISA Programme Action 1.1 - Semantic Interoperability Suzanne.Wigard@ec.europa.eu Stijn.Goedertier@pwc.be Outline 0. ISA Programme Action 1.1 1. What are the Core Vocabularies? 2. Why using the Core Vocabularies? 3. How to use? 4. Where already used? 2 “Standards are like toothbrushes, a good idea but no one wants to use anyone else's” - Anita Golderba Questions raised so far during this meeting… Which building block types: methodology, library, NDR, tools …? Reuse by restriction/extension? Syntax vs Semantics? Governance mechanism and sustainability? Metadata management? Reuse existing tools… exchange data elements between tools? Does it matter at all? (mediation) Do libraries cover everything? … And how about RDF? ISA undertakes initiatives to foster interoperability of information exchanges by public administrations What is interoperability? Ability of disparate organisations to interact towards mutually beneficial and agreed goals, involving the sharing of information and knowledge 5 European Interoperability Framework Political context Source: http://ec.europa.eu/isa/documents/isa_annex_ii_eif_en.pdf 6 ISA Action 1.1 Semantic Interoperability • New Work Package: • Define a Method to build e-Documents based on Core Assets • Methodology • e-Documents • Core Assets (Building Blocks) Methods • Examples: • • • • • CCTS NIEM Naming and Design Rules (NDR) UBL ISA Core Vocabularies e-Documents • Definition? • • • • Machine-readable? PDF? XML? Messages? Core Assets • UN/CEFACT Core Component Library • UBL 2.1 Library • ISA Core Vocabularies ISA Core Vocabularies Definition: … • Core Person • Core Location • Org, RegOrg Other initiatives • Asset Description Metadata Schema (ADMS) • DCAT application profile Why we should collaborate • What is e-SENS CC6.2 doing? • "Develop Semantic Assets" • What is a Semantic Assets? D6.1 chapters • Stock taking from predecessor LSP – Building Blocks • Methodology • SWOT analysis • Maturity • Semantic tools • Requirements for semantic interactions • EIA description of Building blocks • Overview of semantic assets in Europe • Priorities, Plans Analysis of resources from the JoinUp portal (D6.1) eSens domain eBusiness JoinUp asset theme Business and Competition Number of assets on JoinUp (July 2013) 5 eJustice Law and Justice 19 eHealth Health 20 Employment 16 Education 13 Agricultura, Forestry and Fisheries 22 eEmployment eEducation eAgriculture Types of Semantic Assets (D6.1) • Schemas (Messages, eDocs, …) • Knowledge organization systems: o Codelists, Catalogues, (Controlled) Vocabularies o Taxonomies, Thesauri, Name authorities • Ontologies • Mappings/Translations, Mapping Services • Other Services (Syndication, Service Catalogue, Directory of Registers) • Process flows • Containers Future plans for CC 6.2 • Create? • Reuse? • What is reusable? • What is generic? Thank you for your attention Outline 0. ISA Programme Action 1.1 1. What are the Core Vocabularies? 2. Why using the Core Vocabularies? 3. How to use? 4. Where already used? 18 Building consensus on core vocabularies • 2 WGs with each 60+ members • 21+ EU Member States • Following a formal process and methodology • Public review periods • Re-using existing standards Source: https://joinup.ec.europa.eu/node/43160 19 Core vocabularies Simplified, re-usable, and extensible data models that capture the fundamental characteristics of a data entity in a context-neutral fashion. CORE PUBLIC SERVICE VOCABULARY Source: https://joinup.ec.europa.eu/node/43160 20 4 core vocabularies Fundamental characteristics of a person. Fundamental characteristics of a legal entity, such as legal identifier, name, company type, activities. Fundamental characteristics of a location, represented as an address, a geographic name, or a geometry. CORE PUBLIC SERVICE Fundamental characteristics of a public service. VOCABULARY 21 3 representation formats Conceptual model Re-use existing concepts in CCL, INSPIRE, etc. RDF schema Re-uses existing RDF vocabularies XML schema Re-uses Core Components Technical Specification (CCTS) and UBL NDR ISA Open Metadata Licence v1.1 Maintained by W3C (Government Linked Data Working Group) 22 Core Vocabulary UML Model • A conceptual model of the Core Vocabularies • To enable humans to understand the meaning of the data model • Not (yet) used for model-driven design of schemas 23 Illustration: Core Person UML model class Healthcare Domain Core Vocabularies::Geometry lat :string long :string wkt :string xmlGeometry :XML geometry Core Vocabularies::Address addressArea :string addressID :string adminUnitL1 :string adminUnitL2 :string fullAddress :string locatorDesignator :string locatorName :string poBox :string postCode :string postName :string thoroughfare :string Core Vocabularies::Location address geographicIdentifier :URI geographicName :string countryOfDeath placeOfDeath countryOfBirth placeOfBirth Core Vocabularies::Person alternativeName :string birthName :string dateOfBirth :dateTime dateOfDeath :dateTime familyName :string fullName :string gender :code givenName :string patronymicName :string Core Vocabularies::Identifier identifies dateOfIssue :dateTime [0..1] identifier identifier :string [1..1] identifierType :string [0..1] issuingAuthority :string [0..1] issuingAuthorityUri :URI [0..1] 24 Core Vocabulary XML Schemas • According to OASIS Universal Business Language (UBL) XML Naming and Design Rules (NDR) • Garden of Eden design pattern (maximising reuse of global elements) • Using Crane Software Genericode-to-UBL-NDR script • Location XML Schema as subset of the INSPIRE Data Specifications (GML Application Profile)? https://joinup.ec.europa.eu/node/43160 25 Illustration: Person XML Schema UBL-CommonBasicComponents-2.1.xsd (namespace prefix: cbc) <xsd:element name="FamilyName" type="FamilyNameType"/> <xsd:complexType name="FamilyNameType"> <xsd:simpleContent> <xsd:extension base="udt:NameType"/> </xsd:simpleContent> </xsd:complexType> CoreVocabularyBasicComponents-v1.00.xsd (namespace prefix: cvb) <xsd:element name="FullName" type="FullNameType"/> <xsd:complexType name="FullNameType"> <xsd:simpleContent> <xsd:extension base="udt:TextType"/> </xsd:simpleContent> • The global elements cbc:FamilyName and cvb:FullName can be reused in any schema </xsd:complexType> CorePerson.xsd (namespace prefix: cperson) <xsd:element name="Cvperson" type="CvpersonType"/> <xsd:complexType name="CvpersonType"> <xsd:sequence> … <xsd:element ref="cbc:FamilyName" minOccurs="0" maxOccurs="unbounded"/> <xsd:element ref="cvb:FullName" minOccurs="0" maxOccurs="unbounded"/> … 26 Core Vocabularies RDF Schemas • Maximally reuse existing foundational RDF Vocabularies: Dublin Core Terms, FOAF, SKOS, … • Created with a text editor • Core Location as a foundational RDF Vocabulary for the INSPIRE Data Specifications? 27 Illustration: Person RDF Schema foaf:familyName rdfs:comment "..."@en ; rdfs:isDefinedBy <http://xmlns.com/foaf/spec/> ; rdfs:label "family name"@en ; dcterms:identifier "foaf:familyName"@en ; vann:usageNote "A family name is usually shared by members of a family. This attribute also carries prefixes or suffixes which are part of the Family Name, e.g. ... "@ en . foaf:name rdfs:comment "..."@en ; rdfs:isDefinedBy <http://xmlns.com/foaf/spec/> ; rdfs:label "name"@en ; dcterms:identifier "foaf:name"@en ; vann:usageNote "The full name contains the complete name of a person as one string. In addition to the content of given name, family name and, in some systems, patronymic name... "@en . person:Person rdf:type rdfs:Class ; rdfs:comment "..."@en ; rdfs:isDefinedBy : ; rdfs:label "Person"@en ; rdfs:subClassOf schema:Person , foaf:Person ; dcterms:identifier "person:Person"@en . • The global properties foaf:familyName • and foaf:name can be reused in any schema 28 Outline 0. ISA Programme Action 1.1 1. What are the Core Vocabularies? 2. Why using the Core Vocabularies? 3. How to use? 4. Where already used? 29 3 generic use cases 1. Harmonised access to base registers (basic public service) 2. Interoperable cross-border public services (aggregate public service) 3. Interoperability of public data: making it easier to mash up public data 30 Outline 1. What are the Core Vocabularies? 2. Why using the Core Vocabularies? 3. How to use? 4. Where already used? 33 How to use the e-Government Core Vocabularies? • Reuse the core vocabularies as semantic building blocks for information exchange • Reuse-by-restriction: use only a subset of the Core Vocabularies (e.g. only locn:Address) • Reuse-by-extension: use the Core Vocabularies as a foundational data model that is extended with context-specific elements 34 Re-use by extension: 3 levels of abstraction representation techniques Levels of abstraction UML model Message level e-Documents Domain level domain models Core level RDFS /OWL XML Schema Linked Data, e-Documents domain vocabularies domain schemas e-Documents (?) … Core Vocabularies 35 Illustration: patient healthcare domain class Healthcare Domain Patient as a subclass of Person… with a property blood type Core Vocabularies::Geometry lat :string long :string wkt :string xmlGeometry :XML geometry Core Vocabularies::Address addressArea :string addressID :string adminUnitL1 :string adminUnitL2 :string fullAddress :string locatorDesignator :string locatorName :string poBox :string postCode :string postName :string thoroughfare :string Core Vocabularies::Location address placeOfDeath geographicIdentifier :URI geographicName :string countryOfBirth placeOfBirth «enumeration» Sex F = female M = male T = total UNK = unknown NAP = not applicable notes (EuroStat Standard Code List) Core Vocabularies::Person countryOfDeath Health Problem alternativeName :string birthName :string dateOfBirth :dateTime dateOfDeath :dateTime familyName :string fullName :string gender :code givenName :string patronymicName :string hasProblem Patient symptom bloodType :code Core Vocabularies::Identifier identifies dateOfIssue :dateTime [0..1] identifier identifier :string [1..1] identifierType :string [0..1] issuingAuthority :string [0..1] issuingAuthorityUri :URI [0..1] identifier Social Security Number Allergy allergens intollerance reaction hasAllergy 36 Illustration: patient healthcare domain XML Schema Patient.xsd (namespace prefix: cpatient) <xsd:element name="Patient" type="PatientType"/> <xsd:element name="BloodType" type=" BloodTypeCodeType "/> <xsd:complexType name="PatientType"> <xsd:sequence> … <xsd:element ref="cbc:FamilyName" minOccurs="0" maxOccurs="unbounded"/> <xsd:element ref="cvb:FullName" minOccurs="0" maxOccurs="unbounded"/> <xsd:element ref="cpatient:BloodType" minOccurs="0" maxOccurs="unbounded"/> … </xsd:sequence> </xsd:complexType> <xsd:complexType name="BloodTypeCodeType"> <xsd:simpleContent> <xsd:extension base="udt:CodeType"/> </xsd:simpleContent> </xsd:complexType> 37 Illustration: patient healthcare domain RDF Vocabulary @prefix @prefix @prefix @prefix @prefix @prefix @prefix @prefix @prefix dct: <http://purl.org/dc/terms/> . ex: <http://example.com/> . foaf: <http://xmlns.com/foaf/0.1/> . person: <http://www.w3.org/ns/person#> . rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . rdfs: <http://www.w3.org/2000/01/rdf-schema#> . schema: <http://schema.org/Patient> . skos: <http://www.w3.org/2004/02/skos/core#> . ncicb: <http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#> . ex:Patient a rdfs:Class ; rdfs:label "Patient"@en ; rdfs:comment "A patient in the cross-border healthcare domain."@en ; rdfs:subClassOf person:Person ; rdfs:subClassOf schema:Patient . ex:BloodType a rdf:Property ; rdfs:label "blood type"@en ; rdfs:comment "..."@en ; vann:usageNote "..."@en ; dct:identifier ex:bloodType ; rdfs:domain ex:Patient; rdfs:range ncicb:C61009 . 38 Outline 0. ISA Programme Action 1.1 1. What are the Core Vocabularies? 2. Why using the Core Vocabularies? 3. How to use? 4. Where already used? 39 Known implementations e-CODEX large-scale pilot on eJustice Open Corporates The OSLO project 5 pilot implementations initiated by the ISA Programme: • 25 public administrations • 14 Member States • 4 EU Institutions 40 DATA CONSUMER lookup, disambiguate, link Xquery, Xpath • SPARQL endpoint Linked address data Common Data models INSPIRE XML view RDF view XML and RDF views on relational data served over a Web interface LOGD INFRASTRUCTURE sample address data in native format UrBIS - Brussels Capital Region CRAB - Flanders PICC - Wallonia NGI – National Geographic Institute Core Location Pilot: https://joinup.ec.europa.eu/node/63242 Civil register 41 42 GR- Company data of the Greek tax authorities • Master thesis project of Natasa Varitimou • Using API of Greek tax administration • 30K+ companies 43 GR- Ministry of administrative reform and electronic governance 44 Core Public Service Vocabulary Core public service vocabulary Describe public services “only once” using a standard vocabulary, make machine-readable descriptions available to others so that they become searchable on many governmental access portals. https://joinup.ec.europa.eu/asset/core_public_service/description 45 Public services in Europe 46 Flemish Intergovernmental Product and service catalogue (IPDC) Exchange of service catalogue data between national, regional, and local governments. REST web service that returns XML. XSLT to convert into Core Public Service. Project manager: Katrien De Smet, CORVE (present at SEMIC 2013!) http://www.corve.be/projecte47 n/lokaal/IPDC/ OSLO: Open Standards for Local Administrations • Putting the core vocabularies into a local context. • Local administrations need locally enriched data models and data. 48 OpenCorporates: basic company data for everyone • Machinereadable data: (URI, legal identifier, name, company type, activities) • Links back to the base registers 49 Conclusions • The core vocabularies are used in many different contexts. • They can easily be extended and integrated with other vocabularies. • They can be adapted to your needs and context. • The can be used both in an XML and an RDF world. 50 Core Vocabularies: a semantic building block for e-SENS? ? 51 Project Officers: Vassilios.Peristeras@ec.europa.eu Suzanne.Wigard@ec.europa.eu Contractor: Stijn.Goedertier@pwc.be Visit our initiatives SOFTWARE FORGES COMMUNITY CORE Get involved ADMS. SW Follow @SEMICeu on Twitter Join SEMIC group on LinkedIn PUBLIC SERVICE VOCABULARY Join SEMIC community on Joinup 52