ARTICLE 5: CIM - UNDER THE HOOD Authors; Lars-Ola (LOO), Kurt (KH), Becky (BI), Kendall (KD), Margaret (MG), Tanja (TK) Change history Version 4 April 28, 2015 2015-05-07, TK: Accepted all changes, kept change tracking; comments to parts of document nonassigned to me started with “(note TK: ” – discuss the coment and then remove. I added below my figures what should look like a caption, but didn’t use any Word’s “insert caption” command until we confirm which figures are to keep. Continued editing after our web conference. We need a starting picture of someone looking under the hood and finding … an electric motor??? What else? (note TK: pic from http://www.startrunningforbeginners.com/wp-content/uploads/2011/03/underhood-300x195.jpg , the first image by duck-duck-go; just for fun) The big picture Scope statement; LOO Primarily about (related) exchange governance. Each payload is typically not simply related to itself. Tailoring local related exchanges Separation of roles (might be intro, roadmap for reader); MG, TK Margaret had a good discussion of how CAISO has made this an important concern. Maybe we can get a CAISO statement on the topic. (note TK: Margaret & all, I’ve just put some thoughts in, feel free to remove/adapt/whatever; don’t know whether to refer to EPRI CCAPI project?) Historicaly, the initial CIM canonical model and its first profile (at that time called “CIMXML NERC profile”) were created to allow for an EMS intra-application exchange of power system model data, such as typically used in network analysis applications (state estimation, topology processing, power flow, security analysis), as well as EMS-to-EMS import and export of network model snapshots. Very naturally for this kind of usage, domain experts are expecting to work with “input and output data files”. This has driven the initial efforts to produce a network model “data format” that contains reach semantic within the data file itself (as a replacement to counting columns up to 72 characters in a text data file of some format). … The base canonical CIM was covering the needs for network analysis studies, but the utility industry wanted more, in particular in the realm of distribution operations, where business processes are (were?) less dependent on the network analysis model, but in contrast needed to support data exchanges with various enterprise systems. Canonical CIM started growing beyond the “pure network analysis” view to support data exchanges between the distribution operations (DMS) and outage management systems (OMS), asset record systems such as geographical information systems (GIS), work management systems (WMS), customer management systems (CMS), meter data management systems (MDMS), various network planning applications as well as external systems (e.g., enterprise resource planning, markets, etc.). For this kind of integrations, it is important to specify not only the “what” (canonical data exchanged that those heterogenious systems and their internal data stores will require to function properly in the enterprise environment), but also the “how” side of data exchange. That “how” side is normally not file-based, but rather relies on an enterprise messaging infrastructure with required qualities of service (e.g., guaranteed delivery, speed of delivery, traffic logging for audit trails, security, etc.) – especially important for the case of B2B exchanges. It is also usual practice to provide some kind of envelope for the payloads that can then be exchanged by virtually any transport. This is what IEC 61968-100 does. (note TK: Margaret, here to hook the story of 61968-100? See also section “Messaging and the 61968100 WS and XSD structures”, you may want to pull some content of it up here.) (note TK: Margaret, probably to add something about markets here or above; in relation to CAISO? and more recently ENTSO-E?) (note TK: When done, remove the bullets below) Focus of domain experts Semantics Requirements for persistence and QoS Management of datasets, frames, exchange sequences Sensible minimal exchanges Focus of IT experts, more related to 61968-100, the story behind the 100 Files, middleware, security Efficiency, persistence, QoS delivery Technology Data structuring / schema / semantic model; KH Data instances – MRID, or formal concepts of identity sharing and identity persistence Information structure sourced from canonical model Data exchanged (or managed) according to profiles “based on” canonical model. Specific exchanged as datasets, payloads, with semantics beyond the payload. The CIM type system. Non-schema data partitioning (Modeling Authority Sets concepts); LOO (note TK: Be explicit in limiting the scope here to the big TSOs and not scare away the small DSOs who need not bother with this.) (note TK: By the way: what is “non-schema data”? Do you mean horizontal partitioning, also known as sharding? If so, use an analogy, this may help others (who don’t know about MAS) understand the topic). The canonical model , keep small; MG UML and basic rules of CIM canonical model With some rationale for things like unique class names, package changes agility Non-cyclic dependencies, meaning of dependencies Make reference to the existing guide, make that guide an IEC TR. Local non-(IEC)standard extension (note TK: Some of this is also available in the model management guide; here it would be good to cover “watch for this” aspects learnt from experience.) CIM exchange Profiles – deep dive ; KD (note TK: I’ve edited the slide below with the letters in red showing what I propose to remove (strikethrough) and what to add. During the web conference, we started adapting. Becky and Jim to finish the figure). (note TK: It would be good that the figure from the definitions further down, “Profiling at a glance”, come somewhere up here – maybe on the left or right of the detailed figure below; unless IEEE P&E editors want we have definitions + the pic in kind of a “yellow frame” – you see what I mean, that side window in the paper?). Semantic Models - UML Canonical CIM 61970 61968 62325 Manual profiles generation NWIP will add rules to this Profile 62325-352 ? 61970-4xx 62325-351 Message gen 62325-450 Message gen 62325-450 61968-X Message Profile 62325-452 W3C RDF Schema generation 61970-501 W3C RDF CIMXML Schema message payload syntax 61970-552 W3C XML RDF Described by CIMXML RDF Schema Message Profile 62325-451 W3C XML Schema generation 62361-100 W3C XML Schema W3C XML Instance files Described by XML Schema Explained in diagrams at Semantic level Diagram that graphical and clearly ties the canonical model to the profile. In UML, this has been expressed as the “is based on” relationship to Canonical model though we don’t want to say UML is required for profiles. Rules of restrictions by type Simple vs Exotic restrictions Possibly showing this all in XSD. Definitely not UML, though it might be shown in Visio and look similar to UML, just not EA UML and not using any rules that were part of a vendor product. Equally CIMTool snapshots should not be the initial focus here, though they could be used as an example of a tool implementing the basic concepts of the semantic diagram. (note TK: you may want to have a look at IEC 61968-1, Ed.2, Figures 7 + (8+9) for an example: http://iectc57.ucaiug.org/WG14/Part1/Part%201%20Drafts/61968-1%20Ed.2/IEC%20619681%20Ed.2.0%20FDIS%20(2)%20-%202012-08-07.doc; of course, you can make it nicer, but that shows the idea) (note TK: nesting is dealt with already here, i.e., while defining profile I am defining what is nested, what is root, whether we use byRef or not). Payload Serialization level (and special features); TK, BI (note TK: Below is all new text, I hope it all sounds impartial – with the good, bad for all. There is certainly space for improvements !) Previous sections have addressed the upper half of the Figure xyzProfiling, namely canonical CIM and profile derivation from it by means of restrictions. This section addresses the rest, i.e., syntactic aspects of profiles and considerations to take into account when selecting a syntax. At present, standard CIM profiles get translated into one of two distinct flavours, both based on W3C XML syntax; (note TK: here to refer to the picture and text below). (note TK: Becky, if you could it would be great to pull out a snippet of e.g. CoreEquipment or Topology profile = any one used by WG13 with RDF in mind; and then of e.g. EndDeviceControls from 61968-9 metering. We all need to see whether to use text (might not be readable if we’re to put reasonable amount of it on a page) or XSD visualisation by XMLSpy or anything else that will show nesting / flattness). The very first CIM payload type was defined in early 2000’s (or late 1990?) using W3C RDF Schema (RDFS) and the payload instance data was thus serialised as a special dialect of XML, called RDF. The RDFS representation of a profile has been defined in IEC 61970-501, and the instance data format, or the payload serialisation format compliant to that schema has been called CIMXML and defined in IEC 61970-552. The subset of RDFS used for some CIM profiles still today has been defined in such a way that the instance data (expressed in RDF) looks very flat and importantly, leverages the file-scope uniqueness of data as defined by that syntax – namely, through so called RDF identifiers. To ensure that there is no nesting at all, at the previous step (profile definition), the user is supposed to apply shallow relationships (like pointers) to objects, which will preserve the flattness of the data (note TK: here to refer to something from the picture). RDFS based profiles are widely used for bulk network data model exchanges, for either full models or the increments thereof. The syntax in itself is simple, and the language allows to define ontologies and reason on them. On the negative side, the tooling support for validation, editing and parsing is somewhat limited and may have a steep learning curve. With the need of (initially: distibution) utility organisations to exchange data among heterogenious systems and often without the need for network models, the second flavour of syntax for payload types emerged as a natural choice well suited for enterprise integration: “vanilla” XML, that validates against the W3C XML Schema (also referred to as XSD). The translation of CIM profiles into the XSD syntax is defined in IEC 62361-100 and is widely used for standard as well as for custom profiles. This syntax is the mainstream integration technology with miriad of tools supporting its development, well understood by both application developers and integrators, and is an integral part of web-enabled technologies. The profiles defined with the XSD syntax in mind have no limitation in structure and are indeed often reflecting the “navigation path” through relationships from the canonical CIM model. It is possible to use the “shallow” references as in RDFS based profiles, however, there is no XSD native counterpart to RDF identifiers - all the more that payloads may not use CIM mRID for identification, but rather the battery of name-related classes, that have been added with base CIM15 to support object registries. XSD syntax also allows to have multiple profiles for a single type (e.g., a street address with only street name, street number and postal code; and a street address with all the available descriptors); defined locally or globally. Finally, the work is in progress to support modularity of payload types by enabling the inclusion of a “profile snippet” into multiple profiles. Due to its widespread usage in the industry, majority of new XSD based profiles of standard CIM are defined using XSD syntax, in particular those that support processes outside network model exchanges. The syntax is equally well suited for configuration (“static” / bulk) and for operation (“dynamic” / online) data exchanges, as illustrated with e.g., metering profiles from IEC 61968-9. Note that in theory, a profile defined for use with RDFS syntax can also be expressed with XSD syntax; the other way round is possible only in a special case (where the profile gets defined as if it were for RDFS syntax). (note TK: When done, remove the three sets of bullets below) Xsd and conforming XML NDR details References as hierarchical (nested) XSD. References without MRID Generalized references. Rdf (RDFS, OWL) RDF ID and MRID Incremental Open to other serialization forms or technologies CIM/E, JSON, Thrift, proprietary binary, EXI … Keep this section short but the concept needs to be covered. Interoperability; LOO, TK, MG mRID usage Linkages in the Canonical model Discussion of Name and NameType, the mRID:String and UUID. Messaging and the 61968-100 WS and XSD structures; TK, MG, BI (note TK: Margaret, Becky and all, I used as a starter my own text from a chapter that I contributed with colleagues, to the in-preparation ISTE-Wiley book on demand response; please feel free to modify / shorten / whatever. Before the text below, in the book chapter, I talked about enterprise and B2B integration atop a semantically aware enterprise messaging infrastructure, a la our 61968-1 IRM. The intro text down to the “Anatomy of an interface” would maybe better fit somewhere in the introduction, maybe the part on 61968-100 where we talk about both domain and integration concepts, then refer to subsequent sections?) To come up to a running software, the first set of objectives is to specify interfaces: – what is the content (or payload) of data exchanges among actors? – how is it modelled (in the canonical information model) and how is it translated into an implementation artefact (i.e., some kind of data schema)? – by which transport it gets exchanged? while satisfying the requirements, including those related to interoperability. Once we have answered these questions and started specifying concrete data exchange payloads, all actors concerned with any given payload can progress in parallel and independently of each others by implementing and testing their local applications as long as everybody programs to the agreed interfaces. To allow for this level of flexibility, it is best to adopt for interface specification those technologies that are independent of hardware platform, operating system or programming language. This is the approach also recommended and followed by utility enterprise integration standards such as IEC 61968. In the section “Anatomy of an interface”, we will dissect a message derived from one such sample interface. Anatomy of an interface We will analyse a message compliant with an interface that has properties discussed above. Left side of Figure xyz-SOAP shows the general approach to defining communication interfaces by using encapsulation or nesting. Those familiar with the OSI reference model for communication protocols will recognise the idea: The object of exchange (payload) is nested deepest and wrapped into some kind of envelope. That envelope then becomes the object of exchange (payload) one level up and is again wrapped into another kind of envelope. We have denoted three levels in our example: – L1, the first level, that we will call domain interface level, – L2, the second level, that we will call message interface level, and, – L3, the third level, that we will refer to as transport interface level. In theory, one could go further up and wrap content of L3 into some fourth level, L4, but for our purposes, these three levels are sufficient. The fact that we can clearly separate them means that we can independently develop and test “real” software at each level, while “mocking” the functionality of the other levels. This is an important aspect of any project execution or implementation effort. <soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"> L3: Transport message L2: Message envelope L1: Domain data <soapenv:Header> … … </soapenv:Header> SOAP Header <soapenv:Body> <msg:CreateEndDeviceControls> <msg:Header> <msg:Verb>?</msg:Verb> <msg:Noun>?</msg:Noun> <!--Optional:--> … … </msg:Header> <!--Optional:--> <msg:Request> <!--Optional:--> <msg:StartTime>?</msg:StartTime> <!--Optional:--> <msg:EndTime>?</msg:EndTime> <!--Zero or more repetitions:--> <msg:ID>?</msg:ID> </msg:Request> <!--Optional:--> <msg:Payload> <edc:EndDeviceControls> <edc:EndDeviceControl> <!--Optional:--> <edc:mRID>?</edc:mRID> <edc:description>?</edc: description> <edc: drProgramLevel>?</edc:drProgramLevel> … … </edc:EndDeviceControl> </edc:EndDeviceControls> <msg:Format>?</msg:Format> </msg:Payload> </msg:CreateEndDeviceControls> </soapenv:Body> </soapenv:Envelope> Message envelope SOAP Body Domain data Figure xyz-SOAP – Sample SOAP message Let us now look at the right hand side of Figure xyz-SOAP. This shows a set of very concrete technology choices. – The domain interface level L1 reflects the payload whose type is called EndDeviceControls and it has been derived from the canonical distribution CIM (defined in IEC 61968-9 and IEC 61968-11, respectively), and that is compliant with the W3C XML Schema format. That specific domain payload allows one, for instance, to send control actions to end devices (such as smart meters) by providing the identity of the end device and the program level for demand response. – The message interface level L2 reflects the message envelope expressed also in W3C XML Schema format as defined in IEC 61968-100. This is the level where we need to provide extra information that is required by the software that handles message exchanges on a bus and that does not need to “unpack” the payload, but just to route it to the correct receiving applications. Typical information that must be specified are the verb and the noun – telling what needs to be done (verb) with the content of the payload (noun). – Finally, the transport interface level L3 reflects the standard SOAP message (often meant when one says Web Service), with the SOAP header and the whole of level L2 encapsulated in the SOAP message body. There may be cases where a single actor needs to exchange some of the messages within a single organisation and in a simplified way, without need for platform independence and overhead of e.g. Web Service (and its SOAP implementation). To support this kind of exchange, we could simply skip the transport interface level L3 and pass around the message envelope of level L2 with its containing payload in level L1 by using e.g. a JMS broker (in a standard way, as in IEC 61968-100) or some other means for message exchange. The example above has used the payload (L1) in W3C XML format. Given the structure of the envelope (L2), it allows to embed domain payloads in literally any format, including W3C RDF format. Alternately, for certain data exchange scenarios, simple file exchange may be sufficient in which case the domain payload (L1) would simply be the file to exchange; in this case, some non-standard preconfiguration needs to take place (to e.g. state the URI of the file on a file system). (note TK: Do we really want to address here again incremental changes? This would mean describing 61968-100 verbs ‘execute’ and ‘executed’, and if so, one would have to describe all the other verbs. We can, of course do that, but do we have enough space? (note TK: When done, remove the two bullets below) (note TK: When settled on content, think of adapting the title of this whole section to something more catchy and less technical) Use of payload profiles, either RDF or XSD Incremental changes Data composition by receiving systems; KD, KH, MG By MAS or by profile, makes no difference Cross technology integration Receiver data composition details Object composition diagrams Incremental change to existing data. Composition with previously received information Abbreviations (note TK: do we need this?) DMS – EMS – more… Definitions Information models CIM UML Context derived from Profile Data syntax derived from Payload schema (XSD, RDFS, …) Actual data validated by Payload instance (XML, RDF, …) Figure xyzProfiling – Profiling at a glance (1) Canonical CIM: This is an abstract model of domain or data exchange (our CIM UML). (2) Profile: This is a restriction of canonical CIM, a small subset of it as required for a specific interface for external data exchanges (although people may take the same approach for internal system interfaces). We use a profiling tool to reduce canonical CIM to the desired content, and the tool produces for us a schema that will govern the format of the instance data to be exchanged. (3) Payload: This is an instance of Profile according to a given syntax. We use XML (complying with RDFS or XSD); in future we may want to use something else (e.g. JSON format); others may already be using something else based on CIM, or CIM extensions or not related to CIM at all. (note TK: this is L1 from Figure xyzSOAP) (4) Message: This is an instance of message envelope, e.g. Message.xsd (61968-100), that includes among others one or more Payloads (3). It also contains the full context of data exchange (verb, noun, etc.) to let the receiver know what to do with the received payload. (note TK: this is L2 from Figure xyz-SOAP) (5) Dataset: TBD (6) Models: Model – Data that describe an existing entity or thing. Power network model – A model or data describing a power network. Base model – A power network model used as part of a case. As built model – a power network model that correspond a constructed power network Case – complete set of inputs to an analytical study case. Full model – A complete description of a power network model. Difference model or Incremental model: A description of a power network model relative another full model. Assume X and Y are full models and Δ is a difference or incremental model. Then the following is true A new full Y model can be created from an existing full model X and a difference model Δ, i.e. Y=X+Δ A new full model Y just created from an existing full model X and a difference model Δ can be used to recreate the existing full model X by subtracting the difference model Δ, i.e. X=Y-Δ Any two full models X and Y can be used to create difference model such that one of the full models can be recreated from the other by adding the created difference model, i.e. if Δ1=X-Y then X=Y+Δ1 and if Δ2=Y-X then Y=X+Δ2 and Δ1=- Δ2. Further Readings – up to 6 items (note TK: ensure to include CIM model management guidelines and CIM primer - thise leaves place for 4 more) Authors – one line per author Tatjana Kostic is with ABB Corporate Research, Baden-Daettwil, Switzerland.