II. Common Statistical Production Architecture

advertisement
Fostering Interoperability in Official Statistics:
Common Statistical Production Architecture
(Version 0.1, April 2013)
DRAFT FOR REVIEW
Please note the development of the Common Statistical Production Architecture is a work in
progress. This is not intended for official publication and reflects the thoughts and ideas gathered
at the first Sprint on this topic held in Ottawa, Canada during April 2013.
This document is aimed at readers who have some knowledge of Enterprise Architecture.
Instructions for reviewers and a template for providing feedback is available at
http://www1.unece.org/stat/platform/display/msis/CSPA+v0.1
This work is licensed under the Creative Commons Attribution 3.0
Unported License. To view a copy of this license, visit
http://creativecommons.org/licenses/by/3.0/. If you re-use all or part
of this work, please attribute it to the United Nations Economic
Commission for Europe (UNECE), on behalf of the international
statistical community.
Contents
I.
Introduction ............................................................................................................................. 3
II.
Common Statistical Production Architecture ......................................................................... 6
III.
How the Architecture could be used. ................................................................................... 7
IV.
Who uses the Architecture ................................................................................................... 8
V.
Impact on organizations ........................................................................................................ 10
VI.
A.
Architecture Description .................................................................................................... 11
Principles ........................................................................................................................ 12
Decision Principles ............................................................................................................... 12
Design Principles .................................................................................................................. 12
Statistical Service Design Principles .................................................................................... 13
B.
Requirements .................................................................................................................. 14
C.
Statistical Services and "Plug and Play" ........................................................................ 14
Generic Statistical Business Process Model (GSBPM) ........................................................ 16
Generic Statistical Information Model (GSIM) .................................................................... 16
Quality Attributes/ Non-Functional Requirements ............................................................... 16
Architecture Patterns ............................................................................................................. 18
D.
VII.
Catalogues ...................................................................................................................... 21
Governance of the architecture .......................................................................................... 22
Annex 1: Quality Attributes / Non-Functional Requirements ...................................................... 24
Annex 2: Differences between “Classic” SOA and EDA ............................................................. 28
Annex 3: Glossary......................................................................................................................... 30
2
I.
Introduction
1.
Many statistical organizations are facing common challenges. They have built up their
organizational structure, production process, enabling statistical infrastructure and technology
over years, through many iterations and technology changes. This can be referred to as
'accidental architecture' as they were not designed from a holistic top down view. The cost of
maintaining this business model and the associated asset bases (process, statistical, technology)
is becoming insurmountable and the model of delivery is not sustainable.
2.
There are two major threats to the continued efficient and effective supply of core
statistics come from within statistical organizations. These are:
1) rigid processes and methods and
2) inflexible ageing technology environment.
3.
Often it is difficult to replace even one of the components supporting statistical
production. Use of these processes, methods and an inflexible and aging technology
environment mean that statistical organizations find it difficult to produce data and information
aligned to modern standards. Process and methodology changes are time consuming and
expensive resulting in an inflexible, unresponsive statistical organization.
4.
Historically, statistical organizations have developed their own business processes and
IT-systems for producing statistical products. Therefore, although the products and the processes
conceptually are very similar, the individual solutions are not. Statistical organizations have
attempted many times over the years to share their processes, methodologies and solutions, as it
has long been believed that there is value in this. The mechanism for sharing has historically
meant an organization taking a copy of a component and integrating it into their own
environment. Examples include CANCEIS (CANadian Census Edit and Imputation System) and
Banff (an editing and imputation system for business surveys). However, most cases of sharing
have involved significant work to integrate the component into a different processing and
technology environment.
5.
Many statistical organizations are modernizing and transforming using Enterprise
Architecture to underpin their vision and change strategy. This work will enable the
organizations to develop statistical services in a standard way within their organization.
An enterprise architecture aims to create an environment which can change and support business
goals. It shows what the business needs are, where the organization wants to be, and ensures that
the IT strategy aligns with this. Enterprise architecture helps to remove silos, improves
collaboration across an organization and ensures that the technology is aligned to the business
needs.
6.
Figure 1 attempts to explain why this difficulty occurs. The figure assumes that the two
statistical organizations develop all their business capability and supporting components in a
standard way (i.e they have an Enterprise Architecture). The first line of the figure shows that
Canadian components have a zig zag shape and the second suggests that Sweden has components
with slanted edges. If Sweden needs a new component, ideally they need a component with a
3
slanted edge. It can be seen in the third row that while a component from Canada might support
the same process and incorporate robust statistical methodologies, it will not be simple to
integrate it into the Swedish environment.
Figure 1: Why sharing /reuse is hard now
7.
The High Level Group for the Modernization of Statistical Production and Services
(HLG) has put priority on the development of a Common Statistical Production Architecture
(CSPA).
8.
The CSPA is meant to be a generic architecture for statistical production. It will serve as
an industry architecture for statistical organizations. By adopting this common reference
architecture, it will be easier for each organization to standardize and combine the components of
statistical production, regardless of where the statistical services are built. As shown in Figure 2,
Sweden could easily reuse a statistical service from Canada because they both use the same
“shape”.
9.
The CSPA will facilitate the sharing and reuse of statistical services both across and
within statistical organizations. The statistical services that are shared or reused across statistical
organizations might be new statistical services or legacy/existing services which have been
wrapped to comply with the architecture. This is show in Figure 2 by the shapes inside the Lego
blocks. It also provides a starting point for concerted developments of statistical infrastructure
and shared investment across statistical organizations.
4
Figure 2: how it could be easier with architecture
10.
to:
The goal of the CSPA is to provide statistical organizations with a standard framework








facilitate the process of modernization
provide guidance for operating change within statistical organizations
provide statisticians with flexible information systems to accomplish their mission and to
respond to new challenges and opportunities
reduce costs of production through the reuse / sharing of solutions and services and the
standardization of processes
provide guidance for building reliable and high quality services to be shared and reused
in a distributed environment (within and across statistical organizations)
enable international collaboration initiatives for building common infrastructures and
services
foster alignment with existing industry standards such as the Generic Statistical Business
Process Model (GSBPM) and the Generic Statistical Information Model (GSIM), and
encourage interoperability of systems and processes
11.
This document provides the first thoughts on what this CSPA will look like. It is the
output from the Architecture Sprint held at Statistics Canada 8 - 12 April 2013. This output is not
designed to be the final output of this project. It is intended to outline the first ideas on the
architecture in order to seek feedback from the wider official statistics community.
5
II.
Common Statistical Production Architecture
12.
The CSPA aims to be the industry architecture for the official statistics industry. An
industry architecture is a set of agreed common principles and standards designed to promote
greater interoperability within and between the different stakeholders that make up an "industry",
where an industry is defined as a set of organizations with similar inputs, processes, outputs and
goals (in this case official statistics).The value of the architecture is that it enables collaboration
in developing and using services which will allow statistical organizations to create flexible
business processes and systems for statistical production more easily. The CSPA is sometimes
referred to as "plug and play" architecture. The idea is replacing components could be as easily
pulling out a component and plugging another one in.
13.
The CSPA gives users an understanding of the different statistical production elements
that make up a statistical organization and how those elements relate to each other. It also
provides a common vocabulary with which to discuss implementations, often with the aim to
stress commonality. It is an approach to enabling the vision and strategy of an industry, by
providing a clear, cohesive, and achievable picture of what’s required to get there.
14.
An important concept in architecture is the “separation of concerns”. For that reason, the
architecture is separated into a number of “layers”. These “layers” are:




15.
Business Architecture which defines and evolves understanding of what the industry
does and how it is done (statistics in our case),
Information Architecture which builds understanding of the information, its flows and
uses across the industry, and how that information is managed,
Application Architecture which is the set of practices used to select, define or design
software components and their relationships, and
Technology Architecture which describes the infrastructure technology underlying
(supporting) the business and application layers.
The CSPA provides a template architecture for official statistics. It describes:



What the official statistical industry want to achieve – This is the goals and vision (or
future state).
How the industry can achieve this – This is the principles that guide decisions on
strategic development and how statistics are produced.
What the industry will have to do - The industry will need to adopt a service oriented
architecture which will require them to comply with requirements, quality attributes /
non-functional requirements and architecture patterns
16.
One of the key enablers for the CSPA is Catalogues. It is envisaged that each statistical
organization will have catalogues of processes, information objects and statistical services. There
will be a Global Artefact Catalogue which contains the shareable/ reusable artefacts (i.e.
processes, information objects and statistical services) from the statistical organizations.
6
17.
The Global Artefact Catalogue will contain planning metadata. This will include
information about developments that are planned or in progress. This information will facilitate
the creation of a Global Roadmap of statistical organizations developments that can be consulted
to see which organizations are developing what and when.
18.
The architecture is described in more detail in Section VI.
III.
How the Architecture could be used.
19.
The Architecture Sprint proposed eight ways in which CSPA may be used. These are
outlined below:
20.
Component redesign: A statistical organization uses a statistical service and would like
to modify the statistical service to meet new functional and/or non-functional requirements. An
approach to collaborative renewal of existing statistical service is required. For example,
Statistics New Zealand and the Office of National Statistics (UK) would like new functionalities
in CANCEIS
21.
New component (strategic level or design level): The need for a new statistical service
has been identified as a gap in a statistical organization’s Strategic Plan. To fill the gap the
statistical organization starts looking for available solutions in the collaborative space (Global
Artefact Catalogue). If not found, the statistical organization can either start designing and
developing internally or seek collaboration with other statistical organizations. An example of
this could be a need to have metadata driven questionnaire creation.
22.
Vendor: The vendor has released a new product. Statistical organizations should verify
together if the product is relevant to the community requirements. If it is not, statistical
organizations can try to influence the vendor to meet requirements. If yes, statistical
organizations ask the vendor to integrate the product in the CSPA. For example, a SAS
component for Analytics.
23.
Vendor sought: Statistical organizations identify new needs/solution and decide to
investigate possibilities of vendors developing it. For example, the Australian Bureau of
Statistics seeking SAS agreement to develop a new procedure for regression that protect
confidentiality.
24.
Roadmap: Statistical organizations need to modify and integrate their roadmaps to align
them with the CSPA framework. They will also integrate/streamline their Investment strategies.
25.
Strategy: Statistical organizations are creating and using an industry strategy ("Industry
Architecture") and this leads to projects/work programs. For example, the Proof of Concept for
CSPA.
26.
Hosted services: One international organization hosts/deploys services available for all
statistical organizations. For example, Eurostat delivers statistical services in the cloud.
7
27.
Statistical organization cooperation: A cluster of statistical organizations have been
working on a project and want to share the results.
28.
Transition Strategy: Each statistical organization needs to define a strategy to move
from its current state to the common future state defined in their roadmap.
IV.
Who uses the Architecture
29.
Using the Architecture will create new functional roles within a statistical organization.
These roles may already exist in some statistical organizations and may have particular people
(for example, Chief Information Officer, Enterprise Architect) assigned to them. This section
explains the functional roles and leaves it to each statistical organization to decide who performs
that role in their organization.
Figure 3: Roles in project lifecycle
30.
An Investor is a person, or a group of people, who undertakes the strategic planning and
investment decision making role to support statistical production in a statistical organization.
In their decision making, they balance benefits, opportunities, risks and costs. To support this
decision making, the Investor could consult a catalogue which includes processes, components
and statistical services and evaluate the relevant components.
8
Technical Specialist roles
31.
Once the need for a component is identified and the project is funded, a Designer from
within or outside the organization will reuse, wrap or design methods and services to provide the
essential capability needed by the organization or wider statistical community. There could be a
number of types of Designer including Process Designers, Information
Designers, Methodology Designers and Service Designers. (See Box 1 for a more detailed
description of the Designer role). The Service Designer will consult the architecture to ensure
that the service is consistent with it and also authorize the service to be added to the catalogue.
32.
If a new statistical service has been designed, the Service Designer will give the work to
a Service Builder. A Service Builder is a specialist or expert from within or outside of an
organization who builds statistical services according to the specifications provided by Service
Designers.
33.
A Service Assembler is a specialist responsible for integrating or assembling relevant
statistical services into an end-to-end process that delivers a critical business capability. This role
requires the specialist skills of people experienced in integration and methodology.
Non-Technical Specialist Roles
34.
Services deliver generic business capability within the GSBPM framework. Services are
then used in a specific context, within a specific statistical process. Services therefore must be
capable of being configured or parameterized for this specific task. A Service Configurer is the
role responsible for configuring either the end-to-end process or a specific process to achieve the
business outcomes specified by the business area responsible for statistical production. People
who fulfill this role would include Subject Matter Experts and Methodologists. Their role is to
configure the parameters needed for a particular business process. For example, for
predetermined capabilities, they would decide what is used and in what order. Their interactions
with a service are only through a User Interface.
35.
Finally, Users are the business area staff who manage and operate the configured end-toend process to achieve the business outcomes sought.
Box 1: An illustration of the Designer Role
Jose is a Service Designer in a statistical organization. He has been asked to design a solution to
collect data for the Labour Force survey. Jose would look into the statistical requirements of all
data collections in the organization and design a service which would meet not just the
requirement of the Labour Force Survey, but also data collection approach for other surveys in
the organization. These statistical requirements would include details on variables to be
collected, on international standards and classification, sample techniques to be used, the
questionnaire logic and the outputs to be obtained.
Once the requirements are understood, Jose searches in the "GSIM aligned Information Object
Catalogue" for the availability of data and classifications that can be used for the survey.
9
Using the "GSBPM aligned Process Catalogue", Jose selects appropriate processes according to
the requirements for different phases of the Data Collection i.e. “Sample Selection", "Set up
Collection", "Run Collection" and "Finalize Collection".
Then Jose browses the organization’s artifact catalogue to search for suitable existing services
he can use to support the survey methodology, processes and information flows. If multiple
modules are available in the organization’s artifact catalogue for a single phase, Jose chooses
according to the requirements and interfaces, possibly verifying with statisticians.
If some of the needed modules are not available in the organization’s artifact catalogue, Jose
searches in the Global Artefact Catalogue. If no suitable services are found there, Jose
investigates various options to make an informed decision:




internal legacy systems that are candidates to become services (by wrapping)
planned initiatives on the global roadmap that meet the requirements and timelines
(using Planning Metadata in the related Catalogue)
existing services that would require extension following the collaborative change
management process (internal and or external)
vendors listed in the catalogues of the Global Artefact Catalogue that could partner or
provide a solution
Jose will need to consult with the Enterprise Architecture on his evaluations and the cost/benefit
analysis of the options seeking confirmation that his proposals aligns with transitional planning.
If there are no alternatives, Jose will design a new service that can be re-used both internally in
his Statistical Organization and in other organizations.
Following the planning metadata, frameworks, standards, policies, guidelines, and other
knowledge assets in the related Catalogues to conform to the requirements of the CSPA.
After designing the services to be used, he ensures that inputs and outputs required align with
the "Service Interfaces" and integration specifications of the services selected. After verifying
that all necessary components are available, Jose selects the appropriate "Architectural Patterns"
and composes the Services package including defining the "Guidelines" to be used by "Service
Builders" and "Service Assemblers" to meet all functional and non-functional requirements.
The service package is then sent to the Enterprise Architect review and validation process for
both internal and international re-use.
V.
Impact on organizations
36.
There will be a number of required changes for an organization implementing the
CSPA. Adoption of the CSPA will require investment with a view to generating the long term
benefits identified in the value proposition.
10
37.
The main changes required at the organization level can be grouped in layers:
A. People Changes
 Openness to international cooperation
 Building trust in international partners (especially as they may be building
services for your organization)
 Sense of compromise (acceptance that nothing will be optimized for local use,
rather it will be optimized for international or corporate use)
 Development of new functional roles to support use of the architecture (e.g.
Assembler, Builder)
B. Process Changes
 Adoption of an industry wide perspective
 Different approach to business process management and design
 Commitment to service (contract between different functional units)
C. Technology Changes
 setting up an adequate middle ware infrastructure (messaging, repositories)
 uplift of physical network capabilities (bandwidth, etc.)
 management of security features
38.
In addition to the costs and the targeted benefits, an organization adopting the CSPA will
benefit from:



VI.
a sustainable and efficient strategy to cope with legacy and phasing out of existing
applications
a cycle that enables cost saving from reduction in production costs to be reinvested in
further infrastructure transformation
a positive image both on national and international/industry scene
Architecture Description
39.
The CSPA has a number of design artefacts. These design artefacts describe the structure
of statistical services, their interrelationships and the principles that govern their design. The first
ideas for a number of design artefacts were produced at the Architecture Sprint. These include:




Principles
Requirements
Statistical Services and Plug and Play
o Generic Statistical Business Process Model (GSBPM) Alignment
o GSIM Alignment
o Quality Attributes / Non-functional Requirements
o Architecture Patterns
Catalogues
11
A. Principles
40.
Principles are high level decisions or guidelines that influence the way processes and
systems are to be designed, built and governed. Principles are derived from the mission and
values of the organization, taking into account the opportunities and threats that the organization
faces. In the CSPA, principles are used to express the high level design decisions that will shape
the future statistical processes and systems.
Decision Principles
41.
Decision principles are guidelines to help decide on strategic development. They provide
a basis for decision making and informing how the mission is fulfilled. They help enable sound
investment decisions. The following decision principles have been selected for the CSPA. They
represent the outcomes sought by the High Level Group and key elements of the United Nations
principles for Official Statistics. These provide a basis for decision making through an enterprise
and inform how that organization sets about fulfilling its mission. How we decide on strategic
development
42.
These principles are outlines below and are to be applied as a cohesive set:






Increase the cost efficiency of the statistical organization
Deliver enterprise-wide benefits
Capitalize on or influence national or international developments
Sharing and Collaboration
Increase the value of the nation's statistical assets
Maintain information security and community trust.
Design Principles
43.
The following design principles have been developed to provide an understanding of the
goals of the CSPA and the sponsors of the architecture. The principles are provided across the
architectural layers outlined earlier (business, information, application and technical) and inform
design related decisions. Applying the principles (see Table 1) in a balanced way will ensure
that statistical services will be delivered to meet the industry vision.
12
Table 1: Design Principles for the different “layers” of architecture and the Statistical Service
Business Design Principles
Information Design Principles
 Use available standards
 Information is an asset that has value to the
 Focus on efficiency
organization and must be managed
 Support responsiveness to new needs
accordingly.
 Organization wide awareness in design
 Information is consistent across all relevant
 Output driven architecture
services.
 Empowering business to reach their goals
 Use available standards to maintain
independently through metadata driven
consistence of language across all services.
processes
 Users/services have access to the necessary
 Re-use existing before creating new
information to perform their duties.
 Create new for re-use and flexible, easy
 Human intervention at run time is
assembly, plug and play
minimized by automated population.
 Use recognized standards
 Changes to Information assets are managed
 Have the right delegations and governance
rigorously to provide traceability and
with supporting metrics
versioning.
 Consider all capability elements e.g.
 Persistence is maintained within the service
Methods, standards, processes, skills, IT
and is not exposed to service consumers.
 Services do not provide persistence as a
main (or sole) purpose. They support a
GSBPM service
Application Design Principles
Technology Design Principles
 Use available standards
 Implementation agnostic
 Leverage learning from international
application platforms (for example
COmmon Reference Environment
(CORE))
 Follow both "classic" and "event-driven"
SOA architecture patterns in our Plug and
Play framework depending on best fit to
requirements
Automate as much as possible
Statistical Service Design Principles
 Managed standardized service contracts based on GSIM objects.
 Enable services to be loosely coupled externally and be aware of internal coupling.
 Maximize service autonomy (completeness) to enable share-ability and reusability
(External & Internal)
 Non-functional requirements (quality attributes) form a key input in design decisions.
 Granularity based on the level below a GSBPM sub phase
 Independence between design and implementation
Statistical Service Design Principles
44.
The design principles (see Table 1 above) have been selected to maximize the flexibility
of the statistical services wrapped or developed in the context of the CSPA. The flexibility of the
13
statistical service directly impacts the level of reuse, the flexibility required of the industry vision
and the ease with which a statistical organization can implement a statistical service.
B. Requirements
45.
A requirement is a specific functional or physical need that a process or system must
fulfill. It is a statement that identifies a necessary attribute, capability, characteristic, or quality of
a system for it to have value and utility to a customer, organization, internal user, or other
stakeholder. In the context of the CSPA, the requirements shown in Figure 4, in addition to the
principles, shape the way the architecture will be realized.
Figure 4: Requirements for the CSPA
C.
Statistical Services and "Plug and Play"
46.
The architecture is based on an architectural style called Service Oriented Architecture
(SOA). This style focuses on Services (or Statistical Services in this case). A service is a
representation of a real world business activity with a specified outcome. It is self-contained and
14
can be reused by a number of business processes (either within or across statistical
organizations).
47.
The level of reusability promised using an SOA is a direct result of the standardized
definition of the service capabilities through the service contract. For CSPA statistical services
these capabilities have been defined as:




the GSBPM for the reusable functional context
the GSIM for the information model (note this will be the implementation model)
the quality attributes/non-functional requirements
the architectural pattern(s) to be supported (traditional and event driven SOA),
Figure 5: CSPA Service Features
48.
A statistical service will perform a task in the statistical process. Statistical services will
be at different levels of granularity. An atomic or fine grained statistical service encapsulates a
small piece of functionality. An atomic service may, for example, support the application of a
particular methodological option or a methodological step within a GSBPM sub phase. Coarse
grained or aggregate statistical services will encapsulate a larger piece of functionality, for
example, a whole GSBPM sub phase. These may be composed of a number of atomic services.
49.
The CSPA is sometimes referred to as "plug and play" architecture. Plug and Play is a
concept that was borrowed from the world of computer hardware. It is used to indicate that
installing new hardware modules is easy: just plug it in and it will work. In the context of
statistical processes, we use the term to indicate that installing a new "module" of functionality in
an existing process is easy. In fact, the idea is that it should be possible to build a complete new
process by stringing together a number of such modules, like building a wall from Lego-bricks.
15
50.
Within the Plug and Play concept, the Service is the “pluggable” element. The
granularity of statistical services should be based on a balanced consideration between the
efficiency of the statistical service and the flexibility required for sharing purpose - larger
statistical services will usually enable greater efficiency, whereas a finer granularity will allow
greater flexibility for supporting sharing and reuse. It is envisaged that the statistical services will
be shared or reused at one level below a GSBPM sub phase. Services, regardless of their
granularity, must meet the architectural requirements and are aligned with the CSPA
principles. Figure 5 shows that these statistical services must align to GSBPM and GSIM as well
as adhere to the architectural patterns and quality attributes / non- functional requirements
described by the architecture.
Generic Statistical Business Process Model (GSBPM)
51.
The GSBPM is a flexible tool to describe and define the set of business processes needed
to produce official statistics. The GSBPM can also be utilized to harmonize statistical computing
infrastructures, facilitating the sharing of software components, and providing a framework for
process quality assessment and improvement.
52.
For further information please refer to the GSBPM web site
http://www1.unece.org/stat/platform/display/metis/The+Generic+Statistical+Business+Process+
Model
Generic Statistical Information Model (GSIM)
53.
The GSIM is the first internationally endorsed reference framework for statistical
information. This overarching conceptual framework plays an important part in modernising,
streamlining and aligning the standards and production associated with official statistics at both
national and international levels.
54.
GSIM is a reference framework of information objects, which enables generic
descriptions of the definition, management and use of data and metadata throughout the
statistical production process. It provides a set of standardized, consistently described
information objects, which are the inputs and outputs in the design and production of statistics.
As a reference framework, GSIM helps to explain significant relationships among the entities
involved in statistical production, and can be used to guide the development and use of consistent
implementation standards or specifications.
55.
GSIM is one of the cornerstones for modernising official statistics and moving away
from subject matter silos. It is a key element of the strategic vision prepared by the High-Level
Group for the Modernization of Statistical Production and Services (HLG), and endorsed by the
Conference of European Statisticians.
56.
For further information please refer to the GSIM web site
http://www1.unece.org/stat/platform/pages/viewpage.action?pageId=59703371
Quality Attributes/ Non-Functional Requirements
16
57.
Quality Attributes / Non-Functional Requirements are important to be captured in the
design of the services. They have a significant influence on the software architecture of a service
58.
The Quality Attributes / Non-Functional Requirements listed in Table 2 are grouped into
four specific areas linked to design, runtime, system and user qualities. Further explanations of
these quality attributes including descriptions can be found in Annex 1.
Design Qualities
Table 2: Quality Attributes / Non-Functional Requirements
 Reusability (Exploitability)
 Legal and licensing issues or patent-infringement-avoidability
 Privacy
 Multilanguage support (internationalization and localization)
 Compliance
 Extensibility (adding features, and carry-forward of customizations at
next major version upgrade)
 Maintainability
 Platform compatibility
Run time Qualities







System Qualities










Failure management
Quality (e.g. faults discovered, faults delivered, fault removal efficacy)
Configuration management
Deployment
Operability
Portability
Scalability (horizontal, vertical)
Backward Compatibility
Supportability
Testability
User Qualities


Documentation
Usability by target user community
Security
Efficiency (resource consumption for given load)
Interoperability
Effectiveness (resulting performance in relation to effort)
Performance / response time (performance engineering)
Availability (see service level agreement)
Fault tolerance (e.g. Operational System Monitoring, Measuring, and
Management)
 Backup
 Robustness
17
Architecture Patterns
59.
In simple terms, architecture patterns describe a re-usable solution to certain classes of
problems. They explain how, when and why statistical services can be used, as well as the
impact of using them in that way. They help a Service Assembler to identify combinations that
have been used successfully in the past. Although not identified in this document, there are also
anti-patterns which are examples of what should not be done.
60.
The benefits of using architecture patterns can be described by using the analogy of an
expert chess player. To play chess, you must learn the rules and the principles (for example the
value of different pieces, squares). However, to improve and become a really good player, you
need to learn the patterns used by more experienced players and apply them to your game. In the
same way, you can use the requirements, principles and quality attributes of the CSPA, but to get
the maximum benefit, Service Assemblers should learn the architecture patterns.
61.
The CSPA will incorporate both "Classic" and Event Driven Service Oriented
Architecture patterns. Both patterns provide capabilities to cope with the future challenges facing
statistical organizations. Event Driven Service Oriented Architecture would be preferred in cases
such as Big Data, new Dissemination processes, cross organization cooperation.
62.
For other issues either it is indifferent (e.g. security) or it is still undetermined (e.g. future
technology, loss of control of data collection). These aspects may be investigated on the basis of
the Proof of Concept results including aspects like impact on granularity.
63.
Further details comparing the two types can be found in Annex 2.
"Classic" Service Oriented Architecture (SOA)
64.
"Classic" SOA is a pattern that uses the Request/Response pattern for activating services.
This implies a rather fixed routing of messages between services. The integration infrastructural
platform implementing the process “orchestrates” the routing and executing of services. This
pattern leads to less flexibility and tighter coupling between services that the Event Driven
(Service Oriented) Architecture pattern described below.
65.





"Classic" SOA can be used if:
A functional style and sequential flow is required
It is known precisely which service interface should be called
The service should be called exactly once
A response when service execution completes is expected
A response is expected
66.
An example of how this pattern could be applied in relation to collection is each
questionnaire is stored in an entity service. The entity service exposes the operation to get the
questionnaire through a service call. The indicators are computed and stored and made available
using an entity service call.
18
Figure 6: Examples of “Classic” SOA
Event Driven (Service Oriented) Architecture (EDA)
67.
EDA is based on the publish/subscribe concepts and heavily utilizes many of
SOA approaches and design patterns. Some consider EDA to be an asynchronous version of
SOA.
68.
SOA emphasizes the importance of loose coupling. EDA goes a step further. As an event
is generated by an event source and is sent to the processing middleware. It is not known which
functionality is triggered next. In SOA, the concrete service call would have been made, but this
is not the case for Event Driven Architecture. For this reason, EDA talks about "decoupling"
rather than loose coupling.
69.
EDA can be used if:





All recipients that may be interested in the event should be notified
It is not exactly known which and how many recipients are interested in the event
It is not known how recipients respond to this event
Different recipients respond differently to the same event
Only one-way communication from the sender to the recipient is possible
70.
An example of how this pattern could be applied in relation to collection is when each
questionnaire completed publishes an event when available for subscribers downstream. Early
indicators can be produced by processing collection events straight through aggregation.
19
Figure 7: Examples of EDA
Enabling Architectural Mechanisms
71.
Within each statistical organization, there needs to be an infrastructural environment in
which the generic services can be combined and configured to run as element of organization
specific processes. This environment is not part of the CSPA. The CSPA assumes that each
statistical organization has such an environment and makes statements about the characteristics
and capabilities that such a platform must have in order to be able to accept and run statistical
services that comply with CSPA.
72.
Platform for Service Communication: A communication platform provides the
capability for communication between statistical services. It enables inter-service communication
while allowing statistical services to remain autonomous and add additional capabilities for
monitoring and orchestrating the information flow. To assemble a built statistical service, the
communication platform is updated to integrate with new services. There are multiple ways of
establishing a platform communication. Examples of architectural components could be BPMS,
ESB, Workflow Engines, Orchestration Engines, Message Queuing and Routing.
73.
Platform for Configuring and Controlling Services and Processes: The Platform for
Controlling Service and Process execution encompasses the functionalities and tools to support
the management and maintenance of services metadata, artifacts and policies. Examples of how
this mechanism could be achieved include Business Process Modelling System, Lifecycle
Management, Service Monitoring and Management.
74.
Platform for Reporting on Services and Processes: The Platform for reporting is
responsible for enabling real-time monitoring and near-real-time presentation of user defined
business key performance indicators (KPIs). Examples of how this mechanism could be achieved
are Static Dashboard, Business Activity Monitoring (also generates alerts and notifications to
user when these KPIs cross specified thresholds).
20
D.
Catalogues
75.
In the context of a CSPA, catalogues of various artefacts are seen as key enablers. They
provide lists and descriptions of standardized artefacts, and, where relevant, information on how
to obtain and use them.
76.
The catalogues can be at many levels, from global to local. However, for the purposes of
this project, it is the global level that is the primary interest. The global catalogue is called the
Global Artefact Catalogue. The Catalogue will provide information about resources and potential
collaboration partners, helping to ensure that the modified component conforms to the
requirements of the CSPA. Governance and support mechanisms and processes will need to be
defined to ensure the continued relevance and utility of the catalogues.
77.
The Global Artefact Catalogue will include three sub catalogues. These are shown in
Figure 8.
Figure 8: Types of Catalogues
78.
The catalogues should support the lifecycle management, governance and use of
components and services, providing the right level of artefacts for service design, integration and
transition to a production environment.
79.
They are not meant to function as an "app store", as they are not designed to be platform
specific, and will not necessarily hold copies of executable code. They will, however, provide all
necessary information about how to access the artefacts, including contact details for further
information. They should provide sufficient contents to support the CSPA.
21
Figure 9: Types of artefacts in the Catalogues
80.
As shown in Figure 9, the types of artefacts in Catalogues are:






Frameworks include artefacts such as the GSBPM and the GSIM
Standards include DDI and SDMX
Policies support governance and use, for example "no changes to GSBPM or GSIM
unless they are the result of a formal approval process"
Guidelines concern the practical use and integration of artefacts
Other knowledge assets could include architectural principles and information on how to
use the catalogues
Planning metadata is intended as a repository for information about developments in
progress or planned, to promote collaboration at the earliest possible stage in the
development of artefacts, e.g. "Statistics New Zealand aim to develop and make available
component X by (date)". This will provide the basis for a global roadmap.
VII. Governance of the architecture
81.
Creating a CSPA is only the first part of the story. It is likely that the architecture, or
elements of it, will need to be revised and refined from time to time to ensure continued
relevance. As the architecture will be a common asset for the international official statistics
community, there will need to be a clear and transparent process for change management.
82.
This process does not need to be invented from scratch. The CSPA is being developed as
a project overseen by the High-Level Group for the Modernization of Statistical Production and
Services (HLG) so it is assumed that this group will be the custodians of the outputs. Whilst the
architecture itself will be “owned” by the international official statistics community, it will be
administered by the HLG, as top-level representatives of that community.
83.
However the HLG itself is unlikely to be involved in detailed discussions about specific
elements of the architecture. Therefore, for practical purposes, proposals for changes will usually
be discussed at a lower level. These discussions will take place either in one of the four
modernization committees that are being established under the HLG, or in a specific task team.
22
These groups will then formulate recommendations for change, which will typically be bundled
together as a package to be formally signed off by the HLG.
84.
The HLG and the modernization committees will need to carefully balance the
advantages of change, in terms of increasing relevance and usefulness, against the costs of
having to implement those changes within statistical organizations. A reasonable degree of
stability over time is therefore a key requirement for the architectural framework.
23
Annex 1: Quality Attributes / Non-Functional Requirements
Table 3: Further details on Quality Attributes
CATEGORY Quality Attribute
Definition
Notes
Is the capability for components and subsystems to be
suitable for use in other applications and other
Design
Reusability (Exploitability)
scenarios. It minimizes the duplication of components
Qualities
and also the implementation time.
Legal and licensing issues or
Is the conformance of a system to legal, licensing and
patent-infringement-avoidability patent regulations.
Is the ability of a system to prevent the disclosure or
Privacy
Confidentiality
loss of information.
Multilanguage support
Is the capability of a system to handle different
(internationalization and
languages and regional differences.
localization)
Is the degree in which a system conforms to a rule,
such as a specification, an architecture description or
reference, policy, standard or law.
Extensibility (adding features,
Is the ability of system to take into consideration
and carry-forward of
future growth? It is measured by the ability of a
customizations at next major
service to be extended and the level effect required to
version upgrade)
implement the extension.
The ease with which a software system or component
can be modified to correct faults, improve
Maintainability
performance, or other attributes, or adapt to a changed
environment
Is the extent to which a system needs to be adapted to
Platform compatibility
a given technological environment?
Are the measures of the cost-benefit relation of a
Price
system to create a solution for any given problem.
24
FURPS
Functionality
Functionality
Functionality
Functionality
Supportability
Supportability
Supportability
Supportability
Supportability
Run time
Qualities
Is the capability of a system to prevent malicious or
accidental actions outside of the designed usage, and
Security
to prevent disclosure or loss of information. A secure
system aims to protect assets and prevent
unauthorized modification of information.
Is the degree in which a process uses the lowest
Efficiency (resource
amount of inputs to create the greatest amount of
consumption for given load)
outputs. It relates to the use of all inputs to produce
any given output.
Is the ability of diverse systems and organizations to
work together (inter-operate). The term is often used
in a technical systems engineering sense, or
Interoperability
alternatively in a broad sense, taking into account
social, political, and organizational factors that impact
system to system performance.
Effectiveness (resulting
Is the capability of a system to help a user to reach his
performance in relation to effort) specific goals in a given context.
Is an indication of the responsiveness of a system to
execute any action within a given time interval. It can
Performance / response time
be measured in terms of latency or throughput.
(performance engineering)
Latency is the time taken to respond to any event.
Throughput is the number of events that take place
within a given amount of time.
Is the proportion of time that the system is functional
and working. It can be measured as a percentage of
Availability (see service level
the total system downtime over a predefined period.
agreement)
Availability will be affected by system errors,
infrastructure problems, malicious attacks, and system
load.
Fault tolerance (e.g. Operational Is the ability of a system to continue its intended
System Monitoring, Measuring, operation, possibly at a reduced level, rather than
25
Functionality
Performance
Functionality
Performance
Performance
Reliability
Reliability
and Management)
System
Qualities
failing completely, when some part of the system
fails.
Is the capacity of a system to copy and to archive
Backup
computer data so it may be used to restore the original
after a data loss event.
Is the ability of a computer system to cope with errors
during execution or the ability of an algorithm to
Robustness
continue to operate despite abnormalities in input,
calculations, etc.
Are the anticipation, detection and resolution of
Failure management
programming, application, and communication errors.
Quality (e.g. faults discovered, Is the capability of a system to meet the needs and
faults delivered, fault removal
concerns of the stakeholders for whom it was
efficacy)
developed.
Is the capacity to instantiate and customize new
Configuration management
versions of the system.
Deployment
Is the ability to schedule software distribution
Is the ability to keep a system in a reliable condition
Operability
according to predefined operational requirements
Is the capacity of a system to be adapted to different
specified environments without applying other actions
Portability
or means than those provided for this purpose for the
software considered.
Is the ability of a system, network, or process to
handle a growing amount of work in a capable
Scalability (horizontal, vertical)
manner or its ability to be enlarged to accommodate
that growth.
Is the ability of a system to work with input generated
Backward Compatibility
by an older product or technology. If products
Just standards
designed for the new standard can receive, read, view
26
Reliability
Reliability
Reliability
Reliability
Supportability
Supportability
Supportability
Supportability
Supportability
Supportability
Supportability
Testability
User
Qualities
Documentation
Usability by target user
community
or play older standards or formats, then the product is
said to be backward-compatible
Is the ability of the system to provide information
helpful to identifying and resolving issues when it
fails to work correctly.
Is a measure of how easy it is to create test criteria for
the system and its components, and to execute this
tests in order to determine if the criteria are met.
Is the degree in which a system is described in all its
parts in order to get complete knowledge of how it
has been developed and how must be used.
Is the measure of how easily the system can be used
by a given group of users who share certain
characteristics.
27
Supportability
Supportability
Usability
Usability
Annex 2: Differences between “Classic” SOA and EDA
67.
The following table summarizes the main differences between the architecture
patterns "Classic" SOA and Event Driven Service Oriented Architecture.
Table 4: Side by side comparison of the characteristics of "Classic" SOA and EDA
Characteristics
Architectural Style
Service Interface
Control Flow
Request
Request access
point
"Classic" SOA
Event Driven SOA (EDA)
Functional service style is
required
Based on the operation
contracts,
data contracts (for parameters)
and fault contracts.
Event driven service style (publishsubscribe) is required
Sequential flow is required
The service can be called
exactly once.
It is known precisely which
service interface should be
called
Based on the lists of events that the
service can publish.
Sequential flow is possible by using
additional platform functionalities.
Calls are not made explicitly.
Requests are not made directly to a
service interface.
A response when service execution
completes cannot be explicitly expected
A response when service
unless there is an event associated to it.
Response
execution completes can be
Different recipients can respond
explicitly expected
differently to the same event.
It is not known how recipients respond to
this event
Only one-way communication from the
Message Exchange Various MEPs available.
publisher service to subscriber
Pattern (MEPs)
Synchronous and asynchronous
(s) (publish subscribe pattern).
All recipients listening to the event are
Multiple consumers One at a time
notified
28
Table 5: Side by side comparison of the potential adherence to the Statistical Service Design
Principles of "Classic" SOA and EDA (No, Depends on, Yes-partially, Yes-fully)
Service Design Principles
Event Driven SOA (EDA)
Yes-fully
Managed standardized service Yes-fully
Functionality need to be
contracts based on GSIM
Functionality explicitly
described externally (textually).
fragments/entities.
defined in the service interface
The event type data payload is
(in the operation parameters).
a contract based on GSIM.
Enable services to be loosely
coupled externally and be
aware of internal coupling.
Maximize service autonomy
(completeness) to enable
share-ability and reusability
(External & Internal)
Non-functional requirements
(quality attributes) form a key
input in design decisions.
Granularity based on the level
below a GSBPM sub phase
Independence between design
and implementation
"Classic" SOA
Yes-partially
Loose coupling based on strict
service contract is possible
Yes-fully
Very loose coupling. Provider
is not aware of consumer's
existence.
Consumers is coupled to the
event only (not the source of
the event if a mediator is in
place for subscription)
Depends on implementation
but potentially Yes-fully
Depends on implementation
but potentially Yes-fully
Depends on implementation.
Depends on implementation.
Depends on design
Depends on design
Depends on implementation
but potentially Yes-fully
Depends on implementation
but potentially Yes-fully
29
Annex 3: Glossary
Table 6: Glossary
Term
Definition
Application
Architecture
The set of practices used to select, define or design software components and their relationships.
Source: Sprint
Architectural
Pattern
The description of a recurring particular design problem which comes from different design contexts. The solution
schema is specified by describing its components, its responsibilities its relations and the ways they collaborate.
Business
Architecture
Defines and evolves understanding of what our organization does and how we do it (statistics in our case)
Source: Statistics New Zealand
Business Process
A series of logically related activities or tasks performed together to produce a defined set of result.
Capability
Architecture
1. Start with business outcomes. To a CEO, the metrics — the business outcomes — that most determine the
organization’s success boil down to the balance sheet, statement of operations, and statement of cash flow,
combined with soft metrics that indicate the organization’s power, stature, and influence in the marketplace.
2. Direct the evolution of business capabilities. Business plans make a poor foundation for tech strategy. They
will — and should — constantly change in response to competitive and political dynamics. A better
foundation is an organization’s core capabilities — product design, customer service, etc. — which we build
on and evolve to achieve better outcomes. But we need to do more than model and plan business capabilities,
we need to implement them using an integrated, operational, measurable combination of people, processes,
technology, and physical resources. Thus, Forrester places the notion of an evolving business capability
implementation — a complete running part of a business — as the target for Business Capability Architecture.
3. Provide design implementation models oriented around business change. A business capability map is
good, but if you can’t carry it through to implementation, it’s just a pretty picture. So Forrester’s Business
Capability Architecture — by analyzing the ways that businesses change and evolve — provides design
principles and implementation models for integrated, coherent, holistic implementation of business
capabilities. This includes the notion of a business capability platform — a cohesive, integrated,
multitechnology, business-focused platform — that gets past the single technology focus of our current
day BPM apps, event-driven apps, and the like.
Source: Forester
30
Term
Definition
Common
Statistical
Production
Architecture
A set of principles for increased interoperability within and between statistical organizations through the sharing
of processes and components, to facilitate real collaboration opportunities, international decisions and investments
and sharing of designs, knowledge and practices
Source: Sprint
Composition
Assembly of Services that in itself is a service
Enterprise
Architecture
Enterprise architecture is about understanding all of the different elements that go to make up the enterprise and
how those elements interrelate. It is an approach to enabling the vision and strategy of an organization, by
providing a clear, cohesive, and achievable picture of what’s required to get there.
Source: Sprint
Function
A sequence of program instructions that perform a specific task, packaged as a unit.
Global Platform
roadmap
An international level roadmap showing at what point in future new capabilities will be added to the Statistical
Organization library
GSBPM aligned
process
catalogue
Collection of pre-assembled GSBPM sub processes. Each element listing preconditions, inputs, output, outcome
GSIM aligned
Information
Object Catalog
Collection of schemas, each schema describing one information object exchanged between services through an
interface.
Industry
Architecture
A set of agreed common principles and standards designed to promote greater interoperability within and between
the different players that make up an "industry", where an industry is defined as a set of organizations with similar
inputs, processes, outputs and goals.
Source: Sprint
Information
Architecture
Builds understanding of our information, its flows and uses across the organization, and how we manage that
information.
Source: Sprint
Integration
An Architecture specifically focusing on the integration, co-operation, etc. of certain elements, e.g. Components,
31
Term
Definition
Architecture
Services or data.
Source: Sprint
Interface
A type of contract by which subsystems or component communicate.
Source: Sprint
Message
A model that describes how two different parts of a message passing system connect and communicate with each
Exchange Pattern other. There are two major message exchange patterns - a request-response pattern and one-way pattern (also
called fire-and-forget). For example the HTTP is a request-response pattern protocol and UDP is a one-way
pattern.
Metadata Driven
An architecture that supports the design, composition, operation, and management of statistical business
processing and its inputs and outputs through interaction with standard metadata - all of which is automated to the
maximum degree possible.
Orchestration
A way of stringing together a number of services to build a statistical process. Orchestration composes services to
a new service that has central control over the whole process.
Orchestration vs
Workflow
The main difference between a workflow automation and orchestration is that work flows are processed and
completed as processes within a single domain. (Erl)
Orchestration is the connecting and automating of workflows when applicable to deliver a defined service.
Principles
General rules and guidelines, intended to be enduring and seldom amended, that inform and support design and
decision making, and the way in which an organization sets about fulfilling its mission.
Source: Sprint
Protocol
Formats and rules for exchanging messages in or between computing systems
Quality attributes Quality attributes are the overall factors that affect runtime behavior, system design, and user experience.
They represent areas of concern that have the potential for application wide impact. When designing applications
to meet any of the quality attributes requirements, it is necessary to consider the potential impact on other
requirements and to analyze the tradeoffs between them.
The importance or priority of each quality attributes differs from system to system.
Reuse
Reuse is the concept of using a common asset (implemented component, a component definition, a pattern...)
32
Term
Definition
repetitively in different (or similar) contexts (for example in different business processes), and/or by different
participants, and/or overtime.
Source: Sprint
Service
A service is a logical representation of a repeatable business activity that has a specified outcome and is selfcontained, may be composed of other services and is a "black box" to consumers of the service.
Source: TOGAF (G113):
Service Contract
A service contract is comprised of one or more published documents (called service description documents) that
express meta information about a service. The fundamental part of a service contract consists of the service
description documents that express its technical interface. These form the technical service contract which
essentially establishes an API into the functionality offered by the service. A service contract can be further
comprised of human-readable documents, such as a Service Level Agreement (SLA) that describes additional
quality-of-service features, behaviors, and limitations.
Source: http://serviceorientation.com/soaglossary/service_contract
Service Design
Principles
A service design principle represents a highly recommended guideline for shaping solution logic in a certain way
and with certain goals in mind. These goals are usually associated with establishing one or more specific design
characteristics (as a result of applying the principle).
Source: http://serviceorientation.com/index.php/soaglossary/design_principle
Service Interface
A service interface is the abstract boundary that a service exposes. It defines the types of messages and the
message exchange patterns that are involved in interacting with the service, together with any conditions implied
by those messages.
Source: http://www.w3.org/TR/ws-arch/#service_interface
Service Level
Service Levels describes named categories of measurable increments of non-functional/quality attributes (like
scalability, availability, security, transaction support…) packages offered by a service provider. For example,
service levels for availability might be offered as GOLD (99,999%), SILVER (98%) and BRONZE (95% up
time). Service Levels are used by the provider to negotiate with a service consumer a service-level agreement
(SLA).
Service Oriented
Architecture
Service-Oriented Architecture (SOA) is an architectural style that supports a way of thinking (Service Orientation)
in terms of services and service-based development and the outcomes of services.
33
Term
Definition
The SOA architectural style has the following distinctive features:






It is based on the design of the services – which mirror real-world business activities – comprising the
enterprise (or inter-enterprise) business processes.
Service representation utilizes business descriptions to provide context (i.e., business process, goal, rule,
policy, service interface, and service component) and implements services using service orchestration.
It places unique requirements on the infrastructure – it is recommended that implementations use open
standards to realize interoperability and location transparency.
Implementations are environment-specific – they are constrained or enabled by context and must be described
within that context.
It requires strong governance of service representation and implementation.
It requires a “Litmus Test”, which determines a “good service”.
Source: The Open Group http://www.opengroup.org/soa/source-book/soa/soa.htm
Share
Share is an ownership concept where an asset is made available to other participants for use.
There are levels of sharing. A limited form of sharing would be to provide another participant with the means to
replicate (make a copy) the asset (for example give the source code)(i.e. they share an aspect of the asset only). A
more involved form of sharing would entail that asset is actually been made entirely common (in this case the
asset is also reused).
Source: Sprint
Technology
Architecture
The architecture describing the infrastructure technology underlying (supporting) the business and application
layers.
Source: Sprint
34
Download