Background Document on Data Flows

advertisement
Data flows in a Shared Environmental Information System
Item: 1.1
7 November 2008
Source:
Subject:
IDS
On the evolution of data flows - Background
Introduction
In October 2007 the NRC for Information System (part of the EIONET network)
met at the EEA premises to discuss the future development of Reportnet. Our
understanding of the messages from that meeting was that EEA should not focus
on a new development of Reportnet – keep it mainly as it is – do not add new
features but go mainly for an updating / refreshing – better processes etc. The
MS told us further that they were willing to provide access to web-services for a
number of data-flows that are suitable and in which the EEA is interested if EEA
(or any other player at the European level) comes forward with the standards, a
message that also have been voiced at a majority of the SEIS Country Visits that
we have done so far.
The strategic perspective for SEIS is linked to the change in information
demand that the European Union faces in the years ahead. From an Information
Technology point of view (with reference to that description) SEIS is about
moving from Acqui driven policy demand that resulted in Compliance
assessments via the more Sector oriented environmental policy (in which
environment was one sector together with Energy, Transport Agriculture etc.) to
an information outcome that responds to the Strategic policies that already exists
in the 6th EAP namely Integrated assessments that will use existing data from the
countries as input to models that will provide the results.
Since the 1970s information technology have progressed enormously and the
type of open, integrated and technically interoperable approach that is needed to
support the strategic policy demand that leads to integrated assessments
becomes increasingly feasible. But we should also bear in mind that SEIS is
about creating distributed environmental information system. The focus will
be on how systems work together i.e. adapting the existing systems, and the
information that they handle, in the MS to new demands, needs and standards.
Below is described – as a long term objective – a dataflow that could be looked
upon as the SEIS vision on the input side, i.e. deliveries of data that is based on
SEIS principles. That can only happen from systems that are operating on line at
the national level.
Kongens Nytorv 6
1050 Copenhagen K
Denmark
Tel.: +45 33 36 71 00
Fax: +45 33 36 71 99
D:\687322950.doc
E-mail: eea@eea.europa.eu
Web: www.eea.europa.eu
Long term objective - Deliveries through SEIS on line systems
~6 months - 2 year
3b) Delivery happens near real time
Both systems are monitored. (SEIS)
~Near real time
Data flow consortium
Quality Assurance
Packaging
Warehouse
~Near real time
Online systems
Data Protocol
Data Protocol Online systems
Conceptial data flow model
System maintanance
System maintanance
Data Protocol Near real time DB
Services
The national systems are linked to the European system on line. Both the
national and the European (or global) systems are monitored so when data is
changed somewhere the updating will take place simultaneously (i.e. Near Real
Time – NRT). Both sides have a team that maintain and manage the flows. The
flow became automated through a SEIS induced connection between those
teams. We also created Data Flow Consortium – a body designed to overlook the
model and protocols needed to sustain the flow.
Today this will only work with NRT systems. Systems with a lower flow rate would
become more costly and might not have the right amount of resources available
to maintain a fully developed system (that might however change in the future).
We also believe that
 the costs related to package reporting (as today) should be able to be
used for other purposes
 it will create possibilities for near real time European services and
reporting
 it will make possible data flows between anybody (i.e. between regions in
different countries or towards international organisations)
 it might create a flow of more data then was originally requested in a
Reporting Obligation (i.e. the sum of several obligations from different
international organisations dealing with the same topic)
We know that countries have shown interest in participating – opening up - their
systems – for this kind of delivery.
Ozone web could be seen as one of the first setups for EEA. Ozone web is now
looking in how it can be open to other international organisations and pass on the
near real time data further. From the data flow point it would be ideal if that is
done already from the source. Strong evolvement from the providers will be
needed to get this happening and that could be accommodated by a data flow
consortium. The task is to discuss and decide on technical common solutions in
order to facilitate and maintain the data flow specified.
Annexed is a catalogue of the types of data flows that we have identified so far.
D:\687322950.doc
Page 2
How to get to the long term objective
Below we outline a process that is the result of a discussion at the EEA. As
background to the process we set up the following principles:
The SEIS concept is about distributed systems, where the responsibility for
the quality of data mainly lies on the data provider – i.e. the organisation /
country providing the data. The data is stored and accessed as close to the
source as possible.
EEA will work on a case by case basis with each dataflow. The flows are
afferent and will evolve differently. The cost benefit is also important. At all
time we should ensure that the cost/benefit ratio is sane.
The focus in the proposed projects should be on how systems work together
i.e. adapting the existing systems both at the European and the national
level, and the information they handle, to new demands, needs and
standards.
We should be active in the SEIS NESIS project and use that project in order
to get information on what kind of systems that exist in the countries and
define the state of play as well as learn more about the country perspective.
We are all equal partners; any organisation (provider or receiver of data
flows) is responsible for their own system development and integration
towards the other systems irrespective if it is about flows that goes
regional/regional, regional/national, regional/European or national/European.
EEA will continue to develop Reportnet and the work will be focused on
improvement of the performance of the present system for the traditional
reporting that will continue to take place. Reportnet will also be extended with
functionalities that support the SEIS concept of a distributed system, where data
is stored as close to the source as possible. In particular this will apply for spatial
data.
Taking into consideration the way data is produced at national level - the data
flows that Reportnet will handle - will be deliveries of data based on
Questionnaires (including Compliance Reporting), data that are maintained in
offline applications and have to be exported for delivery to Reportnet and data
that are maintained in online systems but – in the beginning – have to exported
for delivery in Reportnet.
In the long run the updated Reportnet should continue being a tool focused on
data reporting for compliance assessments and for those data flows that, for
cost/benefit reasons, never will evolve into online systems.
EEA is also under way to do an inventory in the countries through the NESIS
project. (www.nesis.eu) for online systems that could provide deliverables to the
European level. A change from the present situation will only make sense if most
of the countries in one topic have their data in online systems.
D:\687322950.doc
Page 3
We should also perform a pilot study for one (or several) specific data flows
in order to gain experiences and assess what kind of resources that are needed
to move from the present situation towards the long term objective. Presently it is
difficult to imply what the above mentioned development will mean on a practical
level. As an example responsibility such as quality checking or problem reporting
might move towards the European level. There is also the question on resources.
The implementation cost need to be carefully looked at and ensured by clear
agreements. A small request or implementation from one might have a serious
impact to the other. Perhaps the Dataflow Consortium will ensure that those
issues are addressed correctly. We propose to start with E-PRTR as a first
pilot project.
D:\687322950.doc
Page 4
Date:
29 May 2008
Bernt Röndell, IDS; Jan Bliki, IDS; Søren
Rough, IDS
On the evolution of data flows –
Annex - Catalogue of deliveries
I.
Reflections on deliveries
Reportnet focus today is mainly based on compliance reporting and a small part
on EEA’s voluntary data flows (see above). All datasets looks the same from a
Reportnet point of view. But if you take into consideration the way data is
produced at national level the picture will different. Below we are trying to
describe a number of “types” of data flows and at the same time trying to define
in what way the EEA should operate / handle the future development of these
types.
Many of the national organisations have integrated quality control and validation
functions into their on line systems (Described under C below). At the NRC ISmeeting the NRC’s stressed their interest in having connections towards those
systems directly instead of exporting data packages and manually cross check
such deliveries. Deliveries described under C below are the only type of
deliveries that we can move into a higher level of automation and also towards
near real time without heavy and costly investments.
SEIS is about getting access to and sharing data and information that is
handled in information systems and in organisations all over Europe.
In the “Catalogue” below (under D and E) are described two closely related flows
that could be looked upon as the SEIS vision on the input side. Deliveries can
only be moved into these kinds of flows from existing on line systems at the
national level.
The main reason discussing the alternatives are strongly related to the cost
implied on national level in order to meet minimum requirements specified in the
legislation / reporting obligation.
D:\687322950.doc
Page 5
II.
Catalogue of Deliveries
A. Deliveries based on Questionnaires
National
Reporting / Reportnet
International
1) Reportnet questionnaires.
Countries fill forms directly into
reportnet
Operator
Filling Questionairs
Forms
Directory
ROD
CDR
Data
Dictionary
Conversion
Services
DMM
Quality
Assurance
Packaging
Warehouse
Description The National level use forms inside Reportnet to deliver data. In practice this
means that the moment it is time for delivering the data / information a person will
log in on Reportnet and manually fill in a questionnaire. The result on international
level is small datasets that are serving the purpose of compliance reporting. But in
some cases (OECD Questionnaire) it also used for creating information. That
procedure however has been questioned by the MS asking for another kind of
procedure.
Facts
- A small number of records from every National body.
- Very aggregated data
- Slow data flows that takes place at maximum once a year.
- The data has no further value than the intended assessments originally planned.
Example
Reporting obligation for Natura 2000 Standard Data Form (Habitat)
Impacts
- Countries only invest a small amount of resources to collect and report this
data.
- Automated indicator assessment will not be possible. The detail of data is very
low and does not allow any further analysis.
- SEIS will have very little interest in those flows unless more details can get
provided. However the Commission have a high interest.
Actions
Reportnet. User dialogue with the COM?
D:\687322950.doc
Page 6
B. Deliveries based on offline applications
National
Reporting / Reportnet
2) Reportnet data deliveries.
Countries maintain an offline
application and export for delivery
in reportnet
Offline systems Packaging
Quality
Assurance
Directory
ROD
CDR
Data
Conversion DMM
Dictionary Services
International
Quality Packaging Warehouse
Assurance
Description The National level keeps a database designed for one or more international
obligations. They maintain those databases offline on regular intervals. Those
databases are then packaged and passed on to Reportnet. The main reason for
countries not making this as online systems is in most cases because of the
time intervals that are relatively low and because the gathering of the data from
lower levels is done manually. The cost/benefit of this systems is lower when
done offline than online.
Facts
- Long manual process to produce the dataset.
- Small additional records towards a previous collected list. (Time series)
- Relative low time interval makes it possible to put one person on the job for a
short time. (Example here is one month of work to produce the dataset form
other sources.)
Example
Corine land cover. A process that is based on manual interpreting satellite data.
This process is on every 5 year intervals and is project based. Project based
means that a new team is put together to generate a new update.
Bathing water. Most countries are gathering the data during summer period and
only for the days people go bathing to the beach, lakes or rivers. Most
information is initially gathered manually and at the end of the season entered
as yearly averages into a database.
Impacts
- Countries investment is higher than in case A. But because of it nature can
not be done according the solution of case A.
- Automated indicator assessment could be possible if the indicator can live
with slow data updates for this data flow. SEIS will have little impact on those
flows. The potential to move into online systems is greater but this need to be a
checked on a case by case situation.
Actions
Reportnet. Moving offline applications into online systems on national level is a
slow process which will take several years and must be looked at on case by
case basis to ensure that it is possible and cost effective. However, EEA should
investigate how to handle these flows when that happens. (Some MS will invest
in online services for their public and EEA should find a way to harvest them).
Create a process that makes data into information (IMS) at the European level
(an information service)
D:\687322950.doc
Page 7
C.
Deliveries based on online systems
3a) Reportnet data deliveries.
Countries maintain an online
systen and export for delivery in
reportnet
Quality Assurance Directory
Online systems
Packaging
ROD
CDR
Data
Conversion DMM
Dictionary Services
Quality Packaging Warehouse
Assurance
System maintanance
Description The National level maintain an online system that either collects data directly
from stations or automatically collects data from regional systems or is a central
system where everybody logs in and make there changes online. Online
systems expose this information directly to the internet (open or secured).
Those systems have a team of people who maintain the system. Typical
behavior of online systems is that the data can be changed at any time.
Facts
- Large amount of data. (ex. Monitoring Stations)
- Changes can be provided at any time. (Ex. Industry reporting to PRTR)
- Any system that is based on monitoring stations is most probably based on
online systems. The only issue might be that they are not always connected to
the internet.
- Any centralized National system could become or is already an online system.
Example
Ozone: All countries have an online system and EEA is using those inside
Ozone web. The is a first of it’s kind and to be SEIS complained it should allow
other international or national organisations to make use of that same setup.
The concept proves that near real time is technological possible.
PRTR: Most probably all countries did move or are moving this into an online
system because the cost to maintain a website and gathering the information
will be lower than doing this offline.
Impacts
- High impact on resources. Most of the time we talk of a team of people. The
system is the cheapest solution for the necessary and required data flow.
- This is a technological driven move from Case B to C. The cost of
implementing this in a concept like B would be much higher for the simple
reasons of the update intervals that are very high.
- SEIS will have a big impact on those flows but the datasets could be very
valuable in an assessment process when they are seen as constants i.e that are
just slowly changing over time.
Actions
Move towards an on line delivery accessible for the European level (D1).
Develop Discovery services that is based on a metadata standard that “goes
beyond Dublin Core (INSPIRE metadata specifications). Create a process that
makes data into information (IMS) at the European level (an information service)
D:\687322950.doc
Page 8
D.
Deliveries through online system
3b) Delivery happens near real time.
Both systems are monitored. (SEIS)
~Near real time
Online systems
~6 months - 2 year
Data flow consortium
Data Protocol
Conceptial data flow model
System
maintanance
Quality Packaging Warehouse
Assurance
~Near real time
Data Online systems
Protocol
System
maintanance
Data
Protocol
Near real
time DB
Services
Description In this setup on line deliveries are replaced into life systems on both the national
and the European level. Both sides have a team that maintain and manage
those flows. Both flows can be automated only when there is well established
connection between those teams.
Because of the statements above; near real time is the only way data can be
transported. Systems with a lower flow rate would become more costly and
might not have the right amount of resources available to maintain a fully
developed system. In practice this means that those data flows that fit into this
group will become packaged at international level. But also that the countries
have their own packaging process in place.
Facts
- Near real time update between National and Inter-national databases.
- Reduce manual costs related to package reporting. This is only true when
such national life systems exist.
- Possibilities for near real time European services and reporting.
- Possible data flows between anybody including regional to regional from
different countries.
- Possible return of European services to any national or regional system
- Common flows towards other international organisations.
- More data than originally requested inside an obligation. (Could be the sum of
several obligations from different international organisations)
Countries have shown interest in participating – opening up their systems – for
this kind of delivery.
Example
Ozone web could be seen as one of the first setups for EEA. Ozone web is now
looking in how it can be open to other international organisations and pass on
the near real time data further. Ideal would be if that is done already from the
source. Strong evolvement from the providers will be needed to get this
happening and that could be accommodated by a data flow consortium
Impacts
- Higher resource impact at International level
- Need for technical communications between the teams hosting online
systems dealing with the same data flow. (Data flow consortium (See below))
Actions
Start the project that moves processes from C towards D. (See below)
Create a process that makes data into information (IMS) at the European level
(an information service)
The “Data flow Consortium” is needed when the data flows will affect more
international institutions than EEA. The task is to discuss technical common
solutions in order to facilitate and maintain the data flow specified.
D:\687322950.doc
Page 9
E.
Deliveries using Sensor Web
~6 months - 2
year
4) Sensor web
~Near real time
Online systems
Sensors alert when
treshold occurs
Data Protocol
System
maintanance
Description
Facts
Example
Impacts
Actions
D:\687322950.doc
Quality Packaging Warehouse
Assurance
Online
~Near real time
systems
Data Protocol
System
maintanance
Data
Protocol
Near real
time DB
Services
The future data flows based on monitoring station could evolve in sensor web
systems. Intelligent sensors only send messages when thresholds are
exceeded. The European and the national level can use the same sensor
network and maintain different threshold settings to accommodate there
needs.
- Very large networks which only receives information above the preset
threshold.
- Can only be based on online systems. One alert can happen at any time
and any place.
- A strong need for communication and standardisation between stations and
online systems.
- At the European level Forest monitoring and Flooding
- Higher resource impacts on International level all other options.
- Strong need for communication and agreements to those who manage that
sensor network.
- Excellent setup for alert systems.
Follow RTD projects
Page 10
Download