Earth Observation Payload Data Long Term Archiving The ESA Multi-Mission Facility Infrastructure

advertisement
Earth Observation Payload Data Long Term
Archiving
The ESA Multi-Mission Facility
Infrastructure
Gian Maria Pinna
ESA
Eberhard Mikusch, Manfred Bollner DLR
Bernard Pruin Werum Software & Systems AG
PV-2005
21-23 November 2005
MultiMission Facility Infrastructure (MMFI)
Rationale
 In 2003 the Agency approved a strategy for the evolution of the
several missions ground segments (handled and/or to be
developed) into an open multi-mission architecture, in accordance
with the Oxygen concept for harmonized ESA E.O. Ground
Segment implementation
 Basic implementation principles:
 Adoption of a common architecture for all missions for generic
ground segment, mission independent
 Decomposition of the facility architecture into functional block
elements
 Identification of mission specific and common elements
 Harmonization and standardization of interfaces
 Maximum re-use of already proven elements
 Standardization of products and formats across missions
 Adoption of strategy for all future ESA handled missions
 Extension of concept to European National missions
 Harmonization and rationalization of archives
Payload Data Ground Segment Decomposition
Specific-to-mission elements
(Processors, Acquisition, Q/C, etc.)
Mission
A
Monitoring &
Control
Mission
B
Archives
Mission
D
User Services I/F
Data Management
Networks
Examples of multimission
common elements
Mission
C
Products
Packaging
Payload Data Ground Segment Logical Model
(OAIS-based)
PDGS
Administration
PDGS
Data
Management
Data
Metadata
& browses
Queries
& orders
PDGS
Consumer
Access
Interactive
User Services
Orders &
metadata
PDGS
Ingestion
Data
Producers
Products
PDGS
Storage
PDGS
Order
Archived Processing
products
Data
Output
productsConsumers
PDGS Services Model
 A PDGS for a generic mission is composed of:
 a Multi-Mission Central Infrastructure component,
consisting of all elements required to provide User
Services (cataloguing, user access, data ordering, etc.), and
Quality Assurance services (payload data quality control,
sensor performance assessment, etc.)
 a distributed Multi-Mission Facility Ground Segment (FGS)
component, consisting of all elements necessary for the
acquisition, ingestion, long-term archive, order processing
and data disseminations to end users.
Producer & Product
Registration
Acquisition
Ingestion
Long Term Archive
Order Processing
Data Dissemination
Data Management
User Access
QA
Data
FGS
Services
Data
Producer & Product
Registration
FGS
Services
Multi-Mission Central Infrastructure
Acquisition
Ingestion
Long Term Archive
Order Processing
Data Dissemination
Data Circulation
FEOMI
 The project FEOMI (Facility Evolution into an
Open Multimission Infrastructure) aims at
porting the ESA’s FGSs into an MMFI-based
architecture, by:
 Analyzing the present existing operational
requirements
 Consolidating the MMFI architecture in order to
satisfy all identified operational requirements
 Implementing the specific configuration in the Core
MMFI to support all operational missions
 Developing new or Adopting/Adapting existing
elements to build the new MMFI, in line with the
basic concepts and logical model
 Developing a generic infrastructure for the usage
of the MMFI by future missions’ FGS
FGS Logical Model Definition
 The
FEOMI Requirements
highlighted the need for:
Analysis
phase
 explicit interoperability model elements for data
and metadata exchange, to take care of the
dynamics of the distributed archive with
federation and co-operation between sites for data
exchange, and the synchronization with the central
catalogue for metadata and browse images
 the mapping of the model elements onto
components that were either already in operations
in the ESA FGSs, could be created by configuration
of COTS or required to be custom built
Facility Ground Segment Logical Model
Other FGS
Multi-Mission Central Infrastructure
DI
Cataloging
IDIP
DI
Ingest
DI
DI
Access
ISIP
SIP
Production
Request Handling
Exchange
AIP
Archiving
ISIP
SIP
IDIP
AIP
Dissemination
Monitoring
& Control
IDIP
Processing
IDIP
Facility Ground Segment
Legend:
SIP – Submission Information Package
ISIP – Internal Submission Information Package
DI – Descriptive Information
AIP – Archival Information Package
IDIP – Internal Dissemination Information Package
DI – Dissemination Information Package
DIP
DIP
FGS Integration
 Following the logical model defined by the FEOMI
project, the elements required to build the Facility
Ground Segment for a specific mission are:
 The so-called Core MMFI, i.e. a set of unconfigured
multimission elements and services deployed in a suitable
infrastructure.
 Several optional Mission Specific Elements (MSEs),
accounting for the specificities of the missions sensors
data, as seen before typically processing systems.
 The specific configuration of the MMFI Elements to
allocate the specific services required by the mission, e.g.
the ingestion and cataloguing of the specific datasets
generated in the FGS.
Facility Ground Segment Decomposition
MM Central Services
Data Library
ULS
Core
MMFI
PSM
CAR
POH
AMS
PSM
PR
SatStore
Online
PSM
Product
Distribution
PSM
Processing
System
GFE
FGS
Request Handling
Local Inventory
Archive
E-PFD
.
PFD
Central Server
PFD
Network
Server
DRS
NRT
Site
Cache
Circulation
Cache In
Ingestion
Monitoring & Alarm
Processing
Monitoring
Logging
& Alarm
Circulation
Cache Out
Dissemination
Monitoring and Control
Operating Tool
Data
Consumers
Data
Producers
Mission-Specific
Elements
MMFI
configurations
for missions
MMFI Architecture
Multi-Mission Central Infrastructure
MMMC
MMOHS
Data Library
Local Inventory
AMS
Request Handling
CAR IF PR IF
ULS
POH
Online
Archive
PSM
Processing
PSM
Processing
System
Processing
System
System Other
GFE
GFE
Product
Distributor
PFD
IPF Other
Proc.
IPF Other
Proc.
IPF
Proc.
NRT
DDS, ...
Site
Cache
Circulation
Cache In
Processing
Ingestion
Monitoring & Alarm
Monitoring
& Alarm
Logging
Monitoring and Control
Circulation
Cache Out
Dissemination
Operating Tool
MMFI Ingestion









Controlled by the "Generic Front End" (GFE)
Configurable workflow engine
Registration of data producers and product types
Ingestion of products with validation, metadata
extraction, browse generation, etc.
Generation of master catalogue (MMMC) update files
Standard set of re-usable plug-ins relevant to ingestion
Standard interface for the implementation of new plugins
Access to data (including binary data) via Data Request
Server (DRS)
Metadata extraction and browse generation during
ingestion based on the DRS
MMFI Data Library
 Based on the "Archive Management System" (AMS) and
the "Local Inventory" (LI)
 The AMS manages the actual long-term archiving of the
data products
 Abstraction layer toward underlying storage technology
 unique storage technology adopted by ESA for its EO missions,
in order to achieve maximum harmonization and standardization
 Automation of operations
 On-line access via client-server services
 Internal storage organization transparent to clients
 Clients data access rights management
 Products subsetting to handle HBR data with EAST and DRB
 The LI is the local catalogue indexing all products
archived in the long term archive at the centre
 Indexing and management of on-line archive
 UpLoad Server (ULS) for metadata synchronization with MMMC
MMFI Requests Handling
 Based on the “Product Order Handling” (POH) system
 Handles the on-demand production and dissemination requests
received from the ESA’s Multi-Mission Order Handling System
(MMOHS)
 Organizes the required workflow based on the product type and
output medium requested in the order
 Scheduling of data validation and retrieval from Data Library,
processing, QC verification, dissemination to users
 Reports to MMOHS the Production Request statuses
 The POH is supported by a set of auxiliary components that
interface other MMFI elements and provide specific
functionality for workflow management
MMFI Dissemination
 “Product Formatting and Delivery” (PFD)
 Dissemination workflow management handling different
dissemination channels in a configurable manner
 Optional reformatting and compression (different methods
available, including wavelet/JPEG2000)
 Formatters with specific or generic plug-ins (DRB available for
enhanced flexibility)
 Concurrent dissemination orders with priorities handling
 Media generation, on tape drives or CD/DVD. Small tape libraries
for tapes and RImage CD/DVD Producer
 Generation of unique medium id with barcode printout, back inlay
and delivery note for the end user
 PFD Network Server addition to perform electronic delivery with
dynamic generation of random account and notification to user by
e-mail. Notification back to POHMMOHS of URL for user.
MMFI Dissemination (2)
 “Product Distributor“ (PD)
 Based on PSM (Processing System Management) workflow
engine
 systematic dissemination management (subscription and
standing requests handling)
 systematic product dissemination (by timers and triggers)
 dissemination via PFD
 dissemination via DDS and other satellite multicast systems
 direct dissemination to ftp accounts and file locations
 product circulation to ftp accounts and file locations
 product circulation management with state based circulation
control
 dissemination and circulation reporting
MMFI Processing
 PSM (Processing System Management)
 Main function is the abstraction of the processing systems to
the higher level elements of the MMFI
 Allows integrating mission and sensor specific processing
facilities with minimal effort and offering a choice of protocols
for the definition of the MMFI - processing facility interface
 Also used in other elements thanks to the workflow model and
interface with LI, able to perform complex scheduling of other
elements:
 PSM-CAR (Check And Release)
 PSM-PR (Product Retrieval)
 PSM-PD (Product Distributor)
 Powerful subscription mechanism to LI, based on OQL, to be
notified of data appearance in the archive (systematic
processing, reprocessing, circulation, etc.)
Systematic data-driven reprocessing
gfe :GFE
drs :DRS
li:LI
ams : AMS
ps:PSM_B
ips :Processor
mmmc :MMMC
1 : subscribe ()
2 : putItem (x )
Preceding
3 : notify ()
4 : putProcessingRequest
interaction
suppressed
5 : getItem ()
6 : productRetrieval ()
7 : start ()
8 : done
9 : smartPolling ()
10 : productInspection ()
11 : browseGeneration ()
12 : metadataExtraction ()
13 : archiveKey ()
14 : productArchiving ()
15 : putItem (register )
16 : exchange (add )
()
MMFI Use Cases - Circulation
gfe1:GFE
li1:LI
ams1:AMS
pd1: PD
gfe2:GFE
li2:LI
1: subscribe
()
Preceding
interaction
suppressed
o1: subscribe
()
2: putItem(x)
3 : notify()
4: putCirculationRequest
()
5: getItem()
6: productRetrieval
()
7: create()
pr:ISIP
8: smartPolling
()
9: ingest
10: productArchiving
()
11: putItem(register)
o2: notify()
o4: destroyItem
()
o3: circulationAcknowledge
()
ams2:AMS
MMFI Advanced Features
 The MMFI implements various features covering
preservation and value adding concepts:
 Support to operational scenarii for preservation strategies, e.g.
periodic migration of digital products to new information
technology
 Encapsulation by self-describing items as defined by OAIS
model information packages. The maintenance of metadata is
performed and means for data consistency are supplied
 Support for automated production by means of sophisticated
data access and processing management
 Modular design, open architecture and streamlined interfaces
that permit an easier substitution of one or more of its
elements, if the need will arise in the future, to ensure the
long-term preservation of its data holdings and of its services
 Special attention was put on the architecture for a well
balanced assignment of functionality-to-components.
MMFI System Design Features
 Archiving technology migration, by shielding the physical data-sets from the
applications, using several software layers, that can all be used to handle
lower level technology changes:
 Hierarchical Storage Management (HSM) incorporated in the AMS. A potential
change of the HSM technology would be fully resolved in the AMS.
 Limited number of components directly interfacing the AMS, thus also a change of
the AMS or AMS interfaces could be handled with minimal archive impact.
 Technological evolution and scalability through the modularity, achieved by
the architecture largely built from autonomous, networked components that
can be combined by configuration:
 Simplified exchange of some components by other implementations
 Handling of increased load by instantiating more components in optimised
configurations (e.g. increase the number of processing nodes)
 Processor integration, to cope with changes to the Data Processors used for
adding value to the EO data:
 The PSM framework supports the integration of processor by natively supporting a
variety of processor interfaces and by allowing integrating processor adapters
 The flexibility of this approach makes it possible to substitute processors and
processor interfaces without undue effort and without affecting other parts of
the participating workflows
 Product data model migration, to cope with the evolution of processing
algorithms that requires changes in the data models of the products to be
archived:
 Configurable product object models within the Data Library that can be extended
MMFI Functional Features
Cataloguing and archiving
Basic feature to consistently manage data and metadata for longterm. In addition advanced access capabilities are available to
search and retrieve products for automated production
Automated Product
Ingestion
During automated product ingestion data products are checked for
consistency before archival. Metadata are extracted and where
applicable browse images are generated for catalogue applications
Order driven processing and
delivery
This is the classic dissemination workflow initiated by a user order.
Optionally a value adding processing step may occur before delivery
Systematic data driven
processing
Systematic processing describes the capability to initiate
automatically processing workflows for higher level products upon
the reception of a lower level product
Systematic re-processing
Used to generate a new revision of a product collection due to
processing algorithm or configuration update. It’s a processing
schema with data from/to archive
Systematic dissemination
Subscription-type systematic dissemination is similar to systematic
data driven processing with the difference that the newly arrived
products are not processed but delivered to one or more customers
Online archive access
Online archive access allows to directly retrieving the product data
with a file based transfer protocol
Data circulation
Data circulation function distributes the data between centres to
serve data migration purposes incl. auxiliary products for remote
processing
MMFI Systematic Workflows
Data Library
Local Inventory
Level x Notification
AMS
V1.0, V2.0,
(V1.0+2.0)
Level 0 Notification
Acq.
PSM
Processing
System
10110010
00010101
10111001
01000100
IPF
10110010
00010101
10111001
01000100
10110010
00010101
10111001
01000100
Other
Proc.
Processing
Online
Archive
PSM
Re-Processing
Pointer PSM
10110010
00010101
10111001
01000100
User
Circulation
Cache Out
Other
Proc.
Re-Processing
Dissemination
Processing
10110010
00010101
10111001
01000100
V 2.0 V 1.0
Level 0
PFD
Processing
System
IPF
10110010
00010101
10111001
01000100
Product
Distributor
Level x
User Access
Download