“ HST DATA MANAGEMENT SYSTEM" Dr Rosa Diaz Space Telescopes Science Institute

advertisement
“ HST DATA MANAGEMENT
SYSTEM"
Dr Rosa Diaz
Space Telescopes Science Institute
March 17, 2010
HST Data Management System
 Overview:
 What is DMS
 What we get from HST
 Processing HST Data
 The HST Pipeline
 New Infrastructure
 Archive
The DMS System
 The Data Management System (DMS) is responsible
for some of the development of the data processing
and archive systems at STScI. These systems include
the Data Archive and Distribution System, StarView,
web interface to the Multimission Archive at Space
Telescope.






Science Data Receipt Pipeline (PACOR)
CDBS Pipeline
Pre-Archive Science Pipeline
OTFR Pipeline
Science End-to-End testing
Sample Pipeline
From HST to ground
Tracking and
Data Relay
Satellite System
(TDRSS)
Average of ~ 20 TDRSS contacts/day
using TDRS East and West
Average ~7 SSA returns/day needed for
SSR dumps
S IC & DH (SSR)
From HST to ground
S IC & DH
Tracking and Data
Relay Satellite
System (TDRSS)
White Sands,
New Mexico
(SSR)
From ground to STScI
 White Sands, New Mexico
 Domestic satellites
 Network Control center, Goddard Space
Flight Center, MD
 Ground connection
 STScI
On Average 6 hours since
data was taken
First stages of data manipulation
 Science data arrives at STScI in the form of telemetry
packages
• Science Data is converted to FITS files,
calibrated, and archived
• Engineering data is stored in a separate
place in the SSR and downlinked at a
different time.
•Science data needs information from
engineering data and cannot be
calibrated until it is received.
The Archive Operations/pipeline
USER
CALXXX
Generic
Conversion
DB
Catalog
OTFR
No
Archive
Yes
POD
Files
Safe
Store
Reprocessing
Ingest
Mirror sites
DMS Operations- Functional Architecture
OPUS
Condor
A
NHPPS
Workflow
The software release process
CALXXX
OPUS release
Pass
OPUS
Testing
OPUS fail
Regression Test
INS
Team
Testing
CALXXX issue
CALXXX
development
Pass
CALXXX to OPUS
OPUS Development
A
CALXXX
(STSDAS)
New Operations Server Architecture
Note 2 : Test/Processing hardware identical to Operations systems
and may function as failover – future clustering options
Note 1 : Virtualized Development
Environments successfully deployed
Note 3 : FY11 purchase of additional Operation
Database server for external service read access
11
HST DMS Storage EMC CX4-480

- HST Primary Archive on SAN
CX-4 storage


Sunfire 15K dev, test, ops
HLSP

- New Linux file systems

- New Windows MS SQL
database file systems on fiber
channel drives

- Replacing 1TB drives with 2TB
to reclaim tray space
HST MSR
Multimission Archive
OTFR & Static Archive
Instrum
ent
OTFR
ACS
YES
COS
YES
Static
Archive
Mission
BEFS
WFPC2
YES
FOC
YES
FOS
YES
GHRS
YES
Coperinicus
DSS
EUVE
FUSE
GALEX
GSC
HPOL
HUT
IMAPS
KEPLER
IUE
HSP
YES
TUES
WFPC
YES
SDSS
UIT
VLA
NICMOS
YES
STIS
Pre
SM4
WFC3
YES
Post
SM4
WUPPE
Description
Berkeley Extreme and FUV
Spectrometer
(FUV + NUV spectra)
Digitalized Sky Survey
Extreme UV Explorer
FUV Explorer
Galaxy Evol. Explorer
Guide Star Survey
Spectrometer
Hopkins UV Telescope
The ISM Abs. Profile Spectrograph
International UV Explorer
Tubingen UV Echelle
Spectrometer
Sloan DSS
UV Imaging Telescope
Very Large Array
Wisconsin UV Photopolarimeter
Experiment
http://archive.stsci.edu/hst/
http://starview.stsci.edu/web/
HST data volume
 HST orbits the earth in
96-97 minutes

104 orbits per week
 Only about 80 orbits are
used to take science data
 About 16 GB of data are
received per day
 In January 2010 a total of 497.0 GB were archived (16.03 GB/day)
and 3486.9 GB were retrieved from the archive (112.48 GB/day)
Size of HST data per instrument
Instrument
ACS
COS
STIS
WFC3
MB/dataset
140
258
29.3
115
Data volume processed by the STScI Archive


Calibration pipelines available for all the instruments,
including the legacy instruments.
Data can be reprocessed when new calibration data becomes
available
6000
HST Archive Activity
5000
4000
3000
2000
1000
0
Total Retrievals
Science Retrievals
Ingest
Other missions supported by the Multiarchive System are Kepler and GALEX
Average Number of Requests per week
Average daily values for each week
>500 TB in
USE
DEPOT
Tier 3
Tier 1
EMD
MAST(GALEX,
HLA, GSC, DSS)
62 TB
Tiered Storage Solutions
 TIER 1: High performance Random and Sequential I/O
 Database Access for Catalogs and Large Scale Indexed datasets
 GSC2, DSS, HLA catalogs and footprints, GALEX
 TIER 2: Online mid level performance for data access
with high reliability (HRAS)
 HST DEPOT and OTFR Calibration Files (50TB w/SM4)
 JWST Primary Archive (100TB)
 TIER 3: Lower Cost with load balance and failover
 HLA and MAST Data Product Files, High Capacity (0.5 Petabyte)
21
Phased transition – Stage 1
SAN
FIBER CHANNEL SWITCH
New SAN – 8Gb
FIBER CHANNEL SWITCH
Kepler 15K
18 slots
10 boards
Kepl. Ops (n)
Kepl. Test (m)
HST 15K
18 slots
72 CPUs
HST Dev
Linux
Server
PORT A/B
Pipeline (32)
Tbd ( )
Tbd ( ).
PORT A
Code Dev (12)
DB1 Ops (4)
Test
DB2 Ops (4)
(20)
OS Test (4)
DADS/Pipeline
Windows
DB Server
Dev DB
NEW
HST Test
Linux
Compute
Cluster
Pipeline
DADS
HST Storage
EMC 32TB
Symmetrix
Kepler Storage
EMC Clarion
60TB
HST Storage
EMC CX-4
48 TB
DB1 Ops
Kepl. Ops
DB1 Ops
DB2 Ops
Windows
Test
Databases
Code Dev
Migrate
Kepl. Test
Storage
Test
DB Test
DB Test (4)
Data Depot
DB2 Ops
Code Dev
Test
STIS
WFPC2
HST Pipeline
DB Test
Data Depot
HST DMS
Kepler DMC
Architecture Development:
Infrastructure
Testing and Validation
Long Term - HST DMS Operations Architecture
SAN – 8Gb
FIBER CHANNEL SWITCH
Kepler 15K
18 slots
10 boards
HST Dev
Linux Cluster
HST Test
HST OPS
Linux Cluster
Linux Cluster
DADS
DADS
Pipeline
Pipeline
DADS/Pipeline
Kepl. Ops (n)
Kepl. Test (m)
Tbd ( )
Windows
Database
Tbd ( ).
DB1 Ops (4)
DB Test (4)
Test
Databases
OPS
Databases
HST Storage
EMC CX-4
48 TB
Kepler
Storage
EMC Clarion
60 TB
DB1 Ops
Kepl. Ops
DB2 Ops
Kepl. Test
HST HLA
EMC
Clarion
40 TB
HLA
MAST/HLA
EMC AX4
120 TB
HLA
HST
HST
GALEX
DSS
Code Dev
MAST
General
Test
DB Test
CentralStore
EMC Clarion
162 TB
Science
General
Use
STIS
WFPC2
Data Depot
HST Pipeline
HLA Servers
HLA Servers
HLA Servers
HST DMS
Kepler DMC
MAST/HLA
STScI general
Sunfire 15K Systems Cutoff
TIB
Server
Other
Servers
Download