JSOC Pipeline Processing System Components

Download Report

Transcript JSOC Pipeline Processing System Components

HMI-AIA Joint Science Operations Center Science Data Processing
a.k.a. JSOC-SDP
Overview
LWS Teams Day JSOC Overview
Page 1
HMI & AIA JSOC Concept
GSFC
White Sands
SDOGS
DDS
LMSAL
MOC
Stanford
HMI JSOC Pipeline
Processing System
Redundant
Data
Capture
System
LWS Teams Day JSOC Overview
Housekeeping
Database
Quicklook
Viewing
Primary
Archive
10-Day
Archive
Offsite
Archiv
e
LMSA
L
HMI & AIA
Operations
Catalog
Offline
Archiv
e
Data
Export
& Web
Service
AIA
Analysis
System
Local
Archive
High-Level
Data Import
World
Science Team
Forecast Centers
EPO
Public
Page 2
JSOC Dataflow Rates
LMSAL secure host
0.04
Hk
Joint
Ops
Dataflow (GB/day)
Quick Look
1610
1230
Data Capture
1230
2 processors each
HMI &
AIA
Science
1210
Level 0
(HMI & AIA)
Level 1
(HMI)
HMI High Level
Processing
2 processors
16 processors
c. 200 processors
1210
75
1610
1200
30d cache
40TB each
Online Data
325TB+50TB/yr
rarely
needed
240
1820
Redundant data
capture system
Data Exports
1230
LWS Teams Day JSOC Overview
LMSAL Link
(AIA Level 0, HMI
Magnetograms)
Science
Archive
440TB/yr
(Offiste)
HMI Science
Analysis
Archive
650TB/yr
2 processors
SDO Scientist &
User Interface
Page 3
JSOC-SDP Major Components
LWS Teams Day JSOC Overview
Page 4
JSOC DRMS/SUMS Basic Concepts
•
Each “image” is stored as a record in a data “series”.
•
There will be many series: e.g. hmi_ground.lev0 is ground test data
•
The image metadata is stored in a relational database – our Data Record
Management System (DRMS)
•
The image data is stored in SUMS (Storage Unit Management System) which
itself has database tables to manage its millions of files.
•
SUMS owns the disk and tape resources.
•
Users interact with DRMS via a programming lauguage, e.g. c, FORTRAN, IDL.
•
The “name” of a dataset is actually a query in a simplified DRMS naming
language that also allows general SQL clauses.
•
Efficient use of the system relies on direct use of DRMS
•
Data may be exported from DRMS as FITS or other (TBD) protocols for remote
users.
•
Several Remote DRMS (RDRMS?) sites will be established which will
“subscribe” to series of their choice. They will maintain RSUMS containing
their local series and cached JSOC series.
•
The JSOC may act as an RDRMS to access products made at remote sites.
LWS Teams Day JSOC Overview
Page 5
JSOC Pipeline Processing System Components
Pipeline
Operato
r
Pipeline
processing
plan
JSOC Science
Libraries
Processing
script, “mapfile”
PUI
Pipeline User
Interface
Pipeline Program, “module”
List of pipeline
modules with
needed datasets for
input, output
Utility Libraries
SUMS Disks
DRMS Library
Record
Manage
ment
Keyword
Access
Link
Manage
ment
Record Cache
Data
Access
SUMS
Storage Unit
Management System
DRMS
Processing
History Log
Data Record
Management System
SUMS
Tape
Farm
Database Server
LWS Teams Day JSOC Overview
Page 6
Simple example – find and look at an image
Example of simple utility “module” called “show_”keys”
First find images in the minute starting 9 Sept at 23:50 then look at
one with ds9
P%
P% show_keys "ds=hmi_ground.lev0[][2007.09.09_23:50/1m]" key=FSN,T_OBS
FSN T_OBS
0566684 2007.09.09_23:50:01_UTC
0566685 2007.09.09_23:50:06_UTC
0566686 2007.09.09_23:50:11_UTC
0566687 2007.09.09_23:50:16_UTC
0566688 2007.09.09_23:50:21_UTC
0566689 2007.09.09_23:50:26_UTC
0566690 2007.09.09_23:50:31_UTC
0566691 2007.09.09_23:50:36_UTC
0566692 2007.09.09_23:50:41_UTC
0566693 2007.09.09_23:50:46_UTC
0566694 2007.09.09_23:50:51_UTC
0566695 2007.09.09_23:50:56_UTC
P%
P% ds9 `show_keys "ds=hmi_ground.lev0[566686]" seg=file -p -q`
P%
LWS Teams Day JSOC Overview
Page 7
JSOC Export
•
JSOC will support VSO access
•
JSOC will also have a direct web access
•
There will be remote DRMS/SUMS systems at key Co-I institutions
•
ALL HMI and AIA data will be available for export at level-0 through standard
products (level-1 for both and level-2 for HMI)
•
It would be unwise to expect to export all of the data. It is simply not a
reasonable thing to expect and would be a waste or resources.
•
Our goal is to make all useful data easily accessible.
•
This means “we” must develop browse and search tools to help generate
efficient data export requests.
LWS Teams Day JSOC Overview
Page 8
DRMS/SUMS Configuration
•
DRMS and SUMS use the open-source PostgreSQL database engine.
•
DRMS will run on a pair of dedicated servers likely 4-quad-core processors with
up to 10TB fast disk.
•
SUMS will consist of file servers with attached tape systems.
•
SUMS will manage 200TB cache disk with 150TB/year of permanent archive for
level-1 and higher level products.
•
SUMS archive will use LTO-4 tapes in a robotic system with at least 10 drives
and 2000 tapes near-line.
•
The pipeline processing system will have about 50 processor cores dedicated to
level-0 to level-1 processing and about 450 cores for higher level processing in
the pipeline. HMI standard products will need about half of these.
LWS Teams Day JSOC Overview
Page 9
Extra Info
LWS Teams Day JSOC Overview
Page 10
Pipeline client-server architecture
Pipeline client process
Analysis code
C/Fortran/IDL/Matlab
OpenRecords
CloseRecords
Generic file I/O
GetKeyword, SetKeyword OpenDataSegment
GetLink, SetLink
CloseDataSegment
JSOC Library
Data Segment I/O
JSOC Disks
JSOC Disks
JSOC Disks
JSOC Disks
Record Cache (Keywords+Links+Data paths)
DRMS socket protocol
DataRecord
Record
Data
Data
Record
ManagementService
Service
Management
Management
Service
(DRMS)
(DRMS)
(DRMS)
Storage unit transfer
AllocUnit
GetUnit
PutUnit
Storage Unit
Management Service
(SUMS)
Storage unit transfer
SQL queries
PostgreSQL Database SQL queries
Server
Series
Tables
LWS Teams Day JSOC Overview
SQL queries
Record
Record
Catalogs
Record
Catalogs
Tables
Tape Archive
Service
Storage Unit
Tables
Page 11
Pipeline batch processing
•
A pipeline batch is encapsulated in a single database transaction, “DRMS session”:
–
If no module fails all data records are commited and become visible to other clients of the JSOC catalog
at the end of the session
–
If failure occurs all data records are deleted and the database rolled back
– It is possible to commit data produced up to intermediate checkpoints during sessions
Pipeline batch = atomic transaction
Module 2.1
Commit Data
Register
Module N
&
Module 1
session
…
Deregister
DRMS API
DRMS API
DRMS API
DRMS API
DRMS API
Module 2.2
DRMS API
Input data Output data
records
records
DRMS Service = Session Master
Record & Series
Database
LWS Teams Day JSOC Overview
SUMS
Page 12
HMI module status and MDI heritage
Intermediate and high level data products
Primary
observables
Heliographic
Doppler velocity
maps
Mode frequencies
And splitting
Ring diagrams
Local wave
frequency shifts
Doppler
Velocity
Tracked Tiles
Of Dopplergrams
Internal rotation
Spherical
Harmonic
Time series
Time-distance
Cross-covariance
function
Wave travel times
Egression and
Ingression maps
Wave phase
shift maps
Internal sound speed
Full-disk velocity,
sound speed,
Maps (0-30Mm)
Carrington synoptic v and
cs maps (0-30Mm)
High-resolution v and cs
maps (0-30Mm)
Far-side activity index
Line-of-sight
Magnetograms
Stokes
I,Q,U,V
Full-disk 10-min
Averaged maps
Vector Magnetograms
Fast algorithm
Tracked Tiles
Vector Magnetograms
Inversion algorithm
Coronal magnetic
Field Extrapolations
Solar limb parameters
Coronal and
Solar wind models
Brightness feature
maps
Brightness Images
LWS Teams Day JSOC Overview
Tracked full-disk
1-hour averaged
Continuum maps
Standalone
production codes
in use at Stanford
Research codes in use
by team
Deep-focus v and cs
maps (0-200Mm)
Stokes
I,V
Continuum
Brightness
MDI pipeline
modules exist
Line-of-Sight
Magnetic Field Maps
Vector Magnetic
Field Maps
Codes being
developed in the
community
Codes to be
developed at HAO
Codes to be
developed at
Stanford
Page 13
AIA Level-2
LWS Teams Day JSOC Overview
Page 14
JSOC Data Volumes from Proposal
this version modified to show the links to the hardware plan
Data Path Assumptions
img size
Processe Volume Combin Online
d at
(GB/day) ed
disk
(GB/day cache
channel cadenc compre
)
days
s
e
ss
In from
DDS
HMI: 55,000,000 bps **
SU
553
AIA: 67,000,000 bps **
SU
674
Level-0
HMI: 4k*4k*2 bytes/2-seconds*(pi/4)
0.39 SU
530
3.4E+07
2
4
AIA: 4k*4k*2 bytes * 8 imgs per 10 seconds
3.4E+07
8
10
0.50 SU
1,080
HMI: V,M,Ic @ 45s & B, ld, ff @ 90s*(pi/4)
3.4E+07
5.5
45
0.39 SU
130
AIA: Level 1.0 same as level-0
3.4E+07
8
10
0.50 tbd
1,080
Higher
level
HMI: See below
7.5E+10
1 86400
1.00 SU
70
AIA (lev1a): movies & extracted regions. @ 20%
6.7E+06
8
10
0.50 LM
216
LMSAL
Link
HMI: Magnetograms (M, B)
3.4E+07
5
90
0.39 na
59
AIA: Full Level-0 data+lev1_extract
3.5E+07
8
10
0.50 na
1,134
Export
HMI: 2 * Higher Level products + 5*10 min B
SU
149
AIA: 3* higher Level products (TRACE < 1)
SU
648
HMI: tlm
SU
553
AIA: tlm
SU
674
Local tape HMI: Lev0, Lev-1, All Higher
SU
730
AIA: Lev0, Lev1a
SU
1,296
Level-1
Offsite
tape
Totals
1,227
1,610
1,210
286
1,193
797
Fixed
Disk
cache
(TB)
Perm
Tape
Tape per
disk per Archive year
year
Fraction (TB)
(TB)
Nearline
retain
days
Nearline
Cache
(TB)
30
16
200%
395
90
49
30
20
200%
482
90
59
100
52
100%
189
180
93
30
32
100%
386
1,900
2,004
0
0
90
95
10%
39
0
0
25
100%
25
0
0
77
100%
77
0
0
46
0
0
0
0
0
100
6
0
0
100
111
0
0
60
1
0
0
60
6
0
0
100%
198
24
100%
241
30
1,227
2,026
412
93
743
2,004
HMI Totals
68
71
610
118
AIA Totals
146
77
984
2,034
Combined (TB)
214
148
1,594
2,151
Tape shelf size (TB)
Tape shelf number of tapes - mixed density
LWS Teams Day JSOC Overview
7,968
11,257
Page 15
Sample of DRMS/SUMS Resource Assessment
LWS Teams Day JSOC Overview
Page 16
JSOC Level-0 processing flow
LWS Teams Day JSOC Overview
Page 17
JSOC-SDP Status
•
Capture system complete, waiting testing with DDS at White Sands
•
DRMS and SUMS virtually done
•
Level-0 work progressing, to be done by instrument deliveries
•
Level-1 work to start after delivery
•
Getting serious about work on basic pipeline modules
•
Review of status and computer hardware plans in early November
LWS Teams Day JSOC Overview
Page 18
Stanford JSOC effort plan
HMI-SU Effort Distribution
On-Demand Support
1000
Irradiance
Forecast - farside…
900
Coronal Inferences
Mag Field - Vector Field
800
Mag. Field - Line-of-Sight
Level 2 - Local HS Holography
Level 2 - Local HS Ring Diagram
700
Level 2 - Local HS Time Distance
Level 2 - Global HS
Percent FTE
600
Level 2 - Quick Look
HK & FDS
500
Level-0
JSOC Verification & Test
400
Data Quality and Proc. Metadata
General Env.: cvs,os,oracle,compile
300
Archive h/w arch.
Processing Hardware Arch.
User Tools: API, Data Export, & Browsing
200
pui
Data Capture
100
DRMS
SUMS
0
Support for AIA
FY
FY
FY
FY
FY
FY
FY
FY
FY
FY
FY
FY
FY
2006 2006 2006 2006 2007 2007 2007 2007 2008 2008 2008 2008 2009
Q1
Q2
Q3
Q4
Q1
Q2
Q3
Q4
Q1
Q2
Q3
Q4
Q1
LWS Teams Day JSOC Overview
Page 19