New Software for Ensemble Creation in the - adass xiv

Download Report

Transcript New Software for Ensemble Creation in the - adass xiv

New Software for
Ensemble Creation in the
Spitzer-Space-Telescope
Operations Database
Russ Laher and John Rector
2004 ADASS XIV Conference
October 24 - 27, 2004
Russ Laher ([email protected]) and John Rector ([email protected])
Spitzer Science Center, California Institute of Technology, Pasadena, CA 91125
Preface



About one third of the 230
Spitzer data-processing
pipelines require multiple input
images (e.g., calibrations,
image co-adds & mosaics)
Motivation is data noise
reduction and/or statistical
characterization of the data
Input images are grouped for
particular pipeline processing
into what we call “ensembles”
in the operations database
Russ Laher ([email protected]) and John Rector ([email protected])
Spitzer Science Center, California Institute of Technology, Pasadena, CA 91125
Outline

Powerpoint Presentation
• Introduction


•
•
•
•
•

Background
Purpose of Talk
Database storage of ensembles
Ensemble-creation rules
Ensemble-creation software
Conclusions
Future Work
URL of long version of paper
http://spider.ipac.caltech.edu/staff/laher/sirtf/NewEnsembleCreation.pdf

Appendices
•
•
•
•
A. On-line software tutorial
B. Spitzer ensemble-creation rules
C. S/W output, test mode
D. S/W output, normal mode
Russ Laher ([email protected]) and John Rector ([email protected])
Spitzer Science Center, California Institute of Technology, Pasadena, CA 91125
Background


Spitzer rules for ensemble
creation are well documented
and under version control.
Spitzer pipeline-operator Ron
Beck created the first version
of a script for executing the
ensemble-creation rules
• Rules are hard coded (and
therefore hard to change)
• Direct SQL is used for DB access
(open/close DB connection for
each access)

New database-design
improvements and software
have been developed for
increased speed and flexibility
Russ Laher ([email protected]) and John Rector ([email protected])
Spitzer Science Center, California Institute of Technology, Pasadena, CA 91125
Purpose of
this Talk

To acquaint you with SSC
methodologies for
creating/storing ensembles,
including
• Database design
• “Ensemble-creation” rules

Debut our new ensemblecreation software
• New database tables and schema
changes
• New database stored functions

Identify general concepts used
in creating/storing ensembles
(for application to other
astronomical missions)
Russ Laher ([email protected]) and John Rector ([email protected])
Spitzer Science Center, California Institute of Technology, Pasadena, CA 91125
Hierarchy of
Spitzer Observations
Observing campaign
 5-7 days in a campaign
 30,000-100,000 observed images (DCEs)
 one instrument per campaign
 Spitzer instruments: IRAC, MIPS, IRS
 campaignId
Request i
 200-300 in a campaign
 reqKey
Exposure j
 10-100 in a request
 ExposureId
In “cluster” mode,
there may be
multiple exposures
per cluster of
observations
(clusterPosNum)
At scheduling time,
the “pipeline
picker” assigns to
each DCE a
pipeline for initial
processing
(initPlScriptId)
Instrument channel k
 3 or 4 depending on
the instrument
 chanNum or chnlNum
DCE l
 “Data Collection Event”
 1-10 DCEs in an exposure
and channel
 Each DCE gives a FITS
file of observed data
(single image or stack of
images, depending on the
instrument and mode)
 dceId
Russ Laher ([email protected]) and John Rector ([email protected])
Spitzer Science Center, California Institute of Technology, Pasadena, CA 91125
Miscellaneous
Considerations


Ensembles can be created
in the database after the
observations are scheduled
(it is not necessary to have
received the actual DCEs
from the spacecraft)
Wouldn’t it be nice to store
with each ensemble in the
database information about
the “rule” applied in
creating it?
Russ Laher ([email protected]) and John Rector ([email protected])
Spitzer Science Center, California Institute of Technology, Pasadena, CA 91125
Database
Storage of Ensembles
ensembles
ensId: serial
plScriptId: smallint
dceSetId: integer
expectedInputs: smallint
repDceId: integer
version: smallint
vbest: smallint
ruleId: smallint



1
1+
dceSets
1
1+
dceId: integer
dceSetId: integer
ensembleSets
inEnsId: integer
outEnsId: integer
There are three database tables for
storing information about how
(instances of) ensembles are defined
(which DCEs are included and how
they are to be processed)
DCEs are grouped explicitly into DCE
sets (via association of dceIds with an
dceSetId)
The type of pipeline ensemble
processing to be done is stored with
the ensemble (plScriptId is assocated
with ensId)
Russ Laher ([email protected]) and John Rector ([email protected])
Spitzer Science Center, California Institute of Technology, Pasadena, CA 91125
Database
Storage of Ensembles (cont.)
ensembles
ensId: serial
plScriptId: smallint
dceSetId: integer
expectedInputs: smallint
repDceId: integer
version: smallint
vbest: smallint
ruleId: smallint



1
1+
dceSets
1
1+
dceId: integer
dceSetId: integer
ensembleSets
inEnsId: integer
outEnsId: integer
A DCE set is stored with one or
more ensembles (dceSetId is
associated with ensId)
An ensemble is characterized in the
database by dceSetId and plScriptId
Two or more ensembles can be
associated together for processing a
set of ensembles by creating a new
ensemble with NULL dceSetId and
two or more associations in the
ensembleSets database table
Russ Laher ([email protected]) and John Rector ([email protected])
Spitzer Science Center, California Institute of Technology, Pasadena, CA 91125
DB Storage
of Ensemble Rules
ensRules
ensPlScripts
1
ruleId: smallint
instrument: char(4)
sql: lvarchar
make: Boolean
ensOfEns: Boolean
minInputs: smallint
comment: varchar(255)
created: datetime
createdBy: varchar(30)



1+
ruleId: smallint
plScriptId: smallint
There are two database tables for
storing ensemble-creation rules
The ensRules database table
specifies how DCEs are to be
grouped
The ensPlScripts database table
specifies how a set of DCEs is to be
processed (by one or more different
pipelines)
Russ Laher ([email protected]) and John Rector ([email protected])
Spitzer Science Center, California Institute of Technology, Pasadena, CA 91125
Database Schema
for Ensemble Creation
ensRules
ensPlScripts
1
ruleId: smallint
instrument: char(4)
sql: lvarchar
make: Boolean
ensOfEns: Boolean
minInputs: smallint
comment: varchar(255)
created: datetime
createdBy: varchar(30)
1
0+
0+
ensTempList2
1
ensTempList
groupId: serial
ruleId: smallint
initPlScriptId: smallint
chanNum: smallint
exposureNum: smallint
fowlerNum: smallint
waitPeriod: integer
dceNum: smallint
primaryField: smallint
cycleNum: smallint
aperture: smallint
clusterPosNum: smallint
frameNum: smallint
arraycoord: smallint
1+
1
ruleId: smallint
dceId: integer
dceSetId: integer
expectedInputs: smallint
initPlScriptId: smallint
chanNum: smallint
exposureNum: smallint
fowlerNum: smallint
waitPeriod: integer
dceNum: smallint
primaryField: smallint
cycleNum: smallint
aperture: smallint
clusterPosNum: smallint
frameNum: smallint
arraycoord: smallint
0+
groupId: integer
dceSetId: integer
expectedInputs: smallint
dceId: integer
1
1
ensTempList3
1
1
serialId: serial
ruleId: smallint
ensPlScriptId: smallint
dceSetId: integer
expectedInputs: smallint
dceId: integer
1+
ensOfEnsTempList2
1
1
1
1
1
1+
ensTempListMore
ruleId: smallint
ensPlScriptId: smallint
initPlScriptId: smallint
chanNum: smallint
exposureNum: smallint
fowlerNum: smallint
waitPeriod: integer
dceNum: smallint
primaryField: smallint
cycleNum: smallint
aperture: smallint
clusterPosNum: smallint
frameNum: smallint
arraycoord: smallint
ruleId: smallint
plScriptId: smallint
1
0+
ensOfEnsTempList
1+
1+
1
ruleId: smallint
ensId: integer
ensPlScriptId: smallint
expectedInputs: smallint
dceId: integer
initPlScriptId: smallint
chanNum: smallint
exposureNum: smallint
fowlerNum: smallint
waitPeriod: integer
dceNum: smallint
primaryField: smallint
cycleNum: smallint
aperture: smallint
clusterPosNum: smallint
frameNum: smallint
arraycoord: smallint
1
1
ensTempList3More
1
ensOfEnsTempList3
ruleId: smallint
inEnsId: integer
outEnsId: integer
serialId: integer
ensId: integer
1
ensembles
ensId: serial
plScriptId: smallint
dceSetId: integer
expectedInputs: smallint
repDceId: integer
version: smallint
vbest: smallint
ruleId: smallint
1+
1+
1
1+
ensembleSets
dceSets
1
1+
dceId: integer
dceSetId: integer
1+
1
inEnsId: integer
outEnsId: integer
1+
1+
Russ Laher ([email protected]) and John Rector ([email protected])
Spitzer Science Center, California Institute of Technology, Pasadena, CA 91125
Database Stored
Functions for Ensemble Creation
Database stored
function
Return value(s)
getEnsRules()
All records
getEnsPlScripts()
All records
getReqMode(reqKey)
Corresponding reqMode
(decoded for instrument name)
deleteAllEnsTempLists()
None
getEnsGroupsFrom
EnsTempList(ruleId)
All records for given ruleId
getEnsSetsFromEnsOf
EnsTempList3(ruleId)
All records for given ruleId
createEnsembles
(ruleId, test)
Basic info for all ensembles
created or to be created for given
ruleId
createEnsembleSets
(ruleId, test)
Basic info for all ensembles and
ensembleSets created or to be
created for given ruleId
Russ Laher ([email protected]) and John Rector ([email protected])
Spitzer Science Center, California Institute of Technology, Pasadena, CA 91125
Features of
ensembleCreation.pl








Much faster performance is
expected because pre-compiled
database stored functions are called
Efficient architecture: only a single
database connection is needed
Software complexity is encapsulated
in the database stored functions
Database-table-driven specification
of ensemble-creation rules makes it
flexible
On-line tutorial (lists options,
switches, sample command lines)
Useful, thoughtfully-organized
diagnostic outputs
Test mode to verify effect of
ensemble-creation rule, without
actually having to create ensembles
in the database
Post-mortem debugging capability
via direct SQL querying of database
temporary tables
Russ Laher ([email protected]) and John Rector ([email protected])
Spitzer Science Center, California Institute of Technology, Pasadena, CA 91125
Flow Chart for
createEnsembles.pl
Open database (DB) connection
Delete all records from temporary DB tables
Read ensRules and ensPlScripts DB tables
Optionally write ensemble-creation rules to output file
Execute ensemble-creation-rule SQL statements to pre-load data
into ensTempList# temporary DB tables
Create ensembles and sets of DCEs in temporary DB tables and
optionally in ensembles and dceSets DB tables
Write summary to output file
Execute ensemble-creation-rule SQL statements to pre-load data
into ensOfEnsTempList# temporary DB tables
Create ensembles and sets of ensembles in temporary DB tables
and optionally in ensembles and ensembleSets DB tables
Write summary to output file
Via database stored function
Via dbaccess system-call
Close database connection
File output
Open/close DB connection
Russ Laher ([email protected]) and John Rector ([email protected])
Spitzer Science Center, California Institute of Technology, Pasadena, CA 91125
Conclusions



Increased speed in creating
database records for ensembles
is achieved by using database
stored functions
Flexibility in adding/changing
ensemble-creation rules is
achieved by storing the rules in
the database
Several “small improvements”
were implemented, as well (e.g.,
storing the minimum number of
DCEs with the ensemble-creation
rule, storing the corresponding
ruleId with each ensemble in the
database)
Russ Laher ([email protected]) and John Rector ([email protected])
Spitzer Science Center, California Institute of Technology, Pasadena, CA 91125
Future Upgrades

Add new option to execute
selected ensemble-creation rules
• Specify comma-separated list of
ruleIds
• Application is augmenting existing
set of ensembles

Add new option to create
ensembleSets from existing
ensembles
• Specify ruleId and ensPlScriptId
• Application is linking together
existing ensembles (e.g., process
the data for all reqKeys in a given
12-hour PAO to flag pixels with
latent images)
Russ Laher ([email protected]) and John Rector ([email protected])
Spitzer Science Center, California Institute of Technology, Pasadena, CA 91125