GSC2.2 Calibration Details or What do all those little numbers mean?

Download Report

Transcript GSC2.2 Calibration Details or What do all those little numbers mean?

GSC2.2 Classification
GSC II Annual Meeting
October 2001
Single Plate Classification
Decision tree classifier:
–
–
–
–
–
Use ranks to handle plate to plate variation
5000+ objects in training set
OC1 oblique decision tree (Murthy et al)
Build several decision trees & let them vote
Classification categories star / nonstar / defect
GSC2.2 Classification
Unlike astrometry and photometry, where one
best value was selected per object (per
bandpass),
GSC2.2 classification can combine multiplate
information to improve the final
classifications,
And counter some known weaknesses.
MultiPlate Voting
For each object:
• Collect all single-plate measurements
– Even from plates not being exported, eg IV-N
• Override defect->nonstar if N(obs)>1
– Matched objects likely to be real objects
• Eliminate 25um scan data, if 15um data exist
– Classifier poorly tuned for these scans
• Majority vote of remaining measurements
– Voting classifiers is known to improve results
• Break ties in favor of nonstars
– Compensates for known bias
Auxiliary Information: the
Source Status Flag
• GSC2.2 provides a wealth of additional
information about each object via the source
status flag.
• Much of this information is pertinent to the
quality of the final classification.
• Informed users can further optimize their
results (eg, guide star selection) with this
auxiliary data.
Status Flag Details:
0987654321
10 digit decimal mask with relevant info
Columns 0: blend status
9: incomplete processing
8: classification voters
7: classification unanimity
654: photometric details (V,J,F)
3: centroider details
21: number of plate observations
Classification and the Status Flag
0: blend status
– Poorly tuned for blends => lower
confidence
9: incomplete processing
– No features computed => lower
confidence
8: classification voters
– Multiple voters => higher confidence
– 25um voters => lower confidence
Classification and the Status Flag
7: classification unanimity
– Unanimous vote => higher confidence
654: photometric details (V,F,J)
3: centroider details
21: number of plate observations
– More voters => higher confidence
Bright Objects
• Tycho stars are included in the GSC2.2.
– Classification was set to star for these objects
– Status flag = 9999999900 for Tycho stars
• GSC1 data were omitted from the GSC2.2
– Classifications were excluded from voting
– GSC1 classifier superior for m<14
• Include GSC1 classification in next export
Evaluating Performance: Not a
simple problem
• What to measure?
– Correctness; completeness; contamination
• Magnitude and latitude variations
• What to compare against?
– GSCII was constructed because there is nothing
comparable to it!
– Nonstar <> galaxy
– Automatically classified samples are less reliable
– Visually classified samples are few and small
NPM/SPM Stars
vs magnitude & latitude
NPM/SPM Galaxies
vs magnitude & latitude
SDSS Stars and Galaxies
vs magnitude
Accuracy vs the real questions
• How complete is my sample of nonstars?
• How pure is my sample of stars?
• What is the probability that the GSC2.2
classification of this object is correct?
The answers depend on your sample, as well
as on the properties of the catalog.
A single quoted accuracy does not suffice.
Accuracy vs the real questions
P(Ts|S) = [P(S|Ts)*P(Ts)] / P(S)
This formulation is:
–
–
–
–
–
Responsive to magnitude and latitude variations
Adaptable to a priori effects of sampling
Adaptable to your favorite galaxy model
Computable (we think! - in progress)
Answers the real questions.