Presentation

Download Report

Transcript Presentation

Terminology and
Standards
Dan Gillman
US Bureau of Labor Statistics
Terminology

Principle –
To communicate, we need to agree on terms

Concept –
– unit of thought

Term –
– linguistic expression (similar to a word) linked to
a concept

Special Language –
– set of terms describing a subject field
2
Terminology

Examples of special languages
Probability and statistics
Database theory
Statistical metadata
Statistical activity within each SI
– E.g., US Current Population Survey
• Labor force
• Unemployed
Union of special languages within SI
3
Projects

UNECE Metadata Glossary
Glossary (a.k.a. Vocabulary) –
– Alphabetical listing of terms and their definitions

BLS Taxonomy and Lexicon
Taxonomy (artefact, not the science) –
– Scheme for organizing terms within some
subject field, typically a hierarchy
Lexicon –
– Vocabulary, or dictionary, of terms
4
UNECE Metadata Glossary

Create glossary of terms
In order of importance
– UNECE statistical metadata standards
• GSIM, GSBPM, GAMSO, CSPA, etc.
– Other statistical metadata standards
• DDI, SDMX, etc.
– Other standards and specifications
• Maybe ISO/IEC 11179, Dublin Core, etc.

Disseminate in user-friendly format
5
UNECE Metadata Glossary

Build special language for
Statistical institutes
– Designing metadata systems
– Building interfaces to metadata systems
– Message frameworks for sharing metadata

Establish authoritative source
Terms
Definitions
For international use
6
BLS Taxonomy and Lexicon

Project to
Record terms describing BLS data
– For all disseminated time series
– Separate terms into facets
• Measures (estimates on populations)
• Characteristics (classifications used to subset
measures)
Produce
– Taxonomy – hierarchy of terms
– Lexicon – list of terms
7
BLS Taxonomy and Lexicon

Goals
For each term, find related documents and
data
– organize data – use taxonomy
– tag documents – use lexicon
Use taxonomy to drive and guide
– Web site reorganization
Provide plain English equivalent words
– Help unsophisticated users find resources
– Alleviate common confusions
8
BLS Taxonomy and Lexicon

Plain English examples
Inflation – CPI
Field of work – industry or occupation
Wages, earnings, income, compensation
Plain English names for categories
 Authoritative source for BLS language

9
Usage of Terms

Metadata models
Names of classes, attributes, relationships
E.g., Universe, Category, Specialization

Metadata content
Content stored in attributes in a model
E.g., establishment, retail grocery store, etc.

Terminology systems
Authoritative sources for terms / meaning
10
Standards

Why standards?
Consistency
– Eliminate inconsequential (gratuitous) differences
• Spelling and phrasing differences
Semantic interoperability
– Shared meaning w/o need for negotiation
Data harmonization
– Ability to combine data from different sources
11
Standards

Many levels
Program, Agency, National, Regional,
International

Weaker condition
Authoritative sources
– Term and meaning for some subject field(s)
• E.g., unemployed in US CPS
• Plain English -> not employed
• US CPS -> not employed but still in Labor Force
– Not necessarily standard
12
Standards

Consistency and Interoperability
Handled by authoritative sources
Use URI’s to terminological entries
Spelling and phrasing differences eliminated
Access to meaning ensured

But,
Differences across subject fields remain
13
Standards

Data Harmonization
Authoritative sources not sufficient
– Subject fields may differ
– Gratuitous differences may exist too
Need new standards and agreements
– Bilateral agreements not scalable
Multiple standards on same subject a
problem
– E.g., Geographical standards (US MSA vs. CSA)
– BLS has 6 definitions of Boston
14
Contact Information
Dan Gillman
[email protected]