IMDB - unece

Download Report

Transcript IMDB - unece

Use of Standardized Metadata
to Find, Select and Access
Statistical Data
- Experience of Statistics Canada Joint UNECE/Eurostat/OECD
Work Session on Statistical Metadata
(METIS)
Geneva, February 9-11, 2004
Objective of Presentation
Answer the question:
“How can the corporate Metadatabase
(IMDB) of Statistics Canada help users
find, select and access statistical data
held in its on-line database (CANSIM)?”
Contents of Presentation
 What
is CANSIM?
 What is the IMDB?
 Accessing CANSIM data from IMDB
–
–
–
–
Demonstration
Naming and defining variables
Finding variable & accessing data
Implementation schedule
What is CANSIM?





Stands for: CANadian Socio-economic
Information Management system
Corporate data dissemination database
Accessible on STC Web site
1.3K tables (+ 700 “terminated”)
18.3M time series (incl. 413K “terminated”)
(over 14M for “health” alone)

800 variables (as defined in IMDB)
What is the IMDB?

Corporate repository of information on
over 350 surveys (+400 “discontinued”)
 Development began in 1999
 4 pre-existing systems integrated
 Supports on-line dissemination activities:
The Daily
CANSIM
On-line catalogue
Canadian Statistic Tables
What is the IMDB content?

HTML pages generated from IMDB:
-

Overview of survey (mandate, users, uses)
Survey population & Questionnaire image
Methodology description (10 components)
Data accuracy measures
In the Fall of 2004:
- Variable names and definitions
- Link to classifications & CANSIM tables
- “Time Travel” from November 2000 on
Naming and defining variables




Variable = Statistical unit + property +
representation (as per ISO 11179 model)
Statistical unit is agent, event or item
about which data are produced
Property is characteristic of statistical unit
being measured
Representation is form given to resulting
data, e.g. Name, Index, Type
… Naming and defining variables

Naming convention: all three elements
used to create name of variable
- Value of Sales of Establishment
- Type of Assets of Establishment
- Name of Geographic location of Person
- Type of Occupation of Person
- Value of GDP of Economy
… Naming and defining variables

Definition of variable provided by joined
definitions of its 3 components
+ specification of associated
classifications (or units of measure)
Note about Variable – Classification relationship:
- ISO 11179: one-to-one relationship
- IMDB: one-to-many, but one-to-one between
classification and variable in one CANSIM table
Finding variable & accessing data

Browsing the list of 800 variables
– By variable topic (20) and sub-topic (156)
– By statistical unit (75)
– By classification domain (20)

Search engine to scan the list of:
– variable names in IMDB and return the ones
containing the word entered or its thesaurus
equivalent; or
– class names/codes within classifications, search
the word entered or its thesaurus equivalent,
and return the variables and CANSIM table
numbers associated with the matching codes
Implementation schedule

Winter 2004: loading variables and classifications in
IMDB, implementing Browsing mechanism and
“time travel”, finalizing re-design of web pages

Spring 2004: display new pages with new features
on Intranet to obtain feedback from survey
managers

Fall 2004: display on Internet

Winter 2005: Implementation of Search mechanism