Transcript PPT

Semi-automatic integration in
Dutch Supply and Use Tables – with special
reference to time-series
Marcel Pommée
National Accounts Department
Statistics Netherlands (CBS)
Centraal Bureau voor de Statistiek
Outline
• SUTs, characteristics
• Reasons for automation
• Automation with ‘machines’
• Balancing machine
• Quarterly machine
• Time-series machine
• Time-series project
• Summary
Centraal Bureau voor de Statistiek
SUTs, characteristics
• Annually, quarterly (t+45 and t+30 ?)
• Detail: industries (120), commodities (630), expenditure categories (50)
• Focus on year on year changes, no seasonal adjustment
• Simultaneous balancing of current and constant prices
• Data gaps filled with assumptions and extrapolations
• Manual balancing
Centraal Bureau voor de Statistiek
Reasons for automation
• Efficiency gains: budget reductions up to 34% expected in 2018
• Part of redesign of chain of economic statistics:
•
•
•
More structured process
Top-down approach
Focus on major problems
• Quality
•
•
•
•
Visibility various adjustment steps (transparancy)
Consistency over time
Reproduce results (consistency)
First gdp estimates quickly available (analysis)
Centraal Bureau voor de Statistiek
Automation with ‘machines’
• Quadratic optimization model
•
•
Minimizing the adjustment needed to the growth rates of quarterly series
T-1 and unbalanced data in current and constant prices available
• Only semi-automatic integration
•
•
Major problems tackled manually
Small problems resolved through automation
• Machines for different purposes:
•
•
•
Balancing machine: balancing single SUT
Quarterly machine: rebasing years and aligning quarters
Time-series machine: rebasing time-series
Centraal Bureau voor de Statistiek
Balancing machine
• Balancing single SUT: major problems solved manually
• Hard and soft constraints
•
•
•
•
•
•
•
•
•
Suppy is equal to use by commodity
Preserve price indices by commodity
Preserve i/o-ratios by branches of industry
Compute trade and transport margins
Compute taxes and subsidies on products
Upper and lower bounds for individual variables
Fixation of variables
Weighting of variables based on quality of datasource
Specific relations (import and re-exports, building materials and construction)
Centraal Bureau voor de Statistiek
Quarterly machine
• Compilation cycle: final (F), preliminary (P), very preliminary (V)
• Quarterly machine
• Input: F-year and 12 quarters of previous cycle
• Output: rebased P- and V- year and 12 aligned quarters
• Updating of P- and V-year
• Selected information added
• With balancing machine
Centraal Bureau voor de Statistiek
Time-series machine
• Time-series (1990-2009) based on benchmark revision 2010
• Reconstruction of complete SU and IO tables
• Earlier
•
•
Year by year compilation => very time-consuming
Difficult to preserve price and volume indices of original series
• Time-series machine
•
•
•
•
•
New levels given by benchmark year and reference years
Reference years, e.g. 1987, 1995, 2001 (previous revision years)
Preservation of price and volume indices of original series
Iterative process, manual intervention
Result: fully consistent time-series in current and constant prices
Centraal Bureau voor de Statistiek
Time-series project (1)
• Year 2010
ESA 2010 conceptual revision and benchmark (statistical) revision
Revised GDP 7,6% higher (concepts 3% and benchmark 4,6%)
• Covers
Fully consistent ANA, QNA, ASA, QSA, LA,
SUTs, IO-tables, and regional accounts
• Planning
Benchmark year 2010
2001-2009, up to 2013
1995-2000
• Extremely tight schedule
Centraal Bureau voor de Statistiek
publication 6th March 2014
publication 20th June 2014
publication 24th September 2014
Time-series project (2)
• Time-series 2001-2009: series of problems
•
Start-up problems time series machine (coding errors, retrieving data, capacity
limitations, processing time)
•
Takes a lot of time to specify constraints (fixation of variables, notably government
data, weighting of variables, notably prices)
•
Many interdependencies with other NA-modules (LA, ASA, government data and
financial institutions, fisim)
•
Time-series machine had to make quite large adjustments in the original series due to
substantial level shifts in revision year
Centraal Bureau voor de Statistiek
Time-series project (3)
• Time-series 2001-2009: consequences
•
Difficult to understand what the machine is doing (black box)
•
Planning deadlines were not met
•
Hardly any documentation
•
Data results machine less than optimal: extensive manual interventions
=> publication is on provisional basis
•
Highly motivated team but tension and frustration due to adversities
Centraal Bureau voor de Statistiek
Time-series project (4)
• Time series 1995-2000: gaining experience
Compilation process split into parts with:
•
•
•
Firstly a basic run without some constraints (no commodity balancing) => less
adjustments by machine
Secondly manual intervention to solve major problems (imbalances)
Final run to solve minor inconsistencies and restore all constraints
Results were much better
•
•
Better understanding of the machine output
Less or almost no manual intervention needed
Centraal Bureau voor de Statistiek
Time-series project (5)
• Time series 1995-2000: some lessons learned
•
Takes a lot of time to smoothly run the time-series machine
•
Manual adjustments are mostly complex as they often affect large parts or the whole
time-series
•
Due to all interdependencies it is important to stick to the planning
•
Complexities warrant an experienced team of compilers
Centraal Bureau voor de Statistiek
Summary
• Almost 3 years experience with ‘machines’
• Pros
•
•
•
Efficiency gains: in terms of less fte’s
More robust statistical process
Improved quality
• Cons
•
•
•
Investment to develop and to build
Programming errors
More complex: takes time to understand what the machine is doing
Centraal Bureau voor de Statistiek