Regional Data Validation

Download Report

Transcript Regional Data Validation

Expert Group Meeting on Price Statistics and National
Accounts:
ICP Round 2011
Jointly organized by:
UN-ECLAC, CARICOM, CARTAC and ECCB
27th-30th August 2012
Aruba
Summary
 INTRA country data validation
 INTER country data validation
 Quaranta Tables
 Dikanov Tables
 Case Study
Data validation procedure
 The data validation procedure has two main steps:
 INTRA country validation
 INTER country validation
 Countries and ECLAC participate in this procedure
 It is an iterative process
Step 2: INTER country validation
The main data analysis within inter-country and global validation
is carried out by using two validation tables that are
• The Quaranta Tables
• The Dikhanov Tables
The purpose of the both tables is to
• Screen the national average prices for possible errors by comparing the
average prices for the same product in different countries
• Possible errors are highlighted by special indices
This analysis can lead to
• Editing of price data for an item
• Editing of metadata for an item
Quanranta Tables
validation tables
The Quaranta Tables (QTs) consist of a set of tables
• Table details
• Summary tables for each BH
• Individual tables for each item
QTs can be used for the validation of
• BH PPPs and PLIs
• Item XR-ratios and PPP-ratios
Different calculation options
• EKS, EKS*, CPD, CPRD and W-CPD
• Recommended method by the TAG is weighted CPD
• However comparison of results for different calculation methods can be
used to value given information on importance
QUARANTA TABLE DIAGNOSTICS - Rice
Data Selection Criteria
Basic Heading Code 99.11.01.11.1
Time Period Yearly
Averaging Method Arithmetic Mean
Imputation CPD
Run
Date
4/13/2011
Summary Information
No of Items included in the
Average weight of Basic
Analysis
6 out of 6
Heading in Total Expenditure
No of Countries included in
the Analysis
18 out of 18 Average Coefficient Variation
Base Country
Table
details
0.0
27.4
USA
Country Level Details
# Shares are multiplied by 10000.
Country
XR
PPP
PLI(%) Weight #
Country 1
4.42
1.8150
41.080%
0.0
2;*0
7.2
Counrty 2
959.04
718.2770
74.896%
0.0
5;*0
8.2
Counrty 3 1018.35
2.7696
0.272%
0.0
2;*0
33.8
BH
table
Items Var.Co.
Item Level Details
99.11.01.
11.1.01
Country
Long grain rice, prepacked
Quotati
NC-Price
ons
Var.Co.
XR-pr
Var.Co.
:
PPPXR-ratio
price
25.9 1-Kilograms
PPPratio Pref. UoM
Country 1
1.500
151
11.2
0.34
45.89
0.83
95.02
NA
Country 2
-
-
-
-
-
-
-
NA
Country 3
766.381
10
3.0
72.93
9857.61
1.14
130.66
NA
Item
table
Table details
Quaranta tables
#
Basic Heading Code
Period during which the prices for the products covered by the Table were
Period
collected.
Run date Run date for the tables
Averaging
Method used to calculate the averages
Method
Imputation Method used to calculate the basic heading PPPs EKS, EKS*, CPD or CPDR.
#
Number of Items included in the analysis
National expenditure weights scaled to 100,000. That part of a country's GDP
that is spent on the basic heading when both expenditures are expressed in
Weight
national currency and valued at national price levels. For information only: not
used in the calculations
#
Number of Countries included in the analysis
Overall variation coefficient i.e. the average product variation coefficient for
Average
the products priced for the basic heading. It is calculated as the unweighted
Coefficient
arithmetic mean of the product variation coefficients at. It measures the
Variation
average variation of the PPP-Ratios of all products priced for the basic heading.
Base Country Base country for the calculations (currency selected as numéraire).
BH Table details
Quaranta tables
BH table details
Country
XR
PPP
PLI
Weight
No. of Items
Var.Co.
Names of countries covered by the Table.
Market exchange rates of the countries expressed as the number of units of national
currency per unit of the numéraire currency (base country).
Purchasing power parities for the basic heading calculated and expressed as the
number of units of national currency per unit of the selected numéraire currency.
The prices used to calculate the PPPs are the average prices in national currencies
that countries report for the products they priced for the basic heading - that is, the
NC-Prices.
Price Level Indices. The PPPs expressed as a percentage of the corresponding
exchange rate.
National expenditure weights scaled to 100,000. That part of a country's GDP that is
spent on the basic heading when both expenditures are expressed in national
currency and valued at national price levels.
Number of products that are priced by each country for the basis heading and the
number of products priced by each country that are important - that is, the number
of products assigned an asterisk (*).
Country variation coefficient. The standard deviation of the country's CUP-Ratios for
all products priced by the country for the basic heading expressed as a percentage of
the arithmetic mean of the country's CUP-Ratios for all products priced by the
country for the basic heading.
Item table details
Quaranta tables
Item table details
Code
Var.Co.
Code, name, and summary definition of the product covered in the subsequent product section.
Item variation coefficient. The standard deviation of the product's PPP-Ratios expressed as a
percentage of the arithmetic mean of the product's PPP-Ratios.
Pref. UoM Unit of measurement
Country Country name
NC-Price Average price in national currency (NC).
Quotations Number of price observations on which the average prices in national currency are based.
Var.Co.
XR-Price
XR-Ratio
Price observation variation coefficient. The standard deviation of the price observations
underlying the product's average price expressed as a percentage of the arithmetic mean of the
price observations underlying the product's average price.
The average prices in national currency in column [24] converted to the numéraire currency with
the exchange rates.
Standardized price ratios based on the exchange rate converted prices. The XR-Prices expressed
as a percentage of their geometric mean.
PPP-Price
The average prices in national currency converted to the numéraire currency with the PPPs.
PPP-Ratio
Standardized price ratios based on the PPP converted prices. The PPP-Prices expressed as a
percentage of their geometric mean.
Geo Mean Geometric mean of the exchange rate converted prices and CUP-Prices
 PPP Price: It is the national average price for a specific
product of a specific country deflated by PPP.
 This price indicates how many units of currency of the
base country are needed to buy the same quantity of a
specific product that are bought with one unit of currency
of the base country in the base country.
 Since PPP are calculated based on all the products that
made up the BH, the PPP price indicates how a product
behaves in a specific country with respect to other
products in the same country.
 PPP ratio is the PPP price divided by the geometric mean
of PPP prices in all the countries.
 The value of this ratio indicates how the country behaves
in comparison with other countries for a specific product.
 That is to say, how the price behaves relative to the other
products of the country in the BH and relative to the rest
of the countries for the same product.
 The average of all the PPP ratios for a specific country is 1
(or 100%) inside a BH, and at the same time, the average
of PPP ratios for all the countries for a specific product is
100%
 Suggested critical values:
 CV larger than 33%
 Ratios outside (80% - 125%) range.
 With this tool one can detect problems with the quality of data
such as:
 High price variation for a certain product in each country.
 Average prices which are too high (or low) in nominal terms
compared to the price in other countries when converted with
exchange rate.
 Average prices which are too high (or low) in real terms compared
to the price in other countries when converted with PPP
 Behavior of average prices inside a BH. For instance when the
prices for the majority of the products in a specific BH for a
specific country are below (above) the regional average and for a
certain product the price is above (below) regional average.
Dikhanov Tables
validation tables
As the QTs, the Dikhanov Tables (DTs) consist of a set of tables
• Table details
• Summary table for each BH
• Individual tables or rows for each item
DTs can be used for the validation of
• Aggregated (above BH) and BH PPPs and PLIs
• Item XR-Ratios and CPD(R) residuals
Different lay-out options including color scheme
• Aggregate/BH tables + full item statistics
• Aggregate of BH tables + simple residual rows for item
Different calculation options
• CPD, CPRD, CPD-W
 DT can be used as a substitute or complement of QT in




order to detect potential problems with the data.
QT analyse one BH at a time.
For certain products it is difficult to detect outliers
with QT (biased averages due to lack of countries
collecting data for a specific product)
The main difference between QT and DT is that the
analysis in DT does not consider data grouped by BH
but it considers them individually and simultaneously.
This makes easier the analyses for those products
which are the only representative of a BH or when
there are a few products in a BH.
Overview
Dikhanov tables
Dikhanov Temporal Analysis
PPP
STD
No.of Priced Items
ER (LCU/US$)
Rebased_XR
PLI
Item Level Details
Item
Code
Item Name
99.11.01
.11.1
Rice
PPP
STD
PLI
No.of Priced Items
99.11.01.11.
1.01
Long grain rice, prepacked
Average Price
No.of Observations
Coefficient of Variation
XR Ratio
Country1
Yearly - 2005
2.934690064
0.245237431
420
2.43
4.418181818
0.664230261
Country2
Yearly - 2005
658.1289976
0.256006128
513
527.47
959.0363636
0.686239878
Country3
Yearly - 2005
4.040426119
0.291549487
572
5.78
10.50909091
0.384469613
Country1
Country2
Country3
Yearly - 2005
Yearly - 2005
Yearly - 2005
1.81507
0.05109
0.410819
2
718.297
0.0726994
0.748978
5
4.84856
0.274263
0.461368
6
-0.05109
1.5
151
11.2214
70.4386
-
0.26746
5.51
10
3
108.78
Table
details
BH
table
Item
table
Table details
Dikhanov tables
Table Details
PPP
Purchasing power parities for the basic heading or aggregate covered by the
Table. They are expressed as the number of local currency units per unit of the
selected numéraire currency. The prices used to calculate the PPPs are the
average prices in local currencies that countries report for the products they
priced for the basic heading or aggregate - that is the average prices.
STD
Standard deviation of each country’s CPD or CPRD residuals for the basic
heading or aggregate. It can be converted to a country variation coefficient by
multiplying by 100.
#
Number of products specified for the basic heading or aggregate.
ER
Market exchange rates of countries expressed as the number of local currency
(LCU/US$) units per US dollar.
Rebased XR
PLI
Exchange rates rebased to the numéraire currency. Number of local currency
units per unit of numéraire currency.
Price level indices. The PPPs expressed as a ratio of the corresponding rebased
exchange rates in row.
BH details
Dikhanov tables
BH Details
Item code Code of the basic heading or aggregate covered by the Table.
Item name Name of the basic heading or aggregate covered by the Table.
Period during which the prices for the products covered by the Table were
Period
collected.
PPP
Purchasing power parities for the basic heading or aggregate covered by
the Table. They are expressed as the number of local currency units per
unit of the selected numéraire currency. The prices used to calculate the
PPPs are the average prices in local currencies that countries report for the
products they priced for the basic heading or aggregate - that is the
average prices.
STD
Standard deviation of each country’s CPD or CPRD residuals for the basic
heading or aggregate. It can be converted to a country variation coefficient
by multiplying by 100.
PLI
#
Price level indices. The PPPs expressed as a ratio of the corresponding
rebased exchange rates in row.
Number of products specified for the basic heading or aggregate.
Item details
Dikhanov tables
Item Details
Item code Code of the basic heading or aggregate covered by the Table.
Item name Name of the basic heading or aggregate covered by the Table.
Period
STD
Count
Residual
Period during which the prices for the products covered by the Table were
collected.
Standard deviation of the product’s CPD or CPRD residuals. It can be converted to
a product variation coefficient by multiplying by 100. The mean of a product’s
residuals is 1.
Number of countries pricing the product.
CPD or CPRD residuals by product and country. CPD residuals in the Dikhanov
table are equal to the logarithms of the CUP-Ratios in the Quaranta table.
Average Price Average price in local currency units.
#
Number of price observations on which the average prices at [16] are based.
Coef. of Var. Price observation variation coefficient.
XR Ratio
Price ratios based on exchange rate converted prices. The converted prices
expressed as a percentage of their geometric mean.
Relation
between
Quaranta and Dikhanov Tables
Both tables provides essentially the same information, but
• Dikhanov Tables can be compiled for a group of BHs (an aggregate), they
use color schemes to highlight potential outliers and can be collapsed to
present only residuals for items
• Quaranta tables presents the relations within an item potentially clearer
Relation between PPP-Ratios and CPD residuals
• CPD residuals are equal to the logarithms of CUP-ratios
PPP-Ratios
0 to 14
14 to 47
47 to 78
78 to 128
Less than
-2.0
-2.0 to 0.75
-0.75 to 0.25
-0.25 to
0.25
CPD residuals
128 to 212 212 to 739
0.25 to
0.75
0.75 to 2.0
739 to
More
than 2.0
Case Study
 We will analyze a Quaranta Table and highlight the
problems encountered.
 We will follow the instructions provided in this
presentation.
THANK YOU
21