Coefficient of variation - example - Statistics for Development Division
Download
Report
Transcript Coefficient of variation - example - Statistics for Development Division
Sub-Regional Workshop on
International Merchandise Trade Statistics Compilation and
Export and Import Unit Value Indices
21 – 25 November 2016. Guam Reef & Olive Spa Resort, Tamuning, Guam
Price Indices and UVIs
Presentation by Rens Hendriks
Economic Statistician, Pacific Community
Email: [email protected]
Developing UVIs - Steps
•
•
•
•
•
•
Determine Scope and Coverage
Calculate the unit value indices
Are the selected items homogeneous?
Outlier detection
Calculating the weights
Aggregation of the indices
Data Requirements
• Most important:
– Detailed HS Code (8-10 digits)
– Value of the exports (f.o.b.) or imports (c.i.f.)
– The quantity of the exported or imported commodities
– Country of origin / destination
• Other important variables:
– Port of import / export
– Mode of transportation
– Country of origin / destination
– Customs procedure
Data preparation – Code validation
• Errors in the codes can occur for various reasons
– Incorrect codes given by agents
– Inconsistent lists / HS versions used between ports of entry
/ exit
• Detailed code validation difficult / expensive
• But, some checks can be done
– Easy check: can all the codes be matched to higher level
categories?
Example
8 Digit
6 Digit Code Description
code
3808.91
Insecticides
38089100
3808.91
Insecticides
38089100
3808.91
Insecticides
38089100
3808.91
Insecticides
38089100
3808.91
Insecticides
38089100
3808.91
Insecticides
38089100
3808.10
#N/A
38081000
3808.10
#N/A
38081000
3808.10
#N/A
38081000
HS 2007 Code
38
38.08
3808.91
3808.92
3808.93
3808.94
3808.99
Description
Insecticides
Insecticides
Insecticides
Insecticides
Insecticides
Insecticides
Insecticides
Insecticides
Insecticides
Qty
66
90
350
5
380
2
6
5
5
CIF Value
$782.02
$1,080.92
$4,099.20
$61.00
$4,478.62
$41.48
$89.06
$53.68
$54.90
Country
Australia
Australia
Australia
Australia
Australia
Australia
China
Solomons
China
Description
CHAPTER 38 - MISCELLANEOUS CHEMICAL PRODUCTS
Insecticides, rodenticides, fungicides…
-- Insecticides
-- Fungicides
-- Herbicides, anti-sprouting products…
-- Disinfectants
-- Other
Date
06-Jan-12
06-Jan-12
06-Jan-12
06-Jan-12
06-Jan-12
09-Jan-12
10-Jan-12
16-Jan-12
25-Jan-12
Step 1 - Coverage
• Determine Scope and Coverage
– What will be the main use of the indices?
• Limitation to UVIs
– Often excludes intra-customs area trade
– Often doesn’t cover trade in duty free goods
– Doesn’t cover trade in services
– Homogeneous items
• Discuss with users!
Coverage - Level of detail for UVIs
• In principle, UVIs should be calculated at the lowest level of
the classification
• Where possible, stratify by:
– Country of origin / destination
– Mode of transportation
– Customs procedure
• BUT: Cells should be large enough so that the XMPI can be
calculated (nearly) every quarter.
– Cell represents minimum share of total imports (e.g. 1‰)
– At least 20 consignments of the commodity per year
– UVI can be calculated at least 3 out of 4 quarters per year
Exercise 2.1
• Create two pivot tables for the data in the excel sheet
– Table 1: Determine the CIF value per commodity and
country.
• Set the value field setting to “% of column total”
– Table two: Determine the number of observations per
commodity per country per quarter.
• Three criteria: Value is at least 1‰, at least 20 consignments
per year, UVI can be calculated at least three quarters of the
year.
• Which of the cells would you cover?
Chapter
I
II
III
IV
V
VI
VII
VIII
IX
X
XI
XII
XIII
XIV
XV
XVI
XVII
XVIII
XX
XIX, XXI, XXII
Total
Description
LIVE ANIMALS; ANIMAL…
VEGETABLE PRODUCTS…
ANIMAL OR VEGETABLE …
PREPARED FOODSTUFFS;…
MINERAL PRODUCTS…
PRODUCTS OF THE CHEM…
PLASTICS AND ARTICLE…
RAW HIDES AND SKINS,…
WOOD AND ARTICLES OF…
PULP OF WOOD OR OF O…
TEXTILES AND TEXTILE…
FOOTWEAR, HEADGEAR, …
ARTICLES OF STONE, P…
NATURAL OR CULTURED …
BASE METALS AND ARTI…
MACHINERY AND MECHAN…
VEHICLES, AIRCRAFT, …
OPTICAL, PHOTOGRAPHI…
MISCELLANEOUS MANUFA…
OTHER…
Value
$79,525.00
$114,224.00
$7,969.00
$85,547.00
$394,195.00
$99,236.00
$60,136.00
$5,264.00
$19,379.00
$18,775.00
$113,543.00
$10,988.00
$24,285.00
$11,892.00
$94,771.00
$380,509.00
$240,888.00
$21,696.00
$52,424.00
$18,015.00
$1,853,261.00
Percentage
4.29%
6.16%
0.43%
4.62%
21.27%
5.35%
3.24%
0.28%
1.05%
1.01%
6.13%
0.59%
1.31%
0.64%
5.11%
20.53%
13.00%
1.17%
2.83%
0.97%
100%
Step 2 - Calculate the UVIs
• ‘Easiest’ step in the process
• For as long as possible period, but at least one year
• Quick to do in excel using pivot tables
– Rows: Tariff Code/Description, Quarter
– Value
• CIF and quantities
• Or use a calculated field to calculate
the unit values directly in the Pivot Table
(CIF / Quantity)
Exercise 2.2
• Data on two detailed items
– porland cement and vehicle parts and accessories
• Create a pivot table to calculate total value and quantities for
each item per quarter
• Calculate the UVIs
– Either from the values in the pivot table
– Or use a calculated field
• Calculate the indices from the unit values
• What do you think of the resulting index. Is one more stable
than the other?
Are the products homogeneous?
• How close are the unit value prices together?
• Law of one price
– The less spread out the prices, the more confident we can
be that the product is homogeneous.
• One measure of spread is the coefficient of variation.
– Measures the variation in relation to the mean
– Is calculated as: standard deviation / mean
Standard Deviation
Coefficient of variation - example
• Coefficient of variation = Standard Deviation / Mean
• Numbers: 10, 8, 9, 10, 7, 4
• Mean: (10 + 8 + 9 + 10 + 7 + 4) / 6 = 8
• Standard deviation = 𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒
• Variance = sum of squared differences / N
Coefficient of variation - example
• Numbers: 10, 8, 9, 10, 7, 4
• Mean = 8
• Variance = sum of squared differences from mean / N
Number
10
8
9
10
7
4
Difference from mean
10-8=
2
8-8=
0
9-8=
1
10-8=
2
7-8=
-1
4-8=
-4
Squared
difference
4
0
1
4
1
16
• Sum of squared differences = 4 + 0 + 1 + 1 + 16 = 22
• Variance = 22 / 6 = 4.333
• Standard deviation = 4.333 = 2.0816
Coefficient of variation - example
• Coefficient of variation = Standard deviation / Mean
• Mean = 8
• Standard deviation = 2.0816
• Coefficient of variation = 2.0816 / 8 = 0.2602
Exercise 2.3
• Excel has built in functions for most parts of the calculations
– No specific function for CoV, but:
• Function for mean: = AVERAGE(…)
• Function for standard deviation: = STDEV.P(…)
• Exercise 2.3:
– Calculate the average price for each transaction
– compute the coefficient of variation of the prices
• CoV = Standard Deviation / Mean
• IMF manual suggests the CoV should be below 0.5; Can an
index for this group be published?
CoV – Multiple groups
• For the development of UVIs, we have to determine the
homogeneity for many groups at once.
• Not practical to do separately for each group.
• For serious analyses, it’s best to use statistical software.
• However, first analysis can be done in excel using pivot tables.
Exercise 2.4
• Calculate the prices for each record
• Create a pivot table
– Use Tariff Codes/Descriptions and Quarters for the rows
– Add the average price to the value field twice
• Set value field setting for first to “Average”
• For second set it to “Stddevp”
• Calculate the CoV as: Standard deviation / Average
• On the basis of the table, which indices would you publish?
• Repeat the exercise, this time, add “Country” to the row
labels.
• Does it change which indices you would publish?
Analysing heterogeneity
• Items with a high (>0.5) Coefficient of Variation are not
homogeneous.
• Why?
– Items with distinctly different prices
– Outliers
– Group is truly heterogeneous
• Inspect the groups with high CoV
– Visual tests: histogram
– Statistical tests.
Histograms
• Homogeneous Good
Histograms
• Items with distinct price clusters
Histograms
• Outliers
Histograms
• No pattern
Histogram – Excel
• Load the analysis toolpack
– Click the File tab, and then click Options.
– Click Add-Ins, and then in the Manage box, select Excel
Add-ins.
– Click Go.
– In the Add-Ins available box, select the Analysis ToolPak
check box, and then click OK.
• If Analysis ToolPak is not listed in the Add-Ins available
box, click Browse to locate it.
• If you get prompted that the Analysis ToolPak is not
currently installed on your computer, click Yes to install it.
Exercise 2.5 - Histogram
• Create a histogram of the prices in exercise 2.5
• To create the histogram
– Select data -> data analysis
– Choose Histogram
• In the dialog frame:
– In the input range, add your column with prices
– Leave the bin range empty
– Tick the box for “Chart Output”
• Do you think it is a homogeneous good?
• Do you think there might be outliers?
Testing for Outliers
• In our histogram, we’ve seen that we may have some outliers
• But histograms by themselves can be a bit deceptive.
• We need a better measure
• One such measure is Tukey’s Test:
• Delete all observations outside of the range:
• [𝑄1 − 1.5(𝑄3 − 𝑄1 ), 𝑄3 + 1.5(𝑄3 − 𝑄1 )]
• Quartiles:
Tukey’s Box Plot
An example using Excel
Value
15
25
32
33
37
42
Q1
=QUARTILE.EXC(A2:A12,1)
32
Q2
=QUARTILE.EXC(A2:A12,2)
42
Q3
=QUARTILE.EXC(A2:A12,3)
70
[𝑸𝟏 − 𝟏. 𝟓(𝑸𝟑 − 𝑸𝟏 ), 𝑸𝟑 + 𝟏. 𝟓(𝑸𝟑 − 𝑸𝟏 )]
43
Bounds
52
Lower bound = 32 - 1.5*(70-32)
-25
70
Upper bound = 70 + 1.5 *(70-32)
127
73
140
• Do we have any outliers?
Outliers for UVI
• Outlier detection assumes symmetrical distribution.
• UVIs often skewed to the rights
• Therefore we first take the natural log of the prices
Exercise 2.6
• Calculate the price for each record
• In the next column, calculate the natural logarithm of the
prices
– Use the LN() function
• Determine 1st, 2nd and 3rd Quantiles
– Use the QUANTILE.EXC() function
• Determine the upper and lower bounds for the outliers
• Can you identify outliers?
Deriving weights
• Weight for the unit value indices are derived directly from the
customs data.
• The weight of a particular unit value is simply determined as
its share in total import/exports
• $12 mln of imports of vehicles from Australia
• Total imports are $480 mln
• Weight = 12 / 480 = 0.025 = 2.5%
Adjusting for non-covered commodities
• It is unlikely that we cover all of the commodities for which
there are imports or exports.
• In order to make sure that the weights remain the same at
higher levels of the classification, we have to redistribute the
non-covered items.
• This is done by dividing the values proportionally over the
covered codes.
Example
• Imports for 38.08 Insecticides, rodenticides, fungicides…
Covered?
YES
YES
YES
YES
NO
Code
3808.91
3808.92
3808.93
3808.94
3808.99
Description
-- Insecticides
-- Fungicides
-- Herbicides, anti-spro…
-- Disinfectants
-- Other
CIF Covered
CIF Total
Adjustment factor
CIF (x 1000$)
$ 250.00
$ 450.00
$ 400.00
$ 200.00
$
50.00
$ 1,300.00
$ 1,350.00
1.03846154
CIF Adjusted
$
259.62
$
467.31
$
415.38
$
207.69
$
1,350.00