Outlets Register

Download Report

Transcript Outlets Register

Realisation of Outlets Register for the implementation of
new probability sample strategy on the Consumer Price
Index Survey
Wiesbaden Group on Business Registers
International Roundtable on Business Survey Frames
Tallinn, 27 – 30 September 2010
Simonetta Cozzi,ISTAT, Italy
Introduction
The presentation describes the process used to create a first version of the
Outlets Register.
The set-up of the register has been developed during the activities of a
project, “Development of instruments for implementing a new
sampling strategy and completing a new data collection system in
the production of a Consumer Price Index”
This project has been instituted following the directives of a scientific
committee, composed by university professors and researchers of various
institutions, establishing at ISTAT for steering and monitoring the innovation
process for the production of the CPI.
Introduction
Outlets Register in
order to have an useful
sampling frame
Statistical
Registers and
Administrative
Data
Project
CPIs
new probability sampling
strategy for CPIs
Information
Technology and
Methodology
Structural
Statistics and
Consumer Prices
new data collection
system, quality survey to
test Outlets Register
Introduction
Summary
 The current Consumer Price Index (CPI)
 The new sampling strategy proposed for the CPI
 Outlets Register
 Quality survey
Current Consumer Price Index: characteristic
The collection of prices is carried out in two different ways:
1. centrally, (about 60 products) by the staff of Istat, for products and services where
there are national pricing policies and for prices that are difficult to observe directly;
2. locally, (about 500 products) directly by staff of Municipal Statistical Offices, from
individual outlets at territorial level.
Three sampling stages for the local survey:
•
The first stage units (PSU) are 83 municipalities
•
The second stage units are the outlets purposively chosen in each PSU (about
40,000). The selection is made, through a kind of quota sampling, to be representative
of the consumer behaviour.
•
The most sold elementary items of the fixed basket* are observed in each selected
outlet (about 400,000).
*At the beginning of this year, Istat defined a fixed basket in which includes, relying on the purposive selection, 562
representative elementary items (products or services) that consist of groups of products that are as similar as possible
and relatively homogeneous
Current Consumer Price Index: issues
 The purposive sampling strategy used in the current survey structure
prevents the exact computation of the sampling precision (the standard
error) of the current estimates of the CPIs
 Not all municipalities are included in the survey and it could cause biases if
the not included municipalities display price movements which
systematically differ from those of the included municipalities;
 the selection criterion of the “most sold” elementary item of the product for
each outlet under-represents the smaller brands and products and it could
introduce unknown bias
New sampling strategy proposed: overview
To improve the quality of the CPIs, a new probability sampling strategy has
been proposed by the division for Information Technology and Methodology.
The new sampling design is based on:
• the hypothesis that turnover is a good proxy of the household
consumer expenditure
• the availability of a sampling frame consisting in a list of outlets
defined on the basis of the information collected in the BR
Relying on the frame availability, the sampling design consists of a three stage
selection:
1. Local districts (selected within the geographical area through balanced
sampling)
2. Outlets (D sample distinct, one for each product through a coordinated
selection method to obtain an high level of overlapping of the selected
samples for each type of product)
3. Items (based on an iterative hierarchical drawing of group of product)
Outlets Register
The new sampling strategy for the CPI survey is based on the availability of a
complete and sufficiently updated list of outlets (Outlets Register)
The set-up of the Outlets Register has been carried out by the division for
“Statistical Registers and Administrative Data”, in order to have an useful
sampling frame for the CPIs.
The informative basis of Outlets Register should contain:
Identification codes of outlets
Economic activity code
Types of products sold 1…n (Coicop classification)
Turnover for each type of products sold.
Outlets Register: administrative sources used
The basis for the set-up the Outlets Register is the availability of the Statistical
Register of Local Units (ASIA-UL) which, every year, supplies information on the
territorial locations, economic activity and employees of the local units of the
enterprises, previously available every ten years from the Industry and Services
Census.
The economic activity code allows to identify :
•the local units that can be classified as outlets
•the type of products sold
The setting-up of the Outlets Register requires the integration of different
administrative/statistical sources in order to obtain a register with the
suitable information for applying the proposed sampling strategy.
Outlets Register: administrative sources used
Studi di Settore (Sds) – “Business sector analisys”
•
•
introduced by the Italian tax administration to calculate reference
revenue levels for taxpayers
involve small and medium size firms and independent workers, with an
annual turnover under 7.5 mln.
The Sds are based on sophisticated statistical procedure which aims at
estimating a reasonable turnover value for each taxpayer.
In order to estimate a plausible level of turnover, data are collected from all
firms that report similar activity codes.
Data include structural variables (surface area of office and warehouses,
number of employees, type of customers, product output) and accounting
variables (mainly costs).
10
Outlets Register: administrative sources used
Number of taxpayers in Studi di settore by sector activity. Years 2005-2008
369.318
1.770.034
2008
666.997
714.416
383.986
1.850.573
2007
775.202
725.171
385.482
1.755.602
2006
708.191
722.833
Manufacturing
Service Activities
380.154
2005
Professional activities
1.572.640
646.422
676.011
Trade
80%, i.e. about 3,8 million, of Italian firms are eligible to be audited
on the basis of Sds.
11
Outlets Register: administrative sources used
Issues Sds
Different informative structure with different recorded variables
(different degree of complexity in the statistical translation)
Different definitions and classifications (i.e. classification of product sold)
(translation according to a statistical framework before their usage)
Lack of local unit identification
(only, the municipality code without addresses)
Outlets Register: administrative sources used
Retail Trade Register - Nielsen
The register contains information about 28 thousand local unit of enterprises
in the sector of the retail food trade:
• hypermarkets,
• supermarkets,
• discount stores
• 15 thousand stores with a sales area of between 100 and 400 m2)
The information is:
• identification variables of unit
• dimension in terms of employees
• number of cash desks
• sales area
• type of counter present (frozen food, vegetables, meat,…)
• sales potential indexes, which supply an estimate on the turnover of the
main types of product sold by each local unit.
Outlets Register: ATECO-COICOP correspondence table
In order to apply the proposed sampling strategy, for each outlet it is needed to assign the
information about the type of products sold (TPs).
To identify the TPs sold by each outlet, the division for Structural Statistics and Consumer Prices
has developed a correspondence table between the classification of the economic activities
(ATECO, Classification of economic activities, national version of NACE) and the classification of
products (COICOP, Classification of Individual Consumption by Purpose).
COICOP
code
description
group
ATECO 2002
code
class/division
1.1.2
Meat
52.11
1.1.2
Meat
52.27.2
description
Retail sale in nonspecializ ed stores with
food, beverages or
tobacco predominating
Other retail sale of food,
beverages and tobacco in
specialized stores
Retail sale via stalls and
markets of food,
beverages and tobacco
products
ATECO 2007
code
description
class/division
Retail sale in nonspecialised stores with
47.11
food, beverages or
tobacco predominating
47.29.9
1.1.2
Meat
52.62.1
47.81.
1.1.2
Meat
52.22.0
Retail sale of meat and
meat products
47.22.
1.1.2
Meat
52.31.0
Dispensing chemists
47.73.
Other retail sale of food
in specialised stores
Retail sale via stalls and
markets of food,
beverages and tobacco
products
Retail sale of meat and
meat products in
specialised stores
Dispensing chemist in
specialised stores
Outlets Register:ATECO-COICOP correspondence table
The correspondence between COICOP/ATECO, in some cases is 1 to 1,
namely an Ateco code is associated with a product group and in others 1 to
n, where n can vary from 2 to 44 (for example 44 product groups are
associated with the Ateco code 47.11.1 – hypermarket).
Type of correspondence
1-1
1-n
Numbers ATECO codes
144
47
n=2…..44
The association between TPs and ATECO codes is quite immediate for TP
related to goods, but it implies some uncertainty when treating with services
Outlets Register: Macro phases of realization
The set-up of the Outlets Register for food trade includes five macro phases of
work:
1. Integration
ASIA-UL
ATECO/COICOP,
and
the
correspondence
2. Analysis of administrative sources considered
3. Integration administrative sources
4. Imputation the turnover to each outlet
5. Checking
table
Outlets Register: Phase 1 – Integration ASIA-UL
correspondence table ATECO/COICOP
The subset of local unit containing the outlets selling goods in the food trade is
defined by selecting all the outlets characterised by a ATECO code linked to
a COICOP code of food trade.
The output of this phase is a list of outlets LU0 with the following information:
•
identification characters (name, fiscal code, address, ecc.)
•
economic activity code,
•
type products sold (Coicop code at group level)
•
employees
•
enterprise turnover
Outlets Register: Phase 1 – Integration ASIA-UL
correspondence table ATECO/COICOP
Outlets by enterprise typology and number of products sold (percentage value)
Number of
products sold
Enterprise typology
Total
Singlelocation
Multilocation
1
18.3
2.4
20.7
>1
66.1
13.2
79.3
Total
84.4
13.6
100.0
Total outlets
229.945
Outlets Register: Phase 2 – Source analysis
Each source used for the building-up the Outlets Register needs a specific
pre-treatment before compared and integrated:
standardization and normalization operations for common variables like
location, fiscal code
checks and decoding activities (as location, at municipality code level)
to transform the value of this variable from the input source proper
classification into statistical codes.
Coherence test of identification codes for the link process.
Outlets Register: Phase 2 – Source analysis
Retail Trade Register:

analysis between this source and “survey on the local units of large
enterprises” (IULGI) to evalue the quality of this source

analysis of the methodology used for the potentiality indexes and their
usefulness for estimating the product turnover
Sds:

identification the potentially useful Sds among the 206 Sds avalilable,

analysis the selected Sds gathering different kind of information

identification the products sold,

development of editing and imputation procedures for using the data
Outlets Register: Phase 2 – Source analysis
Sds used to setup the Outlets Register
Denomination
Code
Number enterprises
Retail sale of meat
TM02U
28 thousands
Retail sale in non-specialised stores with
food,
beverages
(minimarkets,
supermarkets, etc.)
TM01U
63 thousands
Retail sale of fruit and vegetables
TM27A
15 thousands
Retail sale of fish
TM27B
5 thousands
Retail sale of Frozen products
TM30U
900
TD01
2 thousands
TD12U
5 thousands
Retail sale of cakes
Retail sale of flour confectionery
Information useful for determining the turnover of the outlets:
 percentage of profits made by each type of product sold as a ratio of
the total profits made at an enterprise level,
 percentage of outlet profits considered as a percentage of the total
profits at a local unit level but only for the TM01U and TM02U.
Outlets Register: Phase 2 – Source analysis
To use the Sds data in order to estimate the turnover of the outlets of traditional distribution,
it’s necessary to define some corresponding tables among the classifications of the products
present in the Sds and the Coicop classification at a group level.
COICOP
Retail sale in non-specialised stores with food, beverages - TM01U
D12
Denomination
Bread and cereals
Meat
Fish and seafood
Milk, cheese and eggs
Oils and fats
Fruit
Vegetables
Sugar, jam, honey, chocolate and confectionery
Food products n.e.c.
Coffee, tea and cocoa
Mineral waters, soft drinks, fruit and vegetable juices
Spirits
Wine
Beer
D13
D14
D15
D16
D17
D18
D19
D20
Bread,
Fresh
Fresh
fresh
Alcoholic
Fresh Oil and
Salami
fruit
fish
pasta
and super- Milk and
cakes alcoholand
Butcher
and
and
and ovenalcoholic
dairy
and
free
cold products
vegetables shellfish baked
drinks
products
sweets drinks
meats
products
X
X
X
X
X
X
…
…
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
Outlets Register: Phase 3 – Sources integration
The integration process starts by matching the above mentioned sources and
the first subset of local unit containing the outlets selling products in the food
trade (LU0), using the primary key for the unit identification (fiscal code).
Outlets’ distribution by source presence and products sold
products
sold
Source presence
Single-location
Absence
Multilocation
Total
Single-location
Presence
Multilocation
Total
Total
1
5.749
2,5
1.610
0,7
7.359
3,2
36.331
15,8
3.909 1,7
40.240
17,5
47.599
20,7
>1
77.951
33,9
15.866
6,9
93.817
40,8
73.812
32,1
14.487 6,3
88.529
38,5
182.346
79,3
83.700
36,4
17.246
7,5
101.176
44,0
110.144
47,9
18.626 8,1
128.769
56,0
229.945
100,0
Total
More than 128 thousand outlet (56%) are present at least one source considered and for
which the percentage of turnover per product (at an enterprise level), the percentage of
turnover per local unit (in some cases and for the multi-location enterprises), and the
potentiality sales indexes are available.
Outlets Register: Phase 4 – Turnover imputation
multi-location
single-location
Type
Presence products
sources
sold
yes
no
yes
1
>1
1
>1
1
>1
Information using to estimate
turnover
Enterprise turnover, TPs turnover
Enterprise turnover
Enterprise turnover, TPs turnover,
turnover for local unit (only for Sds
TM01U, TM02U )
1
no
>1
Outlets Employe
(%)
es (%)
Enterprise turnover
47,9
31,0
2,5
1,4
33,9
23,9
8,1
29,0
0,7
0,7
6,9
14,0
Outlets Register: Phase 4 – Turnover imputation
The next steps are:
1. Imputation the turnover for each outlet
•
For single-location enterprises, the turnover of the outlet coincides
with the turnover of the enterprise
•
For the multi-location enterprises, the turnover of each outlet has to be
imputed, using models relating the turnover to some auxiliary
information such ad the economic activity.
2. Imputation the turnover for all the TPs sold by the outlet.
•
For outlets selling only one product, the enterprise turnover is
attributed completely to the product that is sold
•
For outlets selling p products, the turnover for each product has to be
estimated, using as training data the information on TPs turnover known
by Sds
Outlets Register: Survey quality
To verify the quality of the Outlets Register and the proposed probability
sampling design for the CPI, the division for Structural Statistics and
Consumer Prices will carry out a quality survey in October.
Three main themes will be evaluated during this survey:
 coherence between the list of outlets in the selected sample and the
reality of business distribution in the Municipalities involved,
considering a delay of around 18 months between the time reference
of the starting list and the moment in which the sample is extracted;
 correctness of the methods for turnover imputation;
 test different schemes for the selection of outlet sample;
Outlets Register: Survey quality
Municipalities involve in the survey
Municipalities
Area
Municipal
population
Provincial
population
(01 January
2009)
Weight % of
Weight %
the
large-scale
municipality distribution
population as in the food
to the
trade at a
provincial
regional level
population
(year 2008)
Milan
North
1,295,705
3,930,345
32.97
80.48
Turin
North
908,825
2,290,990
39.67
63.53
Trento
North
114,236
519,800
21.98
78.45
Udine
North
99,071
539,723
18.36
77.45
Bologna
North
374,944
976,175
38.41
77.32
Florence
Centre
365,659
984,663
37.14
63.62
Centre
2,724,347
South and
Naples
Islands
963,661
Source: Istat, division for Structural Statistics and Consumer Prices
4,110,035
66.29
59.25
3,074,375
31.34
33.77
Rome
Outlets Register: Survey quality
The survey will divide in three phases:

a first phase, to be carried out in collaboration between Istat and Municipalities
during which the sample outlets is extracted and the coherence between the
extracted sample and the trade structure of each individual Municipality is
verified. During this phase, the software for the selection of the sample will test
and questionnaires will be made, in which the information contained in the
Outlets Register, (COICOP group turnover for each unit selected) are given.

a second phase to be carried out partly on-field, during which the researchers
of each Municipality check coherence of the information given in the
questionnaire with the reality of the outlets.

a third phase, to be carried out prevalently through office work and in some
cases with a return to the field, during which the results of the second phase
are evaluated.
The results will help to fit the statistical methods and to improve the
methodologies used.
Thank for your attention