New Data Sources
Download
Report
Transcript New Data Sources
NEW DATA SOURCES
3 Cases
• Industry: Varying Data Sources
• Official Statistics: Surveys and Censuses
• Academe: Simulation-Based
OBJECTIVES
■
■
■
needed to understand factors
affecting volume performance
needed to quantify impact of
each factor
possible forecast of immediate
future performance
volume decomposition
quantified impact of each
factor
effects of short/long term
activities (promo, media)
Classes of Data
Internal
Sales by Subcategories
Government
economic indicators ( GNP, inflation, P-$, unemployment, etc)
weather ( temperature, rainfall , humidity)
Competitive
Retail Audit
Consumer confidence
Media TARP
Events/Promotions
APPROACH
Pre- analysis
Estimate of Base Volume
Volume Decomposition
Effect of Short term activities
Simulations
PRE-ANALYSIS
– assigned dummy variables for events (
month of occurrence)
Group of Indicators
Economy
Weather
applied data reduction technique to
address problem on short time series and
no. parameters > no. of observations
used principal components per group of
indicators to come up with composite
scores
Humidity
Rainfall
Media
Distribution
Promotion
Events
Pricing
Competitor Act.
Predictive Analytics
All factors affect volume but of varying degrees
Economy
Weather
Media
Distribution
Promotion
Events
Pricing
Competitor
Activities
SALES
Publication
Consider these…
■ Big Data
■ Public Use Files, Census and
Surveys
■ Data Mining
■ Understanding the
Households
■ Business Analytics
■ Consumer Behavior
■ CRM, Strategic Marketing,
Corporate Planning
■ Strategic Marketing,
Corporate Planning, PolicyMaking
OBJECTIVES
■ Characterize the Filipino HH
■ Describe spending behavior of HH
■ Characterize segments according to spending behavior
■ Business insights from consumer behavior
DATA SOURCES
■ Family Income and Expenditure Survey
■ Reference Period: 2003
■ Sample: 42K HH from 16M HH
■ Domain: Regions
CHARACTERISTICS OF HOUSEHOLDS
■ Household Type:
– Single Family:
– Extended:
■ HH Agri:
79%
20%
■ Employed Spouse:
36%
■ With Radio:
65%
■ With TV Set:
59%
■ With VTR:
33%
■ With Ref:
34%
■ With Phone:
■ With PC:
29%
4%
30%
■ Main Source of Income
– Wage Non-Agri:
38%
– Crop Prod:
16%
– Wage Agri:
8%
– Assist. Abroad:
7%
– Whole & Retail:
7%
CSD EXPENDITURE PATTERN
■ Average Household/Yr:
■ Per Capita/Yr:
■ At P5/day, consumption days/Yr:
P981
203
41
■ Average of 1 serving per week,
Total Expenditure on CSD/Yr: PhP 10.57 Billion
EXPENDITURE PATTERN
Estimated Annual Expenditures
■ CSD:
10.57B
■ Coffee:
7.86B
■ Juice:
5.42B
■ Cocoa:
5.24B
■ Beer:
4.88B
■ Liquor:
3.53B
■ B. Water:
2.74B
■ I Cream:
1.54B
■ Wine:
1.28B
■ F. Milk:
645M
■ Tea:
402M
Meal at School:
8.07B
Meal at Work:
29.94B
Meal at Restaurant: 8.39B
Snacks:
37.73B
Recreation:
9.95B
Home Food:
867B
Total Food:
943B
Summary of Insights
■ Consumers: 1/3 Children
■ 3 of 4 HH Heads are Elementary/HS
■ 1/3 of HH Heads are Farmers
■ With Radio:65% With TV: 59%
– Importance of retail store involvement!
■ Bottom 30% dissavers: campaigns with
values=>social benefits
Summary of Insights
■ Upper 10% spend on CSD 9x of the Lower 10%
– Focus on the middle classes!
■ Highest expenditure to CSD than any beverage, water
only 1/3 of that, juice is 1/2
– Prospects of activation in water and juices
■ Complementing campaigns re: CSD, water, and juices.
Summary of Insights
■ Snacks: largest expenditures on food, ¼ is
expenditure on CSD
– Promote, Collaborate with snack items
■ Food outside home mostly at work
■ CSD 1 serving per week
– Promote availability at snack time in the
workplace
– Availability of multi-serve can increase
consumption
Summary of Insights
■ SEC: More than 1/3 C, almost half D
■ D biggest per capita, E negligible
■ NCR: per capita CSD declining over SEC
■ Highest per capita in NCR
■ AOMM, high in Ilocos, Davao
■ AOMM, low in Bicol, MIMAROPA
Summary of Insights
■ Activate in CALABARZON, they spend more on
Juices, least in CSD
■ DAVAO & NCR high CSD, Water, Tea Consumption
■ Maintenance of CSD in Central Visayas (big
market, big expenditures)
Monte Carlo Simulation
■ Stochastic methods to generate new configurations of a system
of interest – simulation of a phenomena
■ Monte Carlo: importance sampling or systems at equilibrium.
– Start: initial configuration of the system
■ can be data-based random variable generation
– Change the configuration
■ acceptance/rejection of changes
Monte Carlo Simulation
■ Given a data-generating mechanism
– Example: drawing colored balls from an urn, input-output
model, adaptive sampling, etc.
– model of the process you wish to understand
– produce new samples of simulated data, replicate current
data
– examine results of those samples
– may also amplify this procedure with additional
assumptions
Monte Carlo Simulation
■ Computer Simulation/Monte Carlo Models
– Not solved by mathematical analysis but are used for
numerical experimentation.
– Goal of Numerical Experimentation: Answer questions of
real world (What if-sensitivity analysis)
– Purpose of Sensitivity Analysis
■
Validation of the model
– Would the customers exhibit similar credit
behavior?
– Are their credit behavior similar?
Simulation
■ Big Data
Big Data
■ Machine
Learning/Modeling
Leading Indicators
■ Simulation
Validation
■ Estimation
■ Validation
Model-Based Estimation
Simulation Procedures (Resampling)
■ Construct a simulated universe
– composition similar to the universe whose behavior we wish to
describe and investigate.
■ Specify the procedure that produces a pseudo-sample
– simulates the real-life sample in which we are interested
– specify procedural rules by which the sample is drawn from the
simulated universe (purposive sampling)
■ Describe: if several simple events must be combined into a composite
event
■ Calculate the probability of interest
– estimate parameters
– test hypothesis
– Based on tabulation of outcomes of the resampling trials.
THANK YOU.