Powerpoint slides

Download Report

Transcript Powerpoint slides

WELCOME TO
MARKETING/BUSINESS
RESEARCH
1
MARKETING RESEARCH



Definition:
Used to implement the
____________
What is that?? (think intro)
2
Research is used to:

Identify problems/opportunities




3
Research is used to:

generate, and refine
marketing actions
4
Research is used to:

Plan and Implement the Marketing Mix




5
Research used to:

Monitor marketing
performance






6
When is Market Research
Warranted:

Time Constraints

Availability of Data

Nature of Decisions

Costs vs. Benefits
7
Sources of Marketing Data

Internal



sales records
customer complaints
inventory ...

External







Syndicated
Standardized
Customized
Advertising Agencies
Field Services
Tabulation Houses
Commercial
Databases
8
Research is NOT a Cure-All!

Classic Blunders




9
Why do I have to be here?



You will use
research for
decisions
Can easily bias
research
Numbers lie
10
RESEARCH ETHICS
11
In The Beginning

1950 Fear and Authority Studies

Animal Protection

Internal Review Boards

http://www.wvu.edu/~rc/irb/irb_guid/e
xempt.rtf
12
Business Ethics

Definition:
13
Teleological Ethics

Definition:Teleological moral
systems are characterized
primarily by a focus on the
consequences which any action
might have (for that reason,
they are often referred to as
consequentalist moral systems,
and both terms are used here).
Thus, in order to make correct
moral choices, we have to have
some understanding of what
will result from our choices.
When we make choices which
result in the correct
consequences, then we are
acting morally; when we make
choices which result in the
incorrect consequences, then
we are acting immorally.
14
Deontological Ethics

Definition:Deontological moral systems are
characterized primarily by a focus upon
adherence to independent moral rules or
duties. Thus, in order to make the correct
moral choices, we simply have to understand
what our moral duties are and what correct
rules exist which regulate those duties. When
we follow our duty, we are behaving morally.
When we fail to follow our duty, we are
behaving immorally.
15
Kohlberg – Value Maturity
Model

Three levels of maturity with six stages
of development



Self-centered level – (1) obedience and
punishment, (2) naively egoistic
orientations
Conformity level – (3) good person, (4)
“doing duty” orientations
Principled level – (5) contractual legalistic,
(6) conscience of principle orientations
16
Which is the “right”
perspective
17
Respondent’s Right to Choose

Can’t force compliance



Captive subject pools
Status of the researcher
Insure that incentives
do not create pressure
18
Respondent’s Right To Safety



Preserve anonymity
Preserve privacy
No mental stress




respect subjects
debrief subjects
Protect when questions are detrimental
to subject
Inform when special equipment used
19
Respondent’s Right to be
Informed



Informed consent/assent
Parental consent
Observation??



consider risks
consider alternative methods
Deception
20
Solutions

Actively think about ethics when
designing the study


Government


Institutional Review Board
Ethics Codes


http://cme.cancer.gov/c01/
AMA
Ethics Checklist
21
THE RESEARCH PROCESS
Stages in the Research Process
22
Define the Problem
(Stage 1)

Research objectives

Research questions

Properly formulate the problem
23
Conduct a Situation Analysis -Part of Problem Definition


General environment
Competitive
products or services

Consumers

Marketing Programs
24
Determine Research Design
(Stage 2)

How much should you spend?


What type of design should you use?



exploratory
descriptive
causal
25
Exploratory Designs

Used when you do
not have a good
understanding of the
problem and need to
gain insight

Used to:

Methods:
26
Descriptive Designs


Used to describe the
characteristics of
consumers,
competitors, etc.…
Methods
27
Causal Designs

Used to determine cause and effect
relationships.

MUST use experiments which include:



28
Preparation of the Design

Determine source of the data



primary
secondary
Determine data collection method


qualitative
quantitative
29
Sampling
(stage 3)

Sampling defined:



Who is to be sampled (the target
population)?
How big should the sample be?
Which sampling technique should be used?
30
Data Gathering
(stage 4)

Method Used

Stages


31
Data Processing and Analysis
(Stage 5)

Editing


Coding


Analysis
32
Conclusion and Report
Preparation
(Stage 6)

Written--the only tangible from the
study




interesting
easy to read
managerial implications
Oral


interesting
convincing
33
Secondary Research
The Place to Begin
34
Secondary Data
35
Secondary Data Advantages





Time
Money
Improve over other studies
Point of comparison for trends
Increase understanding of problem
36
Secondary Data Disadvantages:
Problems of Fit



Measurement units
differ
Class definitions
differ
Out of date
37
Secondary Data Disadvantages:
Problems of Accuracy



Primary vs.
secondary source
Purpose of
publication
General evidence of
quality
38
Internal Secondary Data




Sales invoices
Warranty cards
Departmental
records
Sales records
39
Locating External Secondary
Data







Identify what you need to know
Develop a list of key terms and people
Examine directories and guides
Write letters to key contacts
Talk to reference librarian
Do a computer search
Pull the information together
40
THE LAW
Always conduct secondary data
search before you do primary
data collection.
41
Qualitative Interviewing
Techniques
Focus Groups
Projective Techniques
3.
Depth Interviews
4.
Observation
1.
2.
42
Definitions (Yuck!):

Inquiry -- Person responds to a set of
Questions

Disguised:

Undisguised:
43
Final Definitions

Structured:



(I promise):
Questions:
Answers:
Unstructured:


Questions:
Answers:
44
Qualitative Methods




“Touchy-feely” – no numbers
Examine thoughts, feelings,
motivations…
Can be results be projected to the
population? Yes No
Can spot trends 3 to 4 years before
they show up in surveys
45
Focus Groups

______ homogeneous people carefully
recruited

Lasts _____

Types:



Round-table (Comfortable room with one-way
mirror)
Telephone
Internet
46
Focus Group Moderator






Keeps discussion focused
Truly believe that participants have wisdom
Encourages shy to talk and dominant
participants to be quiet
Should say little, but keep eye contact
Accepts all answers
Must be a quick study
47
Uses of Focus Groups





48
Advantages/Disadvantages of
Focus Groups
Advantages:
1.
2.
3.
4.
5.
6.
7.
Disadvantages:
1.
2.
3.
4.
49
Conducting a Focus Group



Register participants
(demographic information)
Small talk
Introductions




welcome
why they are here
guidelines or ground
rules
opening question





Ask questions
Anticipate flow
Control your
reactions
Probe as needed
Summarize the
discussion
50
Conducting a Focus Group:
Guidelines
Always Include:
taping discussion
do not talk over others
no names attached
sponsor of study
role of moderator to guide only
feel free to talk to each other
done by
first name basis
no wrong answers only differing
opinions
May Include:
don’t need to agree but listen to
their views
no cell phones or pagers
who will listen to tapes
who will see the report
how the report will be used
strictly research and no sales
location of the bathrooms
help yourself to refreshments
51
Developing Questions for
Focus Groups

Where to Begin:




General Rules:



52
Question Categories





Opening questions
Introductory questions
Transition questions
Key questions
Ending Questions
53
Projective Questions:



Used when subjects
cannot or will not
directly communicate
feelings
“A man is least himself
when he talks in his
own person; when he is
given a mask he will tell
the truth.”
E.g., TATs, inkblot
54
Word Association

Examine brand/service image
Measure frequency of responses and no
responses
Response Latency

Example


55
Sentence Completion


Gives more direction
than word
association
Examples:

When visiting the
President be sure
to_____________.
56
Unfinished Story

Finish the story or
tell why the person
acted the way he or
she did.
57
Third Person Role Play



What would the typical person do in
this situation?
We tend to think others are like
ourselves, yet we are more willing to
tell the truth about “others”
Example:

Why would your neighbor buy a Mercedes
58
Cartoon Completion

Subjects fill in the
bubble – suggests a
dialogue between
the characters
59
Draw a Picture


Subject given a topic to draw
Examples:
60
DEPTH INTERVIEWS


One-on-one interviews
Try to uncover underlying motivations,
prejudices and attitudes toward
sensitive information
61
Depth Interviewing Analysis

Laddering

Attributes

Consequences

Values
62
When to use depth interviews:





Sensitive subject matter
Need intensive probing
Respondent interaction unlikely to be
helpful
Have lots of $$$$ and time
Need detailed responses (> 15 minutes)
63
Some Boring Definitions:

Ethnographic/Observational Research

Direct Observation:

Indirect Observation
64
Observation can be disguised
or undisguised
65
Observation of Physical
Objects


Naturalist Inquiry
Physical Trace
evidence

wear on floor tiles

Garbology

Pantry Audit
66
Mechanical Observation

Television/Internet

Scanners

Eye Tracking

Psychogalvanometer

Response Latency
67
Experimental Research
Methods
Looking at Cause and Effect
Relationships
68
Experiment

Definition:




variable
manipulate
independent variable
dependent variable
69
Requirements for an
Experiment

Must have two or more groups of
subjects



experimental group(s)
control group(s)
Must use random assignments to
groups

controls for extraneous factors
70
Research Environments


Laboratory
experiment
Field experiment
71


Can NEVER prove causation ( X
Y)
Can only INFER such a relationship
72
Reasons for Association
between X & Y :

Common causes


Confounded factors



drowning and ice cream consumption
AIDS test of Rivavion
Coincidence
Causation
73
Evidence to Support Causation



Concomitant Variation
Temporal Ordering (time order of occurrence)
Elimination of Other Causes
74
Concomitant Variation
Required for Causation

1. Concomitant
variation


positively
negatively
75
Temporal Ordering Required
for Causation
76
Elimination of Other Possible
Causes Required for Causation


You must think this through, no one will
give you a list to check
Most difficult of the criteria to
determine
77
Internal Validity

Definition:

Threatened by:





history
maturation
instrumentation
selection bias (non-random assignment )
testing
78
External Validity

Definition:

Threats to external validity

reactive/interactive testing effects

surrogate situations

demand artifacts
79
Experimental Designs -Notation:




RR = random assignment of respondents
X = exposure to one of the possibly many
treatments
0 = observation of measurement of the
respondent
T = treatment effects
80
One-Shot (After Only)
X
O
Problems?
81
One-group Pretest-Posttest
O1
X
O2
Problems?
82
Static Group
X
O1
O2
PROBLEMS?
83
Before/After With Control
RR
RR
O1
O3
X
O2
O4
Problems?
84
After Only With Control
RR
RR
X
01
02
Problems?
85
SURVEY INTERVIEWING
TECHNIQUES
Methods that Use Large Sample
Sizes and Create Results that Can
Be Projected to the Populations
86
Mail Surveys/Self-Administered
Questionnaires

Def:

-cold
-panels
-fax
- e-mail
87
Internet/Computer Assisted
Surveys


Allow for lots of
branching/interactive
Allows for
personalization

Great anonymity

Representative Samples
88
Other Survey Methods



Telephone
Personal in-home (Door-to-Door)
Mall intercepts
-can interact with product
replacing ___________
89
Each Method Has Advantages
and Disadvantages


See page 172 for a summary
TREND – USE A COMBINATION OF
METHODS
90
Things to consider when
choosing method

Versatility
- Visual cues
- Degree of structure
- Complexity of questions
91
Consider Quantity of Data

Function of
questionnaire length shortest
________
- moderate length
________
-longest
_________
92
Consider Sample Control

Contact the right people



mailing list quality

interviewer qualifying

phone unlisted
Random Sampling error
93
Consider the Quality of Data

Response bias (see next slide)
Interviewer bias
Interviewer cheating
Poor questionnaire design
Sample bias

Systematic Errors




94
Response Biases





Acquiescence
Extremity
Interviewer
Auspices
Social Desirability
95
Consider Non-Response Error



Problems occur because the people
responding to the questionnaire differ
significantly from those not responding
Possible Self-selection bias
Example
-survey 500 students to see if they need transportation
to and from school
- 50 answer and say yes
-conclude that all 450 that did not answer do not need it

Did you make the correct conclusion?
96
Your Turn

Make up your own example of
nonresponse bias:
97
How to Increase Response
Rate


Prior notification
Motivate with rewards

Good looking
questionnaires
Good cover letter
Follow-up

Make it fun!!


98
Consider Speed



Phone is ____
Computer-assisted
phone/internet is
_____
Mail is ____
99
Consider Cost




Internet: relatively
inexpensive
Mail: depends on
pre-contacts and
follow-ups
Telephone: next
most expensive
Mall/In-home $30
up to $100
100
Specific Uses for Methods

Cold mail


Mail panels


general information, in-home use
Phone


respondents very interested in topic
nationwide samples
Mall intercept

copy tests, product tests, branding/package
testing
101
Measurement
Assigning Numbers To Reflect the
Degree or Amount of a
Characteristic
102
MEASURES OF
CENTRAL TENDENCY

MODE

MEDIAN

MEAN
103
Measurement Scales


Series of items that
are arranged
progressively
according to value
or magnitude
A series into which
an item can be place
according to its
quantification
104
Nominal Scale




Identification only
No order to the
numbers
Examples:
Measure of Central
Tendency:
105
Ordinal Scale




Ranked data
Distance between two
numbers is unknown
and uneven
Examples:
Measure of Central
Tendency:
106
Interval Scale





Rank to the data
Equal distance between
numbers
No “natural zero”
We assume a lot of scales
are interval
Measure of Central
Tendency:
107
Ratio Scales




Rank to the data
Equal distance between
numbers
“natural zero” where
zero means “none”
Measure of Central
Tendency:
108
YOUR TURN --Write a question
for each type of scale

Nominal

Ordinal

Interval

Ratio
109
Criteria For Good
Measurement

Reliability

Validity

Sensitivity
110
1) Reliability of Scales

Coefficient Alpha



Are the results on
questions measuring the
same thing consistent?
Single item scales more
suspect to random error
Test/retest

Are consistent results
found on repeated
measures
111
2) Validity

Are we measuring what we think we are
measuring?

Content validity (Face validity)

Pragmatic validity
112
3) Sensitivity


Refers to an instruments ability to accurately
measure variability in stimuli or responses
Example: I love to eat chocolate


Agree vs Disagree
Strongly
strongly
agree
mildly
neither
agree
agree or
disagree
mildly
disagree disagree
113
Noncomparative Continuous
Graphic Rating Scales
Place a mark on the line indicating how
important it is to have each of the
following at your vacation resort:
Alpine slides _____________________
unimportant
important
 5 inch line



127 mm
1/20 inch
114
Graphic Rating Scales

Happy faces

Thermometer
115
Noncomparative Itemized
Rating Scale


Several categories from which the respondent
can choose
Top-box method:






How likely are you to buy a Sony DVD player in the next 3
mos.
definitely will buy
Probably will buy
Might buy
Probably will not buy
Definitely will not buy
116
Examples of Itemized Rating
Scales

Likert

Semantic Differential

Staple
117
Likert-type Scales

Sentences with which the respondent
agrees or disagrees
It would be cool to have a candy-red
1965 convertible Mustang
SD D Neither A SA
118
Likert-type Scales



Code such that higher numbers mean
better things
Can create summated scales to form an
index
Assume __________ scale
119
Semantic Differential



Series of attitude scales where repeated
judgments about a concept are made
Opposite adjective words or phrases
Use several of these and sum them
Fast
Bad
Service
Tasty
Food
__:__:__:__:__:__:__
Slow
__:__:__:__:__:__:__
Good
Service
__:__:__:__:__:__:__
Not Tasty
Food
120
Semantic Differential




Code such that higher numbers mean
better things or more of something
Make an overall score--sum the items
Develop a snake diagram (image
profile) to compare competitors
Assume _____ scaling
121
Staple Scale
Use +5 (describes completely) to -5
(does not describe at all)
 Assume _______ scaling
 Good for phone
 Easy to construct
 May look difficult for respondent
-5 -4 -3 -2 -1 FUN +1 +2 +3 +4 +5

122
Questions for Itemized
Response Scales

How many categories?

Balanced or Unbalanced?

Should you have a neutral point?

Forced or unforced?
123
Comparative Scales

Compare one set of objects directly
with another



sensitive
easy
can create artificial differences
124
Paired Comparison

Which do you prefer?
____ Barry Manilow
____ Counting Crows
____ Barry Manilow
____ Rolling Stones
____ Rolling Stones
____ Counting Crows
125
Paired Comparison Table
Manilow Crows
Manilow
Stones
-----
0.90
0.85
Crows
0.10
----
0.60
Stones
0.15
0.40
---126
Calculation of Rank-Order
Values
Manilow Crows
Stones
Manilow
Crows
Stones
127
Rank-order Scales


Respondents are simultaneously
presented with several objects that they
rank order
Please rate the following from 1=most
preferred to 4= least preferred


Pizza Hut
Mario’s
-Domino’s
-Little Ceaser’s
128
Comparative Continuous
Graphic Rating Scale
Similarity ratings used for perceptual
maps
 Pitt and WVU
_________________________
Exactly
Completely
the same
different

129
Constant Sum Scales



Assign chips or
points to attributes
Very careful with
instructions
Difficult for the
respondent
130
Developing Questionnaires
The Art and Science of
Questionnaire Design
131
Preliminary Considerations

What information is required?

Who are the target respondents?

What data collection method will be
used?
132
Managerial Orientation

Make sure that all
information in the
questionnaire is
useful to the
manager

(demographics and
first question are
possible exceptions)
133
Make Sure Questions Are
Understandable


Do you need more
than one question?
Do respondents
have the information
needed to answer
the question?

134
Understandable Questions,
cont.

Can respondents remember the
information?


Is it too much work to get the
information?
135
Ways of Dealing with Sensitive or
Embarrassing Questions


State behavior is not unusual.
Early or late in the questionnaire?



early
late
Give categories for responses.
Phrase how others might act.
136
Need Mutually Exclusive and
Exhaustive Responses

Responses should not overlap

Must cover the entire range

Example:
137
Use Natural and Familiar
Language

Simple language

Language that the target market uses

Avoid ambiguous words:

DO NOT USE:
138
Avoid Bias

No loaded questions

Watch for sequence bias
139
No Double-Barreled Questions

A question that calls
for two responses
140
Response Formats

Open ended--respondent answers in his
or her own words
Uses:

Bad points:

141
Itemized Questions (closeended)

Fixed alternatives
Advantages:

MUST PRETEST

142
Types of Close-ended
Questions

Multichotomous (More than 2
responses)


Dichotomous (Only two responses)


143
Questionnaire Flow

Cover letter

First question very important, must be



_____________
_____________
Demographics late in the questionnaire
144
Sequencing

Funnel

Inverted funnel


Keep questions on related topic
together
Be very careful with branching
145
Layout




Booklets for multipage questionnaires
Attractive
Title, date, return
address on first
page
Color code
branching
146
Layout



Number the questions
Put the answers in all UPPER CASE
letters
What is better?


white space
save a page
147
Pretest the Questionnaire




First with a personal
interview
Make corrections
Next using the real
method
If you do not
pretest, you are
being

_________________
_
148
Sampling
The Statistical Adventure Begins
149
Populations

Def:

Census

Sample

Which is better?


census?
sample?
150
Step 1: Define the Target
Population

Must be very specific:





What is a user?
What demographics matter?
Are there geographic boundaries?
What is the relevant time period?
What is an element?
151
Step 2: Specify a Sampling
Frame

Def:

152
Sample Frame Problems

List may not match the target
population
over-registration

under-registration

153
Step 3: Selecting a Sampling
Method

Probability samples



example:
Non-probability samples


example:
154
What’s the Big Deal?



Probability samples let us estimate
_________
We can calculate a confidence interval
So, probability samples are more
representative than non-probability
samples.

true
false
155
Simple Random Sampling




Probability sample
Number each unit in the sampling
frame
Pick ___ units using a random numbers
table
NOT haphazard
156
Take a Simple Random Sample
(SRS) of n=3
Element
Natasha
Scotty
Kalie
Lynn
Gregory
Paul
John
Attitude toward Motel 6
6
7
4
2
8
4
7
157
Stratified Sample

Decide on stratification
variable




homogeneous groups
related to dept. variable
Divide population into a
few mutually exclusive
and exhaustive strata
Take a SRS from each
strata
158
Proportionate Stratified
Sample
Choose sample from strata in same
proportion as they are in the population
Population
Sample
Strata
proportion
proportion

159
Disproportionate Stratified
Sample



Take a larger sample from the strata
with ________ variance
What is variance?
Exercise: Develop two populations with
8 elements each.


Population 1: high variance, low mean
Population 2: low variance, high mean
160
Disproportionate Stratified
Sample
Population
Strata
Variance proportion
proportion
Sample
161
Why use Stratified Samples?


Make sure that you include certain
subgroups
More precise, IF we use the right
stratification variable




margin of error is ___________
sampling distribution is __________
confidence intervals are __________
What is the right variable?
162
Cluster Sampling



Divide population
into lots of
heterogeneous
clusters
Take a SRS of
clusters
Either:


sample all elements
in the selected
clusters
OR take a SRS of 163
Why use Cluster Samples




Cheap
Easy
Likely to be the way
the sampling frame
is set up
Problem

not precise, lacks
statistical efficiency
164
Non-probability Sample:
Cannot estimate margin of
error


Convenience or
accidental sample
If the sample size is
really large, we
know we have a
representative
sample

true
false
165
Judgment or Purposive
Sample


Elements selected because they can
serve the research purpose--they are
believed to be representative
Snowball sample
166
Quota Sample



Attempts to be
representative by
sampling
characteristics in the
same proportion as
the population
Interviewer chooses
sample
Are these
representative?
167
_____
Step 4: Determine the
Sample Size

Must take into consideration:





cost
time
industry standards
statistical precision
Discuss this in detail in the next chapter
168
Step 5: Select Elements



Actually collect the data
Clean-up the data
Put the data into the computer
169
Characteristics of Interest
# of elements
Population
N
Sample
n
Mean
U (mu)
X (x bar)
Variance
o2 (sigma
Sx2
Standard Deviation
O (sigma)
Sx
squared)
170
Step 6: Estimate the
Characteristics of Interest
Sample mean:
sum of the sample elements
X=
number of elements in sample


Sample variance = Sx
2
sum of deviations around the mean squared
sample size minus 1
171
Sample Standard Deviation


The square root of
the sample variance
= sx
Has a specific
meaning
172
Sampling Error


The difference between the :

population parameter

and the sample statistic
We look at confidence intervals to
estimate this but not until the next
chapter
173
Non-sampling Error
(i.e., all other kinds of errors
except for sampling error!)
174
Types of Non-Sampling Error

Sampling frame

Poor questions

Poor branching

Item non-response
175
More Non-Sampling Errors

Non-response

Interviewer bias

Interviewer cheating

Coding and editing problems
176
Which is the Larger Problem?

Sampling error

Non-sampling error
177
Sample Size Determination
Everything You Ever Wanted to
Know About Sampling
Distributions--And More!
178
Sampling Distribution


A frequency distribution of all the
means obtained from all the samples of
a given size
Example: $$ spent on CD’s at Tracks




Daffy
Donald
Sylvester
34.00
72.00
36.00
All samples of n=2
179
Your Turn


Develop a sampling distribution using
n=2
Calculate the population mean
CAR
A B C D E
Expected
Life
3
4
5
0
1
180
Sampling Distributions

The distribution of sample means is
skinnier than the distribution of
elements



Why?
The distribution is normal
The sampling distribution mean equals
the population mean
181
Standard Error





The variability in the sampling
distribution
Tells you how reliable your estimate of
the population mean is
If this is big (good or bad)
If this is small (good or bad)
WHY?
182
Standard Error
Sx
standard deviation
square root of the sample size
As the samples size gets bigger, the
standard error gets __________
183
Confidence Intervals


CI= Xbar +/- z (standard error)
Where:




z= _____ for 68% confidence
z= _____ for 95% confidence
z= _____ for 99.7% confidence
What confidence level should you use?
184
Develop a Confidence Interval

Estimate the
average number of
trips to the beach
taken by WVU
students during their
4-6 year career




xbar = 5
SD = 1.5
95% Confidence
Level
n=100
185
So,

There is a 95% chance that if all WVU
students were sampled regarding the
number of beach trips that the findings
would differ from our results by no
more than ____ in either direction.
186
or, maybe better,


If I were to conduct this study 100
times, then I would get _____ different
confidence intervals. If I have a 95%
confidence interval the ____ of the 100
CI’s will contain the true population
mean (mu) and ____ will not.
I sure hope that the confidence interval
I got is one of the 95 that contains mu!
187
Confidence Interval Issues

Reliability


Precision



how often we are correct
how wide the confidence interval is
The smaller the n, the _____ the CI
Given a particular n, the CI will be
_______ when we increase the
reliability
188
Factors that Influence n

Precision (H)


how skinny must can your CI be in order to
be able to take action on the results?
I will go to a water park.

DW
PW
Maybe PWN DWN

I will pay _____ for a musical card.

I will pay _____ for a motorcycle.
189
More Factors That
Influence n

Confidence level (z)

Population SD

Time, money and
personnel
190
Sample Size for Interval or
Ratio Data
Z2
H2
n=
* s2
Where:
z= 1, 1.96, or 3
H= precision (+/-) H
s2= variance (or standard deviation
squared)
191
Example: Average Number of
Books Bought Per Semester



H=0.25
s=1.5
Confidence = 95%
192
Sample Size for Nominal Data
n=
Z2
H2
*
(P) (Q)
Where:
Z= 1, 1.96, or 3
H= a percentage (e.g., 0.03--NOT 3)
P = initial estimate of the population
proportion
Q= (1-P)
193
n for Proportion of WVU
Students Who Read the DA

Do you read the DA?





1. YES
2. NO
Estimate that 60%
read the DA
Want a 99 % CI
Want a +/- 3%
precision
194
The Final Sample Size

Compute n for all nominal, interval and
ratio questions

most conservative

limited resources
195
Non-statistical Approaches to
n


All you can afford method:

subtract costs from budget

figure out cost per interview

divide leftover budget by cost per interview
Rules of thumb
196
Coding and Editing
Getting the data ready for analysis
197
Coding



Each response must have its own
variable name
Variable names can have up to 8
characters
Assigning numbers to responses to
enter data into computer
198
Creating a Coding Sheet

Must have a filename at top of questionnaire


Name_data.txt
First variable is ALWAYS the ________
Why?

Write the variable names on the
questionnaire next to the matching response
199
Coding

Coding Open-Ended Questions:

Code open-ended nominal __________


EX: What State is your current state of residence?
Code open-ended numerical – enter _______

Ex: How much would you pay for this product?
200
Coding

Coding fixed-alternative responses:

Assigned numbers should be logical

One variable needed for each answer the
respondent will give

rank order

semantic differential

“Check all that apply”
201
Editing

Cleaning up the data

Field edit


check for legibility
check for completeness
202
Office Editing

Outliers

Missing data

Blunders

Inconsistencies
203
Hypothesis Testing
Using the SAS System to Analyze
Questionnaires
204
Statistically Significant


Are these results for real, or did they
just occur by chance?
Remember, in sampling, all numbers
have ranges
205
Alpha and p-values

Alpha value:


the error rate you
are willing to accept
P-value

the error associated
with rejecting the
null hypothesis
206
Chi-square & T tables
t-distribution
chi-square distribution
For BOTH distributions
 Area under the
curve =
 Alpha & p-value are
areas under the
curve
 critical value-associated with an
alpha level
 calculated value-- 207
Chi-Square Goodness of Fit

When to use:



number of variables ________
scaling of variable _________
Basic idea:

could the numbers you get (the observed
value) come from a population which has
the pattern I expect? (the expected value)
208
Chi-square Goodness of Fit


Ho: This sample could have come from
a population which has this pattern:
________________________________
__
________________________________
__
Ha: There is a different pattern in the
population than I expect (or hope). 209
Chi-Square Goodness of Fit

Chi-square calculated=
sum of (Observedi -Expected i )
Expected i



2
degrees of freedom = number of cells 1
Alpha Value
Table Value
210
Now Graph


chi-square
calculated
chi-square table
value
211
Chi-Square Goodness of Fit
What type of dairy
treat do you like
best?
1. hard scoop ice
cream
2. soft serve ice cream
3. chocolate covered
ice cream bars
Ho:
Ha:
Chi-square Calculated:
Degrees of Freedom:
Chi-square Table:
Graph
212
Chi-square Goodness of Fit
Rules


If the chi-square
calculated is in the tail,
then _______ Ho;
conclude the pattern in the
data. is NOT what you
expected or wanted.
If chi-square calculated is
in the hump, then
_______ Ho; conclude, the
pattern IS what you
213
expected or wanted.
Do It Yourself Using SAS --

What is the pattern for the favorite brand of
soda?
Ho:

Ha:

Chi-square Calculated:

214
Do It Yourself Using SAS -(cont.)

Degrees of Freedom:
Chi-square Calculated:
Chi-square Table:
Graph

Conclusion:



215
Chi-square for Two Variables

When to use:



number of variables ________
scaling of variables ________
Basic Idea:

Compare the values you actually get from
you study to the values you would expect if
there was ____________between the
two variables
216
Chi-square for Two Variables



Ho: There is no relationship between
____ and _____
Ha: There is a relationship between
____ and _____; SPECIFICALLY
_________________
NO CALCULATIONS!! SAS DOES THIS
ONE
217
Chi-Square for Two Variables

Alpha level

Probability level
218
Chi-Square for Two Variables
If the probability level is > _____, do
not reject Ho,
conclude________________
 If the probability level is < _____ then
reject Ho, conclude ______________
AND specify the nature of the
relationship.
 CAREFUL--Do not just assume that the
relationship you predicted is correct 219

If you Reject HO -How Strong is The Relationship?

Look at Phi



Phi < 0.10 is
______
Between 0.11 and
0.40 is __________
Phi > 0.40 is
_______
220
Do it yourself using SAS

Ho:

Ha:





Chi-square calculated:
Probability level:
Alpha level
Phi:
Conclusion:
221
Rank-Order Tests
It’s 12:00. Do you know what
your Ha is?
222
Rank-order Data & Chi-square

When to use:



number of variables ________
scaling of variable ________
Basic idea: compare the observed value
(________________) with the values
you would expect if NO PREFERENCE
was shown in the data
223
Hypotheses & Calculations



Ho: There is no ranking in the data-there is no preference.
Ha:
_____________________________,
specifically, ______________________.
Chi-square calculated:
Sum of (Observedi -Expectedi)2
Expectedi
224
Rank-order Chi-square

First, multiply the rank-order data for each
variable




Variable 1 score = 1 (___) + 2 (___) +3
(___) ...
Variable 2 score = 1 (___) + 2 (___) +3
(___) ...
Variable 3 score = 1 (___) + 2 (___) +3
(___) ...
Compute expected value

Add up the total scores and divide by the
number of variables
225
Example

Ranking of movies: Mulan, Private Ryan,
Titanic (n=20)
1
2
3
Mulan
10
4
PR
7
8
Titanic
3
8
6
5
9
Mulan ranking = 1 (___) + 2 (___) +3 (___)
PR ranking = 1 (___) + 2 (___) +3 (___)
Titanic ranking = 1 (___) + 2 (___) +3 (___)
Expected value:
226
Rank-order Chi-square






Degrees of freedom
Alpha level
Chi-square table
Graph chi-square and chisquare calculated
Conclude
Managerial implications
227
Rules


If the chi-square calculated is in the tail,
then _______ Ho, conclude that there
is a preference shown in the data.
EXAMINE THE DATA TO
DETERMINE PREFERENCE. (It may
not be what you hypothesized!)
If the chi-square calculated is in the
hump, then ___________ Ho.
Conclude there is no preference shown.
228
Do This With the Soda Rankings

Rank calculations

Expected value
Ho:

Ha:

229
Do This With the Soda Rankings
(continued)





Chi-square calculated:
Chi-square table:
Graph:
Conclusion: reject or do not reject Ho
Managerial implication
230
T-test for One Mean

When to use:



Basic idea


number of variables __________
scaling of variables
__________
Look at the confidence intervals. Any
numbers in the same confidence intervals
are considered the same.
Key question--If my sample mean
(xbar) is ___, can my population
mean (mu) be ___?
231
Hypotheses for T-test for One
Mean



Interested in the average number of
sodas drunk per day.
Ho: The opposite of Ha: The
population mean is equal or
(less/greater than or equal to) the the
number hypothesized.
Ha: What you need to be actionable.
The population mean is (less than/
greater than) _____.
232
Example Ho and Ha






Ho: × = µ
Ha: × ≠ µ (two-tailed test)
Ho: X ≥ µ
Ha: X < µ (one-tailed test – lower tail)
Ho: X ≤ µ
Ha: X > µ (one-tailed test – upper tail)
233
Calculations


T calculated =
xbar - mu
standard error
Where:
xbar = sample mean
mu = hypothesized population mean
234
More T Calculations



Degrees of
freedom=
n-1
Alpha level=
T-table value =
235
Now Graph



t-calculated and
t-table value
on a normal curve
236
Rules for T-test for One Mean


If the calculated t-value is in the hump,
________ Ho. Conclude that your Ha is
not correct.
If the calculated t-value is in the tail
then _____ Ho. Examine your data to
see if Ha or the opposite of Ha is
correct.
237
Practice Once









Ho:
Ha: The populations purchase intention for a
gumball machine is >4.
X-Bar: 4.5 SE= 0.15, n=60
T-calculated
Degrees of freedom
T-table
Graph
Conclusion: Reject or Do not Reject Ho
Managerial implication:
238
Practice Again!









Ho:
Ha: The populations purchase intention for a
gumball machine is >4.
X-Bar: 2.3 SE= 0.18, n=60
T-calculated
Degrees of freedom
T-table
Graph
Conclusion: Reject or Do not Reject Ho
Managerial implication:
239
Now Use SAS








Ho:
Ha: The average population rating for Coke when
consumers know it is Coke is >6.
T-calculated
Degrees of freedom
T-table
Graph
Conclusion: Reject or Do not Reject Ho
Managerial implication:
240
T-test for Two Means

When to use:




Number of variables = _______
One variable (the groups) is _______ scaled
One variable (the dependent variable) is ________
scaled
Basic idea:

See if the confidence intervals for the two different
groups overlap. If they do, then
_________________________________ .
241
Hypotheses for T-test for Two
Means



Is there a difference between the number of
sodas males drink per day and the number of
sodas females drink per day?
Ho: The two groups are the same with
respect to __________ .
Ha: The two groups are different with
respect to _______. Specifically,
______________.
242
More on T-tests for Two
Means

No calculations

Check to see if variances are equal or
unequal

Look at “Equality of Variances” –




Ho: variances are equal
Ha: variances are not equal
If p>.05 accept Ho and use equal variances
If p<.05 reject Ho and use unequal variances
243
More on T-tests for two means

Check the T-test table to see if you
should accept or reject your Ho:
•
T-value =
(either for equal or unequal variance)
•
P-value =
244
Rules

If the probability level is ________
0.05, then ________ Ho. Conclude that
the two groups are different. LOOK AT
THE DATA TO DETERMINE WHAT THE
DIFFERENCE IS.

If the probability level is ______ 0.05,
then __________ Ho. Conclude that
the two groups are the same.
245
Your Turn








Is there a difference in the number of sodas
drunk per day between people who drink
soda with breakfast, and people who do not?
Nominal variable= ___________
Interval variable = ___________
Ho:
Ha:
Probability level
Conclude--reject or do not reject ho
Managerial Implication
246
Your Turn again








Is there a difference in the number of sodas
drunk per day between people who drink
soda with breakfast, and people who do not?
Nominal variable= ___________
Interval variable = ___________
Ho:
Ha:
Probability level
Conclude--reject or do not reject ho
Managerial Implication
247
ANOVA

When to use:




Testing mean differences between groups
Have more than 2 groups
Want to test interactions between 2 variables
Same as a t-test except that you have more
than two groups



Number of variables = _______
Some variables (the groups) are _______ scaled
One variable (the dependent variable) is ________
scaled
248


Ho: all the means are equal
Ha: one of the means differs (specify
how the mean differs)
249
No Calculations: SAS does
this

Use Proc GLM
Class: the nominally scaled variable(s)
Model: specifies the dependent variable, the
dependent variable and interactions
e.g.,
class= age;
model liking= age;
mean = age;
250
Interpretation:
Dependent variable: liking
Source
DF
Sum of
Squares
Model
6
93.52
15.59
Error
75
433.02
5.77
Corrected Total
Source
Model
81
Mean Square
526.55
F Value
2.70
Pr>F
0.02
NOTE:to determine significance – check the p value (if p less than
.05 reject Ho)
251
Do it yourself using SAS

You want to test whether age has an
impact on the number of sodas
consumed per day
HO:
HA:
252

F-calculated
Alpha
P-value

Conclusion:


253
THE END!!!
254