MKTG 3531 - Chapter 12

Download Report

Transcript MKTG 3531 - Chapter 12

Chapter Twelve
Data Processing,
Fundamental Data
Analysis, and the
Statistical Testing of
Differences
Chapter Twelve
Chapter Twelve Objectives
To develop an understanding of the importance and nature of quality control
checks.
To understand the data entry process and data entry alternatives.
To learn how surveys are tabulated and cross-tabulated.
To understand the concept of hypothesis development and how to test
hypotheses.
Chapter Twelve
Data Analysis Overview
Validation &
Editing
Coding
Data
Entry
Chapter Twelve
Machine Tabulation &
Cleaning
Statistical
of Data
Analysis
Data Analysis Overview
Step One:
• Validation: Confirming the interviews/surveys occurred
• Editing: Determining the questionnaires were completed correctly
Step Two:
• Coding: Grouping and assigning numeric codes to the question responses.
Step Three:
• Data Entry: Process of converting data to an electronic form
Can use scanning devices to enter data
• Scanning the questionnaire into a data base (such as with bubble sheets)
Step Four:
• Clean the Data: Check for data entry errors or data entry inconsistencies
• Machine cleaning - computerized check of the data
Step Five:
• Data tabulations and statistical analysis.
Chapter Twelve
Editing & Skip Patterns
Editing:
The Process of ascertaining that questionnaires were
filled out properly and completely.
Skip Patterns:
Sequence in which later questions are asked, based
on a respondent’s answer to an earlier or questions.
Chapter Twelve
Coding
Coding:
The Process of grouping and assigning numeric codes
to the various responses to a question.
The Process:
• List Responses
• Consolidate Responses
• Set Codes
• Enter Codes
• Keep Coding Sheet
Chapter Twelve
Data Entry
Data Entry:
The Process of converting information to an
electronic format.
Intelligent Data Entry:
A form of data entry in which the information being entered into
the data entry device is checked for internal logic.
Chapter Twelve
Machine Cleaning of Data
Machine Cleaning of Data:
Final computer error check of data.
Error Checking Routines:
Computer programs that accept instructions from the user to
check for logical errors in the data.
Marginal Report:
Computer-generated table of the frequencies of the responses to
each question, used to monitor entry of valid codes and correct
use of skip patterns.
Chapter Twelve
Cross Tabulation Data
Examination of the responses to one question relative to
the responses to one or more questions in a survey set.
Bi-variate cross-tabulation:
• Cross tabulation two items - “Business
Category” and “Gender”
Multi-variate cross-tabulation:
• Additional filtering criteria Status” - Now filtering three items.
“Veteran
Race/Ethnicity
(All)
Are You a Veteran?
Yes
You Liked the Chamber's Services (All)
Count of Respondent
Business Category
Computers/Technology
Construction
Manufacturing
Other
Professional
Grand Total
Gender
Female Male
Grand Total
1
3
4
1
1
5
5
3
2
5
1
1
9
7
16
Are You a Veteran?
(All)
You Liked the Chamber's Services (All)
Race/Ethnicity
(All)
Count of Respondent
Business Category
Computers/Technology
Construction
General Services
Manufacturing
No Response
Other
Professional
Retail
Wholesale
#N/A
Grand Total
Chapter Twelve
Gender
Female Male
Grand Total
5
7
12
2
4
6
1
1
13
6
19
1
4
5
15
11
26
1
3
4
4
4
8
1
1
2
1
1
42
42
84
Graphic Representations of Data
One Way Frequency Tables
A table showing the number of respondents
choosing each answer to a survey question.
Did You Like the Movie?
7
8
6
4
No
4
3
Yes
Grand Total
2
0
Female
Chapter Twelve
Graphic Representations of Data
Line, Pie, and Bar Charts
Line Charts: Good for demonstrating linear
relationships.
Did You Like the Movie?
15
12
No
10
Pie Charts: Good for special relationships
among data points.
7
5
Bar Charts: Good for side by side relationships /
comparisons
4
3
Yes
6
5
3
2
Grand Total
0
Female
Male
Grand Total
Did You Like the Movie?
Did You Like the Movie?
14
12
12
4
10
7
8
6
4
No
4
5
3
2
6
6
Male
Yes
6
2
0
Female
Male
Grand Total
Grand Total
3
Female
2
Grand Total
Chapter Twelve
Descriptive Statistics
Effective means of summarizing large data sets. Key measures include:
mean, median, mode, kurtosis, standard deviation, skewness, and variance.
Significant discrepancies in “Mean”
and Median” should cause you to
look further into this data.
Years in Business
Mean
Standard Error
Median
Mode
Standard Deviation
Sample Variance
Kurtosis
Skewness
Range
Minimum
Maximum
Sum
Count
Chapter Twelve
22.4
2.6
15.0
5.0
23.1
534.5
3.8
2.1
98.0
2.0
100.0
1770.5
79.0
Descriptive Statistics
Mean:
• The sum of the values for all observations of a variable
divided by the number of observations.
Median:
• In an ordered set, the value below which 50 percent of the
observations fall.
Mode:
• The value that occurs most frequently.
Chapter Twelve
Descriptive Statistics
Variance:
• The sums of the squared deviations from the mean divided by the number of
observations minus one.
• The same formula as standard deviation with the squaring.
Range:
Standard
Deviation
Standard Deviation:
=
sum
• The maximum value for a variable minus the minimum value for that variable.
(X1- X)
2
(N-1)
• Calculated by:
• subtracting the mean of a series from each value in a series
• squaring each result then summing them
• then dividing the result by the number of items minus 1
• and finally taking the square root of this value
Chapter Twelve
Statistical Significance
Mathematical Differences:
• By definition, if numbers are not exactly the same, they
are different. This fact does not, however, mean that the
difference is either important or statistically significant.
Statistical Significance:
• If a particular difference is large enough to be unlikely
to have occurred because of chance or sampling error,
then the difference is statistically significant.
Chapter Twelve
Statistical Significance
Managerial Important Differences:
• One must be able to distinguish between mathematically differences
and statistically significant differences in using the data analysis in
managerial decision making.
Hypothesis:
• An assumption, argument, or theory that a researcher or manager
makes about some characteristics of the population under study.
Chapter Twelve
Hypothesis Testing
Step One: Stating the hypothesis
• Null Hypothesis - status quo proven to be true.
• Alternative Hypotheses - another alternative proven to the true.
Step Two: Choosing the appropriate test statistic
• Test of means, test or proportions, ANOVA, etc.
Step Three: Developing a decision rule
• Determine the significance level.
• Need to determine whether to reject or fail to reject the null hypothesis.
Chapter Twelve
Hypothesis Testing
Step Four: Calculating the value of the test statistic
• Use the appropriate formula to calculate the value of the statistic.
Step Five: Stating the conclusion
• Stated from the perspective of the original research question.
Chapter Twelve
Types of Errors in Hypothesis Testing
Type I:
• Rejection of the null hypothesis when, in fact, it is true.
Type II:
• Acceptance of the null hypothesis when, in fact, it is false.
Tests are either one or two-tailed. This decision depend on the nature of the
situation and what the researcher is demonstrating.
One-Tailed:
• “If you take the medicine, you will get better”
Two-Tailed:
• “If you take the medicine, you will get either better or worse.”
Chapter Twelve
Issues With Type I and II Errors
Actual State of the
Null Hypothesis
Fail to Reject Ho
Reject Ho
Ho is true
Correct (1- )
no error
Type I error ( )
Ho is false
Type II error ( )
Correct (1- )
no error
Chapter Twelve
Commonly Used Statistical Hypothesis Tests
Independent Samples:
• Samples in which measurement of a variable in one population
has no effect on measurement of the variable in the other.
Related Samples:
• Samples in which measurement of a variable in one population
might influence measurement of the variable in the other.
Degrees of Freedom:
• Is equal to the number of observations minus the number of
assumptions or constraints necessary to calculate a statistic.
Chapter Twelve
Hypothesis Tests
About One and Two Means Respectively
Z-Test:
• Hypothesis test used for a single mean if the sample is large
enough and drawn from a normal population. Usually for
samples of about 30 and above.
t-Test:
• Hypothesis test used for a single mean if the sample is too small to
use the Z-test. Usually for samples below 30.
Hypothesis test that tests the difference between groups of data.
Chapter Twelve
Hypothesis Tests
About Proportions and P-Value
Proportion in One Sample:
• Test to determine whether the difference between proportions is greater than
would be expected because of sampling error.
Two Proportions in Independent Samples:
• Test to determine the proportional differences between two or more groups.
p-value:
• The exact probability of getting a computed test statistic that was largely
due to chance. The smaller the p-value, the smaller the probability that the
observed result occurred by chance.
Chapter Twelve
Statistics and the Internet
ActivStats - www.datadesk.com
Autobox - www.autobox.com
In “Slide Show” mode,
click on the arrow to
be taken to the
respective web page.
Math Software - http://gams.nist.gov
Minitab - www.minitab.com
SAS - www.sas.com
SPSS - www.spss.com
Stata - www.stata.com
SYSTAT - www.systat.com
Vizion - www.datadesk.com/viz!on
xISTAT - www.xlstat.com
Chapter Twelve
Index
Cross-tabulation
Data Analysis Overview
Descriptive Statistics
Editing, Coding, & Cleaning the Data
Hypothesis Testing - Common Types
Hypothesis Testing - Steps
Measures of Central Tendency
Measures of Dispersion
Statistical Testing of Differences
Type I and Type II Errors
Index