SPSS 101 - University of San Diego Home Pages

Download Report

Transcript SPSS 101 - University of San Diego Home Pages

Data Analysis Express:
Practical Application using
SPSS
Data of Interest

National Insurance Company
– 1000 questionnaires sent
– 285 respondents

Questionnaire Presentation
– Copy given in class
SPSS Data Set

2 Views : Variable and Data.
 Raw Variable (labels and values)
 Transformed Variable (compute and recode)
Preliminary Data Analysis:
Basic Descriptive Statistics

Preliminary data analysis examines the
central tendency and the dispersion of the
data on each variable in the data set
 Measurement level dictates what to do
 Feeling for the data

What can we do: limitations on next slide?
Run descriptives. (outputs 1)
Measures of Central Tendency and
Dispersion for Different Types of
Variables
Crosstabs: Frequencies in
specific condition.

Most of the time with categorical variables

Examples to run
Cross-Tabulations- Comparing
frequencies: Chi-square
Contingency Test

Technique used for determining whether there is a
statistically significant relationship between two
categorical (nominal or ordinal) variables
Need to Conduct Chi-square Test
to Reach a Conclusion

The hypotheses are:
– H0:There is no association between educational level
and willingness to recommend National to a friend (the
two variables are independent of each other).
– Ha:There is some association between educational level
and willingness to recommend National to a friend (the
two variables are not independent of each other).
– Let’s do it….
National Insurance Company
Study
Computed Chisquare value
P-value
National Insurance Company
Study --P-Value Significance

The actual significance level (p-value) = 0.019
 the chances of getting a chi-square value as high
as 10.007 when there is no relationship between
education and recommendation are less than 19 in
1000.
 The apparent relationship between education and
recommendation revealed by the sample data is
unlikely to have occurred because of chance.
 We can safely reject null hypothesis.
Precautions in Interpreting Cross
Tabulation Results

Two-way tables cannot show conclusive evidence
of a causal relationship

Watch out for small cell sizes

Increases the risk of drawing erroneous inferences
when more than two variables are involved
Comparing Means

Mainly T-tests and ANOVAs

T-test on OQ and gender.
Independent T-tests

Independent Variable with 2 categories max.

Equality of variance (cf output)

88% of chance that the difference of .04 is
due to chance (random effect). Cannot
reject the null hypothesis.
Analysis of Variance

ANOVA is appropriate in situations where
the independent variable is set at certain
specific levels (called treatments in an
ANOVA context) and metric measurements
of the dependent variable are obtained at
each of those levels
Example
24 Stores Chosen randomly for the study
8 Stores randomly chosen for each treatment
Treatment 1
Store brand sold at
the regular price
Treatment 2
Store brand sold at
50¢ off the regular
price
Treatment 3
Store brand sold at
75¢ off the regular
price
monitor sales of the store brand for a week in each store
Table 15.2 Unit Sales Data Under Three
Pricing Treatments
Treatment
Regular Price
50 ¢ off
75 ¢ off
37
46
46
38
43
49
40
43
48
40
45
48
38
45
47
38
43
48
40
44
49
39
44
49
Number of
stores
8
8
8
Mean sales
38.75
44.13
48.00
Unit Sale in
each store
ANOVA –Grocery Store
Hypothesis

Grocery Store Example
– Ho
– Ha

1 = 2 = 3
At least one  is different from one or more of
the others
Hypotheses for K Treatment groups or samples
– Ho
– Ha
1 = 2 = ………..k
At least one  is different from one or more of
the others
Exhibit 15.1 SPSS Computer
Output for ANOVA Analysis
Between -Sub je cts F acto rs
Treatment
group
1
2
3
Value Label
Regular
price
50 cents off
75 cents off
N
8
8
8
Exhibit 15.1 SPSS Computer
Output for ANOVA Analysis
(Cont’d)
Tests of Between-Subjects Effects
Dependent Variable: SALES
Source
Corrected Model
Intercept
TREAT
Error
Total
Corrected Total
Type III Sum
of Squares
345.250a
45675.375
345.250
26.375
46047.000
371.625
df
2
1
2
21
24
23
Mean Square
172.625
45675.375
172.625
1.256
F
137.445
36367.123
137.445
Sig.
.000
.000
.000
a. R Squared = .929 (Adjusted R Squared = .922)
There is less than a .001 probability of obtaining an Fvalue as high as 137.447
ANOVA

OQ recommendation and OQ, individual
variable

OQ and EDUC (Graph)..and post hoc