Inferential_Stat_III

Download Report

Transcript Inferential_Stat_III

Homework from last week
Two random samples of 10 patrol officers
from the XYZ Police Department, each officer
tested for cynicism (continuous variable,
scale 1-5)
Sample 1 scores: 3 3 3 3 3 3 3 1 2 5 -Variance = .99
Sample 2 scores: 2 1 1 2 3 3 3 3 4 2 -Variance = .93
Pooled sample variance Sp2
Simplified method: midpoint between the two sample variances
2
Sp =
s2 1 + s2 2
2
Standard error of the difference between means
x
1
-x2 =  Sp2 (
1
+
n1
1
x1
-x2
)
n2
T-Test for significance of the difference between means
x1 -x2
t = -------------x -x
1
2
CALCULATIONS
Pooled sample variance: .96
Standard error of the difference between means: .44
t statistic: 1.14
df -- degrees of freedom [( n1 + n2) – 2]: 18
Would you use a ONE-tailed t-test OR a TWO-tailed t-test?
Depends on the hypothesis
Two-tailed (does not predict direction of the change): Gender  cynicism
One-tailed (predicts direction of the change): Males more cynical than females
Can you reject the NULL hypothesis? (maximum probability that the null hypothesis
is true must be < .05 (less than five chances in a hundred)
NO – For a ONE-tailed test need a t of 1.734 or higher
NO – For a TWO-tailed test need at t of 2.101 or higher
Answer for final exam
•
•
•
There will be one word question on the final for each statistical technique
covered during the period
Here is the question for difference between the means
– You will first be asked to state whether the null hypothesis can be rejected,
either yes or no
– Remember that rejecting a null hypothesis means you are confirming the
working hypothesis, while accepting a null hypothesis means that you are
rejecting the working hypothesis
– You will then be asked: “Justify your yes/no answer using plain English.
Avoid technical terms. Make sure to include the estimated accuracy. ”
Here is a good answer:
– If reject null hypothesis: There are less than 5 chances in 100 that the
observed (actual) difference between means could have been caused by
chance alone
– If accept null hypothesis: There are more than 5 chances in 100 that the
observed (actual) difference between means could have been caused by
chance alone
Other tests of significance
We’ve covered confidence interval
and difference between means -what else is there?
All variables are continuous
Correlation (r) and regression (r2)
Correlation: Relationship between variables
Regression: Correlation squared – proportion of change in the dependent
variable that is accounted for by the change in the independent variable.
If the hypothesis is that Height  Weight, we can say that 52% of the change
in weight is accounted for by the change in height
In this example, the probability that one could get r and r2 statistics this high
if the null hypothesis is true is <.01 (less than 1 in 100).
Correlations
240
WEIGHT
Pears on Correlation
Sig. (2-tailed)
N
Pears on Correlation
Sig. (2-tailed)
N
HEIGHT
WEIGHT
1.000
.719**
.
.000
26
26
.719**
1.000
.000
.
26
26
**. Correlation is s ignificant at the 0.01 level
(2-tailed).
r2
= .52**
220
200
WEIGHT
HEIGHT
180
160
140
120
100
58
60
62
64
66
68
HEIGHT
70
72
74
76
Dependent variable is dichotomous
Logistic regression - “b” statistic
Dependent variable: Also called a “dummy” variable, is two-level (e.g., “Yes/No”, “M/F”.)
Independent variables: Categorical or continuous
Hypothesis is that arresting domestic abusers reduces the risk that their victims will be assaulted
in the future. IV’s are down the left and DV repeat victimization (Yes/No) is embedded.
This table reports the influence of the independent variables on Repeat Victimization = Y
There is no significant relationship between Arrest and Repeat Victimization = Y. Negative
relationship between Arrest and Repeat Victimization =Y means that after arrest repeat
victimization decreases, but not significantly. “Not reported”, “Survey exposure” and “Prior
victimization” do have significant relationships with Repeat Victimization = Y
Independent variable is categorical
and dependent variable is continuous
Analysis of Variance - “F” statistic
An extension of difference between the means test
Example: does officer professionalism vary between cities? (scale 1-10)
City 1
Means:
8
City 2
City 3
5
3
Calculate the “F” statistic, look up the table. An “F” statistic that is sufficiently
large can overcome the null hypothesis that the differences between the
means are due to chance.
Two-way Analysis of Variance
•
•
Stratify independent variable (e.g. by gender)
M
City 1
---
City 2
---
F
---
---
City 3
---
---
F statistic is a ratio of “between-group” to “within” group differences. To
overcome the null hypothesis the differences in scores between groups
(between cities and between genders) should be much greater than the
differences in scores within cities
Between group variance (error + systematic effects of ind. variable)
Within group variance (how scores disperse within each city)
Instead of using asterisks, sometimes the actual
probability that the null hypothesis is true is given.
For significant relationships, look for probabilities of .05
or less (< .05)
Know where the independent and dependent variables are on
the table! Sometimes, like in this example, categories of the
dependent variable run in rows, and the independent variable
categories run in columns.