4. Inferences about a single quantitative predictor

Download Report

Transcript 4. Inferences about a single quantitative predictor

Unit 4: Inferences about a Single
Quantitative Predictor
Unit Organization
 First consider details of simplest model (one parameter
estimate; mean-only model; no X’s)
 Next examine simple regression (two parameter
estimates, one X for one quantitative predictor variable)
 These provide critical foundation for all linear models
 Subsequent units will generalize to one dichotomous
predictor variable (Unit 5; Markus), multiple predictor
variables (Units 6-7) and beyond….
2
Linear Models as Models
Linear models (including regression) are ‘models’
DATA = MODEL + ERROR
Three general uses for models:
 Describe and summarize DATA (Ys) in a simpler form using
MODEL
 Predict DATA (Ys) from MODEL
• Will want to know precision of prediction. How big is
error? Better prediction with less error.
 Understand (test inferences about) complex relationships
between individual regressors (Xs) in MODEL and the DATA
(Ys). How precise are estimates of relationship?
MODELS are simplifications of reality. As such, there is ERROR.
They also make assumptions that must be evaluated
3
Fear Potentiated Startle (FPS)
We are interested in producing anxiety in the laboratory
To do this, we develop a procedure where we expose people
to periods of unpredictable electric shock administration
alternating with periods of safety.
We measure their startle response in the shock and safe
periods.
We use the difference between their startle during shock –
safe to determine if they are anxious.
This is called Fear potentiated startle (FPS). Our procedure
works if FPS > 0. We need a model of FPS scores to
4
determine if FPS > 0.
Fear Potentiated Startle: One parameter model
A very simple model for the population of FPS scores would
predict the same value for everyone in the population.
^
Yi = 0
We would like this value to be the “best” prediction.
In the context of DATA = MODEL + ERROR, how can we
quantify “best”?
We want to predict some characteristic about the population
of FPS scores that minimizes the ERROR from our model.
ERROR = DATA – MODEL
^
i = Yi – Yi; There is an error (i) for each population score.
How can we quantify total model error?
5
Total Error
Sum of errors across all scores in the population isn’t ideal
b/c positive and negative errors will tend to cancel each other
^
(Yi – Yi)
Sum of absolute value of errors could work. If we selected 0
to minimize the sum of the absolute value of errors, 0
would equal the median of the population.
^
( |Yi – Yi| )
Sum of squared errors (SSE) could work. If we selected 0
to minimize the sum of squared errors, 0 would equal the
mean of the population.
^
(Yi – Yi)2
6
One parameter model for FPS
For the moment, lets assume we prefer to minimize SSE
(more on that in a moment). You should predict the
population mean FPS for everyone.
^
Yi = 0
where  0 = 
What is the problem with this model and how can we fix this
problem?
We don’t know the population mean for FPS scores ().
We can collect a sample from the population and use the
sample mean (X) as an estimate of the population mean ().
X is an unbiased estimate for 
7
Model Parameter Estimation
Population model
^
Yi = 0 where  0 = 
Yi = 0 + i
Estimate population parameters from sample
^
Yi = b0 where b0 = X
Yi = b0 + ei
8
Least Squares Criterion
In ordinary least squares (OLS) regression and other least
squares linear models, the model parameter estimates (e.g.,
b0) are calculated such that they minimize the sum of squared
errors (SSE) in the sample in which you estimate the model.
^
SSE =  (Yi – Yi)2
SSE = ei2
9
Properties of Parameter Estimates
There are 3 properties that make a parameter estimate
attractive.
Unbiased: Mean of the sampling distribution for the
parameter is equal to the value for that parameter in the
population.
Efficient: The sample estimates are close to the population
parameter. In other words, the narrower the sampling
distribution for any specific sample size N, the more efficient
the estimator. Efficient means small SE for parameter
estimate
Consistent: As the sample size increases, the sampling
distribution becomes narrower (more efficient). Consistent
means as N increases, SE for parameter estimate decreases 10
Least Squares Criterion
If the i are normally distributed, both the median and the
mean are unbiased and consistent estimators.
The variance of the sampling distribution for the mean is:
2
N
The variance of the sampling distribution for the median is:
2
2N
Therefore the mean is the more efficient parameter estimate.
For this reason, we tend to prefer to estimate our models by
minimizing the sum of squared errors.
11
Fear-potentiated startle during Threat of Shock
> setwd("C:/Users/LocalUser/Desktop/GLM")
> d = lm.readDat('4_SingleQuantitative_FPS.dat')
> str(d)
'data.frame':
96 obs. of 2 variables:
$ BAC: num 0 0 0 0 0 0 0 0 0 0 ...
$ FPS: num -98.098 -22.529 0.463 1.194 2.728 ...
> head(d)
BAC
FPS
0125
0 -98.0977778
0013
0 -22.5285000
0113
0
0.4632944
0116
0
1.1943667
0111
0
2.7280444
0014
0
6.7237833
> some(d)
BAC
FPS
0111 0.0000
2.728044
1121 0.0235 43.901667
1126 0.0395 14.181344
1113 0.0495 53.176722
1124 0.0580 11.859050
1112 0.0605 45.181778
2112 0.0730 162.736611
2016 0.0750 30.453111
2023 0.0925 19.598722
3112 0.1085 14.603611
12
Descriptives and Univariate Plots
> lm.describeData(d)
var n mean
sd median
min
max skew kurtosis
BAC
1 96 0.06 0.04
0.06
0.0
0.14 -0.09
-1.09
FPS
2 96 32.19 37.54 19.46 -98.1 162.74 0.62
1.93
> windows() #on MAC, use quartz()
> par('cex' = 1.5, 'lwd'=2)
> hist(d$FPS)
13
FPS Experiment: The Inference Details
Goal: Determine if our shock threat procedure is effective at
potentiating startle (increasing startle during threat
relative to safe)
Create a simple model of FPS scores in the population
FPS = 0
Collect sample of N=96 to estimate 0
Calculate sample parameter estimate (b0) that minimizes SSE
in sample
Use b0 to test hypotheses
H0: 0 = 0
Ha: 0 <> 0
14
Estimating a one parameter model in R
m = lm(FPS ~ 1, data = d)
> summary(m)
Call:
lm(formula = FPS ~ 1, data = d)
Residuals:
Min
1Q
-130.29 -25.40
Median
-12.73
3Q
18.27
Max
130.55
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)
32.191
3.832
8.402 4.26e-13 ***
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 37.54 on 95 degrees of freedom
15
Errors/Residuals
summary(m)
Call:
lm(formula = FPS ~ 1, data = d)
Residuals:
Min
1Q
-130.29 -25.40
Median
-12.73
3Q
18.27
Max
130.55
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)
32.191
3.832
8.402 4.26e-13 ***
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 37.54 on 95 degrees of freedom
This is simple descriptive information about the
errors (ei)
^
ei = (Yi – Yi)
16
Errors/Residuals
^)
ei = (Yi – Y
i
R can report errors for each individual in the sample:
> residuals(m)
0125
0013
-130.2886183 -54.7193405
0011
0021
-12.6999127
-2.7175738
0023
0026
19.6643817
35.0963817
1121
1014
11.7108262
34.7095484
1115
1021
-24.1598966
61.4058262
2014
1114
-35.5108960 -29.1695572
2122
1112
27.5951151
12.9909373
2113
2116
5.2138262 -27.4784549
2125
2111
-23.4260627
9.3974373
3023
2114
-14.5139683 -13.0036627
3022
3021
6.1488262 -18.1054016
3112
3115
-17.5872294 -26.9565572
0113
-31.7275460
0121
1.5669373
0114
36.5672151
1122
-25.2434516
1116
-19.2076849
1022
67.7537262
2011
-12.7704849
2112
130.5457706
2126
4.5087151
2021
0.3123817
3014
-17.1700072
3111
-26.9358794
0116
-30.9964738
0016
8.3829373
0126
53.1913817
1025
14.2658262
1013
-14.1219516
1123
-18.4250627
2013
-58.2597294
3024
1.6943262
2026
37.4066428
3026
-25.3833322
3125
-31.8114616
3124
-31.4756127
0111
-29.4627960
0123
9.3662151
0015
57.4678817
1023
104.6339928
1024
36.5526595
1026
-16.7506349
2022
-35.5405016
2025
-20.6439405
2121
17.0578817
2023
-12.5921183
3011
77.8639373
3126
-15.9328183
You can get the SSE easily:
> sum(residuals(m)^2)
[1] 133888.3
0014
-25.4670572
0122
11.7176040
0024
58.6873817
1126
-18.0094960
1111
-29.6592294
1016
47.3338484
1015
17.9774928
2123
-28.0089794
2012
3.9210484
3116
-32.1845627
3122
-34.0540127
3113
-25.7722738
0124
-25.3710072
0012
16.2161040
0112
72.7258928
1125
-8.4337683
1113
20.9858817
3025
-21.4997294
2124
-28.5735072
2016
-1.7377294
2115
-9.8150183
3121
-31.0086183
3114
4.8073817
3012
-21.4575572
0022
-16.8541238
0115
19.1260706
0025
78.7543262
1011
29.8681317
1012
59.8164373
1124
-20.3317905
2024
25.0772151
2015
-32.0183927
3015
27.4325484
3013
-18.5716738
3016
-26.4386960
3123
-17.4630572
17
Standard Error of Estimate
summary(m)
Call:
lm(formula = FPS ~ 1, data = d)
Residuals:
Min
1Q
-130.29 -25.40
Median
-12.73
3Q
18.27
Max
130.55
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)
32.191
3.832
8.402 4.26e-13 ***
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 37.54 on 95 degrees of freedom
This is the standard error of estimate. It an estimate of the
standard deviation of i
^
 (Yi – Yi)2
SSE
N-P
N-P
NOTE: for mean-only model, this is sY
18
Coefficients (Parameter Estimates)
summary(m)
Call:
lm(formula = FPS ~ 1, data = d)
Residuals:
Min
1Q
-130.29 -25.40
Median
-12.73
3Q
18.27
Max
130.55
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)
32.191
3.832
8.402 4.26e-13 ***
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 37.54 on 95 degrees of freedom
This is b0, the unbiased sample estimate of 0, and its
standard error. It is also called the intercept in regression
^ =b
^ = 32.2
(more on this later). Y
Y
i
0
i
> coef(m)
(Intercept)
32.19084
19
Predicted Values
^
Yi = 32.19
You can get the predicted value for each individual in the
sample using this model:
> fitted.values(m)
0125
0013
0113
32.19084 32.19084 32.19084
0123
0122
0012
32.19084 32.19084 32.19084
1121
1014
1122
32.19084 32.19084 32.19084
1024
1111
1113
32.19084 32.19084 32.19084
2122
1112
2011
32.19084 32.19084 32.19084
2025
2123
2016
32.19084 32.19084 32.19084
3023
2114
2021
32.19084 32.19084 32.19084
3011
3122
3114
32.19084 32.19084 32.19084
0116
32.19084
0115
32.19084
1025
32.19084
1012
32.19084
2013
32.19084
2015
32.19084
3026
32.19084
3016
32.19084
0111
32.19084
0023
32.19084
1023
32.19084
2014
32.19084
2022
32.19084
2125
32.19084
2023
32.19084
3112
32.19084
0014
32.19084
0026
32.19084
1126
32.19084
1114
32.19084
1015
32.19084
2111
32.19084
3116
32.19084
3115
32.19084
0124
32.19084
0114
32.19084
1125
32.19084
1022
32.19084
2124
32.19084
2126
32.19084
3121
32.19084
3111
32.19084
0022
32.19084
0126
32.19084
1011
32.19084
1123
32.19084
2024
32.19084
2026
32.19084
3013
32.19084
3124
32.19084
0011
32.19084
0015
32.19084
1115
32.19084
1026
32.19084
2113
32.19084
2121
32.19084
3022
32.19084
3126
32.19084
0021
32.19084
0024
32.19084
1021
32.19084
1016
32.19084
2116
32.19084
2012
32.19084
3021
32.19084
3113
32.19084
0121
32.19084
0112
32.19084
1116
32.19084
3025
32.19084
2112
32.19084
2115
32.19084
3014
32.19084
3012
32.19084
0016
32.19084
0025
32.19084
1013
32.19084
1124
32.19084
3024
32.19084
3015
32.19084
3125
32.19084
3123
32.19084
20
Testing Inferences about 0
summary(m)
Call:
lm(formula = FPS ~ 1, data = d)
Residuals:
Min
1Q
-130.29 -25.40
Median
-12.73
3Q
18.27
Max
130.55
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)
32.191
3.832
8.402 4.26e-13 ***
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 37.54 on 95 degrees of freedom
This is the t-statistic to test the H0 that 0 = 0.
The probability (p-value) of obtaining a sample b0 = 32.2 if H0
is true (0 = 0) < .0001.
Describe the logic of how this was determined?
21
Sampling Distribution: Testing Inferences about 0
H0: 0 = 0;
Ha: 0 <> 0
If H0 is true, the sampling distribution for 0 will have a mean of 0.
We can estimate standard deviation of the sampling distribution
with SE for b0.
t (df=N-P) = b0 – 0
SEb0
= 32.2 – 0 = 8.40
3.8
b0 is approximately 8 standard deviations above the expected
mean of the distribution if H0 is true
pt(8.40,95,lower.tail=FALSE) * 2
[1] 4.293253e-13
The probability of obtaining a sample
b0 = 32.2 if H0 is true is very
low (< .05). Therefore we reject H0
And conclude that 0 <> 0 and b0 is
our best (unbiased) estimate of it.
22
Statistical Inference and Model Comparisons
Statistical inference about parameters is fundamentally
about model comparisons
You are implicitly (t-test of parameter estimate) or explicitly
(F-test of model comparison) comparing two different
models of your data
We follow Judd et al and call these two models the compact
model and the augmented model.
The compact model will represent reality as the null
hypothesis predicts. The augmented model will represent
reality as the alternative hypothesis predicts.
The compact model is simpler (fewer parameters) than the
augmented model. It is also nested in the augmented
model (i.e. a subset of parameters)
23
Model Comparisons: Testing inferences about 0
^ =
FPS
i
0
H0: 0 = 0
Ha: 0 <> 0
^
Compact model: FPSi = 0;
^ =  ( b )
Augmented model: FPS
i
0
0
We estimate 0 parameters (P=0) in this compact model
We estimate 1 parameter (P=1) in this augmented model
Choosing between these two models is equivalent to testing
if 0 = 0 as you did with the t-test
24
Model Comparisons: Testing inferences about 0
^ =0
Compact model: FPS
i
^ =  ( b )
Augmented model: FPS
i
0
0
We can compare (and choose between) these two models by
comparing their total error (SSE) in our sample
^ )2
SSE = (Yi – Y
i
^
SSE(C) = (Yi – Yi)2 = (Yi – 0)2
> sum((d$FPS - 0)^2)
[1] 233368.3
SSE(A) = (Yi – Yi)2 = (Yi – 32.19)2
> sum((d$FPS – coef(m)[1])^2 #(sum(residuals(m)^2)
[1] 133888.3
25
Model Comparisons: Testing inferences about 0
Compact model:
SSE = 233,368.3
P=0
^
FPSi = 0;
^
Augmented model: FPSi = 0 ( b0)
SSE = 133,888.3
P=1
F (PA – PC, N – PA) =
F (1– 0, 96 – 1) =
(SSE(C) -SSE(A)) / (PA-PC)
SSE(A) / (N-PA)
(233368.3-133888.3) / (1 - 0)
133888.3 / (96 - 1)
F(1,95) = 70.59, p < .0001
> pf(70.58573,1,95, lower.tail=FALSE)
[1] 4.261256e-13
26
Effect Sizes
Your parameter estimates are descriptive. The describe
effects in the original units of the (IVs) and DV. Report
them in your paper
There are many other effect size estimates available. You
will learn two that prefer.
Partial eta2 (p2): Judd et al call this PRE (proportional
reduction in error)
Eta2 (2): This is also commonly referred to as R2 in
regression.
27
Sampling Distribution vs. Model Comparison
The two approaches to testing H0 about parameters (0, j)
are statistically equivalent
They are complementary approaches with respect to
conceptual understanding of GLMs
Sampling distribution
Focus on population parameters and their estimates
Tight connection to sampling and probability distributions
Understanding of SE (sampling error/power; confidence
intervals; graphic displays)
Model comparison
Focus on models themselves increase
Highlights model fit (SSE) and model parsimony (P)
Clearer link to PRE (p2)
Test comparisons that differ by > 1 parameter (discouraged)
28
Partial Eta2 or PRE
Compact model:
SSE = 233,368.3
P=0
^
FPSi = 0;
^
Augmented model: FPSi = 0 ( b0)
SSE = 133,888.3
P=1
How much was the error reduced in the augmented model
relative to the compact model?
SSE(C) – SSE(A)
SSE (C)
=
233,368.3 - 133,888.3
233,368.3
= .426
Our more complex model that includes 0 reduces prediction
error (SSE) by approximately 43%. Not bad!
29
Confidence Interval for b0
A confidence interval (CI) is an interval for a parameter estimate in
which you can be fairly confident that you will capture the true
population parameter (in this case, 0). Most commonly reported
is the 95% CI. Across repeated samples, 95% of the calculated CIs
will include the population parameter*.
> confint(m)
2.5 %
97.5 %
(Intercept) 24.58426 39.79742
Given what you now know about confidence intervals and
sampling distributions, what should the formula be?
CI (b0) = b0 + t (; N-P) * SEb0
For the 95% confidence interval this is approximately + 2 SEs
around our unbiased estimate of 0
30
Confidence Interval for b0
How can we tell if a parameter is “significant” from the
confidence interval?
If a parameter <> 0 at  = .05, then the 95% confidence
interval for its parameter estimate should not include 0.
This is also true for testing whether the parameter is equal
to any other True for any other non-zero point estimate for
the parameter non-zero value
31
The one parameter (mean-only) model: Special Case
What special case (specific analytic test) is statistically
equivalent to the test of the null hypothesis: 0 = 0 in the one
parameter model?
The one sample t-test testing if a population mean = 0.
> t.test(d$FPS)
One Sample t-test
data: d$FPS
t = 8.4015, df = 95, p-value = 4.261e-13
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
24.58426 39.79742
sample estimates:
mean of x
32.19084
32
Testing 0 = non-zero values
How could you test an H0 regarding 0 = some value other
than 0 (e.g., 10)? HINT: There are at least three methods.
^ =  ) to
Option 1: Compare SSE for the augmented model (Y
i
0
^
SSE from a different compact model for this new H0 (Yi = 10)
Option 2: Recalculate t-statistic using this new H0.
t = b0 – 10
SEb0
Option 3: Does the confidence interval for the parameter
estimate contain this other value? No p-value provided.
> confint(m)
2.5 %
97.5 %
(Intercept) 24.58426 39.79742
33
Intermission…..
One parameter (0) “mean-only” model
 Description: b0 describes mean of Y
 Prediction: b0 is predicted value that minimizes sample SSE
 Inference: Use b0 to test if 0 = 0 (default) or any other value.
One sample t-test.
Two parameter (0, 1) model
 Description: b1 describes how Y changes as function of X1. b0
describes expected value of Y at specific value (0) for X1.
Prediction: b0 and b1 yield predicted values that vary by X1 and
minimize SSE in sample.
 Inference: Test if 1 = 0. Pearson’s r; independent sample t-test.
Test if 0 = 0. Analogous to one-sample t-test controlling for X1,
if X1 is mean-centered. Very flexible!
34
Two Parameter (One Predictor) models
We started with a very simple model of FPS:
FPS = 0
What if some participants were drunk and we knew their
blood alcohol concentrations (BAC). Would it help? What
would the model look like? What question (s) does this
model allow us to test? Think about it
35
The Two Parameter Model
DATA = MODEL + ERROR
Yi = 0 + 1 * X1 + i
^
Yi = 0 + 1 * X1
i = Yi - ^
Yi
^
FPSi = 0 + 1 * BAC1
36
The Two Parameter Model
^
Yi = 0 + 1 * X1
As before, the population parameters in the model (0 , 1) are
estimated by b0 & b1 calculated from sample data based on
the least squares criterion such that they minimize SSE in the
sample data.
Sample model
^ b +b *X
Y
i= 0
1
1
To derive these parameter
estimates you must solve series
of simultaneous equations using
linear algebra and matrices (see
supplemental reading).
Or use R!
37
Least Squares Criterion
^
ei = Yi – Y
i
SSE = ei2
38
Interpretation of b0 in Two Parameter Model
^
Yi = b0 + b1 * X1
b0 is predicted value for Y when X1 = 0. Graphically, this is
the Y intercept for the regression line (Value of Y where
regression line crosses Y-axis at X1 = 0*).
Approximately what is b0 in this example?
42.5
39
Interpretation of b0 in Two Parameter Model
IMPORTANT: Notice that b0 is very different (42.5) in the two
parameter model than in previous one parameter model (32.2)
WHY?
In the one parameter model b0 was our sample estimate of
the mean FPS score in everyone. b0 in the two parameter
model is our sample estimate of the mean FPS score for
people with BAC = 0, not everyone.
40
Interpretation of b1 in Two Parameter Model
^ b +b *X
Y
i= 0
1
1
b1 is the predicted change in Y for every one unit change in
X1. Graphically it is represented by the slope of the line
regression line. If you understand the units of your predictor
and DV, this is an attractive description of their relationship.
^ = 42.5 + -184.1 * BAC
FPS
i
i
For every 1% increase in BAC,
FPS decreases by 184.1
microvolts.
For every .01% increase in BAC,
FPS decreases by 1.841
microvolts.
41
Testing Inferences about 1
Does alcohol affect people’s anxiety?
^ =  +  * BAC
FPS
i
0
1
i
What are your null and alternative hypotheses about model
parameter to evaluate this question?
H0: 1 = 0
Ha: 1 <> 0
If 1 = 0, this means that FPS does not change with changes in BAC.
In other words, there is no effect of BAC on FPS. If 1 < 0, this
means that FPS decreases with increasing BAC (People are less
anxious when drunk) If 1 > 0, this means FPS increases with
increasing BAC (i.e., people are more anxious when drunk).
42
Estimating a Two Parameter Model in R
> m2 = lm(FPS ~ BAC, data = d)
> summary(m2)
Call:
lm(formula = FPS ~ BAC, data = d)
Residuals:
Min
1Q
-140.555 -21.565
Median
-8.289
3Q
15.638
Max
133.718
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)
42.457
6.548
6.484 4.11e-09 ***
BAC
-184.092
95.894 -1.920
0.0579 .
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘
’ 1
Residual standard error: 37.02 on 94 degrees of freedom
Multiple R-squared: 0.03773, Adjusted R-squared: 0.02749
43
F-statistic: 3.685 on 1 and 94 DF, p-value: 0.05792
Testing Inferences about 1
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)
42.457
6.548
6.484 4.11e-09 ***
BAC
-184.092
95.894 -1.920
0.0579 .
Does BAC affect FPS? Explain this conclusion in terms of the
parameter estimate, b1 and its standard error
Under the H0: 1 = 0, the sampling distribution for 1 will have a
mean of 0 with an estimated standard deviation of 95.894.
t (96-1) = -184.092 – 0
= -1.92
95.894
Our this value of the parameter estimate, b1, is 1.92 standard
deviations below the expected mean of the sampling distribution
for H0.
> pt(-1.92, 94, lower.tail = TRUE)*2
[1] 0.05788984
A b1 of this size is not unlikely under the null, therefore you fail to
44
reject the null and conclude that BAC has no effect on FPS
Testing Inferences about 1
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)
42.457
6.548
6.484 4.11e-09 ***
BAC
-184.092
95.894 -1.920
0.0579 .
One tailed p-value
> pt(-1.92, 94, lower.tail = TRUE)
[1] 0.02894492
Two-tailed p-value
> pt(-1.92, 94, lower.tail = TRUE)*2
[1] 0.05788984
H0: 1 = 0
Ha: 1 <> 0
45
Model Comparison: Testing Inferences about 1
H0: 1 = 0
Ha: 1 <> 0
What two models are you comparing when you test
hypotheses about 1? Describe the logic.
^
Compact Model:
FPS =  + 0 * BAC
i
0
i
PC = 1
SSE(C) = 133888.3
^
Augmented Model: FPSi = 0 + 1 * BACi
PA = 2
SSE(A) = 128837.1
F(PA-PC, N-PA) = SSE(C) – SSE(A) / (PA-PC)
SSE(A) / (N-PA)
F(1,94) = 3.685383, p = 0.05792374
46
Sum of Squared Errors
If there is a perfect relationship between X1 and Y in your
sample, what will the SSE be in the two parameter model and
why?
SSE(A) = 0 . All data points will fall perfectly on the
regression line. All errors will be 0
If there is no relationship at all between X1 and Y in your
sample (b1 = 0), what will the SSE be in the two parameter
model and why?
SSE(A) = SSE of the mean-only model. X1 provides no
additional information about the DV. Your best prediction
will still be the mean of the DV.
47
Testing Inferences about 0
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)
42.457
6.548
6.484 4.11e-09 ***
BAC
-184.092
95.894 -1.920
0.0579 .
What is the interpretation of b0 in this two parameter model?
It is the predicted FPS for a person with BAC = 0 (sober).
The test of this parameter estimate could inform us if the
shock procedure worked among our sober participants. This
is probably a more appropriate manipulation check than
testing if it worked in everyone including drunk people given
that alcohol could have reduced FPS.
What two models are being compared?
^
Compact Model:
FPSi = 0 + 1 * BACi
^ =  +  * BAC
Augmented Model:
FPS
i
0
1
i
48
Confidence Interval for bj or b0
You can provide confidence intervals for each parameter
estimate in your model.
> confint(m2)
2.5 % 97.5 %
(Intercept) 29.45597 55.457721
BAC
-374.49261 6.308724
The underlying logic from your understanding of sampling
distributions remains the same
CI (b) = b + t (;N-P) * SEb where P = total # of parameters
How can we tell if a parameter is “significant” from the
confidence interval?
If a parameter is <> 0, at  = .05, then the 95% confidence
interval should not include 0. True for any other non-zero
value for b as well.
49
Partial Eta2 or PRE for 1
How can you calculate the effect size estimate partial eta2
(PRE) for 1
Compare the SSE across the two relevant models
^ =  + 0 * BAC
Compact Model:
FPS
i
0
i
SSE(C) = 133888.3
^
FPSi = 0 + 1 * BACi
Augmented Model:
SSE(A) = 128837.1
SSE(C) – SSE(A)
SSE (C)
=
133888.3 - 128837.1
133888.3
= . 0.038
Our augmented model that includes a non-zero effect for BAC
reduces prediction error (SSE) by only 3.8% over the compact
model that fixes this parameter at 0.
50
Partial Eta2 or PRE for 0
How can you calculate the effect size estimate partial eta2
(PRE) for 0
Compare the SSE across the two relevant models
^ = 0 +  * BAC
Compact Model:
FPS
i
1
i
SSE(C) = 186462.4
^
FPSi = 0 + 1 * BACi
Augmented Model:
SSE(A) = 128837.1
SSE(C) – SSE(A)
SSE (C)
=
186462.4 - 128837.1
186462.4
= 0.309
Our augmented model that allows FPS to be non-zero for
people with BAC=0 (sober people) reduces reduces
prediction error (SSE) by 30.9% from the model that fixes FPS
51
at 0 when BAC =0!
Coefficient of Determination (R2)
Coefficient of Determination (R2):
Proportion of explained variance (i.e., proportion of variance
in Y accounted for by all Xs in model).
DATA = MODEL + ERROR
For individuals:
Yi = Yi + ei
With respect to variances:
sYi2 = s^Yi2 + sei2
R2 =
s^Yi2
sYi2
> var(fitted.values(m2))/ var(d$FPS)
[1] 0.03772707
52
Coefficient of Determination (R2)
> m2 = lm(FPS ~ BAC, data = d)
> summary(m2)
Call:
lm(formula = FPS ~ BAC, data = d)
Residuals:
Min
1Q
-140.555 -21.565
Median
-8.289
3Q
15.638
Max
133.718
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)
42.457
6.548
6.484 4.11e-09 ***
BAC
-184.092
95.894 -1.920
0.0579 .
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘
’ 1
Residual standard error: 37.02 on 94 degrees of freedom
Multiple R-squared: 0.03773, Adjusted R-squared: 0.02749
53
F-statistic: 3.685 on 1 and 94 DF, p-value: 0.05792
R2 and the Mean-Only Model
Why did the mean-only model not have an R2?
It explained no variance in Yi because it predicted the same
value (mean) for every person. The variance of the predicted
values is 0 in the mean-only model
In fact, the SSE for the mean-only model is the numerator of
the formula for the variance for Yi
^
SSE = (Yi – Y)2
S2 =
(Yi – Y)2
N-1
54
R2 and the Mean-Only Model
The mean-only model is used in an alternative
conceptualization of R2 for any augmented model
R2 = SSE(Mean-only) - SSE(A)
SSE(Mean-only)
^ =
Mean-Only Model:
FPS
i
0
SSE(Mean-only) = 133888.3
Augmented Model:
SSE(A) = 128837.1
^
FPSi = 0 + 1 * BACi
R2 = 133888.3 - 128837.1 = 0.03773
133888.3
In this augmented model, R2 is fully accounted for by BAC.
In more complex models, R2 will be the aggregate of multiple
55
predictors.
Test of 1 in Two Parameter Model: Special Case
When both the predictor variable and the dependent variable
are quantitative, the test of 1 = 0 is statistically equivalent to
the what other common statistical test
The test of the Pearson’s correlation coefficient, r
> corr.test(d$BAC, d$FPS)
Correlation matrix
[1] -0.194
Sample Size
[,1]
[1,] 96
Probability values adjusted for multiple tests.
[,1]
[1,] 0.058
Furthermore r2 = R2 for this model only.
-.1942 = .038
56
Visualizing the Model
e = effect('BAC', m2)
plot(e)
57
Displaying Model Results
Error bars/bands
You are predicting the mean Y
for any X. There is a sampling
distribution around this mean.
The true population mean Y for
any X is uncertain. You can
display this uncertainty by
displaying information about
the sampling distribution at
any/every X. This is equivalent
to error bars in ANOVA.
58
^
Error Band for Yi
plot(d$BAC,d$FPS, xlim = c(0,.15), xlab = 'Blood Alcohol Concentration',
ylab = 'Fear-potentiated startle')
dNew = data.frame(BAC= seq(0,.14,.0001))
pY = lm.pointEstimates(m2,dNew)
lines(dNew$BAC,pY[,1], col='red', lwd=2)
lines(dNew$BAC,pY[,2], col='gray', lwd=.5)
lines(dNew$BAC,pY[,3], col='gray', lwd=.5)
effect() dsiplays 95% CI for
However, I prefer + 1 SE
.
59
^
Error Band for Yi
Why are the error bands not linear?
Model predictions are better
(less error) near the center of
your data (Xi). Regression line
will always go through mean
of X and Y. Small changes in
b1 across samples will
^
produce bigger variation in Yi
at the edge of the model (far
from the mean X).
^
FPSi = 42.5 + -184.1 * BACi
confint(m2)
2.5 %
97.5 %
(Intercept)
29.45597 55.457721
BAC
-374.49261 6.308724
60
Error Band for
Compare to the SE for b0
b0 is simply the predicted
value for Y when X = 0.
We can use additive
transformations of X to make
tests of the predicted value
at X = 0. Most common in
repeated measures designs
but used elsewhere as well.
61
Publication Quality Figure
62