APSTAT - Unit 4b - Woodside Priory School
Download
Report
Transcript APSTAT - Unit 4b - Woodside Priory School
APSTAT - Unit 4b
Inference Continued
Chapter 23
Inference about means
One Sample Z-Test for Mean
One Sample T-Test for Mean
One Sample T-Interval for Mean
One Sample Z - Test for Mean
Example. The 2005 SAT:
Mean 1080, Standard Deviation 180
Priory Students n = 32, mean 1200
Two Possibilities (Just like for Proportions…)
Higher WPS scores just happened by chance
(natural variation of a sample)
The likelihood of 50 people averaging 1150 is so
remote we must conclude that Priory Students
are better at SAT than national average.
Let’s Do It! WPS SAT
Example
Step 1 Define Parameter:
m, the true mean score of WPS SAT test-takers
Step 2 Hypotheses
H0: m = 1080, Priory students perform at the
same level as the National Average
Ha: m > 1080, Priory students perform better than
the National Average
WPS SAT Example Continued
Step 3 Assumptions:
SRS
Normal Distribution
s is known
Assume/not stated
n>30
Yep!
Step 4 Name It, Show it, DO IT
One Sample Z-Test for a Mean
x - m 1200 - 1080
z=
=
= 3.77
s
180
n
32
WPS SAT Example Continued
Step 5 P-value and sketch of normal curve:
P(z>3.77)= 0.0000816
Z=3.77
P=0.000082
1080
1250
Step 6 Interpret P-value and Conclusion
A P-Value of .0000816 indicates that there is less than a
1 in 10000 chance that a result this distant from the m
happened merely by chance. Therefore, reject H0 in
favor of Ha. It is very likely that WPS students performed
far better on average than the National Average on the
2005 SAT
Two-Sided Example
Mr R is convinced that his blood-sugar
count is not normal. He tests himself
35 times and finds his count to be 90. If
the healthy mean for blood sugar count
is 84 with a standard deviation of 23, is
Mr. R’s blood sugar count abnormal?
Let’s Do It! WPS SAT
Example
Step 1 Define Parameter:
m, Riebhoff’s true mean blood sugar level
Step 2 Hypotheses
H0: m = 84, Boff’s blood sugar level is normal
Ha: m ≠ 84, Boff’s blood sugar level is abnormal
WPS SAT Example Continued
Step 3 Assumptions:
SRS
Normal Distribution
s is known
Assume/not stated
n>30
Yep!
Step 4 Name Test and DO IT
One Sample Z-Test for a Mean
x m
90 84
z
1.54
s
23
n
35
WPS SAT Example Continued
Step 5 P-value and sketch of normal curve:
2.P(z≠1.54) = 2(.06178) = 0.1236
1.54
m
+1.54
Step 6 Interpret P-value and Conclusion
A P-Value of .1236 indicates that a sample mean this
different from the true mean would occur about once in
every eight samples of this size simply by chance.
Therefore we fail to reject H0. There is not enough
evidence that Mr R’s blood sugar level is different than the
population mean.
Confidence and
Significance…
Hmm, think about a significance level
of a = .05
Now think about a confidence interval
of C = .95
P = 0.025
P = 0.95
P = 0.025
Z = -1.96
Z = 1.96
Or -2 (empirical)
Or 2 (empirical)
Apples Problem, CI Style
Step 1 Define Parameter
m,
Step 2 Hypotheses
H0:
Ha:
Step 3 Assumptions
Recall:
x 122.5
n 49
m 120
s 12
Apples Problem, CI Style
Step 4 Name Test and DO IT
Confidence Interval for Sample Mean
x z*
s
n
12
122.5 1.96
49
122.5 3.36
x 122.5
n 49
m 120
s 12
Apples Problem, CI Style
Step 5 Sketch of Interval:
m = 120
122.5
125.86
Step 6 Interpret119.14
Interval and
Conclusion
We are 95% confident that the true population
mean falls between 119.14 and 125.86.
Because the population mean of 120 IS
included in this interval, we must fail to reject
H0. The mean weight of this shipment cannot
be considered abnormal.
Inference as a decision
Example:
I am the Major League Baseball equipment
checker dude.
In each shipment of baseballs, I select 36 balls
at random to make sure they comply with MLB
Standards.
The rules state that the balls are to be 5.2 oz
each
Over the course of a great number of years, the
standard deviation of weights (of accepted
balls) has been 0.15 (we will accept this as s)
Here is the start of my test
m: the true mean weight of the balls in the
shipment
Hypothesis
H0: m = 5.2, Shipment mean is 5.2
Ha: m ≠ 5.2, Shipment mean is not 5.2
My decision will be to accept or reject the
shipment. If I reject, the balls will be sent
back to the manufacturer. I will use an a
of 0.05
Let’s Continue
My ball sample had a mean of 5.15, Do
I accept or reject the shipment?
Assumptions:
SRS
Normal Distribution
s is known
Stated
36 sample size
should give a
normal-ish distribution
Yep!
Mas
Step 4 Name Test and DO IT
One Sample Z-Test for a Mean
x m
5.2 5.15
z
2.00
s
.15
n
36
Mas Mas
Step 5 P-value and sketch of normal curve:
2.P(z>2.00) = 2(.02275) = 0.0455
Z=-1.96
Z=1.96
Accept
Z-Score = 2.0
Step 6 Interpret P-value and Conclusion
We will reject this shipment of balls since .0455<.05.
There is enough evidence that this shipment of balls is
not acceptable by MLB standards
T-Procedures – No s given
In most situations, knowing m and s is
not likely.
We must find a way to estimate s
What we do have:
x-bar: Our unbiased estimate of m
s: the standard deviation from our
sample. An unbiased estimate of s
So……
Before:
z
x m
s
n
Now:
x m0
t
s
n
Like z, with a
twist. You’ll see
Standard Error
(SE) of the
sample mean
What is this t thang, baby?
Like z,
Shows how far from m in standard deviation units.
Bell Shaped
BUT, unlike z
Curve changes as sample size changes
If n is small, more probability rests in tail of tdistribution curve
As n increases, becomes more like z-distribution
(standard normal)
T-distribution curve
Standard
Normal
T (n=3)
T (n=9)
m
UH OH!!!
If curve changes as n changes, we will need
a crazy chart
To use it we need something new:
DEGREES OF FREEDOM (k)
k = n – 1
We describe the t-distribution curve with the
following:
t(k) – means a t-distribution with k degrees of
freedom
Lets do it
Last page in book or Table B on AP
handout
Critical Value t*
EX: n = 10, P = .025 to right of t*
Sketch it!
P=.025
df=
m
t*=
Lets do it again!
Find Critical Value t*
EX: n = 18, P = .98 to left of t*
P=.98
Sketch it!
df=
m
t*=
Assumptions
SRS
Normal population or large n
Our Book’s Rule Of Thumb
n<15, ok if pretty close to normal
n>15, ok if no outliers or super skew
n>40, all good in my ‘hood
If outliers, you may eliminate them
Other texts
n>30 if non-normal population (but always check for outliers
and skew if given raw data)
s is unknown
Significance test.
Does body fat increase after 1 wk of
McDonalds-only eating?
Sample changes in body fat % from a
Put in random sample of 8 people.
List
on TI,
1.3 0.8
You’ll
see
Find: n=
why
2.1
df=
1.6
xbar=
-0.2
s=
1.5
-1.0
2.0
Test is on!
PARAMETERm: __________________________
HYPOTHESESH0: m = 0, _____________________
Ha: m > 0, _____________________
More…
ASSUMPTIONS
Sketch graph!
SRS
Normalness
s is not given
TEST – One sample t-test for mean
x m0 1.0125 0
t
2.616
s
1.0947
n
8
More….
P(VALUE)
P(t>2.616)=
Look on t-dist table
df=7, t=2.616
Find what probability this
is between
***NOTE - T-distribution table
reverse of normal distribution table
Probability on the OUTSIDE, t-score on the
INSIDE
TI version of t-distribution
table
Z test we would do normalcdf(blah)
T test we do:
Tcdf(low, high, df)
That is it.
More…….
Interpretation
A p-value between .01 and .02 would indicate
that the sample mean would occur roughly 1
in 50 to 1 in 100 samples of this size simply
by chance if H0 were true. We will therefore
reject H0 in favor of Ha. It is likely that eating
only fast food would be a factor in increasing
a person’s body fat count.
Now to make you crazy and
angry
Remember, to do a test correctly you
must throw down all the PHAT PI
action. There is no shortcut in what
you need to show.
Now,
hit STAT>TESTS>T-TEST
Input:Data, m0=0, List:L1, >, calculate
OH MY!!!!!!
Mas TI LOVE!
Go back to STAT>TESTS>T-TEST
You can also enter stats, instead of
drawing on data from a list
Boo ya!
Confidence Intervals
s
x t*
n
Based on t(k)
Let’s hop right in
We do a random sample of the length
(in inches) of 8 senior male feet and
find the following:
N=8 x-bar=12 s=2.4
Construct a 95% CI
Do Assumptions (no need to do a
Hypothesis test, why?)
Senior Feet
Test – One sample confidence interval for
mean of a population
s
x t*
n
2.4
12 (2.365)
8
12 2.01
(9.99,14.01)
t* is from
table, C=.95,
df=7
Senior Feet…
Interpret CI
We are 95% confident that the true
mean of WPS senior foot size is
between 9.99 and 14.01 inches.
Chapter 24
Comparing Means
Two Sample Z-Test for Mean
Two Sample T-Test for Mean
Two Sample T-Interval for Mean
Comparing two means
Two sample problems
Compare characteristics of two
populations
Separate samples
Random
Sample from
Population 1
C
O
M
P
A
R
E
Random
Sample from
Population 2
Comparing two means
Another example:
Group 1
Treatment A
C
O
M
P
A
R
E
Group 2
Treatment B
***Matched pairs is different, not a 2 sample problem…
Assumptions for
two sample t-test
2 SRS from different and independent
populations
Normal Population or large enough
sample size
Typically n1 + n2 >40
s and m are unknown
Two Sample Z Statistic
ONE SAMPLE
z
x m0
s
n
Looking at difference
of means
TWO SAMPLE
z
( x1 x2 ) ( m1 m2 )
Remember, we
cant add standard
deviations…
s
2
1
n1
+
s
2
2
n2
Two Sample T Statistic
ONE SAMPLE
TWO SAMPLE
x m0
t
s
n
t
( x1 x2 ) ( m1 m2 )
2
1
2
2
s
s
+
n1 n2
Important stuff
A Two-Sample t statistic does NOT have a
t distribution
We are replacing 2 standard deviations with 2
standard errors
It’s ok though we can use it if:
We change our degrees of freedom a bit
1 way – Big Ugly Formula (P.468)
Your TI does this
Easier Way – Just use smaller of n1-1 or n2-1
Two Sample t* Confidence
Interval
2
1
2
2
s
s
( x1 x2 ) + t *
+
n1 n2
•If you use CI as part of a test of significance:
•If m0 is included in the interval, fail to reject H0
•If m0 is included in the interval, reject H0 in favor of Ha
Testing Hypotheses
In a Two Sample Test:
Null Hypothesis
H0 : m1=m2 OR m1- m2 = 0
Basically, there is NO difference between the
means
Alternative Hypothesis
Ha : m1> (or < or ≠) m2
OR m1- m2 > (or < or ≠) 0
Basically, there is a difference between the means
or one is greater than the other
Pooled
Use Pooled Formula if data have exactly the
same variance:
s s
2
1
2
2
Honestly, since a pooled t-test is so sensitive to
slight differences in variances, JUST USE
REGULAR TWO-SAMPLE T-TEST.
Do girls take more AP classes?
An SRS is taken, here’s the data:
Boys
Girls
P
H
A
n
29
25
x
2.9
3.2
s
1.1
.9
Girls in AP classes
T
P
I
Do it with A 95% Confidence
Interval
P – Same
H – Same
A – Same
T – 2-sample 95% t* Confidence Interval
P – Don’t get one
I–
Chapter 25
Paired Samples
Paired T-Test
Paired T-Interval
Matched Pairs
Looking at change/difference
Before training/after training
Left hand/right hand
So…we will find the
difference/change and it will become
our data!
Then….We are basically performing
a One Sample T-Test on these
differences. Easy.
Raw data
10 Students Before/After SAT Tutoring, is
there a positive effect?
1
2
3
4
5
6
7
8
9
10
before
500
535
600
605
575
560
525
400
415
550
after
525
550
590
635
550
575
525
450
410
575
Diff
25
15
-10
30
-25
15
0
50
-5
25
LET’S DO IT!
P
H
A
LET’S DO IT! – Use the Calc
T T-Test for difference of matched pairs
P
I
Paired T-Interval
Construct and interpret a 90%
Confidence Interval for the true mean
difference in the previous problem.
Paired Wrap-up
Not too hard, huh?
Tough part is determining when to use
Paired Procedures
Simple Signs:
2 groups of data, same exact size
Before/after data
One person doing 2 things
Chapter 26
Chi-Square Procedures
Comparing Counts (Categorical Data)
Three Tests
Goodness of Fit
Independence
Homogeneity
Chi Square – Goodness of Fit
Remember M&M’s, we did 1 Prop t-test
for all 6 colors.
2 Problems
Took a loooooong time
Doesn’t give us an overall picture of how
WAC the package was overall
Chi-Square Goodness of Fit
“How well does our
“observations” fit
what we “expect”
My M&M data:
Observed
Expected
17
13
.24(58)
.20(58)
10
8
3
.14(58)
.14(58)
.13(58)
7
.10(58)
Chi Squared is the SUM of…
Observed
Expected
17
.24(58)
13
.20(58)
10
.14(58)
8
.14(58)
3
.13(58)
7
.10(58)
(O E )2
E
X
2
What to do with X2 value
Remember…
T-score?
Z-Score?
X2 is same (but different)
Need X2 value
Degrees of Freedom (df)
(Number of Categories – 1)
M&M’s Example
X2 = _________
df = 6-1=5
Look at X2 distribution table
Note, curve is not normal
Skewed right
Gets “normaler’ as df raises
P(X2 >___)= Between ____ and ____
Goodness of fit – on TI-83
Unfortunately no “TEST” on TI-83
Can use LISTS to make it easier
Can also use X2cdf(low,high, df)
For M&M’s Example P-Value=_____
Significance Test
Mostly the same
Parameters: Define proportions
Let p1,p2,…p6 = Proportion of each color
of M&Ms
Hypotheses:
H0 – Proportions are as stated
p1=.24, p1=.20…
Ha – H0 is NOT true
Significance test
Assumptions
Observed counts – SRS
Large enough sample (all expected
counts are above 5
Test – We just did it
P-Value - Same as before
Interpretation – Same as before
Let’s Do One!
Are students more
likely to miss school
on certain days?
Data from a
random sample of 5
Mondays,
Tuesdays…is
taken.
Observed
MON
18
TUE
15
WED
12
THU
16
FRI
19
Absence days
Parameters
Let p1,p2,…p5 = Proportion of absences
on each day Monday through Friday
Hypotheses:
H0 – p1,p2,p3,p4,p5=____
Ha – H0 is NOT true
Absence Days
Assumptions
Observed counts – SRS
Large enough sample (all expected counts are
above 5) – Will Show Below
Test – X2 Goodness of Fit Test
(O E )
X
E
2
2
Absence Days
Day
Observed
Expected
MON
18
16
TUE
15
16
WED
12
16
THU
16
16
FRI
19
16
(O-E)2/E
Absence Days
P VALUE
X2 = ______
df = ______
P(
)=
Interpretation:
X2 for Homogeneity and
Independence
We just looked at 1 category
ie. Color of M&M, Day of Week
Now 2 Categories. Yeah!
Two Tests (done same way)
Homogeneity – No difference in
proportions within a category
Independence – Is one variable
independent of the other?
Drinking Habits
Does there appear to be a gender
difference with respect to drinking
behavior of college students? 2017
male and female students were asked
to monitor their drinking over the
course of a week. Levels were
classified as None, Low (1-7), Medium
(8-24), High (25+).
Drinking Habits – Observed
GENDER
Drinking
Level
None
Male
Female
140
186
Low
478
661
Medium
300
173
High
63
16
Significance Test –
X2 Homogeneity
Parameters: NOT NEEDED
HYPOTHESES
H0: True Category Proportions are the
same for all populations
Ha: True Category Proportions are NOT
the same for all populations
Assumptions: Same as other X2:
SRS and Expected Cell Counts > 5
Significance Test –
X2 Homogeneity
Test: Pretty close to Goodness-Of-Fit, but
sum ALL cells.
Expected cell counts:
Row Marginal x Column Marginal
Grand Total
P-Value: Same as before with X2
distribution chart or X2cdf(). BUT df is
different: df=(#Rows-1)(#Columns-1)
Interpretation: Same ol’ Same ol’
DO IT! – Drinking Example
First Find Expected Counts – Fill in chart (just do 2)
GENDER
DRINKING LEVEL
MALE
FEMALE
NONE
140
186
LOW
478
661
MEDIUM
300
173
HIGH
63
16
Column Marginal
Row
Marginal
ENTER THE MATRIX
Matrix – Choose Edit [A]
Choose (r x c) – Plug in observed #s
Should Look Like Your Table
Stat>Test>X2 Test
Calculate (You can DRAW later for your
sketch)
Ignore output (for a bit) and go look at Matrix
[B] (press enter) – Those are Expected
Counts WRITE THEM IN!
Drinking Example
Hypotheses
H0: True Category Proportions are the
same for all populations
Ha: True Category Proportions are NOT
the same for all populations
Assumptions
SRS
Large enough sample size
Drinking Example
TEST: Don’t write all the way out, Do This:
2
(
O
E
)
(__ __)
(__ __)
2
X
+ ... +
E
__
__
P-Value: df =
Interpret:
NOW X2 - Independence
Looking to see if certain category is
independent of another.
Example: Do blondes have more fun?
Do exactly like homogeneity, but
thinking (and Hypothesis and
Interpretation) is a bit different.
Recall: If A and B are Independent:
P(A&B) = P(A) x P(B)
Observed Data – from an SRS
of 70 people
Hair Color
Fun Level
Blonde
Non-Blonde
Always
12
12
24/70 = .34
Sometimes
9
12
21/70 = .3
Never
4
21
25/70 = .36
25/70 = .36
45/70 = .64
70
Expected Value = Row Proportion x Column Proportion x Grand Total
Sig Test Differences
Hypotheses:
H0: Two variables are independent
Ha: Two variables are NOT independent
Interpretation: answer question, is
there evidence against the hypotheses
that the variables are independent?
Everything else is same
Do it! Do blondes have more
fun?
Is fun level independent of hair color?
Hypotheses:
H0:
Ha:
Assumptions
Blonde = Fun
TEST:1st - Show Completed Chart w/
Expected counts too!
Hair Color
Fun Level
Blonde
Non-Blonde
Always
12
12
24/70 = .34
Sometimes
9
12
21/70 = .3
Never
4
21
25/70 = .36
25/70 = .36
45/70 = .64
70
Blonde = Fun
Now conduct X2 test
2
(
O
E
)
(__ __)
(__ __)
2
X
+ ... +
E
__
__
P-Value df =
Interpret
THAT’S IT!
JUST ONE MORE CHAPTER TO GO
AFTER THIS!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Chapter 27
Inference For Regression
Final Chapter!!!!!
The Main Idea
Inference –
We take a sample
Use procedures to find
Could the results happen by chance variation of
the sample
Is there evidence that this result might not have
happened by chance
Now we apply this to linear relationships
between 2 variables
Let’s Review Bivariate Data 1st
Mother’s Age and Birth Weight
Age
Weight (lbs)
15
5.0
17
6.3
18
7.3
15
5.8
16
6.8
19
6.6
17
6.9
16
5.6
18
7.5
19
8.1
Chuck these
into L1 and L2
and let’s roll!
Stuff to do with Bivariate Data
Graph it (don’t forget labeling)
Find LRSL (graph it and label it too!)
Stuff to do with Bivariate Data
Coefficient of Correlation (interpret)
Coefficient of Determination (interpret)
Stuff to do with Bivariate Data
Graph residuals (don’t forget labels)
Stuff to do with Bivariate Data
Overall comment on Bivariate data.
Direction, Strength, Linear-ness, Weirdness:
Stuff to do with Bivariate Data
Normalness of residuals (use boxwhisker, stem-leaf, or histogram)
Stuff to do with Bivariate Data
Predict BW for the birth weight of a
child born to a 17 year-old mother.
Stuff to do with Bivariate Data
Interpret slope of regression line in the
context of the problem.
Interpret intercept of regression line in
the context of the problem.
Inference for an LSRL - Data
Two major FR problem types
Do inference based on given set of
bivariate data (not common)
Do inference based on the output from a
statistics software program (very
common)
We will do both, but raw data first…
Basics
Our LSRL is simply an estimator of the
true LSRL (which would be based on a
census of the entire population).
Estimated y-int
Predicted y value
Estimated slope
yˆ a + bx
Basics
Real LSRL (from a census)
Mean y-value
for that given
x-value
True y-intercept
True Slope
(Pretty drawing stolen with
love from Gabriel Tang
who stole it from someone
else)
Basics
For a given x-value, the y’s will vary
normally about a my. (standard error of the
residuals)
Basics
So….to do inference, we need the following statistics:
a – an estimate of a, the y-int of my (LinReg)
b – an estimate of b, the slope of my (Lin Reg)
sb – the standard error of the slope of the
regression line (on computer output, not TI, I will
show you a trick to find it)
s – the standard error of the residuals (on
computer and TI output)
Basics yˆ b b x
0
1
Check out the
formulas on formula
sheet:
Make sure you can
use them/know what
they are
Great multiple choice
fodder…
i.e. given r, sy, and b1
find sx
b1 blob
b0 blob
r blob
b1 blob
Sb1 blob
Basics - SEResiduals
1
2
s
Residual
n2
( y yˆ )
Why n-2? Just deal.
(TI will calculate if you have the raw data!)
Basics – SE Slope
SE of Residuals
sb
s
x x
2
(TI will NOTcalculate – Boff has a trick)
Standard Error of Slope Trick
When we do LinRegTTest on T1, it gives us:
T-value
s (SE of Resids)
Now the t-value formula is:
b
b
t
therefore sb
sb
t
Yee haw, plug and chug!
Significance Test for Slope of a
Regression Line – From Data
Idea: we are checking to see if this LSRL is
a good fit for our data
Usual test is whether the slope of the
regression line is ZERO (no relationship) or
NOT ZERO (a relationship) – Two Sided
Sometimes we look to see if relationship is
positive or negative (>0 or <0, respectively)
– One Sided
Let’s do it – Mom Age/Baby wt
Parameters:
Let b represent the true slope of the
regression line.
Hypotheses:
H0: b = 0, No relationship between birth
weight and mother’s age
Ha: b ≠ 0, There is a relationship between
birth weight and mother’s age
Let’s do it – Mom Age/Baby wt
Assumptions:
Linear Relationship (show resids, talk
about r)
Variance about the line is both constant
and normal (show resids and comment)
WHAT TO LOOK FOR:
YUCK – Curvy!
YUCK – Not constant variance
(Better check for outlier here)
Let’s do it – Mom Age/Baby wt
Assumptions:
Linear
Relationship
Variance about
the line is both
constant and
normal
How do ours
look?
Let’s do it – Mom Age/Baby wt
TEST – T Test for a Linear Regression
On TI – STAT>TEST>LINREGTTEST
Fill in Blanks
b
t
sb
Let’s do it – Mom Age/Baby wt
Total P
P-Value
df = 10-2 = 8
2 x P(t >
)=
b1
b=0
b2
t
t=0
t
Interpretation:
P-Value is very small so we will reject H0 in favor of
the alternative. There is almost assuredly a
relationship between birth weight and a mother’s
age.
Remember, we can never say one causes the
other unless we have a well designed and
controlled experiment
Confidence Intervals (from
data)
Similar to other CI, but remember we are
looking at slope…
b t * Sb1
Our estimated
slope from
LinReg
Get from table,
need df (n-2)
and confidence
level
Remember
Riebhoff’s
sneaky trick?
Part 2, Computer Output
Data collected by counting cricket
chirps in 15 seconds and noting current
temperature.
Output (from some statistical software):
TEMP = 44.0 + 0.993 NUMBER
Predictor
Coef
STDev T-Ratio
P
CONSTANT
44.01
1.827
24.09
.000
NUMBER
0.993
0.065
15.23
.000
S = 1.538
R-sq = 95.9%
R-sq(adj) = 95.5%
Interpreting Computer Output
TEMP = 44.0 + 0.993 NUMBER
Predictor
Coef
STDev T-Ratio
P
CONSTANT
44.01
1.827
24.09
.000
NUMBER
0.993
0.065
15.23
.000
S = 1.538
R-sq = 95.9%
R-sq(adj) = 95.5%
This is the LSRL for predicting the temperature
based on the number of chirps.
Interpreting Computer Output
TEMP = 44.0 + 0.993 NUMBER
Predictor
Coef
STDev T-Ratio
P
CONSTANT
44.01
1.827
24.09
.000
NUMBER
0.993
0.065
15.23
.000
S = 1.538
R-sq = 95.9%
R-sq(adj) = 95.5%
The Coef for the CONSTANT is the y-intercept
(a) of the LSRL
The Coef for the NUMBER is the Slope (b) of
the LSRL
Interpreting Computer Output
TEMP = 44.0 + 0.993 NUMBER
Predictor
Coef
STDev T-Ratio
P
CONSTANT
44.01
1.827
24.09
.000
NUMBER
0.993
0.065
15.23
.000
S = 1.538
R-sq = 95.9%
R-sq(adj) = 95.5%
STDev of NUMBER is the standard error of the
slope (Sb).
Interpreting Computer Output
TEMP = 44.0 + 0.993 NUMBER
Predictor
Coef
STDev T-Ratio
P
CONSTANT
44.01
1.827
24.09
.000
NUMBER
0.993
0.065
15.23
.000
b
S = 1.538
t
R-sq = 95.9%
sb
R-sq(adj) = 95.5%
T-ratio of NUMBER is the value of the t test
statistic. Check it:
b
t
sb
P is the P-value associated with the t-ratio
(assuming a 2-sided test – if you are using a 1sided test, divide by 2):
Interpreting Computer Output
TEMP = 44.0 + 0.993 NUMBER
Predictor
Coef
STDev T-Ratio
P
CONSTANT
44.01
1.827
24.09
.000
NUMBER
0.993
0.065
15.23
.000
S = 1.538
R-sq = 95.9%
R-sq(adj) = 95.5%
S is the standard error of the residuals (s)
Interpreting Computer Output
TEMP = 44.0 + 0.993 NUMBER
Predictor
Coef
STDev T-Ratio
P
CONSTANT
44.01
1.827
24.09
.000
NUMBER
0.993
0.065
15.23
.000
S = 1.538
R-sq = 95.9%
R-sq(adj) = 95.5%
R-sq is r-squared (coefficient of determiniation
– r2 )
Interpreting Computer Output
TEMP = 44.0 + 0.993 NUMBER
Predictor
Coef
STDev T-Ratio
P
CONSTANT
44.01
1.827
24.09
.000
NUMBER
0.993
0.065
15.23
.000
S = 1.538
R-sq = 95.9%
R-sq(adj) = 95.5%
Ignore these, we do
not need!
Significance Test – From
Output
Same as before, but maybe easier…
Here we go…
Let’s do it – Chirps/Temp
Parameters:
Let b represent the true slope of the
regression line.
Hypotheses:
H0: b = 0, No relationship between # of
chirps and temperature
Ha: b ≠ 0, There is a relationship between
# of chirps and temperature
Let’s do it – Chirps/Temp
Assumptions:
Linear Relationship (no resids, talk about
r)
Variance about the line is both constant
and normal (we must assume….)
Let’s do it – Mom Age/Baby wt
TEMP = 44.0 + 0.993 NUMBER
Predictor
Coef
STDev T-Ratio
P
CONSTANT
44.01
1.827
24.09
.000
NUMBER
0.993
0.065
15.23
.000
S = 1.538
R-sq = 95.9%
R-sq(adj) = 95.5%
TEST – T Test for a Linear Regression
Fill in Blanks From Output
b
t
sb
Let’s do it – Chirps/Temp
Total P
P-Value
df = not given
2 x P(t >
)=
Interpretation:
b1
b=0
b2
t
t=0
t
DONE
DONE DONE DONE DONE DONE DONE
DONE DONE DONE DONE DONE DONE
DONE DONE DONE DONE DONE DONE
DONE DONE DONE DONE DONE DONE
DONE DONE DONE DONE DONE DONE
DONE DONE DONE DONE DONE DONE
DONE DONE DONE DONE DONE DONE
DONE DONE DONE DONE DONE DONE
DONE DONE DONE DONE DONE DONE
DONE DONE DONE DONE DONE YEAH!