APSTAT - Unit 4b - Woodside Priory School

Transcript APSTAT - Unit 4b - Woodside Priory School

APSTAT - Unit 4b
Inference Continued
Chapter 23
Inference about means
One Sample Z-Test for Mean
One Sample T-Test for Mean
One Sample T-Interval for Mean
One Sample Z - Test for Mean
 Example. The 2005 SAT:
 Mean 1080, Standard Deviation 180
 Priory Students n = 32, mean 1200
 Two Possibilities (Just like for Proportions…)
 Higher WPS scores just happened by chance
(natural variation of a sample)
 The likelihood of 50 people averaging 1150 is so
remote we must conclude that Priory Students
are better at SAT than national average.
Let’s Do It! WPS SAT
Example
 Step 1 Define Parameter:
 m, the true mean score of WPS SAT test-takers
 Step 2 Hypotheses
 H0: m = 1080, Priory students perform at the
same level as the National Average
 Ha: m > 1080, Priory students perform better than
the National Average
WPS SAT Example Continued
 Step 3 Assumptions:
 SRS
 Normal Distribution
 s is known
Assume/not stated
n>30
Yep!
 Step 4 Name It, Show it, DO IT
 One Sample Z-Test for a Mean
x - m 1200 - 1080
z=
=
= 3.77
s
180
n
32
WPS SAT Example Continued
 Step 5 P-value and sketch of normal curve:
 P(z>3.77)= 0.0000816
Z=3.77
P=0.000082
1080
1250
 Step 6 Interpret P-value and Conclusion
 A P-Value of .0000816 indicates that there is less than a
1 in 10000 chance that a result this distant from the m
happened merely by chance. Therefore, reject H0 in
favor of Ha. It is very likely that WPS students performed
far better on average than the National Average on the
2005 SAT
Two-Sided Example
 Mr R is convinced that his blood-sugar
count is not normal. He tests himself
35 times and finds his count to be 90. If
the healthy mean for blood sugar count
is 84 with a standard deviation of 23, is
Mr. R’s blood sugar count abnormal?
Let’s Do It! WPS SAT
Example
 Step 1 Define Parameter:
 m, Riebhoff’s true mean blood sugar level
 Step 2 Hypotheses
 H0: m = 84, Boff’s blood sugar level is normal
 Ha: m ≠ 84, Boff’s blood sugar level is abnormal
WPS SAT Example Continued
 Step 3 Assumptions:
 SRS
 Normal Distribution
 s is known
Assume/not stated
n>30
Yep!
 Step 4 Name Test and DO IT
 One Sample Z-Test for a Mean
x m
90  84
z

 1.54
s
23
n
35
WPS SAT Example Continued
 Step 5 P-value and sketch of normal curve:
 2.P(z≠1.54) = 2(.06178) = 0.1236
1.54
m
+1.54
 Step 6 Interpret P-value and Conclusion
 A P-Value of .1236 indicates that a sample mean this
different from the true mean would occur about once in
every eight samples of this size simply by chance.
Therefore we fail to reject H0. There is not enough
evidence that Mr R’s blood sugar level is different than the
population mean.
Confidence and
Significance…
 Hmm, think about a significance level
of a = .05
 Now think about a confidence interval
of C = .95
P = 0.025
P = 0.95
P = 0.025
Z = -1.96
Z = 1.96
Or -2 (empirical)
Or 2 (empirical)
Apples Problem, CI Style
Step 1 Define Parameter
m,
Step 2 Hypotheses
H0:
Ha:
Step 3 Assumptions
Recall:
x  122.5
n  49
m  120
s  12
Apples Problem, CI Style
Step 4 Name Test and DO IT
Confidence Interval for Sample Mean
x  z*
s
n
12
122.5  1.96
49
122.5  3.36
x  122.5
n  49
m  120
s  12
Apples Problem, CI Style
Step 5 Sketch of Interval:
m = 120
122.5
125.86
Step 6 Interpret119.14
Interval and
Conclusion
We are 95% confident that the true population
mean falls between 119.14 and 125.86.
Because the population mean of 120 IS
included in this interval, we must fail to reject
H0. The mean weight of this shipment cannot
be considered abnormal.
Inference as a decision
 Example:
 I am the Major League Baseball equipment
checker dude.
 In each shipment of baseballs, I select 36 balls
at random to make sure they comply with MLB
Standards.
 The rules state that the balls are to be 5.2 oz
each
 Over the course of a great number of years, the
standard deviation of weights (of accepted
balls) has been 0.15 (we will accept this as s)
Here is the start of my test
 m: the true mean weight of the balls in the
shipment
 Hypothesis
H0: m = 5.2, Shipment mean is 5.2
Ha: m ≠ 5.2, Shipment mean is not 5.2
 My decision will be to accept or reject the
shipment. If I reject, the balls will be sent
back to the manufacturer. I will use an a
of 0.05
Let’s Continue
 My ball sample had a mean of 5.15, Do
I accept or reject the shipment?
 Assumptions:
SRS
Normal Distribution
s is known
Stated
36 sample size
should give a
normal-ish distribution
Yep!
Mas
 Step 4 Name Test and DO IT
 One Sample Z-Test for a Mean
x m
5.2  5.15
z

 2.00
s
.15
n
36
Mas Mas
 Step 5 P-value and sketch of normal curve:
 2.P(z>2.00) = 2(.02275) = 0.0455
Z=-1.96
Z=1.96
Accept
Z-Score = 2.0
 Step 6 Interpret P-value and Conclusion
 We will reject this shipment of balls since .0455<.05.
There is enough evidence that this shipment of balls is
not acceptable by MLB standards
T-Procedures – No s given
 In most situations, knowing m and s is
not likely.
 We must find a way to estimate s
 What we do have:
x-bar: Our unbiased estimate of m
s: the standard deviation from our
sample. An unbiased estimate of s
So……
 Before:
z
x m
s
n
 Now:
x  m0
t
s
n
Like z, with a
twist. You’ll see
Standard Error
(SE) of the
sample mean
What is this t thang, baby?
Like z,
 Shows how far from m in standard deviation units.
 Bell Shaped
BUT, unlike z
 Curve changes as sample size changes
 If n is small, more probability rests in tail of tdistribution curve
 As n increases, becomes more like z-distribution
(standard normal)
T-distribution curve
Standard
Normal
T (n=3)
T (n=9)
m
UH OH!!!
If curve changes as n changes, we will need
a crazy chart
To use it we need something new:
DEGREES OF FREEDOM (k)
k = n – 1
We describe the t-distribution curve with the
following:
t(k) – means a t-distribution with k degrees of
freedom
Lets do it
 Last page in book or Table B on AP
handout
 Critical Value t*
 EX: n = 10, P = .025 to right of t*
Sketch it!
P=.025
df=
m
t*=
Lets do it again!
 Find Critical Value t*
 EX: n = 18, P = .98 to left of t*
P=.98
Sketch it!
df=
m
t*=
Assumptions
 SRS
 Normal population or large n
 Our Book’s Rule Of Thumb
n<15, ok if pretty close to normal
n>15, ok if no outliers or super skew
n>40, all good in my ‘hood
If outliers, you may eliminate them
 Other texts
n>30 if non-normal population (but always check for outliers
and skew if given raw data)
 s is unknown
Significance test.
 Does body fat increase after 1 wk of
McDonalds-only eating?
 Sample changes in body fat % from a
Put in random sample of 8 people.
List
on TI,
1.3 0.8
You’ll
see
Find: n=
why
2.1
df=
1.6
xbar=
-0.2
s=
1.5
-1.0
2.0
Test is on!
PARAMETERm: __________________________
HYPOTHESESH0: m = 0, _____________________
Ha: m > 0, _____________________
More…
ASSUMPTIONS
Sketch graph!
SRS
Normalness
s is not given
TEST – One sample t-test for mean
x  m0 1.0125  0
t

 2.616
s
1.0947
n
8
More….
P(VALUE)
P(t>2.616)=
Look on t-dist table
df=7, t=2.616
Find what probability this
is between
***NOTE - T-distribution table
reverse of normal distribution table
Probability on the OUTSIDE, t-score on the
INSIDE
TI version of t-distribution
table
 Z test we would do normalcdf(blah)
 T test we do:
Tcdf(low, high, df)
That is it.
More…….
Interpretation
A p-value between .01 and .02 would indicate
that the sample mean would occur roughly 1
in 50 to 1 in 100 samples of this size simply
by chance if H0 were true. We will therefore
reject H0 in favor of Ha. It is likely that eating
only fast food would be a factor in increasing
a person’s body fat count.
Now to make you crazy and
angry
 Remember, to do a test correctly you
must throw down all the PHAT PI
action. There is no shortcut in what
you need to show.
 Now,
hit STAT>TESTS>T-TEST
Input:Data, m0=0, List:L1, >, calculate
OH MY!!!!!!
Mas TI LOVE!
 Go back to STAT>TESTS>T-TEST
You can also enter stats, instead of
drawing on data from a list
Boo ya!
Confidence Intervals
s
x t*
n
Based on t(k)
Let’s hop right in
 We do a random sample of the length
(in inches) of 8 senior male feet and
find the following:
 N=8 x-bar=12 s=2.4
 Construct a 95% CI
 Do Assumptions (no need to do a
Hypothesis test, why?)
Senior Feet
Test – One sample confidence interval for
mean of a population
s
x t*
n
2.4
12  (2.365)
8
12  2.01
(9.99,14.01)
t* is from
table, C=.95,
df=7
Senior Feet…
Interpret CI
We are 95% confident that the true
mean of WPS senior foot size is
between 9.99 and 14.01 inches.
Chapter 24
Comparing Means
Two Sample Z-Test for Mean
Two Sample T-Test for Mean
Two Sample T-Interval for Mean
Comparing two means
 Two sample problems
Compare characteristics of two
populations
Separate samples
Random
Sample from
Population 1
C
O
M
P
A
R
E
Random
Sample from
Population 2
Comparing two means
 Another example:
Group 1
Treatment A
C
O
M
P
A
R
E
Group 2
Treatment B
***Matched pairs is different, not a 2 sample problem…
Assumptions for
two sample t-test
 2 SRS from different and independent
populations
 Normal Population or large enough
sample size
Typically n1 + n2 >40
 s and m are unknown
Two Sample Z Statistic
ONE SAMPLE
z
x  m0
s
n
Looking at difference
of means
TWO SAMPLE
z
( x1  x2 )  ( m1  m2 )
Remember, we
cant add standard
deviations…
s
2
1
n1
+
s
2
2
n2
Two Sample T Statistic
ONE SAMPLE
TWO SAMPLE
x  m0
t
s
n
t
( x1  x2 )  ( m1  m2 )
2
1
2
2
s
s
+
n1 n2
Important stuff
A Two-Sample t statistic does NOT have a
t distribution
We are replacing 2 standard deviations with 2
standard errors
It’s ok though we can use it if:
We change our degrees of freedom a bit
1 way – Big Ugly Formula (P.468)
Your TI does this
Easier Way – Just use smaller of n1-1 or n2-1
Two Sample t* Confidence
Interval
2
1
2
2
s
s
( x1  x2 ) + t *
+
n1 n2
•If you use CI as part of a test of significance:
•If m0 is included in the interval, fail to reject H0
•If m0 is included in the interval, reject H0 in favor of Ha
Testing Hypotheses
 In a Two Sample Test:
 Null Hypothesis
 H0 : m1=m2 OR m1- m2 = 0
 Basically, there is NO difference between the
means
 Alternative Hypothesis
 Ha : m1> (or < or ≠) m2
 OR m1- m2 > (or < or ≠) 0
 Basically, there is a difference between the means
or one is greater than the other
Pooled
 Use Pooled Formula if data have exactly the
same variance:
s s
2
1
2
2
 Honestly, since a pooled t-test is so sensitive to
slight differences in variances, JUST USE
REGULAR TWO-SAMPLE T-TEST.
Do girls take more AP classes?
An SRS is taken, here’s the data:
Boys
Girls
P
H
A
n
29
25
x
2.9
3.2
s
1.1
.9
Girls in AP classes
T
P
I
Do it with A 95% Confidence
Interval
P – Same
H – Same
A – Same
T – 2-sample 95% t* Confidence Interval
P – Don’t get one
I–
Chapter 25
Paired Samples
Paired T-Test
Paired T-Interval
Matched Pairs
 Looking at change/difference
Before training/after training
Left hand/right hand
 So…we will find the
difference/change and it will become
our data!
 Then….We are basically performing
a One Sample T-Test on these
differences. Easy.
Raw data
10 Students Before/After SAT Tutoring, is
there a positive effect?
1
2
3
4
5
6
7
8
9
10
before
500
535
600
605
575
560
525
400
415
550
after
525
550
590
635
550
575
525
450
410
575
Diff
25
15
-10
30
-25
15
0
50
-5
25
LET’S DO IT!
P
H
A
LET’S DO IT! – Use the Calc
T T-Test for difference of matched pairs
P
I
Paired T-Interval
 Construct and interpret a 90%
Confidence Interval for the true mean
difference in the previous problem.
Paired Wrap-up
 Not too hard, huh?
 Tough part is determining when to use
Paired Procedures
 Simple Signs:
2 groups of data, same exact size
Before/after data
One person doing 2 things
Chapter 26
Chi-Square Procedures
Comparing Counts (Categorical Data)
Three Tests
Goodness of Fit
Independence
Homogeneity
Chi Square – Goodness of Fit
 Remember M&M’s, we did 1 Prop t-test
for all 6 colors.
 2 Problems
Took a loooooong time
Doesn’t give us an overall picture of how
WAC the package was overall
Chi-Square Goodness of Fit
 “How well does our
“observations” fit
what we “expect”
 My M&M data:
Observed
Expected
17
13
.24(58)
.20(58)
10
8
3
.14(58)
.14(58)
.13(58)
7
.10(58)
Chi Squared is the SUM of…
Observed
Expected
17
.24(58)
13
.20(58)
10
.14(58)
8
.14(58)
3
.13(58)
7
.10(58)
(O  E )2
E
X 
2
What to do with X2 value
 Remember…
T-score?
Z-Score?
 X2 is same (but different)
Need X2 value
Degrees of Freedom (df)
(Number of Categories – 1)
M&M’s Example
 X2 = _________
 df = 6-1=5
 Look at X2 distribution table
Note, curve is not normal
Skewed right
Gets “normaler’ as df raises
 P(X2 >___)= Between ____ and ____
Goodness of fit – on TI-83
 Unfortunately no “TEST” on TI-83
 Can use LISTS to make it easier
 Can also use X2cdf(low,high, df)
 For M&M’s Example P-Value=_____
Significance Test
 Mostly the same
 Parameters: Define proportions
Let p1,p2,…p6 = Proportion of each color
of M&Ms
 Hypotheses:
H0 – Proportions are as stated
p1=.24, p1=.20…
Ha – H0 is NOT true
Significance test
 Assumptions
Observed counts – SRS
Large enough sample (all expected
counts are above 5
 Test – We just did it
 P-Value - Same as before
 Interpretation – Same as before
Let’s Do One!
 Are students more
likely to miss school
on certain days?
Data from a
random sample of 5
Mondays,
Tuesdays…is
taken.
Observed
MON
18
TUE
15
WED
12
THU
16
FRI
19
Absence days
 Parameters
Let p1,p2,…p5 = Proportion of absences
on each day Monday through Friday
 Hypotheses:
H0 – p1,p2,p3,p4,p5=____
Ha – H0 is NOT true
Absence Days
 Assumptions
 Observed counts – SRS
 Large enough sample (all expected counts are
above 5) – Will Show Below
 Test – X2 Goodness of Fit Test
(O  E )
X 
E
2
2
Absence Days
Day
Observed
Expected
MON
18
16
TUE
15
16
WED
12
16
THU
16
16
FRI
19
16
(O-E)2/E
Absence Days
 P VALUE
X2 = ______
df = ______
P(
)=
 Interpretation:
X2 for Homogeneity and
Independence
 We just looked at 1 category
ie. Color of M&M, Day of Week
 Now 2 Categories. Yeah!
 Two Tests (done same way)
Homogeneity – No difference in
proportions within a category
Independence – Is one variable
independent of the other?
Drinking Habits
 Does there appear to be a gender
difference with respect to drinking
behavior of college students? 2017
male and female students were asked
to monitor their drinking over the
course of a week. Levels were
classified as None, Low (1-7), Medium
(8-24), High (25+).
Drinking Habits – Observed
GENDER
Drinking
Level
None
Male
Female
140
186
Low
478
661
Medium
300
173
High
63
16
Significance Test –
X2 Homogeneity
 Parameters: NOT NEEDED
 HYPOTHESES
 H0: True Category Proportions are the
same for all populations
 Ha: True Category Proportions are NOT
the same for all populations
 Assumptions: Same as other X2:
 SRS and Expected Cell Counts > 5
Significance Test –
X2 Homogeneity
 Test: Pretty close to Goodness-Of-Fit, but
sum ALL cells.
Expected cell counts:
Row Marginal x Column Marginal
Grand Total
 P-Value: Same as before with X2
distribution chart or X2cdf(). BUT df is
different: df=(#Rows-1)(#Columns-1)
 Interpretation: Same ol’ Same ol’
DO IT! – Drinking Example
 First Find Expected Counts – Fill in chart (just do 2)
GENDER
DRINKING LEVEL
MALE
FEMALE
NONE
140
186
LOW
478
661
MEDIUM
300
173
HIGH
63
16
Column Marginal
Row
Marginal
ENTER THE MATRIX
 Matrix – Choose Edit [A]
 Choose (r x c) – Plug in observed #s
 Should Look Like Your Table
 Stat>Test>X2 Test
 Calculate (You can DRAW later for your
sketch)
 Ignore output (for a bit) and go look at Matrix
[B] (press enter) – Those are Expected
Counts WRITE THEM IN!
Drinking Example
 Hypotheses
H0: True Category Proportions are the
same for all populations
Ha: True Category Proportions are NOT
the same for all populations
 Assumptions
SRS
Large enough sample size
Drinking Example
 TEST: Don’t write all the way out, Do This:
2
(
O

E
)
(__  __)
(__  __)
2
X 

+ ... +

E
__
__
 P-Value: df =
 Interpret:
NOW X2 - Independence
 Looking to see if certain category is
independent of another.
 Example: Do blondes have more fun?
 Do exactly like homogeneity, but
thinking (and Hypothesis and
Interpretation) is a bit different.
 Recall: If A and B are Independent:
P(A&B) = P(A) x P(B)
Observed Data – from an SRS
of 70 people
Hair Color
Fun Level
Blonde
Non-Blonde
Always
12
12
24/70 = .34
Sometimes
9
12
21/70 = .3
Never
4
21
25/70 = .36
25/70 = .36
45/70 = .64
70
Expected Value = Row Proportion x Column Proportion x Grand Total
Sig Test Differences
 Hypotheses:
H0: Two variables are independent
Ha: Two variables are NOT independent
 Interpretation: answer question, is
there evidence against the hypotheses
that the variables are independent?
 Everything else is same
Do it! Do blondes have more
fun?
 Is fun level independent of hair color?
 Hypotheses:
H0:
Ha:
 Assumptions


Blonde = Fun
 TEST:1st - Show Completed Chart w/
Expected counts too!
Hair Color
Fun Level
Blonde
Non-Blonde
Always
12
12
24/70 = .34
Sometimes
9
12
21/70 = .3
Never
4
21
25/70 = .36
25/70 = .36
45/70 = .64
70
Blonde = Fun
 Now conduct X2 test
2
(
O

E
)
(__  __)
(__  __)
2
X 

+ ... +

E
__
__
 P-Value df =
 Interpret
THAT’S IT!
 JUST ONE MORE CHAPTER TO GO
AFTER THIS!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Chapter 27
Inference For Regression
Final Chapter!!!!!
The Main Idea
 Inference –
We take a sample
Use procedures to find
Could the results happen by chance variation of
the sample
Is there evidence that this result might not have
happened by chance
 Now we apply this to linear relationships
between 2 variables
Let’s Review Bivariate Data 1st
Mother’s Age and Birth Weight
Age
Weight (lbs)
15
5.0
17
6.3
18
7.3
15
5.8
16
6.8
19
6.6
17
6.9
16
5.6
18
7.5
19
8.1
Chuck these
into L1 and L2
and let’s roll!
Stuff to do with Bivariate Data
 Graph it (don’t forget labeling)
 Find LRSL (graph it and label it too!)
Stuff to do with Bivariate Data
 Coefficient of Correlation (interpret)
 Coefficient of Determination (interpret)
Stuff to do with Bivariate Data
 Graph residuals (don’t forget labels)
Stuff to do with Bivariate Data
 Overall comment on Bivariate data.
Direction, Strength, Linear-ness, Weirdness:
Stuff to do with Bivariate Data
 Normalness of residuals (use boxwhisker, stem-leaf, or histogram)
Stuff to do with Bivariate Data
 Predict BW for the birth weight of a
child born to a 17 year-old mother.
Stuff to do with Bivariate Data
 Interpret slope of regression line in the
context of the problem.
 Interpret intercept of regression line in
the context of the problem.
Inference for an LSRL - Data
 Two major FR problem types
Do inference based on given set of
bivariate data (not common)
Do inference based on the output from a
statistics software program (very
common)
 We will do both, but raw data first…
Basics
 Our LSRL is simply an estimator of the
true LSRL (which would be based on a
census of the entire population).
Estimated y-int
Predicted y value
Estimated slope
yˆ  a + bx
Basics
 Real LSRL (from a census)
Mean y-value
for that given
x-value
True y-intercept
True Slope
(Pretty drawing stolen with
love from Gabriel Tang
who stole it from someone
else)
Basics
 For a given x-value, the y’s will vary
normally about a my. (standard error of the
residuals)
Basics
 So….to do inference, we need the following statistics:
a – an estimate of a, the y-int of my (LinReg)
b – an estimate of b, the slope of my (Lin Reg)
sb – the standard error of the slope of the
regression line (on computer output, not TI, I will
show you a trick to find it)
s – the standard error of the residuals (on
computer and TI output)
Basics yˆ  b  b x
0
1
 Check out the
formulas on formula
sheet:
 Make sure you can
use them/know what
they are
 Great multiple choice
fodder…
 i.e. given r, sy, and b1
find sx
b1  blob
b0  blob
r  blob
b1  blob
Sb1  blob
Basics - SEResiduals
1
2
s
 Residual 

n2
( y  yˆ )
Why n-2? Just deal.
(TI will calculate if you have the raw data!)
Basics – SE Slope
SE of Residuals
sb 
s
 x  x 
2
(TI will NOTcalculate – Boff has a trick)
Standard Error of Slope Trick
When we do LinRegTTest on T1, it gives us:
T-value
s (SE of Resids)
Now the t-value formula is:
b
b
t
therefore sb 
sb
t
Yee haw, plug and chug!
Significance Test for Slope of a
Regression Line – From Data
 Idea: we are checking to see if this LSRL is
a good fit for our data
 Usual test is whether the slope of the
regression line is ZERO (no relationship) or
NOT ZERO (a relationship) – Two Sided
 Sometimes we look to see if relationship is
positive or negative (>0 or <0, respectively)
– One Sided
Let’s do it – Mom Age/Baby wt
Parameters:
Let b represent the true slope of the
regression line.
Hypotheses:
H0: b = 0, No relationship between birth
weight and mother’s age
Ha: b ≠ 0, There is a relationship between
birth weight and mother’s age
Let’s do it – Mom Age/Baby wt
 Assumptions:
Linear Relationship (show resids, talk
about r)
Variance about the line is both constant
and normal (show resids and comment)
 WHAT TO LOOK FOR:
YUCK – Curvy!
YUCK – Not constant variance
(Better check for outlier here)
Let’s do it – Mom Age/Baby wt
 Assumptions:
Linear
Relationship
Variance about
the line is both
constant and
normal
 How do ours
look?
Let’s do it – Mom Age/Baby wt
 TEST – T Test for a Linear Regression
 On TI – STAT>TEST>LINREGTTEST
 Fill in Blanks
b
t 
sb

Let’s do it – Mom Age/Baby wt
Total P
 P-Value
 df = 10-2 = 8
 2 x P(t >
)=
b1
b=0
b2
t
t=0
t
 Interpretation:
 P-Value is very small so we will reject H0 in favor of
the alternative. There is almost assuredly a
relationship between birth weight and a mother’s
age.
Remember, we can never say one causes the
other unless we have a well designed and
controlled experiment
Confidence Intervals (from
data)
 Similar to other CI, but remember we are
looking at slope…
b  t * Sb1
Our estimated
slope from
LinReg
Get from table,
need df (n-2)
and confidence
level
Remember
Riebhoff’s
sneaky trick?
Part 2, Computer Output
 Data collected by counting cricket
chirps in 15 seconds and noting current
temperature.
 Output (from some statistical software):
TEMP = 44.0 + 0.993 NUMBER
Predictor
Coef
STDev T-Ratio
P
CONSTANT
44.01
1.827
24.09
.000
NUMBER
0.993
0.065
15.23
.000
S = 1.538
R-sq = 95.9%
R-sq(adj) = 95.5%
Interpreting Computer Output
TEMP = 44.0 + 0.993 NUMBER
Predictor
Coef
STDev T-Ratio
P
CONSTANT
44.01
1.827
24.09
.000
NUMBER
0.993
0.065
15.23
.000
S = 1.538
R-sq = 95.9%
R-sq(adj) = 95.5%
This is the LSRL for predicting the temperature
based on the number of chirps.
Interpreting Computer Output
TEMP = 44.0 + 0.993 NUMBER
Predictor
Coef
STDev T-Ratio
P
CONSTANT
44.01
1.827
24.09
.000
NUMBER
0.993
0.065
15.23
.000
S = 1.538
R-sq = 95.9%
R-sq(adj) = 95.5%
The Coef for the CONSTANT is the y-intercept
(a) of the LSRL
The Coef for the NUMBER is the Slope (b) of
the LSRL
Interpreting Computer Output
TEMP = 44.0 + 0.993 NUMBER
Predictor
Coef
STDev T-Ratio
P
CONSTANT
44.01
1.827
24.09
.000
NUMBER
0.993
0.065
15.23
.000
S = 1.538
R-sq = 95.9%
R-sq(adj) = 95.5%
STDev of NUMBER is the standard error of the
slope (Sb).
Interpreting Computer Output
TEMP = 44.0 + 0.993 NUMBER
Predictor
Coef
STDev T-Ratio
P
CONSTANT
44.01
1.827
24.09
.000
NUMBER
0.993
0.065
15.23
.000
b
S = 1.538
t
R-sq = 95.9%
sb
R-sq(adj) = 95.5%
T-ratio of NUMBER is the value of the t test
statistic. Check it:
b
t
sb
P is the P-value associated with the t-ratio
(assuming a 2-sided test – if you are using a 1sided test, divide by 2):
Interpreting Computer Output
TEMP = 44.0 + 0.993 NUMBER
Predictor
Coef
STDev T-Ratio
P
CONSTANT
44.01
1.827
24.09
.000
NUMBER
0.993
0.065
15.23
.000
S = 1.538
R-sq = 95.9%
R-sq(adj) = 95.5%
S is the standard error of the residuals (s)
Interpreting Computer Output
TEMP = 44.0 + 0.993 NUMBER
Predictor
Coef
STDev T-Ratio
P
CONSTANT
44.01
1.827
24.09
.000
NUMBER
0.993
0.065
15.23
.000
S = 1.538
R-sq = 95.9%
R-sq(adj) = 95.5%
R-sq is r-squared (coefficient of determiniation
– r2 )
Interpreting Computer Output
TEMP = 44.0 + 0.993 NUMBER
Predictor
Coef
STDev T-Ratio
P
CONSTANT
44.01
1.827
24.09
.000
NUMBER
0.993
0.065
15.23
.000
S = 1.538
R-sq = 95.9%
R-sq(adj) = 95.5%
Ignore these, we do
not need!
Significance Test – From
Output
 Same as before, but maybe easier…
 Here we go…
Let’s do it – Chirps/Temp
Parameters:
Let b represent the true slope of the
regression line.
Hypotheses:
H0: b = 0, No relationship between # of
chirps and temperature
Ha: b ≠ 0, There is a relationship between
# of chirps and temperature
Let’s do it – Chirps/Temp
 Assumptions:
Linear Relationship (no resids, talk about
r)
Variance about the line is both constant
and normal (we must assume….)
Let’s do it – Mom Age/Baby wt
TEMP = 44.0 + 0.993 NUMBER
Predictor
Coef
STDev T-Ratio
P
CONSTANT
44.01
1.827
24.09
.000
NUMBER
0.993
0.065
15.23
.000
S = 1.538
R-sq = 95.9%
R-sq(adj) = 95.5%
 TEST – T Test for a Linear Regression
 Fill in Blanks From Output
b
t 
sb

Let’s do it – Chirps/Temp
Total P
 P-Value
 df = not given
 2 x P(t >
)=
 Interpretation:
b1
b=0
b2
t
t=0
t
DONE
DONE DONE DONE DONE DONE DONE
DONE DONE DONE DONE DONE DONE
DONE DONE DONE DONE DONE DONE
DONE DONE DONE DONE DONE DONE
DONE DONE DONE DONE DONE DONE
DONE DONE DONE DONE DONE DONE
DONE DONE DONE DONE DONE DONE
DONE DONE DONE DONE DONE DONE
DONE DONE DONE DONE DONE DONE
DONE DONE DONE DONE DONE YEAH!

APSTAT - Unit 4b - Woodside Priory School

Transcript APSTAT - Unit 4b - Woodside Priory School

Directory