Introduction to Probability and Statistics Eleventh Edition

Download Report

Transcript Introduction to Probability and Statistics Eleventh Edition

Welcome to Pstat5E:
Statistics with Economics and
Business Applications
Solution to Practice Final Exam
Yuedong Wang
Practice Midterm
1. Each year, billions of dollars are spent at theme parks
owned by Disney, Universal Studios, Sea World and
others. A management consultant claims that 20% of
trips include a theme park visit. A survey of 1233
randomly selected people who took trips revealed that
111 of them visited a theme park.
(i) Construct a 95% confidence interval for the
proportion of trips that include a theme park visit.
(ii) Do these data support the consultant's claim?
Practice Midterm
Solution: (i) We have a binomial experiment with
p=proportion of trips include a theme park visit
95% confidence interval for p :
pˆ  z α/2
pˆ qˆ
111
(111 / 1233)(1  111 / 1233)

 1.96
n 1233
1233
 .0900  .0160
(.0740, .1060)
(ii) Since the interval does not contain the value
.2 (20%), the consultant’s claim is not supported.
Practice Midterm
2. A mathematical proficiency test were given to
randomly selected 13-year-old male and female
students. The following tables gives the sample mean
scores and standard deviations:
Male Students
Sample size
905
Sample mean
474.6
Sample Std Dev 192.5
Female Students
905
473.2
153.4
(i) Estimate the difference in mean scores between male
student and female students and construct the 95%
confidence interval.
(ii) Can you conclude that the mean scores are different
Practice Midterm
for male and female students?
Solution: (i) Denote μ1=mean score for male students,
μ2=mean score for female students.
The point estimate of the difference, μ1-μ2, is
x1  x2  474.6  473.2  1.4
Since both sample sizes are large,
95% confidence interval for 1 -  2 :
( x1  x 2 )  z / 2
s12 s 22

 1.4  16.04, (14.64, 17.44)
n1 n2
(ii) Since the confidence interval contains zero, we would
not conclude that the mean scores are different
between male and female students.
Practice Midterm
3. The paper ``The association of marijuana use with
outcome of pregnancy'' (Amer. J. Public Health,
1983, pp.1161-1164) reported the following data on
incidence of major malfunctions among newborns
both for mothers who were marijuana users and for
mothers who did not use marijuana.
User
Sample size
1,246
Number of major malfunctions 42
Nonuser
11,178
294
Practice Midterm
(i) Construct a 99% confidence interval for the
difference between the incidence rate among
all mothers who use marijuana and the
incidence rate among all mothers who do not
use marijuana.
(ii) Do these data indicate that the incidence rate
is higher for mothers who use marijuana?
Practice Midterm
Solution: (i) Denote
p1= incidence rate among all mothers who use
marijuana,
p2= incidence rate among all mothers who do
not use marijuana.
Since both sample sizes are large,
Practice Midterm
pˆ 1  42 / 1246  .0337,
pˆ 2  294 / 11178  .0263
99% confidence interval for p1  p 2 :
( pˆ 1  pˆ 2 )  z / 2
pˆ 1 qˆ1 pˆ 2 qˆ 2

n1
n2
.0337  .9663 .0263  .9737
 .0337  .0263  2.58

1246
11178
 .0074  .0138
(-.0064, .0212)
(ii) Since the confidence interval contains zero, we
would not conclude that the incidence rate is
higher for mothers who use marijuana
Practice Midterm
4. A new program has been developed to enrich the
kindergarten experience of children in preparation for
the first grade. Pupils in each classroom are tested at the
beginning of the school year (pretest) and again at the
end of the school year (posttest). The following table
gives the scores of 9 randomly selected students
exposed to the new curriculum (high score=better
performance).
Pupil
1
x=Pretest 9
y=Posttest 16
2 3 4 5 6 7 8 9
6 14 12 9 8 12 8 11
11 14 10 14 12 15 11 14
Practice Midterm
(i) Apply an appropriate test to decide at the 5% level if
the new curriculum significantly increased pupil's
performance. Follow five steps in the lecture note.
(ii) Specify assumptions for the above test.
(iii) Suppose that further study establishes that, in fact,
the population mean score at the beginning is 12.4 and
the mean score at the end of the year is12.3.
Refer back to part (i). Did your analysis lead to a
(a) Type I error;
(b) Type II error;
(c) Correct decision;
(d) None of (a)-(c).
Circle the correct response.
(iv) Do you change your conclusion in (i) if =.01?
Practice Midterm
Solution:
(i)
Since pretest and posttest scores come as pairs for
each pupil, the method we would use is the paireddifference test. Denote
x=pretest score, y=posttest score, d=x-y,
μ1=mean pretest score, μ2=mean posttest score.
Pupil
x=Pretest
y=Posttest
d=x-y
1 2 3 4 5 6
9
6 14 12 9 8
16 11 14 10 14 12
-7 -5 0 2 -5 -4
7 8 9
12 8 11
15 11 14
-3 -3 -3
Practice Midterm
H 0 : 1   2  0 (no improvemen t)
H a : 1   2  0 (increased )
n  9, d  3.1111, s d  2.7131
t* 
d
sd / n

 3.1111
2.7131 / 9
 3.44
df  n  1  9  1  8
Practice Midterm
p - value : one - sided 
p - value  P(t  -3.44)  P(t  3.44)
 P(t  3.355)  .005
p  value  .005
Decision: since the p-value is smaller than  =
.05, H0 is rejected.
Conclusion: there is strong evidence that the new
curriculum increases performance on average.
Practice Midterm
(ii) The differences, d=x-y, are independent for different
pupils and have the same normal distribution.
(iii) μ1=12.4, μ2=12.3, H0 is true. Since we rejected H0,
so we committed a type I error. Circle (a).
(iv) Since p-value < .01, we still reject H0.
Practice Midterm
5. An automobile manufacture recommends that
any purchaser of one of its new cars bring it in
to a dealer for a 3000-mile checkup. The
company wishes to know whether the true
average mileage for initial servicing differs
from 3000.
(i) A random sample of 20 recent purchasers
resulted in a sample average mileage of 3108
and a sample standard deviation of 273 miles.
Does the data suggest that true average
mileage for this checkup is something other
than the recommended value? Use α=.01 and
follow five steps in the lecture note.
Practice Midterm
(ii) In (i), instead of 20, suppose that the
manufacture selected 50 recent purchasers,
and gets the same sample mean and standard
deviation as in (i). Does the data suggest that
true average mileage for this checkup is
something other than the recommended value?
Use α=.01.
(iii) In (ii), what is the smallest significance
level that you will reject the null hypothesis?
(iv) Specify assumptions for the tests in (i) and
(ii).
Practice Midterm
Solution:
(i) Denote μ=true average mileage of cars brought
to the dealer for 3000-mile checkups.
H 0 :   3000,
H a :   3000
n  20 (small sample size), x  3108, s  273
x  3000 3108  3000
t* 

 1.769
s/ n
273 / 20
df  n  1  20  1  19
Practice Midterm
p - value : two - sided  p - value  2  P(t  1.769),
1.769 is between t .050  1.729 and t .025  2.093, thus
.05  p  value  .1
Decision: since the p-value is larger than  = .01,
H0 is not rejected.
Conclusion: there is insufficient evidence to
indicate that the true average initial checkup
mileage differs from the manufacture’s
recommended value.
Practice Midterm
(ii)
H 0 :   3000,
H a :   3000
n  50 (large sample), x  3108, s  273
x  3000 3108  3000
z* 

 2.80
s/ n
273 / 50
p - value : two - sided 
p - value  2  P(z  2.80)  2  (.5  .4974)  .0052
Decision: since the p-value is smaller than  = .01, H0 is
rejected.
Conclusion: there is strong evidence to indicate that the
true average initial checkup mileage differs from the
Practice Midterm
manufacture’s recommended value.
(iii) the smallest significance level to reject the null
hypothesis=p-value=.0052.
(iv) For (i), we need to assume that the sample has been
randomly selected from a normally distributed
population. For (ii), the normality assumption is not
needed.
Practice Midterm
6. In planning for a meeting with accounting
majors, the head of the Accounting Program
wants to emphasize the importance of doing
well in the major courses to get better-paying
jobs after graduation. To support this point, he
plans to show that there is a strong relationship
between starting salaries for recent accounting
graduates and their grade-point average (GPA)
in the major courses. Records for seven of last
year's accounting graduates are selected at
random:
Practice Midterm
GPA in major courses
2.58
3.27
3.85
3.50
3.33
2.89
2.23
 x i  21.65,
x y
i
i
Starting salary (in thousands dollars)
16.5
18.8
19.5
19.2
18.5
16.6
15.6
 x i  68.84,
2
 yi  124.70,
 yi  2235.75,
2
 390.69, Sxx  1.88, Syy  14.31, Sxy  5.01
Practice Midterm
(i) What are dependent and independent variables?
(ii) Find and report the least-square regression line.
(iii) How much of the variability in starting salary is
explained by the GPA in major courses?
(iv) Find 95% confidence interval for the slope.
Interpret the point and interval estimates of the slope.
(v) Obtain a 95% confidence interval for the expected
starting salary of all graduates with major GPA 3.0.
(vi) Obtain a 95% confidence interval for a graduate
with major GPA 3.0.
(vii) Suppose 5 graduates each has major GPA 3.0. Do
you expect these 5 graduates to have exactly the same
starting salary?
Practice Midterm
Solution:
(i) x=Independent variable=GPA in major
courses
y=dependent variable=starting salary
(ii)
βˆ 
S xy
S xx
5.01

 2.66
1.88
124.70
21.65
ˆ
αˆ  y - β x 
 2.66
 9.59
7
7
yˆ  9.59  2.66 x
Practice Midterm
(iii)
SS Total  S yy , SSR 
S xy
2
S xy
2
S xx
2
SSR
5.01
r 


 .93
SS Total S xx S yy 1.88  14.31
2
93% of the variabili ty in starting salary is
explained by the GPA in major courses
Practice Midterm
(iv)
SSE  SS Total - SSR  14.31 - 5.012 / 1.88  .96
σˆ 2  SSE/(n  2)  .96/5  .19
σˆ  .19  .44
A 95% confidence interval for  is
ˆ  t / 2 ˆ / S xx  2.66  2.57  .44/ 1.88  2.66  .82
(1.84, 3.48)
When GPA increases 1 unit, the starting salary
increases 2660$, and we are 95% confident that
the true increase in starting salary associated
with one unit GPA is between 1840$ and 3480$.
Practice Midterm
(v)
When x 0  3, yˆ  9.59  2.66  3  17.57
95% confidence interval for the average of y at x 0  3 is
 1 (x 0  x) 2
yˆ  t α/2 σˆ  
S xx
n



 1 (3  21.65/7) 2
 17.57  2.57  .44  
1.88
7
or (17.14, 18.00).

  17.57  .43

Practice Midterm
(vi)
95% confidence interval for the prediction of y at x 0  3 is
 1 ( x0  x ) 2
yˆ  t / 2ˆ 1  
S xx
 n



 1 (3  21.65 / 7) 2
 17.57  2.57  .44 1  
1.88
 7
or (16.36, 18.78).

  17.57  1.21

(vii) No.
Practice Midterm