p - Michigan State University`s Statistics and Probability
Download
Report
Transcript p - Michigan State University`s Statistics and Probability
Review for Final Exam
• Two statistical inference methods:
Confidence interval
Estimator +/- Margin of Error
Hypothesis testing
Hypothesis: H0 v.s. Ha
Test statistic
P-value
Conclusion
Review for Final Exam
Question
C.I.?
1-S?
μ?
p?
Test?
C.I.?
2-S? 1-S?
2-S? 1-S?
Test?
2-S? 1-S?
2-S?
Review for Final Exam
• Inference about population proportion p
Confidence interval:
A level C confidence interval for p is given by
Standard
Error
pˆ z
*
pˆ (1 pˆ )
n
where z* is a z-critical value corresponding to the
confidence level C, n is the sample size, and ^
p is the
sample proportion.
Review for Final Exam
• Inference about population proportion p
The level C confidence interval for a population
proportion p will have margin of error approximately
equal to a specified value m when the sample size is
2
z *
n p (1 p * )
m
where p* is a guessed value for the sample proportion.
The margin of error will be at most m if p* is taken to
be 0.5.
*
Review for Final Exam
• Inference about population proportion p
Hypothesis testing
Hypotheses:
H0:p=p0 v.s. Ha:p>p0/p<p0/p≠p0
Test Statistic:
z
pˆ p0
p0 (1 p0 )
n
Review for Final Exam
• Inference about population proportion p (continued):
Hypothesis testing:
P-value:
P-value=1-Φ(z), for Ha:p>p0
P-value=Φ(z), for Ha:p<p0
P-value=2(1-Φ(|z|)), for Ha:p≠p0
Here z is the value of the test statistic and Φ(z) is
the probability from the normal table
corresponding to z.
Conclusion:
Reject H0 if P-value<α
Do not reject H0 if P-value>α
Review for Final Exam
Review for Final Exam
• Inference about population mean μ
Confidence interval:
A level C confidence interval for μ is given by
s
x t*
Standard Error
n
where t* is the t-critical value corresponding to
degrees of freedom n-1 and the confidence level C,
n is the sample _size, s is the sample standard
deviation, and x is the sample mean.
Review for Final Exam
• Inference about population mean μ
Hypothesis testing:
Hypotheses:
H0:μ=μ0 v.s. Ha:μ>μ0/μ<μ0/μ≠μ0
Test Statistic:
x 0
t
s
n
The test statistic follows a t-distribution with
degrees of freedom n-1.
Review for Final Exam
• Inference about population mean μ
Hypothesis testing:
P-value:
P-value=Tdf(t), for Ha:μ>μ0
P-value=Tdf(-t), for Ha:μ<μ0
P-value=2Tdf(|t|), for Ha:μ≠μ0
Here Tdf(t) means look up the t-Critical Values
Table for the test statistic t.
Conclusion:
Reject H0 if P-value<α
Do not reject H0 if P-value>α
Review for Final Exam
Review for Final Exam
• Interpretation about hypothesis testing
P-value is the probability, assuming the null
hypothesis is true, that the test statistic will take a
value as extreme or more extreme (meaning favoring
the alternative hypothesis Ha) than that actually
observed.
Caution: P-value is NOT the probability that the
null hypothesis is wrong.
Review for Final Exam
• Interpretation about hypothesis testing
Type I error: reject H0 while is H0 true
Type II error: do not reject H0 while is H0 false
The significance level α is our tolerance for the
probability of making type I error.
The P-value is the probability of making type I error
when we reject the null hypothesis based on our
sample.
If the consequences of rejecting the null hypothesis
are very serious, we want to be conservative at
rejecting H0. Therefore, we should choose a small α.
Review for Final Exam – Practice
• In a survey conducted by a firm, 12 of 60 families in two
story houses were found to own their houses. Let p
denote the population proportion of families of two story
houses who own their house.
Find a 95% confidence interval for p.
The firm came up with a confidence interval (0.1406,
0.2594) for p. What confidence level did the firm use?
Assume nothing is known about p. The firm requires a
95% confidence interval with margin of error at most
0.034 for p. What is the required sample size?
Suppose that a previous survey indicates that the p is
0.28. The firm requires a 95% confidence interval with
margin of error at most 0.034 for p. What is the
required sample size?
Review for Final Exam – Practice
• Solution:
Find a 95% confidence interval (C.I.) for p.
In general, a level C C.I. for p is given by
pˆ (1 pˆ )
pˆ (1 pˆ )
*
*
pˆ z
ˆz
,
p
n
n
In this case,
^
p=12/60=0.2;
n=60;
z*=1.96 (according to the 95% confidence level)
Thus a 95% C.I. for p is
0.2(1 0.2)
0.2(1 0.2)
0.2 1.96
(0.0988, 0.3012)
,
0
.
2
1
.
96
60
60
Review for Final Exam – Practice
• Solution:
The firm came up with a confidence interval (0.1406,
0.2594) for p. What confidence level did the firm use?
Confidence interval for p can also be given by
pˆ ME
pˆ (1 pˆ )
*
where ME is the margin of error: ME z
n
In this case,
ME=0.2594-0.2=0.0594
The standard error is
pˆ (1 pˆ )
0.2(1 0.2)
SE
0.0516
n
60
Then z*=ME/SE=0.0594/0.0516=1.15, which
corresponds to confidence level 75%.
Review for Final Exam – Practice
• Solution:
Assume nothing is known about p. The firm requires a
95% C.I. with margin of error at most 0.034 for p. What
is the required sample size?
The required sample size for a level C
(corresponding to z*) C.I. for a p with margin of
error approximately equal to m is
2
z *
n p (1 p * )
m
In this case:
z*=1.96, p*=0.5, m=0.034
2
Then
1
.
96
n
0.5(1 0.5) 830.8 831.
0.034
*
Review for Final Exam – Practice
• Solution:
Suppose that a previous survey indicates that the p is
0.28. The firm requires a 95% C.I. with margin of error
at most 0.034 for p. What is the required sample size?
The required sample size for a level C
(corresponding to z*) C.I. for a p with margin of
error approximately equal to m is
2
z *
n p (1 p * )
m
In this case:
z*=1.96, p*=0.28, m=0.034
2
Then
1.96
n
0.28(1 0.28) 669.95 670.
0.034
*
Review for Final Exam – Practice
• To target the right age-group of people, a marketing
consultant must find which age-group purchases from
home-shopping channels on TVs more frequently.
According to management of TeleSell24/7, a homeshopping store on TV, about 40% of the online-musicdownloaders are in their fifties, but the marketing
consultant does not believe in that figure. To test this he
selects a random sample of 205 online-musicdownloaders and finds 71 of them are in their fifties.
What are the hypotheses in this case?
What is the value of the test statistic?
What is the P-value of the test?
What is your conclusion at α=5%?
Review for Final Exam – Practice
• Solution:
The sample:
pˆ
71
0.346, n 205
205
What are the hypotheses in this case?
H0:p=0.4 v.s. Ha:p≠0.4
What is the value of the test statistic?
z
pˆ p0
p0 (1 p0 )
n
0.346 0.40
1.58
0.40(1 0.40)
205
Review for Final Exam – Practice
• Solution:
What is the P-value of the test?
According to Ha:p≠0.4, P-value=2(1-Φ(|1.58|))=0.1141.
What is your conclusion?
Since P-value>α(=5%), we do not reject the null
hypothesis.
If we concluded that 40% of the online-musicdownloaders are in their fifties while in fact this
proportion is 35%, then
we made a Type I Error.
we made a Type II Error.
we made a correct decision.
Review for Final Exam – Practice
• The safety management of an offshore oil-mining
corporation believes that the true average escape time
would be at most 340 min. A sample of 28 offshore oilworkers took part in a simulated escape exercise. The
sample yielded an average escape time of 347.68 min.
and standard deviation of 26.95 min. Does this data
contradict the management's claim?
What are the hypotheses in this case?
What is the value of the test statistic?
What is the P-value of the test?
What is your conclusion at α=5%?
What is a 98% confidence interval of the average
escape time?
Review for Final Exam – Practice
• Solution:
The sample:
x 347.68, s 26.95, n 28.
What are the hypotheses in this case?
H0:μ=340 v.s. Ha:μ>340
What is the value of the test statistic?
x 0 347.68 340
t
1.508
s
26.95
28
n
The test statistic follows a t-distribution with
degrees of freedom 28-1=27.
Review for Final Exam – Practice
• Solution:
What is the P-value of the test?
According to Ha:μ>340,
P-value is between 0.05 and 0.10.
Review for Final Exam – Practice
• Solution:
What is your conclusion?
Since P-value>α(=5%?), we do not reject the null
hypothesis.
If we concluded that the management's claim is
correct while in fact average escape time is 340 min.,
then
we made a Type I Error.
we made a Type II Error.
we made a correct decision.
Review for Final Exam – Practice
• Solution:
What is a 98% confidence interval of the average
escape time?
A level C confidence interval for μ is given by
* s
x t
n
We have
t*=2.473 (corresponding to degrees of freedom 27
and the confidence level 98%);
_
n=28, s=26.95, and x=347.68.
So a 98% confidence interval of the average
escape time is
26.95
347.68 2.473
(335.0848,360.2752).
28
Review for Final Exam – Practice
Review for Final Exam – Practice
• In a test of hypothesis, if we insist on very strong
evidence against the null hypothesis we should
choose α to be very small
choose α to be larger than the P-value
choose α to be very large
choose α to be smaller than the P-value
Review for Final Exam – Practice
• Based on a random sample of 50 students from
among 40,000, a 91 percent confidence interval on
the mean height of all 40,000 students was found to
be the interval from 66 inches to 69.2 inches. Select
the correct statement below:
About 91 percent of all 40,000 students have heights
between 66 and 69.2.
About 91 percent of the heights in the sample should
be between 66 and 69.2
The probability that the mean height is between 66
and 69.2 is 91 percent.
About 91 percent of all samples would produce
intervals containing μ
Review for Final Exam – Practice
• In a test of hypotheses, data are deemed to be significant
at level α=0.05, but not significant at level α=0.01. Which
of the following is true about the P-value associated with
this test?
P-value is greater than 0.05.
P-value is between 0.01 and 0.05.
P-value is less than 0.01.
Nothing can be said.
Review for Final Exam
• Sample / Population
• Statistics / Parameters
• Random sampling design
Simple random sample (SRS)
Stratified random sample
Cluster sample
Multistage sample
• Use random digits to draw simple random samples
Review for Final Exam
•
•
•
Law of large numbers
Probability: Sample space / Events
Rules for probability model:
1. for any event A, 0 ≤ P(A) ≤ 1.
2. for sample space S, P(S) = 1.
3. if two events A and B are disjoint, then
P(A or B) = P(A) + P(B).
4. for any event A,
P(A does not occur) = 1 - P(A).
5. For two independent events A and B,
P(A and B) = P(A) X P(B).
•
Venn diagram
Review for Final Exam
• General Addition Rule:
For two events A and B,
P(A or B) = P(A) + P(B) – P(A and B).
• General Multiplication Rule
For two events A and B,
P(A and B) = P(B|A) X P(A).
• Conditional probability
P(A and B)
P(B | A)
P(A)
• Independence: P(B|A) = P(B).
Review for Final Exam
• Random variable:
A random variable is a variable whose value is a
numerical outcome of a random phenomenon.
• Distribution:
The probability distribution (distribution) of a random
variable tells us what values this random variable can
take and how to assign probabilities to those values.
Review for Final Exam
• Statistics are random variables.
Sample proportion
Sample mean
• Central limit theorem
• Sampling distributions of statistics
Review for Final Exam
• Sampling distribution of the sample proportion p^for an
SRS of size n:
mean of ^
p equals the population proportion p;
standard deviation of p^equals
p (1 p )
;
n
If the sample size is large, then p^ is approximately
Normal, that is,
p (1 p )
.
pˆ ~ N p,
n
Review for Final Exam
_
• Sampling distribution of the sample mean x for an SRS of
size n:
_
mean of x equals the population mean μ;
_
standard deviation of x equals
, where σ is the
n
population standard deviation;
_
if the sample size is large, then x is approximately
normal, that is,
σ
x ~ N ,
;
n
if the population has a normal distribution, then the
approximation is exact.
Review for Final Exam – Practice
• Motor vehicles sold to individuals are classified as either
cars or light trucks (including SUVs) and as either
domestic or imported. In a recent year, 69% of vehicles
sold were light trucks, 78% were domestic, and 55% were
domestic light trucks. For a randomly selected vehicle,
what is the probability that
the vehicle is a car?
the vehicle is either domestic or a light truck or both?
the vehicle is an imported light truck?
the vehicle is a domestic if we know it is a car?
Review for Final Exam – Practice
• 56% of all American workers have a workplace retirement
plan, 66% have health insurance, and 73% have at least
one of the benefits. We select a worker at random.
What is the probability that he has both health
insurance and a retirement plan?
What is the probability that he has neither health
insurance nor a retirement plan?
What is the probability that he only has a retirement
plan?
Knowing that he has a retirement plan, what is the
probability that he has health insurance?
Review for Final Exam – Practice
• Solution:
Let A be the event that he has a retirement plan.
Let B be the event that he has health insurance.
Then P(A)=0.56, P(B)=0.66, and P(A or B)=0.73.
A
B
B
A
Review for Final Exam – Practice
• Solution:
What is the probability that he has both health
insurance and a retirement plan?
P(A and B)=?
General addition rule:
P(A or B) = P(A) + P(B) - P(A and B)
Therefore, P(A and B) = P(A) + P(B) - P(A or B) =
0.56+0.66-0.73 = 0.49
A
B
Review for Final Exam – Practice
• Solution:
What is the probability that he has neither health
insurance nor a retirement plan?
The probability that he has at least one benefit is
0.73.
Therefore, the probability that he has neither
health insurance nor a retirement plan is 10.73=0.27.
A
B
Review for Final Exam – Practice
• Solution:
What is the probability that he only has a retirement
plan?
“Only has a retirement plan” means has a
retirement plan but no health insurance (not both).
Therefore, P(he only has a retirement plan) = P(A)
– P(A and B) = 0.56-0.49 = 0.07
A
B
Review for Final Exam – Practice
• Solution:
Knowing that he has a retirement plan, what is the
probability that he has health insurance?
P(B and A) 0.49
P(B | A)
0.875.
P(A)
0.56
Review for Final Exam – Practice
• Spell-checking software catches “nonword errors” that result
in a string of letters that is not a word, as when “the” is typed
as “teh.” When undergraduates are asked to type a 250-word
essay (without spell-checking), the number X of nonword
errors has the following distribution:
X
0
1
2
3
>=4
Probability
0.1
0.2
0.3
0.3
?
• For a randomly selected student, what is the probability that
he made 4 or more errors?
he made at most 1 error?
• For four randomly selected student, what is the probability
that
each of them made no more than 2 errors?
at least one of them made an error?
Review for Final Exam – Practice
• In a large Statistics lecture, the professor reports that
52% of the students enrolled have never taken a Calculus
course, 34% have taken only one semester of Calculus,
and the rest have taken two or more semesters of
Calculus. The professor randomly assigns students to
groups of three to work on a project for the course.
What is the probability that the first group member you
meet has studied some Calculus?
What is the probability that the first group member you
meet has studied no more than one semester of Calculus?
What is the probability that both of your two group
members have studied exactly one semester of Calculus?
What is the probability that at least one of your group
members has had more than one semester of Calculus?
Review for Final Exam – Practice
• Solution:
Let A denote the event that a student has never taken
a Calculus course
Let B denote the event that a student has taken only
one semester of Calculus
Let C denote the event that a student has taken two
or more semesters of Calculus.
A
B
C
Review for Final Exam – Practice
• Solution:
First, we can find the probability that a student has
taken two or more semesters of Calculus:
P(C) = 1–P(A)–P(B) = 1-0.52-0.34=0.14.
What is the probability that the first group member
you meet has studied some Calculus?
{Some Calculus} = B or C
P(Some Calculus) = P(B or C) = P(B)+P(C) =
0.34+0.14 = 0.48.
Review for Final Exam – Practice
• Solution:
What is the probability that the first group member
you meet has studied no more than one semester of
Calculus?
C = {a student has taken two or more semesters of
Calculus}
CC = {a student has studied no more than one
semester of Calculus}
P(no more than one semester of Calculus) = P(CC) =
1-P(C) = 1-0.14 = 0.86.
Review for Final Exam – Practice
• Solution:
What is the probability that both of your two group
members have studied exactly one semester of
Calculus?
The two events
A1={first member has studied exactly one
semester of Calculus}
A2={second member has studied exactly one
semester of Calculus}
are independent.
Thus, P(both members have studied exactly one
semester of Calculus) = P(A1 and A2) = P(A1)XP(A2) =
0.34X0.34 = 0.1156
Review for Final Exam – Practice
• Solution:
What is the probability that at least one of your group
members has had more than one semester of Calculus?
Let E={at least one of your group members has had
more than one semester of Calculus}
EC={neither of your group members has had more
than one semester of Calculus}
E1={first members does not have had more than
one semester of Calculus}
E2={second members does not have had more
than one semester of Calculus}
P(EC) = P(E1 and E2) = P(E1)XP(E2) = (1-0.14)2.
P(E) = 1-P(EC) = 1-(1-0.14)2 = 0.2604.
Review for Final Exam – Practice
• A North American roulette wheel has 38 slots, of which 18
are red, 18 are black, and 2 are green. If you bet on red,
the probability of winning is 18/38 = .4737. The
probability .4737 represents
(A) nothing important, since every spin of the wheel
results in one of three outcomes (red, black, or green).
(B) the proportion of times this event will occur in a
very long series of individual bets on red.
(C) the fact that you're more likely to win betting on
red than you are to lose.
(D) the fact that if you make 100 wagers on red, you'll
have 47 or 48 wins.
Review for Final Exam – Practice
• A company has developed a new battery, but the average
lifetime is unknown. In order to estimate this average, a
sample of 100 batteries is tested and the average lifetime
of this sample is found to be 250 hours.
Here the population of interest is:
100 batteries, which were tested / average of 250
hours/ all newly developed batteries by the
company / lifetime of newly developed batteries
Here the sample is:
100 batteries, which were tested / lifetime of newly
developed batteries / average of 250 hours / not in
the list
Review for Final Exam – Practice
• A company has developed a new battery, but the average
lifetime is unknown. In order to estimate this average, a
sample of 100 batteries is tested and the average lifetime
of this sample is found to be 250 hours.
What is the parameter of interest in this case?
average lifetime of 100 batteries tested / average
of all newly developed batteries by the company /
100 batteries sampled and tested / no parameter is
involved in this problem
The 250 hours is the value of:
parameter / statistic / sample / variable
Review for Final Exam – Practice
• There are 30 problems in Ch12 in 4 pages and 45 problems in Ch13
in another set of 4 pages. In order to make up a homework set
based on chapters 12 and 13 the instructor considers the following
different schemes. Identify the sampling scheme employed.
Method 1: Label the 75 problems from 1 through 75 and draw 10
numbers at random and choose the corresponding problems.
Simple Random Sampling
Method 2: Pick 4 problems from the 30 in chapter 12 and pick 6
problems from the 45 in chapter 13.
Stratified Random Sampling
Method 3: Pick two pages at random and assign all the problems
in those pages
Cluster Sampling
Method 4: Pick two pages at random and pick 5 problems at
random from each of those two pages.
Multistage Sampling
Review for Final Exam – Practice
• A student group has 8 members:
1. Barrett 2. Chen 3. DeRoos 4. Maceli
5. Pagliarulo 6. Smithson 7. Williams 8. Zachary
Three of them will be selected to participate a national
conference. If we use the following random digits (start
from the left) to select a simple random sample of size 3,
then who will attend the conference?
2023967 8523610 4317063 5689043 5463038 9406022
A. Barrett, Chen, DeRoos
B. Chen, Chen, DeRoosi
C. Chen, DeRoos, Smithson
D. Chen, Pagliarulo, Williams
Review for Final Exam
•
•
•
•
Data / Data table
Cases
Variables (Categorical / Quantitative)
Display Categorical Variables
Frequency Table / Relative Frequency Table
Bar Chart / Relative Frequency Bar Chart / Pie Chart
Review for Final Exam
• Graphic techniques for displaying quantitative
variables:
Histograms
Stem-and-leaf displays
• Shape of distributions:
Unimodal / Bimodal / Multimodal / Uniform
Symmetric / Skewed to the left / Skewed to the
right
Outlier
Review for Final Exam
• Numerical descriptions for the distribution of a
quantitative variable :
The center of a distribution
Mean
Median
The spread of a distribution
Standard deviation
Interquartile Range (IQR)
Five number summary / Outlier (1.5IQR rule)
Boxplot
Review for Final Exam
• Shifting and rescaling of quantitative variables
• Standardization of quantitative variables (z-score)
z x-x
s
• The Normal model
Mean and standard deviation
68-95-99.7 rule
Two types of problems:
Find percentage
Find percentiles
Review for Final Exam
• Scatterplot for two quantitative variables
Direction
positive / negative
Form
linear / curved / no pattern
Strength
strong / moderate / weak
• Correlation coefficient r
Review for Final Exam
• Linear models
yˆ b0 b1 x
• Least square regression line
sy
b1 r
and
b0 y b1 x
sx
• Predictions and residuals
Review for Final Exam – Practice
• The mean height of American women in their early
twenties is about 64.5 inches and the standard deviation
is about 2.5 inches. The mean height of men the same age
is about 68.5 inches, with standard deviation about 2.7
inches. If the correlation between the heights
• of husbands and wives is about r = 0.5, what is the
equation of the regression line of the husband’s height on
the wife’s height in young couples? Predict the height of
the husband of a woman who is 67 inches tall. What
percentage of variation in husbands’ height is explained
by wives’ height?
Review for Final Exam – Practice
• Michigan State University researchers want to investigate how
rainfall affects the yield of crops in East Lansing. The
researchers found that the average amount of rainfall over the
past 20 years is about 230 inches and the standard deviation is
about 10 inches. The average yield of crops in East Lansing is
about 280 tones with a standard deviation of 20 tones. The
correlation between the amount of rainfall and yield of crops is
about 0.4.
1) What is the slope of the regression line of yield of crop on
amount of rainfall?
2) What is the intercept of the appropriate regression line?
3) What is the predicted value of the yield of crop when the
amount of rainfall is 240 inches? If the actual yield of crop
of the year with rainfall 240 inches is 280, what is the
residual?
4) What percentage of variation in crop yield is explained by
the rainfall?
Review for Final Exam – Practice
• Solution:
1) What is the slope of the regression line of yield of
crop on amount of rainfall?
s
The slope is given by b1 r y
sx
Here r 0.4, s x 10, s y 20.
Thus the slope is
20
b1 0.4
0.8
10
2) What is the intercept of the appropriate regression
line?
The intercept is given by b0 y b1 x
Here x 230, y 280, b1 0.8.
Thus the intercept is b0 280 230(0.8) 96.
Review for Final Exam – Practice
• Solution:
3) What is the predicted value of the yield of crop when the
amount of rainfall is 240 inches? If the actual yield of crop
of the year with rainfall 240 inches is 280, what is the
residual?
The predicted value is
yˆ 96 0.8 x 96 0.8(240) 288.
The residual is
y yˆ 280 288 8.
4) What percentage of variation in crop yield is explained by
the rainfall?
The quantity r2 tells us the percentage of changes in
the response variable which are explained by the
changes in explanatory variable. In this case,
r2=0.42=0.16.
Review for Final Exam – Practice
• In a population of couples the average height of wives'
was 65.2 inches and that of the husbands 68.2 inches. You
use the regression line to make predictions of the wife's
height from the husband's height. Suppose a husband has
height 68.2 inches, what would be the predicted height of
the wife?
• Solution:
The regression line satisfies
y b0 b1 x
Since the husband’s height (68.2 inches) is same as the
average height of husbands, the predicted height of
the wife should also be the average height of wives,
that is, 65.2 inches.
Review for Final Exam – Practice
• A regression study on obesity shows that doing more
physical exercises reduces weight. In this study they have
found time spent in physical exercise explained 16% of
the total sample variation in weight among obese people.
What is the correlation between "time spent in physical
exercise" and "weight"?
• Solution:
The quantity r2 tells us the percentage of changes in the
response variable which are explained by the changes in
explanatory variable. In this case, r2=0.16. So the
correlation is r=0.4.
Review for Final Exam – Practice
• Suppose that in families with 5 children X is the number
of boys and Y is the number of girls. What is the
correlation between X and Y?
• Solution:
Since X+Y=5, or equivalently Y =5-X, X and Y are
linearly related.
Therefore, the correlation between X and Y is -1.
Review for Final Exam – Practice
• Which scatterplot has correlation near zero?
Review for Final Exam – Practice
• In a photographic process, the developing time of prints
are approximately normal with mean 15.4 seconds and
standard deviation 0.4 seconds.
1) What proportion of prints will take at least 14.64 sec
to develop?
2) What proportion of prints will take 14.64 sec to 16.00
sec to develop?
3) How many seconds is needed at most for the
quickest 10%?
Review for Final Exam – Practice
• Solution:
1) What proportion of prints will take at least 14.64 sec
to develop?
The z-score corresponding to 14.64 is
x 14.64 15.4
z
1.9.
0.4
The probability corresponding to z-score -1.9 is
0.0287.
Therefore, the proportion of prints that will take at
least 14.64 sec to develop is 1-0.0287=0.9713.
Review for Final Exam – Practice
• Solution:
1) What proportion of prints will take 14.64 sec to 16.00
sec to develop?
The z-score corresponding to 16 is
x 16 15.4
z
1.5.
0.4
The probability corresponding to z-score 1.5 is
0.9332.
Therefore, the proportion of prints will take 14.64
sec to 16.00 sec to develop is 0.93320.0287=0.9045.
Review for Final Exam – Practice
• Solution:
1) How many seconds is needed at most for the
quickest 10%?
Quickest 10% corresponds to the smallest 10%
(less time).
The z-score corresponding to probability 0.1 is 1.28.
Therefore, the seconds needed at most for the
quickest 10% is
x z 15.4 (1.28)0.4 10.28.
Review for Final Exam – Practice
• Which seems to be the
likely value of Q1 (the
first quartile)?
22
• Which seems to be the
likely value of the
48
median?
• What percentage of
the observations is
lying outside the box?
50%
• What is the
approximate value of
the range?
110-5=105
Review for Final Exam – Practice
• The following stem-and-leaf display shows the number of
patients attended by a house-physician in 15 randomly
selected weeks:
Stem | Leaf
---------------------------0 | 8 9
1 | 3 4 6 6 6 8 8
2 | 0 1 2 4
3 | 0 6
Here 0|8 implies 8, 1|3 implies 13 etc. (i.e. the stem represents
tens and leaf represents units).
1) Which observation occurred most?
16
2) How many weeks the physician had to attend between 15
9
to 25 patients?
3) What is the median, Q1, and Q3? Median:18; Q1:14; Q3:22
4) What is the IQR?
IQR=Q3-Q1=22-14=8
5) Are there any outliers? 36 is an outlier
Review for Final Exam – Practice
• What is the mean and standard deviation of the data set
{34, 40, 43, 55}?
• Solution:
Mean: x 34 40 43 55 43.
4
Standard deviation:
xx
( x x )2
s
34
40
43
55
-9
-3
0
12
81
9
0
144
2
(
x
x
)
n 1
234
8.832
4 1
sum
234
Review for Final Exam – Practice
• An airline company keeps track of the delay in its flights.
Generally most flights have small delays but there are a
few flights with very long delays. A consumer group
claims that the "average" delay is 740 minutes while the
airline company claims that the average is only 260
minutes. Why is the difference?
• Solution:
The consumer group refers to the mean while the
company refers to median.
The distribution is skewed to the right. So the mean is
larger than the median.
Review for Final Exam – Practice
• To decide whether to provide electrical power using
overhead lines or underground lines, the state
administration has to consider the total lengths of street
(measured in mile) in each subdivision of the respective
state. Below is the histogram of street lengths of 47
subdivisions in a state.
Review for Final Exam – Practice
of
• What is plotted along the Y-axis (the vertical axis)? Number
subdivisions
• How many subdivisions have total length of street between 2000 and
4000 miles? 10+7=17
• What percent of subdivisions have total length less than 1000 miles?
• Which seems more likely to be true? 12/47=25.5%
1) Mean = Median; Mean < Median;
Mean > Median
• Which class will the median street length be in?
The median
is the 24th
observation
Median
Review for Final Exam – Practice
• In order to plan transportation and parking needs, the
administrations of a private high school asked students how
they get to school. Some rode a school bus, some rode in with
parents or friends, and others used "personal" transportations bikes, skateboards, or just walking. The following table
summarizes the response from boys and girls.
1)
2)
3)
4)
5)
6)
Boy
Girl
Bus
35
32
Ride
35
47
How many students takes part in the survey?
What percentage of students surveyed are girl?
What percentage of students take school bus?
What percent of the students are girls who ride the bus?
What percent of girls who ride bus?
What percent of bus riders are girls?
Review
for Final Exam – Practice
• Solution:
Boy
Girl
1) How many students
takes part in35the survey?
Bus
32
35+35+32+47=149.
Ride
35
47
2) What percentage of students surveyed are girl?
(32+47)/149=53.0%.
3) What percentage of students take school bus?
(35+32)/149=45.0%.
4) What percent of the students are girls who take the bus?
32/149=21.5%.
5) What percent of girls who ride bus?
32/(32+47)=40.5%.
6) What percent of bus riders are girls?
32/(32+35)=47.8%.