Categorizing Inference Questions

Download Report

Transcript Categorizing Inference Questions

Categorizing
Inference
Questions
• 1. Infants who cry easily may be more easily
stimulated than others. This may be a sign of
higher IQ. Child development researchers
explored the relationship between the crying of
infants 4 to 10 days old and their later IQ test
scores. A snap of a rubber band on the sole of
the foot caused the infants to cry. The
researchers recorded the crying and measured
its intensity by the number of peaks in the most
active 20 seconds. They later measured the
children’s IQ at age 3 years. The table below
contains data from a random sample of 38
infants. Do these data provide convincing
evidence that there is a positive linear
relationship between the cry counts and IQ in
the population of infants?
The researchers recorded the crying intensity by the number of peaks in the
most active 20 seconds. They later measured the children’s IQ at age 3 years.
The table below contains data from a random sample of 38 infants. Do these
data provide convincing evidence that there is a positive linear relationship
between the cry counts and IQ in the population of infants?
• A) bivariate – positive linear relationship
• B) the true slope of the relationship
between cry counts and IQ in the
population of infants
• C) 1 sample of 38 infants
• D) T-test for slope of a regression line
𝑏1 −𝛽
t=
𝑠𝑏1
• E) Assumptions:
o SRS, relationship is linear, responses vary normally about the
LSRL, standard deviation of the residuals is constant
• 2. Trace metals found in wells affect the taste of drinking
water, and high concentrations can pose a health risk.
Researchers measured the concentration of zinc (in
milligrams per liter) near the top and the bottom of 10
randomly selected wells in a large region. The data are
provided in the table below. Construct and interpret a
95% confidence interval for the mean difference in the
zinc concentrations from these two locations in the wells.
• A) means
• B) the mean difference in the zinc concentrations
from these two locations in the wells
• C) one sample of 10 wells - two measurements at
each well
Researchers measured the concentration of zinc (in milligrams per liter) near
the top and the bottom of 10 randomly selected wells in a large region. The
data are provided in the table below. Construct and interpret a 95% confidence
interval for the mean difference in the zinc concentrations from these two
locations in the wells.
• D) This will be a one-sample t-interval for matched
pairs. There are two measurements for each well.
Subtract the matching measurements and use the
difference data to do the t-interval
•
∗ 𝑠𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒
𝑥𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 ± 𝑡
𝑛
• E) Assumptions
o
o
o
o
σ unknown, use t
random sample of wells - given
population of wells > 100
will have to check graph of difference data to see if a normal
approximation is valid since sample size is too small.
• 3. Researchers designed a survey to compare the
proportions of children who come to school without
eating breakfast in two low-income elementary schools.
An SRS of 80 students from School 1 found 19 had not
eaten breakfast. At School 2, an SRS of 150 students
included 26 who had not had breakfast. More than
1500 students attend each school. Do these data give
convincing evidence of a difference in the population
proportions?
• A) proportions
• B) the true difference in the population
proportions of children who come to school
without eating breakfast in two low income
elementary schools
• C) two samples
Researchers designed a survey to compare the proportions of children who come to
school without eating breakfast in two low-income elementary schools. An SRS of 80
students from School 1 found 19 had not eaten breakfast. At School 2, an SRS of 150
students included 26 who had not had breakfast. More than 1500 students attend each
school. Do these data give convincing evidence of a difference in the population
proportions?
• D) this will be a two proportion z test for
difference in proportions 𝑧 =
𝑝1 −𝑝2 −(𝑝1−𝑝2)
𝑝(1−𝑝)
1
1
+
𝑛1 𝑛2
o Remember the plain “p” is the pooled p, putting the two
samples together
• E) Assumptions
o SRS of 80 students from one school and independent SRS of
150 students in second school- given
o n1p1=19 n1(1-p1) = 61; n2p2 = 26 n2(1-p2) = 124 since all
are greater than 10 we can use a normal approximation
o population of students in school 1 > 800 and school 2 >1500
– question states both populations >1500
• 4. Bottles of a popular cola are supposed to
contain 300 milliliters of cola. There is some variation
from bottle to bottle because the filling machinery is
not perfectly precise. From experience, the
distribution of the contents of the bottles is
approximately normal. An inspector measures the
contents of six randomly selected bottles from a
single day’s production. Do these data provide
convincing evidence that the mean amount of
cola in all the bottles filled that day differs from the
target value of 300 ml?
• A) means
• B) the mean amount of cola in all the bottles filled
that day
• C) one sample
From experience, the distribution of the contents of the bottles is approximately
normal. An inspector measures the contents of six randomly selected bottles
from a single day’s production. Do these data provide convincing evidence
that the mean amount of cola in all the bottles filled that day differs from the
target value of 300 ml?
• D) This will be a one-sample t-test for
𝑥−𝜇
means t = 𝑠
𝑛
• E) Assumptions:
o σ unknown, use t
o Random sample of bottles from day’s production
- given
o Contents of bottles given to be approximately
normally distributed - given
o Population of bottles in one day’s production >
60
• 5. Some doctors have begun to use medical magnets
to treat patients with chronic pain. Scientists wondered
whether this type of therapy really worked, so they
designed an experiment to find out. Fifty patients with
chronic pain were recruited for the study. A doctor
identified a painful site on each patient and asked him
or her to rate the pain on a scale from 0 to 10. Then, the
doctor selected a sealed envelope containing a
magnet from a box that contained both active and
inactive magnets. The chosen magnet was applied to
the site of the pain for 45 minutes. After treatment, each
patient was again asked to rate the level of pain from 0
to 10. In all, 29 patients were given active magnets and
21 patients received inactive magnets. All but one of
the patients rated their initial pain as 8, 9, or 10, so
scientists decided to focus on the patients’ final pain
ratings. Do these data show statistical evidence to
suggest that the active magnets help reduce pain?
In all, 29 patients were given active magnets and 21 patients received
inactive magnets. All but one of the patients rated their initial pain as
8, 9, or 10, so scientists decided to focus on the patients’ final pain
ratings. Do these data show statistical evidence to suggest that the
active magnets help reduce pain?
• A) This question doesn’t explicitly say whether it’s
proportions or means, but there are no
percentages, and we could find the mean pain
rating, so it is means.
• B) The true reduction(difference) in mean pain
rating for patients using magnets as opposed to a
placebo
• C) Two samples – even though there were 50
patients originally, they get split into two separate
treatments. Since whether they get the magnet or
not is random, we can say the two samples are
independent
In all, 29 patients were given active magnets and 21 patients received
inactive magnets. All but one of the patients rated their initial pain as
8, 9, or 10, so scientists decided to focus on the patients’ final pain
ratings. Do these data show statistical evidence to suggest that the
active magnets help reduce pain?
• D) Two-sample t-test for difference of means
𝑥1 −𝑥2 − 𝜇1 − 𝜇2
t=
2
2
𝑠1
𝑠2
+
𝑛1 𝑛2
• E) Assumptions:
o σ unknown, use t
o Random sample of 29 patients given magnets and
independent sample of 21 patients with inactive magnets –
given
o Reasonable to assume true population of patients with
chronic pain is greater than 210 and 290
o Since each sample size is less than 30, we will need to look
at graphs of the pain ratings for each group to determine if
we can use a normal approximation
• 6. Tonya wants to estimate what proportion of the
seniors in her school plan to attend the prom. She
interviews an SRS of 50 of the 750 seniors in her
school and finds that 36 plan to go to the prom.
• A) proportions
• B) The true proportion of seniors at Tonya’s school
who plan to go to the prom
• C) One sample of students
• D) One-proportion z-interval since we are
estimating the parameter 𝑝 ± 𝑧 ∗
𝑝(1−𝑝)
𝑛
• E) Assumptions:
o SRS of seniors - given
o np = 36, n(1-p) = 14. Since both are >10 we can use normal approx.
o Population of seniors = 750 which is greater than 500 (10n)
• 7. A local high school makes a change that should
improve student satisfaction with the parking
situation. Before the change, 37% of the school’s
students approved of the parking that was
provided. After the change, the principal surveys
an SRS of 200 of the over 2500 students at the
school. In all, 83 students say that they approve of
the new parking arrangement. Is this evidence that
the change was effective?
• A) Proportions – you have a % and fraction 83/200
• B) The true difference in proportion of students
satisfied with the parking situation before and after
the change.
• C) Two samples of students
Before the change, 37% of the school’s students approved of the
parking that was provided. After the change, the principal surveys an
SRS of 200 of the over 2500 students at the school. In all, 83 students
say that they approve of the new parking arrangement. Is this
evidence that the change was effective?
• D) two-proportion z-test for difference of
proportions z =
𝑝1 −𝑝2 − 𝑝1 −𝑝2
𝑝(1−𝑝)
1
1
+
𝑛12 𝑛22
• E) Assumptions
o Random sample of students before change (not given)
and SRS of students after change - given
o n1p1 = (2500)(.37) = 925 n1(1-p1) = 1575
o n2p2 = 83 n2(1-p2) = 117
• Since all these values > 10 we can use a normal
approximation
Population of students is 2500 which is greater than 2000
(10n)
• 8. Here are data on the time (in minutes) Professor
Moore takes to swim 2000 yards and his pulse rate
(beats per minute) after swimming on a random
sample of 23 days. Is there statistically significant
evidence of a linear relationship between Professor
Moore’s swim time and his pulse rate in the
population of days on which he swims 2000 yards?
• A) bivariate data
• B) the true slope of the relationship between Mr.
Moore’s swim time and his pulse rate
• C) one sample of 23 days
• D) t-test for slope of LSRL t =
𝑏1 −𝛽
𝑠𝑏1
• E) Same assumptions as question #1
• 9. Biologists studying the healing of skin wounds
measured the rate at which new cells closed a cut
made in the skin of an anesthetized newt. Here are
data from a random sample of 18 newts, measured
in micrometers per hour. We want to estimate the
mean healing rate with 95% confidence.
• A) means
• B) the true mean healing rate of skin wounds
• C) one sample
• D) one-sample t-interval for means 𝑥 ± 𝑡 ∗
𝑠
𝑛
• E) Assumptions:
o
o
o
o
σ unknown, use t
Random sample of newts – given
Population of newts > 180
Since the sample size is <30 we would need to look at a graph of the data
to determine if we can use a normal approximation
• 10. Breast-feeding mothers secrete calcium into
their milk. Some of the calcium may come from
their bones, so mothers may lose bone mineral.
Researchers compared a random sample of 47
breast-feeding women with a random sample of 32
women of similar age who were neither pregnant
nor lactating. They measured the percent change
in the bone mineral content of the women’s spines
over three months. Comparative data is given
below. Is the mean change in bone mineral
content significantly lower for the mothers who are
breast-feeding?
• A) means
• B) the true difference in the mean change of bone
mineral content between breast feeding women
and those not breastfeeding
Researchers compared a random sample of 47 breast-feeding women with a
random sample of 32 women of similar age who were neither pregnant nor
lactating. They measured the percent change in the bone mineral content of the
women’s spines over three months. Comparative data is given below. Is the mean
change in bone mineral content significantly lower for the mothers who are breastfeeding?
• C) Two samples
• D) Two-sample t-test for difference of means
𝑥 −𝑥 − 𝜇1 − 𝜇2
t= 1 2 2 2
𝑠1
𝑠2
+
𝑛1 𝑛2
• E) Assumptions
o Neither σ is known, so use t
o Random sample of 47 breast-feeding women and random
sample of 32 women not breast-feeding – given
o Both sample sizes are greater than 30, so we can use a normal
approximation
o Population of breast-feeding women > 470 and population of
women not breast-feeding > 320
• 11. Some doctors argue that “normal” human
body temperature is not really 98.6oF. One
researcher took the oral temperature reading for
each of 130 randomly chosen, healthy 18- to 40year olds. The mean temperature was 98.25oF, with
a standard deviation of 0.73oF. Do these data
provide convincing evidence that normal body
temperature is not 98.6oF?
• A) means
• B) the true mean temperature of healthy 18- to 40year-olds
• C) one sample
• D) one-sample t-test for means t =
𝑥−𝜇
𝑠
𝑛
One researcher took the oral temperature reading for each of 130
randomly chosen, healthy 18- to 40-year olds. The mean temperature
was 98.25oF, with a standard deviation of 0.73oF. Do these data provide
convincing evidence that normal body temperature is not 98.6oF?
• E) Assumptions
o σ unknown, use t
o Random sample of healthy 18- to 40-yearolds – given
o Since the sample size is greater than 30,
we can use a normal approximation
o Population of 18- to 40-year-olds > 1300
• 12. A study followed a random sample of 8474
people with normal blood pressure for about four
years. All the individuals were free of heart disease
at the beginning of the study. Each person took a
test which measures how prone a person is to
sudden anger. Researchers also recorded whether
each individual developed coronary heart disease.
Do the data provide convincing evidence of an
association between anger level and heart disease
in the population of interest?
• A) categorical data
• B) whether there is an association between anger
level and heart disease in the population of adults
• C) one sample – two variables
A study followed a random sample of 8474 people with normal blood pressure for about
four years. All the individuals were free of heart disease at the beginning of the study.
Each person took a test which measures how prone a person is to sudden anger.
Researchers also recorded whether each individual developed coronary heart disease.
Do the data provide convincing evidence of an association between anger level and
heart disease in the population of interest?
• D) Chi-square test of independence
o
𝑋2
=
𝑜𝑏𝑠−𝑒𝑥𝑝 2
𝑒𝑥𝑝
• E) Assumptions
o random sample of people – given
o Data are counts
o All expected counts are greater than 5 – we
would need to actually check the individual
counts, but considering the sample is so large,
this condition will probably be met
• 13. A drug manufacturer claims that less than 10%
of patients who take its new drug for treating
Alzheimer’s disease will experience nausea. To test
this claim, researchers conduct an experiment.
They give the new drug to a random sample of 300
out of 5000 Alzheimer’s patients whose families have
given informed consent for the patients to
participate in the study. In all, 25 of the subjects
experience nausea.
• A) proportions
• B) the true proportion of Alzheimer’s patients
experiencing nausea when taking a drug
• C) one sample of patients
• D) one sample z-test of proportions z =
𝑝 −𝑝
𝑝(1−𝑝)
𝑛
A drug manufacturer claims that less than 10% of patients who take its new drug
for treating Alzheimer’s disease will experience nausea. To test this claim,
researchers conduct an experiment. They give the new drug to a random sample of
300 out of 5000 Alzheimer’s patients whose families have given informed consent
for the patients to participate in the study. In all, 25 of the subjects experience
nausea.
• E) Assumptions
o Random sample of 300 Alzheimer’s
patients - given
o n1p1 = 25
n1(1-p1) - 275 Since both of
these values are >10 we can use a normal
approximation
o Population of Alzheimer’s patients > 3000
• 14, Glenn wonders what proportion of students at his
school think that tuition is too high. He interviews an SRS
of 50 of the 240 students at his college. Thirty-eight of
those interviewed think tuition is too high.
• A) proportions
• B) the true proportion of students at Glenn’s school who
think that tuition is too high
• C) one sample of students
• D) Since Glenn is trying to determine the proportion, or
estimate the proportion, this will be a confidence interval
– z-interval for one sample proportion 𝑝 ± 𝑧 ∗
𝑝(1−𝑝)
𝑛
• E) Assumptions
o SRS of students - given
o np = 38 n(1-p) = 12 Since both values > 10 we can use a normal approx.
o Population of students is 240, which is not 10n!!! We can’t assume
independence in our sample. We can proceed, but our results may not be
reliable.
• 15. Market researchers suspect that background
music may affect the mood and buying behavior of
customers. One study in a supermarket compared
three randomly assigned treatments: no music,
French accordion music, and Italian string music.
Under each condition, the researchers recorded
the numbers of bottles of French, Italian, and other
wine purchased. Are the distributions of wine
purchases under the three music treatments similar
or different?
• A) categorical data
• B) if the population of customers has the same
buying habits with different music treatments
• C) 3 samples (treatment groups)
One study in a supermarket compared three randomly assigned treatments:
no music, French accordion music, and Italian string music. Under each
condition, the researchers recorded the numbers of bottles of French, Italian,
and other wine purchased. Are the distributions of wine purchases under the
three music treatments similar or different?
• D) Chi-square test of homogeneity
•
𝑋2
=
𝑜𝑏𝑠−𝑒𝑥𝑝 2
𝑒𝑥𝑝
• E) Assumptions
o Random sample of supermarket
customers - given
o Data are counts
o All expected counts > 5 – would need the
data to check this
• 16. A surprising number of young adults (ages 19 to
25) still live in their parents’ homes. The National
Institutes of Health planned to estimate the
difference in proportions of women and men in this
age group who live at home. The random sample
included 2253 men and 2629 women in this age
group. The survey found that 986 of the men and
923 of the women lived with their parents.
• A) proportions
• B) the true difference in proportions of men and
women ages 19 – 25 who still live with their parents
• C) two samples – men and women
The National Institutes of Health planned to estimate the difference in
proportions of women and men in this age group who live at home.
The random sample included 2253 men and 2629 women in this age
group. The survey found that 986 of the men and 923 of the women
lived with their parents.
• D) two-sample z-interval for difference of
proportions 𝑝1 − 𝑝2 ± 𝑧 ∗
𝑝1 (1−𝑝1 )
𝑛1
+
𝑝2 (1−𝑝2 )
𝑛2
• E) Assumptions
o Random sample of 19-25 year old men and random sample of
19-25 year old women - given
o n1p1 = 986 n1(1-p1) = 1267
o n2p2 = 923 n2(1 – p2) = 1706
• Since all these values are >10, I can use a normal
approximation
o Population of 19-25 year old men > 22530 and population of 19-25
year old women > 2629
• 17. The Wade Tract Preserve in Georgia is an oldgrowth forest of long-leaf pines that has survived in
a relatively undisturbed state for hundreds of years.
One question of interest to foresters who study the
area is “How do the sizes of long-leaf pine trees in
the northern and southern halves of the forest
compare?” To find out, researchers took random
samples of 30 trees from each half of the forest and
measured the trees’ diameter in centimeters. What
is the difference in mean diameters of long-leaf
pines in the northern and southern halves?
• A) mean – it’s not proportions, and we can find the
mean diameter
• B) the true difference of the mean diameters of
trees in the northern and southern halves of the
Wade Tract Preserve
To find out, researchers took random samples of 30 trees from
each half of the forest and measured the trees’ diameter in
centimeters. What is the difference in mean diameters of longleaf pines in the northern and southern halves?
• C) two samples of 30 trees
• D) We want to find/estimate the difference, so we
use a two – sample t-interval for the difference of
means 𝑥1 − 𝑥2 ±
𝑡∗
𝑠12
𝑛1
𝑠22
+
𝑛2
• E) Assumptions:
o Random sample of trees from northern part and independent
random sample of trees from southern part of the forest - given
o Since both sample sizes are 30, we can use a normal
approximation
o Population of the trees in each the northern and southern parts of
the forest are > 300
o σ unknown, use t
• 18. Environmentalists, government officials, and
vehicle manufacturers are all interested in studying
the auto exhaust emissions produced by motor
vehicles. The major pollutants in auto exhaust from
gasoline engines are hydrocarbons, carbon
monoxide, and nitrogen oxides (NOX). Researchers
collected data on the NOX levels (in grams per
mile) for a random sample of 40 light-duty engines
of the same type. The mean NOX reading was
1.2675 and the standard deviation was 0.3332.
• A) means
• B) The true mean NOX level of the population of
light-duty engines
• C) one sample
Researchers collected data on the NOX levels (in grams per
mile) for a random sample of 40 light-duty engines of the
same type. The mean NOX reading was 1.2675 and the
standard deviation was 0.3332.
• D) We’re not testing a claim, so we are going to
estimate the true mean using a one-sample t𝑠
∗
interval for means 𝑥 ± 𝑡
𝑛
• E) Assumptions
o Random sample of light-duty engines – given
o Sample size 40 > 30 so we can use a normal
approximation
o Population of light-duty engines > 400
o σ unknown, use t
• 19. Do experienced computer game players earn
higher scores when they play with someone present
to cheer them on or when they play alone? Fifty
teenagers who are experienced at playing a
particular computer game have volunteered for a
study. We randomly assign 25 of them to play the
game alone and the other 25 to play the game with
a supporter present. Each player’s score is
recorded.
• A) means (mean score)
• B) the true difference in the mean scores of players
with and without a supporter present
• C) two samples (or randomly assigned groups)
Fifty teenagers who are experienced at playing a particular computer
game have volunteered for a study. We randomly assign 25 of them to
play the game alone and the other 25 to play the game with a
supporter present. Each player’s score is recorded.
• D) two – sample t-test of difference of
means
t=
𝑥1 −𝑥2 − 𝜇1 − 𝜇2
𝑠12 𝑠22
+
𝑛1 𝑛2
• E) Assumptions:
o Randomly assigned players in each group – with and
without a supporter present - given
o Both sample sizes, 25, are less than 30 so we will need to
look at the data to determine if we can use a normal
approximation
o Population of computer game players > 250 + 250
o σ unknown, use t
• 20. As part of the Pew Internet and American Life
Project, researchers conducted two surveys in late
2009. The first survey asked a random sample of 800
U.S. teens about their use of social media and the
Internet. A second survey posed the same
questions to a random sample of 2253 U.S. adults.
In these two studies, 73% of teens and 47% of adults
said that they use social-networking sites. Construct
and interpret a 95% confidence interval for the
difference in the proportion of all US teens and
adults who use social-networking sites.
• A) proportions
• B) the true difference in the proportion of US teens
and adults who use social-networking sites
The first survey asked a random sample of 800 U.S. teens about their use of social
media and the Internet. A second survey posed the same questions to a random
sample of 2253 U.S. adults. In these two studies, 73% of teens and 47% of adults
said that they use social-networking sites. Construct and interpret a 95%
confidence interval for the difference in the proportion of all US teens and adults
who use social-networking sites.
• C) two samples
• D) two-sample z-interval for difference of
proportions 𝑝1 − 𝑝2
± 𝑧∗
𝑝1 (1−𝑝1 )
𝑛1
𝑝2 (1−𝑝2 )
+
𝑛2
• E) Assumptions
o Random sample of 800 US teens and independent random sample of
2253 US adults – given
o n1p1 = (800)(.73)=584 n1(1-p1)=216
o n2p2 = (2253)(.47)=1059 n2(1-p2)=1194
o Since all these values > 10, we can use a normal approximation
o Population of US teens > 8000 and population of US adults > 22530