Investigations for Introducing

Download Report

Transcript Investigations for Introducing

Investigations for Introducing
Mathematically Inclined
Students to Statistics
Allan Rossman ([email protected])
Beth Chance
([email protected])
Student Audience

Introductory statistics course for
mathematically inclined students



mathematics and statistics majors
future secondary teachers
perhaps strong science, engineering, computer
science majors
Goals for the Course?


Brainstorm your goals for these students,
particularly with attention to whether and how
these goals differ from service courses
(5 min)
Reporter summarize top three goals
Summary of Goals
Efforts for Math Stat/Prob Sequence

Supplement with data analysis component


Infuse data and applications


Witmer’s Data Analysis: An Introduction
Rice’s Mathematical Statistics and Data Analysis
Use lab activities


Nolan and Speed’s StatLabs
Baglivo’s Mathematica Laboratories for
Mathematical Statistics
Our Project
To develop and provide a:
Data-Oriented, Active Learning, Post-Calculus
Introduction to Statistical
Concepts, Applications, Theory
Supported by the NSF DUE/CCLI #9950476, 0321973
Guiding Principles









Put students in the role of active investigator
Motivate with real studies, genuine data
Emphasize connections among study design,
inference technique, scope of conclusions
Use simulations frequently
Use a variety of computational tools
Investigate mathematical underpinnings
Introduce probability “just in time”
Experience entire statistical process over and over
Provide a combination of immediate corrective
formative and summative evaluation of key concepts
Outline
Chapter 1
Data Collection
Observation vs.
experiment,
confounding,
randomization
Descriptive
Statistics
Conditional
proportions,
segmented bar
graphs, odds
ratio
Probability
Sampling/
Randomization
Distribution
Chapter 2
Chapter 3
Chapter 4
Chapter 5
Random
sampling, bias,
precision,
nonsampling
errors
Paired data
Quantitative
summaries,
transformations,
z-scores,
resistance
Bar graph
Models,
Probability
plots, trimmed
mean
Counting,
random
variable,
expected value
empirical rule
Bermoulli
processes, rules
for variances,
expected value
Normal, Central
Limit Theorem
Randomization
distribution for
Randomization
distribution for
Sampling
distribution for
X, pˆ
Large sample
sampling
distributions for
x , pˆ
Sampling
distributions of
pˆ1  pˆ 2 , OR,
Binomial
Normal, t
Normal, t, lognormal
Chi-square, F, t
Binomial tests
and intervals,
two-sided pvalues, type I/II
errors
z-procedures for
proportions tprocedures,
robustness,
bootstrapping
Two-sample zand tprocedures,
bootstrap, CI for
OR
Chi-square for
homogeneity,
independence,
ANOVA,
regression
pˆ1  pˆ 2
Model
Hypergeometric
Statistical
Inference
p-value,
significance,
Fisher’s Exact
Test
x1  x2
p-value,
significance,
effect of
variability
Independent
random samples
Chapter 6
Bivariate
Scatterplots,
correlation,
simple linear
regression
x1  x2
Chi-square
statistic, F
statistic,
regression
coefficients
Example Investigations





Full versions available at
www.rossmanchance.com/iscam/uscots/
Investigation 1: Sleep Deprivation and Visual Learning
(randomization tests)
Investigation 2: Sampling Words (random samples,
variability)
Investigation 3: Kissing the Right Way (confidence
intervals)
Investigation 4: Sleepless Drivers (CI for Odds Ratio)
Investigation 1: Sleep Deprivation

Physiology Experiment

Stickgold, James, and Hobson (2000) studied the
long-term effects of sleep deprivation on a visual
discrimination task (3 days later!)
sleep condition
deprived
unrestricted
n
11
10
Mean
3.90
19.82
StDev
12.17
14.73
Median
4.50
16.55
IQR
20.7
19.53
Investigation 1: Sleep Deprivation




How often would such an extreme
experimental difference occur by chance, if
there was no sleep deprivation effect?
Set of 21 index cards with the improvement
scores (positive and negative).
Randomly assign 11 of the cards to the sleep
deprived group.
Calculate the difference in group means
(deprived – unrestricted)
Investigation 1: Sleep Deprivation

After this reminder of the randomization
process, students then use a Minitab macro
sample 21 c2 c3
unstack c1 c4 c5;
subs c3.
let c6(k1)=mean(c4)-mean(c5)
let k1=k1+1
Investigation 1: Sleep Deprivation

Students investigate this question through



Hands-on simulation (index cards)
Computer simulation (Minitab)
Exact distribution
p-value .002
p-value=.0072
15.92
Investigation 1: Sleep Deprivation

Experience the entire statistical process
again



Tools change, but reasoning remains same


Develop deeper understanding of key ideas
(randomization, significance, p-value)
Effect of variability
Tools based on research study, question – not for
their own sake
Simulation as a problem solving tool

Empirical vs. exact p-values
Investigation 2: Sampling Words
Four score and seven years ago, our fathers brought forth upon this continent a
new nation: conceived in liberty, and dedicated to the proposition that all men
are created equal.
Now we are engaged in a great civil war, testing whether that nation, or any nation
so conceived and so dedicated, can long endure. We are met on a great
battlefield of that war.
We have come to dedicate a portion of that field as a final resting place for those
who here gave their lives that that nation might live. It is altogether fitting and
proper that we should do this.
But, in a larger sense, we cannot dedicate, we cannot consecrate, we cannot hallow
this ground. The brave men, living and dead, who struggled here have
consecrated it, far above our poor power to add or detract. The world will little
note, nor long remember, what we say here, but it can never forget what they
did here.
It is for us the living, rather, to be dedicated here to the unfinished work which they
who fought here have thus far so nobly advanced. It is rather for us to be here
dedicated to the great task remaining before us, that from these honored dead
we take increased devotion to that cause for which they gave the last full
measure of devotion, that we here highly resolve that these dead shall not have
died in vain, that this nation, under God, shall have a new birth of freedom, and
that government of the people, by the people, for the people, shall not perish
from the earth.
Investigation 2: Sampling Words

Examine the average length of words in the
sample
Example Class Results
The population mean of all 268 words is
4.295 letters
Investigation 2: Sampling Words

Students use Minitab to select sample and
compare results

Example results
Investigation 2: Sampling Words

Then turn to technology (applet)

What is the long-term behavior of this (random)
sampling method?


Unbiased method?
What happens if we change sample size?
Population size?
Investigation 2: Sampling Words

Using various forms of technology to support
student conceptual learning





Tailored to the context
Dynamic, interactive, and visual
Easy to use
Confront most common student
misconceptions directly
Distinguish randomization from random
sampling
Investigation 3: Kissing the Right Way

Biopsychology observational study

Güntürkün (2003) recorded the direction turned by
kissing couples to see if there was also a rightsided dominance.
Investigation 3: Kissing the Right Way


Is 1/2 a plausible value for p, the probability a
kissing couple turns right?
Binomial Simulation applet


Introduce idea of two-sided p-value
Is 2/3 a plausible value for p, the probability a
kissing couple turns right?

Discuss calculation of non-symmetric two-sided pvalues
Investigation 3: Kissing the Right Way

Have students explore and develop an
“interval” of plausible values for p
Later Investigations

Use another applet to explore the meaning of
confidence level



Wald vs. adjusted Wald
z vs. t
Robustness of t-intervals
Investigation 3: Kissing the Right Way




Encourage students to make predictions and
test their knowledge
Use the technology to minimize
computational burden so students focus on
concepts
Return to key ideas often, increasing the level
of complexity each time
Give them a taste for the modern flavor of
statistical practice and methodology
Investigation 4: Sleepless Drivers

Sociology case-control study

Connor et al (2002) investigated whether those in
recent car accidents had been more sleep
deprived than a control group of drivers
No full
night’s sleep
in past week
At least one full
night’s sleep in
past week
Sample sizes
“case” drivers
(crash)
61
474
535
“control” drivers
(no crash)
44
544
588
Investigation 4: Sleepless Drivers
Sample proportion that were in a car crash
Sleep deprived: .581
Not sleep deprived: .466
Odds ratio: 1.59
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
no crash
crash
No full night’s sleep in past
week
At least one full night’s
sleep in past week
How often would such an extreme observed odds ratio occur by
chance, if there was no sleep deprivation effect?
Investigation 4: Sleepless Drivers

Students investigate this question through

Computer simulation (Minitab)



Empirical sampling distribution of odds-ratio
Empirical p-value
Approximate mathematical model
1.59
Investigation 4: Sleepless Drivers
1 1 1 1
  
a b c d

SE(log-odds) =

Confidence interval for population log odds:



sample log-odds + z* SE(log-odds)
Back-transformation
90% CI for odds ratio: 1.13 – 2.24
Investigation 4: Sleepless Drivers


Students understand process through which
they can investigate statistical ideas
Students piece together powerful statistical
tools learned throughout the course to derive
new (to them) procedures

Concepts, applications, methods, theory
Expectations of Students (Midterm Qs)






Issues in sampling, nonsampling errors
Understand the implications of improper sampling
Analyze data numerically and graphically,
communicate their results
Be able to explain how random variability affects the
conclusions we should draw
Verbalize student conclusions that follow based on
study design – Causation? Generalizability?
Explain the idea behind randomization/sampling
distributions, think statistically

Increasing understanding of confidence, p-value
Discussion

Are these worthy goals?



Is such a course feasible?




Recruiting students into statistics (2nd course…)
Preparing future teachers
Learning environment
Course structure
Integration of technology
What are the essential components in
students’ ability and understanding to
assess?
For More Information

Applets, data files, other resources:
www.rossmanchance.com/iscam/

Faculty development workshop (July 18-22,
2005):
www.rossmanchance.com/prep/workshop.html

Review copies of text:
www.duxbury.com
Thank you

Allan Rossman ([email protected])

Beth Chance ([email protected])