PowerPoint - Hope College Math Department
Download
Report
Transcript PowerPoint - Hope College Math Department
Nathan Tintle
Dordt College
Sioux Center, Iowa
Outline
We’ve been sitting too long!
Why change? (pre-2009)
Hold your breath
Taking the plunge… (fall 2009)
Show me the data
Data on fall 2009 implementation (pre-, post- and 4-month
retention)
Are we there yet?
Joining forces, major changes and refinements (Summer 2010present)
Are we there yet?
Open questions…(fall 2012 and beyond)
When are we going to get there (and where is there!)?
This and other common questions we’ve heard….
Background
Consensus curriculum
Desc stats, design, prob and samp dist, inference
Using a TI
Lecture and lab separated
Feeling like we were teaching algebra and
memorization of rules and not statistical thinking
Had heard George’s talk from 2005
Permutation tests more and more in research
Taking the plunge…
…so hold your breath!
Decided to throw out the old curriculum completely and
start from scratch.
Hard to “sprinkle in,”
Unclear how we would “incrementally change”
Pilot in spring 2009 (loosely based on Rossman-ChanceCobb-Holcomb modules)
Faculty training workshop in may 2009
Flesh out more in summer 2009 for use in fall 2009
Todd Swanson, Jill VanderStoep and I
Details of fall 2009 implementation
Unit #1 (4-5 weeks). Introduction to inference
Chapter 1. Inference with a single proportion
(simulation only)
Chapter 2. Inference comparing two proportions
(randomization test only)
Chapter 3. Inference comparing two means
(randomization test only)
Chapter 4. Inference for correlation and regression
(randomization test only)
Details of fall 2009 implementation
Unit #2 (8-9 weeks). Revisiting inference: theorybased approaches, confidence intervals and power
Chapter 5: Correlation and regression: revisited
Chapter 6: comparing two and more means: revisited
Chapter 7: comparing two or more proportions:
revisited
Key features of fall 2009
Start with inference- Simulation and Randomization
first; Theory-based approaches later (revisit)
Topics based chapters (Malone, et al. argued for
this)- Descriptive statistics are just in time
Probability and sampling distributions introduced
intuitively; less formally
Student projects (more, earlier)
Pedagogy (active learning inextricably linked with
simulation/randomization)
Case studies/research articles: Real data that matters
Assessment from fall 2009
All instructors liked better, felt like students learn more/better
the before, student attitudes seemed better
Used CAOS as pre-post test and 4 month follow-up
Full results are published vs. CAOS fall 2007
Pre-post: Tintle NL et al. “Development and assessment of a
preliminary randomization-based introductory statistics
curriculum. Journal of Statistics Education. March 2011.
4 month retention: Tintle NL, et al. “Retention of statistical
concepts in a preliminary randomization based introductory
statistics curriculum” Statistics Education Research Journal. May
2012.
Assessment highlights from fall 2009
Pre-post changes
6 items significantly better (p≤0.001) fall 2009 (pre-post) as
compared to fall 2007 (at Hope) (p-values and design)
1 items significantly worse (standard deviation/histogram)
33 items n/s
4-month retention
Average of 48% loss of knowledge gained 4-months post course
(traditional curriculum)
Average of 6% loss of knowledge gained 4-months post-course
(randomization curriculum)
Summer 2010
Joined forces with Beth Chance, George Cobb, Allan
Rossman and Soma Roy to produce an introductory
statistics textbook
Many major and minor changes to address
assessment results, teaching experience, etc. from fall
2009 (as well as experience over last two years of
teaching)
Course overview
Chapter 0: A few basics (statistical method, desc. stats and
probability as long-run frequency) (~1 week)
Unit 1: LOGIC AND SCOPE OF INFERENCE (5-6 weeks)
Chapter 1: Simulation and theory-based inferential
approach for single proportion (SIGNIFICANCE)
Chapter 2: Estimation using plausible values, ± 2SD,
theory-based approach (ESTIMATION)
Chapter 3: Drawing conclusions from population to
sample (GENERALIZABILITY)
Chapter 4: Association and causation (CAUSATION)
Course overview
Unit 2: COMPARING TWO GROUPS (5-6 weeks
weeks)
Chapter 5: Comparing two proportions—
randomization and theory-based
Chapter 6: Comparing two averages—randomization
and theory-based
Chapter 7: Matched pairs and single mean—
randomization and theory-based
Course overview
Unit 3: ANALYZING MORE GENERAL
SITUATIONS (3-4 weeks)
Chapter 8: Comparing more than two groups on
categorical response
Chapter 9: Comparing more than two groups with a
quantitative response
Chapter 10: Correlation and regression
**Note: Chapters 7-10 can be done in any order**
Changes/major decisions
Efficiency with randomization/simulation and
theory-based done simultaneously
Logic and Scope of inference (Significance,
Estimation, Generalizability and Causation)
Easier validity conditions for theory-based tests
Much more…3-S process for assessing statistical
significance, 7-step method, approach on CIs,…
Assessment 2011/2012
Portability
Dordt similar results to Hope, showing improvement
over time as we make tweaks
Continues to be good
Exploring alternative assessment tests/questions more
tailored to our learning goals (e.g., GOALS, MOST,
Garfield et al.)
An example
Chapter 9. Comparing multiple group means on a quantitative
response
Exploration 9.1: Exercise and Brain volume in the elderly (data
from Mortimer et al. 2012)
Brain size typically declines. Shrinkage may be linked to
dementia.
Randomized experiment with 4 groups: (a) Tai Chi (b) Walking
(c) Social Interaction and (d) Control
40 weeks: Measure percent change in brain size
Where are we now
Class testing fall 2012 (seven other institutions, plus
our own three)
Anticipate Wiley published book within 18 months
(or so)
Draft materials prepared now
Contact if interested in learning more:
[email protected] or
http://math.hope.edu/isi for more details
Are we there yet?
Debate over
Bootstrapping vs. randomization vs. simulation
How to handle confidence intervals
Order
Etc.
Where is there?
Common questions we’ve heard
Q: How can I convince client departments?
Hard if you don’t do theory-based approaches any longer
If you say “We’re still doing the test that you care about, but
we’re giving students a better scaffolding to understand
what that test is doing…”
I haven’t heard ANY client department respond negatively
to this rationale
Q: Ok, so how about my math colleagues?
Harder. They like the MATH part of statistics, and that’s
what we’re arguing to do less of to encourage
STATISTICAL thinking.
What is the goal of your course?
Not algebra, not probability, not calculus, not mathematical
thinking…
Common questions we’ve heard
Q: Sounds great, sounds like LOTS of work! Can I do
this?
Be willing to get out of the boat!
New things are never completely smooth
I know of no one (yet!) who has done this and wants
to go back
Utilize experienced instructor resources
We have some and are working on more
Need for more…
Conclusions
Randomization is doable as intro course without
alienating client departments (theory-based tests)
Shows promise (initial assessment data; feedback,
etc. is positive)
More research needed to pinpoint the causes of
improved assessment data and accepted “bestpractices” (what matters and what doesn’t) for a
randomization-approach
Acknowledgments
Collaborators:
Hope College: Todd Swanson and Jill VanderStoep
Cal Poly: Beth Chance, Allan Rossman and Soma Roy
Mt. Holyoke: George Cobb
Testers: Numerous other faculty and many, many
students
Funding: NSF (DUE-1140629)
Significantly better
% of Students Correct
Item Description (Topic)
Understanding of the purpose
of randomization in an
experiment (Data collection
and design)
Understanding that low pvalues are desirable in
research studies (Tests of
significance)
Cohort1
NT
HT
HR
NT
HT
HR
Pretes Posttes Differ
t
t
ence
8.5
12.3
3.8
4.6
9.7
5.1
3.5
20.8
17.3
49.9
68.5
18.6
56.9
85.6
28.7
56.9
96.0
39.1
Significantly better
% of Students Correct
Item Description (Topic)
Understanding that no statistical
significance does not guarantee
that there is no effect (Tests of
significance)
Ability to recognize a correct
interpretation of a p-value (Tests
of significance)
Cohort1
Pretes Posttes Differ
t
t
ence
NT
63.1
64.4
1.3
HT
66.2
72.7
6.5
HR
65.2
85.1
19.9
NT
46.8
54.5
7.7
HT
36.1
41.0
4.9
HR
42.3
60.0
17.7
Significantly better
% of Students Correct
Item Description (Topic)
Ability to recognize an incorrect
interpretation of a p-value.
Specifically, probability that a
treatment is not effective. (Tests
of significance)
Understanding of how to
simulate data to find the
probability of an observed value
(Probability)
Cohort1
Pretes Posttes Differ
t
t
ence
NT
53.1
58.6
5.5
HT
HR
59.8
58.9
68.6
79.7
8.8
20.8
NT
20.4
19.5
-0.9
HT
20.0
20.0
0.0
HR
20.0
32.2
12.2
Significantly better
% of Students Correct
Item Description (Topic)
Ability to recognize an incorrect
interpretation of a p-value.
Specifically, probability that a
treatment is not effective. (Tests
of significance)
Understanding of how to
simulate data to find the
probability of an observed value
(Probability)
Cohort1
Pretes Posttes Differ
t
t
ence
NT
53.1
58.6
5.5
HT
HR
59.8
58.9
68.6
79.7
8.8
20.8
NT
20.4
19.5
-0.9
HT
20.0
20.0
0.0
HR
20.0
32.2
12.2
Significantly worse
% of Students Correct
Item Description (Topic)
Ability to correctly estimate and
compare standard deviations for
different histograms.
(Descriptive statistics)
Cohort1
Pretes Posttes Differ
t
t
ence
NT
34.3
51.7
17.4
HT
44.8
36.3
70.8
48.5
26.0
12.2
HR
Retention
CAOS scores 4 months post-course (fall 2007 vs. fall 2009
students)
Fall 2007
Fall 2009
Average change in percent correct
Posttest Mean
4-month Retention
minus Prettest
Mean minus Posttest
Mean
Mean
(SD)
(SD)
10.92
-5.28
(9.5)
(10.1)
10.04
(12.3)
-0.61
(8.3)
Significantly better retention in fall 2009 sample (p=0.002)
Best retention areas
Average score on Topics
Topic
Cohort
Data
collection
and design
Randomized
Tests of
Significance
Randomized
Consensus
Pretest
Postte 4-month Change
st
retention
31.58
43.09
41.12
-1.97
41.02
47.44
34.94
-12.50
51.54
71.27
72.37
1.10
51.51
67.31
64.31
-2.95
Consensus
Common questions we’ve heard
What do students prefer—
simulation/randomization or theory-based?
Depends on how you present it---students will take
their cue from you
My preference:
A. Simulation requires computational power, didn’t
have that until recently
B. Theory based was the historical answer because it
predicts what would have happened how you simulated
C. Theory based use mathematical theory to give good
approximation of simulation distribution under certain
validity conditions
Are we there yet?
Ongoing questions for debate (fall 2012 and beyond)
Confidence intervals approach—does it matter?
Plausible values?
Bootstrap?
Theory only?
Study design and simulation approach
Disconnect
Connect
Re-randomize= randomized experiment
Bootstrap=random sample
Common questions we’ve heard
Q: Does this work?
Assessment data
Content objectives
Attitudes
More and more people doing this and saying it works!
At least 15 faculty with our materials, ~10 new faculty
this fall,
Lock’s, CATALST, NCSU, UCLA, STATCRUNCH…and
more!
Common questions we’ve heard
Q: What makes the difference? Simulation?
Randomization? Pedagogy? Talking about inference
for 16 weeks instead of 5?
We don’t know…yet..
Pedagogy and these changes are inextricably linked
On Sunday at the modeling session one of the panelists
said “How could you teach this without hands on
activities and technology?”