PowerPoint - Hope College Math Department

Download Report

Transcript PowerPoint - Hope College Math Department

Nathan Tintle
Dordt College
Sioux Center, Iowa
Outline

 We’ve been sitting too long!
 Why change? (pre-2009)
 Hold your breath
 Taking the plunge… (fall 2009)
 Show me the data
 Data on fall 2009 implementation (pre-, post- and 4-month
retention)
 Are we there yet?
 Joining forces, major changes and refinements (Summer 2010present)
 Are we there yet?
 Open questions…(fall 2012 and beyond)
 When are we going to get there (and where is there!)?
 This and other common questions we’ve heard….
Background

 Consensus curriculum
 Desc stats, design, prob and samp dist, inference
 Using a TI
 Lecture and lab separated
 Feeling like we were teaching algebra and
memorization of rules and not statistical thinking
 Had heard George’s talk from 2005
 Permutation tests more and more in research
Taking the plunge…

 …so hold your breath!
 Decided to throw out the old curriculum completely and
start from scratch.
 Hard to “sprinkle in,”
 Unclear how we would “incrementally change”
 Pilot in spring 2009 (loosely based on Rossman-ChanceCobb-Holcomb modules)
 Faculty training workshop in may 2009
 Flesh out more in summer 2009 for use in fall 2009
 Todd Swanson, Jill VanderStoep and I
Details of fall 2009 implementation

 Unit #1 (4-5 weeks). Introduction to inference
 Chapter 1. Inference with a single proportion
(simulation only)
 Chapter 2. Inference comparing two proportions
(randomization test only)
 Chapter 3. Inference comparing two means
(randomization test only)
 Chapter 4. Inference for correlation and regression
(randomization test only)
Details of fall 2009 implementation

 Unit #2 (8-9 weeks). Revisiting inference: theorybased approaches, confidence intervals and power
 Chapter 5: Correlation and regression: revisited
 Chapter 6: comparing two and more means: revisited
 Chapter 7: comparing two or more proportions:
revisited
Key features of fall 2009

 Start with inference- Simulation and Randomization
first; Theory-based approaches later (revisit)
 Topics based chapters (Malone, et al. argued for
this)- Descriptive statistics are just in time
 Probability and sampling distributions introduced
intuitively; less formally
 Student projects (more, earlier)
 Pedagogy (active learning inextricably linked with
simulation/randomization)
 Case studies/research articles: Real data that matters
Assessment from fall 2009

 All instructors liked better, felt like students learn more/better
the before, student attitudes seemed better
 Used CAOS as pre-post test and 4 month follow-up
 Full results are published vs. CAOS fall 2007
 Pre-post: Tintle NL et al. “Development and assessment of a
preliminary randomization-based introductory statistics
curriculum. Journal of Statistics Education. March 2011.
 4 month retention: Tintle NL, et al. “Retention of statistical
concepts in a preliminary randomization based introductory
statistics curriculum” Statistics Education Research Journal. May
2012.
Assessment highlights from fall 2009

 Pre-post changes
 6 items significantly better (p≤0.001) fall 2009 (pre-post) as
compared to fall 2007 (at Hope) (p-values and design)
 1 items significantly worse (standard deviation/histogram)
 33 items n/s
 4-month retention
 Average of 48% loss of knowledge gained 4-months post course
(traditional curriculum)
 Average of 6% loss of knowledge gained 4-months post-course
(randomization curriculum)
Summer 2010

 Joined forces with Beth Chance, George Cobb, Allan
Rossman and Soma Roy to produce an introductory
statistics textbook
 Many major and minor changes to address
assessment results, teaching experience, etc. from fall
2009 (as well as experience over last two years of
teaching)
Course overview

 Chapter 0: A few basics (statistical method, desc. stats and
probability as long-run frequency) (~1 week)
 Unit 1: LOGIC AND SCOPE OF INFERENCE (5-6 weeks)
 Chapter 1: Simulation and theory-based inferential
approach for single proportion (SIGNIFICANCE)
 Chapter 2: Estimation using plausible values, ± 2SD,
theory-based approach (ESTIMATION)
 Chapter 3: Drawing conclusions from population to
sample (GENERALIZABILITY)
 Chapter 4: Association and causation (CAUSATION)
Course overview

 Unit 2: COMPARING TWO GROUPS (5-6 weeks
weeks)
 Chapter 5: Comparing two proportions—
randomization and theory-based
 Chapter 6: Comparing two averages—randomization
and theory-based
 Chapter 7: Matched pairs and single mean—
randomization and theory-based
Course overview

 Unit 3: ANALYZING MORE GENERAL
SITUATIONS (3-4 weeks)
 Chapter 8: Comparing more than two groups on
categorical response
 Chapter 9: Comparing more than two groups with a
quantitative response
 Chapter 10: Correlation and regression
**Note: Chapters 7-10 can be done in any order**
Changes/major decisions

 Efficiency with randomization/simulation and
theory-based done simultaneously
 Logic and Scope of inference (Significance,
Estimation, Generalizability and Causation)
 Easier validity conditions for theory-based tests
 Much more…3-S process for assessing statistical
significance, 7-step method, approach on CIs,…
Assessment 2011/2012

 Portability
 Dordt similar results to Hope, showing improvement
over time as we make tweaks
 Continues to be good
 Exploring alternative assessment tests/questions more
tailored to our learning goals (e.g., GOALS, MOST,
Garfield et al.)
An example

 Chapter 9. Comparing multiple group means on a quantitative
response
 Exploration 9.1: Exercise and Brain volume in the elderly (data
from Mortimer et al. 2012)
 Brain size typically declines. Shrinkage may be linked to
dementia.
 Randomized experiment with 4 groups: (a) Tai Chi (b) Walking
(c) Social Interaction and (d) Control
 40 weeks: Measure percent change in brain size
Where are we now

 Class testing fall 2012 (seven other institutions, plus
our own three)
 Anticipate Wiley published book within 18 months
(or so)
 Draft materials prepared now
 Contact if interested in learning more:
[email protected] or
http://math.hope.edu/isi for more details
Are we there yet?

 Debate over




Bootstrapping vs. randomization vs. simulation
How to handle confidence intervals
Order
Etc.
 Where is there?
Common questions we’ve heard

 Q: How can I convince client departments?
 Hard if you don’t do theory-based approaches any longer
 If you say “We’re still doing the test that you care about, but
we’re giving students a better scaffolding to understand
what that test is doing…”
 I haven’t heard ANY client department respond negatively
to this rationale
 Q: Ok, so how about my math colleagues?
 Harder. They like the MATH part of statistics, and that’s
what we’re arguing to do less of to encourage
STATISTICAL thinking.
 What is the goal of your course?
 Not algebra, not probability, not calculus, not mathematical
thinking…
Common questions we’ve heard

 Q: Sounds great, sounds like LOTS of work! Can I do
this?
 Be willing to get out of the boat!
 New things are never completely smooth
 I know of no one (yet!) who has done this and wants
to go back
 Utilize experienced instructor resources
 We have some and are working on more
 Need for more…
Conclusions

 Randomization is doable as intro course without
alienating client departments (theory-based tests)
 Shows promise (initial assessment data; feedback,
etc. is positive)
 More research needed to pinpoint the causes of
improved assessment data and accepted “bestpractices” (what matters and what doesn’t) for a
randomization-approach
Acknowledgments

 Collaborators:
 Hope College: Todd Swanson and Jill VanderStoep
 Cal Poly: Beth Chance, Allan Rossman and Soma Roy
 Mt. Holyoke: George Cobb
 Testers: Numerous other faculty and many, many
students
 Funding: NSF (DUE-1140629)
Significantly better
% of Students Correct
Item Description (Topic)
Understanding of the purpose
of randomization in an
experiment (Data collection
and design)
Understanding that low pvalues are desirable in
research studies (Tests of
significance)

Cohort1
NT
HT
HR
NT
HT
HR
Pretes Posttes Differ
t
t
ence
8.5
12.3
3.8
4.6
9.7
5.1
3.5
20.8
17.3
49.9
68.5
18.6
56.9
85.6
28.7
56.9
96.0
39.1
Significantly better
% of Students Correct
Item Description (Topic)

Understanding that no statistical
significance does not guarantee
that there is no effect (Tests of
significance)
Ability to recognize a correct
interpretation of a p-value (Tests
of significance)
Cohort1
Pretes Posttes Differ
t
t
ence
NT
63.1
64.4
1.3
HT
66.2
72.7
6.5
HR
65.2
85.1
19.9
NT
46.8
54.5
7.7
HT
36.1
41.0
4.9
HR
42.3
60.0
17.7
Significantly better
% of Students Correct
Item Description (Topic)

Ability to recognize an incorrect
interpretation of a p-value.
Specifically, probability that a
treatment is not effective. (Tests
of significance)
Understanding of how to
simulate data to find the
probability of an observed value
(Probability)
Cohort1
Pretes Posttes Differ
t
t
ence
NT
53.1
58.6
5.5
HT
HR
59.8
58.9
68.6
79.7
8.8
20.8
NT
20.4
19.5
-0.9
HT
20.0
20.0
0.0
HR
20.0
32.2
12.2
Significantly better
% of Students Correct
Item Description (Topic)

Ability to recognize an incorrect
interpretation of a p-value.
Specifically, probability that a
treatment is not effective. (Tests
of significance)
Understanding of how to
simulate data to find the
probability of an observed value
(Probability)
Cohort1
Pretes Posttes Differ
t
t
ence
NT
53.1
58.6
5.5
HT
HR
59.8
58.9
68.6
79.7
8.8
20.8
NT
20.4
19.5
-0.9
HT
20.0
20.0
0.0
HR
20.0
32.2
12.2
Significantly worse

% of Students Correct
Item Description (Topic)
Ability to correctly estimate and
compare standard deviations for
different histograms.
(Descriptive statistics)
Cohort1
Pretes Posttes Differ
t
t
ence
NT
34.3
51.7
17.4
HT
44.8
36.3
70.8
48.5
26.0
12.2
HR
Retention

 CAOS scores 4 months post-course (fall 2007 vs. fall 2009
students)
Fall 2007
Fall 2009
Average change in percent correct
Posttest Mean
4-month Retention
minus Prettest
Mean minus Posttest
Mean
Mean
(SD)
(SD)
10.92
-5.28
(9.5)
(10.1)
10.04
(12.3)
-0.61
(8.3)
 Significantly better retention in fall 2009 sample (p=0.002)
Best retention areas

Average score on Topics
Topic
Cohort
Data
collection
and design
Randomized
Tests of
Significance
Randomized
Consensus
Pretest
Postte 4-month Change
st
retention
31.58
43.09
41.12
-1.97
41.02
47.44
34.94
-12.50
51.54
71.27
72.37
1.10
51.51
67.31
64.31
-2.95
Consensus
Common questions we’ve heard

 What do students prefer—
simulation/randomization or theory-based?
 Depends on how you present it---students will take
their cue from you
 My preference:
 A. Simulation requires computational power, didn’t
have that until recently
 B. Theory based was the historical answer because it
predicts what would have happened how you simulated
 C. Theory based use mathematical theory to give good
approximation of simulation distribution under certain
validity conditions
Are we there yet?

 Ongoing questions for debate (fall 2012 and beyond)
 Confidence intervals approach—does it matter?
 Plausible values?
 Bootstrap?
 Theory only?
 Study design and simulation approach
 Disconnect
 Connect
 Re-randomize= randomized experiment
 Bootstrap=random sample
Common questions we’ve heard

 Q: Does this work?
 Assessment data
 Content objectives
 Attitudes
 More and more people doing this and saying it works!
 At least 15 faculty with our materials, ~10 new faculty
this fall,
 Lock’s, CATALST, NCSU, UCLA, STATCRUNCH…and
more!
Common questions we’ve heard

 Q: What makes the difference? Simulation?
Randomization? Pedagogy? Talking about inference
for 16 weeks instead of 5?
 We don’t know…yet..
 Pedagogy and these changes are inextricably linked
 On Sunday at the modeling session one of the panelists
said “How could you teach this without hands on
activities and technology?”