Assessing the association between quantitative maturity and student

Download Report

Transcript Assessing the association between quantitative maturity and student

Assessing the association between quantitative maturity
and student performance in simulation-based and nonsimulation based introductory statistics
Nathan Tintle
Associate Professor of Statistics
Dordt College, Sioux Center, Iowa
Background
• What is simulation-based inference and why are people using it?
• Preliminary evidence is positive in aggregate and on particular
subscales
• Getting questions from workshop participants and users
• Will this work for my students who are particularly weak (or strong) or…
Literature
• Some prior studies have examined student quantitative maturity
and performance in introductory statistics courses
• Johnson and Kuenen (2006) – prior math skills (ACT and basic math skills
test) strongly predicted student performance in intro stats
• Green et al. (2009) – prior college math courses taken and order they were
taken strong predictors of student performance intro stats (business school)
• Rochelle and Dotterweich (2007) – Performance in algebra course and
college GPA strong predictors of student performance (business school)
• Li et al. (2012) and Wang et al. (2007) – college GPA and ACT both strong
predictors of student performance
• Gnaldi (2006) – Poor math training in high school leads to challenges in
learning statistics in college
Literature
• Some prior studies have examined student quantitative maturity
and performance in introductory statistics courses
• Lester (2007) – Age and algebra skills strong predictors of student
performance
• Dupuis et al. (2011) – More math coursework in high school associated with
better performance in college stats (multi-institutional)
• Cherney and Cooney (2005) and Silvia et al. (2008) – mathematical skills and
student attitudes were both significant predictors of student performance
Overall themes
• To date
• Associations found between prior abilities and performance and statistics
course performance
• Tend to be single institution
• Tend to focus on general mathematical and quantitative reasoning skills
(ACT score, algebra skills)
• Tend to focus on course grade as a measure of performance
Gaps in research to date
• Gaps
• Course grade is not normalized across institutions
• E.g., not nationally normed assessment tests of student abilities; no sense of
how much memorization/rehearsed algebra vs. student conceptual
understanding
• Course grade does not look at change over time (pre-course to post-course)
• E.g., Are students actually learning anything in the course or did they know it
already?
• Primarily focused on single curricula
• Similar results across institutions?
• Similar results across curricula? Simulation-based curriculum vs. Traditional
curriculum
Results addressing these gaps: Part 1
• Two ‘before and after’ stories
• Two Midwestern liberal arts colleges
• Using ‘traditional’ Stat 101 curricula (Moore at one; Agresti and Franklin at
the other)
• Changed to early versions of the ISI curriculum
Results addressing these gaps: Part 1
• Two ‘before and after’ stories
• Traditional curriculum
•
•
•
•
•
289 students
Two semesters
Two institutions
Multiple sections and instructors (average size of 25-30 students p/section)
Comprehensive Assessment of Outcomes in Statistics (CAOS)
• First week of class
• During finals week
• Online, incentive for taking, not for performance
• Response rate over 85% p/section
Results addressing these gaps: Part 1
• Two ‘before and after’ stories
• Early SBI curriculum
•
•
•
•
•
366 students
Three semesters
Two institutions
Multiple sections and instructors (average size of 25-30 students p/section)
Comprehensive Assessment of Outcomes in Statistics (CAOS)
• First week of class
• During finals week
• Online, incentive for taking, not for performance
• Response rate over 85% p/section
Results addressing these gaps: Part 1
• Screened out some bad data (e.g., too quick taking the
assessment)
• Similar demographics between the two (before and after, within
institution)
Table 1. Pre- and post-course CAOS scores
stratified by pre-course performance and
curriculum
Pre-test score
group
Low (≤40%)
Curriculum
Pre-test
Post-test
mean % correct mean % correct
(SD)
(SD)
Change in
mean % correct
(SD)1
Consensus (n=80)
35.2 (4.9)
48.1 (8.8)
12.9 (9.4)***
Early-SBI (n=141)
35.2 (4.8)
50.1 (9.8)
14.9 (10.6)***
Middle (Between Consensus (n=77)
40 and 50%)
Early-SBI (n=108)
45.1 (2.1)
52.0 (10.2)
6.9 (10.2)***
44.9 (2.0)
54.9 (10.8)
10.0 (10.4)***
High (≥50%)
Overall
Consensus
(n=129)
Early-SBI (n=117)
57.1 (6.8)
62.3 (6.8)
5.2 (9.1)***
55.8 (5.6)
63.0 (11.3)
7.2 (10.5)***
Consensus
(n=289)
Early-SBI (n=366)
47.7 (10.7)
55.5 (11.8)
7.8 (10.0)***
44.6 (9.7)
55.6 (11.9)
11.0 (11.0)***
Difference in
mean change
by
curriculum2
1.9
3.1*
2.1
3.3***
*p<0.05; **p<0.01;
***p<0.001
1. Significance is indicated
by asterisks and reported
based on results from paired
t-tests comparing the pre-test
and post-test scores
2. From a linear model
predicting the change in
score by curriculum and
adjusted for institution.
Institution was not
significant in any of the four
models (p-values of 0.62low; 0.22- middle; 0.72high; 0.38- overall).
Take homes from Table 1
• All groups significantly better on both curricula
• Some evidence of improved performance within each strata of
pre-test score with early SBI curriculum (not always statistically
significant)
Table 2. Pre- and post-course CAOS scores
stratified by ACT score and curriculum1
ACT Group
Low (≤22)
Middle (23-26)
High (≥27)
Overall
Curriculum
Pre-test
mean %
correct (SD)
Post-test
mean %
correct (SD)
Change in
mean %
correct (SD)2
Consensus (n=21) 41.7 (10.3)
46.3 (10.1)
4.0 (11.7)
Early-SBI (n=55)
54.9 (11.9)
12.2 (10.5)***
Consensus (n=34) 46.0 (8.2)
52.4 (10.3)
6.5 (9.2)***
Early-SBI (n=48)
55.1 (10.8)
11.2 (11.4)***
Consensus (n=36) 51.3 (7.7)
57.1 (7.7)
5.8 (9.2)**
Early-SBI (n=49)
47.8 (9.8)
59.5 (12.0)
11.8 (10.1)***
Consensus (n=91) 46.4 (9.3)
52.0 (11.0)
5.6 (9.8)
Early-SBI (n=152)
56.5 (11.6)
11.6 (10.7)
42.7 (10.1)
44.0 (10.0)
44.9 (10.1)
Difference in
mean change
by
curriculum3
8.2***
4.7*
6.0*
6.0***
1. Only for students with ACT scores
available (all students with available
ACT scores were from one of the two
colleges evaluated in Table 1)
2. From a paired t-test comparing
the pre-test and post-test scores
3. These values indicate how
different the two curricula are with
regards to changing student scores.
For example, 8.2 means that the
Early-SBI curriculum shows an
improvement in percent correct
which is 8.2 percentage points higher
than the Consensus curriculum. A
test to see whether there was
evidence that the difference in mean
changes were different by ACT group
(e.g., 8.2 vs. 4.7 vs. 6.0) did not
yield evidence of a significant
(p=0.15; ANOVA comparison of
whether a model predicting post-test
scores by pre-test, curriculum used
and ACT score group was
significantly different than a model
which predicted post-test scores by
pre-test and curriculum only).
Take homes from Table 2
• All groups show significant improvement except
• The lowest ACT score group with the consensus curriculum
• All groups significantly better with early-SBI curriculum
Limitation: Only one of the two institutions had ACT scores available,
however, no significant evidence of institutional differences in the models for
Table 1.
Part 1: Results by subscale
• Nine CAOS subscales
•
•
•
•
•
•
•
•
•
Graphical representations
Boxplots
Data collection and design
Descriptive statistics
Tests of significance
Bivariate relationships
Confidence Intervals
Sampling Variability
Probability/Simulation
Results by subscale
• Lowest pre-test group (<40%)
• 3 subscales significant improvement (p<0.05) SBI vs. traditional
• Data collection and design (9.4 points more improvement for SBI), Tests of
significance (8.4 points), Probability/Simulation (15.8 points)
• 6 no significant change (4 of 6 improved SBI vs. traditional)
• Middle pre-test group (40-50%)
• 2 subscales significant improvement (p<0.05) SBI vs. traditional
• Data collection and design (11.5 points) and Probability/Simulation (14.4 poitns)
• 6 no change (5 of 6 improved SBI vs. traditional)
• 1 significantly worse (10 point decrease on descriptive statistics)
• Highest pre-test group (50%+)
• 1 subscale significant improvement (Tests of significance 8.2 points)
• 8 no change (5 of 8 improved)
Take homes from subscale results
• SBI better across most subscales and most subgroups of students
with most gains on tests of significance, probability/simulation
and design
• *Note that decrease for descriptive statistics went away in later versions of
the curriculum
Part 2. Multi-institution analysis
• 1078 students
• 34 instructor sections
• 13 institutions (1 CC, 1 private university, 2 high school AP courses, four
liberal arts colleges and 4 public universities)
• Modified CAOS test
• Elimination or modification to questions commonly missed on post-test or
commonly correct pre-test
• Generally similar administration (online, outside of class, first
week/finals week, incentive to participate) across sections
• College GPA via self-report
• All sections used an SBI curriculum (the ISI curriculum)
Part 2. Multi-institution analysis
How grouped
Grouping
Pre-test
concept score
Low (<40%
correct;
n=291)
Middle (4055%; n=422)
High (55%+;
n=365)
Low (B or
worse; n=193)
Middle (B+ to
A-; n=654)
High (A;
n=231)
Overall
Self-reported
college GPA
Pre-test
Mean (SD)
35.0 (5.0)
Post-test
Mean (SD)
50.2 (12.0)
Change
Mean (SD)1
15.2 (12.3)***
48.1 (3.8)
56.2 (12.1)
8.1 (11.9)***
1 From
64.1 (7.3)
68.1 (12.8)
4.0 (10.8)***
45.6 (12.3)
52.9 (13.1)
7.3 (11.8)***
50.0 (12.0)
58.1 (13.6)
8.1 (12.6)***
53.8 (13.6)
64.9 (14.7)
11.1 (12.2)***
50.0 (12.6)
58.6 (14.3)
8.6 (12.5)***
a paired t-test comparing
the pre-test and post-test scores
Part 2. Multi-institution analysis
• Take home messages
• All groups show significant improvement
• Regression to mean effects noted when stratifying by pre-test score
• Improvement comparable across groups
Part 2. Multi-institution analysis by subscale
• Seven subscales with new instrument; stratified by GPA
• Table shows pre-test to post-test improvement
Low
Middle
High
Graphical rep’s
6.5*
6.4***
8.9***
Data collection and
design
-2.6
-0.0
7.5***
Desc. Stats
4.4
9.7***
5.7
Tests of sig
10.7***
10.8***
14.8***
Conf Intervals
10.7***
10.3***
14.0***
Sampling Variability
12.6***
3.2*
5.5
Probability/Simulation
10.9***
10.6***
11.4***
Take homes
• Fairly consistent improvement across scales
• Fairly consistent improvement across subgroups stratified by GPA
• Some modest evidence of differences for data collection/design
and sampling variability
Overall take home messages
• SBI showed consistent improvement at two institutions vs.
traditional curriculum
• Across student ability groups (pre-test on stat conceptual understanding and
ACT)
• Particular gains in conceptual areas emphasized by SBI; do-no-harm in other
areas
• SBI showed improvement across conceptual subscales across
institutions and across student ability groups as measured by
conceptual pre-test and self-reported college GPA
Limitations
Self-reported GPA; Limited ACT data
Limited institutions (pedagogical effects; institutional effects, etc.)
Limited abilities to draw cross-curricular comparisons at additional institutions
Factors other than GPA/ACT/pre-test score worth looking at (e.g., SES;
race/ethnicity)
• More sophisticated statistical modelling possibilities (hierarchical models;
relative gain vs. absolute)
• Incorporating student attitudes
•
•
•
•
These limitations are being addressed in current work with an expanded set of
data being gathered and in ongoing statistical analysis of current data; Similar
results so far.
Conclusions
• SBI continues to show promise overall and across student ability
groups, especially in areas of emphasis of SBI curricula (design,
tests of significance, probability/simulation)
• Future work is needed to ensure transferability of results to
broader groups of students and additional SBI and non-SBI
curricula and consider impacts of student demographics and
attitudes towards statistics
Acknowledgments
• Beth Chance, Cindy Nederhoff and others on the ISI development
and assessment team
• Support from the National Science Foundation under(Grant DUE1140629) and (Grant DUE-1323210)
• Class testers and students!
References
•
Cherney, I. D., & Cooney, R. R. (2005). The Mathematics and Statistics Perception Scale. Transactions of the Nebraska Academy of Sciences, 30, 1–8.
•
Dupuis, D. N., Medhanie, A., Harwell, M., Lebeau, B., Monson, D., & Post, T. R. (2011). A multi-institutional study of the relationship between high school
mathematics achievement and performance in introductory college statistics. Statistics Education Research Journal, 11(1), 4–20.
•
Gnaldi, M. (2006). The relationship between poor numerical abilities and subsequent difficulty in accumulating statistical knowledge. Teaching Statistics, 28(2),
49–53.
•
Green, J. J., Stone, C. C., Zegeye, A., & Charles, T. A. (2009). How much math do students need to succeed in business and economics statistics? An ordered
probit analysis. Journal of Statistics Education, 17(3).
•
Johnson, M., & Kuennen, E. (2006). Basic math skills and performance in an introductory statistics course. Journal of Statistics Education, 14(2). Retrieved from
http://www.amstat.org/publications/jse/v14n2/johnson.html
•
Lester, D. (2007). Predicting performance in a psychological statistics course. Psychological Reports, 101, 334.
•
Li, K., Uvah, J., & Amin, R. (2012). Predicting Students’ Performance in Elements of Statistics. US-China Review, 10, 875–884. Retrieved from
http://files.eric.ed.gov/fulltext/ED537981.pdf
•
Malone, C., Gabrosek, J., Curtiss, P., & Race, M. (2010). Resequencing topics in an introductory applied statistics course. The American Statistician, 64(1), 52–
8.
•
Rochelle, C. F., & Dotterweich, D. (2007). Student Success in Business Statistics. Journal of Economics and Finance Education, 6(1).
•
Scheaffer, R. (1997). Discussion to new pedagogy and new content: The case of statistics. International Statistics Review, 65(2), 156–8.
•
Silvia, G., Matteo, C., Francesca, C., & Caterina, P. (2008). Who failed the introductory statistics examination? A study on a sample of psychology students.
International Conference on Mathematics Education.
•
Wang, J.-T., Tu, S.-Y., & Shieh, Y.-Y. (2007). A study on student performance in the college introductory statistics course. AMATYC Review, 29(1), 54–62.