Assessing the association between quantitative maturity and student

Transcript Assessing the association between quantitative maturity and student

Assessing the association between quantitative maturity
and student performance in simulation-based and nonsimulation based introductory statistics
Nathan Tintle
Associate Professor of Statistics
Dordt College, Sioux Center, Iowa
Background
• What is simulation-based inference and why are people using it?
• Preliminary evidence is positive in aggregate and on particular
subscales
• Getting questions from workshop participants and users
• Will this work for my students who are particularly weak (or strong) or…
Literature
• Some prior studies have examined student quantitative maturity
and performance in introductory statistics courses
• Johnson and Kuenen (2006) – prior math skills (ACT and basic math skills
test) strongly predicted student performance in intro stats
• Green et al. (2009) – prior college math courses taken and order they were
taken strong predictors of student performance intro stats (business school)
• Rochelle and Dotterweich (2007) – Performance in algebra course and
college GPA strong predictors of student performance (business school)
• Li et al. (2012) and Wang et al. (2007) – college GPA and ACT both strong
predictors of student performance
• Gnaldi (2006) – Poor math training in high school leads to challenges in
learning statistics in college
Literature
• Some prior studies have examined student quantitative maturity
and performance in introductory statistics courses
• Lester (2007) – Age and algebra skills strong predictors of student
performance
• Dupuis et al. (2011) – More math coursework in high school associated with
better performance in college stats (multi-institutional)
• Cherney and Cooney (2005) and Silvia et al. (2008) – mathematical skills and
student attitudes were both significant predictors of student performance
Overall themes
• To date
• Associations found between prior abilities and performance and statistics
course performance
• Tend to be single institution
• Tend to focus on general mathematical and quantitative reasoning skills
(ACT score, algebra skills)
• Tend to focus on course grade as a measure of performance
Gaps in research to date
• Gaps
• Course grade is not normalized across institutions
• E.g., not nationally normed assessment tests of student abilities; no sense of
how much memorization/rehearsed algebra vs. student conceptual
understanding
• Course grade does not look at change over time (pre-course to post-course)
• E.g., Are students actually learning anything in the course or did they know it
already?
• Primarily focused on single curricula
• Similar results across institutions?
• Similar results across curricula? Simulation-based curriculum vs. Traditional
curriculum
Results addressing these gaps: Part 1
• Two ‘before and after’ stories
• Two Midwestern liberal arts colleges
• Using ‘traditional’ Stat 101 curricula (Moore at one; Agresti and Franklin at
the other)
• Changed to early versions of the ISI curriculum
Results addressing these gaps: Part 1
• Two ‘before and after’ stories
• Traditional curriculum
•
•
•
•
•
289 students
Two semesters
Two institutions
Multiple sections and instructors (average size of 25-30 students p/section)
Comprehensive Assessment of Outcomes in Statistics (CAOS)
• First week of class
• During finals week
• Online, incentive for taking, not for performance
• Response rate over 85% p/section
Results addressing these gaps: Part 1
• Two ‘before and after’ stories
• Early SBI curriculum
•
•
•
•
•
366 students
Three semesters
Two institutions
Multiple sections and instructors (average size of 25-30 students p/section)
Comprehensive Assessment of Outcomes in Statistics (CAOS)
• First week of class
• During finals week
• Online, incentive for taking, not for performance
• Response rate over 85% p/section
Results addressing these gaps: Part 1
• Screened out some bad data (e.g., too quick taking the
assessment)
• Similar demographics between the two (before and after, within
institution)
Table 1. Pre- and post-course CAOS scores
stratified by pre-course performance and
curriculum
Pre-test score
group
Low (≤40%)
Curriculum
Pre-test
Post-test
mean % correct mean % correct
(SD)
(SD)
Change in
mean % correct
(SD)1
Consensus (n=80)
35.2 (4.9)
48.1 (8.8)
12.9 (9.4)***
Early-SBI (n=141)
35.2 (4.8)
50.1 (9.8)
14.9 (10.6)***
Middle (Between Consensus (n=77)
40 and 50%)
Early-SBI (n=108)
45.1 (2.1)
52.0 (10.2)
6.9 (10.2)***
44.9 (2.0)
54.9 (10.8)
10.0 (10.4)***
High (≥50%)
Overall
Consensus
(n=129)
Early-SBI (n=117)
57.1 (6.8)
62.3 (6.8)
5.2 (9.1)***
55.8 (5.6)
63.0 (11.3)
7.2 (10.5)***
Consensus
(n=289)
Early-SBI (n=366)
47.7 (10.7)
55.5 (11.8)
7.8 (10.0)***
44.6 (9.7)
55.6 (11.9)
11.0 (11.0)***
Difference in
mean change
by
curriculum2
1.9
3.1*
2.1
3.3***
*p<0.05; **p<0.01;
***p<0.001
1. Significance is indicated
by asterisks and reported
based on results from paired
t-tests comparing the pre-test
and post-test scores
2. From a linear model
predicting the change in
score by curriculum and
adjusted for institution.
Institution was not
significant in any of the four
models (p-values of 0.62low; 0.22- middle; 0.72high; 0.38- overall).
Take homes from Table 1
• All groups significantly better on both curricula
• Some evidence of improved performance within each strata of
pre-test score with early SBI curriculum (not always statistically
significant)
Table 2. Pre- and post-course CAOS scores
stratified by ACT score and curriculum1
ACT Group
Low (≤22)
Middle (23-26)
High (≥27)
Overall
Curriculum
Pre-test
mean %
correct (SD)
Post-test
mean %
correct (SD)
Change in
mean %
correct (SD)2
Consensus (n=21) 41.7 (10.3)
46.3 (10.1)
4.0 (11.7)
Early-SBI (n=55)
54.9 (11.9)
12.2 (10.5)***
Consensus (n=34) 46.0 (8.2)
52.4 (10.3)
6.5 (9.2)***
Early-SBI (n=48)
55.1 (10.8)
11.2 (11.4)***
Consensus (n=36) 51.3 (7.7)
57.1 (7.7)
5.8 (9.2)**
Early-SBI (n=49)
47.8 (9.8)
59.5 (12.0)
11.8 (10.1)***
Consensus (n=91) 46.4 (9.3)
52.0 (11.0)
5.6 (9.8)
Early-SBI (n=152)
56.5 (11.6)
11.6 (10.7)
42.7 (10.1)
44.0 (10.0)
44.9 (10.1)
Difference in
mean change
by
curriculum3
8.2***
4.7*
6.0*
6.0***
1. Only for students with ACT scores
available (all students with available
ACT scores were from one of the two
colleges evaluated in Table 1)
2. From a paired t-test comparing
the pre-test and post-test scores
3. These values indicate how
different the two curricula are with
regards to changing student scores.
For example, 8.2 means that the
Early-SBI curriculum shows an
improvement in percent correct
which is 8.2 percentage points higher
than the Consensus curriculum. A
test to see whether there was
evidence that the difference in mean
changes were different by ACT group
(e.g., 8.2 vs. 4.7 vs. 6.0) did not
yield evidence of a significant
(p=0.15; ANOVA comparison of
whether a model predicting post-test
scores by pre-test, curriculum used
and ACT score group was
significantly different than a model
which predicted post-test scores by
pre-test and curriculum only).
Take homes from Table 2
• All groups show significant improvement except
• The lowest ACT score group with the consensus curriculum
• All groups significantly better with early-SBI curriculum
Limitation: Only one of the two institutions had ACT scores available,
however, no significant evidence of institutional differences in the models for
Table 1.
Part 1: Results by subscale
• Nine CAOS subscales
•
•
•
•
•
•
•
•
•
Graphical representations
Boxplots
Data collection and design
Descriptive statistics
Tests of significance
Bivariate relationships
Confidence Intervals
Sampling Variability
Probability/Simulation
Results by subscale
• Lowest pre-test group (<40%)
• 3 subscales significant improvement (p<0.05) SBI vs. traditional
• Data collection and design (9.4 points more improvement for SBI), Tests of
significance (8.4 points), Probability/Simulation (15.8 points)
• 6 no significant change (4 of 6 improved SBI vs. traditional)
• Middle pre-test group (40-50%)
• 2 subscales significant improvement (p<0.05) SBI vs. traditional
• Data collection and design (11.5 points) and Probability/Simulation (14.4 poitns)
• 6 no change (5 of 6 improved SBI vs. traditional)
• 1 significantly worse (10 point decrease on descriptive statistics)
• Highest pre-test group (50%+)
• 1 subscale significant improvement (Tests of significance 8.2 points)
• 8 no change (5 of 8 improved)
Take homes from subscale results
• SBI better across most subscales and most subgroups of students
with most gains on tests of significance, probability/simulation
and design
• *Note that decrease for descriptive statistics went away in later versions of
the curriculum
Part 2. Multi-institution analysis
• 1078 students
• 34 instructor sections
• 13 institutions (1 CC, 1 private university, 2 high school AP courses, four
liberal arts colleges and 4 public universities)
• Modified CAOS test
• Elimination or modification to questions commonly missed on post-test or
commonly correct pre-test
• Generally similar administration (online, outside of class, first
week/finals week, incentive to participate) across sections
• College GPA via self-report
• All sections used an SBI curriculum (the ISI curriculum)
Part 2. Multi-institution analysis
How grouped
Grouping
Pre-test
concept score
Low (<40%
correct;
n=291)
Middle (4055%; n=422)
High (55%+;
n=365)
Low (B or
worse; n=193)
Middle (B+ to
A-; n=654)
High (A;
n=231)
Overall
Self-reported
college GPA
Pre-test
Mean (SD)
35.0 (5.0)
Post-test
Mean (SD)
50.2 (12.0)
Change
Mean (SD)1
15.2 (12.3)***
48.1 (3.8)
56.2 (12.1)
8.1 (11.9)***
1 From
64.1 (7.3)
68.1 (12.8)
4.0 (10.8)***
45.6 (12.3)
52.9 (13.1)
7.3 (11.8)***
50.0 (12.0)
58.1 (13.6)
8.1 (12.6)***
53.8 (13.6)
64.9 (14.7)
11.1 (12.2)***
50.0 (12.6)
58.6 (14.3)
8.6 (12.5)***
a paired t-test comparing
the pre-test and post-test scores
Part 2. Multi-institution analysis
• Take home messages
• All groups show significant improvement
• Regression to mean effects noted when stratifying by pre-test score
• Improvement comparable across groups
Part 2. Multi-institution analysis by subscale
• Seven subscales with new instrument; stratified by GPA
• Table shows pre-test to post-test improvement
Low
Middle
High
Graphical rep’s
6.5*
6.4***
8.9***
Data collection and
design
-2.6
-0.0
7.5***
Desc. Stats
4.4
9.7***
5.7
Tests of sig
10.7***
10.8***
14.8***
Conf Intervals
10.7***
10.3***
14.0***
Sampling Variability
12.6***
3.2*
5.5
Probability/Simulation
10.9***
10.6***
11.4***
Take homes
• Fairly consistent improvement across scales
• Fairly consistent improvement across subgroups stratified by GPA
• Some modest evidence of differences for data collection/design
and sampling variability
Overall take home messages
• SBI showed consistent improvement at two institutions vs.
traditional curriculum
• Across student ability groups (pre-test on stat conceptual understanding and
ACT)
• Particular gains in conceptual areas emphasized by SBI; do-no-harm in other
areas
• SBI showed improvement across conceptual subscales across
institutions and across student ability groups as measured by
conceptual pre-test and self-reported college GPA
Limitations
Self-reported GPA; Limited ACT data
Limited institutions (pedagogical effects; institutional effects, etc.)
Limited abilities to draw cross-curricular comparisons at additional institutions
Factors other than GPA/ACT/pre-test score worth looking at (e.g., SES;
race/ethnicity)
• More sophisticated statistical modelling possibilities (hierarchical models;
relative gain vs. absolute)
• Incorporating student attitudes
•
•
•
•
These limitations are being addressed in current work with an expanded set of
data being gathered and in ongoing statistical analysis of current data; Similar
results so far.
Conclusions
• SBI continues to show promise overall and across student ability
groups, especially in areas of emphasis of SBI curricula (design,
tests of significance, probability/simulation)
• Future work is needed to ensure transferability of results to
broader groups of students and additional SBI and non-SBI
curricula and consider impacts of student demographics and
attitudes towards statistics
Acknowledgments
• Beth Chance, Cindy Nederhoff and others on the ISI development
and assessment team
• Support from the National Science Foundation under(Grant DUE1140629) and (Grant DUE-1323210)
• Class testers and students!
References
•
Cherney, I. D., & Cooney, R. R. (2005). The Mathematics and Statistics Perception Scale. Transactions of the Nebraska Academy of Sciences, 30, 1–8.
•
Dupuis, D. N., Medhanie, A., Harwell, M., Lebeau, B., Monson, D., & Post, T. R. (2011). A multi-institutional study of the relationship between high school
mathematics achievement and performance in introductory college statistics. Statistics Education Research Journal, 11(1), 4–20.
•
Gnaldi, M. (2006). The relationship between poor numerical abilities and subsequent difficulty in accumulating statistical knowledge. Teaching Statistics, 28(2),
49–53.
•
Green, J. J., Stone, C. C., Zegeye, A., & Charles, T. A. (2009). How much math do students need to succeed in business and economics statistics? An ordered
probit analysis. Journal of Statistics Education, 17(3).
•
Johnson, M., & Kuennen, E. (2006). Basic math skills and performance in an introductory statistics course. Journal of Statistics Education, 14(2). Retrieved from
http://www.amstat.org/publications/jse/v14n2/johnson.html
•
Lester, D. (2007). Predicting performance in a psychological statistics course. Psychological Reports, 101, 334.
•
Li, K., Uvah, J., & Amin, R. (2012). Predicting Students’ Performance in Elements of Statistics. US-China Review, 10, 875–884. Retrieved from
http://files.eric.ed.gov/fulltext/ED537981.pdf
•
Malone, C., Gabrosek, J., Curtiss, P., & Race, M. (2010). Resequencing topics in an introductory applied statistics course. The American Statistician, 64(1), 52–
8.
•
Rochelle, C. F., & Dotterweich, D. (2007). Student Success in Business Statistics. Journal of Economics and Finance Education, 6(1).
•
Scheaffer, R. (1997). Discussion to new pedagogy and new content: The case of statistics. International Statistics Review, 65(2), 156–8.
•
Silvia, G., Matteo, C., Francesca, C., & Caterina, P. (2008). Who failed the introductory statistics examination? A study on a sample of psychology students.
International Conference on Mathematics Education.
•
Wang, J.-T., Tu, S.-Y., & Shieh, Y.-Y. (2007). A study on student performance in the college introductory statistics course. AMATYC Review, 29(1), 54–62.

Assessing the association between quantitative maturity and student

Transcript Assessing the association between quantitative maturity and student

Directory