What can Mathematics Education Research Learn from Medical

Download Report

Transcript What can Mathematics Education Research Learn from Medical

The Role of RCTs in the
Design of Educational
Interventions
Finbarr Sloane & James A. Middleton
Mary Lou Fulton College of Education
Center for Research on Education in Science, Mathematics,
Engineering and Technology (CRESMET)
Arizona State University
1
Goals of the Presentation
– Going beyond the rhetoric of RCT’s
– Briefly outline a “compleat” (the clinical trials
model) and address its fit with and for
education research in teaching and learning
– Discuss the role of efficacy and effectiveness studies
in the overall process of development and vetting of
innovation
– Ask what are the Human and Intellectual Capital
needs of such a model for education research
2
Medical Research

There is more than one medical model
– e.g, Epidemiology, Surgery

The compleat model has at least Seven!
phases
– Pre-Phase I research
– Phases I-IV (the written about medical model)
– Education intervention typically has 2 additional
phases (Artifact Development and Field Testing)
3
Hierarchy of Clinical Trial Research
– Phase I: Safety
 Establish feasibility
– Phase II: Initial Efficacy
 Show improvement over historical norms
– Phase III: Confirm Efficacy
 Comparative randomized trials
– Phase IV: Follow Up
 Real world scaling
4
The written about
Medical/Education model
Transportability
Theory-building
Phase 1
Identifying the
Research Problem
Establish
Feasibility
Phase 4
Implementing the
Solution
Scaling
Systematic
effects
Causal
relationshi
p
Contextual
variables
The ‘What Works’ model of
scientific research in
education
Phase 3
Definitive Testing
of the Solution
RCTs
Hypothesis
Question
Theoretical
model
Phase 2
Designing a
Testable Solution
Historical Norms
Artifact or
interventio
n
5
A “Compleat” model of Design and
Evaluation of Innovations
Phase 1
Grounde
d Models
Phase 7
Disseminatio
n and Impact
Phases II, III,
and IV
Phase 2
Developm
ent of
Artifact
Phase 6
Definitive
Test
Phase 3
Feasibilit
y Study
Phase 5
Field Study
Insert between
Phase I and II
Pre-Phase I
Phase 4
Prototypi
ng and
Trialling
Phase I
6
Middleton, J. A., Gorard, S., Taylor, C., & BannonRitland, B. (in press). The compleat design
experiment: From soup to nuts. To appear in E.
Kelly & R. Lesh (Eds.), Design Research:
Investigating and assessing complex systems in
mathematics, science and technology education.
Mahwah, NJ: Lawrence Erlbaum Associates.
7
The Engineering Design Process
Intended
Function
Intended
Behavior
Form
Actual
Behavior
Actual
Function
Theoretical Model
Empirical Test
Theory Building
Figure 1. General Model of Design
Research
8
Phases I and II: Grounded Models,
Development of Artifacts/Practices

If the clinical trial is to evaluate a new
drug, the first step is an action plan called
the Investigational New Drug Application
(IND) that is presented to the FDA.
– This application contains everything known
about the therapy, including all the data from
laboratory and animal tests.
– If the FDA feels that the therapy might
possibly benefit people, it approves the IND
and the first phase of clinical trials can begin.
9
Education Research Commentary

Medical Phase I (Feasibility) efforts assume that the early Pre-Phase I
‘innovation’ work has been conducted. Moreover, it highlights the need for
such work before considering Phase I of a trial.
– This has major implications for education research.
– In the US two funding agencies are contrasted (IES, the NSF), following its
research mission “where discoveries begin,” plays a critical role in this pre-trial
research work. IES in following its mission to find out what works funds more
summative research.
In Education, innovations often arise out of practice. As such, they
carry with them ad hoc theories of implementation and practice that are not
well specified.
 Other innovations are generated out of well-researched grounded
models. What is at question is their transportability to other
implementation sites and conditions (e.g., scale).

– Feedback needed from Feasibility Studies (next Phase)

Most innovations, like mathematics curricula, are so complex, that a
detailed theory of their cognitive and behavioral affordances is impossible to
achieve within the publication schedule of commercial vendors. This results
in a patchwork of theories cobbled together without regard to their
epistemological or empirical coherence.
10
Phase III: Feasibility

Establish feasibility
–
–
–
–
–
–
–
Concerned with safety and potential benefit
Nearly always single institution
Focus on a delivery mechanism
Search for interactions
Not randomized
Small ‘n’
Dosage focus: How much is needed?
 Time
 Intensity
 Patient-specific dosage and delivery
11
Phases III - ER Commentary


In general, these studies are local and there is no effort to
randomize.
These Phases looks specifically at dosages
– Binary?.
– Is more better?
– What would be the Least Significant Effect?

There is an important caveat in educational research because the
researcher is trying to minimally examine two dosages simultaneously
making the inferential space more difficult.
– We need to understand what doses should be given to the
teacher (and how the dosage should be delivered), before the teacher
can appropriately dose the student.
– 3-levels of dosage?

ER research maps differently and has different complications. For
example, the treatment is not delivered to an individual but to classes
within schools within districts within communities crossed
with SES.
12
A “Compleat” model of Design and
Evaluation of Innovations
Phase 1
Grounded
Models
Phase 7
Disseminatio
n and Impact
Phases II, III,
and IV
Phase 2
Developmen
t of Artifact
Phase 6
Definitive
Test
Phase 3
Feasibility
Study
Phase 5
Field Study
Insert between
Phase I and II
Pre-Phase I
Phase 4
Prototyping
and
Trialling
Phase I
13
Phase IV: Prototyping and Trialing






An initial prototype of an artifact or procedure needs to be made
implementable beyond the specific conditions under which it was
developed;
Linear design, concurrent design, iterative design;
The more complex the intervention, the longer and more costly this Phase
is.
Very few interventions in education which have made it to RCT Phases are
very complex.
Very few interventions have a detailed theory of how and why they work,
that is actually being tested in current RCT studies.
Most innovations in the next 10 years will be highly complex (e.g.,
MMOLEs)
–
–
–
–
Teaching many things
Highly interdependent treatments
Differentially implemented for different circumstances
Measurements not yet developed which are reliable and valid under these
conditions.
14
A “Compleat” model of Design and
Evaluation of Innovations
Phase 1
Grounded
Models
Phase 7
Disseminatio
n and Impact
Phases II, III,
and IV
Phase 2
Developmen
t of Artifact
Phase 6
Definitive
Test
Phase 3
Feasibility
Study
Phase 5
Field Study
Insert between
Phase I and II
Pre-Phase I
Phase 4
Prototyping
and
Trialling
Phase I
15
Phase V – Initial Efficacy-Field
Studies






A Phase V study provides preliminary
information about how well the new treatment
works and generates more information about
safety and benefit.
Conducted in defined condition of interest
Refines delivery mechanism
Evaluates endpoints of interest
Compare to historical norms on pre-specified
hypotheses (if historical norms available)
Again, no required randomization to
treatment/control groups
16
Phase V: ER Commentary

Similarly, in education we need to understand and
generate preliminary information on how and how well
the treatment works for trainers, for teachers, and for
students and specify the theory for how it all fits
together in the subsequent implementation and trials to
come:
– Some control is attained in ER research through the use of
quasi-experimental designs; we may not have access to
historical norms
– Education research designs must acknowledge clustering
– Education research must also specify cross level theory
 Learning and instruction are not separable but go hand in hand
 We need to find and measure these interactions. For example, can
we assume teacher effect is “fixed”?
– Because of heterogeneity of implementation, Robustness of
implementation should be built-in
17
Phase VI - Confirming Efficacy
Multi-institutional (with standardized
procedures)
 In well defined populations
 Large ‘n’ (some subset analyses)
 Well defined endpoints
 Randomized comparisons
 Serious within study controls (blinding)
 Monitored

18
Phase VI

In Medicine, these trials compare a promising new drug,
combination of drugs, or procedure with the current
standard therapy.
– The contrast is always to best known science. This helps address the
ethical issues associated with the Hippocratic oath.


Medical trials typically involve large numbers of patients
from doctors’ offices, clinics, and cancer centers
nationwide-Generalisability versus transportability
The reason that the randomised clinical trial has been
initiated is that the superiority of one treatment over the
other has not yet been firmly established.
– Neither you nor your physician, choose whether you get the
new intervention or the standard treatment.
19
Phase VI
If you are assigned the standard intervention,
you receive what experts view as the best
treatment available. Experts believe that
each treatment is effective, but really don’t
know which one is better. Thus the need for the
trial.
 If you are assigned the new intervention, you
receive a treatment that some experts, backed
up by appropriate theory, think may have
some advantages over the standard treatment.
(Ethics enters here as does the meaning of the
control group).

20
Phase VI: ER Commentary



Do we have anything that looks like this is education
research? Do we enact the full model?
Some IERI studies (perhaps)?
Why not?
– Lack of overall research infrastructure
 For researchers
 For graduate students
 For funding agencies
– Lack of resources on the research side
 Human and intellectual capital
 Financial
 Lack of critical mass of knowledge at each phase and across the
phases
21
Phase VII – Follow Up






Real world application
Long term follow-up
Refines practices
Discover new applications
Some use the term “Phase IV” in medicine to
include the continuing evaluation that takes
place after FDA approval, when the drug or
treatment procedure is already on the market
and available for general use.
This is also called post-marketing surveillance.
22
Logistics

Privately run trials
– Driven by regulatory concerns

Publicly run “group” trials (role of NIH and the
Consensus Boards)
– Ensures only the “best” are tested
– Standardized techniques across studies
– Provides infrastructure to support trials (i.e., the
research model and some of the capital resources)
– Public data sharing allows for other correlative
research
23
ER Commentary

We in ER do not conduct much remotely like this:
– Likely why the IES “What Works Clearing-House” is inhabited by
few studies. For example there are very few mathematics
education research studies that meet criteria for inclusion
– Rubber-meets-the-road innovations are highly influenced by
publishing houses: Pace of publication outstrips capacity to
provide quality data on effectiveness; Emphasis is on reuse and
renovation rather than innovation.
 The industry will not regulate itself
– Moreover, the studies honored in the WW clearinghouse are
simply analyzed at the wrong unit of analysis (individuals rather
than clustered by the level of the treatment, i.e., classrooms).
Put simply the effect size estimates are way off!
24
ER Commentary

The Ratio of Studies Across the Phases -- 10:1
– In general, there are least 10 studies conducted at each phase
for every study at the next step up.
 In sum a 10:1 ratio of studies across the phases.
Consequently in medicine there are 1,000 medical phase I
studies for every major trial (not to mention the basic
research efforts that go on before medical phase I work).
– I note that this may underestimate the problem in education
research due to the double and triple dosage issues noted
earlier.
– That is, the current medical model would need to be seriously
adapted to meet the real needs of education interventions.
 Drs do not need to be treated with a drug to have them
deliver treatments to their patients.
25
Design Issues and Teacher Effects

To estimate a teacher effect in an RCT you need
a number of things to be in place:
– Random assignment of teachers to treatment and
control (within school)
 Problem!
– What do we do to eliminate this problem?
– Fix: Randomize by schools (Clustering)
 Problem! In doing this we collapse over all teachers but we
need to understand treatment fidelity at the level of the
classroom
 Cost goes up exponentially by number of schools.
26
Human and Intellectual Capital

What are the central questions of education research?
– How students learn or student learning?
 Learning versus achievement?


Is improved student achievement really a serious goal for ER?
If so, we need a fully-fledged, theoretically strong organizing
structure for our research
– Expertise both within and across researchers in quantitative hypothesistesting methods, statistical modeling, and qualitative model-building.
– The Logic Model for interventions must be coherent, driving inquiry and
evaluation across each phase of the innovation cycle.







Inductive models
Development of intervention
Feasibility
Initial trials
Scale-up in the Field
Definitive Test (RCTs)
Market and implementation follow-up.
27
A “Compleat” model of Design and
Evaluation of Innovations
Phase 1
Grounde
d Models
Phase 7
Disseminatio
n and Impact
Phases II, III,
and IV
Phase 2
Developm
ent of
Artifact
Phase 6
Definitive
Test
Phase 3
Feasibilit
y Study
Phase 5
Field Study
Insert between
Phase I and II
Pre-Phase I
Phase 4
Prototypi
ng and
Trialling
Phase I
28