Peer Review: A workshop for referees

Download Report

Transcript Peer Review: A workshop for referees

Advising on Test Validity:
Comments on Denny Borsboom
Neil K. Aaronson
The Netherlands Cancer Institute
KNAW Colloquium on
Advising on Research Methods
Amsterdam, March 29, 2007
The way to capture an audience’s
attention is with a demonstration where
there is a possibility the speaker may
die.
Jearl Walker, Cleveland State University
It usually takes more than three weeks
to prepare a good impromptu speech.
Mark Twain
Who am I?
• Health outcomes researcher
• Clinical oncology
• Develop questionnaires to assess patients’
illness and treatment experience from their
own perspective
• For use in observational and evaluative
studies in clinical research and practice
What are we attempting to measure?
•
•
•
•
•
Health outcomes
Health status
Quality of life
Health-related quality of life
Patient-reported outcomes (PROs)
State of affairs in defining QL
• "Quality of life is a vague and ethereal entity, something
that many people talk about, but which nobody clearly
knows what to do about.“
Campbell et al., 1976
• “The idea has become a kind of umbrella under which are
placed many different indexes dealing with whatever the
user wants to focus on.”
Feinstein, 1987
• “Quality of life is an ill-defined term…it means different
things to different people, and takes on different meanings
according to the area of application.” Fayers & Machin,
2000
Key dimensions of quality of life as
defined by David Karnofsky (1949), the
WHO (1949) and ASCO (1995)
Physical
Symptoms commonly caused by
cancer and the toxicities of treatment
Psychological Effects of cancer and its treatment
on cognitive function and emotional
state
Social
Effects of cancer and its treatment on
interpersonal relationships, school,
work and recreation
Attributes of QL definitions
• Non-specific versus health-related
• Health states (or status) versus personal evaluation
of those states (e.g., expectations, discrepancies,
satisfaction)
• Scope of concerns (e.g., spirituality or existential
issues)
• Polarity of concerns (dysfunction and its
resolution vs. positive well-being)
Does it matter?
• Yes, because the content of QL
questionnaires reflects the underlying
definition.
• It may be less important in clinical trials,
where group comparisons will be internally
valid, regardless of the definition used.
• It is more important in comparing results
across trials and in observational (e.g.,
prevalence) studies.
Examples of QL definitions
“The difference between the hopes and
expectations of the individual and the individual’s
present experience.”
Calman, 1987
“The functional effect of an illness and its
consequent therapy upon a patient, as perceived by
the patient.”
Schipper et al. 1996
Covinsky et al. Am J Med 1999;
106:435-440
• 493 elderly patients rated their physical functioning,
psychological distress and overall QL
• More than 40% of those who reported the worst
physical functioning and/or the highest levels of
psychological distress rated their QL as “good or
excellent”
• Approximately 20% of those with the best physical
functioning and lowest levels of distress rated their QL
as “poor”
Generic HRQL instruments
•
•
•
•
•
•
Sickness Impact Profile (SIP)
Nottingham Health Profile (NHP)
Spitzer QL Index
COOP/WONCA Charts
MOS 36-Item Health Survey (SF-36)
World Health Organization (WHOQoL)
Cancer-specific QL questionnaires
• Functional Living Index – Cancer (FLIC)
• Cancer Rehabilitation Evaluation System
(CARES)
• Rotterdam Symptom Checklist (RSCL)
• EORTC QLQ-C30
• Functional Assessment of Cancer Therapy
(FACT-G)
Key psychometric attributes of
HRQL instruments
•
•
•
•
•
•
•
measurement model
reliability
validity
responsiveness
interpretability
cultural adaptability
burden
Assessing validity of HRQL instruments:
classical approaches
(SAC/MOT 2001)
Content-related
• evidence that the content domain of an
instrument is appropriate relative to its intended
use
• the use of lay and expert panel (clinician)
judgments
• complete the questionnaire(s) yourself
Future perspective items
SF-36
“I expect my health to get worse.”
FACT-G “I worry about dying.”
CARES-SF “I worry about whether the cancer
will progress.”
QLQ-C30
--
Assessing validity of HRQL instruments:
classical approaches
(SAC/MOT 2001)
Construct-related
• evidence that supports a proposed interpretation
of scores based on theoretical implications
associated with the constructs being measured.
• examine interscale correlations
• examine patterns of scores for groups known to
differ on relevant variables
• Disease-stage; treatment status, response to
treatment, etc.
Questions for Denny and audience (1)
• Examining correlations between measures
purported to assess the same concept indeed tends
to yield little useful information for instrument
developers or for end-users – the exercise is
theoretically and empirically anemic
• However, the “known groups” comparison
approach is intuitively appealing and tends to be
well-understood and accepted by end-users
• Is this latter approach equally “suspect”; i.e. does
it also fail to truly address the validity of a
measure?
Questions for Denny and audience (2)
• Item response theory (IRT) approaches are quickly coming
to dominate the field of HRQL instrument development
(NIH PROMIS INITIATIVE)
• Generating large item banks for each domain of interest,
primarily based on existing literature (e.g., depression,
pain, fatigue)
• Collecting large datasets to model item and scale
information curves
• Generating computer-adaptive versions of measures
• Will this approach really yield theoretically grounded and
valid measures, or is it yet another example of “dustbowl
empiricism”?
Suggested reading
• Fayers P, Hays R (eds). Assessing quality of
life in clinical trials: Methods and practice.
Oxford: Oxford University Press, 2005
• Lipscomb J, Gotay CC, Snyder CF (eds.)
Outcomes Assessment in Cancer.
Cambridge: Cambridge University Press ,
2005.