Outcome measures: data quality is a dimension, not a category

Download Report

Transcript Outcome measures: data quality is a dimension, not a category

ROUTINE OUTCOME
MEASURES: DATA QUALITY
IS A DIMENSION, NOT A
CATEGORY
Dr Grant Sara , MHDAO, NSW Health
HoNOS ?
1.
2.
3.
4.
Wrong purpose : public health
not clinical decision-making
Reflects service priorities, not
consumer
Clinician rated, not consumerrated
Focus on deficits not recovery
and wellbeing
Aim



Data quality issues shouldn’t be seen as binary, all-ornothing problems
Data quality is a dimension: no data is perfect, all data
sets contain both signal and noise
The issue of whether data is good enough depends on
purpose or type of question
 scale


Current routine outcome data is usable and valuable for
aggregate purposes, for understanding variation,
comparing groups of consumers or services
Individual clinical interaction
Measures can aid discussion
and dialogue
(McKay and Coombs)
Feedback can improve
outcomes
(Michael Lambert)
Feedback especially effective
with integrated information
systems
(Perth Clinic)
Background
Aspects of data quality

Reliability

 Test-retest

 Inter-rater

Validity
 Face
 Construct
 Concurrent
 Predictive
 Incremental

Sensitivity
Specificity
Bias
 Information
 Selection
Power
Literature

Recent review Delaffon:



HoNOS






HoNOS wide use in service evaluation
UK, Aust, NZ. Community and rehab settings
Clinical Utility (Prowse 2009, Preti 2012, Kisely 2010)
Sensitivity to change (Cheung 2007, Canuto 2009, Staring 2012, van Vugt 2012, Prabhu 2008)
Predicts outcomes (Turner 2009, Hayes 2012)
Predicts service use (Kortrijk 2012, Tulloch 2012, Andreas 2010, Bech 2006)
However, some findings of issues with inter-rater reliability (Ecob 2004) & sensitivity
to change (Hunter 2009, Duke 2010)
LSP


Concurrent and predictive validity (Eagar 2005, Di Michele 2007, Aoyama 2011)
Recommended as a measure of social function by recent RAND expert panel
(Leifker 2011)
NSW Data
Overview





Comparing services
Comparing programs, settings and groups
Other measures (LSP, K10, APQ)
Predictive validity - readmission
Challenges
 Completion
rates and selection bias ?
Differences between units



Clinical benchmarking program
Provision of aggregated (6 month) data at WARD
level, compared to peer group wards
HoNOS
 Admission
Profile
 Discharge Profile
 Change groups (Improved, unchanged, deteriorated)

K10 recently added
HoNOS Admission Profile
Data Jul-Dec 2012
“Adult Acute” peer
group (n=47): excludes
HDU, PICU, PECC, some
specialty units.
Of episodes with a valid
Admission HoNOS,
percent with clinically
significant scores.
Median and IQR for
peer group units
UNIT
Regional unit, high
disadvantage, many young
people
UNIT
High acuity inner city unit,
significant disadvantage
and homelessness
UNIT
Outer metro unit, very high
youth disadvantage and
unemployment
Differences between programs and settings
HoNOS admission profile
50%
40%
Inpatient
Community
Diagnosis
30%
20%
10%
0%
Organic
50%
40%
Substance Psychosis
Affective
Anx &
Adjust
Personality
Devel
Delay
Childhood Behaviour
Other
HoNOS
30%
20%
10%
0%
Overact Self-inj
Subs
Cog
Phys
Psychosis
Dep
Oth
Rel
ADL
Acc
Occ
Personality Disorder


Personality disorder has significant impact for
individuals, families, services
NSW inpatient data
 Any
Personality disorder: 6% primary, 12% comorbid
 Around two thirds is Borderline Personality Disorder


Using structured personality interviews in inpatient
mental health: 30 – 60% meet criteria for one or
more personality disorder diagnoses
Can routine outcome measures add to our
understanding?
Admission HoNOS
No Personality Disorder
Percent with clinically significant (2+) ratings
70%
60%
50%
40%
30%
20%
10%
0%
Mean 11.8 (95%CI 11.7 – 11.9)
N = 63 708
Personality Disorder
Mean 12.5 (95%CI 12.3 – 12.6)
Kessler-10 (K10)
10 item, consumerrated.
Yields
- Overall score
- Psychological
distress “bands”
K10 at admission
Adults in adult acute units, January – June 2012, n = 12 881. 4 415 (34%) have valid admission K10.
No significant difference in completion rates between personality disorder groups
K10 bands
40
100%
30
75%
Percent of people
Average K10 score
K10 score
20
None
Mod
Severe
10
25%
0
0%
None
Comorbid
Personality disorder
Primary
Mild
50%
None
Comorbid
Primary
Personality disorder
Change in K10
Admission
January – June 2012
Discharge
Adults in adult acute units, n
= 12 881, of whom 8
518 > 3 days.
45
40
1 339 (15.7 %) have
valid K10 pair.
Average K10 score
35
Completion of K10 pair
higher (18.4%) in people
with primary personality
disorder diagnoses than
those without (15.7%)
30
25
20
15
10
5
0
None
Comorbid
Personality disorder
Primary
Other measures and issues
Concurrent and predictive validity?
Activity and Participation Questionnaire
1: Paid work
2: Looking for work
3: Unpaid work
4: Education / study
5: Social / community activity
6: Desire for change
Brief, consumer-rated measure.
“Soft launch”, opt-in, mainly used in
rehabilitation and early psychosis
services
Employment: APQ v LSP
100%
80%
APQ
Not participating
Looking
Employed
60%
40%
20%
0%
0
1
2
LSP Work Capacity
LSP Q 16 (LSP16, LSP20), Q26(LSP39). N=3,419
3
Employment: APQ v HoNOS
100%
80%
APQ
Not participating
Looking
Employed
60%
40%
20%
0%
0
1
2
3
4
HoNOS occupation and activity
HoNOS item 12. n = 5824
Social : LSP v APQ
LSP : high scores
= more difficulty
APQ Total score (prelim)
SCORE
Employment
Study
Social
Unpaid work
2
Employed
Formal
study
2 or more
activities
More than one
role
1
Looking
Informal
study
1 activity
1 role
0
Not
participating
No study
No activities
No unpaid
work
0-2 in 4 domains = score out of 8
7 and 8 collapsed to 7: few values at 8
“Participation” v K10
K10 – higher
psychological
distress
APQ: more social
and work
participation
“Participation” v LSP total
LSP : higher level
of disability
APQ: more social
and work
participation
Predictors of 28 day readmission
Factor
Age *
Male
Homeless
Out of Area
Length of stay *
Psychosis
Mood disorder
Substance disorder
Personality disorder
Improved
Unchanged
Deteriorated
OR
95% CI
.992
.955
1.243
1.394
.989
1.062
.933
1.625
.840
1.237
( 0.990 - 0.994 )
( 0.914 - 0.997 )
( 1.084 - 1.425 )
( 1.321 - 1.471 )
( 0.987 - 0.991 )
( 1.006 - 1.122 )
( 0.879 - 0.990 )
( 1.536 - 1.720 )
( 0.799 - 0.883 )
( 1.041 - 1.469 )
Binary Logistic regression . * For continuous variables, Odds are per unit of change: year(age), day (LOS)
Adult admissions to acute units, Jul 2007 – Jun 2012, LOS 95% trimmed (52 days).
N = 66 926
First episode psychosis

Relationship between ongoing substance use and
readmission ?
2276
First
Psychosis
Admission
4993
No MH readmission
No other contact
Community
contacts
7269
Non MH
admissions
Proxy drug use
measure:
Substance
diagnosis OR
HoNOS
Substance > 2
No MH readmission
46%
MH readmission
54%
2 years
Ongoing drug use and readmission
First episode psychosis, n = 4993
Drug use, ongoing:
67% readmitted
No drug use:
51% readmitted
Drug use, ceased:
40% readmitted
Proxy measure of drug use based on
- Diagnosis
- HoNOS substance subscale
Discussion
Summary
Consistent and plausible HoNOS profile between
services, with meaningful variation for outlier services
Differences between service groups and settings in ways
not captured by diagnostic differences
HoNOS and K10 reflect differences in profile, distress
and change in people with Personality Disorder
Correlation between consumer and clinician ratings for
employment, study and social participation (APQ,
HoNOS subscales, LSP and K10)
HoNOS can add predictive value for 28 day
readmission and for 2 year readmission in early
psychosis
Utility
Validity
- Face
- Incremental
- Concurrent
- Predictive
Sensitivity
Slade’s critique of HoNOS
1.
Public health or clinical decision-making.

2.
Service or consumer priorities.

3.
Can tap into measures that are arguably shared priorities:
reducing distress, improving social outcomes and work
Clinician or consumer rated

4.
Measures do appear to have clinical relevance, sensitivity.
Meaningful and complementary relationship between
these
Deficit or recovery focus

Even current measures can touch on issues of work, study,
accommodation, social participation, relationships
Data quality?


As used in routine practice, the measures have some of the
properties of “good enough” data
Data precision is relative to scale:






These examples all of aggregate data: n=50 – 50 000+
In aggregate uses size does matter
Non-systematic variation (error, noise) cancels out
Some statistical assumptions can be violated with large samples
Large = 50 – 500 ?
Large scale does not prevent systematic error
Challenges

System capacity
 NOCC

review: resources, IT, leadership, training …
Quality
 Zero
scoring and invalid ratings
 Underscoring

Quantity
 Incomplete
collection
 Selection bias – who is being measured ?
HoNOS completion
HoNOS Completion - Adult Acute Units
(Median and IQR of peer units, Jul-Dec 2012)
100
90
Percent of episodes
80
70
60
50
40
30
20
10
0
Any
Any valid
Valid ADM
Valid DIS
ADM-DIS
Predictors of a valid HoNOS Pair

Person and episode of care factors
FACTOR
Age
Sex
Indigenous
Migrant
Psychosis
Mood disorder
Personality disorder
Substance disorder
Length of stay
OR
1.004
.833
1.095
1.048
1.002
(95% CI)
( 1.003 - 1.005 )
( 0.800 - 0.868 )
( 1.048 - 1.144 )
( 1.011 - 1.087 )
( 1.001 - 1.004 )
Binary Logistic regression
Adult admissions to acute units, Jul 2007 – Jun 2012, LOS > 3 days, LOS 95% trimmed
(52 days). N = 66296
Person and episode factors
40%
Percent with HoNOS pair
35%
30%
25%
20%
15%
10%
5%
0%
3-7
8-14
15-21
22-28
29-35
36-42
Length of stay (days)
43-49
50+
Number of units
Unit factors
12

10

8

6
4
2
0
Percent of episodes with HoNOS pair
Information systems
Resources and
workforce
Culture and leadership
Conclusions
Why persist with these measures ?

For consumers


For clinicians


The data is being used and is of value – collective uses are OK
For clinical leaders, policy makers, funders




Development may be needed, but they can add meaning
Binary view of quality may be too simplistic
Very substantial investment has resulted in some returns
Timelines for information development are agonizingly long
For broader community and stakeholders: “measuring the wrong thing”



These measures can touch on some broad aspects of experience, function and
recovery
For some comparative questions (Australia v other health systems? NGO vs specialist
MH service providers? Service A v Service B ?) some of the weaknesses of these
measures may also be their strengths: broad, universal, “common currency”
Slade’s unique, bottom up approach (OCAN) may make it difficult to answer these
types of questions
Conclusions
Our current measures
•
•
•
•
Neither road to salvation nor perdition
Just tools: like all tools, uses and limitations
Particular use in collective analysis, understanding services
and systems
Scale may overcome noise and data quality problems
Acknowledgements




David Duerden
InforMH: Sharon Jones, Kieron McGlone, Connie Ho,
Wendy Chen, Jenny Wildgoose, Damien McCaul, Jo
Sharpe
NSW LHD MH information managers
Clinicians and consumers of NSW Health Mental
services