Evaluation workshop

Download Report

Transcript Evaluation workshop

‘How to evaluate your own work’
Dr. Catrin Eames
Centre for Mindfulness Research and Practice
[email protected]
Workshop for the ‘Mindfulness Now’ conference, CMRP,
Bangor University
9th-11th April, 2011
• Rationale for conducting your own evaluations
• How to manage the numbers
• Suggested evaluation material
• Scoring, inputting and analysing
• Presentation of results
• To evaluate whether an intervention is worth
doing (efficacy trial)
• To evaluate whether it works in real life settings
(effectiveness trial)
• To establish for whom is might work and why
(moderators and mediators of outcome)
• To establish what service users think about the
intervention (qualitative methodologies)
• To challenge beliefs
What we feel might be true anecdotally is not always supported
by large scale research studies
• To explore new applications
Research plays a role in exploring whether existing
interventions can be applied to new sub-groups
• NHS Trusts or organisations may require you to conduct
evaluation
• It is imperative to be able to demonstrate improvement
associated with your groups
• Helps to maintain funding and/or win new funding
5
The most important decisions to make when considering
evaluation are:
1) Design
2) Evaluation measures
6
• It is very important that you have baseline (before) and
outcome measures (after)
• It is also important that you have the same measures on
everyone
• Use evidence based interventions
• Deliver intervention with fidelity
7
Be aware of ethical issues
•
•
•
•
•
Consent, information sheet, free to withdraw
Protection of participants, e.g. wellbeing
Anonymity
Data storage
Disposal of data
8
One group post-test only design
• Audit of satisfaction with a service
One group pre-test -post-test design
• Common design in clinical practice
• Problem of attributing change to treatment (i.e., causality)
Non-equivalent groups post-test only design
• No pre-test data available
• Cannot assume similarity before treatment
Non-equivalent groups pre-test -post-test design
• Often one group is control
• Classic effectiveness study design
Comparison against norms
Published data in other studies
For….
Many of our evaluations we have used:
Demographic Questionnaire
Beck Depression Inventory
Hospital Anxiety and Depression Questionnaire
Five Factor Mindfulness Questionnaire
WHO Well-being Index 5
Warwick Edinburgh Mental Health Wellbeing Scale
12
What do you need to know?
Do you want to compare outcomes of:• Older versus younger participants?
• Males versus females?
• Different areas?
• Any other ideas?
13
• Working/ Unemployed?
• Prior mood disorder history?
• Progression/take up training/employment
• Been on another course/taster?
• Cultural background/family history
• Teacher effect on outcomes
• Gender
• Level of engagement prior to course
14
• Family income
• Rurality - access issues
• First language in the home / how many
languages?
• Any current medication?
15
• Mean & SD
• Change scores
• Effect sizes
• Excel
• Inputting data
• Analysing data
• Graphs/chart production
• Writing up results
16
• For evaluation purposes you are most interested in change from
start to end.
• Easiest way is to look at MEAN difference
•
Add up all baseline scores and divide by number of
participants, do same for follow-up.
Standard Deviation: the standard deviation is the most commonly
used measure of statistical dispersion. Simply put, it measures how
spread out the values in a data set are.
17
Minus 1, 2, 3… SD

Mean

Plus 1, 2, 3… SD
Even simple spreadsheet programmes like Excel will allow
you to conduct simple statistics
Intervention N= 19
Control N = 11
Gender
16 female, 3 male
9 female, 2 male
Age
M = 41.89 (SD = 13.05)
Range 24-64
M = 44.54 (SD = 11.60)
Range 24-58
20
• FREE!
• 2-5 minutes to complete
• 14 positively phrased items
• Total score (min 14 max 70)
Mean
before
SD
before
Mean
after
SD
after
Mean
Pooled Effect
change SD
size
WEM
WBS- I
14.16
3.66
17.84
3.30
3.68
3.48
1.06
WEM
WBS-C
15.18
3.49
14.45
3.50
-0.72
3.50
-.21
Cohen’s 1988 guidelines: difference between means
divided by pooled SD. 0.3 = clinically useful change,
0.5 medium effect, 0.8 = large effect
22
• Change scores are useful
• Easy and simple way of evaluating change
• Change scores should demonstrate improvements in
behaviour outcome
23
24
Cohen’s D - difference between mean of two
groups divided by pooled S.D. of both groups
Glass Delta - difference between mean of two
groups divided by mean SD of control group
Note: both of these can be used to look at
post treatment group differences or treatment
group pre and post differences
Cohen’s D
Mean of intervention - Mean of control/ (SD of
intervention + SD of control)/2
Glass’s delta
Mean of intervention - Mean of control/ (SD of control)
There were XX participants in total from two group
conditions (Intervention N = XX, Control N = XX).
The mean age was XX (range xx-xx, SD = XX ).
At baseline the two groups DID/DID NOT
differ significantly on XX/YY. The mean at baseline
was ??(SD=XX) and at follow-up was ?? (SD = XX),
respectively. The mean change score was therefore??
with an effect size of ?? This study suggests the
intervention has impacted on participants’ self
reported well-being. Furthermore this change is
statistically significant as demonstrated by t-test
analyses, t(20), =2.61, p<.05
27
28
• Title
• Abstract (summary)
• Introduction
• Method
Participants
Intervention
Measures
Design
• Results
• Discussion
• References
29
• 21-item self-report inventory measuring the severity of
characteristic attitudes & symptoms associated with
depression
• Each item contains four possible responses which range
in severity from 0 ( I do not feel sad) to 3 ( I am so sad or
unhappy that I can’t stand it)
• Score of 10-18 = mild to moderate depression
• Score of 19-29 = moderate to severe depression
30-63 = severe depression
Purchase from: http://www.pearson-uk.com
30
• 39-item self-report questionnaire used to assess five different
facets of mindful awareness.
• non-reactivity to inner experience,
• observing,
• acting-with-awareness,
• describing and
• non-judging of experience.
• 5-point Likert scale (1= never o very rarely true; 5 = very often
or always true).
• Rationale for conducting your own evaluations
• How to manage the numbers
• Evaluation measures
• Scoring, inputting and analysing
• Presentation of results
32
• Mean score
• =AVERAGE(data)
• Standard deviation
• =STDEV(data)
• T-test
• Data Analysis -> t-test
•
•
Independent = Different groups
Paired = Matched groups