Statistics for IPs

Download Report

Transcript Statistics for IPs

Intro to Statistics for
Infection Preventionists
Presented By:
Jennifer McCarty, MPH, CIC
Shana O’Heron, MPH, PhD, CIC
Objectives
Describe the important role statistics
play in infection prevention.
Describe the most common types of
statistics used in hospital epidemiology
Provide examples on how statistics are
utilized in hospital epidemiology.
Role of Statistics in Hospital
Epidemiology
Aid in organizing and summarizing data



Population characteristics
Frequency distributions
Calculation of infection rates
Make inferences about data


Suggest association
Infer causality
Communicate findings


Prepare reports for committees
Monitor the impact of interventions
Role of Statistics in Hospital
Epidemiology
When evaluating a study or white paper



Are the findings statistically significant?
Was the sample size large enough to show a
difference if there is one?
Are the groups being compared truly similar?
When investigating and unusual cluster



Describe the outbreak
Select control subjects
Determine the appropriate test to use when
measuring exposure
Descriptive Epidemiology
Descriptive Statistics: techniques
concerned with the organization,
presentation, and summarization of
data.



Measures of central tendency
Measures of dispersion
Use of proportions, rates, ratios
Descriptive Statistics
Variable: “Anything that is measured or
manipulated in a study”
Types of variables:

Qualitative
 Nominal, Ordinal

Quantitative
 Interval, Ratio


Independent vs. Dependent Variables
Continuous vs. Discrete variables
Variables
Measures of Central Tendency
Measures of Central Tendency
Mean: mathematical average of the
values in a data set.
Calculation:
Patient Length of Stay: 12, 9, 3, 5, 7, 6, 13, 8, 4, 15, 6
Mean (x)= The sum of each patient’s length of stay
The number of patients
= 12 + 9 + 3 + 5 + 7 + 6 + 13 + 8 + 4 + 15 + 6 = 88 = 8 days
11
11
Measures of Central Tendency
Median: the value falling in the middle
of the data set.
Calculation:
Patient Length of Stay: 12, 9, 3, 5, 7, 6, 13, 8, 4, 15, 6
Median = 3, 4, 5, 6, 6, 7, 8, 9, 12, 13, 15 = 7 days
Measures of Central Tendency
Mode: the most frequently occurring
value in a data set.
Calculation:
Patient Length of Stay: 12, 9, 3, 5, 7, 6, 13, 8, 4, 15, 6
Mode = 3, 4, 5, 6, 6, 7, 8, 9, 12, 13, 15 = 6 days
Measures of Dispersion
Measures of Dispersion
Range: the difference between the
smallest and largest values in a data
set.
Calculation:
Patient Length of Stay: 12, 9, 3, 5, 7, 6, 13, 8, 4, 15, 6
Range = 15 – 3 = 12 days
Measures of Dispersion
Standard Deviation: measure of dispersion that
reflects the variability in values around the mean.


Deviation: the difference between an individual data point
and the mean value for the data set.
SD = √(X-X)2 / n-1
 “Take all the deviations from the mean, square them, then
divide their sum by the total number of observations minus one
and take the square root of the resulting number”
Variance: a measure of variability that is equal to
the square of the standard deviation.
Normal Distribution
Continuous
distribution
Bell shaped
curve
Symmetric
around the
mean
Non-Normal Distributions
Skew


Non-symmetric
distribution
Positive or Negative
 Refers to the
direction of the long
tail
Bi/Multi-Modal


May have distinct
peaks with its own
central tendency
No central tendency
Proportions
Use of Proportions, Rates & Ratios
Proportions: A fraction in which the
numerator is part of the denominator.
Rates: A fraction in which the denominator
involves a measure of time.
Ratios: A fraction in which there is not
necessarily a relationship between the
numerator and the denominator.
Proportions
Prevalence: proportion of persons with a particular
disease within a given population at a given time.
Proportion of S. aureus Nosocomial
Infections Resistant to Oxacillin (MRSA)
Among Intensive Care Unit Patients,
1989-2003*
Percent Resistance
70
60
50
40
30
20
10
0
1989
1991
1993
1995
1997
1999
2001
Year
*Source: NNIS System, data for 2003 are incomplete
2003
Rates
Rate = x/y × k



x = The number of times the event (e.g.,
infections) has occurred during a specified time
interval.
y = The population (e.g., number of patients at
risk) from which those experiencing the event
were derived during the same time interval.
k = A constant used to transform the result of
division into a uniform quantity so that it can be
compared with other, similar quantities.
Rates
Example: Foley-Associated UTIs in the ICU





Step 1: Time period
 April 2014
Step 2: Patient population
 Patients in the Medical / Surgical ICU of Hospital X who have
Foley catheters
Step 3: Infections (numerator)
 April CAUTI infections in the ICU = 2
Step 4: Device-days (denominator)
 Total number of days that patients in the ICU had Foley
catheters in place = 920
Step 5: Device-associated infection rate
 Rate = 2 x 1000 = 2.17 per 1000 Foley-days
920
NHSN Comparison
Ratios
Calculation of Device Utilization Ratio





Step 1: Time period
 April 2014
Step 2: Patient population
 Patients in the Medical / Surgical ICU of Hospital X who have
Foley catheters
Step 3: Device-days (numerator)
 Total number of days that patients in the ICU had Foley
catheters in place = 920
Step 4: Patient-days (denominator)
 Total number of days that patients are in the ICU = 1176
Step 5: Device utilization ratio
 Ratio = 920= 0.78
1176
NHSN Comparison
What does this tell you?
When examined together, the deviceassociated infection rate and device utilization
ratio can be used to appropriately target
preventative measures.

Consistently high rates and ratios may signify a
problem and further investigation is suggested.
 Potential overuse/improper use of device

Consistently low rates and ratios may suggest
underreporting of infection or the infrequent use
or short duration of use of devices.
Analytic Epidemiology
Inferential statistics: procedures
used to make inferences about a
population based on information from a
sample of measurements from that
population.



Z-test/T-test
Chi Square
SIR
Hypothesis
Testing
Hypothesis Testing Studies
Null Hypothesis (Ho): a hypothesis of no
association between two variables.

The hypothesis to be tested
Alternate Hypothesis (Ha): a hypothesis
of association between two variables.
Hypothesis Testing: Error
Reality
Decision
Treatments are not
different
Conclude that
treatments are
not different
Conclude that
treatments are
different
Correct Decision
Treatments are different
Type II Error
(Probability = )
Type I Error
Correct Decision
(Probability = )
(Probability = 1- = Power)
Significance Testing
A p value is not the probability that your
finding is due to random chance alone

But of collecting a random sample of the same
size from the same population that yields a result
at least as extreme as the one you just calculated
Level of Significance ( level) is the
probability of rejecting a null hypothesis when
it is true


The level of risk a researcher is willing to take of
being wrong
Usually set to 0.05 or 0.01
Hypothesis Testing: Error
Type I Error: Probability of rejecting the null
hypothesis when the null hypothesis is true.

 = probability of making a type I error
Type II Error: Probability of accepting the null
hypothesis when the alternate hypothesis is true.

 = probability of making a type II error
Power: Probability of correctly concluding that the
outcomes differ

1 -  = power
Hypothesis Testing: Error
Reality
Decision
Treatments are not
different
Conclude that
treatments are
not different
Conclude that
treatments are
different
Correct Decision
Treatments are different
Type II Error
(Probability = )
Type I Error
Correct Decision
(Probability = )
(Probability = 1- = Power)
Parametric Tests


Assume Normal distribution of the sample
population
Usually continuous-interval variables
z Test
Student’s t Test
z Test
Test the difference in means of two
proportions (two tailed)
Use when:


Sample size is greater than 30
Requires a normal distribution
Example: Comparing your mean
infection rate to NHSN mean rates
t Tests
http://www.dimensionresearch.com/resources/calculators/ztest.html
t Tests
Test the difference in means (one or two
tailed)
Use when:


Sample size is less than 30
Assumes
 Independence of populations & values
 Variance is equal for both sets of data
 No confounding variables
Types of t Tests:


Independent sample (experiment vs. control)
Paired sample (before and after)
t Tests
http://www.dimensionresearch.com/resources/calculators/ttest.html
t Tests
http://www.usablestats.com/calcs/2samplet
Non-Parametric Tests
Do not assume normal distribution
Used with more types of data:
Nominal, Ordinal, Interval, Discrete
(infection vs no infection)
Chi Square (X 2)


Compares observed values against expected
values
 Example: Comparing SSI rates for Dr. X and Dr. Y

http://www.gifted.uconn.edu/siegle/research/ChiS
quare/chiexcel.htm
2x2: Exposures and Outcomes
Patients With
Disease
Patients With
No Disease
Patients
Exposed
a
b
Patients Not
Exposed
c
d
Chi square
http://faculty.vassar.edu/lowry/newcs.html
Relative Risk
Comparing the risk of
disease in exposed
individuals to individuals
who were not exposed
Patients With
Disease
Patients With
No Disease
Patients
Exposed
a
b
Patients Not
Exposed
c
d
RR = ___Disease incidence in exposed___ = _a / (a + b)_
Disease incidence in non-exposed
c / (c + d)
__a__
RR = ____a + b____
__c__
c+d
(
(
)
)
Relative Risk
RR = 1


Risk in exposed equal to risk in non-exposed
No association
RR > 1


Risk in exposed greater than risk in non-exposed
Positive association, possibly causal
RR < 1


Risk in exposed less than risk in non-exposed
Negative association, possibly protective
Odds Ratio
Comparing the odds
that a disease will
develop
Patients With
Disease
(Cases)
Patients With
No Disease
(Controls)
Patients With
History of
Exposure
a
b
Patients
Without History
of Exposure
c
d
OR = __Odds that a case was exposed_ = _a / c_ = _ad_
Odds that a control was exposed
b/d
bc
Odds Ratio
OR = 1

Exposure not related to the disease
OR > 1

Exposure positively related to disease
OR < 1

Exposure negatively related to the disease
95% Confidence Interval
Confidence Interval: a computed interval of
values that, with a given probability, contains
the true value of the population parameter.
 95% CI: 95% of the time the true value
falls within the interval given.
Allows you to assess variability of an
estimated statistic
If the confidence interval includes the value
of 1, then the stat is not significant
Standardized Infection Ratio (SIR)
Compare the HAI experience among one or
more groups of patients to that of a standard
population’s (e.g. NHSN)
Risk-adjusted summary measure

Available for CAUTI, CLABSI, and SSI data
Details can be found in the SIR Newsletter,
available at:
http://www.cdc.gov/nhsn/PDFs/Newsletters/NHSN_N
L_OCT_2010SE_final.pdf
SIR
Observed # of HAI – the number of events
that you enter into NHSN
Expected or predicted # of HAI – comes from
national baseline data*
The formula for calculating the number of
expected CLABSI infections is:

# central line days *(NHSN Rate/1000)
*Source of national baseline data: NHSN Report, Am J Infect Control 2009;37:783805 Available at: http://www.cdc.gov/nhsn/PDFs/dataStat/2009NHSNReport.PDF
SIR – CLAB Data for CMS IPPS
Overall SIR Interpretation
During the first half of 2011, our facility observed
15 CLABSIs in our ICU locations.
The number of expected CLABSIs during this
timeframe, based on national data, was 10.397
CLABSIs
This yields an SIR of 1.443, indicating that we
observed approx. 44% more infections than
expected
Based on statistical evidence, we can conclude that
our SIR is no different than 1
SIR for SSI Data
SSI Rates output options have been moved to
“Advanced” folder
You can still obtain your facility’s SSI rates
using Basic Risk Index, however NHSN pooled
mean and comparison statistics for SSI
 Rates will no longer be available
SIRs use several risk factors to build logistic
regression models for improved risk
adjustment
Expected SSIs
The number of expected SSIs is calculated by
summing the procedure risk for all
procedures included in the summarized
calculation (e.g., all procedures for 2011, H1)
The procedure risk is calculated from
improved risk models*


The “Basic Risk Index” is no longer used for
national SSI analyses
New risk models provide improved risk adjustment
in the prediction of SSIs
*Mu Y et al. Infect Control Hosp Epidemiol 2011;32(10):970-986.
Available NHSN Risk Factors
Logistic Regression Model
Logistic Model
for VHYS
Sum Probability of SSI for
#Expected SSIs
Overall SSI SIR
Questions?
Questions?
Useful Resources:
Useful Resources:
APIC EpiGraphics: Statistics and Surveillance Tools for IPs
APIC Manual, Chapter 5 – Use of Statistics. (2009)
PDQ Statistics. GR Norman & DL Streiner (2003). BC
Decker.
Fundamentals of Biostatistics. B Rosner (2000).
Brooks/Cole.
Epidemiology for Public Health Practice. RH Friis & TA
Sellers (2004). Jones and Bartlett Publishers, Inc.
Excel Hacks. DE Hawley (2007). O’Reilly
www.graphpad.com/quickcalcs
http://www.danielsoper.com/statcalc/default.aspx
 Free statistics calculators
http://nccphp.sph.unc.edu/training/index.php
 Free online epi training from North Carolina Center of
Public Health Preparedness
Images From:
National prevalence of methicillin-resistant Staphylococcus
aureus in inpatients at US health care facilities, 2006. W Jarvis
et al. AJIC December 2007
National Healthcare Safety Network (NHSN) report: Data
summary for 2006 through 2008, issued December 2009.
Edwards et al. AJIC December 2009.
PDQ Statistics. GR Norman & DL Streiner (2003). BC Decker.
Pertussis: A Disease Affecting All Ages. DS Gregory. American
Family Physician. August 2006.
www.aafp.org/afp/2006/0801/p420.html
Summarizing Your Data. Science Buddies.
www.sciencebuddies.org/mentoring/project_data_analysis_sum
marizing_data.shtml