10. Direct method of standardization of indices

Download Report

Transcript 10. Direct method of standardization of indices

Direct method of
standardization of
indices
The most important part
The most important part is
concerned with reasoning in an
environment where one doesn’t know,
or can’t know, all of the facts needed to
reach conclusions with complete
certainty. One deals with judgments and
decisions in situations of incomplete
information. In this introduction we will
give an overview of statistics along with
an outline of the various topics in this
course.
Case control study
In the early 1940s, Alton Ochsner, a
surgeon in New Orleans, observed that
virtually all of the patients on whom he
was operating for lung cancer gave a
history of cigarette smoking. He
hypothesized that cigarette smoking
was linked to lung cancer.
Case control study
Again in the 1940s, Sir Norman Gregg,
an Australian ophthalmologist, observed a
number of infants and young children in his
ophthalmology practice who presented with
an unusual form of cataract. Gregg noted that
these children had been in utero during the
time of a rubella (German measles) outbreak.
He suggested that there was an association
between prenatal rubella exposure and the
development of the unusual cataracts.
Case-Control Study

Design of case-control study

Conduction of case-control study

Analysis of case-control study
Design of a case-control study
Were exposed
Were not
exposed
HAVE THE DISEASE
Were exposed
Were not
exposed
DO NOT HAVE THE DISEASE
Design of Case-Control
Studies
First Select
Then Measure Post Exposure
Cases
Controls
(With Disease)
(Without Disease)
Were exposed
a
b
Were not exposed
C
d
Total
a+c
Proportions
exposed
a /a+c
b+d
b / b+c
If the exposure is associated with
disease, we expect that
a/a+c > b/b+c
A case-control study of Coronary
Heart Disease and Cigarette
Smoking
CHD
Cases
Controls
Smoke cigarettes
112
176
Do not smoke
cigarettes
88
224
Total
200
400
% Smoking cigarettes
56.0
44.0
Question
Can we get prevalence of disease in our
study?
Is it 200/(200+400) ?
Measures of Association

Relative risk and cohort studies
- The relative risk (or risk ratio) is defined as the
ratio of the incidence of disease in the
exposed group divided by the corresponding
incidence of disease in the unexposed group.

Odds ratio and case-control studies
- The odds ratio is defined as the odds of
exposure in the group with disease divided by
the odds of exposure in the control group.
Measures of Association
Measures of Association




Absolute risk
- The relative risk and odds ratio provide a measure of risk
compared with a standard.
Attributable risk or Risk difference is a measure of absolute
risk. It represents the excess risk of disease in those exposed
taking into account the background rate of disease. The
attributable risk is defined as the difference between the
incidence rates in the exposed and non-exposed groups.
Population Attributable Risk is used to describe the excess
rate of disease in the total study population of exposed and
non-exposed individuals that is attributable to the exposure.
Number needed to treat (NNT)
- The number of patients who would need to be treated to
prevent one adverse outcome is often used to present the
results of randomized trials.
Population
Selected cases
Selected controls
All cases in this
population
All normal people in this
population
The Case-Control Study
Definition
Last J. A Dictionary of Epidemiology, 2001
The observational epidemiologic study
of persons with the disease (or other
outcome variable) of interest (cases)
and a suitable control (comparison,
reference) group of persons without
the disease.
Source of cases:
•Hospital patients;
•Patients in physicians’ practices;
•Clinic patients.
Selection of cases
1.
Generalizability:
incident /prevalent
alive/dead
hospital /population based
2.
Case criteria:
specific definitions
inclusion and exclusion criteria
Select cases from a single
hospital or multiple hospitals
From single hospital
From multiple
hospitals in the
community
Identified risk factors
may be unique to the
hospital
Risk factors are
generalizable to all
patients with the
disease
Select incident or prevalent
cases
Select Incident
cases
Select prevalent
cases
Risk factors
identified are more
likely related to
development of the
disease
Risk factors
identified are more
likely related to
survival of the
disease
Sources of controls
• Sampling Frames:
1. Population of an administrative area
2. 2. Hospital patients (often same service,
variety of conditions)
3. Relatives of the cases (spouses and
siblings)
4. Associates of the cases (neighbors, coworkers, etc)
• The controls should be drawn from the
population of which the cases represent the
affected individuals.
Advantages of hospital
controls
Advantages of hospital
controls

Easily accessible

Participants have time

Participants motivated to cooperate

Cases and controls drawn from similar social
or geographic environments

Differential recall likely to be minimised
Disadvantages of hospital
controls
Disadvantages of hospital
controls

Differential hospitalisation patterns may
produce bias

Difficult to blind disease status of cases
and controls

May underestimate exposure in
controls
Advantages of community controls

May reduce selection biases

Study results more generalizable

May provide convenient control of
extraneous variables
 Coefficient
of variation is the
relative measure of variety; it
is a percent correlation of
standard deviation and
arithmetic average.
Measures of Association

Relative risk and cohort studies
- The relative risk (or risk ratio) is defined as the
ratio of the incidence of disease in the
exposed group divided by the corresponding
incidence of disease in the unexposed group.

Odds ratio and case-control studies
- The odds ratio is defined as the odds of
exposure in the group with disease divided by
the odds of exposure in the control group.
Measures of Association
Measures of Association




Absolute risk
- The relative risk and odds ratio provide a measure of risk
compared with a standard.
Attributable risk or Risk difference is a measure of absolute
risk. It represents the excess risk of disease in those exposed
taking into account the background rate of disease. The
attributable risk is defined as the difference between the
incidence rates in the exposed and non-exposed groups.
Population Attributable Risk is used to describe the excess
rate of disease in the total study population of exposed and
non-exposed individuals that is attributable to the exposure.
Number needed to treat (NNT)
- The number of patients who would need to be treated to
prevent one adverse outcome is often used to present the
results of randomized trials.
Correlation coefficient
Correlation coefficient
Correlation coefficient
Types of correlation
There are the following types of
correlation (relation) between the
phenomena and signs in nature:
 а) the reason-result connection is the
connection between factors and
phenomena, between factor and result
signs.
 б) the dependence of parallel changes of a
few signs on some third size.
Quantitative types of connection
 functional
one is the connection, at
which the strictly defined value of the
second sign answers to any value of
one of the signs (for example, the
certain area of the circle answers to
the radius of the circle)
Quantitative types of connection

correlation - connection at which a few values of
one sign answer to the value of every average
size of another sign associated with the first one
(for example, it is known that the height and
mass of man’s body are linked between each
other; in the group of persons with identical
height there are different valuations of mass of
body, however, these valuations of body mass
varies in certain sizes – round their average
size).
Correlative connection


Correlative connection foresees the dependence
between the phenomena, which do not have
clear functional character.
Correlative connection is showed up only in the
mass of supervisions that is in totality. The
establishment of correlative connection foresees
the exposure of the causal connection, which will
confirm the dependence of one phenomenon on
the other one.
Correlative connection


Correlative connection by the direction (the
character) of connection can be direct and
reverse. The coefficient of correlation, that
characterizes the direct communication, is
marked by the sign plus (+), and the coefficient
of correlation, that characterizes the reverse one,
is marked by the sign minus (-).
By the force the correlative connection can be
strong, middle, weak, it can be full and it can be
absent.
Estimation of correlation by
coefficient of correlation
Force of connection
Complete
Line (+)
Reverse (-)
+1
Strong
From +1 to +0,7
Average
from +0,7 to +0,3 from –0,7 to –0,3
Weak
No connection
From -1 to -0,7
from +0,3 to 0
from –0,3 to 0
0
0
Types of correlative
connection
By direction
direct (+) – with the increasing of one sign
increases the middle value of another one;
 reverse (-) – with the increasing of one sign
decreases the middle value of another one;

Types of correlative
connection
By character
 rectilinear - relatively even changes of
middle values of one sign are
accompanied by the equal changes of the
other (arterial pressure minimal and
maximal)
 curvilinear – at the even change of one
sing there can be the increasing or
decreasing middle values of the other sign.
Terms Used To Describe The
Quality Of Measurements
Reliability is variability between subjects
divided by inter-subject variability plus
measurement error.
 Validity refers to the extent to which a test
or surrogate is measuring what we think it
is measuring.

Measures Of Diagnostic Test
Accuracy




Sensitivity is defined as the ability of the test to identify
correctly those who have the disease.
Specificity is defined as the ability of the test to identify
correctly those who do not have the disease.
Predictive values are important for assessing how
useful a test will be in the clinical setting at the individual
patient level. The positive predictive value is the
probability of disease in a patient with a positive test.
Conversely, the negative predictive value is the
probability that the patient does not have disease if he
has a negative test result.
Likelihood ratio indicates how much a given diagnostic
test result will raise or lower the odds of having a disease
relative to the prior probability of disease.
Measures Of Diagnostic Test
Accuracy
Expressions Used When
Making Inferences About Data

Confidence Intervals
- The results of any study sample are an estimate of the true value
in the entire population. The true value may actually be greater or
less than what is observed.



Type I error (alpha) is the probability of incorrectly
concluding there is a statistically significant difference in
the population when none exists.
Type II error (beta) is the probability of incorrectly
concluding that there is no statistically significant
difference in a population when one exists.
Power is a measure of the ability of a study to detect a
true difference.
Multivariable Regression
Methods


Multiple linear regression is used when the
outcome data is a continuous variable such as
weight. For example, one could estimate the
effect of a diet on weight after adjusting for the
effect of confounders such as smoking status.
Logistic regression is used when the outcome
data is binary such as cure or no cure. Logistic
regression can be used to estimate the effect of
an exposure on a binary outcome after adjusting
for confounders.
Survival Analysis


Kaplan-Meier analysis measures the ratio of
surviving subjects (or those without an event)
divided by the total number of subjects at risk for
the event. Every time a subject has an event, the
ratio is recalculated. These ratios are then used
to generate a curve to graphically depict the
probability of survival.
Cox proportional hazards analysis is similar to
the logistic regression method described above
with the added advantage that it accounts for
time to a binary event in the outcome variable.
Thus, one can account for variation in follow-up
time among subjects.
Kaplan-Meier Survival Curves
Standard Normal Distribution
Mean +/- 1 SD  encompasses 68% of observations
Mean +/- 2 SD  encompasses 95% of observations
Mean +/- 3SD  encompasses 99.7% of observations
n=130 w=4cm
n=1000,
w =1cm
n  , w  0
Distribution curve
In general, as the number of observations, n, approaches
infinity, and the width of class interval approaches zero,
we will find a smooth curve such as is show in Figure 3.
Such smooth curves are used to represent graphically the
distribution of continuous random variables. We also call
distribution curves. (or probability distribution curves)
Histogram (Frequency distribution graph)
n is large
Frequency distribution curve
Relative frequency distribution curve
n 
Probability distribution curve
The probability distribution curve has
some important consequences:
 The total area under the curve is equal to one;
 The relative frequency of occurrence of value
between any two points on the X-axis is equal to
area under curve between these two points;
 The probability of any specific value of the
random variable is zero, because a specific value
is represented by a point on the X-axis and the
area above a point is zero.
a
b
The Normal Distribution
What is the normal distribution?
The normal density is given by :

1
f (X ) 
e
2
( X   )2
2 2
  X  
It is the most important distribution in
all of statistics.
σ
μ
The normal distributions are symmetric, single-peaked,
bell-shaped density curves.
Characteristics of the normal distribution:
It is symmetrical about its mean ,μ;
The mean, the median, and mode are all equal;
The total area under the normal curve above the
X-axis is one square unit;
Characteristics of the normal distribution:
The normal distribution is completely determined
by the parameters μ and σ.
In other words, a different normal distribution is
specified for each different value of μ and σ.
This implies that the normal distribution is really
a family of distributions in which one member is
distinguished from another on the basis of the value
of μ and σ.