Descriptive Studies in Epi
Download
Report
Transcript Descriptive Studies in Epi
Common Statistical Tests
Descriptive statistics (common in all types of
studies – first step in reporting findings)
Continuous variables: T-test, ANOVA, Pearson
correlation, linear regression (e.g., pain VAS,
age, cholesterol)
Categorical, Nominal: Chi-square test,
relative risks, proportions, Mantel-Haentzel,
Spearman correlation, logistic regression
(e.g., gender, death, categorical scales)
*Most assume random sampling or random
group assignment – frequently violated.
Descriptive Statistics
Measures of central tendency
Measures of variability
Standard deviation, standard error, confidence
intervals, range of scores
Frequency distribution
Mean, median, mode
How many people in each level of the variable
Proportions
Proportion (%) of sample at each level
Often also referred to as frequency distribution
Central Tendency
Mean
Median
mathematical average
Used when distribution is normal
50th percentile – ½ scores below, ½ above
Used when distribution is skewed
Mode
Score with the highest frequency
Seldom reported
Measures of Variability
Standard deviation
Variability of scores around mean in your sample
(spread of scores in your sample)
E.g., mean of 100, S.D. 10 means that 68% of
scores are between 90 and 110, 95% of scores
are within 2 standard deviations of mean
Standard error
Measure of the inaccuracy of the sample mean
compared to the true population mean
Often used incorrectly in presentation of results
Standard error smaller than standard deviation - makes
data look less variable
Measures of Variability
Range of scores
Range of scores observed
Confidence intervals
Range of values we are fairly confident will
include the true value we are interested in
Mean=100, 95% CI 85-105 – if we
measured that value on 100 samples, 95%
of those values would fall within the
confidence intervals
چرا آزمون آماری؟
خطای ناشی از نمونه گیری
مفهوم >= H0فرض برابری (یا عدم ارتباط)
چقدر نتایج بدست آمده ناشی از شانس
است؟ P Value
رد H0به غلط => خطای نوع اول = 0.05
قبول H0به غلط => خطای نوع دوم = 0.2
Frequency Distribution
45
40
35
20-29 years
30-39 years
40-49 years
50-59 years
60-69 years
30
25
20
15
10
5
0
Responders
آزمون های آماری
.1
پی بردن به اختالف:
مقایسه میانگین فشار خون
مقایسه توزیع جنسی در رشته های مختلف
.1
پی بردن به ارتباط:
تعیین ارتباط نوع شخصیت و رشته تحصیلی
تعیین ارتباط عفونت کالمیدیا با IHD
آزمون آماری جهت مقایسه
Independent t test
Paired t test
ANOVA
Repeated measures
مجذور کای
McNemar
مستقل زوجی
مستقل زوجی
مستقل زوجی مستقل زوجی
Chi Squre
دو گروه
سه گروه یا بیشتر
دو گروه
سه گروه یا بیشتر
Cochran
متغیر کمی(میانگین)
متغیر کیفی(درصد)
Statistical Analysis
Student’s T-Test
Measures differences between group
means
Requires continuous data, assumes normal
distribution in each group, random
sampling
Considers variability within groups
T-test for independent samples, t-test for
dependent samples
Statistical Analysis
Analysis of Variance
Similar in concept to t-test
Used when more than two groups
E.g., experimental group, placebo group,
alternative medication group
Requires continuous variables, normal
distribution in each group, random
sampling
Statistical Analysis
Chi-Square
Differences between proportions, discrete data
2 X 2 table
Considers variability within groups
Mantel-Haentzel
Extension of Chi-square
Way of calculating adjusted odds ratios for
stratified data
Chi Square
Depressed
Smoker
Nonsmoker
Total
Not
Depressed
89 (33%) 179 (67%)
a
b
131 (17%) 647 (83%)
c
d
220
826
a+c
b+d
Total
268
a+b
778
c+d
1046
T (total)
Chi Square
Smoker
Depressed Not
Total
Depressed
a
b
a+b
Nonsmoker
c
d
c+d
Total
a+c
b+d
T=a+b
+c+d
آزمون های آماری جهت پی بردن به ارتباط
Correlation
Regression
Correlation Coefficients
Possible values from –1 to +1
-1 = perfect negative correlation
As exposure increases, disease (health
condition) decreases
0 = no relationship or no linear
relationship
+1 = perfect positive correlation
As exposure increases, disease increases
Other Statistics
Logistic Regression
Odds ratios (cohort, case-control, crosssectional studies)
Odds that an exposed person develops the
disease: odds than a non-exposed person
develops the disease
Crude OR (just taking exposure and outcome
into consideration)
Adjusted OR (odds taking all other
factors/confounders into consideration)
Other Statistics
Linear regression
When outcome is continuous
A kind of correlation
Can adjust for other factors/confounders in the
model
Cox Proportional Hazards
When outcome is time to an event
Time to death, recovery, onset of symptoms
Regression model