Descriptive Studies in Epi

Download Report

Transcript Descriptive Studies in Epi

Common Statistical Tests




Descriptive statistics (common in all types of
studies – first step in reporting findings)
Continuous variables: T-test, ANOVA, Pearson
correlation, linear regression (e.g., pain VAS,
age, cholesterol)
Categorical, Nominal: Chi-square test,
relative risks, proportions, Mantel-Haentzel,
Spearman correlation, logistic regression
(e.g., gender, death, categorical scales)
*Most assume random sampling or random
group assignment – frequently violated.
Descriptive Statistics

Measures of central tendency


Measures of variability


Standard deviation, standard error, confidence
intervals, range of scores
Frequency distribution


Mean, median, mode
How many people in each level of the variable
Proportions


Proportion (%) of sample at each level
Often also referred to as frequency distribution
Central Tendency

Mean



Median



mathematical average
Used when distribution is normal
50th percentile – ½ scores below, ½ above
Used when distribution is skewed
Mode


Score with the highest frequency
Seldom reported
Measures of Variability

Standard deviation



Variability of scores around mean in your sample
(spread of scores in your sample)
E.g., mean of 100, S.D. 10 means that 68% of
scores are between 90 and 110, 95% of scores
are within 2 standard deviations of mean
Standard error


Measure of the inaccuracy of the sample mean
compared to the true population mean
Often used incorrectly in presentation of results

Standard error smaller than standard deviation - makes
data look less variable
Measures of Variability

Range of scores


Range of scores observed
Confidence intervals


Range of values we are fairly confident will
include the true value we are interested in
Mean=100, 95% CI 85-105 – if we
measured that value on 100 samples, 95%
of those values would fall within the
confidence intervals
‫چرا آزمون آماری؟‬
‫‪‬‬
‫‪‬‬
‫‪‬‬
‫‪‬‬
‫‪‬‬
‫خطای ناشی از نمونه گیری‬
‫مفهوم ‪ >= H0‬فرض برابری (یا عدم ارتباط)‬
‫چقدر نتایج بدست آمده ناشی از شانس‬
‫است؟ ‪P Value‬‬
‫رد ‪ H0‬به غلط => خطای نوع اول = ‪0.05‬‬
‫قبول ‪ H0‬به غلط => خطای نوع دوم = ‪0.2‬‬
Frequency Distribution
45
40
35
20-29 years
30-39 years
40-49 years
50-59 years
60-69 years
30
25
20
15
10
5
0
Responders
‫آزمون های آماری‬
‫‪.1‬‬
‫پی بردن به اختالف‪:‬‬
‫مقایسه میانگین فشار خون‬
‫مقایسه توزیع جنسی در رشته های مختلف‬
‫‪.1‬‬
‫پی بردن به ارتباط‪:‬‬
‫تعیین ارتباط نوع شخصیت و رشته تحصیلی‬
‫تعیین ارتباط عفونت کالمیدیا با ‪IHD‬‬
‫آزمون آماری جهت مقایسه‬
‫‪Independent t test‬‬
‫‪Paired t test‬‬
‫‪ANOVA‬‬
‫‪Repeated measures‬‬
‫مجذور کای‬
‫‪McNemar‬‬
‫مستقل زوجی‬
‫مستقل زوجی‬
‫مستقل زوجی مستقل زوجی‬
‫‪Chi Squre‬‬
‫دو گروه‬
‫سه گروه یا بیشتر‬
‫دو گروه‬
‫سه گروه یا بیشتر‬
‫‪Cochran‬‬
‫متغیر کمی(میانگین)‬
‫متغیر کیفی(درصد)‬
Statistical Analysis

Student’s T-Test




Measures differences between group
means
Requires continuous data, assumes normal
distribution in each group, random
sampling
Considers variability within groups
T-test for independent samples, t-test for
dependent samples
Statistical Analysis

Analysis of Variance


Similar in concept to t-test
Used when more than two groups


E.g., experimental group, placebo group,
alternative medication group
Requires continuous variables, normal
distribution in each group, random
sampling
Statistical Analysis

Chi-Square




Differences between proportions, discrete data
2 X 2 table
Considers variability within groups
Mantel-Haentzel


Extension of Chi-square
Way of calculating adjusted odds ratios for
stratified data
Chi Square
Depressed
Smoker
Nonsmoker
Total
Not
Depressed
89 (33%) 179 (67%)
a
b
131 (17%) 647 (83%)
c
d
220
826
a+c
b+d
Total
268
a+b
778
c+d
1046
T (total)
Chi Square
Smoker
Depressed Not
Total
Depressed
a
b
a+b
Nonsmoker
c
d
c+d
Total
a+c
b+d
T=a+b
+c+d
‫آزمون های آماری جهت پی بردن به ارتباط‬
‫‪Correlation‬‬
‫‪Regression‬‬
‫‪‬‬
‫‪‬‬
Correlation Coefficients


Possible values from –1 to +1
-1 = perfect negative correlation



As exposure increases, disease (health
condition) decreases
0 = no relationship or no linear
relationship
+1 = perfect positive correlation

As exposure increases, disease increases
Other Statistics

Logistic Regression


Odds ratios (cohort, case-control, crosssectional studies)
Odds that an exposed person develops the
disease: odds than a non-exposed person
develops the disease


Crude OR (just taking exposure and outcome
into consideration)
Adjusted OR (odds taking all other
factors/confounders into consideration)
Other Statistics

Linear regression

When outcome is continuous



A kind of correlation
Can adjust for other factors/confounders in the
model
Cox Proportional Hazards

When outcome is time to an event


Time to death, recovery, onset of symptoms
Regression model