Revision(UG1432)

Transcript Revision(UG1432)

The exam is of 2 hours & Marks :40
The exam is of two parts ( Part I & Part II)
Part I is of 20 questions . Answer any 15 questions
Each question is of 2 marks . Total 30 marks.
Part II is of 15 questions. Answer any 10 questions
Each question is of 1 mark. Total 10 marks.
No MCQ’s. You should write the answers.
No major calculations. No need to memorize the formulas.
Bring your own calculator. Cell phones are not allowed to use as a calculator.
Study Designs
Levels of measurements (Type of data)
Sampling Distribution of Means and proportions
Normal Distribution
Hypothesis testing & Z- test
Student’s t-test
Chi -square test, MacNemar’s Chi-square test
Confidence Intervals
How to write a research paper ?
-----
( 40 marks)
 QUALITATIVE
DATA (Categorical data)
 DISCRETE QUANTITATIVE
 CONTINOUS QUANTITATIVE




Nominal – qualitative classification of
equal value: gender, race, color, city
Ordinal - qualitative classification which
can be rank ordered: socioeconomic
status of families
Interval - Numerical or quantitative data:
can be rank ordered and sizes compared :
temperature
Ratio - Quantitative interval data along
with ratio: time, age.

Standard error of mean is
calculated by:
s
sx  sem 
n

The standard deviation (s) describes variability between
individuals in a sample.

The standard error describes variation of a sample statistic.


The standard deviation describes how individuals
differ.
The standard error of the mean describes the
precision with which we can make inference about
the true mean.
 Standard
error of the mean (sem):
s
sx  sem 
n
 Comments:



n = sample size
even for large s, if n is large, we can get good
precision for sem
always smaller than standard deviation (s)
The standard deviation of the sampling
distribution of a proportion:
 pˆ 
p(1  p)
n
Two Steps in Statistical
Inferencing Process
1. Calculation of “confidence intervals” from
the sample mean and sample standard
deviation within which we can place the
unknown population mean with some
degree of probabilistic confidence
2. Compute “test of statistical significance”
(Risk Statements) which is designed to
assess the probabilistic chance that the true
but unknown population mean lies within
the confidence interval that you just
computed from the sample mean.
 Many

biologic variables follow this pattern
Hemoglobin, Cholesterol, Serum Electrolytes, Blood
pressures, age, weight, height
 One
can use this information to define what
is normal and what is extreme
 In clinical medicine 95% or 2 Standard
deviations around the mean is normal

Clinically, 5% of “normal” individuals are
labeled as extreme/abnormal

We just accept this and move on.
about mean, 
 Mean, median, and mode are equal
 Total area under the curve above the x-axis
is one square unit
 1 standard deviation on both sides of the
mean includes approximately 68% of the
total area
 Symmetrical


2 standard deviations includes approximately 95%
3 standard deviations includes approximately 99%
 Normal
distribution is completely determined
by the parameters  and 


Different values of  shift the distribution along
the x-axis
Different values of  determine degree of flatness
or peakedness of the graph
Measures of Position
z score
Sample
x
x
z= s
Population
x
µ
z=

Round to 2 decimal places
 The
Z score makes it possible, under some
circumstances, to compare scores that
originally had different units of
measurement.
Interpreting Z Scores
Unusual
Values
-3
Ordinary
Values
-2
-1
0
Z
Unusual
Values
1
2
3
Hypothesis
‘The mean sodium concentrations in the two
populations are equal.’
Alternative hypothesis
Logical alternative to the null hypothesis
‘The mean sodium concentrations in the two
populations are different.’
simple, specific, in advance
One-tail test
Ho:μ= μo
Ha: μ> μo
or μ< μo
Alternative Hypothesis: Mean systolic BP of Nephrology
patients is significantly higher (or lower) than the mean
systolic BP of normal patients.
0.05
100
110
120
130
140
Two-tail test
Ho:μ= μo
Ha:μ# μo
Alternative Hypothesis : Mean systolic BP of Nephrology
patients are significantly different from mean systolic BP of
normal patients.
0.025
0.025
100
110
120
130
140
Every decisions making process will commit two
types of errors.
“We may conclude that the difference is
significant when in fact there is not real
difference in the population, and so reject
the null hypothesis when it is true. This is
error is known as type-I error, whose
magnitude is denoted by the Greek letter ‘α’.
On the other hand, we may conclude that the
difference is not significant, when in fact
there is real difference between the
populations, that is the null hypothesis is not
rejected when actually it is false. This error
is called type-II error, whose magnitude is
denoted by ‘β’.
Disease (Gold Standard)
Absent
Present
Positive
Correct
Test
Result
False Positive
a
c
Negative
Total
False Negative
a+c
Total
a+b
b
d
Correct
b+d
c+d
a+b+c+d
 This
level of uncertainty is called type 1
error or a false-positive rate (a)
 More commonly called a p-value
 In general, p ≤ 0.05 is the agreed upon level
 In other words, the probability that the
difference that we observed in our sample
occurred by chance is less than 5%

Therefore we can reject the Ho
 Stating
the Conclusions of our Results
 When
the p-value is small, we reject
the null hypothesis or, equivalently, we
accept the alternative hypothesis.

“Small” is defined as a p-value  a, where a 
acceptable false (+) rate (usually 0.05).
 When
the p-value is not small, we
conclude that we cannot reject the
null hypothesis or, equivalently, there
is not enough evidence to reject the
null hypothesis.

“Not small” is defined as a p-value > a, where a =
acceptable false (+) rate (usually 0.05).
P-value
A standard device for reporting quantitative results
in research where variability plays a large role.
Measures the dissimilarity between two or more sets
of measures or between one set of measurements
and a standard.
“ the probability of obtaining the study results by
chance if the null hypothesis is true”
“The probability of obtaining the observed value
(study results) as extreme as possible”
P-value
- continued
“ The p-value is actually a probability, normally the
probability of getting a result as extreme as or more
extreme than the one observed if the dissimilarity is
entirely due to variability of measurements or
patients response, or to sum up, due to chance
alone”.
Small p value
Large p value
- the rare event has occurred
- likely event
Area = .025
Area = .025
Area =.005
0
1.96
2.575
-2.575
-1.96
Area = .005
Z
1. Test for single mean
Whether the sample mean is equal to the predefined
population mean ?
2. Test for difference in means
Whether the CD4 level of patients taking treatment A is
equal to CD4 level of patients taking treatment B ?
3. Test for paired observation
Whether the treatment conferred any significant benefit ?
t is a measure of:
How difficult is it to believe the null hypothesis?
High t
Difficult to believe the null hypothesis accept that there is a real difference.
Low t
Easy to believe the null hypothesis have not proved any difference.
Student ‘s t-test will be used:
--- When Sample size is small , for mean
values and for the following situations:
(1) to compare the single sample mean
with the population mean
(2) to compare the sample means of
two indpendent samples
(3) to compare the sample means of
paired samples

BACKGROUND AND NEED OF THE TEST
Data collected in the field of
qualitative.
medicine is often
--- For example, the presence or absence of
a symptom, classification of pregnancy as
‘high risk’ or ‘non-high risk’, the degree of
severity of a disease (mild, moderate,
severe)
The measure computed in each instance is a
proportion, corresponding to the mean in
the case of quantitative data such as height,
weight, BMI, serum cholesterol.
Comparison between two or more
proportions, and the test of significance
employed for such purposes is called the
“Chi-square test”
McNemar’s test
Situation:
Two paired binary variables that form a
particular type of 2 x 2 table
e.g. matched case-control study or
cross-over trial
When both the study variables and outcome
variables are categorical (Qualitative):
Apply
(i) Chi square test
(ii) Fisher’s exact test (Small samples)
(iii) Mac nemar’s test ( for paired samples)
Z-test:
Study variable: Qualitative
Outcome variable: Quantitative or Qualitative
Comparison: two means or two proportions
Sample size: each group is > 50
Student’s t-test:
Study variable: Qualitative
Outcome variable: Quantitative
Comparison: sample mean with population mean;
two means (independent samples); paired
samples.
Sample size: each group <50 ( can be used even for
large sample size)
Chi-square test:
Study variable: Qualitative
Outcome variable: Qualitative
Comparison: two or more proportions
Sample size: > 20
Expected frequency: > 5
Fisher’s exact test:
Study variable: Qualitative
Outcome variable: Qualitative
Comparison: two proportions
Sample size:< 20
Macnemar’s test: (for paired samples)
Study variable: Qualitative
Outcome variable: Qualitative
Comparison: two proportions
Sample size: Any
1.
Number of Observations that Are Free to
Vary After Sample Statistic Has Been
Calculated
2.
Example
Sum of 3 Numbers Is 6
X1 = 1 (or Any Number)
X2 = 2 (or Any Number)
X3 = 3 (Cannot Vary)
Sum = 6
degrees of freedom
= n -1
= 3 -1
=2
Sampling
Investigation
P
S
S
Results
Inference
P value
Confidence intervals!!!
Two forms of estimation
 Point estimation = single value, e.g., x-bar is
unbiased estimator of μ
 Interval estimation = range of values 
confidence interval (CI). A confidence interval
consists of:
Estimation Process
Population
Mean, , is
unknown
Sample
Random Sample
Mean
X = 50
I am 95%
confident that 
is between 40 &
60.
 “We
are 95% sure that the TRUE parameter
value is in the 95% confidence interval”
 “If
we repeated the experiment many
many times, 95% of the time the TRUE
parameter value would be in the interval”
 “the
probability that the interval would
contain the true parameter value was
0.95.”
CI 90% corresponds to p 0.10
CI 95% corresponds to p 0.05
CI 99% corresponds to p 0.01
Note:
p value  only for analytical studies
CI  for descriptive and analytical studies
RR = 5.6
OR = 12.8
NNT = 12
(95% CI = 1.2 ; 23.7)
(95% CI = 3.6 ; 44,2)
(95% CI = 9 ; 26)
If p value <0.05, then 95% CI:
exclude 0 (for difference), because if A=B then A-B = 0 
p>0.05
exclude 1 (for ratio), because if A=B then A/B = 1, 
p>0.05
--The (im) precision of the estimate is indicated
by the width of the confidence interval.
--The wider the interval the less precision
THE WIDTH OF C.I. DEPENDS ON:
---- SAMPLE SIZE
---- VAIRABILITY
---- DEGREE OF CONFIDENCE
p
values (hypothesis testing) gives you the probability that
the result is merely caused by chance or not by chance, it
does not give the magnitude and direction of the difference
 Confidence interval (estimation) indicates estimate of value
in the population given one result in the sample, it gives the
magnitude and direction of the difference
 Abstract
 Introduction
 Methods
 Results
 Discussion
 References
and
Wishing all of you Best of
Luck !

Revision(UG1432)

Transcript Revision(UG1432)

Directory