Review of Statistics
Download
Report
Transcript Review of Statistics
Statistics
Review of Statistics
Levels of Measurement
Descriptive and Inferential Statistics
Levels of Measurement
Nature of the variable affects
rules applied to its measurement
Qualitative Data
Nominal
Ordinal
Quantitative Data
Interval
Ratio
Nominal Measurement
Lowest Level
Sorting into categories
Numbers merely symbols--have no
quantitative significance
Assign equivalence or nonequivalence
Examples, gender, marital status, etc
Male / female smoker /nonsmoker
alive/dead
1
2
Rules of Nominal system
All of members of one category are
assigned same numbers
No two categories are assigned the
same number (mutual exclusivity)
Cannot treat the numbers
mathematically
Mode is the only measure of central
tendency
The Ordinal Scale
Sorting variations on the basis of their
relative standing to each other
Attributes ordered according to some
criterion (e.g. best to worst)
Intervals are not necessarily equal
Should not treat mathematically,
frequencies and modes ok
Ordinal scale
0
1
2
3
4
Interval Scale
Researcher can specify rank ordering of
variables and distance between
Intervals are equal but no rational zero
point (example IQ scale, Fahrenheit
scale)
Data can be treated mathematically,
most statistical tests are possible
Ratio Scale
Highest level of measurement
Rational meaningful zero point
Absolute magnitude of variable (e.g.,
mgm/ml of glucose in urine)
Ideal for all statistical tests
Descriptive Statistics
Used to describe data
Frequency distributions, histograms,
polygons
Measures of Central Tendency
Dispersion
Position within a sample
Frequency Distributions
Imposing some order on a mass of
numerical data by a systematic
arrangement of numerical values from
lowest to highest with a count of the
number of times each value was
obtained--Most frequently represented
as a frequency polygon
Frequency distribution
30
25
Frequency
20
15
10
5
0
Shapes of distributions
Symmetry
Modality
Kurtosis
Symmetry
Normal curve symmetrical
If non symmetrical skewed (peak is off
center)
– positively skewed
– negatively skewed
Positive skew
Negative skew
Modality
Describes how many peaks are in the
distribution
– unimodal
– bimodal
– multimodal
unimodal
bimodal
multimodal
Kurtosis
Peakedness of distribution
– platykurtic
– mesokurtic
– leptokurtic
Mesokurtic
Platykurtic
Leptokurtic
Measures of Central Tendency
Overall summary of a group’s
characteristics
“What is the average level of pain
described by post hysterectomy pts.?”
“How much information does the typical
teen have about STDs?”
Mean
Arithmetic average
Most widely reported meas. of CT
Not trustworthy on skewed distributions
Median
The point on a distribution above which
50% of observations fall
Shows how central the mean really is
since the median is the number which
divides the sample in half
Does not take into account the
quantitative values of individual scores
Preferred in a skewed distribution
Mode
The most frequently occuring score or
number value within a distribution
Not affected by extreme values
Shows where scores cluster
There may be more than one mode in a
distribution
Arrived at through inspection
limited usefulness in computations
Which measures of central
tendency is represented by each
of these lines?
Variability or Dispersion
Measures
Percentile rank-the point below which a
% of scores occur
Range --highest-lowest score
Standard deviation--master measure of
variability--average difference of scores
from the mean--allows one to interpret a
score as it relates to others in the
distribution
Normal (Gaussian) Distribution
Mathematical ideal
– 68.3% of scores within +/- 1sd
– 95.4% of scores within +/- 2sd
– 99.7% of scores within +/- 3sd
unimodal
mesokurtic
symmetrical
Normal curve
1%
13.5% 34%
34% 13.5 %
1%
Inferential Statistics
Used to make inferences about entire
population from data collected from a
sample
Two classifications based on their
underlying assumptions
Parametric
Nonparametric
Parametric
Based on population parameters
Have numbers of assumptions
(requirements)
Level of measurement must be interval
or ratio
– t-test
– Pearson product moment correlation ®
– ANOVA
– Multiple regression analysis
Parametric
Preferable because they are more
powerful--better able to detect a
significant result if one exists.
Nonparametric
Not as powerful
Have fewer assumptions
Level of measurement is nominal or
ordinal
– Chi squared
Some examples of Statistical
tests and their use
Statistical Test
Purpose
IV
DV
t-test (t)
To test the difference
between 2 gp. means
nominal
Interval or ratio
ANOVA (F)
To test the difference
of means among 3or
more gps
To test that a
relationship exists
Nominal
Interval or ratio
Interval or
ordinal
Interval or
ordinal
Pear. Prod
Mom. Corr (r )
Chi Squared
test (X2)
To test the differences Nominal
in proportions in 2 or
more groups to
determin if results are
possible due to
chance
Nominal
analysed with: Analyse-It + General v
Test Chi-square test
Caffeine consumption of adults
Marital status by Caffeine consumption
Performed by Analyse-it Software, Ltd.
n
Count
Marital status
Married
3888
0
Total
652
(705.8)
36
(32.9)
218
(167.3)
906
X² statistic
p
51.66
<0.0001
Divorced, seperated, widowed
Single
Date
Caffeine consumption
1-150
151-300
1537
598
(1488.0)
(578.1)
46
38
(69.3)
(26.9)
327
106
(352.7)
(137.0)
1910
742
>300
242
(257.1)
21
(12.0)
67
(60.9)
330
Total
3029
141
718
3888
1 February 1999
Hypothesis testing
Research Hypothesis Hr--Statement of
the researcher’s prediction
Alternate Hypothesis Ha--Competing
explanation of results
Null Hypothesis Ho -- Negative
Statement of hypothesis tested by
statistical tests
Research Hypotheses
Method A is more effective than method
B in reducing pain (directional)
Method A will differ from Method B in
pain reducing effectiveness
(nondirectional)
Null Hypothesis
Method A equals Method B in pain
reduction effectiveness.(any difference
is due to chance alone
This must be statistically tested to say
that something else beside chance is
creating any difference in results
Type I and Type II errors
Type I--a decision to reject the null
hypothesis when it is true. A researcher
conludes that a relationship exists when
it does not.
Type II--a decisioon to accept the null
hypothesis when it is false. The
researcher concludes no relationship
exists when it does.
Level of Significance
Degree of risk of making a Type one
error. (saying a treatment works when it
doesn’t or that a relationship exists
when there is none)
Signifies the probability that the results
are due to chance alone.
p=.05 means that the probability of the
results being due to chance are 5%