Descriptive Statistics Powerpoint
Download
Report
Transcript Descriptive Statistics Powerpoint
Sampling
‘Scientific sampling’ is random sampling
Simple random samples
Systematic random samples
Stratified random samples
Random cluster samples
What?
Why?
How?
What is random sampling?
Simple random sample -Every sample with
the same number of observations has the
same probability of being chosen
Choose first sample member randomly
Stratified random sample – Choose simple
random samples from the mutually exclusive
strata of a population
Cluster sample – Choose a simple random
sample of groups or clusters
Why sample randomly?
To make valid statistical inferences to a
population
Conclusions from a non-probability
sample can be questioned
Conclusions from a self-selected sample
are SLOP
How can samples be randomly
chosen?
Random number generators (software)
Ping pong balls in a hopper
Other mechanical devices
Random number tables
Slips of paper in a ‘hat’
With or without replacement
Descriptive Statistics –
Graphic Guidelines
Pie charts – categorical variables, nominal data, eg. ‘religion’
Bar charts – categorical or numerical variables, nominal or
interval data, eg. ‘religion’ or ‘margin debt’; time series or
cross sectional data
Line graphs – numerical variables, interval data, eg. margin
debt; time series data
Histograms – numerical variables, interval data, eg. golf
scores; cross sectional data – depicts the SHAPE of a
frequency distribution
Stem and Leaf Plot– quick and dirty histogram
Ogive – depicts a cumulative percentage frequency
distribution
Scattergram – two quantitative variables, eg. Margin vs, the
market value
Graphic Deception – some widely
used methods
Graphs without a scale on one axis
Captions or titles intended to influence
Reporting only absolute changes in value and
not percentage changes
Changing the scale of the vertical axis with
breaks or truncations
Changing the scale of the horizontal axis
Changing the width as well as the height of
bars or pictogram figures
Summary of data types and
available graphic techniques
Numeric
Cross-sectional data
Time-series data
Histograms
Percentage histograms
Ogives
Stem and leaf plots
Box plots
Line charts
Bar charts
Nomina;
Pie charts
Bar charts
Complex pie
or bar charts
Describing the frequency distribution
for numerical, cross sectional data
Shape
Center
Spread
Describing distributions
SHAPE
Graphs
Histograms
Percentage histograms
Ogives,
Stem and leaf plots
Box plots
Words
Symmetric, skewed, bell shaped, flat, peaked
Descriptive Statistics –
CENTER
Quantitative measures
Mean
Median
Mode
Mid-point of the range
Descriptive Statistics –
Numeric Measures – cont’d.
SPREAD (dispersion)
Range
Symmetric distributions
Standard deviation
Variance
Skewed distributions
Quartiles
Min
Max
Interquartile range
Percentiles
Z Scores and t-scores
Measures distance from the mean in standard
deviations
Eg. T score for bone density – 1 to 2.5
standard deviations below the norm (mean)
for a 23 year old indicates osteopenia; 2.5 or
more indicates osteoporosis
(X-m)/s = z score
(X – Xbar)/s = t score
Empirical Rule
For mound shaped distributions
About 68% of observations are within one
standard deviation of the mean
About 95% of observations are within two
standard deviations of the mean
Almost all (99.7%) observations are within
three standard deviations of the mean
Chebyshev’s Rule
For all distributions
Let k be greater than or equal to 1
At least 1-(1/k2) of the observations are within k
standard deviations of the mean
Examples
K=1 zero observations may be within one
standard deviation of the mean
K=2 3/4th’s of observations must be within two
standard deviations of the mean
K=3 8/9th’s of observations must be within three
standard deviations of the mean