Standard Deviation

Download Report

Transcript Standard Deviation

Descriptive Statistics II:
By the end of this class you should be able to:
•
•
•
describe the meaning of and calculate the
mean and standard deviation of a sample
estimate normal proportions based on mean
and standard deviation
plot a histograms with alternative scaling
Palm: Section 7.1, 7.2
please download cordbreak1.mat
& FWtemperature.txt
Exercise
• Download FWTemperature.txt
• Read into MATLAB
• Prepare a single figure with two plots
– a histogram of March highs (row 2)
– a histogram of April highs (row 4)
• Label these plots fully
• Print out the your commands and the resulting
figure
Review: Quantifying Variation
Mean
Standard Deviation
Central Tendency
Spread
>> mean(x)
n
x
x
i 1
n
i
>> std(x)
difference 
deviation of
each point
about the mean
sx 
Summation 
yields one
number
squared 
all values
positive
n
2
(
x

x
)
 i
i 1
n 1
Divide by n-1
normalize the sum
for based on
degrees of freedom
Formula
MATLAB
EXCEL
>> mean(variable)
= average(range)
>> std(variable)
= stdev(range)
n
Mean
Sample
Standard
Deviation
x
x 
i 1
i
n
n
sx 
2
(
x

x
)
 i
i 1
n 1
The Normal (Gaussian) Distribution
Mode
probability density (scaled frequency)
0.4
0.35
0.3
(Population)
Standard
Deviation
0.25
0.2
0.15
0.1
Mean
0.05
0
-4
-3
-2
-1
0
1
2
standard deviations from the mean
3
4
Note on Sample and Population Statistics
Standard
Deviation
Mean
Sample
(The estimate
from a sample
of the whole
population)
Population
(The true value
from the entire
population)
as
n 
s
s
s s
m
x m
x or m
Expected Proportions for known s
mean
m
probability density (scaled frequency)
0.4
Percentage of
observations in
the given range
0.35
0.3
68 %
 1s
0.25
0.2
95.5 %
0.15
0.1
0.05
0
-4
99.7%
 2s
 3s
-3
-2
-1
0
1
2
standard deviations from the mean
3
4
Expected Proportions for known s
probability density (scaled frequency)
0.4
0.35
0.3
0.25
0.2
68 %
0.15
0.1
100  68
 16%
2
16 %
0.05
0
-4
-3
-2
-1
0
1
2
standard deviations from the mean
3
4
Proportions Problem
Data analysis of the breaking strength of a certain
fabric shows that it is normally distributed with a
mean of 200 lb and a variance (s2) of 9.
• Estimate the percentage of fabric samples that will
have a breaking strength between 197 lb and 203 lb.
• Estimate the percentage of fabric samples that will
have a breaking strength no less than 194 lb.
9
x 10
-3
Cord Breaking Distribution with Normal Curve
8
1
( x  m )2 /(2s 2 )
p (x ) 
e
s 2
Scaled Frequency
7
6
5
4
3
2
1
0
145 165 185 205 225 245 265 285 305 325 345 365
Breaking Force (n)
Review: Types of Histograms
Type
Freq.
Absolute
Frequency
absolute
count in
each bin
Relative
Frequency
fraction of
total count
in each bin
Scaled
Frequency
fraction of
total area
in each bin
Formula
=z

z
sum(z )
z

sum (z ) * bin width
Use
Matlab
for a quick
picture
>> hist(x, n)
compare
samples
when total
counts
differ
>> [x,z] = hist(x)
>> zr = z/sum(z)
>> bar(x, zr)
compare
samples
when bin
sizes
differs
>> b = bin centers
>> [x,z] = hist(x,b)
>> zs = z/(sum(z)*w)
>> bar(x, zs)
Additional Example (not covered in
class)
Looking at two sets of data
• Look at a histogram of the second set of data, ‘cord2’
• How would you compare it to cord the first set of
data?
• What problems do you run into?