Statistical Analysis - Descriptive Statistics
Download
Report
Transcript Statistical Analysis - Descriptive Statistics
Systems Engineering Program
Department of Engineering Management, Information and Systems
EMIS 7370/5370 STAT 5340 :
PROBABILITY AND STATISTICS FOR SCIENTISTS AND ENGINEERS
Statistical Analysis – Descriptive Statistics
Dr. Jerrell T. Stracener, SAE Fellow
Leadership in Engineering
1
• Basic Concepts
• Analysis of Location, or Central Tendency
• Analysis of Variability
• Analysis of Shape
2
Population vs. Sample
Population
the total of all possible values (measurement,
counts, etc.) of a particular characteristic for a
specific group of objects.
Sample
a part of a population selected according to some
rule or plan.
Why sample?
- Population does not exist
- Sampling and testing is destructive
3
Sampling
Characteristics that distinguish one type of sample
from another:
• the manner in which the sample was obtained
• the purpose for which the sample was obtained
4
Types of Samples
• Simple Random Sample
The sample X1, X2, ... ,Xn is a random sample if
X1, X2, ... , Xn are independent identically
distributed random variables.
Remark: Each value in the population has an
equal and independent chance of being included
in the sample.
• Stratified Random Sample
The population is first subdivided into
sub-populations for strata, and a simple random
sample is drawn from each strata
5
Types of Samples - Continued
•Censored Samples
Type I Censoring - Sample is terminated at a
fixed time, t0. The sample consists of K times to
failure plus the information that n-k items
survived the fixed time of truncation.
Type II Censoring - Sampling is terminated
upon the Kth failure. The sample consists of K
times to failure, plus information that n-k items
survived the random time of truncation, tk.
Progressive Censoring - Sampling is reduced in
stage.
6
Types of Samples - Continued
• Systematic Random Sample
The N items in the population are arranged in
some order.
Select an item at random from the first K = N/n
items, where n is the sample size.
Select every Kth item thereafter.
7
Statistical Analysis Objective
• Data represents the entire population
Statistical analysis is primarily descriptive.
• Data represents sample from population
Statistical analysis
- describes the sample
- provides information about the population
8
Analysis of Location or Central Tendency
• Sample (Arithmetic) Mean
• Sample Midrange
• Sample Mode
• Sample Median
• Sample Percentiles
9
Sample Mean
• Formula:
1 n
x xi
n i 1
• Remarks:
Most frequently used statistic
Easy to understand
May be misleading due to extreme values
10
Sample Mode
• Definition:
Most frequently occurring value in the sample
• Remarks:
A sample may have more than one mode
The mode may not be a central value
Not well understood, nor frequently used
11
Sample Median
xk
Formula:
, if n is odd & K = (n+1)/2
x 0.5
x k x k 1 , if n is even & K = n/2
2
where the sample values X1, X2, ... , Xn
are arranged in numerical order
• Remarks:
Not well understood, nor accepted
All sample data does not appear to be utilized
Not affected by extreme values
12
Analysis of Variability
• Sample Range
• Sample Variance
• Sample Standard Deviation
• Sample Coefficient of Variation
13
Sample Range
• Formula:
R = Xmax - Xmin
where Xmax is the largest value in the sample
and Xmin is the smallest sample value
• Remarks:
Easy to determine
Easily understood
Determined by extreme values
Does not use all sample data
14
Sample Variance & Standard Deviation
• Sample Variance
2
n x i x i
2
i 1
i 1
xi x
n n 1
n
n
1
2
s
n 1 i 1
n
2
• Sample Standard Deviation
s = (sample variance)1/2
• Remarks
Most frequently used measure of variability
Not well understood
15
Sample Coefficient of Variation
CVs
s
x
• Remarks
Relative measure of variation
Used for comparing the variation in two samples of
data that are measured in two different units
16
Analysis of Shape
• Skewness
• Kurtosis
17
Estimate of Skewness
x
xr
x0.5
For a unimodal distribution, xr is an indicator of
distribution shape
xr
<1
, indicates skewed to the left
=1
, indicates symmetric
>1
, indicates skewed to the right
18
Measure of Skewness
• The third moment about the mean is related to
the asymmetry or skewness of a distribution
3 E X
3
• For a unimodal (i.e., a single peaked) distribution
3 < 0 , distribution is skewed to the left
3 = 0 , distribution is symmetric
3 > 0 , distribution is skewed to the right
• Measure of skewness relative to degree of spread
1 3 /( 2 )
3/ 2
2 E x
2
19
Comparison of Distribution Skewness
• Normal
1 0
•Exponential
1 4
20
Estimation of Skewness
• Estimate of skewness of a distribution from a
random sample
3/ 2
ˆ
1 m3 /( m2 )
where
n
1
m2 xi x
n i 1
and
2
n
1
m3 x i x
n i 1
3
1 n
x xi
n i 1
21
Measurement of Kurtosis
•The fourth moment about the mean is related to
the peakedness, called kurtosis, of a distribution
4 E x
4
• Relative measure of Kurtosis
2 4 / 2
2
where
2 E x
2
22
Estimation of Kurtosis
• Estimate of kurtosis of a distribution (2) from a
random sample
^
2 b2 m4 /(m2 )
where
n
1
m2 xi x
n i 1
and
2
2
n
1
m4 xi x
n i 1
4
1 n
x xi
n i 1
23
Comparison of Kurtosis
24
Presentation of Data
25
40 Specimens
40 specimens are cut from a plate for tensile tests. The tensile tests
were made, resulting in Tensile Strength, x, as follows:
i
1
2
3
4
5
6
7
8
9
10
x
48.5
54.7
47.8
56.9
54.8
57.9
44.9
53.0
54.7
46.7
i
11
12
13
14
15
16
17
18
19
20
x
55.0
55.7
49.9
54.8
49.7
58.9
52.7
57.8
46.8
49.2
i
21
22
23
24
25
26
27
28
29
30
x
53.1
49.1
55.6
46.2
52.0
56.6
52.9
52.2
54.1
42.3
i
31
32
33
34
35
36
37
38
39
40
x
54.6
49.9
44.5
52.9
54.4
60.2
50.2
57.4
54.8
61.2
Perform a statistical analysis of the tensile strength data.
26
40 Specimens
The following descriptive statistics were calculated from the data:
Descriptive Statistics
Count
Minimum
Maximum
Range
Sum
Mean
Median
Sample Variance
Standard Deviation
Kurtosis
Skewness
40
42.35
61.18
18.84
2104.82
52.62
53.03
19.83
4.45
2.51
-0.34
27