Transcript Slide 1
ENGINEERING STATISTICS
2009
1
Engineering System Analysis
• Engineering systems analysis is the
process of using observations to
qualitatively and quantitatively
understand a system.
• The use of mathematics to determine
how a set of interconnected
components whose individual
characteristics are known will behave
in response to a given input or set of
inputs.
2009
2
1. What is meant by “understanding a system”?
•
The ability to predict future outcomes from the system
based on hypothetical inputs.
2. How do we go about formalizing an
understanding of a System?
•
Our understanding of a system is formalized by a model
that maps input signals to output signals.
3. Why is this important?
•
2009
A system model is a key component in the systems
engineering design cycle.
3
Systems Engineering Cycle
Problem
System Analysis Cycle
Identify
model
factors
System Design Cycle
Estimate
model
parameters
Conceptual
design
MODEL
Optimize
design
parameters
Acquire data
Evaluate
prototyp
e
2009
Simulate
model
response
Final Design
Build
Prototype
4
• A technician is involved in the implementation of
engineering designs. An experienced technician
can extrapolate from previous designs to obtain
effective solutions to similar problems
• An engineer uses the tools of modeling and
optimization to generate system designs. An
experienced engineer should be able to tackle
problems that are completely novel and should
provide solutions that are optimal.
2009
5
How not to solve a design problem
Problem
System Analysis Cycle
Identify
model
factors
System Design Cycle
Estimate
model
parameters
Conceptual
design
Simulate
model
response
MODEL
Optimize
design
parameters
Acquire data
Evaluate
prototyp
e
Build
Prototype
Final Design
2009
6
The Importance of models
Dictionary definition of “model”
• A representation of something (usually on a smaller
scale)
• A simplified description of a complex entity or process.
• A representative form or pattern.
A comprehensible simplified description of a
real world system that captures its most
significant patterns or form.
The key property of a model for systems
engineering design is the ability to predict
outcomes from the system.
2009
7
Models
• Engineering systems analysis
can be thought of as the
process of using observations
to identify a model of a
system.
• The process of modeling a
system is one of finding
correlations or patterns in the
observed signals.
2009
8
Statistical framework
• Measuring real signals is a statistical process.
– Observed signals will be noisy and this noise must be
included in the modeling process. Thus, all modeling
is inherently a statistical process.
– Identified models of systems are uncertain
approximations of the real world. The modeling error
itself is interpreted as a statistical process.
A systems engineer should have a good
understanding of statistical modeling and
statistical decision methodology
2009
9
What is statistics?
• Statistics is the scientific application of
mathematical principles to the collection,
analysis, and presentation of data
– at the foundation of all of statistics is data.
Statistics
2009
deals
with
Collection
Presentation
Analysis
Use
data
to
make
decisions
and solve
problems
10
Engineering statistics
• Engineering statistics is the study of how
best to…
– Collect engineering data
– Summarize or describe engineering data
– Draw formal inferences and practical
conclusions on the basis of engineering data
all the while recognizing the reality of variation
2009
11
Engineering Statistics
is the branch of statistics that has three subtopics which
are particular to engineering.
1. Design of experiments (DOE)
– use statistical techniques to test and construct model of
engineering components and systems.
2. Quality control and process control
– use statistics as a tool to manage conformance to specifications
of manufacturing processes and their products.
3. Time and method engineering
– use statistics to study repetitive operations in manufacturing in
order to set standards and find optimum (in some sense)
manufacturing procedures.
2009
12
Data collection methods
•
•
•
2009
Observational Study
Experimental Study
Opposite ends of a continuum where the
“scale” is in terms of the degree to which
an investigator manages important
variables in the study
13
Types of data
• Qualitative Data (Categorical)
– Non-numerical characteristics associated with items in a sample
– Examples:
• Eye color (blue, brown, green, etc)
• Engine status (working, not working & fixable, not working & not
fixable)
• Quantitative Data (numerical)
– Numerical characteristics associated with items in a sample
– Typically counts of occurrences of a phenomenon of interest or
measurements of some physical property
– Can be further broken down into discrete (countable) and
continuous (uncountable)
2009
14
Collection of quantitative data
(Measurement)
• If you can’t measure, you can’t do
statistics… or engineering for that matter!
• Issues:
– Validity
– Precision
– Accuracy (unbiasedness)
2009
15
Measurement issues
• Validity – faithfully representing the aspect of
interest; i.e.: usefully or appropriately
represents the feature of an object or system
• Precision – small variation in repeated
measurements
• Accuracy (unbiasedness) – producing the
“true value” “on average”
2009
16
Precision and accuracy
Not Accurate
Not Precise
Precise, Not
Accurate
2009
Accurate, Not
Precise
Accurate and
Precise
17
Statistical thinking
• Statistical methods are used to help us describe
and understand variability.
• By variability, we mean that successive
observations of a system or phenomenon do not
produce exactly the same result.
Are these gears produced exactly the same size?
2009
NO!
18
Method
Environment
Material
Man
Sources of
variability
2009
Machine
19
Example
• An engineer is developing a rubber
compound for use in O-rings.
• The engineer uses the standard rubber
compound to produce eight O-rings in a
development laboratory and measures the
tensile strength of each specimen.
• The tensile strengths (in psi) of the eight
O-rings are
1030,1035,1020, 1049, 1028, 1026, 1019, and 1010.
2009
20
Variability
• There is variability in the tensile strength
measurements.
– The variability may even arise from the measurement
errors
• Tensile Strength can be modeled as a random
variable.
• Tests on the initial specimens show that the
average tensile strength is 1027.1 psi.
• The engineer thinks that this may be too low for
the intended applications.
• He decides to consider a modified formulation of
rubber in which a Teflon additive is included.
2009
21
Random sampling
• Assume that X is a measurable quantity related
to a product (tensile strength of rubber). We
model X as a random variable
– Occur frequently in engineering applications
• Random sampling
–
–
–
–
Obtain samples from a population
All outcomes must be equally likely to be sampled
Replacement necessary for small populations
Meaningful statistics can be obtained from samples
R : x1 , x2 , x3 ,, xi ,, xN
2009
22
Point Estimation
• The probability density function f(x) of the
random variable X is assumed to be known.
– Generally it is taken as Gaussian distribution basing
on the central limit theorem.
f x
x 2
exp
2
2
2
1
• Our purpose is to estimate certain parameters of
f(x), (mean, variance) from observation of the
samples.
2009
23
Sample Mean & Variance
1 N
xi
Sample mean: M
N i 1
N
1
2
2
xi M
Sample variance: S
N 1 i 1
M is a point estimator of
S is a point estimator of
2009
24
Point estimates as random
variables
• Since the sample mean and variance
depend on the random sample chosen,
the values of M and S both depend on the
sample set.
• As such, they also can be considered as
random variables.
2009 Fall
25
Examples
Sample (N = 10)
{55,41,50,44,55,56,48,29,51,66}
{60,34,49,43,40,38,53,46,51,46}
2009
M
S
49.5 10.01
46
7.69
{45,54,57,71,36,40,60,46,36,53}
49.8 11.29
{66,57,70,55,69,47,39,48,62,39}
55.2 11.64
{56,44,56,39,51,30,45,55,47,62}
48.5
{44,27,38,61,49,54,59,29,44,43}
44.8 11.47
9.49
26
Quality of Estimators
• If y u(x1,x2,...,xN) is a point estimator of a
parameter q of the population, we want
– E{y} q (unbiased)
– V{y} should be as small as possible
(minimum variance)
• Such an estimator is called an unbiased
minimum variance estimator.
2009
27
PDF of sample mean
1 N
xi
Sample mean: M
N i 1
EM
2
1
x
f X x
exp
2
2
2
2
1
m
2
2
f M m
exp
;
/N
M
2
2 m
2 M
2009
28
4
For larger
sample sizes
(N) the
probability that
the mean
estimate is
closer to the
mean is
higher.
N=1
5
1
N=5
N=100
3
2
1
2M
0
0
2009
2
4
6
8
2
N
10
29
Confidence interval
We want to determine an interval I for the
actual mean so that
P I 1
P a M a
a
f mdm
M
1
PM a M a
2009
30
• Given that X is a Gaussian random
variable with mean and variance 2.
R : x1 , x2 , x3 ,, xi ,, xN
1 N
1 N
2
2
M xi ; S
x
M
i
N i 1
N 1 i 1
M
Z
1
V 2
2009
N
;
has distribution N(0,1)
N
2
has a chi-square distribution
x
M
;
i
i 1
with N1 degrees of freedom.
31
Define t
Z
V / N 1
0.3
Then the pdf of t is given by
k 1 / 2 t
1
hN t
k
k / 2 k
k N 1
2
0.2
k 1 / 2
0.1
-4
-2
0
2 t
4
This distribution is known as Student’s t-distribution
with k degrees of freedom.
The distribution is named after the English statistician W.S. Gosset, who
published his research under the pseudonym “Student.”
2009
32
hk(t)
0.3
(1/2
0.2
(1/2
0.1
-4
1
0
-2
tk,
2
t
4
tk,
P t k , t t k ,
S
S
P M
t k , M
t k ,
N
N
2009
33
• Thus if we obtain the estimates M and S from the
sample set, the actual value of the population
mean will lie in the interval
S
S
t N , , M
t N ,
M
N
N
with probability . This is called a ×100 percent
confidence interval.
• The values for Student’s t-distribution are
tabulated.
2009
34
Confidence coefficient
2009
N
0.90
0.99
0.995
10
1.8331
3.2498
3.6897
50
1.6766
2.6800
2.9397
100
1.6604
2.6264
2.8713
500
1.6479
2.5857
2.8196
35
Example
• Ten measurements were made on the
resistance of a certain type of wire. Suppose
that M10.48 W and S1.36 W. We want to
obtain a confidence interval for with
confidence coefficient 0.90. From the table
t10,0.9 1.83
1.36
10.48
10
9.69,11.27
2009
1.36
1.83
1.83,10.48
10
36
Example
1.4
The voltage
measured at the
output of a system
1.2
1
0.8
V, Volt
0.6
0.4
0.2
0
-0.2
-0.4
0
1
2
3
4
5
6
7
8
9
10
t, msec
2009
37
Statistics with 500 measurements
1.4
mean
1.2
99.9% confidence interval
1
V exp(t/t)
0.8
0.6
t 3 msec
0.4
0.2
0
2009
0
2
4
6
8
10
t, msec
38
Statistical Hypothesis Testing
(experiment design)
Manufacturing
Process at
T=200 C
1K
Manufacturing
Process at
T=350 C
?
1 K
50 W
?
50 W
H0 : Mean and variance are not changed (null hypothesis)
H1 : Mean and variance are changed (alternative hypothesis)
2009
39
Statistical Hypothesis Testing
(process optimization)
Old
Manufacturing
Process (tested
in time)
MTBF 3 months
New
Manufacturing
Process (more
costly)
?
MTBF 6 months
H0 : MTBF ≤ 3 months (null hypothesis)
H1 : MTBF > 3 months (alternative hypothesis)
2009
40
• To guess is cheap. To guess wrongly is
expensive - Chinese Proverb
• There are three kinds of lies: lies, damned lies,
and statistics - Benjamin Disraeli (?), British PM
• First get your facts, then you can distort them at
your leisure - Mark Twain
• Statistical Thinking will one day be as necessary
for efficient citizenship as the ability to read and
write - H. G. Wells
2009
41