A Journey of Learning from Statistics to Manufacturing
Download
Report
Transcript A Journey of Learning from Statistics to Manufacturing
A Journey of Learning from
Statistics to Manufacturing,
Logistics, Engineering Design
and to Information Technology
Professor J.-C. Lu
Industrial and Systems Engineering
Georgia Institute of Technology
Contents
1
2
3
4
5
Introduction
Statistics in Reliability
Quality Improvement in Manufacturing
Data Mining in Manufacturing
Product Design, Manufacturing and Service
Chain Management System
6 Information Technology in Education
1. Introduction
• Traditional Research Approach:
Thesis Background
Application #1
New Methods
Application #2
“Modifications”
“Extensions”
•
•
•
New “Areas”
Application #k
• Non-Traditional Research Methods:
Non-Traditional Research Approach
Real-life Problems
Business
Team-work
Application-oriented
Literature
Best Practice
Practical Problem
Solving
Cross-disciplines
Academic Problem Formulation
Literature Review
Academia
Discipline-focused
New Methods or New Areas in Research
Impact Analysis
Time
2. Statistics in Reliability
Traditional Research Approach:
Lu, J. C. (1989), “Weibull Extensions of the Freund and Marshall-Olkin
Bivariate Exponential,” IEEE Transaction on Reliability, 38, 5, 615619.
Lu, J. C. and Bhattacharyya, G. K. (1990), “Some New Constructions of
Bivariate Weibull Models,” Annals of the Institute of Statistical
Mathematics, 42(3), 543-559.
Lu, J. C. (1990), “Least Squares Estimation for the Multivariate Weibull
Model of Hougaard Based on Accelerated Life Test of System and
Component,” Communication in Statistics, 19(10), 3725-3739.
Lu, J. C. and Bhattacharyya, G. K. (1991), “Inference Procedures for a
Bivariate Exponential Model of Gumbel Based on Life Test of
System and Components,” Journal of Statistical Planning and
Inference, 27, 383-396.
Lu, J. C. and Bhattacharyya, G. K. (1991), “Inference Procedures for a
Bivariate Exponential Model of Gumbel,” Statistics and Probability
Letters, 12, 37-50.
Lu, J. C. (1997), “A New Plan for Life-Testing Two-Component Parallel
Systems,” Statistics and Probability Letters, 34(1), 19-32.
x (1)
x (2)
x (r)
•••
y [1] ’ y [2] ’
x*(r+1)
y*[r] ’
y*[r+1] ’
The life-testing experiment was terminated at
and data with superscript “*” are censored at
x (1)
<
x (2)
y [1] , y [2] ,
<
•••,
•••,
y [r]
x (r)
x*(n)
•••
y*[n] .
x (r),
x (r).
are ordered statistics,
are concomitant ordered statistics.
Sample Publications from the Traditional Research Approach:
Chen, D., and Lu, J. C. (1998), “The Asymptotics of Maximum
Likelihood Estimates of Parameters Based on a Data Type
Where Failure and Censoring Times are Dependent,”
Statistics and Probability Letters, 36, 379-391.
Chen, D., Li. C. S., Lu, J. C., and Park, J. (2000), “Simple Parameter
Estimation for Bivariate Shock Models with Singular
Distribution for Censored Data with Concomitant Order
Statistics,” Australian and New Zealand Journal of
Statistics, 42(3), 323-336.
Non-traditional Research Approaches:
A. Start to work with Nortel in the printed circuit board (PCB)
manufacturing area in 1989. Get the 1st Nortel grant in
1990. Publish the 1st paper (in JASA – case study) in 1994.
B. Start to work with NCSU’s Semiconductor Center in 1990.
Early publications appeared in 1991 (Proceedings),
1993 (engineering journal) and 1997 (statistics journal).
Reliability Degradation Studies (First example
of the Non-traditional Research Approach):
Lu, J. C., Park, J. and Yang, Q. (1997), “Statistical Inference of a
Time-to-Failure Distribution from Linear Degradation Data,”
Technometrics, 39(4), 391-400.
Su, C., Lu, J. C., Chen, D., and Hughes-Oliver, J. M. (1999), “A Linear
Random Coefficient Degradation Model with Random Sample
Size,” Lifetime Data Analysis, 5, 173-183.
Chen, D., Lu, J. C., X. Huo, and Ming, Y. (2001), “Optimum Percentile
Estimating Equations for Nonlinear Random Coefficient
Models,” Journal of Statistical Planning and Inference,275-292.
NSF DMII-ORPS Program, “Modeling Accelerated Degradation Data for
Product Reliability Improvement and Warranty Analysis,” 20012003 (with Paul Kvam).
Linear Degradation Model (semiconductor
manufacturing):
y ij
0i
=
+
1i
log(t ij ) +
ij ,
i = 1, 2, …, k (#replicates),
j = 1, 2, …, ni (#successive repeated measurements),
y ij
t
= current, threshold voltage shift or
transconductance degradation,
time.
=
ij
Linear Random Coefficient Model:
Assume 0 and 1 have a bivariate normal distribution
with mean (0 , 1 ), variance ( 02 , 12 ) and correlation .
Define the failure time T as the time that the degradation
reaches a specified level y f , and set y =
0 + 1 T .
f
0 )/ 1
The distribution of the failure time T = ( yf –
Pr( T t ) = Pr( ( y f –
0 )/ 1
< t)
{ A / B }, where A = 0 + t 1 – yf
and
B = sqrt(C), C = 02 + 21 t2 + 2 t 0 1 .
is
Non-linear Degradation Model (motivated from
both semiconductor and PCB manufacturing studies):
Y i = f ( Xi , i ) + i ,
i = + b i (random effects).
Note that E( Y i ) f ( Xi , E( i )) = f ( Xi , ).
Thus, f ( Xi , ) is not the mean response of the population,
and may not be the median of the distribution of Y i
even when zero is the distribution mean of errors i .
By correcting the bias of the median regression, estimates of
were obtained from solving a system of (optimum) unbiased
percentile estimating equations (PEE). The asymptotic
distribution of the estimates was derived. Several examples
of asymptotic efficiency evaluations were given.
3. Quality Improvement in Manufacturing
Non-Traditional Research (examples):
Mesenbrink, P., Lu, J. C., McKenzie, R., and Taheri, J. (1994),
“Characterization and Optimization of a Wave Soldering Process,”
Journal of the American Statistical Association (JASA), 89, 1209-1217.
Gardner, M. M., Lu, J. C., et al. (NCSU ECE and TI researchers) (1997),
“Equipment Fault Detection using Spatial Signatures,” IEEE Trans.
on Components, Hybrids and Manufacturing, 20(4), 295-304.
Hughes-Oliver, J. M., Lu, J. C., Davis, J. C., and Gyurcsik, R. S. (1998),
“Achieving Uniformity in a Semiconductor Fabrication Process using
Spatial Modeling,” JASA, 93, 36-45.
Lu, J. C., et al. (SRC (semiconductor research corporation) and NCSU ECE
people) (1998), “A New Device Design Methodology,” IEEE Trans.
on Electron Devices - Special Issue on Process Integration and
Manufacturability, 45(3), 634-642.
Li, C. S., Lu, J. C., Park, J., Kim, K. M., Brinkley, P. A., and Peterson, J.
(1999), “A Multivariate Zero-inflated Poisson Distribution and its
Inferences,” Technometrics, 41(1), 29-38.
4. Data Mining in Manufacturing
Rying, E. A. Bilbro, G. L. Ozturk, M. C., and Lu, J. C. (2000), “In Situ
Selectivity and Thickness Monitoring based on Quadrupole Mass
Spectroscopy during Selective Silicon Epitaxy,” Proceedings of the
197th Meetings of the Electronchemical Society, 383-392.
Lu, J. C. (2001), “Methodology of Mining Massive Data Set for Improving
Manufacturing Quality/Efficiency,” Chapter 11 (pp. 255-288) in Data
Mining for Design and Manufacturing edited by D. Braha, Kluwer
Academic Publishers: New York.
Lada, E. K., Lu, J. C., and Wilson, J. R. (2002), “A Wavelet Based Procedure
for Process Fault Detection,” IEEE Trans. on Semiconductor
Manufacturing, 15(1), 79-90.
Rying, E. A., Bilbro, G. L., and Lu, J. C. (in press), “Focused Local Learning
with Wavelet Neural Networks,” IEEE Trans. on Neural Networks.
Porter, A. L., Kongthon, A., and Lu, J. C. (in press), “Research Profiling –
Improving the Literature Review: Illustrated for the Case of Data
Mining of Large Datasets,” Scientometrics.
Data from Nortel’s Antenna Manufacturing Process
Auto-Correlation Map
Keywords (Cleaned) (cor map2)
B
Similarity
> 0.75
0.50 - 0.75
0.25 - 0.50
< 0.25
te m p o ra l
d a ta b a se s
d a ta
re d uc tio n
w a ve le t
tra nsfo rm s
d a ta
a na lysis
re m o te se nsing
im a g e
re c o g nitio n
c la ssific a tio n
sp a tia l d a ta
struc ture s
D
fuzzy se t the o ry
im a g e
c la ssific a tio n
fe a ture
e xtra c tio n
d e c isio n tre e s
Ba ye s
m e tho d s
sta tistic a l a na lysis
le a rning (a rtific ia l
inte llig e nc e )
p a tte rn
re c o g nitio n
C
p a tte rn
c la ssific a tio n
ne ura l ne ts
A
d a ta
vem
ryining
la rg e
d a ta b a se s
fuzzy ne ura l ne ts
p a tte rn
c luste ring
tre e d a ta
struc ture s
b a c kp ro p a g a ti
on
im a g e
p ro c e ssing
fuzzy lo g ic
tre e s
(m a the m a tic s)
p a ra lle l
a lg o rithm s
unsup e rvise d
le a rning
p a ra lle l
p ro g ra m m ing
Node size reflects relative frequency in the dataset of 991 abstract records. Placement is based on a
VantagePoint proprietary Multi-dimensional Scaling (MDS) routine. Topics depicted close together are
Discrete Wavelet Transform:
Data Reduction Procedures
1 Linear and Nonlinear Approximation in
Signal Processing
2 Information Metric Based Procedures
3 Data Denoising Procedures
4 Our Methods RRE_h and RRE_s
5 Comparisons
• Testing Curves
• “Data without Noises”
• “Data with Inherent Random Noises”
Linear and Nonlinear Approximation
in Signal Processing
Information Metric Based Procedure – AMDL
(Approximation Minimum Description Length)
Saito’s (1994) method selects C to minimize
AMDL(C) = 1.5 C log2 N + 0.5 N log 2
[
N ( y i
i=1
2
^
– y i,C ) ].
Data De-noising Procedures:
Donoho and Johnstone (1995) considered the nonparametric
regression model, y i = f i + i , i = 1, 2, …, N, where i
are i.i.d. normal variables with zero mean and constant variance.
The goal of the data de-noising procedures is to find a smooth
estimate to minimize the mean square error (MSE). Three
methods,VisuShrink, RiskShrink and SURE (Stein’s Unbiased Risk
Estimate) were compared in our studies.
Seven Testing Curves, Two Reallife Data Examples
Comparison Results (“Data
without Noise”)
Comparison Results (“Data with
Inherent Random Noises”)
Decision Rules (based on the
“reduced-size data”)
1
2
3
4
Chi-square tests
Multi-scale Statistical Process Control (SPC)
(Functional) Principal Component Analysis (PCA)
Bayesian Odds-ratio Probability-based
Classification (and Canonical Variation Analysis)
5 Decision Tree (CART)
6 Scalogram (from Signal Processing Literature)
7 Integrated Energy Metrics
Scalogram
Challenges: derive the distribution of the “energy,”
2
E j = I ( | wjk | ) wjk , where is decided from the
k
data reduction method, and w
jk
is the wavelet coefficient.
Key Challenges in Data Mining Procedures in
Manufacturing Applications:
The replication size in “fault classes” is small.
Proposal: generating “learning data”
Example: Rying (2001) conducted 25 runs of RTCVD experiments
with four induced fault cases.
Nominal Runs:
Four Induced Fault Cases
Challenges in Learning-data Generations:
1. Difficult to generate the “data shifting patterns”
(e.g., Rying’s nominal data) at the wavelet
domain, which has a much smaller size of
data to deal with compared to the original
data domain with possible large size data.
Idea: “Zoom-in” the regions that “fault data
patterns” occurred, and generate the shifteddata at the original data domain in these
focused regions.
Illustration Example:
“Zoom-in Procedure”:
Generate Replicates in the Wavelet Domain
with the following “Patching Technique”:
5. Product Design, Manufacturing and Service
(PDMS) Chain Management System
Initiatives in iTimes (Information Technology
Integrated Manufacturing Enterprise System)
Engineering Domains
Customer-Driven
Design/Engineering
Application Areas
Additive Fabrication
E-Design, Engineering
Supply Chains
Simulation-Based Design
Environments for Field
Service Engineering
Aero/Auto/Elec Systems
Education
Materials Design
Enabling Technologies
Decision Making and Design Synthesis
Interoperability: Fine and Coarse Grained
Engr. Modeling, Validation, Testbeds
IT Architectures for Affordable Change
Tools for Modeling
Current Involvement in iTimes:
(1) developing a collaborative game theory based decision
support system for structuring interactions among
partners in the ePDMS chain, e.g., random coefficient
based evolution modeling of utility functions changing
over the “co-developing periods”);
(2) extracting design-relevant relationships from “data”
collected from various sources, e.g., past designs,
conditions of machines on the factory floor at
distributed sites, etc.;
(3) monitoring and controlling resource (e.g., energy)
utilization and environmental impact.
Challenges in Data Mining on Product Design
(1) “Retrieving past design information”:
How to define “similarity” in 3-D geometric
objects with spatial relationships?
Is it possible to develop a “multi-resolution”
presentation of design models or data?
(2) Source of “variation” in design
(3) Relationship between design, manufacturing
and service activities.
Analysis Models of Varying Fidelity
Analysis Models (CAE)
Design Model (CAD)
1D Beam/Stick Model
Airframe Subassembly
Associativity
Gaps
3D Continuum/Brick Model
Diverse
Fidelities
Design Model
Analysis Model
PWA Component Occurrence
3 APM
linear-elastic model
primary structural
total height, h c
material
C
L
PWB
h1
body 1
APM ABB
core: FR4
Plane Strain Bodies System
2 ABB
Component
base: Alumina
Epoxy
Solder
Joint
Solder Joint Plane Strain Model
4 CBAM
To
body 4
body 3
body 2
plane strain body ,i i = 1...4
geometry
i
material ( E , , )
i
Informal Associativity Diagram
ABB
3 APM
sj
solder joint
shear strain
range
component
occurrence
c
1 SMM
2 ABB
deformation model
Fine -Grained Associativity
approximate maximum
inter-solder joint distance
primary structural material
hc
linear-elastic model
[1.1]
length 2
total thickness
pwb
primary structural material
Tc
Ls
[1.2]
hs
linear-elastic model
[1.1]
detailed shape
solder
1.25
+
rectangle
solder joint
Plane Strain
Bodies System
Lc
total height
component
linear-elastic model
Ts
[1.2]
[2.1]
average
bilinear- elastoplastic
model
[2.2]
T0
a
L1
h1
stress-strain
model 1
T1
L2
h2
stress-strain
model 2
T2
geometry model 3
stress-strain
model 3
T3
xy , extreme, 3
Tsj
Constrained Object -based Analysis Template
Constraint Schematic View
SMM
4 CBAM
xy , extreme,
sj
6. Information Technology in Education
CaMILE
IC web-page links
Laboratory project
Web-based User Interface
Modeling and analysis tools
in “existing systems
ePDMS decision
support tools
Middleware (e.g., CORBA,
SOAP, Jini, etc.)
Case study database
Simulated enterprise
operation system
Industrial practicum
reports and case studies
Architecture of the Integrated Curriculum (IC)-ePDMS System