Kernel Density Estimation and Metropolis-Hastings Sampling in Process Capability Analysis of Unknown...
-
Upload
independent -
Category
Documents
-
view
1 -
download
0
Transcript of Kernel Density Estimation and Metropolis-Hastings Sampling in Process Capability Analysis of Unknown...
Wenzhen Huang1, Ankit Pahwa
University of Massachusetts Dartmouth Mechanical Engineering Department
North Dartmouth, MA, USA 1Contact author
Zhenyu Kong School of Industrial Engineering and Management
Oklahoma State University Stillwater, OK, USA
ABSTRACT Strong normality assumption is associated with widely used process capability indices such as cp, cpk. Violation of the assumption will mislead the interpretation in applications. A nonparametric method is proposed for density estimation of any unknown distribution. Kernels are used for density estimation and metropolis-hastings (M-H) algorithm is adopted to generate samples from the density. M-H sampling provides a tool to accommodate different kernel functions and flexibility of future extension to multivariate cases. Conformity (yield) based indices (yp, y) are adopted to replace cp, cpk. These indices can be conveniently assessed by the proposed kernel density based M-H algorithm (K-M-H). The method is validated by several simulation case studies.
INTRODUCTION Process capability indices such as Cp, Cpk are popularly used in industry for manufacturing process evaluation and quality control. Cp, Cpk essentially use specification limits and estimated process mean and standard deviation to indicate potential and actual process capability of conforming to specifications. The initial introduction and interpretation of these indices were strongly tied to normality assumption. More and more modified indices were introduced to accommodate non-normality, off-center, better performance, and different interpretation requirements (e.g., quality loss). The bewildering diversified versions of indices made the interpretation and comprehension per se an important issue and attracted intensions of academic and industrial community. Early comprehensive interpretations of these indices were given by Kane (1986), Sullivan et al. (1984), and Kotz et al. (1993). More recent review and discussion include Tsui (1997), Palmer
and Tsui (1999). Kotz and Johnson (2002) gave an exhaustive summary and review in process capability indices definition, interpretation, and new development. Since non-normal population in manufacturing quality control and geometric tolerance (GD&T) is common as stated by Bisgaard et al. (1997), Kotz et al. (2002), and Kong et al. (2009), improved methods and indices for non-normal cases have attracted more interests. To target non-normality and to align with 6σ dispersion of normal cases, Clements (1989) introduced an alternative of a percentile range, i.e., replacing 6σ with the range between upper and lower 0.135 percentage points of a non-normal quality characteristic in the index definition. This led to similar capability indices. Along the similar line, variety of parametric models were developed for percentile range assessment, such as Pearson system (Clements, 1989), Johnson system, Weibull, lognormal, generalized lambda (Pal, 2005), t, gamma etc. (Kotz et al. 2002). These efforts broadened generality and applicability of initial indices. They also complicate assessment, comparison, and interpretation of the indices. For instance with the same value of an index Cp (or Cpk) the answer may not be straight forward on whether the two processes have the same capability in terms of nonconformity. These parametric techniques assume a functional form for the density of quality characteristic. More involved model (distribution family) assumption, parameter estimation, and model adequacy checking are required. With the bewildering diversified indices the confusion, incoherence, and misleading are inevitable in application. It is well known that Cp, Cpk generally do not provide us with the proportion of nonconforming products. Only for normal distribution this proportion can be easily derived from Cp, Cpk.
KERNEL DENSITY ESTIMATION AND METROPOLIS-HASTINGS SAMPLING IN PROCESS CAPABILITY ANALYSIS OF UNKNOWN DISTRIBUTIONS
1 Copyright © 2012 by ASME
Proceedings of the ASME 2012 International Manufacturing Science and Engineering Conference MSEC2012
June 4-8, 2012, Notre Dame, Indiana, USA
MSEC2012-7299
Downloaded From: http://proceedings.asmedigitalcollection.asme.org/ on 05/14/2014 Terms of Use: http://asme.org/terms
Several authors preferred to going back to basic concept of what a process capability really means, i.e., replacing Cp, Cpk with more transparent and coherent indices such as nonconformity or average fallout rate (the probability of fallout p ). Carr (1991) was probably the first who proposed the
estimated fallout rate p as an index. Yeh et al. (1998, 2001)
proposed the ratio of expected (desired) fallout rate to the observed one as an index for both univariate and multivariate processes. Tsui (1997) also preferred the yield or conforming rate (1- p ) for capability evaluation, and he proposed
alternatives i.e. quality yield indices to accommodate conformity and quality loss evaluation. Flaig (1999, 2002) also suggested the use of conforming rate (1- p ) as an index.
Parametric models were suggested for nonconformity assessment by almost all the authors who preferred nonconformity indices. If the population is believed to follow certain distribution family of distributions, the related parameters are estimated based on the sample data, and then the nonconforming probability p is calculated with the estimated
distribution model. However, none of the distribution family available can accommodate all possible behaviors in actual data densities. Density estimate therefore becomes a prerequisite. Nonparametric approach was also proposed to estimate density for process capability analysis, and Gaussian kernel was adopted for density estimation (Polansky, 1998, 2000). Nonparametric kernel density estimate (KDE) is purely data driven, requiring no model training process and prior model knowledge/assumption on population density, i.e. “letting the
data speak for themselves”. The extension to multivariate KDE is straightforward. Thus it can be a promising approach to accommodate any distribution for capability analysis. Two factors are crucial for KDE, i.e., bandwidth and kernel functions. Gaussian kernel is one of the commonly used kernels which can simplify conformity assessment (Polansky, 1998). Exploring other kernel alternatives may increase flexibility and improve performance of density estimate of truncated distributions. With KDE the nonconforming probability p can
be assessed. One way to do this, instead of randomly sampling an actual process, is to generate random samples from the estimated density, an idea of Bootstrap. p can thus be estimated
by the ratio of fallout to the total sample size. In this paper, we adopt yield based process capability indices. Several kernel functions are applied to increase the model flexibility, and Markov chain Monte Carlo (MCMC) sampling algorithm (Metropolis-Hastings sampling) is introduced to draw random samples for assessing the nonconforming probability and yield. The integration of KDE and MCMC may provide a novel tool for nonconformity based process capability evaluation. The paper is organized as follows. The yield based indices are introduced and the proposed assessing strategy is presented in the nest section. Section 3 deals with kernel density estimation method, including kernel functions and bandwidth
selection. Section 4 presents Metropolis-Hastings sampling and step by step procedure for implementation. Simulation case study is presented in Section 5, including four different types of distributions, and comparison is made among benchmark, proposed method, and the commonly used approach. Section 6 gives a summary as conclusions. CONFORMITY INDICES Widely used process capability indices Cp, Cpk usually require normality assumption. These indices indicate the ratios of specification range and the process dispersion. 6σ is used to represent process dispersion (equivalent to a 99.73% conformity range). Alternatives have also been proposed when normality is violated, using the range of distribution quantiles (0.00135 and 0.99865) as the measure of process dispersion. These indices can be interpreted by the corresponding conformity, indicated by defective part per million or ppm. However, the conformity interpretation of Cp, Cpk will be significantly different and misleading when normality assumption is moderately even slightly violated. In this paper we assume the process under study is in-control which can be assessed by control chart techniques with subgroup samples. We further assume the process quality characteristic (individual sample) can be characterized by any known or unknown statistical model. We propose directly using yield or conformity to characterize process capability below. It gives a coherent and unified index for any distribution. This prevents confusion and misunderstanding caused by vagueness in existing indices (Cp, Cpk). The yield is defined as the probability of quality characteristic(s) falling in specification ranges or regions (for multivariate processes). Thus,
Yield =Prob[LSL ≤ x ≤ USL] (for a univariate process) (1) or Yield =Prob[X∈Ωs] (for a multivariate process) (2) Where x, X are the quality characteristic and the vector quality characteristics, respectively. ΩS denotes a specification region. Ωs usually represents a hypercube but can be a complex irregular region. Examples of such irregular region can be found in many applications such as semiconductor and automotive manufacturing (Huang et al., 2008). An important example is to evaluate the process capability of a multivariate process in manufacturing of geometrical features with interrelated GD&T tolerances in which position/orientation, and form tolerances are embedded, therefore not independent, creating an irregular specification space (Huang et al, 2010). For instance, a composite tolerance on two dimensional variables creates an irregular Ωs. USL and LSL are upper and lower specification limits. In a univariate case, the conformity is expressed as:
dxxfY
USL
LSL
pk ∫= )( , and
)( dxxfMaxYMaxY
USL
LSL
pkp ∫==µµ
(3)
2 Copyright © 2012 by ASME
Downloaded From: http://proceedings.asmedigitalcollection.asme.org/ on 05/14/2014 Terms of Use: http://asme.org/terms
Where Ypk, Yp can be interpreted as actual and potential conformity (Yield), respectively. f(x) is the density function of a quality characteristic x. µ is the mean of x. In general Ypk < Yp. And Ypk = Yp only if the process is correctly positioned. In the symmetric distribution case this means well centered. ppm can thus be conveniently expressed by:
ppmactual =(1- Ypk)106 and ppmpotential =(1-Yp)106 In multivariate cases, a general expression of conformity is
X)X( dfY
S
pk ∫=
Ω
and
X)X(µµ
dfMaxYMaxY
S
pkp ∫==
Ω
(4)
Where µµµµ denotes the mean vector of X. Ypk, Yp clearly define the capability of process for producing quality products. The interpretation is coherent and independent of model assumption. However, the applicability of Eq.(1) relies on how easy and accurate Ypk, Yp can be obtained from process data. To this end, two issues must be resolved: i) density f(x) estimation; ii) Ypk, Yp calculation. We propose kernel density estimation and Metropolis-Hastings sampling in the next two sections to attack these issues. The strategy is presented below:
FIGURE 1- PROCEDURE OF PROCESS CAPABILITY EVALUATION USING KERNEL DENSITY ESTIMATION
AND SAMPLING
KERNEL DENSITY ESTIMATION In process capability analysis the accuracy of results of Cp,
Cpk are essential for interpretation and decision making. Model (normality) assumption is one of the most critical factors, detrimentally affecting the quality of the results if the distribution feature is incorrectly assumed. When it is moderately even slightly violated but still accepted the
conformity of a process represented by Cp, Cpk can be very misleading, as was shown in (Polanski 1998) and Section 4. These non-normality cases are common in practice. For example, as presented by Bisgaard et al. (1997) the process bias in a machining process if a tolerance is subject to maximum
material condition (MMC) the distribution of a radius of a hole tends to be skewed to its lower tolerance boundary side and a radius of a shaft tends to be skewed to its upper tolerance boundary side because one cannot add materials to a part in machining processes. Thus incurable mistakes of an out-of-spec larger hole or a smaller shaft can be avoided.
Parametric method is appealing because of simplicity. However, it lacks the flexibility to accommodate these non-normal distributions. Nonparametric density estimation is a well established method. The most appealing advantage of it is the flexibility of accommodating to any unknown distributions, i.e., “let the data speak for themselves” or much less dependent of modeling experiences and assumptions.
We assume a process is in a stable status but the quality characteristic x is not necessarily normally distributed. The quality characteristic is x and its distribution is denoted by f(x). If we collected n samples from the process as a training data set, denoted by xi, i=1,…n, the density function can be expressed as the smooth Parzen estimate:
)5(),(1
)(ˆ
11∑∑
==
−=
−=
n
i
i
n
i
i
h
xxKhn
h
xxK
nhxf α
Where K(z) is a kernel. Any smooth function, satisfying
( ) ,0≥zK ( ) ( ) 1=∫ zKdzzK ,
( ) ( ) ( ) ,0,1 == ∫∫ dzzzKzKdzzK ( ) 02 ≠∫ dzzKz and < ∞, can be used
as a kernel. h represents the window width of the kernel function, controlling the smoothness of )(ˆ xf . The kernel K is a
local weighting function, giving more weight to the observed point x that is closer to the training point xi and the weight decreases as its distance from xi increases. ),( hnα is a function
of n and h to ensure 1)(ˆ =∫
∞
∞−
dxxf (density function). For
simplification ),( hnα can be ignored in sample generation as
shown in the next section. Eq(1) is essentially a regression or function estimation
technique, requiring very little model training. Another appealing feature is that the important distribution patterns (skewness, multimodal, thick tails etc.) can be reserved.
There are several widely used kernels in literature. The choice of kernels is less important than the choice of h in terms of the behavior of )(ˆ xf (Marron et al. 1988). To accommodate
different distributions, in this paper, we propose using Gaussian, tri-cube, and Epanechnikov quadratic kernels. The Gaussian kernel has infinite support, whereas the others have finite supports which are desirable for bounded (truncated) distributions. These kernels are: Gaussian:
Data collection xi, i=1, 2, …n
Kernel density estimate )(ˆ xf →f(x)
M-H sampling x(t) ~ )(ˆ xf , t=N+1, …
Ypk, Yp estimation (Eqs. (1), (3) and Sect. 4 Eqs. (18), (19))
Is the process in-control?
Process correction
N
Y
3 Copyright © 2012 by ASME
Downloaded From: http://proceedings.asmedigitalcollection.asme.org/ on 05/14/2014 Terms of Use: http://asme.org/terms
))(2
1exp(
2
1 2
h
xx
hh
xxK ii −
−=
−
π
(6)
Tri-cube:
≤
−−−
=
−
otherwise
h
xxif
h
xx
h
xxK
iii
0
1)1( 3
3
(7)
Epanechnikov quadratic:
≤
−−−
=
−
otherwise
h
xxif
h
xx
h
xxK
iii
0
1)1(4
32
(8)
Bandwidth parameter h is a more important factor in density estimation, controling the smoothness of )(ˆ xf . When h
is too large important features of the underlying distribution (e.g. multimodal) are smoothed away. When h is too small the
)(ˆ xf tends to be too wiggly or overfitting, representing the
sample randomness rather than true patterns. Various techniques have been developed for selection of optimal bandwidth. By optimum it means to achieve a tradeoff between biased )(ˆ xf
and variance of )(ˆ xf , and achieving a minimum estimation
error. A common measure of the estimation error is the mean integrated squared error (MISE).
dxxfxfEhMISE h
2))()(ˆ()( ∫ −= (9)
A review on the optimal bandwidth selection was given in (Jones, Marron, and Sheather, 1996). These techniques are especially beneficial in automatic bandwidth determination for multiple estimates and dimensionality reduction where manual selection is impractical. In this paper, we propose to use the “Quick and Dirty”, i.e. the Rule of thumb method for bandwidth selection (Turlach, 1993, Jones et al., 1996). This can be interactively used with visual choice of bandwidth. In process capability analysis one usually does not many processes to be estimated simultaneously because in manufacturing the process data are costly; the careful planning and experimental setup are also time consuming. Another reason is the simplicity for implementation, thus, it is more appealing and desirable for practitioners. If we take K as the Gaussian kernel, the Rule of thumb optimal h can be derived based on the asymptotic mean integrated squared error (AMISE) as (Silverman, 1986):
4/)()()()()( )2(22
41 fRKhKRnhhAMISE µ+= − (10)
Minimizing AMISE with respect to h leads to the following optimal bandwidth:
5/1
5/1
)2(22 )()(
)( −
= n
fRK
KRho
µ
(11)
Where ∫= dxxKKR )()( 2 . ( )222
2 )()( ∫= dxxKxKµ is independent of
bandwidth h. And f (2) denotes 2nd derivative of f. By using the standard normal distribution as a reference distribution to replace the unknown f in Eq.(9), the Rule of thumb yields the estimate
5/1ˆ06.1 −= nho σ (12)
Where 2σ is the sample variance of xi. For other non-
Gaussian kernels, the equivalent h can be obtained by rescaling (Marron et al. 1988). Suppose we have estimated an unknown density f using Gausisian kernel KA and bandwidth ho, using rescaling to estimate f with a different kernel, say KB. Then the appropriate bandwidth hB to use with KB is
A
o
B
nh
δ
σ 5/1ˆ06.1 −
= (13)
where the scale factor A
oδ (the so-called canonical bandwidths)
can be found in literature, i.e.,
7764.04
110/1
≅
=
πδ A
o (14)
ho in Eq.(10) is sensitive to outliers in xi which may cause a too large estimate of σ , and hence a too large ho,
resulting in an oversmoothed f. A more robust alternative to estimate σ is to use the estimated quantile
range nn xxR 25.075.0ˆ −= . nx 25.0 , nx 75.0 denote 25% and 75%
quantile points of xi. It can be shown that the robust estimate of σ is
34.1
ˆˆ
R=σ
It yields a Better Rule of Thumb (robust) estimate of ho,
5/134.1
ˆ,ˆ06.1 −= n
RMinho σ (15)
With the selected kernel and bandwidth ho the Yp, Ypk can
be estimated by plugging in )(ˆ xf in Eq.(3) with the proposed
numerical method as presented below. Statistical property of kernel estimate reveals the effects of
kernel functions and factor h to the quality of )(ˆ xf
approximating the true f(x). Bias of the estimation is expressed as
Bias( 0),()()(2
))(ˆ 22
)2(2
→+= hhoKxfh
xf µ (16)
The variance of the estimation is
Var( ∞→+= nhnh
oxfKRnh
xf ),1
()()(1
))(ˆ (17)
The optimal bandwidth ho is set to balance these two estimation errors. In conformity analysis we are more interested in the tail property bias. It directly affects the nonconformity estimation.
Thus, once ho is determined the contribution of )(2 Kµ is
another independent control factor which actually determine how the weight is allocated along the tails. Larger
)(2 Kµ implies more weight is put to the tails as shown below.
∫= dxxKxK )()( 22µ
4 Copyright © 2012 by ASME
Downloaded From: http://proceedings.asmedigitalcollection.asme.org/ on 05/14/2014 Terms of Use: http://asme.org/terms
Numerical Monte Carlo integration was conducted to
investigate )(2 Kµ effects which rely only on kernel K. The
results are summarized in Table 1 below.
Table 1 )(2 Kµ evaluation for three kernel functions
Kernels Equation # )(2 Kµ
Epanechnikov Eq.(8) 0.19841 quadratic Tri-cube Eq.(7) 0.16639 Gaussian Eq.(6) 0.98934
* 10000 Monte Carlo samples were used for )(2 Kµ integration
The results in Table 1 mean that Gaussian kernel put more weight on tails than other two kernels. For the same ho, Gaussian kernel may cause more bias error at tails. The similar observation can be found in the simulation case study in Sect. 5.
METROPOLIS-HASTINGS (M-H) SAMPLING AND CONFORMITY ESTIMATION Monte Carlo integration provides a generic and flexible method for evaluating Eq.(3). With Gaussian kernel density estimate (Polansky, 1996, 1999) gave close form expressions of process conformity. In this paper, an alternative method, i.e. Metropolis-Hastings (M-H) sampling, is proposed. The idea is to draw large number (M) of random samples which follow the
distribution )(ˆ xf , and then use them to estimate conformity
probability. In comparison, the M-H method is more flexible, easy to implement, and independent of kernel function and the types of specification region. It will be more desirable for multivariate applications where the design spaces (specification regions) are irregular, making the direct integration infeasible. Since the training data collected from a manufacturing process is usually costly and limited (e.g., N≤100), the computation cost of M-H for density evaluation at M target points is linear, i.e., O(M). The M-H algorithm involves the following steps: i)
generating random samples from )(ˆ xf ; ii) counting fall-out
(nonconforming) samples; iii) calculating yield or the ratio of the number of conforming samples to the total number of the samples. The one side fall-outs can also be easily counted which provides off center information and is useful in estimating the optimal position of the process (Ypk). In addition, sampling method can accommodate both regular and complex irregular specification regions in multivariate processes.
Since the feature of the estimated density )(ˆ xf can be
complex there is no straightforward approach to directly draw
samples from )(ˆ xf as in Monte Carlo simulation with well
known parametric models such as normal, uniform, Gamma etc. Metropolis-Hastings sampling algorithm is adopted to draw
samples )(ˆ xf .
The M-H algorithm requires a target density and a proposal
density. The former in evaluating Eq.(1) is defined as )(ˆ xf .
The later, a conditional density denoted as q(⋅|x), is selected to be easy to simulate and explicitly available. The commonly used proposals include uniform (U(a,b)) and normal. The uniform proposal is independent of x i.e. q(y|x)=q(y) and the
normal proposal q(y|x)= )2
)((exp
2
12
2
σσπ
xy −− is
symmetrical. The procedure of Metropolis-Hastings sampling is as follows (Robert and Casella, 2006): With an initial point x(0) Step 1: Repeat for t=1, 2, …, N with current x(t) Step 2 Draw a new sample Yt
from q(y |x(t)) Step 3 Draw a sample u~U(0,1) Step 4 Calculate
)|(
)|(,
)(ˆ)(ˆ
)()(
)()(
2)(
)(
1 tt
tt
t
t
xYq
Yxqa
xf
Yfa == (18)
a=a1*a2 Step 5 Update current sample, i.e., If a≥1, x
(t+1)= Y(t), i.e., updating current sample, i.e. accepting
the new sample; Otherwise if a>u, let x
(t+1)= Y(t), updating current sample, i.e.
accepting the new sample; If a<u, let x(t+1)=x
(t), rejecting the new sample. Actually the Steps 4~5 is to update a sample with the acceptance probability ρ(x,y), i.e. take:
−=+
),(1
),()()()(
)()()()1(
ttt
ttt
t
Yxyprobabilitwithx
YxyprobabilitwithYx
ρ
ρ (19)
Where
= 1,)|(
)|(
)(ˆ)(ˆ
min),(xyq
yxq
xf
yfyxρ . Uniform or normal
proposals are symmetrical i.e., q(x|y) = q(y|x), thus
1)|(/)|( ≡xyqyxq . 1),(ˆ/)(ˆmin),( xfyfyx =ρ .
The above M-H algorithm produces a series of samples x
(t). x(t) forms a Metropolis-Hastings Markov chain with the
stationary distribution )(ˆ xf . After a sufficient long initial burn-
in process, the x(t) t >NB can be taken as random samples
from )(ˆ xf .
As shown in Fig. 1 the Yp, Ypk are estimated by counting the nonconforming M-H samples x
(t):
burninaftersamplesoftotal
samplesgnonconforYpk
#
min#1−= (20)
We can count numbers of nonconforming samples that are beyond USL and below LSL separately. Denotes I>USL = # of
nonconforming samples that are beyond upper specification limit and I<LSL = # of nonconforming samples that are below the lower specification limit. We designed the following procedure to estimate the potential capability (yield) Yp: If I>USL>I<LSL Shift the process x
(t) up and update I>USL, I<LSL iteratively until I>USL ≈ I<LSL Or if I>USL<I<LSL
5 Copyright © 2012 by ASME
Downloaded From: http://proceedings.asmedigitalcollection.asme.org/ on 05/14/2014 Terms of Use: http://asme.org/terms
Shift the process x(t) down and update I>USL, I<LSL
iteratively until I>USL ≈ I<LSL. For convenience, instead of shifting the process up or down in calculation we shift the relative position of specification limits and record the amount required to be shifted. With the updated I>USL, I<LSL, we have
burninaftersamplesoftotal
IIY LSLUSL
p#
1 <> +−= (21)
This procedure can be conducted by retrieving the recorded M-H samples x
(t), no extra M-H simulation is needed. The resultant total amount of shift can be recorded and used to guide process adjustment similar to centering a process. SIMULATION CASE STUDY Two Beta distributions, a mixture of two normal distributions representing a bimodal case, and a standard normal distribution were selected to simulate typical distributions in applications. The Beta distributions are selected because they can be skewed and have finite support, representing truncated and skewed distributions. The mixture of normal distributions represents multimodal processes and has infinite support (infinite tails). The standard normal pdf can be taken as a extreme case of using Gaussian kernel density estimation, i.e., using one kernel function to fit all data points. For convenience we only compare the conformities of benchmark, kernel based M-H sampling approach (K-M-H), and the parametric model method i.e., Cp, Cpk analysis. The Cp,
Cpk are translated into corresponding conformity using Eqs. (22)~(24) below. The benchmarks were set to generate 100 million samples directly from Beta(2,1), Beta(2,6), a bi-normal mixture, as well as normal N(0,1). Monte Carlo simulations were conducted and the conformities were checked by counting the non-conforming samples (Eqs.(20)~(21)). The benchmark distributions are shown in Fig. 2. The conformity results are shown in Table 2. For comparison, we use traditional process capability analysis to estimate
−−=
−=
σ
µ
σ
µ
σ ˆ
ˆ,
ˆ
ˆmin,
ˆ*6
LSLUSLC
LSLUSLC pkp (22)
The underlying assumption is normality of the quality characteristics. The equivalent potential conformity (associated with Cp) is:
1)3(*2 −= pp CY Φ (23)
And actual capability (yield) is:
).ˆ
ˆ()
ˆ
ˆ(
σ
µΦ
σ
µΦ
−−
−=
LSLUSLYpk (24)
Where σµ ˆ,ˆ are estimated from N=100 Monte Carlo samples
generated from the specified distributions. These samples were also used for kernel density estimation and M-H sampling. We used these distributions to generate N=100 samples xi i=1, 2, …, N to simulate the individual quality characteristic data from these processes. The training data size is determined with the sampling plan of 25 subgroups and n=4 samples in each subgroup for phase I SPC charting process (to ensure in-
control). The procedure in Fig. 1 was then followed to estimate density and yield. In the M-H sampling, the initial 10,000 samples were taken as burn-in process. The following samples were split into 30 sample sets, each containing 40,000 samples. From each sample set the conformity (yield) is estimated. The average of these 30 conformities gives the conformity results in Table 2 below. The last column in Table 2 represents the relative errors, i.e.
%100||
% ×−
=ConformityBenchmark
ConformityBenchmarkConformityEstimatedε
Table 2 Comparison of process capability indices of Yp, Ypk with different approaches
Pdf USL/LSL
Ypk Yp 95% CI of Ypk Errors ε %
Benchmark
via simulation
Beta(2,1) Beta(2,6) Bi-normal
N(0,1)
0.97/0.030.97/0.03
9.5/-3 3/-3
0.9432/ 0.986/ 0.9987/ 0.9973
0.9984 0.9999 0.9973
0% 0% 0% 0%
Epanechnikov quadratic
kernel
Beta(2,1) Beta(2,6) Bi-normal
N(0,1)
0.97/0.030.97/0.03
9.5/-3 3/-3
0.9534/0.9835/0.9981/
1.0/
0.9999 1.0 0.9997 1.0
[.9451, .9617] [.9775, .9862] [.9975,.9987]
[1.0,1.0]
1.08%/0.16% 0.25%/0%
0.06%/0.02% 0.27%/0.27%
Tri-cube kernel
Beta(2,1) Beta(2,6) Bi-normal
N(0,1)
0.97/0.030.97/0.03
9.5/-3 3/-3
0.9510/ 0.98775/0.99986/
1.0/
0.9944 1.0 0.9999 1.0
[.9428, .9593] [.9838, .9917] [.9997,1.000]
[1.0,1.0]
0.83%/0.4% 0.18%/0% 0.12%/0%
0.27%/0.27%
Gaussian kernel
Beta(2,1) Beta(2,6) Bi-normal
N(0,1)
0.97/0.030.97/0.03
9.5/-3 3/-3
0.9563/ 0.9614/ 0.97939 0.9967
0.9852 1.0 /0.9903 /0.9946
[.9503,.9623] [.9547, .9681] [.9775,.9813] [.9957, .9978]
1.39%1.32% 2.49%/0%
1.93%/0.96% 0.06%/0.27%
Parametric model via
Cp, Cpk analysis
Beta(2,1) Beta(2,6) Bi-normal
N(0,1)
0.97/0.030.97/0.03
9.5/-3 3/-3
0.89200.92100.96690.9983
/0.9532 /0.9987 /0.9737 /0.9985
5.43%/4.52% 6.59%/0.13% 3.18%/2.62% 0.1%/0.12%
The 95% confidence intervals were estimated from 30
sample sets. From the results and computation experiences it was observed that: 1. Among three K-M-H algorithms, Epanechnikov quadratic
and Tri-cube kernels give better results on all non-normal distributed processes; Gaussian kernel seems overestimating the nonconformity or too conservative. This spillover effect is caused by relatively large )(2 Kµ or over
emphasizing the tails, as presented in Section 3 Table 1. However, Gaussian kernel does produce smoother density estimates.
2. The conformity results of all three K-M-H algorithms are superior over the traditional parametric normal model method except for the perfect normal distribution. For non-normal scenario the conformity estimates are significantly different.
3. The quick & dirty bandwidth selection works well in the case study.
4. Computation cost is trifle. It depends on the sample size. In most of cases it takes about 30 sec for about 600,000 kernel function assessments in M-H sampling.
5. It is noticed that the estimated Yp, Ypk are random in the sense that they rely on the actual sampling process i.e.
6 Copyright © 2012 by ASME
Downloaded From: http://proceedings.asmedigitalcollection.asme.org/ on 05/14/2014 Terms of Use: http://asme.org/terms
samples xi directly from the unknown process model and also on the M-H sampling process.
6. Another advantage over traditional approach is that the M-H is actually a bootstrap sampling from the nonparametric kernel density model. Hence the confidence interval can be estimated simultaneously.
0 0.5 10
200
400
600
800
0 0.5 10
500
1000
1500
-5 0 5 100
100
200
300
400
-5 0 50
500
1000
1500
FIGURE 2 - DISTRIBUTIONS: (a) BETA(2,1) DISTRIBUTION; (b) BETA(2,6) DISTRIBUTION; (c)
MIXTURE OF N(0,1) AND N(5,1.5); (d) NORMAL N(0,1)
In the following figures (Figs 3-11), (a) is )(ˆ xf , (b) M-H
sample distribution, (c) training samples (n=100), (d) 30 estimated yields.
0 10 20 30 400
5
10
15
20
0 10 20 30 400
5
10
15
20
0 0.5 10
5
10
15
20
25
0 10 20 300.945
0.95
0.955
0.96
0.965
0.97
FIGURE 3 - BETA(2,1) WITH EPAN: YPK=0.9534, 95% C.I= [0.94511, 0.9617]
0 10 20 30 400
0.5
1
1.5
2
0 10 20 30 400
0.5
1
1.5
2
0 0.5 10
5
10
15
20
25
0 10 20 300.945
0.95
0.955
0.96
0.965
0.97
FIGURE 4 - BETA(2,1) WITH GAUSSIAN: YPK=0.9563, 95% C.I= [0.95034, 0.9623]
0 10 20 30 400
10
20
30
40
0 10 20 30 400
10
20
30
40
0 0.5 10
10
20
30
0 10 20 300.93
0.94
0.95
0.96
FIGURE 5 - BETA(2,1) WITH TRI: YPK=0.9510, 95% C.I= [0.9428, 0.9593]
0 10 20 30 400
5
10
15
20
25
0 10 20 30 400
5
10
15
20
25
0 0.2 0.4 0.6 0.80
5
10
15
20
25
0 10 20 300.975
0.98
0.985
0.99
FIGURE 6 - BETA(2,6) WITH EPAN: YPK=0.98185, 95% C.I= [0.97745, 0.98624]
(a) (b)
(c) (d)
(b) (a)
(d) (c)
(a) (b)
(c) (d)
(a) (b)
(c) (d)
(a) (b)
(c) (d)
7 Copyright © 2012 by ASME
Downloaded From: http://proceedings.asmedigitalcollection.asme.org/ on 05/14/2014 Terms of Use: http://asme.org/terms
0 10 20 30 400
0.5
1
1.5
2
2.5
0 10 20 30 400
0.5
1
1.5
2
2.5
0 0.2 0.4 0.6 0.80
5
10
15
20
0 10 20 300.95
0.955
0.96
0.965
0.97
0.975
FIGURE 7 - BETA(2,6) WITH GAUSSIAN: YPK=0.9614, 95% C.I= [0.95473, 0.96814]
0 10 20 30 400
10
20
30
0 10 20 30 400
10
20
30
0 0.2 0.4 0.6 0.80
10
20
30
0 10 20 300.98
0.985
0.99
0.995
FIGURE 8 - BETA(2,6) WITH TRI: YPK=0.98775, 95% C.I= [0.98383, 0.99167]
0 10 20 30 400
10
20
30
40
0 10 20 30 400
10
20
30
40
-5 0 5 100
5
10
15
20
25
0 10 20 300.9975
0.998
0.9985
0.999
FIGURE 9 - BI-NORMAL WITH EPAN: YPK=0.9981, 95% C.I= [0.99753,0.99867]
0 10 20 30 400
5
10
15
20
0 10 20 30 400
5
10
15
20
-5 0 5 10 150
10
20
30
0 10 20 300.976
0.978
0.98
0.982
FIGURE 10 - BI-NORMAL WITH GAUSSIAN: YPK=0.97939, 95% C.I= [0.97751, 0.98128]
0 10 20 30 400
10
20
30
40
0 10 20 30 400
10
20
30
40
-5 0 5 100
5
10
15
20
25
0 10 20 300.9996
0.9997
0.9998
0.9999
1
FIGURE 11 - BI-NORMAL WITH TRI: YPK=0.99986, 95% C.I= [0.9997,1]
Matlab m-file codes were developed and simulations were
conducted on genuine intel (r) cpu t2500 @2.0ghz, ram2046Mb, windows xp. The computation seems to be not a serious concern. Most of the calculation can be done within one minute. SUMMARY
The process capability indices Cp, Cpk have been widely accepted and used in industry for process capability evaluation. The Cp, Cpk analysis may cause ambiguity and misleading in interpretation in application when normality assumption is slightly or moderately violated. A new method for process capability evaluation is proposed based on nonparametric model and Markov Chain Monte Carlo technique. The new method directly defines conformity or yield as process capability index which can avoid ambiguity and misleading in interpretation. Kernel density estimation and Metropolis-Hastings sampling, a popular sampling algorithm in
(a) (b)
(c) (d)
(a) (b)
(a) (b) (a) (b)
(a) (b)
(c) (d)
(c) (d) (c) (d)
(c) (d)
8 Copyright © 2012 by ASME
Downloaded From: http://proceedings.asmedigitalcollection.asme.org/ on 05/14/2014 Terms of Use: http://asme.org/terms
Markov Chain Monte Carlo technique, were adopted for kernel pdf model checking and conformity analysis. Yield computation in conformity analysis involves a multivariate integral computation. If prior experiences or knowledge provide sufficient fidelity to the kernel pdf model, the model checking by M-H sampling can thus be avoided. Monte Carlo or quasi-Monte Carlo [25] methods, as well as other space filling techniques [26] can also be applied for high efficient conformity (yield) computation. Four distributions, representing truncated, skewed, multimode, and perfect normal densities, were used in simulation for validation. The results show that the proposed K-M-H approach in conformity estimation has the following features:
1. Assumption-free: the kernel estimation strategy subjects to no prior model assumption, i.e., letting the
data speak for themselves. This is very appealing to practitioners who concern about capability indices interpretation and how much violation of normality should be tolerated.
2. M-H sampling method serves both model checking and conformity analysis, having sufficient model fidelity, other techniques can provide more efficient yield calculation algorithms;
3. The method can also be easily extended to multivariate processes;
4. More accurate conformity estimation than currently used parametric model method, especially when normality is violated;
5. Coherent and unambiguous interpretation of process capability by using conformity indices Yp, Ypk;
6. Superiority of Epanechnikov quadratic and tri-cube kernels over Gaussian in conformity estimation for non-normal distributions in the simulation case study;
7. The quick and dirty bandwidth selection approach works well in the simulation case study;
8. Coding and implementation: once the ho is determined the expression of kernel density is straightforward; M-H sampling and coding is extremely simple;
9. Computation cost is trifle for process capability estimation.
Despite assumption-free, some prior knowledge about the density of the process can be useful. For instance, if process knowledge suggests a truncated distribution a kernel with a finite support like Epanechnikov quadratic and tri-cube kernels can be the better choices to prevent spillover effects, leading to overestimated nonconformities. The quick & dirty selection of ho is preferred because it can avoid computation intensive optimal ho searching and more mathematical involvement. In addition, the contribution of bias and variance error from the tails is much less because f(x) and f(2)(x) in Eqs. (16) and (17) are normally very small compared to the central region (e.g., with peaks and valleys) of f(x). Thus, the tedious efforts in
minimizing MISE or AMISE in Eqs. (9) and (10) may lead to trifle improvement in density tails, making it worthless.
The proposed technique can find applications in most of multivariate quality control problems such as in semiconductor, process, pharmaceutical, and general manufacturing industries. One specific application area is in manufacturing quality control problem with geometric tolerance (GD&T) requirements in which the interrelated tolerances and complexity of the multivariate statistical model prevent current tolerance techniques from being viable option.
ACKNOWLEDGEMENTS
The authors gratefully acknowledge financial support of the National Science Foundation awards grant numbers NSF-CMMI: #0928609 and CMMI-#0927557.
REFERENCES [1] Polansky, A. M., (1998). “A smooth nonparametric approach to process capability” Qual. Reliab. Engng. Int. 14, pp. 43-48.
[2] Polansky, A. M., 2000, “An algorithm for computing a smooth non-parametric capability estimate”, J. of Qual. Tech., 32, pp. 284-289.
[3] Wand, M. P. and M. C. Jones, Kernel Smoothing, Vol. 60 of Monographs on Statistics and Applied Probability, Chapman and Hall, London, 1995.
[4] Kotz, S. and N. L. Johnson, Process capability indices, Chapman and Hall, London, 1993.
[5] Kotz, S. and N. L. Johnson, (2002). “Process capability Indices—A review, 1992-2000”, J. Qual. Tech. 34(1), pp. 2-19.
[6] Robert, C. P. and G. Casella, Monte Carlo Statistical
Methods, Springer-Verlag, New York, 2006.
[7] Palmer, K. and K. Tsui, (1999). “A review and interpretations of process capability indices”, Annals of
Operational Research 87, pp. 31-47.
[8] Silverman, B. W., Density Estimation for Statistics and
Data Analysis, Vol. 26 of Monographs on Statistics and Applied
Probability, Chapman and Hall, London, 1986.
[9] Turlach,B. A., (1993). “Bandwidth selection in kernel density estimation: A review”, Discussion Paper 9307, Institut für Statistik und Ökonometrie, Humboldt-Universität zu Berlin.
[10] Tsui, K., (1997). “Interpretation of process capability indices and some alternatives”, Qual. Engng, 9(4), pp. 587-596.
9 Copyright © 2012 by ASME
Downloaded From: http://proceedings.asmedigitalcollection.asme.org/ on 05/14/2014 Terms of Use: http://asme.org/terms
[11] Clements, J. A., (1989). “Process capability calculation for non-normal distributions”, Qual. Progress 22, pp. 95-100.
[12] Flaig, J. J., (2002). “Process Capability Optimization”, Quality Engineering, 15(2), pp. 233–242.
[13] Flaig, J. J., (1999). “Process Capability Sensitivity Analysis”, Quality Engineering, 11, pp. 587–592.
[14] Marron, J. S. and D. Nolan, (1988). “Canonical kernels for density estimation”, Statistics & Probability Letters 7(3), pp. 195-199.
[15] Kane, V. E., (1986). Process Capability Indices”, Journal
of Quality Technology, 18, pp. 41-52.
[16] Sullivan, L. P., (1984). “Reducing variability: A new approach to quality”, Quality Progress, 17(7), pp. 15-21.
[17] Yeh, A. B. and S. Bhattacharya, (1998). “A robust capability index”, Communication in Statistics—Simulation and
Computation 27, pp. 565-589.
[18] Yeh, A. B. and H. Chen, (2001). “A nonparametric multivariate process capability index”, International journal of
modeling & simulation, 21(8), pp. 218-223.
[19] Carr, W. E., (1991). “A new process capability index: parts per million”, Quality Progress, 24(2), pp. 152-154.
[20] Pal, S., (2005). “Evaluation of Nonnormal Process Capability Indices using Generalized Lambda Distribution”, Quality Engineering, 17, pp. 77–85.
[21] Huang, W., T. Phoomboplab, and D. Ceglarek, 2008, “Process Capability Surrogate Model Based Tolerance Synthesis of Multi Station Manufacturing Systems (MMS)”, IIE
Transactions, special issue on Quality Control and
Improvement for Multistage Systems, 41, pp. 309-322.
[22] Jones, M. C., J. S. Marron, and S. J. Sheather, (1996). “Progress in data-based bandwidth selection for kernel density estimation”, Computational Statistics, 11, pp. 337-381.
[23] Bisgaard, S. and S. Graves, (1997). “A Negative Process Capability Index from Assembling Good Components? A Problem in Statistical Tolerancing”, Qual. Engng, 10(2), pp. 409-414.
[24] Huang, W., B. R. Konda, Z. Kong, (2010). "Geometric Tolerance Simulation Model for Rectangular and Circular Planar Features", Trans. of NAMRI/SME 38.
[25] Huang, W., D. Ceglarek, Z. G. Zhou, (2004). “Using Number-Theoretical Net Method (NT-net) in Tolerance Analysis”, International Journal of Flexible Manufacturing
Systems, 6(1), pp. 65-90.
[26] Huang, W., Z. Kong, A. Chennamaraju, (2010). “Fixture Robust Design by Sequential Space Filling Methods in Multi-Station Manufacturing Systems”, ASME Trans. Journal of
Computing & Information Science in Engineering (JCISE). 10, pp. 041001-1 ~ 041001-11.
[27] Kong, Z., W. Huang, A. Oztekin, (2009). “Stream of Variation Analysis for Multiple Station Assembly Process with Consideration of GD&T Factors”, ASME Trans., Journal of Manufacturing Science and Engineering, 131, pp. 51010-51020.
10 Copyright © 2012 by ASME
Downloaded From: http://proceedings.asmedigitalcollection.asme.org/ on 05/14/2014 Terms of Use: http://asme.org/terms