-
IIE Transactions (2011) 43, 471–482Copyright C© “IIE”ISSN:
0740-817X print / 1545-8830 onlineDOI:
10.1080/0740817X.2010.532854
A cautious approach to robust design with model
parameteruncertainty
DANIEL W. APLEY1,∗ and JEONGBAE KIM2
1Department of Industrial Engineering & Management Sciences,
Northwestern University, Evanston, IL 60208-3119, USAE-mail:
[email protected] Telecom Headquarters, 206 Jungja-Dong
Bundang- Gu, Seongnam, Kyunggi, Korea, 463-711E-mail:
[email protected]
Received September 2009 and accepted August 2010
Industrial robust design methods rely on empirical process
models that relate an output response variable to a set of
controllable inputvariables and a set of uncontrollable noise
variables. However, when determining the input settings that
minimize output variability,model uncertainty is typically
neglected. Using a Bayesian problem formulation similar to what has
been termed cautious control in theadaptive feedback control
literature, this article develops a cautious robust design approach
that takes model parameter uncertaintyinto account via the
posterior (given the experimental data) parameter covariance. A
tractable and interpretable expression for theposterior response
variance and mean square error is derived that is well suited for
numerical optimization and that also providesinsight into the
impact of parameter uncertainty on the robust design objective. The
approach is cautious in the sense that as parameteruncertainty
increases, the input settings are often chosen closer to the center
of the experimental design region or, more generally, ina manner
that mitigates the adverse effects of parameter uncertainty. A
brief discussion on an extension of the approach to considermodel
structure uncertainty is presented.
Keywords: Robust parameter design, cautious control, model
uncertainty, Bayesian estimation, quality control, variation
reduction,Six Sigma
1. Introduction
In robust parameter design, which has received consider-able
attention from academia and industry, one optimallyselects the
levels of a set of controllable variables (a.k.a. in-puts) in order
to minimize variability in an output responsevariable, while
keeping the mean of the response variableclose to a target. The
component of the response variabil-ity that can be affected by
adjusting the inputs is typicallyassumed to be due to a set of
uncontrollable (a.k.a. noise)variables. Hence, minimizing response
variability amountsto choosing the inputs so that the output
response is robustor insensitive to variations in the noise
variables. Two mainapproaches to this problem are Taguchi’s robust
parame-ter design (Taguchi, 1986; Nair, 1992; Wu and Hamada,2000),
which employs signal-to-noise ratios and crossed-array experimental
designs, and response surface method-ology in conjunction with
combined-array designs (Viningand Myers, 1990; Shoemaker et al.,
1991; Myers et al.,1992; Lucas, 1994; Khattree, 1996).
This article is focused on the response surface approach,which
is often advocated because of its stricter adherence to
*Corresponding author
well-established techniques for statistical modeling, analy-sis,
and experimental design. Consider the following re-sponse surface
model, which is widely assumed in robustdesign studies (Myers and
Montgomery, 2002). The outputresponse y is represented as
y = α + β′g(x) + γ′w + w′Bx + ε (1)
where x = [x1, x2, . . . , xp]′ is a vector of p
controllableinput variables, w = [w1, w2, . . . , wm]′ is a vector
of muncontrollable noise variables, and ε is the model
residualerror. It is assumed that w is random with mean zero
andknown covariance matrix �w (typically diagonal) and thatε is
normally distributed with mean zero and varianceσ 2, independent of
w. Each element of the l-length vectorg(x) = [g1(x), g2(x), . . . ,
gl (x)]′ is a known function of thep controllable input variables.
The scalar α, the l-lengthvector β, the m-length vector γ, and the
m × p matrixB comprise the model parameters (excluding σ , whichwe
treat differently), which we denote collectively by thevector θ =
[α β′ γ′ b′1, b′2 · · · b′p]′, where bi denotes the i thcolumn of
B. If g(x) = x, for example, the model includesthe main effects of
x and its interactions with w. A morecommon choice for g(x) in
robust design studies is g(x) =
0740-817X C© 2011 “IIE”
Downloaded By: [[email protected]] At: 16:21 1 April
2011
-
472 Apley and Kim
[x1, x2, . . . , xp, x21 , x22 , . . . x
2p, x1x2, x1x3, . . . , x1xp, x2x3, . . . ,
xp−1xp]′, in which case the model is full quadratic in x.The
standard dual response approach is to select x to
minimize:
Varε,w(y |θ, σ ) = (γ + Bx)′�w(γ + Bx) + σ 2, (2)subject to the
constraint that the mean
Eε,w(y |θ, σ ) = α + β′g(x) (3)equals some specified target T.
Alternatively, one may selectx to minimize a Mean Square Error
(MSE) objective func-tion Eε,w[(y − T)2 |θ,σ ]. The subscripts on
the varianceand expectation operators indicate which random
variablesthe operations are with respect to, and we have written
themas conditioned on the model parameters θ and σ . The rea-son
for the latter is that when optimizing x, one generallyestimates
the parameters using experimental design andanalysis techniques and
then views the estimates a if theywere the true parameters. Hence,
parameter uncertaintydue to estimation error is generally
neglected, which, asnoted in Shoemaker et al. (1991), could
actually result inan increase in the response variance.
This article develops a method for taking parameter un-certainty
into consideration when choosing the input set-tings. The objective
is to find robust design input settingsfor which the response is
robust to parameter estimationerrors, as well as to the noise w.
The Bayesian MSE, whichis refered to as the Cautious Robust Design
(CRD) objec-tive function in this article, is minimized. Here, the
CRDobjective function is defined as
JCRD(x) = Eε,w,θ,σ [(y − T)2 | Y], (4)where Y denotes the
observed response values over the ex-periment from which the
parameters are estimated. Thesubscripts θ and σ are added on the
expectation operatorto indicate that it is with respect to the
posterior distribu-tion of the parameters, given the data Y, in
addition to thedistributions of w and ε. It will be shown that
JCRD(x) canbe expressed as a quite tractable function of x, θ̂, σ̂
2, �θ,�w, and T, where θ̂ and σ̂ 2 denote the posterior means(point
estimates) of θ and σ 2, and �θ denotes the poste-rior covariance
matrix of θ. Thus, minimizing JCRD(x) willyield optimal x settings
that are a function of the posteriorcovariance �θ, thereby taking
into account parameter un-certainty. The Bayesian strategy of
minimizing an objectivefunction of the form of Equation (4) bears
close resem-blance to what has been referred to as cautious control
inthe adaptive control literature (Åström and Wittenmark,1995).
Hence the use of the CRD terminology.
The CRD objective function (or the posterior varianceof the
response in an analogous dual response CRD for-mulation, which will
be considered later in this article) isa natural extension of the
standard robust design objectivefunction and, hence, should have
familiar conceptual ap-peal to practitioners. The resulting CRD
approach has anumber of attractive characteristics. It leads to a
relatively
simple, closed-form expression for the objective function.Other
Bayesian approaches for considering parameter un-certainty in
robust design (reviewed in Section 2) requireMonte Carlo simulation
to calculate the objective function.A different Monte Carlo
simulation must be conductedfor each x of interest, which prohibits
analytical optimiza-tion of the objective function and complicates
numericaloptimization. The analytical expressions provide a
conve-nient smooth function that can be easily evaluated withinan
optimization routine. More generally, these analyticalexpressions
provide insight into the mechanisms behindrobustness to parameter
uncertainty or the lack thereof.
The format of the remainder of this article is as fol-lows.
Section 2 reviews prior Bayesian and frequentistapproaches for
considering model structure and parame-ter uncertainty in
experimental-based process optimizationand robust design. Section 3
discusses the prior and poste-rior distributions for the parameters
and provides expres-sions for θ̂, σ̂ 2, and �θ. These in turn are
used in Section 4to develop a tractable expression for JCRD(x). For
a specialcase of the model (1) that is linear in x (i.e., g(x) =
x), aclosed-form expression exists for the x that minimizes theCRD
objective function. This provides insight into howCRD ensures
robustness to parameter uncertainty, whichis discussed in Section
5. In Section 6, a leaf spring man-ufacturing example from the
literature is used to illustrateCRD and compare it to standard
robust design in whichparameter uncertainty is neglected. Although
the CRD ob-jective function considers variability due to parameter
un-certainty on par with variability due to the noise variables,it
has a natural decomposition into (i) the standard robustdesign
expression for the response variability that resultswhen the
parameters are treated as known; and (ii) the ad-ditional response
variability due to parameter uncertainty.Section 7 continues the
leaf spring example to illustrate howthis decomposition provides
insight into whether the cur-rent experiment yields sufficient
information for optimizingthe process versus whether additional
experimentation isneeded to reduce parameter uncertainty. Section 8
discussesthe distinctions between parameter uncertainty and
noisevariability. Section 9 considers the implications of
havingconstraints on the input variables, which are common inrobust
design optimization problems. Section 10 briefly dis-cusses an
extension of the CRD approach that considersuncertainty in the
model structure. Section 11 discusses anextension of the CRD
concepts to dual-response robustdesign, in which the objective is
to minimize Varε,w,θ,σ [y |Y], subject to the constraint Eε,w,θ,σ
[y | Y] = T. Section 12concludes the article.
2. Review of prior work on model and parameteruncertainty in
experimental-based processoptimization and robust design
Cautious adaptive feedback control strategies that in-volve a
Bayesian MSE objective function have been widely
Downloaded By: [[email protected]] At: 16:21 1 April
2011
-
Cautious robust design 473
investigated (Åström and Wittenmark, 1995). Recently,Apley
(2004) and Apley and Kim (2004) have investigatednon-adaptive
versions of cautious control. Their work wasin the context of
automatic feedback control, in which theinput settings x are
actively adjusted online as each newresponse observation is
obtained. In the context of robustdesign, in which the online input
settings are held fixedat some optimized values based on an offline
experiment,the use of the Bayesian MSE objective function (4)
fol-lows Apley and Kim (2002) and Kim (2002). A numberof other
approaches have also been proposed for takinginto account model
uncertainty in robust design and, moregenerally, in
experimental-based process optimization. Boxand Hunter (1954)
developed a confidence region for thevalues of x that constitute a
stationary point for (e.g., thatmaximize or minimize) a response
surface. The confidenceregion is with respect to uncertainty in the
parameters,which is taken into account from a frequentist
perspectivevia the dependence of the confidence region on the
distri-bution of the parameter estimates. Peterson et al. (2002)
de-veloped an alternative confidence region that
distinguishessaddle points from minima/maxima and that can
handleconstraints on the inputs. In another frequentist
approach,Myers and Montgomery (2002, p. 576) derived an unbi-ased
estimate of Varε,w(y| θ,σ ) in Equation (2), which in-volves
parameter uncertainty via the dependence of theirbias correction
term on the covariance matrix of θ̂. Theyrecommended minimizing the
unbiased variance estimateand also discussed a graphical approach
in which one plotsunbiased estimates of the mean and variance in
Equations(2) and (3) as functions of x, while simultaneously
display-ing a confidence region for the true x settings that
minimizeVarε,w(y| θ,σ ). Miró-Quesada and Del Castillo (2004)
usedthe same bias correction term as Myers and Montgomery(2002),
but their objective was to minimize an unbiased es-timate of
Varθ̂,w (ŷ (w) |θ, σ ), where ŷ (w) is Equation (1)with θ̂
substituted for θ. Parameter uncertainty was takeninto account from
a frequentist perspective by virtue of thevariance operation being
with respect to θ̂, as well as w,which resulted in an expression
that was a function of thecovariance matrix of θ̂. In a sense that
will be discussedin Section 11, the CRD objective function that is
adoptedresults in a more complete accounting of parameter
uncer-tainty. Sahni et al. (2009) considered model uncertainty
inthe context of mixture-process optimization. Monroe et al.(2010)
considered the effects of model uncertainty on theselection of
optimal designs for accelerated life tests.
A number of Bayesian approaches have also been pro-posed for
taking model uncertainty into account. Chipman(1998) considered the
posterior distribution of {θ, σ} | Yand then recommended a Monte
Carlo simulation in whichvalues of {θ, σ} are drawn from their
posterior distribu-tion and substituted into Equations (2) and (3)
(or anyother robust design criterion). A separate Monte
Carlosimulation is conducted for each value of x of interest.The
average and/or sample variance of Equations (2) and
(3) over the Monte Carlo simulation can guide a designerin
choosing x settings for which the response mean andvariance are
robust to uncertainty in θ. Chipman (1998)also considered
uncertainty in the model structure. Usingthe approach of Box and
Meyer (1993), Chipman calcu-lated the posterior probabilities that
each model withinsome class (e.g., all models consisting of subsets
of the in-dividual terms in Equation (1)) is the true one, and
thenwithin the Monte Carlo simulation they drew the
modelstructures, as well as the parameters, from their
posteriordistributions.
Peterson (2004) and Miró-Quesada et al. (2004) pro-posed a
Bayesian approach in which one calculates theposterior (given Y)
probability that y falls within somespecified tolerance interval.
The posterior distribution ofy| Y considers uncertainty/randomness
in w, ε, θ, and σ .Peterson (2004) considered the noiseless case
(i.e., termsinvolving w absent from Equation (1)), and
Miró-Quesadaet al. (2004) extended the approach to include noise.
Ra-jagopal and Del Castillo (2005) and Rajagopal et al.
(2005)further extended the approach to incorporate uncertaintyin
the model structure, the former treating the noiselesscase and the
latter including noise. They used the approachof Box and Meyer
(1993), also used by Chipman (1998), tocalculate the posterior
model probabilities. For the analyseswith no noise, they utilized
analytical expressions for theposterior distribution of y| Y (a
t-distribution under a cer-tain choice of priors). For the analyses
with noise terms inthe model, they relied heavily on Monte Carlo
simulationto calculate the objective function for each x of
interest, asin Chipman (1998).
Relative to the aforementioned Bayesian approaches forrobust
design with noise, a primary advantage of the pro-posed approach is
that it is possible to derive a relativelysimple closed-form
analytical expression for the objectivefunction. As mentioned in
the Introduction, the analyticalexpression provides insight into
robustness issues; facili-tates optimization; and offers a natural
decomposition ofthe response variability into the standard robust
designcomponent (i.e., assuming θ coincides with θ̂) and the
ad-ditional component due to parameter uncertainty. This al-lows
one to conveniently plot the individual componentsversus x. A plot
of the additional variability due to pa-rameter uncertainty is
informative when deciding whetherfurther experimentation is
necessary to reduce parameteruncertainty, which is illustrated with
examples later. More-over, graphical exploration of each component
plot is quiteuseful if there are other design considerations
(qualitativeor quantitative) that are difficult to incorporate into
a for-mal mathematical optimization criterion, as is often thecase
in practical robust design problems.
Another obvious difference between the approach pro-posed in
this article and the approaches of Peterson (2004)and Miró-Quesada
et al. (2004) is that the underlyingdesign criteria are quite
different. When deciding whichmethod is more appropriate, one
should also consider
Downloaded By: [[email protected]] At: 16:21 1 April
2011
-
474 Apley and Kim
which criterion is more physically meaningful for the prob-lem
at hand. For some problems, minimizing the posteriorMSE may be more
meaningful than maximizing the poste-rior probability that y falls
within some specified toleranceinterval, and vice versa for other
problems.
3. Prior and posterior distributions for the parameters
Let Y denote the n × 1 vector of observations of y obtainedfrom
an experiment, over which the x and w settings werevaried according
to some design matrix Z. By this it ismeant that for the n
observations, the model (1) can bewritten as
Y = Zθ + ε,where ε is an n × 1 Gaussian random vector with
zeromean and covariance matrix σ 2In, and In denotes the n ×
nidentity matrix. If k denotes the dimension of the parametervector
θ, each column of the n × k matrix Z correspondsto a single term in
Equation (1). Each row of Z consists of a“1” (corresponding to the
intercept term α) and the valuesof {gi (x): i = 1, 2, . . . , l},
{w j : j = 1, 2, . . . , m} and {xiw j ,i = 1, 2, . . . , p; j = 1,
2, . . . , m} for a single experimentalrun.
For the linear Gaussian model a common choice of
priordistributions, which will be assumed in this work, is an
un-informative (locally flat) prior for log(σ ) and θ |σ ∼ Nk(µ,σ
2�). In other words, the prior distribution of σ is ∝ 1/σ ,and the
prior distribution of θ given σ is multivariate nor-mal with mean µ
(some specified k × 1 vector) and covari-ance matrix � (some
specified k × k matrix). One oftenselects � to be diagonal with
diagonal entries {φi : i = 1,2, . . . , k}, in which case σ 2φi is
the prior variance of θ i ,the i th element of θ. Most of the prior
Bayesian treatmentsof parameter uncertainty reviewed in Section 2
have as-sumed these priors. Notice that letting φi → ∞
representsminimal prior knowledge of θ i .
Under these priors, it can be shown (Bunke and Bunke,1986, p.
439) that the posterior distribution of θ given Yand σ is
θ | Y, σ ∼ Nk(µY, σ 2�Y),where µY = [�−1+ Z′Z]−1[�−1µ + Z′Y],
and �Y = [�−1+Z′Z]−1. It can also be shown that the posterior
distributionof σ−2 | Y is gamma. The full posterior distributions
willnot be needed to derive an expression for JCRD(x), as willbe
seen in the following section. All that are needed are theposterior
mean and covariance of θ | Y and the posteriormean of σ 2 | Y.
Because σ−2 | Y is gamma, it follows thatthe posterior mean of σ 2
| Y is (Bunke and Bunke, 1986,A 2.22)
σ̂ 2 = Eσ [σ 2 | Y]= [θ̂−µ]′�−1[θ̂ − µ] + [Y − Zθ̂]′[Y −
Zθ̂]
n − 2 ,(5)
where
θ̂ = Eθ,σ [θ | Y] = Eσ [Eθ[θ | σ, Y] | Y] = Eσ [µY | Y] = µY=
[�−1 + Z′Z]−1[�−1µ + Z′Y], (6)
and
�θ = Eθ,σ [(θ−θ̂)(θ−θ̂)′ | Y]= Eσ [Eθ[(θ−θ̂)(θ−θ̂)′ | σ, Y] |
Y]= Eσ [σ 2�Y | Y] = σ̂ 2�Y = σ̂ 2[�−1 + Z′Z]−1, (7)
are the posterior mean and covariance of θ | Y. Noticethat with
minimal prior knowledge of θ (i.e., �−1 → 0k),Equations (6) and (7)
reduce to the standard least squaresparameter estimates and
covariance matrix, respectively,albeit using a different estimate
of σ 2. For minimal priorknowledge of θ, Equation (5) reduces to
the residual sumof squares [Y − Zθ̂]′[Y − Zθ̂], divided by n −
2.
Joseph (2006) and Joseph and Delaney (2007) investi-gated an
alternative choice of prior covariance for θ thatconsisted of
choosing � so that the resulting prior dis-tribution of the
response is in agreement (over some fullfactorial grid in the
design space) with a specified Gaussianrandom process model for the
response. This may be usefulin highly fractionated designs in which
one prefers to retainmany high-order terms in the model.
4. A closed-form expression for the CRD objectivefunction
The results of the previous section yield a rather
tractableexpression for JCRD(x). Toward this end, let θ̃ = θ −
θ̂denote the vector of parameter estimation errors, and let α̃,β̃,
γ̃, and B̃ be defined similarly. Substituting θ = θ̂ + θ̃for the
parameters in Equation (1) gives:
y − T = {α̂ + β̂′ g(x) + γ̂′w + w′B̂x − T} + {α̃ + β̃ g(x)+ γ̃′w
+ w′B̃x} + ε. (8)
Since the posterior distribution of θ | Y is multivari-ate
normal with mean θ̃ and covariance �θ, the posteriordistribution of
θ̃ | Y is multivariate normal with mean zeroand the same
covariance. Moreover, θ̃ | Y is independentof w, which denotes some
future noise that is independentof the experimental data Y.
Although θ̃ | Y is not indepen-dent of ε | Y (because their
distributions both depend onσ ), it is straightforward to show that
they are uncorrelated,which is the property that is needed in the
following.
Substituting Equation (8) into Equation (4) gives:
JCRD(x) = (α̂ + β̂′g(x) − T)2 + (γ̂ + B̂ x)′�w(γ̂ + B̂ x)+ �α +
g′(x)�βg(x) + x′ A x + 2g′(x)�βα+ 2x′a + d + σ̂ 2, (9)
where �α = Eθ,σ (α̃2 | Y), �β = Eθ,σ (β̃β̃′ | Y) and �βα =Eθ,σ
(β̃α̃ | Y) denote the posterior variances/covariances
Downloaded By: [[email protected]] At: 16:21 1 April
2011
-
Cautious robust design 475
of α and β, and we define A = Eθ,σ (B̃′�wB̃ | Y), d =Eθ,σ
(γ̃′�wγ̃ | Y), and a = Eθ,σ (B̃′�wγ̃ | Y). In arriving atterms like
x′Ax in Equation (9), we have used the relation-ship Ew,θ,σ [(w′ B̃
x)2 | Y] = Ew,θ,σ (x′ B̃′ ww′B̃ x | Y) = Eθ,σ(Ew(x′ B̃′ ww′B̃ x |
θ, σ, Y) | Y) = x′ Eθ,σ (B̃′ �wB̃ | Y)x =x′Ax. The scalar d, the p
× p matrix A, and the p × 1 vectora can be readily constructed from
the covariance matrices�w and �θ using the relationships d = Eθ,σ
(γ̃′�wγ̃ | Y) =trace(�γ�w), Ai, j = Eθ,σ (b̃′i�wb̃ j | Y) =
trace(�bi b j �w),and ai = Eθ,σ (b̃′i�wγ̃ | Y) = trace(�bi γ�w),
where�bi b j = Eθ,σ (b̃i b̃′j | Y) and �bi γ = Eθ,σ (b̃i γ̃′ | Y).
Noticethat �α, �β, �βα, �γ, �bi b j , and �bi γ are all
directlyavailable as submatrices of �θ. Consequently, given
g(x),θ̂, σ̂ 2, �θ, �w, and T, Equation (9) can be easily
evaluatedanalytically, without the need for Monte Carlo
simulation.
Equation (9) has a revealing interpretation in terms ofthe
effects of parameter uncertainty on the posterior MSE.Since
Eε,w,θ,σ (y | Y) = α̂ + β̂′ g(x), the first term in Equa-tion (9)
is the component of the MSE due to differencesbetween the posterior
mean of y and the target. The sec-ond term represents the variance
of y that is due to therandom noise variables w, under the
assumption that thetrue parameters are equal to their estimates.
This assump-tion is often referred to as Certainty Equivalence (CE)
in theadaptive control literature. Borrowing this terminology,
theanalogous CE objective function is the familiar standardrobust
design expression:
JCE(x) = Eε,w[(y − T)2 |θ = θ̂, σ = σ̂ ] = (α̂ + β̂′g(x) − T)2+
(γ̂ + B̂ x)′�w(γ̂ + B̂ x) + σ̂ 2. (10)
Notice that we can write Equation (9) as JCRD(x) =JCE(x) +Jθ(x),
where
Jθ(x) = �α + g′(x)�βg(x) + x′ A x+ 2g′(x)�βα + 2x′a + d (11)
represents the additional MSE due to parameter uncer-tainty. As
parameter uncertainty decreases (i.e., as �θ →0k), all of the terms
in Equation (11) disappear. As param-eter uncertainty increases,
Jθ(x) increases, because eachterm in Equation (11) is proportional
to elements of theposterior covariance �θ.
5. A closed-form solution for xCRD with only lineareffects
When g(x) = x, the model (1) reduces to the linear effectsmodel
y = α + β′x + γ′w + w′Bx + ε, in which case wecan obtain a
closed-form solution for the optimal CRDsettings when there are no
constraints on x (constraintsare discussed in Section 9). Although
the linear model isnot as broadly applicable as the model with
quadratic g(x),the closed-form solution for the linear case
provides insightinto the nature of CRD.
Substituting g(x) = x in Equation (9) and setting thepartial
derivative equal to zero gives the optimal CRD inputsettings:
xCRD = [β̂β̂′ + B̂′ �wB̂ + �β + A]−1{(T − α̂)β̂− B̂′�wγ̂ − �βα −
a}. (12)
In contrast, the input settings that minimize the analo-gous CE
objective function (10) are
xCE = [β̂β̂′ + B̂′ �wB̂]−1{(T − α̂)β̂ − B̂′�wγ̂}, (13)
which follows from Equation (12) with all parameter co-variance
terms set equal to zero. Notice that if p > m + 1,the matrix
β̂β̂
′ + B̂′ �wB̂ in Equation (13) is not invertible,and the inputs
that minimize JCE are not unique. In thiscase, replacing the
inverse of β̂β̂
′ + B̂′ �wB̂ by its singularvalue decomposition pseudoinverse
corresponds to takingxCE to be the minimum-norm solution. This
minimum-norm solution is used for all of the CE examples in
thisarticle.
Comparing Equations (12) and (13), the origin of theterm
cautious in CRD becomes more apparent. Larger pa-rameter
uncertainty (as measured by, say, the eigenvalues ofthe positive
semi-definite matrices �β and A) results in theinverse of the
matrix in brackets in Equation (12), and thusxCRD, being smaller
than if parameter uncertainty wereneglected. Strictly speaking,
larger parameter uncertaintycauses xCRD to be closer to the center
of the experimentaldesign region, but this translates to xCRD being
smaller if xis coded so that the zero vector represents the center.
In thissense, the optimal settings for x are chosen more
cautiouslyin CRD than if parameter uncertainty is neglected.
The effects of parameter uncertainty on xCRD are evenmore
apparent in the special case that a fractional factorialdesign
(with no aliasing of terms that are included in themodel) is used
and x and w are transformed to a scalefor which they take on values
of ±1 over the experiment.With an orthogonal design matrix Z, and
assuming a non-informative prior for θ, the posterior covariance
becomes�θ = σ̂ 2n−1Ik. Thus, �βα and a are zero, �β = σ̂ 2n−1Ipand
A = σ̂ 2n−1 trace(�w)Ip, which, when substituted intoEquation (12),
yields:
xCRD = [β̂β̂′ + B̂′ �wB̂ + σ̂ 2n−1{1 + trace(�w)}Ip]−1× {(T −
α̂)β̂ − B̂′�wγ̂}.
When parameter uncertainty (as measured by σ̂ 2n−1, theposterior
variance of the estimated parameters) is zero,xCRD coincides with
xCE. As parameter uncertainty in-creases, the xCRD settings shrink
monotonically toward 0,the center of the experimental design
region.
Downloaded By: [[email protected]] At: 16:21 1 April
2011
-
476 Apley and Kim
Table 1. Description of variables in the leaf spring example
Level
Variable Represents Low High
x1 Temperature 1840 1880x2 Heating time 25 23x3 Hold down time 2
3x4 Transfer time 12 10w1 Quench temperature 130–150 150–170
6. An example
The use of the proposed approach is illustrated with datafrom an
experiment involving the manufacture of truckleaf springs,
originally analyzed in Pignatiello and Ram-berg (1985) and later in
Chipman (1998). There are fourcontrollable variables and one noise
variable, whose de-scriptions are given in Table 1. All
temperatures are in de-grees Fahrenheit, and all times are in
seconds. The highand low values in Table 1 correspond to ±1 values
for xand w, which are expressed in coded units for the analy-ses in
this article. The output variable y is the free heightof the leaf
spring, for which the target is T = 8 inches.The experiment was
three replicates of a crossed-array de-sign with a two-level
fractional factorial in x, the data forwhich are shown in Table 2.
No quadratic effects can beestimated with this experiment, and, as
shown in Chipman(1998), all x-by-x interactions appear
insignificant. Hence,throughout the analysis, the linear effects
model y = α +β′x + γw1 + w1Bx + ε discussed in Section 5 is used
(i.e.,Equation (1) with g(x) = x and m = 1).
Let φi → ∞ in order to represent minimal prior knowl-edge of θ,
in which case Equations (5) and (6) yield the pointestimates α̂ =
7.636, β̂ = [0.111, −0.088, −0.014, 0.052]′,γ̂ = −0.062, B̂ =
[0.016, 0.037, 0.005, −0.018], and σ̂ 2 =(0.186)2. Since the design
matrix was orthogonal, Z′Z= nI10 = 48I10, and the parameter
covariance be-comes �θ = σ̂ 2[�−1 + Z′Z]−1 = n−1σ̂ 2I10 =
(0.0268)2I10.The noise variance was assumed to be �w = 1.
Neglectingparameter uncertainty, the CE input settings from
Equa-
Table 2. Response data for the leaf spring example
w1
x1 x2 x3 x4 −1 1−1 −1 −1 −1 7.78 7.5 7.78 7.25 7.81 7.12
1 −1 −1 1 8.15 7.88 8.18 7.88 7.88 7.44−1 1 −1 1 7.5 7.5 7.56
7.56 7.5 7.5
1 1 −1 −1 7.59 7.63 7.56 7.75 7.75 7.56−1 −1 1 1 7.94 7.32 8
7.44 7.88 7.44
1 −1 1 −1 7.69 7.56 8.09 7.69 8.06 7.62−1 1 1 −1 7.56 7.18 7.62
7.18 7.44 7.25
1 1 1 1 7.56 7.81 7.81 7.5 7.69 7.59
x1 x2
JCRD
x1 x2
JCE
x1 x2
Jθθ
-10
12
34
-1
-0.5
0
0.5
10
0.1
0.2
0.3
0.4
-10
12
34
-1
-0.5
0
0.5
10
0.1
0.2
0.3
0.4
-10
12
34
-1
-0.5
0
0.5
10
0.1
0.2
0.3
0.4
Fig. 1. Plots of JCRD, JCE, and Jθ versus the first two inputs
forthe leaf spring example. JCRD results in a smaller optimal
valuefor x1, for which the adverse effects of parameter uncertainty
arelessened.
tion (13) are xCE = [3.43, 0.24, −0.01, 0.09]′. In compari-son,
the CRD input settings from Equation (12) are xCRD =[2.51, −0.45,
−0.10, 0.38]′.
Figure 1 plots JCRD(x), JCE(x), and Jθ (x) versus x1 andx2, with
x3 and x4 held fixed at −0.10 and 0.38, respectively(their optimal
CRD values). Notice that as x1 increases,
Downloaded By: [[email protected]] At: 16:21 1 April
2011
-
Cautious robust design 477
JCRD increases more so than JCE, because Jθ(x) increasesas one
attempts to extrapolate to values of x1 beyond theexperimental
region. This is the primary net effect of pa-rameter uncertainty in
this example, and it has the bene-ficial consequence of helping to
ensure that the CRD x1setting is closer to the center of the
experimental regionwithout adding an explicit constraint (explicit
constraintswill be discussed in Section 9). Recall that the CE
settingfor x1 is 3.43, which is far outside the experimental
region.The CRD setting for x1 is 2.51, which, while still far
enoughoutside the experimental region to cause concern, is
sub-stantially smaller than the CE setting. The posterior
MSEscorresponding to the optimal CE and CRD input settingsare
JCRD(xCE) = 0.053 and JCRD(xCRD) = 0.048. Hence, amodest 10%
improvement in the posterior MSE is achievedby taking into account
parameter uncertainty when select-ing the optimal input settings.
For this example, there was arelatively large number of
experimental runs (n = 48) and,correspondingly, a relatively small
level of parameter un-certainty. In the next section it will be
demonstrated thatthe differences between the CRD and CE are much
morepronounced for larger parameter uncertainty.
7. Assessing the impact of parameter uncertainty and theneed for
further experimentation
For higher levels of parameter uncertainty than in the
pre-ceding example, the posterior MSE when using xCRD maybe much
lower than when using xCE, as will be demon-strated shortly.
However, even if the CRD settings are used,one may find that the
inflation of the MSE due to parameteruncertainty is still
unacceptably large. The decompositionof the CRD objective function
(9) into JCRD(x) = JCE(x)+Jθ(x), where JCE(x) and Jθ(x) are given
by Equations(10) and (11), respectively, can aid in assessing
whether thisis the case. Recall that Jθ(x) represents the
additional MSEdue to parameter uncertainty and that when parameter
un-certainty disappears, Jθ(x) reduces to zero, and JCRD(x)reduces
to JCE(x). A simple plot of Jθ(x) versus x, as inthe bottom panel
of Fig. 1, can be used to assess the neteffect of parameter
uncertainty in terms of its direct impacton the robust design
objective. This, in turn, provides in-sight into whether additional
experimentation is necessaryto reduce parameter uncertainty, which
is illustrated with acontinuation of the leaf spring example.
Example continued: The n = 48 response observationsin Table 2
represent three replicates of a 25−1 frac-tional factorial design
in x and w. Suppose, instead, thatonly a single replicate was
conducted, resulting in onlyn = 16 runs, and that σ̂ increased from
0.186 to 0.372(i.e., doubled). Rather than arbitrarily choose one
of thethree replicates from Table 2 to retain, it will be sim-ply
assumed that the point estimates from the previoussection (α̂ =
7.636, β̂ = [0.111, −0.088, −0.014, 0.052]′,γ̂ = −0.062, B̂ =
[0.016, 0.037, 0.005, −0.018] and σ̂ 2 =
x1x2
JCRD
x1x2
JCE
x1x2
Jθθ
-10
12
34
-1
-0.5
0
0.5
10
0.1
0.2
0.3
0.4
0.5
-10
12
34
-1
-0.5
0
0.5
10
0.1
0.2
0.3
0.4
0.5
-10
12
34
-1
-0.5
0
0.5
10
0.1
0.2
0.3
0.4
0.5
Fig. 2. Plots of JCRD, JCE, and Jθ versus the first two inputs
for themodified leaf spring example with n = 16 and σ̂ = 0.372,
insteadof n = 48 and σ̂ = 0.186. The effects of parameter
uncertaintynow dominate, resulting in a more cautious CRD setting
for x1.
(0.372)2) came from a single replicate. In terms of theanalysis
for this case, the net effect is that the pa-rameter covariance
matrix increases by a factor of 12:From �θ = (0.0268)2I10 to �θ =
σ̂ 2[�−1 + Z ′Z]−1 =n−1σ̂ 2I10 = (0.0928)2I10. The CE input
settings, which ne-glect parameter uncertainty, remain unchanged
from their
Downloaded By: [[email protected]] At: 16:21 1 April
2011
-
478 Apley and Kim
earlier values: xCE = [3.43, 0.24, −0.01, 0.09]′. In
contrast,the CRD input settings change from xCRD = [2.51,
−0.45,−0.10, 0.38]′ (for the original Fig. 1 example) to xCRD
=[1.10, −0.66, −0.11, 0.40]′ (for the example with smallern and
larger σ̂ ). This has the benefit of further reducingthe magnitude
of the largest input x1 to the point that it isalmost within the
experimental region.
Figure 2 plots JCRD(x) and its two components, JCE(x)and Jθ(x),
versus x1 and x2. The other two inputs, x3 and x4,were held fixed
at their optimal CRD values of −0.11 and0.40, respectively. A few
points are worth noting. JCRD(x)increases dramatically as x1
increases, because of the verypronounced component due to parameter
uncertainty (Jθ,plotted in the bottom panel of Fig. 2). The
contribution ofJθ(x) to the MSE is much larger for the higher
levels of pa-rameter uncertainty considered in this example,
especiallyfor large values of x1. Consequently, JCRD(x) penalizes
largevalues of x1 more than was the case for the Fig. 1
example,which explains why the optimal x1 setting is
substantiallysmaller in this case. The posterior MSEs corresponding
tothe optimal CE and CRD input settings are JCRD(xCE) =0.360 and
JCRD(xCRD) = 0.219. Thus, with the larger pa-rameter uncertainty in
this example, neglecting parameteruncertainty when selecting the
inputs results in a 64% largerposterior MSE than when CRD is
used.
At the optimal CRD inputs, JCRD(xCRD) = 0.219 de-composes into
its two components: JCE(xCRD) = 0.170,and Jθ(xCRD) = 0.049. The
contribution of parameter un-certainty to the posterior MSE is
roughly 29% of the CEcontribution. Based on this, one might decide
that furtherexperimentation is required to reduce the parameter
un-certainty. When assessing whether further experimentationis
needed, one should also consider the relative contribu-tions of the
two components of JCRD at the optimal CEinputs xCE. These are the
input settings that would be usedif there were no parameter
uncertainty and the true param-eters coincided with the point
estimates. Because the pointestimates are the posterior mean of θ,
they do, after all,represent one’s best guess at the true
parameters. Follow-ing this line of reasoning, one might be
interested in thefollowing questions.
1. What benchmark MSE could be achieved in the hypo-thetical
scenario that we know the true parameters andthey happen to
coincide with their point estimates?
2. How much will the reality of parameter uncertainty addto the
MSE if we use the inputs optimized under thehypothetical benchmark
scenario?
The answers to these two questions are precisely JCE(xCE)and
Jθ(xCE), respectively. For the results shown in Fig. 2, wehave
JCE(xCE) = 0.138 and Jθ(xCE) = 0.221. This hypothet-ical benchmark
MSE of 0.138 is better than JCE(xCRD) =0.170, the analogous value
using the cautious inputs xCRD.Based on this, one might hesitate to
so quickly rule outusing the potentially very good xCE settings in
favor of
the more-robust-to-parameter-uncertainty xCRD settings.However,
neither should one just go ahead and use xCE,considering that
parameter uncertainty at xCE could be ex-pected to almost triple
the MSE (from 0.138 to 0.138 +0.221 = 0.360). One might conclude
from this analysis thatfurther experimentation is necessary to
reduce parameteruncertainty to levels at which one can use input
settingswith greater confidence.
Further experimentation also encompasses confirma-tion
experiments, which are generally considered soundpractice in any
response surface optimization. In the pre-ceding example, it had
appeared that xCE may result ina lower MSE than xCRD if one
neglects parameter un-certainty (JCE(xCE) = 0.138 versus JCE(xCRD)
= 0.170).However, after performing the analyses described in
thepreceding paragraph, it is also clear that the MSE couldin fact
be substantially higher (e.g., 0.360 versus 0.138)because of
parameter uncertainty if one uses xCE. Conse-quently, if one wished
to entertain the notion of using xCE inhopes of achieving a lower
MSE, then at the very least oneshould run a confirmation experiment
at xCE. After run-ning a confirmation experiment, the entire
analysis shouldbe repeated to update all relevant posterior
distributions.
8. Parameter versus noise uncertainty
In the CRD paradigm, when formulating the objectivefunction and
solving for the optimal inputs, no distinc-tion has been made
between parameter uncertainty andnoise uncertainty. However, the
two forms of uncertaintyare of course very different: The noise
variables w are truerandom variables and will vary from
part-to-part or batch-to-batch. In contrast, the parameters θ are
fixed (but un-known) variables. They have been assigned a
probabilitydistribution only as a convenient means of quantifying
theiruncertainty. Instead of considering the uncertainties in
thenoise and parameters on par in the CRD criterion, it
isstraightforward to keep them distinct by decomposing
JCRD(x) = JCE(x) + Jθ(x) = (α̂ + β̂′g(x) − T)2+(γ̂ + B̂ x)′�w(γ̂
+ B̂ x) + Jθ(x) + σ̂ 2,
similar to what was done when plotting the individual
com-ponents JCE(x) and Jθ(x) in Figs. 1 and 2.
The term (γ̂ + B̂ x)′�w(γ̂ + B̂ x) represents the contri-bution
of noise variability to the MSE, whereas the termJθ(x) represents
the contribution of parameter uncertainty.The term (α̂ + β̂′g(x) −
T)2 represents the contribution ofan off-target mean. These terms
could all be plotted indi-vidually to understand their relative
contributions to theMSE, keeping the effects of noise variability
distinct fromparameter uncertainty. Considering them together by
min-imizing JCRD(x) is simply a convenient means to mitigatingthe
adverse impact of parameter uncertainty.
Downloaded By: [[email protected]] At: 16:21 1 April
2011
-
Cautious robust design 479
9. Input constraints and extrapolation beyond theexperimental
region
In robust design optimization, it is common to incorpo-rate
constraints on the inputs. A constraint can result ifthe inputs are
physically or economically constrained tolie in a region; or they
can result simply because the ex-perimental region is finite, and
extrapolation beyond theexperimental region is viewed as risky.
Regarding the lat-ter, CRD inherently penalizes extrapolation
beyond theexperimental region, where the effects of parameter
un-certainty are greater. However, constraining the inputs tobe
near the experimental region while optimizing the CEobjective
function may also possess an inherent form ofcaution.
To illustrate this, let xCE,con denote the input settingsthat
minimize JCE(x) under the constraint that x lies withinthe
experimental region (e.g., that each element of x liesbetween −1
and 1). For the example of Fig. 2, numericaloptimization reveals
that xCE,con = [1, −1, −1, 1]′, for whichthe posterior MSE is
JCRD(xCE,con) = 0.246. This is only amodest 12% larger than the
posterior MSE for the optimalCRD inputs [JCRD(xCRD) = 0.219].
The examples of Figs. 1 and 2 are regular orthogonaldesigns for
which �βα and a are zero and, hence, the effectof parameter
uncertainty is the lowest at the origin x =0 (see Equation (11)).
In this case, minimizing JCE whileconstraining x to be closer to
the origin inherently resultsin more cautious input settings than
if JCE is minimizedwithout constraints. Furthermore, suppose one
conducteda ridge analysis in which JCE is optimized under the
con-straint that Jθ = λ for a range of values λ > 0. Because
theCRD solution lies somewhere along the ridge path, thereexists
some λ > 0 for which the constrained CE optimiza-tion solution
coincides with the CRD solution.
The situation becomes more nuanced for non-orthogonal designs.
Consider a further modification of theleaf spring example with
everything the same as in the ex-ample in Fig. 2, except that only
n = 13 runs are conducted.Suppose that the three omitted runs
(relative to the exam-ple in Fig. 2, for which n = 16) are three of
the four runsat the {x1, x2} = {1, −1} corner. In particular,
suppose theomitted runs are {x1, x2, x3, x4, w} = {1, −1, −1, 1,
−1},{1, −1, 1, −1, −1}, and {1, −1, −1, 1, 1}. In this case,the
optimal CRD input settings are xCRD = [0.62, −0.08,0.17, 0.38]′,
for which JCRD(xCRD) = 0.278. Figure 3 plotsJCRD(x), JCE(x), and
Jθ(x), versus x1 and x2, with x3 and x4held fixed at their optimal
CRD values of 0.17 and 0.38, re-spectively. Notice that the effects
of parameter uncertaintyare much higher at the x1, x2 = 1, −1
corner in the bottompanel of Fig. 3.
In comparison, the optimal constrained CE inputs arexCE,con =
[1, −1, −1, 1]′ (from numerical optimization), forwhich the
posterior MSE is JCRD(xCE,con) = 0.413. This isalmost 50% larger
than the posterior MSE for the optimalCRD inputs [JCRD(xCRD) =
0.278]. The reason why simply
x1x2
JCRD
x1x2
JCE
x1x2
Jθθ
-1-0.5
00.5
1
-1
-0.5
0
0.5
10
0.1
0.2
0.3
0.4
0.5
-1-0.5
00.5
1
-1
-0.5
0
0.5
10
0.1
0.2
0.3
0.4
0.5
-1-0.5
00.5
1
-1
-0.5
0
0.5
10
0.1
0.2
0.3
0.4
0.5
Fig. 3. Plots of JCRD, JCE, and Jθ versus the first two inputs
forthe modified leaf spring example with n = 13 and σ̂ =
0.372.Because there was only one run at {x1, x2} = {1,−1}, there
ishigher uncertainty at this corner.
constraining the CE inputs to the experimental region didnot
provide sufficient caution in this case is that the
optimalconstrained CE settings happened to fall in the corner ofthe
experimental region for which the effects of parameteruncertainty
were large (the three omitted runs were
Downloaded By: [[email protected]] At: 16:21 1 April
2011
-
480 Apley and Kim
deliberately chosen so as to create this scenario). Insituations
like this, in which the design matrix is notorthogonal, and the
effects of parameter uncertaintyare larger in certain directions of
the input space, CRDautomatically accounts for the nuanced
characteristics ofthe parameter uncertainty.
Inclusion of model structure uncertainty within the
CRDframework, as discussed in the following section, wouldtend to
further penalize extrapolation outside the experi-mental region.
More generally, it would penalize choosinginput settings that are
far from design points. Consider thetwo-level fractional factorial
design with no center pointsthat was used in the example of Fig. 2.
Under the linearmodel assumption, the smallest uncertainty is at
the ori-gin (see Equation (11) and the bottom panel of Fig. 2),even
though there were no design points there. However,if one considers
the possible presence of a quadratic term,then the uncertainty
would be much larger at the origin.Similarly, the possible presence
of quadratic terms wouldgreatly inflate the uncertainty as one
extrapolates outsidethe range of the experimental region.
10. Consideration of model structure uncertainty
Following the approach of Chipman (1998), which wasalso adopted
by Apley and Kim (2002), Kim (2002), Ra-jagopal and Del Castillo
(2005), Rajagopal et al. (2005),and Ng (2010) when considering
model structure uncer-tainty in robust design, the CRD results can
be extended toaccount for model structural uncertainty. Let {S1,
S2, . . . ,Sq} denote the set of all candidate model structures,
whereeach model structure consists of subsets of the gi (x), w j
,and xiw j interaction terms in Equation (1). Under
certainassumptions on the prior probabilities, the posterior
prob-abilities {π1, π2, . . . , πq} that each model structure
holdscan be calculated in much the same manner as the
posteriordistributions for the parameters (refer to Box and
Meyer(1993) or Chipman (1998) for details). The CRD strategywould
be to select the x settings to minimize the weightedsum:
J(x) =q∑
i=1πi Ji (x),
where Ji (x) = Eε,w,θ,σ [(y − T)2 | Si ] is the MSE from
Equa-tion (5) under the assumption that the model structure
Siholds. Since some of the models will exclude subsets of
thecontrollable inputs, uncontrollable noise, and/or their
in-teractions, one must be careful in summing the Ji (x) acrossthe
different models in any optimization algorithm. Themost
straightforward way to do this is to include all of theinput and
noise variables in each Ji (x). If Si excludes a par-ticular main
effect or interaction term, the correspondingelement of θ̂ (and the
variance/covariance for that param-eter) would be set equal to zero
when forming Ji (x).
11. A dual-response version of CRD
The CRD problem has been formulated as minimizing thesingle MSE
criterion Eε,w,θ,σ [(y − T)2 | Y]. It is straight-forward to extend
the CRD approach to the analogousdual response criteria in which
Varε,w,θ,σ [y | Y] is mini-mized subject to the constraint Eε,w,θ,σ
[y | Y] = T. Then,it can be written that
Eε,w,θ,σ [y | Y] = α̂ + β̂′g(x),and
Varε,w,θ,σ [y | Y]=Eε,w,θ,σ [(y−T)2 | Y]−(Eε,w,θ,σ [y | Y]−T)2=
JCRD(x)−(α̂ + β̂′g(x)−T)2=(γ̂+B̂x)′�w(γ̂+B̂x)+�α+g′(x)�βg(x)
+ x′Ax + 2g′(x)�βα + 2x′a + d + σ̂ 2,(14)
where Equation (9) has been used for JCRD(x). For this
for-mulation, the use of Lagrange multipliers when performingthe
constrained optimization may be helpful. For the spe-cial case that
g(x) = x, the variance expression is quadraticin x, and the
constraint is linear. In this case, using La-grange multipliers, it
is straightforward to show (see Kim(2002)) that the closed-form
solution is
xCRD,dual =D−1β̂
(T − α̂ + z′D−1β̂
)
β̂T
D−1β̂−D−1z, (15)
where D = B̂′ wB̂ + β+ A, and z = a + βα + B̂′ wγ̂.The
dual-response CRD approach bears some resem-
blance to the two frequentist approaches mentioned inSection 2
for taking parameter uncertainty into accountin dual-response
robust design. As a dual-response objec-tive function,
Miró-Quesada and Del Castillo (2004) pro-posed using an unbiased
estimate of Varθ̂,w (ŷ (w) |θ, σ ),where ŷ (w) = α̂ + β̂′ g(x)
+γ̂′w + w′B̂x. Their objective re-duces to minimizing (γ̂ + B̂x)′
w(γ̂ + B̂x) + g′(x) �β̂ g(x),where the parameter estimates and
covariance matrix �β̂are from standard least squares. For
comparison purposes,suppose a non-informative prior is assumed in
the CRDapproach, so that the posterior parameter estimates
andcovariance matrix coincide with their least squares
coun-terparts. The objective function of Miró-Quesada and
DelCastillo (2004) is missing a number of terms related to
pa-rameter uncertainty that are present in the CRD
objectivefunction (14). In particular, the quantity x′Ax + 2x′a + d
ismissing. Consequently, although their approach
considersuncertainty in β in the same manner as does CRD, it
doesnot take into account uncertainty in B. It is worth notingthat
the quantity x′Ax + 2x′a + d appears at one point inthe derivations
of Miró-Quesada and Del Castillo (2004),but it is later subtracted
out as a result of the fact thatit is precisely the bias correction
suggested by Myers and
Downloaded By: [[email protected]] At: 16:21 1 April
2011
-
Cautious robust design 481
Montgomery (2002, p. 576) when estimating the
responsevariance.
To contrast the two approaches, reconsider the exam-ple of Fig.
3, discussed in Section 9. From Equation (15),the CRD inputs are
xCRD,dual = [2.10, −0.86, 0.40, 1.18]′,for which the posterior MSE
is JCRD(xCRD,dual) = 0.480.In comparison, the optimal inputs using
the criterion ofMiró-Quesada and Del Castillo (2004), which are
given byEquation (15) with A, a, and �βα all set equal to zero,are
xMdC = [2.17, −0.85, 0.31, 1.00]′, for which the poste-rior MSE is
JCRD(xMdC) = 0.481. Two things are apparent.First, the
dual-response solution of Miró-Quesada and DelCastillo (2004) is
very similar to the dual-response CRD so-lution for this example.
Second, both of the dual-responsesolutions call for much larger
input settings and incur sub-stantially higher posterior MSE than
the non-dual CRDsolution. Recall that xCRD = [0.62, −0.08, 0.17,
0.38]′, forwhich JCRD(xCRD) = 0.278. Evidently, even though
param-eter uncertainty is incorporated into the objective
function(14), the hard constraint that the posterior response
meanα̂ + β̂′g(x) equals the target can outweigh the penalty
thatEquation (14) places on using input settings for which
theeffects of parameter uncertainty are large.
Myers and Montgomery (2002, p. 576) recommend min-imizing (γ̂ +
B̂x)′�w(γ̂ + B̂x) − x′Ax − 2x′a −d + σ̂ 2,which they derive as an
unbiased estimate of Varε,w(y |θ,σ ). Notice the minus sign on the
three bias correctionterms. From a frequentist perspective, the
minus signs arereasonable if the objective is purely to obtain an
unbiasedestimate of the response variance. From a Bayesian
per-spective, however, it is counterproductive as a strategy
forminimizing the response variance in a manner that takesinto
account parameter uncertainty. It would tend to callfor input
settings that make x′Ax + 2x′a larger, which, fromEquation (14),
serves to increase the posterior responsevariance, rather than
decrease it. Even from a frequentistperspective, it has some
undesirable properties as an objec-tive function. Because A is
positive semi-definite, includingthe −x′Ax term in the objective
function will tend to callfor larger x settings than if the
standard variance objectivefunction (γ̂ + B̂x)′�w(γ̂ + B̂x) +σ̂ 2
is used. In fact, if pa-rameter uncertainty in B is large enough
that B̂
′�wB̂ − A
has a negative eigenvalue, their objective function to
beminimized (the unbiased expression for the variance) is
un-bounded from below, which will call for infinitely large
xsettings. The cautious approach, on the other hand, tendsto call
for smaller x settings as parameter uncertainty in-creases, which
seems a more intuitively appealing way tomitigate the adverse
effects of parameter uncertainty.
12. Conclusions
This article has investigated a Bayesian MSE objectivefunction
as a means of taking parameter uncertainty into
account in robust design optimization. A key feature of
thisapproach is that a tractable, closed-form expression for theCRD
objective function is obtained as a function of theinput settings.
The only posterior information needed tocalculate the CRD objective
function is the posterior mean(i.e., point estimates) and posterior
covariance matrix ofthe parameters, for which simple expressions
have beenprovided in Section 3. The resulting CRD objective
func-tion (9) is a quadratic function of g(x) and x and, hence, isa
well-behaved one for numerical optimization, at least fortypical
g(x) considered in robust design studies.
The term cautious robust design is fitting. As has
beendemonstrated, it tends to call for more cautious x settingsthan
if parameter uncertainty is neglected. By cautious it ismeant x
settings at which the effects of parameter uncer-tainty are
mitigated. For rotatable, orthogonal designs, forwhich parameter
uncertainty is the same in all directions ofthe parameter space,
the cautious settings amount to onesthat are closer to the center
of the experimental design re-gion. On the other hand, for
non-orthogonal experimentaldesigns such as in the example of Fig.
3, the cautious na-ture of the CRD solution is more nuanced. The
effect ofparameter uncertainty is larger in certain directions or
re-gions of the input space, and CRD automatically takes thisinto
account by selecting x settings that avoid this.
It has been shown that the CRD objective function,which is a
Bayesian MSE, decomposes naturally into threecomponents: the first
component is the square of the re-sponse mean deviation from
target; the second componentis the certainty equivalence
variance—i.e., what would re-sult due to noise and random error
variability if there wereno parameter uncertainty; and the third
component is theadditional variance due to the uncertainty in the
param-eters, characterized by their posterior distribution. It
hasbeen demonstrated that this decomposition is particularlyuseful
in determining whether further experimentation isnecessary to
reduce parameter uncertainty. The assessmentof parameter
uncertainty is entirely objective oriented, inthe sense that the
third component quantifies the extentto which parameter uncertainty
inflates the MSE objectivefunction. The current authors believe
that this is a verydirect and appropriate way of quantifying
parameter un-certainty and facilitating efficient
experimentation.
A dual-response version of CRD has also been investi-gated. In
situations with no parameter uncertainty, a dual-response criterion
may have conceptual advantage over asingle MSE criterion, since the
former constrains the re-sponse mean to be on target. In the
presence of parameteruncertainty, however, the dual-response
constraint that theposterior mean is on target is less meaningful:
Enforcing theconstraint α̂ + β̂′g(x) = T will generally not ensure
that theactual response mean α + β′g(x) equals the target for a
spe-cific realization of α and β. Ironically, it could happen
thatthe response mean using the CRD MSE criterion is closerto the
target than when the constraint α̂ + β̂′g(x) = T is
Downloaded By: [[email protected]] At: 16:21 1 April
2011
-
482 Apley and Kim
enforced. Furthermore, it was demonstrated in the exam-ple that
enforcing the hard constraint that α̂ + β̂′g(x) = Tresults in
settings that are substantially less robust to pa-rameter errors.
For these reasons, the authors believe thatthe CRD MSE criterion of
Equation (9) is preferable tothe dual-response criterion when
parameter uncertainty islarge.
Acknowledgements
This work was supported in part by the National
ScienceFoundation under grant CMMI-0758557. We also thanktwo
anonymous referees and the Department Editor, Rus-sell Barton, for
their many helpful comments.
References
Apley, D.W. (2004) A cautious minimum variance controller
withARIMA disturbances. IIE Transactions, 36, 417–432.
Apley, D.W. and Kim, J.B. (2002) A cautious approach to robust
de-sign with model uncertainty. In Proceedings of the 2002
IndustrialEngineering Research Conference, IIE, Orlando, FL, paper
2164.
Apley, D.W. and Kim, J.B. (2004) Cautious control of industrial
processvariability with uncertain input and disturbance model
parameters.Technometrics, 46(2), 188–199.
Åström, K.J. and Wittenmark, B. (1995) Adaptive Control,
second edi-tion. Addison-Wesley, New York, NY.
Box, G.E.P. and Hunter, J.S. (1954) A confidence region for the
solution ofa set of simultaneous equations with an application to
experimentaldesign. Biometrika, 41, 190–199.
Box, G.E.P. and Meyer, R.D. (1993) Finding the active factors in
frac-tional screening experiments. Journal of Quality Technology,
25, 94–105.
Bunke, H. and Bunke, O. (1986) Statistical Inference in Linear
Models:Statistical Methods of Model Building, Volume I, Wiley, New
York,NY.
Chipman, H. (1998) Handling uncertainty in analysis of robust
designexperiments. Journal of Quality Technology, 30, 11–17.
Joseph, V.R. (2006) A Bayesian approach to the design and
analysis offractionated experiments. Technometrics, 48,
219–229.
Joseph, V.R. and Delaney, J.D. (2007) Functionally induced
priors forthe analysis of experiments. Technometrics 49, 1–11.
Khattree, R. (1996) Robust parameter design: a response surface
ap-proach. Journal of Quality Technology, 28, 187–198.
Kim, J.B. (2002) A cautious approach to minimizing industrial
processvariability. Ph.D. dissertation, Department of Industrial
Engineer-ing, Texas A&M University.
Lucas, J.M. (1994) How to achieve a robust process using
response surfacemethodology. Journal of Quality Technology, 26,
248–260.
Miró-Quesada, G. and Del Castillo, E. (2004) Two approaches for
im-proving the dual response method in robust parameter design.
Jour-nal of Quality Technology, 36, 15–168.
Miró-Quesada, G., Del Castillo, E. and Peterson, J. (2004) A
BayesianApproach for multiple response surface optimization in the
presenceof noise variables. Journal of Applied Statistics, 31,
251–270.
Monroe, E.M., Pan, R., Anderson-Cook, C.M., Montgomery, D.C.
andBorror, C.M. (2010) Sensitivity analysis of optimal designs for
ac-celerated life testing. Journal of Quality Technology, 42(2),
121–135.
Montgomery, D.C. (2001) Design and Analysis of Experiments,
fifth edi-tion, John Wiley & Sons, New York, NY.
Myers, R.H., Khuri, A.I. and Vining, G. (1992) Response surface
al-ternatives to the Taguchi robust parameter design approach.
TheAmerican Statistician, 46, 131–139.
Myers, R.H. and Montgomery, D. (2002) Response Surface
Methodol-ogy: Process and Product Optimization Using Designed
Experiments,John Wiley & Sons, New York, NY.
Nair, V.N. (1992) Taguchi’s parameter design: a panel
discussion. Tech-nometrics, 34, 128–161.
Ng, S.H. (2010) A Bayesian model-averaging approach for
multiple-response optimization. Journal of Quality Technology, 42,
52–68.
Peterson, J.J. (2004) A posterior predictive approach to
multiple responsesurface optimization. Journal of Quality
Technology, 36, 139–153.
Peterson, J.J., Cahya, S. and Del Castillo, E. (2002) A general
approachto confidence regions for optimal factor levels of response
surfaces.Biometrics, 58, 422–431.
Pignatiello, J.J. and Ramberg, J. S. (1985) Discussion of
off-line qual-ity control, parameter design, and the Taguchi
method. Journal ofQuality Technology, 17, 198–206.
Rajagopal, R. and Del Castillo, E. (2005) Model-robust process
optimiza-tion using Bayesian model averaging. Technometrics, 47,
152–163.
Rajagopal, R., Del Castillo, E. and Peterson, J.J. (2005) Model
anddistribution-robust process optimization with noise factors.
Jour-nal of Quality Technology, 37, 210–222.
Sahni, N.S., Piepel, G.F. and Naes, T. (2009) Product and
process im-provement using mixture-process variable methods and
robust op-timization techniques. Journal of Quality Technology,
41(2), 181–197.
Shoemaker, A.C., Tsui, K. and Wu, C.F.J. (1991) Economical
experimen-tation methods for robust design. Technometrics, 33,
415–427.
Taguchi, G. (1986) Introduction to Quality Engineering,
UNIPUB/KrausInternational, White Plains, NY.
Vining, G.G. and Myers, R.H. (1990) Combining Taguchi and
responsesurface philosophies: a dual response approach. Journal of
QualityTechnology, 22, 38–45.
Wu, C.F.J. and Hamada, M. (2000) Experiments: Planning,
Analysis, andParameter Design Optimization, John Wiley & Sons,
New York, NY.
Biographies
Daniel W. Apley is an Associate Professor of Industrial
Engineering andManagement Sciences at Northwestern University,
Evanston, IL. He ob-tained B.S., M.S., and Ph.D. degrees in
Mechanical Engineering and anM.S. degree in Electrical Engineering
from the University of Michigan.His research interests lie at the
interface of engineering modeling, statisti-cal analysis, and data
mining, with particular emphasis on manufacturingvariation
reduction applications in which very large amounts of data
areavailable. His research has been supported by numerous
industries andgovernment agencies. He received the NSF CAREER award
in 2001, theIIE Transactions Best Paper Award in 2003, and the
Wilcoxon Prize forbest practical application paper appearing in
Technometrics in 2008. Hecurrently serves as Editor-in-Chief for
the Journal of Quality Technologyand has served as Chair of the
Quality, Statistics & Reliability Sectionof INFORMS, Director
of the Manufacturing and Design EngineeringProgram at Northwestern,
and Associate Editor for Technometrics.
Jeongbae Kim is the Director of the Service Innnovation Team at
KoreaTelecom Headquarters. He obtained his Ph.D. in Industrial
Engineeringfrom Texas A&M University.
Downloaded By: [[email protected]] At: 16:21 1 April
2011