-
Classification Algorithms for Virtual Metrology
Shaima Tilouche
Department of Mathematical and
Industrial Engineering
Montreal Polytechnique
Email: [email protected]
Samuel Bassetto
Department of Mathematical and
Industrial Engineering
Montreal Polytechnique
Email: [email protected]
Vahid Partovi Nia
GERAD Research Center
Department of Mathematical and
Industrial Engineering
Montreal Polytechnique
Email: [email protected]
Abstract—Virtual metrology in quality control deals withdrifts
in product quality that occur during non-sampling periods.This
approach enables a hundred percent control and improvesthe
precision of statistical control, specially while there is
nosampling activity in manufacturing process. The main challengein
virtual metrology is inaccurate predictions. As such, the choiceof
an appropriate algorithm for prediction is crucial. We
compareseveral algorithms that can be used for prediction in
virtualmetrology. The comparison over different prediction
algorithmsis made on a simulated data inspired from virtual
metrologyapplication.
I. INTRODUCTION
Semiconductor manufacturing production line includesmany steps
and requires a high level of accuracy in eachstep. Several sensors
are placed in different locations in orderto detect the defects to
stop following the manufacturingprocess for already defected items.
Information collected fromthese sensors are, then, used in quality
control. The controlis performed in many levels to ensure the
stability and theperformance of the production process. [2]
enumerates somelevels that should be taken into account during the
controlprocess. These levels are mentioned as layers of control
fromtools to product. At the product level, electric tests
verifyproper functionality of chips. These tests are performed
atthe end of the manufacturing stage [13] [18]. Some
otherintegration tests are applied during the fabrication to
verifythe properties of technological modules. Tests like
verifyingthat transistor’s shape is correctly processed, or
examining thatelectrical properties of several layers of materials
are in thedesired specifications. After each step, usually it is
possibleto perform some tests to monitor any abnormal variations
oftools that operate a particular process. Finally, at the tool
level,numerous sensors are employed to regulate and to monitor
theprocess.
All the tests and the monitoring steps described above arebased
on few samples of products, except for the electricalwafer sort
being the final product. As a consequence, a drift inproduction can
occur in the non-sampling period and this driftis hardly detected
[6]. Virtual metrology (VM) algorithms havebeen suggested as an
alternative to 100% wafer measurement,in order to support
wafer-to-wafer control [12]. VM canincrease metrology data
availability, reduce send-ahead wafers,improve quality guarantee
levels, and reduce cycle time. Themain challenge in using VM is the
poor prediction accuracy.Thus, the choice of the algorithm used in
VM method is ofa great importance. VM refers to classification of
products(correct or defected) using some auxiliary information
col-
lected during the manufacturing process. We briefly review
andcompare several classification algorithms that can be used inVM.
Section II presents a literature review on VM. Section IIIdescribes
the data simulation setup, used to compare theclassification
algorithms, Section IV introduces classificationalgorithms and
Section V discusses the numerical comparison.
II. VIRTUAL METROLOGY
Virtual metrology has been introduced to employ mathe-matical
models on accessible measurements from an operatingprocess with the
aim of predicting some variables of interest.This methodology
allows to predict relevant variables usingequipment measurement,
without physically conducting qual-ity measurement [21] [14]. [6]
observed a strong correlationbetween the tool history and the wafer
measurement. Theyfound the coefficient of determination R2 >
0.97 can beachieved with more than 500 wafers on deposition as
theresponse and equipments as the predicting variables. Thisstrong
correlation suggests that a wafer-to-wafer control canbe quickly
enabled using an existing lot-to-lot control system.This indirect
technique, called virtual metrology, provides anefficient and
economical alternative to wafer-to-wafer control.
Several algorithms have been proposed in the VM litera-ture.
[14] suggested a linear regression model to predict thestate of the
wafer in real time. They proposed to find thecoefficients using
least squares. The univariate output of theirmodel was the wafer
measurement and the inputs were theequipments’ measurement. The
model was updated as new out-put measurement was available. [3]
studied a virtual metrologymodel using partial least squares. This
model predicts chemicalvapor deposition oxide thickness for an
Inter Metal Dielectricdeposition process.
Many VM algorithms have been developed based on neu-ral networks
[22]. Neural network is an implicit nonlinearmodel fitting.
Different versions of neural networks havebeen considered in the VM
litterature. [15] established a VMmodel using radial basis network.
The effectiveness of theproposed VM system was tested on chemical
vapor depositionprocesses in a practical semiconductor
manufacturing. Theirresult confirmed that neural network can be
used effectively toconstruct a predictive model. [14] adopted a
system using backpropagation neural network for establishing a
model for theetching process in semiconductor manufacturing. [7]
comparedthe performance of the radial basis network and the
backpropagation neural network on the thin-film transistor
liquidcrystal display industry. The radial basis function network
andthe back propagation neural network produced quite similar
-
results. Some other versions of neural models are proposedto
detect wafer anomaly such as polynomial neural network,piecewise
linear neural network, and fuzzy neural network, seefor instance
[5], [4], [11], and [20].
Kernel approaches, specifically support vector machines,are a
powerful tool to predict wafer drifts based on
equipmentmeasurements [16]. [7] reports that the support vector
ma-chines approach give a better prediction accuracy comparedwith
the radial basis function network and also compared withthe
back-propagation approach, see also [1].
Genetic algorithm is a powerful optimization tool for
modelfitting [8]. A kernel adjustment is proposed to deal
withoverfitting problems. [7] combines the support vector
machinesand the genetic algorithm to construct a virtual metrology
sys-tem for the chemical vapor deposition process. [23]
suggestsprincipal components axes to reduce the dimensionality
ofplasma dimensions after etching process. It is well-known
thatuncorrelated features improve the estimation and the
predici-tion of statistical models. The principal components are
linearcombinations of the inputs and are mutually uncorrelated.
Existence of a large list of different classification
algo-rithms makes the choice of an appropriate algorithm
difficult.We aim to compare different classification algorithms
proposedin the literature on a simple simulated example motivated
froma VM problem.
III. DATA SIMULATION
We tried to simulate the data according realistic
conditionsappearing in VM. The inputs, say x, represent the
equipmentmeasurements. In practice, the inputs can be power,
pressure,temperature, etc. Some of such inputs are intercorrelated
andsome others are independent from each other. We simulatedthe
total of 10 inputs. The output variable, say y, is a binaryvariable
that represents the final product’s state, being corrector
defected. The following matrix describes the data structure
x11 x12 · · · x110 y1x21 x22 · · · x210 y2
......
......
xn1 xn2 · · · xn10 yn
,
where each row corresponds to a measurement of on a wafer.Each
column xj is a continuous value of an input variable, sayequipment
j, and the last column y shows the binary output.The number of
rows, n, is the total number of wafers. Thematrix entries xij
represents the measurement of wafer i onequipment j and yi is the
final state of wafer i.
We simulated the data with the following structure.
Inputvariables x1 and x2 are a block of intercorrelated
variables;another block of correlated variables contains x3, x4,
andx5. The other inputs x6, x7, x8, x9 and x10 are
generatedindependently, all irrelevant to classify the output. The
latterblock does not affect the output variabl e, but they
contribute inthe classification error generated by the measurement
system.Table I illustrates the dependence structure of the
simulated in-put variables in which Np(µ,Σ) denotes a p-variate
Gaussiandistribution with mean µ and variance-covariance Σ.
block Inputs Correlation Distribution
1 (x1, x2) yes N2
[(
00
)
,
(
1 0.90.9 1
)]
2 (x3,x4,x5) yes N3
000
,
1 0.9 0.90.9 1 0.90.9 0.9 1
3 x6, . . . , x10 no N1(0, 1)
TABLE I: The generated input data structure. Three blocks
ofinput variables are generated of size n = 100 observations.
First, we simulated a binary output yi being generated asa
function of only three input variables xi1, xi3, and xi4
{
yi = 1 if a′zi ≥ 0,
yi = 0 if a′zi < 0,
(1)
where a = (a1, a2, a3) and zi = (xi1, xi3, xi4). This
modelproduces observations that only x1, x3, and x4 are useful
forclassification, and the other variables are noise. Second,
wesimulated a binary output using a quadratic function of x1,x2,and
x4 is generated
{
yi = 1 if a′zi + z
′iAzi ≥ 0,
yi = 0 if a′zi + z
′iAzi < 0.
(2)
The elements of the vector a are sampled independentlyand
uniformly from {−6,−3, 3, 6}, and the elements of thesymmetric
matrix A are sampled from {−6,−3, 0, 3, 6}. Wegenerated n = 100
observations as the training set. A datasetof the same size is
generated as the validation set. The modelis fitted on the training
set, and the precision of the resultingclassification is evaluated
on the validation set. The total of 20Monte Carlo simulations have
been run.
IV. CLASSIFICATION ALGORITHMS
Several classification algorithms listed below are used
topredict the output as a function of the inputs.
A. k-Nearest-Neighbours
The k-nearest-neighbours is a model-free algorithm thatpredicts
the output based on its k nearest neighbours. Thenearest neighbours
are found using a distance, often theEuclidean distance, computed
over the corresponding inputvariables. Suppose N(x) is the
neighbourhood at point xwhere the k data fall into, then
ŷ(x) =1
k
∑
xi∈N(x)
yi. (3)
This technique gives a step function approximation to
theclassification function, see Fig. 1. The tunning parameter k
ischosen manually or is estimated using cross-validation.
B. Logistic Regression
The logistic regression is a generalization of the
linearregression where the output variable is binary. This
techniqueis used to predict a binary outcome based on one or
morecontinuous predictor variables. The logistic regression
esti-mates the coefficients of a linear classifier using the
conditional
-
Input
Out
put
Fig. 1: An illustrative example of a 3-nearest-neighbours
algo-rithm. The circles are observations from an unknown
function.The three green blobs are the data that fall in the
neighborhoodof x. The vertical red line represents x and the
horizontal blueline shows the neighbourhood of size 3, denoted by
N(x) in(3). The output is predicted by the average of the closest
3points, denoted by the red blob.
distribution of yi | xi. Since the logistic regression uses
aprobabilistic model to estimate the classification function,
theprobability of (y = 1 | x) can be extracted after the fitting
fora given x. In order to produce a binary predict, this
estimatedprobability is cut at a certain point, usually 0.5. The
probabilityof yi | xi is expressed as
Pr(yi = 1|xi) =exp(β0 + x
′iβ)
1 + exp(β0 + x′iβ),
where x′i = (xi1, xi2, .., xi10) and β = (β1, . . . , β10)′.
The
regression coefficients are estimated using maximum likeli-hood.
The log likelihood function of the Bernoulli distributionis
maximized using iterative reweighted least squares. TheBernoulli
log likelihood, say ℓ(β), is expressed as
ℓ(β) =n∑
i=1
log
[
{
exp(β0 + x′iβ)
1 + exp(β0 + x′iβ)
}yi{
1−exp(β0 + x′iβ)
1 + exp(β0 + x′iβ)
}1−yi]
Like linear regression, logistic regression suffers from
over-fitting and produces unstable estimation of coefficients
whilea many of noise variables is added in the model. As a
remedythe penalized logistic regression is fitted. The penalty
term,penalizes large absolute values of model coefficients.
Themaximizing function is
ℓ(β)− λ||β||22
where ||β||22 =∑10
j=1 β2j is the squared Euclidean norm of the
regression coefficients. Here λ is a positive tuning
parameter,usually estimated by cross-validation.
C. Neural Network
A neural network is a set of simple but highly intercon-nected
processing elements, called neurons, to fit a highlynonlinear
model, see Fig. 2. Neural network has been evaluatedfor different
number of hidden layers with different weights.The most predictive
number of layers is chosen. This approachhelp regularizing this
algorithm and avoids overfitting.
Fig. 2: Schematic representation of a neural network model.
D. Linear Discriminant
Linear discriminant analysis separates data into
differentclasses (two classes for a binary output) using a linear
hyper-plane, see Fig. ?? (left panel). However, linear
discriminantcoefficients are sensitive to the correlation between
the inputvariables. In order to improve the classification
performance,it is proposed to perform the classification on the
principalcomponents of data [17]. We applied linear discriminant
onfour principal components.
An alternative to improve the performance in the presenceof
correlated variables is penalization. A penalized discrimi-nant
analysis is suggested in [9]. Absolute norm penalty, alsocalled the
lasso penalty, is applied to the discriminant vectorsto encourage
variable selection simultaneously.
E. Quadratic Discriminant
Quadratic discriminant analysis, as its name indicates,
pro-poses quadratic boundaries to separate data. This algorithm
issimilar to the linear discriminant, except it allows for
quadraticcoefficients as well, see Fig. ?? (right panel). We
applied thisalgorithm on the principal components of data also.
F. Mixture Discriminant
Polynomial boundaries such as linear and quadratic func-tions
are too restrictive for complex data. Mixture of discrimi-nant
functions covers a flexible class of classification functions.This
algorithm is called mixture discriminant analysis [10]. AGaussian
mixture model for the kth class has density
Pr(X |G = k) =Rk∑
i=1
πkrφ(X,µkr ,Σkr),
where the mixing proportions πkr sum to one. This has
Rkprototypes for the kth class, and in our specification,
thecovariance matrix Σkr is used as the metric throughout.
Givensuch a model for each class, the class posterior
probabilities
-
0 2 4 6 8 10
−2
02
46
8
Input 1
Inpu
t 2
00 000
00
00
0
0
000
00
0
00 000
00
0
0
000
0
0
00 0
00
0
0
0
00
00
0
00
0000000
0 0
0
0000
111
1
11
11
111
1
111
1
1
1
111
1
111
1111
11111
1 111
11 1111
1
1111
111
1
1
1
1
1111
1
2 3
4
5
1
2
3 4
5
1
2 3
4 5
0 2 4 6 8 10
−2
02
46
8
Input 1
Inpu
t 2
00 000
00
00
0
0
000
00
0
00 000
00
0
0
000
0
0
00 0
00
0
0
0
00
00
0
00
0000000
0 0
0
0000
111
1
11
11
111
1
111
1
1
1
111
1
111
1111
11111
1 111
11 1111
1
1111
111
1
1
1
1
1111
1
2 3
4
5
1
2
3 4
5
1
2 3
4 5
00 000
00
00
0
0
000
00
0
00 000
00
0
0
000
0
0
00 0
00
0
0
0
00
00
0
00
0000000
0 0
0
0000
111
1
11
11
111
1
111
1
1
1
111
1
111
1111
11111
1 111
11 1111
1
1111
111
1
1
1
1
1111
0 2 4 6 8 10
−2
02
46
8
Input 1
Inpu
t 2
1
2 3
4
5
1
2
3 4
5
1
2 3
4 5
00 000
00
00
0
0
000
00
0
00 000
00
0
0
000
0
0
00 0
00
0
0
0
00
00
0
00
0000000
0 0
0
0000
111
1
11
11
111
1
111
1
1
1
111
1
111
1111
11111
1 111
11 1111
1
1111
111
1
1
1
1
1111
0 2 4 6 8 10
−2
02
46
8
Input 1
Inpu
t 2
00 000
00
00
0
0
000
00
0
00 000
00
0
0
000
0
0
00 0
00
0
0
0
00
00
0
00
0000000
0 0
0
0000
111
1
11
11
111
1
111
1
1
1
111
1
111
1111
11111
1 111
11 1111
1
1111
111
1
1
1
1
1111
1
2 3
4 5
1
2 3
4
5
1
2
3 4
5
Fig. 3: Scatter plot of the quadratic simulated data over Input1
and Input 2, see also Table I. Linear discriminant (topleft panel),
quadratic discriminant (top right panel), mixturediscriminant
(bottom left panel) and neural networks (bottomright panel) are
used to find the decision boundaries.
are given by
Pr(X |G = k) =
∑Rkr=1 πkrφ(X,µkr ,Σkr)πk
∑K
l=1
∑Rkr=1 πlrφ(X,µlr,Σlr)πl
,
where πl represents the class prior probabilities. The
param-eters of mixture discriminant are estimated using
maximumlikelihood. The classification obtained through Mixture
dis-criminant is compared with linear and quadratic discriminantand
also with neural networks as shown in Fig. 3. We canconclude that
mixture discriminant gives better results thanlinear and quadratic
discriminant and basically, as good resultsas neural networks. This
good classification result is due to theflexibility of mixture
discriminant boundaries.
G. Kernelized Support Vector Machines
A support vector machine constructs a hyperplane in a
highdimensional space. Intuitively, a good separation is achievedby
the hyperplane that has the largest distance to the nearesttraining
data point of any class, so-called functional margins,see Fig. 4.
Like the other methods, after computing thehyperplane, the data are
categorized into 2 classes. Instead ofthe linear support vector
machines, we tested a more flexibleversion called kernelized
support vector machines. The kernelfunction transforms the
classification problem into a new spacedefined by the kernel (inner
product) on the input variables.We used the radial basis kernel
also called Gaussian kernel.
V. NUMERICAL RESULTS
A Monte Carlo simulation study over the linear and thequadratic
models (1) and (2) is summarized in Table II.
−2 −1 0 1 2
−3
−2
−1
01
23
Input 1
Inpu
t 2 10 10 0
10
1
0 00
1
0
11
0
1 1
0
0
1
0
11
00
0
0
1
1
0
0
110 1
1
1
1
0
1
0
1
1
0
11
00
10
0
1
0
1
0
1
11
0 0 1
11
0
0
1 1
1
10
1
1
1
0
1
0
1
1
0
1
1
01
1
1
0
00
0
1
1 11
1
11
0
0
1
01
00
0
1
0
0
1
11
0
Fig. 4: Linear support vector machines shown on a
separableillustrative example. The dashed lines show the margins
andthe data that fall on the margin, shown by triangles, are
calledthe support vectors.
Linear Quadratic
Algorithm p̂L p̂ p̂U p̂L p̂ p̂UNeural Network 94 94 94 83 84
84
Kernel SVM 88 88 88 82 82 83
MDA-PCA 86 86 87 81 82 82
QDA-PCA 86 87 87 81 82 82
KNN 85 86 85 82 82 82
LDA-PCA 87 87 88 75 76 77
Penalized LDA 87 88 88 76 76 77
Penalized LR 90 91 91 66 67 68
LR 88 89 89 63 64 64
TABLE II: The estimated correct classification rates for
dif-ferent algorithms in percentages, p̂, and their respective
95%confidence lower and upper bounds, p̂L and p̂U . The resultsare
demonstrated once for the linear simulated data (left) andonce for
the quadratic simulated data (right).
The simulation codes are written the statistical
programminglanguage R [19]. Simulations are performed using a 2.30
GHzIntel core i5-2410m processor and 6.00 Go RAM, takingaround 2
minutes to run all algorithms. Datasets and the Rcodes are
available and will be provided upon request. Thecorrect
classification rates of the output variable is summarizedin Table
II.
Neural network outperforms all other algorithms for bothlinear
and quadratic data. The logistic regression (LR) andthe penalized
logistic regression are, also, good classifiersfor the linear data.
However, they give significantly inferiorcorrect classification
rates for the quadratic data. The penalizedlogistic regression
(Penalized LR) improves the rate of correctclassification compared
to the logistic regression, particularlyfor the quadratic output.
It achieves an increase of 3% (from64% to 67%). The quadratic
discriminant combined with theprincipal components (QDA-PCA) shows
better results thanthe linear discriminant (LDA-PCA). The QDA-PC on
thequadratic output shows 6% increase of the correct
classificationrate compared to LDA. The mixture discriminant
method(MDA) gives results similar to the quadratic discriminant,
butbetter than the LDA-PCA and the Penalized LDA, for the
linearoutput. The kernelized support vector machines (Kernel
SVM)gives accurate predictions for both linear and quadratic
outputs.
-
VI. CONCLUSION
We briefly reviewed the existing literature on qualitycontrol
with an emphasis on virtual metrology. We insistthat the choice of
a proper classification algorithm is ofgreat importance in this
area. Therefore, we studied severalalgorithms that could be used
for VM on some simulated data.These algorithms perform differently
depending on the outputs(linear or quadratic). However, neural
network outperforms allothers in both cases. This suggests to keep
neural networkmethod as a strong potential candidate for modelling
in VM.
REFERENCES
[1] R. Baly and H. Hajj, “Wafer classification using support
vectormachines.semiconductor manufacturing,” IEEE Transactions,
vol. 25,no. 3, pp. 373–383, 2012.
[2] S. Bassetto and A. Siadat, “Operational methods for
improving manu-facturing control plans: case study in a
semiconductor industry,” Journalof intelligent manufacturing, vol.
20, no. 1, pp. 55–65, 2009.
[3] J. Besnard, D. Gleispach, H. Gris, A. Ferreira, A. Roussy,
C. Kernaflen,and G. Hayderer, “Virtual metrology modeling for cvd
film thickness,”International Journal of Control Science and
Engineering, vol. 2, no. 3,pp. 26–33, 2012.
[4] S. Bhatikar and A. Siadat, “Operational methods for
improving man-ufacturing control plans : case study in a
semiconductor industry,”Journal of intelligent manufacturing, vol.
20, no. 1, pp. 55–65, 2009.
[5] Y. J. Chang, Y. Kang, C. L. Hsu, C. T. Chang, and T. Y.
Chan,“Virtual metrology technique for semiconductor manufacturing,”
inNeural Networks, 2006. IJCNN ’06. International Joint Conference
on.IEEE, 2006, pp. 5289–5293.
[6] Y. T. Chen, H. C. Yang, and F. T. Cheng, “Multivariate
simulationassessment for virtual metrology,” in Robotics and
Automation, 2006.ICRA 2006. Proceedings 2006 IEEE International
Conference on.IEEE, 2006, pp. 1048–1053.
[7] P. H. Chou, M. J. Wu, and K. K. Chen, “Integrating support
vectormachine and genetic algorithm to implement dynamic wafer
qualityprediction system,” Expert Systems with Applications, vol.
37, no. 6,pp. 4413–4424, 2010.
[8] D. E. Goldberg, Genetic algorithms in search, optimization,
and ma-chine learning. Addison-wesley Reading Menlo Park, 1989,
vol. 412.
[9] T. Hastie, A. Buja, and R. Tibshirani, “Penalized
discriminant analysis,”The Annals of Statistics, vol. 23, no. 1,
pp. 73–102, 1995.
[10] T. Hastie and R. Tibshirani, “Discriminant analysis by
gaussian mix-tures,” Journal of the Royal Statistical Society,
Series B, vol. 58, no. 1,pp. 155–176, 1996.
[11] K. L. Hsieh and L. I. Tong, “Optimization of multiple
quality responsesinvolving qualitative and quantitative
characteristics in ic manufacturingusing neural networks,”
Computers in Industry, vol. 46, no. 1, pp. 1–12,2001.
[12] A. A. Khan, J. R. Moyne, and D. M. Tilbury, “Virtual
metrologyand feedback control for semiconductor manufacturing
processes usingrecursive partial least squares.” Journal of Process
Control, vol. 18,no. 10, pp. 961–974, 2008.
[13] W. Kuo and T. Kim, “An overview of manufacturing yield and
reliabilitymodeling for semiconductor products.” Proceedings of
IEEE, vol. 87,no. 8, pp. 1329–1344, 1999.
[14] T. H. Lin, F. T. Cheng, W. M. Wu, C. A. Kao, A. J. Ye, and
F. C.Chang, “Nn-based key-variable selection method for enhancing
virtualmetrology accuracy,” Semiconductor Manufacturing, IEEE
Transactionson, vol. 22, no. 1, pp. 204–211, 2009.
[15] T. H. Lin, M. H. Hung, R. C. Lin, and F. T. Cheng, “virtual
metrologyscheme for predicting cvd thickness in semiconductor
manufacturing,”in Robotics and Automation, 2006. ICRA 2006.
Proceedings 2006 IEEEInternational Conference on. IEE, 2006, pp.
1054–1059.
[16] K. Mao, “Feature subset selection for support vector
machines throughdiscriminative function pruning analysis,” Systems,
Man, and Cyber-netics, Part B: Cybernetics, IEEE Transactions on,
vol. 34, no. 1, pp.60–67, 2004.
[17] G. L. Marcialis and F. Roli, “Fusion of LDA and PCA for
faceverification,” in Biometric Authentication. Springer, 2002, pp.
30–37.
[18] J. Moyne, E. Del Castillo, and A. M. Hurwitz, Run-to-run
control insemiconductor manufacturing. CRC Press, 2010.
[19] R Core Team, R: A Language and Environment for
StatisticalComputing, R Foundation for Statistical Computing,
Vienna, Austria,2014. [Online]. Available:
http://www.R-project.org/
[20] D. Stokes and G. May, “Real-time control of reactive ion
etching usingneural networks,” Semiconductor Manufacturing, IEEE
Transactionson, vol. 13, no. 4, pp. 469–480, 2000.
[21] G. A. Susto, A. Beghi, and C. De Luca, “A virtual metrology
systemfor predicting cvd thickness with equipment variables and
qualitativeclustering,” in Emerging Technologies & Factory
Automation (ETFA),2011 IEEE 16th Conference on. IEEE, 2011, pp.
1–4.
[22] J. C. Yung-Cheng and F. T. Cheng, “Application development
of virtualmetrology in semiconductor industry,” in Industrial
Electronics Society,2005. IECON 2005. 31st Annual Conference of
IEEE. IEEE, 2005.
[23] D. Zeng and C. J. Spanos, “Virtual metrology modeling for
plasmaetch operations,” Semiconductor Manufacturing, IEEE
Transactions on,vol. 22, no. 4, pp. 419–431, 2009.