Dynamic Quantiﬁcation of Process Parameters in Viscose … · 2014-01-15 · Dynamic Quantiﬁcation of Process Parameters in Viscose Production with Evolving Fuzzy Systems? Carlos

Dynamic Quantification of Process Parameters in ViscoseProduction with Evolving Fuzzy Systems ?

Carlos Cernuda1, Edwin Lughofer1, Lisbeth Suppan2, Thomas Roder3, Roman Schmuck3, PeterHintenaus4, Wolfgang Marzinger5 and Jurgen Kasberger6

1 Department of Knowledge-based Mathematical Systems, Johannes Kepler University of Linz, Austria2 Kompetenzzentrum Holz GmbH, St.-Peter-Str. 25, 4021 Linz, Austria

3 Lenzing AG, 4860 Lenzing, Austria4 Software Research Center, Paris Lodron University Salzburg, Austria

5 i-RED Infrarot Systeme GmbH, Linz, Austria6 Recendt GmbH, Linz, Austria

Abstract. In viscose production, it is important to monitor three process parameters as part of thespin-bath in order to assure a high quality of the final product: the concentrations of H2SO4, Na2SO4and ZnSO4. During on-line production these process parameters usually show a quite high dynamicsdepending on the fibre type that is produced. Thus, conventional chemometric models, kept fixedduring the whole life-time of the on-line process, show a quite imprecise and unreliable behaviorwhen predicting the concentrations of new on-line data. In this paper, we are demonstrating evolvingchemometric models based on TS fuzzy systems architecture, which are able to adapt automaticallyto varying process dynamics by updating their inner structures and parameters in a single-pass in-cremental manner. Gradual forgetting mechanisms are necessary in order to out-date older learnedrelations and to account for more flexibility and spontaneity of the models. The results show that ourdynamic approach is able to overcome the huge prediction errors produced by various state-of-the-artstatic chemometric models, which could be elicited over a three months period.

Keywords: viscose production, dynamic processes, evolving chemometric models, gradual forgetting

1 Introduction

1.1 Motivation

The viscose process is of economic significance and production has been growing rapidly for the past twodecades. However, analytics accompanying the process can hardly follow this growth due to a high anal-ysis time for samples withdrawn. In highly dynamic processes, changing its system behavior or operatingconditions quite frequently within days, new and fast measurement methods are required to accomplishadequate response times for process control. In this regard, the NIR spectroscopy is a powerful methodfor measuring the most important process parameters, namely the concentrations of H2SO4, Na2SO4and ZnSO4, which are contained in the spin-bath and whose composition determines the forming of theviscose filament associated with viscose fibre properties. Closed loops and frequent changes in the pro-duction lead to unsteady process conditions. This means that not only the measured variables themselveschange, but also the level of accompanying substances and impurities. Thus, chemometric models built upbased on NIR measurements [3], are also expected to handle the dynamic changes of the system behav-ior and to include new states and operating conditions on-the-fly during the on-line production process.

? This work was funded by the Austrian research funding association (FFG) under the scope of the COMET pro-gramme within the research network ’Process Analytical Chemistry (PAC)’ (contract # 825340). This publicationreflects only the authors’ views.

2 Authors Suppressed Due to Excessive Length

Conventional chemometric models (such as PLS and its robust version in [22], PCR [20], LWR and manyothers, see e.g. [23] including clustering and soft computing techniques) cannot cover the entire rangeof occurrences, leading to severe downtrends in predictive accuracy on new on-line samples. They alsodo not allow online process control without permanent re-training phases of the whole model with allsamples seen so far, which are usually time-intensive and slow. Thus, the usage of such off-line methodsin a fully-automatic on-line setting is not really promising and recommended and in most cases evenimpossible. Furthermore, intrinsic non-linear behaviors between input spectra and target concentrationsmay be present in the process, which condemns the application of recursive linear models [14].

1.2 Our Approach (Summary)

In this paper, we are presenting a new methodology for setting up dynamic updateable chemometricmodels. The approach is evolving in the sense, that the structure of the models may change and expand ondemand, according to the variations in the production process, accounting for more or less non-linearity.Therefore, we speak about evolving chemometric models. Both, the recursive adaptation of parameters aswell as the evolution of structural components are driven by single-pass incremental learning techniques— see Section 2.3. The applied model architecture is a Takagi-Sugeno fuzzy system, whose interpretationfrom chemometric point of view can be seen as a weighted sum of local linear predictors with multivariateGaussian kernels to form a global non-linear and smooth model. Opposed to [8], where a recursive multi-model partial least squares approach is demonstrated for chemometric purposes, the number of localmodels (rules in the fuzzy system) accounting for different degrees of non-linearity does not need to bepre-parameterized, as these are evolved fully automatically from the process data on-line. The complexityof the chemometric model is automatically reduced whenever some rules are not needed any longer asgetting overlapping, redundant to other rules. Furthermore, our approach supports the option of a gradual,smooth forgetting over time in order to react on high process dynamics over time.

The approach will be evaluated based on a streaming data set from the viscose production processat Lenzing AG (Section 3). The results will show that our evolving modeling technique overcomes thehuge errors produced by various conventional chemometric models former applied to the process (usingdifferent modeling methods such as PLS, PCR, LWR, MLR, Stepwise Regression, ANN, Regtree andGLMnet)

2 Chemometric Modeling Procedure

As model architecture, we use Takagi-Sugeno fuzzy systems with linear consequent functions, Gaussianfuzzy sets and product t-norm as conjunction operator, also called fuzzy basis function networks [24].The reason of this choice lies in the interpretation capabilities of TS fuzzy systems within the contextof chemometrics, as there basically statistical methods are used for calibration purposes (see e.g. [23]).As in many applications the corresponding concentration values for calibration samples are quite costlyto obtain (e.g. by laboratory analysis), the proportion between the number of training samples and thedimensionality of the learning problem is not really beneficial for learning algorithms. Thus, often linearmethods are used [22], suffering from curse of dimensionality less than complex non-linear models [10].However, implicit non-linearities in the relations between spectral data and concentrations favors theusage of non-linear methods. In this context, TS fuzzy systems are perfectly providing a good balance, asconsisting of piece-wise local linear predictors in the form of (C = the number of rules)

li = wi0 +wi1x1 +wi2x2 + ...+wipxp i = 1, ...,C (1)

with all the characteristics and properties of linear regression models, which are combined with ba-sis functions Ψi(x) =

µi(x)∑

Cj=1 µ j(x)

,µi(x) = ∏pi=1 exp(−0.5 xi−ci

σ2i) (normalized multivariate Gaussians) for a

Dynamic Quantification of Process Parameters in Viscose Production with Evolving Fuzzy Systems 3

(a) (b)

Fig. 1. (a): outlier sample (indicated with arrow and circle) lying far away from the real trend of the relationshipspoils the model significantly; (b): a systematic error (surrounded by an ellipsis) spoils the model trying to follow thereal trend significantly

smooth transition between two linear predictors to form a non-linear model. The degree of non-linearitycan be simply steered by the number of rules, thus being able in the training process to react automaticallyon the type of learning problem (see steps below). Furthermore, they can be used within the context ofon-line incremental learning scenarios (termed as evolving fuzzy systems [17]), which is requested in thedynamic viscose production process.

Our procedure for applying chemometric fuzzy models consists of three stages: 1.) pre-processingphase by eliminating outliers in the inputs as well outputs from the training data set; 2.) initial modelprocess including a wavelength reduction and fuzzy systems extraction from NIR spectra data, intervenedin a cross-validation process coupled with an optimal parameter search scenario (varied over parametergrid) and 3.) model adaptation phase throughout the on-line process with specific emphasis on sufficientmodel flexibility. In the following, we provide a short summary for each of these phases.

2.1 Pre-Processing Phase

The influence that some possible outliers would have in the final models, could be critical. An exampleof outliers effecting the final model tendency/surface even in a batch training phase is shown Figure 1.Therefore, we have performed tests to look for outliers in both, the input and the output values.

The test performed on the targets is based on the Mahalanobis distance, defined for two vectors, xand y, as dM(x,y) =

√(x− y)T S−1(x− y) It takes into account the covariance matrix S, where Si j is the

covariance between xi and x j and Sii is the variance of xi. Thus, we are considering elliptic, instead ofcircular, regions of equidistant points. The way of proceeding is to calculate the mean and the standarddeviation of all pairwise Mahalanobis distances between the target values. Then a value would be consid-ered as an outlier if its distance to the mean is higher than n times the standard deviation, being n = 3 thedefault value.

The test performed for the inputs is based on statistical approximations in projection methods [21].The projection method employed is PCA. Suppose that we have a matrix X where the N columns arethe predictor variables, i.e. where are working in a N-dimensional space E. Once selected a number aof PCs, PCA projects the data on a subspace S of dimensionality a, defined by the a first PCs. Then wecan consider the orthogonal supplement, T , of S, that is (N− a)-dimensional, meaning that S⊕T = E.Thus any element x in E has a projection in both S and T so that the sum of them is x. The distancefrom x to S (called score distance, SD) and to T (called orthogonal distance, OD) can be modeled by


means of Snedecor’s F distributions SD≡ F(a,M−a) and OD≡ F(K−a,(K−a)(M−a)) where M isthe number of instances (i.e. rows of X), and K is the rank of X . Taking into account that K is unknownbut can be estimated, and the approximation given by limn2→∞ F(n1,n2) = χ2(n1) we could approximateboth distances

SD∼= χ2(a) OD∼= χ2(M−a)

Now p-values can be calculated for a certain chosen critical level to decide whether an input is an outlieror not.

2.2 Initial Modeling Phase

The first step of the initial modeling phase deals with dimensionality reduction in order to select thosewavelengths from the NIR spectra, which are really necessary for explaining the variances in the targetconcentrations. This is conducted by successively adding new wavelength regressors, until a certain levelof saturation in terms of model quality is reached. Therefore, in each iteration we elicit that wavelengthband which is the most important for explaining the (remaining) information contained in the target,store it into a list of selected regressors and subtract its contribution together with the contribution ofall regressors from the target. Finally, a ranking of wavelengths (bands) is achieved according to theirimportance level for explaining the target (first selected wavelengths most important, second selectedwavelengths second most important and so on) and stored into an ordered list Rank.

The second step concerns the extraction the TS fuzzy systems from pre-collected calibration samples.This is achieved by applying a two-stage clustering algorithm in the reduced wavelength space: the firststage passes over the data to elicit the optimal number of an appropriate number of local regions C forthe given problem at hand, the second stage fine-tunes the parameters, i.e. cluster = rule centers andranges of influence. After the local regions are found and positioned, a regularized weighted least squares(WLS) approach is conducted for estimating the consequent parameter vectors w, where the regularizationparameter is automatically set based on the condition of the inverse Hessian matrix. For further details onthe batch learning phase, please also refer to [4].

Steps 1 and 2 are wrapped within a cross-validation (CV) procedure in order to elicit the optimalmodel structure in terms of number of inputs and rules. This is performed for each input set using the firsti features in Rank with i varying from 1 to max dim, and for each variation of the criterion responsiblefor adding new clusters in the clustering process (vigilance). This finally provides a 3-D error surfaceover different input dimensionalities and model complexities in terms of the number of rules. In order tobe able to penalize the number of inputs and model complexity with a different dynamic intensity, thatparameter setting is finally selected which achieves the minimal penalized error given by

RMSE(pen)α = RMSE · eα param1+β (1−param2) (2)

with param1 the number of inputs (normalized by the maximal allowed dimensionality), param2 thenormalized vigilance parameter criterion

2.3 Incremental On-line Learning (Evolving Model)

Basic Strategy The initial model plus the optimal parameter setting found during the validation phase(number of inputs and vigilance criterion) are used as parameters throughout the model adaptation phasebased on new in-coming data. In fact, the optimized parameters can be seen as a valuable (start) settingfor the problem at hand. Rules may be automatically merged whenever becoming superfluous (due to atoo small vigilance setting, for instance), features may be automatically down-weighted in the piecewiselinear predictors serving a consequent outputs of the TS fuzzy systems. The FLEXFIS approach [16] isused as incremental learning engine for updating the fuzzy systems on a sample per sample basis. The own


predictions should not be forced back, as increasing the risk of self-error back-propagation [7]. A newtarget value is available every 15 minutes as measured by the titration automat, whose sampling aparatusis placed in close proximity to the IR probe used for interfacing the IR spectrometer to the process.

The consequent parameters in FLEXFIS approach are updated by a recursive fuzzily weighted leastsquares estimator (see [17], Chapter 2), providing exact solutions, meaning that the same solutions arefound as when estimating the linear parameters in batch mode. The antecedent learning takes place inthe cluster space with the usage of an incremental evolving version of vector quantization [15], addingnew rules on demand and updating the centers and ranges of influence of the clusters. In case of usingEuclidean distance, the ellipsoidal clusters become axes-parallel, thus recursive variance formula is usedfor estimating and updating the ranges of influence σ :

(nwin +1)σ2win, j← nwinσ

2win, j +(nwin +1)∆c2

win, j +(cwin, j− x j)2 ∀ j = 1, ..., p+1 (3)

with cwin the center of the updated cluster, ∆c the difference between the updated and the old position ofthe cluster center, and nwin the support of the updated cluster. In case of Mahalanobis distance, the ellip-soidal clusters can achieve an arbitrary position, thus the incremental (recursive) update of the covariancematrix Σ is applied [18]:

Σ(new) =1

N +1(NΣ(old)+

NN +1

(X(N)− xN+1)T (X(N)− xN+1)) (4)

with X the mean value of all features in X and N the number of samples seen so far. In order to speed upthe process, calculation of the inverse of the covariance matrix can be omitted by directly updating theinverse covariance matrix, which, however, is just a rough approximation of the batch inverse covariancematrix [1]:

Σ−1win(new) =

Σ−1win(old)1−α

− α

1−α

(Σ−1win(old)(x− cwin(old)))(Σ−1

win(old)(x− cwin(old)))T

1+α((x− cwin(old))T Σ−1win(old)(x− cwin(old)))

(5)

with α = 1nwin+1 . For details on FLEXFIS, refer to [16].

Necessary Extensions in Viscose Production The first extension integrates a forgetting factor for pa-rameters learned so far, i.e. the impact of parameters learned on older data is out-weighted over time. Thisprovides the model with higher flexibility to react to system changes. In the viscose production process,it turned out that a strong forgetting is required to account for the very high dynamic of the spin-bath. Inparticular, when using the exponential forgetting for consequent parameters applying RWLS with forget-ting:

wi(N +1) = wi(N)+ γ(k)(y(N +1)− rT (N +1)wi(N)) (6)

γ(N) =Pi(N)r(N +1)

λ

Ψi(x(N+1)) + rT (N +1)Pi(N)r(N +1)Pi(N +1) = (I− γ(N)rT (N +1))Pi(N)

1λ

(7)

with Pi the inverse Hessian matrix and r the current regressor, it turned out that a forgetting factor λ of 0.9is an appropriate choice (see also evaluation section). In the exponential smoothing context, this meansthat only the latest 21 samples are reflected in the model and its parameters with a weight larger than 0.1.

Including forgetting in the antecedent part is achieved by re-activating the clusters with reducing thenumber of samples attached to them:

nwin = nwin−nwin ∗λ trans λ trans =−9.9λ +9.9 (8)

This automatically increases the learning gain in the evolving vector quantization process, which wasmonotonically decreased before with increasing nwin over time.


(a) (b)

Fig. 2. (a): two distinct clusters from original data; (b): samples are filling up the gap between the two original clusterswhich get overlapping due to movements of their centers and expansion of their ranges of influence

The second extension concerns the reduction of unnecessary complexity of the model over time inorder to keep it as slender as possible, which also decreases the computation time for model updatesduring the on-line process. In an incremental learning context, unnecessary complexity may arise dueto the nature of the data stream, as originally two rules may be required to resolve sufficiently non-linearities in the approximation problem, however may turn out to be unnecessary at a later stage asbecoming significantly redundant (over-lapping) (see Figure 2). In order to circumvent time-intensiveover-lap criteria between two clusters i and k on high-dimensional ellipsoids, we use virtual projectionsof the two clusters in all dimensions to one-dimensional Gaussians and calculate an aggregated overlapdegree based on all intersection points according to the highest membership degree in each dimension(note there are two intersection points between two Gaussians):

overlapik = Aggp+1j=1 overlapik( j) overlapik( j) = max(µ(interx(1)),µ(interx(2))) (9)

where Agg denotes an aggregation operator and µ(interx(1)) and µ(interx(2)) the membership degrees ofthe two intersection points of virtually projected Gaussians on dimension j. A feasible choice for Agg is at-norm, as a strong non-overlap along one single dimension is sufficient that the clusters do not overlap atall — we used the minimum operator in all test cases. If overlapik is higher than a pre-defined threshold(we used 0.8 as value in all tests), then a merge is conducted in the antecedents using a recursive weightedstrategy [19], and in the consequents resolving possible contradictions with Yager’s participatory learningconcepts [25].

3 Experimental Setup

3.1 Data set description

We will handle three data sets containing NIR spectral data coming from the same system, but with threedifferent targets: H2SO4, Na2SO4 and ZnSO4. The number of wavelengths considered (i.e. the dimen-sionality of the input space) is 2658 for all three targets. The data was recorded over a time frame of 3months. The number of instances available is summarized in Table 1. Finally, we used the measurementsfrom the fist two weeks to set up an initial model based on batch learning techniques. This gave us a firstimpression how our method works, also in comparison with state-of-the-art modeling techniques (see be-low). The remaining blocks were used as data stream for incremental adaptation of our fuzzy system. Thelast column indicates the number of outliers identified and thus not used in the batch training procedure.


Target Samples Batch Training Incremental Learning Outliers identifiedH2SO4 16467 3375 13092 165

Na2SO4 8406 1809 6597 69ZnSO4 8742 1881 6861 57

Table 1. Data sets description.

3.2 State-of-art methods used for Comparison

In order to compare the initial batch trained fuzzy systems as non-linear predictors with state-of-the-artmethods widely used in the field of chemometric models, we applied the following techniques:

– Multiple Linear Regression (MLR): standard least squares regression using all wavelengths– Principal Components Regression(PCR) [13]: MLR on principal components– Partial Least Squares(PLS) Regression [9]: similar to PCR, but dimensionality is reduced w.r.t. the

target– Locally weighted Regression [5]: fitting a regression surface to data through multivariate smoothing– Regression Trees (RegTree) [10] use the tree to represent the recursive partition of the input space– Stepwise Regression(StepwiseReg) [6]: includes a statistical-based variable selection– Artifical neural networks [12] after the possibilistic approach by Wasserman– GLMNet [11]: a recent development combining the concepts of lasso and ridge regression.

3.3 Evaluation scheme

The procedure chosen for model selection is a 10-Fold Cross Validation(CV) over the training data set.The CV is made 5 times, calculating the average root mean squared error (RMSE). The learning param-eters and dimensionality of the input space were tuned within the CV procedure. The way to determinethe optimal parameters will be based on the RMSE of CV, but penalizing high complexity, according to(2). Once we selected the best models for all the algorithms, these will be tested using approximately thefirst quarter of the validation data sets.

In the on-line phase we are going to employ the incremental learning concepts for evolving chemo-metric models (EvolvingChemo), as presented in Section 2.3. Several error measures will be calculated(both the sample-by-sample and the accumulated RMSE and NormRMSE for any incoming sample),given by

RMSE =√(y− y)2 = |y− y| NormRMSE = RMSE

max(targets)−min(targets)

RMSEacumi+1 =

√i·(RMSEacum

i )2+(y−yi)

2

i+1 NormRMSEacumi+1 =

RMSEacumi+1

max(targets)−min(targets)

(10)

Apart from the error measures, observed vs predicted plots and the correlation coefficients between themwill be shown, as well as the computation time and the model complexity. In principle, the model adap-tation is performed for each sample, meaning that once a new input arrives the target is predicted, theerror measures are calculated and stored, and the model is updated. In fact, this is in accordance with thefamous interleaved test-and-then-train scenario [2]. Merging and forgetting are both switched on, withdefault values 0.8 for local region similarity degree and 0.9 for the forgetting factor.


Algorithm Parameters RMSE NormRMSE SDH2SO4

MLR −−− 3.2405 0.1403 0.1921PCR PC = 6 2.1377 0.0925 0.1156PLSR LV = 12 1.8603 0.0805 0.1191LWR LV = 3 1.6880 0.0731 0.2486RegTree mpar = 10 1.2439 0.0538 0.2029StepwiseReg pval = 0.045 1.1208 0.0485 0.1115ANN s = 1.00 2.0281 0.0878 0.1585GLMNet λ = 0.04 1.6281 0.0705 0.1184FLEXFIS batch (sel 1) dim = 7, vigi = 0.1 1.0956 0.0474 0.1430

Table 2. Cross validation results from initial batch modeling phase for H2SO4

4 Results

4.1 Model Selection and Validation (off-line)

The results obtained in the CV for the the target H2SO4 are summarized in Table 2, containing the bestparameters, the root mean squared error, the normalized root mean squared error and the standard devi-ation over all the folds. The results for the best algorithms are highlighted in bold. For H2SO4, Table 2,FLEXFIS batch and StepwiseReg have similar behavior, outperforming the rest of algorithms. Similarresults could be observed for the other two targets.

In all the cases, the error increases dramatically when we apply the final trained models (with optimalparameters) to new on-line data without any adaptation and evolution phase. In order to illustrate this,we show Figures 3, where we can see the predicted and the observed values for the first quarter of thevalidation instances using the best two algorithms for every target. Obviously, none of the algorithms isable to model the dynamics of the process within a reasonable error, especially not after the first 300-400samples. For the rest three quarters of the validation instances the behavior is even worse, underlining theabsolute necessity of on-line incremental techniques in this production process.

(a) FLEXFIS static (b) GLMNet (c) StepwiseReg

Fig. 3. Observed vs predicted on H2SO4 (a), Na2SO4 (b) and ZnSO4 (c) (off-line models) — note the bad performanceproducing correlations of observed vs. predicted below 0.5

4.2 On-line Phase

The procedure in the on-line phase is as follows: a) The system is modeled using the dimensionality andvigilance suggested by the model selection procedures in the off-line phase. Several options are consid-ered, like pruning or not pruning and the amount of forgetting we allow (no forgetting and forgetting); b)


Once a new incoming sample arrives, the target is predicted and the errors are calculated, accumulatedand stored; and c) the model is updated with the new sample.

Figure 4 shows the observed vs predicted graphics, in the case of pruning and forgetting. As thenumber of validation features is large, we have split them in blocks (4 blocks for H2SO4 and 3 for therest) showing only the first 2 of each to facilitate the visualization of the achieved accuracy by visualinspection. The observed versus predicted lines more or less overlap each other, indicating performanceof our method with very high accuracy (which is further underlined with concrete numbers in the tables).

(a) (b) (c)

Fig. 4. Observed vs predicted on H2SO4 (a), Na2SO4 (b) and ZnSO4 (c) when applying evolving chemometric mod-els, observed vs. predicted lying over each other (compare the improvement over that one shown Figure 3)

Table 3 contains the results for all three targets, considering both selection procedures and pruning ornot pruning, ’AvComp’ the average complexity and ’Corr’ the correlation coefficient between predictedversus observed, the other measured described in 3.3. To give the reader the chance of making a fastcomparison between the on-line and off-line validation processes, the last row shows the errors, timesand correlation obtained by the best algorithm in the off-line phase – the difference is significant asincreasing correlations between predicted and measured from below 0.5 up to [0.95,0.98].

As an example, we can see in Figure 5 (a) the evolution in time of the number of rules for Na2SO4.Obviously, at around sample 4000 a significant number of rules can be merged as not really needed anylonger. Finally, Figure 5 (b) and (c) underline why forgetting is necessary, as showing more stable andsmooth predictions than without forgetting (compare right with left figure), the latter cause a drop of thecorrelation between measured and predicted values below 0.9.

5 Conclusion

In this paper, we demonstrated an approach for building chemometric models on-the-fly based on on-line process data in a viscose production process, where the prediction, quantification and supervision ofthree concentrations (H2SO4, Na2SO4 and ZnSO4) is essential in order to guarantee high quality of theproducts. These models employing TS model architecture are able to permanently update to changingsystem characteristics and to include new upcoming system states and operating conditions on demandwithout the necessity of time-intensive re-calibration phases. The results show that our method is able tooutperform conventional state-of-the-art chemometric methods, calibrated based on pre-recorded off-linedata samples and kept fixed during the on-line process, when predicting the actual concentrations of thethree process parameters over a time period of three months. In fact, the high error rate of conventionalmethods could be reduced by a factor of 10 and high correlations between observed and predicted valuesin the range of [0.95,0.98] could be achieved (former the correlations laid below 0.5).


AvRMSE NAvRMSE AccRMSE NAccRMSE Corr Time AvCompH2SO4No Pru-Sel1 0.2495 0.0123 0.5881 0.0291 0.9737 0.1433 775.9539Pru-Sel1 0.2570 0.0127 0.5952 0.0294 0.9731 0.2045 599.3145No Pru-Sel2 0.3447 0.0170 0.6241 0.0308 0.9703 0.0031 21.2282Pru-Sel2 0.3457 0.0171 0.6300 0.0311 0.9697 0.0039 12.5538FLEXFIS (static) 5.2633 0.2278 — — 0.1757 0.0001 401Na2SO4No Pru-Sel1 0.6413 0.0154 1.5131 0.0363 0.9575 0.0309 260.1313Pru-Sel1 0.7525 0.0181 2.2621 0.0543 0.9074 0.0254 95.4457No Pru-Sel2 0.8006 0.0192 1.6128 0.0387 0.9516 0.0040 29.7643Pru-Sel2 0.8462 0.0203 2.1512 0.0516 0.9150 0.0037 10.7014GLMNet (static) 12.2347 0.2384 — — 0.4134 0.00001 —ZnSO4No Pru-Sel1 0.1013 0.0158 0.2143 0.0335 0.9796 0.0665 377.3374Pru-Sel1 0.1119 0.0174 0.2838 0.0444 0.9648 0.0845 242.8069No Pru-Sel2 0.1589 0.0248 0.2533 0.0396 0.9720 0.0011 2Pru-Sel2 0.1589 0.0248 0.2533 0.0396 0.9720 0.0012 2StepwiseReg (static) 12.1905 1.6474 — — 0.1466 0.000002 —

Table 3. On-line validation results for the three targets, last row in each part represents the performance of staticmodels

(a) # of rules over time (b) No forgetting (c) Forgetting

Fig. 5. Example of the evolution of the complexity (a), example of the influence of not forgetting (b) or forgetting (c)

References

1. Backer, S.D., Scheunders, P.: Texture segmentation by frequency-sensitive elliptical competitive learning. Imageand Vision Computing 19(9–10), 639–648 (2001)

2. Bifet, A., Holmes, G., Kirkby, R., Pfahringer, B.: MOA: Massive online analysis. Journal of Machine LearningResearch 11, 1601–1604 (2010)

3. Brereton, R.: Chemometrics: Data Analysis for the Laboratory and Chemical Plant. John Wiley & Sons, Hobo-ken, New Jersey (2003)

4. Cernuda, C., Lughofer, E., Maerzinger, W., Kasberger, J.: NIR-based quantification of process parameters inpolyetheracrylat (pea) production using flexible non-linear fuzzy systems. Chemometrics and Intelligent Labo-ratory Systems 109(1), 22–33 (2011)

5. Cleveland, W., Devlin, S.: Locally weighted regression: An approach to regression analysis by local fitting.Journal of the American Statistical Association 84(403), 596–610 (Sep 1988)

6. Draper, N., Smith, H.: Applied regression analysis. Wiley Interscience, Hoboken, NJ (1998)7. Gama, J.: Knowledge Discovery from Data Streams. Chapman & Hall/CRC, Boca Raton, Florida (2010)8. Haavisto, O., Hyotyniemi, H.: Recursive multimodel partial least squares estimation of mineral flotation slurry

contents using optical reflectance spectra. Analytica Chimica Acta 642, 102–109 (2009)


9. Haenlein, M., Kaplan, A.: A beginner’s guide to partial least squares (PLS) analysis. Understanding Statistics3(4), 283–297 (2004)

10. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference and Pre-diction - Second Edition. Springer, New York Berlin Heidelberg (2009)

11. Hastie, T., Tibshirani, R., Friedman, J.: Regularized paths for generalized linear models via coordinate descent.Journal of Statistical Software 33(1) (2010)

12. Haykin, S.: Neural Networks: A Comprehensive Foundation. Prentice Hall (1999)13. Jolliffe, I.: Principal Component Analysis. Springer Verlag, Berlin Heidelberg New York (2002)14. Ljung, L.: System Identification: Theory for the User. Prentice Hall PTR, Prentic Hall Inc., Upper Saddle River,

New Jersey (1999)15. Lughofer, E.: Extensions of vector quantization for incremental clustering. Pattern Recognition 41(3), 995–1011

(2008)16. Lughofer, E.: FLEXFIS: A robust incremental learning approach for evolving TS fuzzy models. IEEE Trans. on

Fuzzy Systems 16(6), 1393–1410 (2008)17. Lughofer, E.: Evolving Fuzzy Systems — Methodologies, Advanced Concepts and Applications. Springer,

Berlin Heidelberg (2011)18. Lughofer, E.: On-line incremental feature weighting in evolving fuzzy classifiers. Fuzzy Sets and Systems

163(1), 1–23 (2011)19. Lughofer, E., Bouchot, J.L., Shaker, A.: On-line elimination of local redundancies in evolving fuzzy systems.

Evolving Systems 2(3), 165–187 (2011)20. Næs, T., Martens, H.: Principal component regression in NIR analysis: Viewpoints, background details and

selection of components. Journal of Chemometrics 2(2), 155–167 (1988)21. Pomerantsev, A.: Acceptance areas for multivariate classification derived by projection methods. Journal of

Chemometrics 22, 601–609 (2008)22. Shao, X., Bian, X., Cai, W.: An improved boosting partial least squares method for near-infrared spectroscopic

quantitative analysis. Analytica Chimica Acta 666(1–2), 32–37 (2010)23. Varmuza, K., Filzmoser, P.: Introduction to Multivariate Statistical Analysis in Chemometrics. CRC Press, Boca

Raton (2009)24. Wang, L., Mendel, J.: Fuzzy basis functions, universal approximation and orthogonal least-squares learning.

IEEE Transactions on Neural Networks 3(5), 807–814 (1992)25. Yager, R.R.: A model of participatory learning. IEEE Transactions on Systems, Man and Cybernetics 20(5),

1229–1234 (1990)

Dynamic Quantiﬁcation of Process Parameters in Viscose … · 2014-01-15 · Dynamic Quantiﬁcation of Process Parameters in Viscose Production with Evolving Fuzzy Systems? Carlos

Documents