Computational tools in metrology and testing x

9610_9789814678612_tp.indd 1 2/4/15 4:21 pm

May 2, 2013 14:6 BC: 8831 - Probability and Statistical Theory PST˙ws

This page intentionally left blankThis page intentionally left blank

World Scientific

9610_9789814678612_tp.indd 2 2/4/15 4:21 pm

Published by

World Scientific Publishing Co. Pte. Ltd.5 Toh Tuck Link, Singapore 596224USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

Library of Congress Cataloging-in-Publication DataAdvanced mathematical and computational tools in metrology and testing X / edited by Franco Pavese (Istituto Nazionale di Ricerca Metrologica, Italy) [and four others]. pages cm. -- (Series on advances in mathematics for applied sciences ; volume 86) Includes bibliographical references and index. ISBN 978-9814678612 (hardcover : alk. paper) 1. Metrology. 2. Statistics. I. Pavese, Franco, editor. QC88.A38 2015 389'.1015195--dc23 2015008632

British Library Cataloguing-in-Publication DataA catalogue record for this book is available from the British Library.

Copyright © 2015 by World Scientific Publishing Co. Pte. Ltd.

All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the publisher.

For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.

Printed in Singapore

EH - Adv Math & Compu Tools in Metrology and Testing X.indd 1 14/4/2015 2:41:47 PM

v

9610-00a:Advanced Mathematical and Computational Tools

Advanced Mathematical and Computational Tools in Metrology and Testing X

Edited by F. Pavese, W. Bremser, A. Chunovkina, N. Fischer and A. B. Forbes

© 2015 World Scientific Publishing Company (p. v)

Foreword

This volume contains original refereed worldwide contributions. They were

prompted by presentations made at the tenth Conference held in St. Petersburg,

Russia, in September 2014 on the theme of advanced mathematical and

computational tools in metrology and also, in the title of this book series, in

testing.

The aims of the IMEKO Committee TC21 “Mathematical Tools for

Measurements” (http://www.imeko.org/index.php/tc21-homepage) supporting

the activities in this field and this book Series were

• To present and promote reliable and effective mathematical and

computational tools in metrology and testing.

• To understand better the modelling, statistical and computational

requirements in metrology and testing.

• To provide a forum for metrologists, mathematicians, software and IT

engineers that will encourage a more effective synthesis of skills,

capabilities and resources.

• To promote collaboration in the context of EU and International

Programmes, Projects of EURAMET, EMPR, EA and of other world

Regions, MRA requirements.

• To support young researchers in metrology, testing and related fields.

• To address industrial and societal requirements. The themes in this volume reflect the importance of the mathematical, statistical

and numerical tools and techniques in metrology and testing and, also keeping

the challenge promoted by the Metre Convention, to access a mutual recognition

for the measurement standards.

Torino, February 2015 The Editors



vii

9610-00b:Advanced Mathematical and Computational Tools

Contents

Foreword v

Fostering Diversity of Thought in Measurement Science

F. Pavese and P. De Bièvre 1

Polynomial Calibration Functions Revisited: Numerical and Statistical

Issues

M.G. Cox and P. Harris 9

Empirical Functions with Pre-Assigned Correlation Behaviour

A.B. Forbes 17

Models and Methods of Dynamic Measurements: Results Presented by

St. Petersburg Metrologists

V.A. Granovskii 29

Interval Computations and Interval-Related Statistical Techniques:

Estimating Uncertainty of the Results of Data Processing and

Indirect Measurements

V.Ya. Kreinovich 38

Classification, Modeling and Quantification of Human Errors in

Chemical Analysis

I. Kuselman 50

Application of Nonparametric Goodness-of-Fit Tests: Problems and

Solution

B.Yu. Lemeshko 54

Dynamic Measurements Based on Automatic Control Theory Approach

A.L. Shestakov 66

Models for the Treatment of Apparently Inconsistent Data

R. Willink 78

Model for Emotion Measurements in Acoustic Signals and Its Analysis

Y. Baksheeva, K. Sapozhnikova and R. Taymanov 90

viii


Uncertainty Calculation in Gravimetric Microflow Measurements

E. Batista, N. Almeida, I. Godinho and E. Filipe 98

Uncertainties Propagation from Published Experimental Data to

Uncertainties of Model Parameters Adjusted by the Least Squares

V.I. Belousov, V.V. Ezhela, Y.V. Kuyanov, S.B. Lugovsky,

K.S. Lugovsky and N.P. Tkachenko 105

A New Approach for the Mathematical Alignment Machine Tool-Paths

on a Five-Axis Machine and Its Effect on Surface Roughness

S. Boukebbab, J. Chaves-Jacob, J.-M. Linares and N. Azzam 116

Goodness-of-Fit Tests for One-Shot Device Testing Data

E.V. Chimitova and N. Balakrishnan 124

Calculation of Coverage Intervals: Some Study Cases

A. Stepanov, A. Chunovkina and N. Burmistrova 132

Application of Numerical Methods in Metrology of Electromagnetic

Quantities

M. Cundeva-Blajer 140

Calibration Method of Measuring Instruments in Operating Conditions

A.A. Danilov, Yu.V. Kucherenko, M.V. Berzhinskaya and

N.P. Ordinartseva 149

Statistical Methods for Conformity Assessment When Dealing with

Computationally Expensive Systems: Application to a Fire

Engineering Case Study

S. Demeyer, N. Fischer, F. Didieux and M. Binacchi 156

Overview of EMRP Joint Reserch Project NEW06 “Traceability for

Computationally-Intensive Metrology”

A.B. Forbes, I.M. Smith, F. Härtig and K. Wendt 164

Stable Units of Account for Economic Value Correct Measuring

N. Hovanov 171

A Novel Approach for Uncertainty Evaluation Using Characteristic

Function Theory

A.B. Ionov, N.S. Chernysheva and B.P. Ionov 179

Estimation of Test Uncertainty for TraCIM Reference Pairs

F. Keller, K. Wendt and F. Härtig 187

ix


Approaches for Assigning Numerical Uncertainty to Reference Data

Pairs for Software Validation

G.J.P. Kok and I.M. Smith 195

Uncertainty Evaluation for a Computationally Expensive Model of a

Sonic Nozzle

G.J.P. Kok and N. Pelevic 203

EllipseFit4HC: A MATLAB Algorithm for Demodulation and

Uncertainty Evaluation of the Quadrature Interferometer Signals

R. Köning, G. Wimmer and V. Witkovský 211

Considerations on the Influence of Test Equipment Instability and

Calibration Methods on Measurement Uncertainty of the Test

Laboratory

A.S. Krivov, S.V. Marinko and I.G. Boyko 219

A Cartesian Method to Improve the Results and Save Computation

Time in Bayesian Signal Analysis

G.A. Kyriazis 229

The Definition of the Reliability of Identification of Complex Organic

Compounds Using HPLC and Base Chromatographic and Spectral Data

E.V. Kulyabina and Yu.A. Kudeyarov 241

Uncertainty Evaluation of Fluid Dynamic Simulation with

One-Dimensional Riser Model by Means of Stochastic Differential

Equations

E.A.O. Lima, S.B. Melo, C.C. Dantas, F.A.S. Teles and

S. Soares Bandiera 247

Simulation Method to Estimate the Uncertainties of ISO Specifications

J.-M. Linares and J.M. Sprauel 252

Adding a Virtual Layer in a Sensor Network to Improve Measurement

Reliability

U. Maniscalco and R. Rizzo 260

Calibration Analysis of a Computational Optical System Applied in the

Dimensional Monitoring of a Suspension Bridge

L.L. Martins, J.M. Rebordão and A.S. Ribeiro 265

x


Determination of Numerical Uncertainty Associated with Numerical

Artefacts for Validating Coordinate Metrology Software

H.D. Minh, I.M. Smith and A.B. Forbes 273

Least-Squares Method and Type B Evaluation of Standard Uncertainty

R. Palenčár, S. Ďuriš, P. Pavlásek, M. Dovica, S. Slosarčík

and G. Wimmer 279

Optimising Measurement Processes Using Automated Planning

S. Parkinson, A. Crampton and A.P. Longstaff 285

Software Tool for Conversion of Historical Temperature Scales

P. Pavlásek, S. Ďuriš, R. Palenčár and A. Merlone 293

Few Measurements, Non-Normality: A Statement on the Expanded

Uncertainty

J. Petry, B. De Boeck, M. Dobre and A. Peruzzi 301

Quantifying Uncertainty in Accelerometer Sensitivity Studies

A.L. Rukhin and D.J. Evans 310

Metrological Aspects of Stopping Iterative Procedures in Inverse

Problems for Static-Mode Measurements

K.K. Semenov 320

Inverse Problems in Theory and Practice of Measurements and Metrology

K.K. Semenov, G.N. Solopchenko and V.Ya. Kreinovich 330

Fuzzy Intervals as Foundation of Metrological Support for

Computations with Inaccurate Data

K.K. Semenov, G.N. Solopchenko and V.Ya. Kreinovich 340

Testing Statistical Hypotheses for Generalized Semiparametric

Proportional Hazards Models with Cross-Effect of Survival Functions

M.A. Semenova and E.V. Chimitova 350

Novel Reference Value and DOE Determination by Model Selection

and Posterior Predictive Checking

K. Shirono, H. Tanaka, M. Shiro and K. Ehara 358

Certification of Algorithms for Constructing Calibration Curves of

Measuring Instruments

T. Siraya 368

xi


Discrete and Fuzzy Encoding of the ECG-Signal for Multidisease

Diagnostic System

V. Uspenskiy, K. Vorontsov, V. Tselykh and V. Bunakov 377

Application of Two Robust Methods in Inter-Laboratory Comparisons

with Small Samples

Е.T. Volodarsky and Z.L. Warsza 385

Validation of CMM Evaluation Software Using TraCIM

K. Wendt, M. Franke and F. Härtig 392

Semi-Parametric Polynomial Method for Retrospective Estimation of

the Change-Point of Parameters of Non-Gaussian Sequences

S.V. Zabolotnii and Z.L. Warsza 400

Use of a Bayesian Approach to Improve Uncertainty of Model-Based

Measurements by Hybrid Multi-Tool Metrology

N.-F. Zhang, B.M. Barnes, R.M. Silver and H. Zhou 409

Application of Effective Number of Observations and Effective

Degrees of Freedom for Analysis of Autocorrelated Observations

A. Zieba 417

Author Index 425

Keywords Index 427



1

9610-01:Advanced Mathematical and Computational Tools



© 2015 World Scientific Publishing Company (pp. 1–8)

FOSTERING DIVERSITY OF THOUGHT IN MEASUREMENT

SCIENCE

FRANCO PAVESE

Torino, Italy

PAUL DE BIÈVRE

Kasterlee, Belgium

The contrast between single thought and diversity is long since inherent to the search for

‘truth’ in science—and beyond. This paper aims at summarizing the reasons why

scientists should be humble in contending about methods for expressing experimental

knowledge. However, we suppose that there must be reasons for the present trend toward

selection of a single direction in thinking rather than using diversity as the approach to

increase confidence that we are heading for correct answers: some examples are listed.

Concern is expressed that this trend could lead to ‘political’ decisions, hindering rather

than promoting, scientific understanding.

1. Introduction

In many fields of science we think we see increasing symptoms of an

attitude that seems to be fostered by either the anxiety to take a decision, or by

the intention to attempt to ‘force’ a conclusion upon the reader.

Limiting ourselves to a field where we have some competence, measurement

science, a few sparse examples of exclusive choices have been selected in no

particular order, including two documents that are widely assumed to master this

field:

− The Guide for the Expression of Uncertainty in Measurement (GUM) [1],

–being which is now in favour of choosing a single framework, the

‘uncertainty approach’, discontinuing the ‘error approach’ [2, 3];

–seeming now to be heading for a total ‘Bayesian approach’ replacing

all ‘frequentist’ approaches [4–6].

− The International System of Measurement Units (SI) [7], which now seems

to be proposed for a fundamental change by the Consultative Committee for

Units (CCU) to the CIPM and CGPM [8,9], to change, with the

‘fundamental’ or ‘reference constants’ replacing ‘physical states’ or

‘conditions’ for the definitions of units.

2


− The VIM with the basic change from “basic and general terms” to “basic

and general concepts and associated terms”.

− The “recommended values” of the numerical values of fundamental

constants, atomic masses, differences in scales …, e.g., specific data from

CODATA [8,10] being restricted to one single ‘official’ set.

− The stipulation of numerical values in specific frames, claimed to have

universal and permanent validity.

− The traditional classification of the errors/effects in random and systematic,

with the concept of “correction” associated to the latter, being pretended to

be exclusive.

This paper does not intend to discuss any specific example, since its focus is

not on the validity of any specific choice, but on the importance of creating the

choice. The two issues should not be confused with each other. Parts of the paper

may look ‘philosophical’, but they are only intended to concern philosophy of

science, i.e. not to be extraneous to scientist’s interests: any concerned scientist

should be aware of the difference between ‘truth’ and ‘belief’. Accepting

diversity in thinking is a mental attitude, which should never be ignored or

considered a mere option for scientists. It is a discriminating issue. The paper is

only devoted to this science divide: either one goes to single thought, or one

picks up from diversity a wider view of solutions to scientific problems. We

think that disappointment with this position can be expected from single thought

advocates only.

2. Truth in philosophy and science

The gnoseological1 issue of truth is itself a dilemma, since different

fundamental aspects can be attributed to this concept: as one can have truth by

correspondence, by revelation (disclosure), by conformity to a rule, by

consistency (coherence), by benefit. They are not reciprocally alternative, are

diverse and not-reducible to each other [11]. Several of them are appropriate in

science. With particular respect to consistency, it is relevant to recall a sentence

of Steven G. Vick: “Consistency is indifferent to truth. Once can be entirely

consistent and still be entirely wrong” [12].

In the search for truth, the history of thinking shows that general principles

are typically subject to irresolvable criticism, leading to—typically two—

contrasting positions: it is the epistemological dilemma, long since recognized

1 Gnoseological: it means concerning with the philosophy of knowledge and the human faculties for

learning.

3


(e.g., David Hume (1711-1776): “Reason alone is incapable of resolving the

various philosophical problems”) and has generated several ‘schools of

thinking’: pragmatism, realism, relativism, empirism, ….

Modern science, as basically founded on one of the two extreme

viewpoints—empiric, as opposed to metaphysical—is usually considered exempt

from the above weakness. Considering doubt as a shortcoming, scientific

reasoning aims at reaching, if not truth, at least certainties, and many scientists

tend to believe that this goal can be fulfilled in their field of competence.

Instead, they should remember the Francis Bacon (1605) memento: “If we begin

with certainties, we shall end in doubts; but if we begin with doubts, and are

patient with them, we shall end with certainties” … still an optimistic one.

3. Does certainty imply objectivity? The rise of the concept of

uncertainty as a remedy in science

As alerted by philosophers, the belief in certainty simply arises from the

illusion of science being able to attain objectivity as a consequence of being

based on information drawn from the observation of natural phenomena, and

considered as ‘facts’. A fact, as defined in English dictionaries, means:

“A thing that is known or proven to be true” [Oxford Dictionary]

“A piece of information presented as having objective reality” [Merriam-

Webster Dictionary].

Objectivity and cause-effect-cause chain are the pillars of single-ended

scientific reasoning. Should this be the case, the theories developed for

systematically interlocking the empirical experience would similarly consist of a

single building block, with the occasional addition of ancillary building blocks

accommodating specific new knowledge. This is a ‘static’ vision of science (and

of knowledge in general). In that case “Verification” [13]” would become

unnecessary, ‘Falsification’ [14] a paradox, and the road toward any “Paradigm

change” or “Scientific revolution” [15] prevented.

On the contrary, confronted with the evidence available since long, and

reconfirmed everyday that the objectivity scenario does not normally apply, the

concept of uncertainty came in. To be noted that, strictly speaking, it applies

only if the object of the uncertain observations is the same (the ‘measurand’ in

measurement science), hence the issue is not resolved, the problem is simply

shifted to another concept, the uniqueness of the measurand, a concept of non-

random nature, leading to “imprecision”. This term is used here in the sense

indicated in [16]: “Concerning non-precise data, uncertainty is called

4


imprecision … is not of stochastic nature … can be modelled by the so-called

non-precise numbers”.

4. From uncertainty to chance: the focus on decision in science

Confronted with the evidence of diverse results of observations, modern

science way-out was to introduce the concept of ‘chance’—replacing ‘certainty’.

This was done with the illusion of reaching firmer conclusions by

establishing a hierarchy in measurement results (e.g. based on the frequency of

occurrence), in order to take a ‘decision’ (i.e. for choosing from various

measurement results).

The chance concept initiated the framework of ‘probability’, but expanded

later into several other streams of thinking, e.g., possibility, fuzzy, cause-effect,

interval, non-parametric, … reasoning frames depending on the type of

information available or on the approach to it. Notice that philosophers of

science warned us about the logical weakness of the probability approach: “With

the idol of certainty (including that of degrees of imperfect certainty or

probability) there falls one of the defences of obscurantism which bars the way

of scientific advance” [14] (emphasis added).

Limiting ourselves to the probability frame, any decision strategy requires

the choice of an expected value as well of the limits of the dispersion interval of

the observations.

The choice of the expected value (‘expectation’: “a strong belief that

something will happen or be the case” [Oxford Dictionary]) is not unequivocal,

since several location parameters are offered by probability theory—with a ‘true

value’ still standing in the shade, deviations from which are called ‘errors’.

As to data dispersion, most theoretical frameworks tend to lack general

reasons for bounding a probability distribution, whose tails thus extend without

limits to infinitum. However, without a limit, no decision is possible; and, the

wider the limit, the less meaningful a decision is. Stating a limit becomes itself a

decision, assumed on the fitness of the intended use of the data.

In fact, the terms used in this frame clearly indicate the difficulty and the

meaning that is applicable in this context:

‘confidence level’ (confidence: “the feeling or belief that one can have faith

in or rely on someone or something” [from Oxford Dictionary]), or

‘degree of belief’ (belief: “trust, faith, or confidence in (someone or

something)” or “an acceptance that something exists or is true, especially one

without proof” [ibidem])

5


Still about data dispersion: one can believe in using truncated (finite tail-

width) distributions. However, reasons for truncation are generally supported by

uncertain information. In rare cases it may be justified by theory, e.g. a bound to

zero –itself not normally reachable exactly (experimental limit of detection).

Again, stating limits becomes itself a decision, also in this case, on the fitness for

the intended use of the data.

5. The fuzziness of decision and the concept of risk in science

The ultimate common goal of any branch of science is to communicate

measurement results and to perform robust prediction. Prediction is necessary to

forecast, i.e. to decision. However, what about the key term ‘decision’?

When (objective) reasoning is replaced by choice, a decision can only be

based on (i) a priori assumptions (for hypotheses), or (ii) inter-subjectively

accepted conventions (predictive for subsequent action).

However, hypotheses cannot be proved, and inter-subjective agreements are

strictly relative to a community and for a given period of time.

The loss of certainty resulted in the loss of uniqueness of decisions, and the

concept of ‘risk’ emerged as a remedy.

Actually, any parameter chosen to represent a set of observations becomes

‘uncertain’ not because it must be expressed with a dispersion attribute

associated to an expected value, but because the choice of both parameters is the

result of decisions. Therefore, when expressing an uncertain value the

components are not two (best value and its uncertainty), but three, the third being

the decision taken for defining the values of the first two components, e.g., the

chosen width of the uncertainty interval, the chosen ‘level’ of confidence, ….

A decision cannot be ‘exact’ (unequivocal). Any decision is fuzzy. The use

of risk does not alleviate the issue: if a decision cannot be exact, the risk cannot

be zero.

In other words: the association of a risk to a decision, a recent popular issue,

does not add any real benefit with respect to the fundamental issue. Risk is only

zero for certainty, so zero risk is unreachable.

This fact has deep consequence, as already expressed by Karl Popper in

1936: “The relations between probability and experience are also still in need of

clarification. In investigating this problem we shall discover what will at first

seem an almost insuperable objection to my methodological views. For although

probability statements play such a vitally important role in empirical science,

they turn out to be in principle impervious to strict falsification.” [14]

6


6. The influence of the observer in science

In conclusion, chance is a bright prescription for working on symptoms of

the disease, but is not a therapy for its deep origin, subjectivity.

In fact, the very origin of the problem is related to our knowledge

interface—human being.

It is customary to make a distinction between the ‘outside’ and the ‘inside’

of the observer, ‘the ‘real world’ and the ‘mind’. We are not fostering here a

vision of the world as a ‘dream’: there are solid arguments for conceiving a

structured and reasonably stable reality outside us (objectivity of the “true

value”). However, this distinction is one of the reasons having generated a

dichotomy since at least since a couple of centuries, between ‘exact sciences’

and other branches, often called ‘soft’, like psychology, medicine, economy.

For ‘soft’ science we are ready to admit that the objects of the observations

tend to be dissimilar, because every human individual is dissimilar from any

other. In ‘exact science’ we are usually not ready to admit that the human

interface between our ‘mind’ and the ‘real world’ is a factor of influence

affecting very much our knowledge. Mathematics stay in between, not being

based on the ‘real world’ but on an ‘exact’ construction of concepts based in our

mind.

7. Towards an expected appropriate behaviour of the scientist

All the above should suggest scientists to be humble about contending on

methods for expressing experimental knowledge—apart from obvious mistakes

(“blunders”). Different from the theoretical context, experience can be shared to

a certain degree, but leads, at best, to a shared decision. The association of a

‘risk’ to a decision, a relatively recent very popular issue, does not add any real

benefit with respect to the fundamental issue, and this new concept basically is

merely the complement to one of the concept of chance: it is zero for certainty,

zero risk being an unreachable asymptote.

For the same reason, one cannot expect that a single decision be valid in all

cases, i.e. without exceptions. In consequence, no single frame of reasoning

leading to a specific type of decision can be expected to be valid in all cases.

The logical consequence of the above should be, in most cases, that not all

decisions (hence all frames of reasoning) are necessarily mutually exclusive.

Should this be the case, diversity rather becomes richness by deserving a higher

degree of confidence in that we are pointing to the correct answers. Also in

science, ‘diversity’ is not always a synonym of ‘confusion’, a popular way to

7


contrast it, rather it is an invaluable additional resource leading to better

understanding.

This fact is already well understood in experimental science, where the main

way to detect systematic effects is to diversify the experimental methods. Why

the diversifying methodology should not extend also to principles?

It might be argued that the metrological traceability requirement—a

fundamental one not only in metrology but in general in measurement—may

conflict with diversity, since metrological traceability requires metrological

criteria as given in [3], potentially creating a conflict between diversity and

uniformity invoking the principle of (decision) efficiency in science of

measurement. Based on the meaning of “metrological traceability” and of

“measurement result” involved in it, as defined in [3], we do not see a possible

conflict in allowing for diversity. Take, for example, the well-known issue of the

frequentist versus Bayesian treatments: both are used depending on the decision

of the single scientist, without ever having been considered, at our knowledge, to

have affected the validity of any metrological traceability assessment.

The origin of the trend indicated may be due to an incorrect assignment to a

scientific Commission asked to reach a single ‘consensus’ outcome instead of a

rationally-compounded digest of the best information/knowledge available.

However the consequence would be politics (needing decisions) leaking into

science (seeking for understanding); a potential trend also threatening scientific

honesty.

References

1. Guide for the expression of uncertainty in measurement; JCGM 00:2008,

ISO Geneva, at http://www.bipm.org/en/publications/guides/gum.html

2. GUM Anniversary Issue, Metrologia Special Issue, 51 (2014) S141–S244.

3. International vocabulary of metrology – Basic and general concepts and

associated terms – VIM, 3rd edition, 2012 (2008 with minor corrections),

ISO, Geneva, at http://jcgm.bipm.org/vim

4. B. Efron, Bayesians, Frequentists, and Scientists, Technical Report No.

2005-1B/230, Janauary 2005, Division of Biostatistics, Standford,

California 94305-4065.

5. R. Willink and R. White, Disentangling Classical and Bayesian Approaches

to Uncertainty Analysis, Doc. CCT/12-07, Comité Consultatif de

Thermométrie, BIPM, Sèvres (2012).

8


6. Stanford Encyclopedia of Philosophy, Interpretations of Probability,

http://plato.stanford.edu/entries/probability-interpret/, pp 40.

7. http://www.bipm.org/en/measurement-units/

8. http://www.bipm.org/en/measurement-units/new-si/

9. F. Pavese, How much does the SI, namely the proposed ‘new SI’, conform

the spirit of the Metre Treaty?, ACQUAL, 19 (2014) 307–314

10. F. Pavese, Some problems concerning the use of the CODATA fundamental

constants in the definition of measurement units, Letter to the Editor,

Metrologia 51 (2014) L1–L4.

11. N. Abbagnano, Dictionary of Philosophy (in Italian), UTET, Torino (1971).

12. S.G. Vick, Degrees of Belief: Subjective Probability and Engineering

Judgment, ASCE Publications (2002).

13. L. Wittgenstein, Philosophical Investigations (Translated by G. E. M.

Anscombe), Basil Blackwell, Oxford, 1st edition (1953).

14. K. Popper, The Logic of Scientific Discovery (Taylor & Francis e-Library

ed.). London and New York: Routledge / Taylor & Francis e-Library

(2005).

15. T.S. Kuhn, The Structure of Scientific Revolutions. 3rd ed. Chicago, IL:

University of Chicago Press (1996).

16. R. Viertl, Statistical Inference with imprecise data, in Probability and

Statistics, in Encyclopedia of Life Support Systems (EOLSS), Developed

under the Auspices of the UNESCO, Eolss Publishers, Oxford, UK,

[http://www.eolss.net] (2003).

April 23, 2015 10:10 ws-procs9x6-9x6 9610-02 page 9

Advanced Mathematical and Computational Tools in Metrology and Testing X Edited by F. Pavese, W. Bremser, A. Chunovkina, N. Fischer and A. B. Forbes © 2015 World Scientific Publishing Company (pp. – )

POLYNOMIAL CALIBRATION FUNCTIONS REVISITED:

NUMERICAL AND STATISTICAL ISSUES

MAURICE COX AND PETER HARRIS

National Physical Laboratory, Teddington, Middlesex TW11 0LW, UK

E-mail: [email protected]

The problem of constructing a polynomial calibration function is revisited,

paying attention to the representation of polynomials and the selection of anappropriate degree. It is noted that the monomial representation (powers of the

‘natural’ variable) is inferior to the use of monomials in a normalized variable,

which in turn is bettered by a Chebyshev representation, use of which alsogives stability and insight. Traditional methods of selecting a degree do not

take fully into account the mutual dependence of the statistical tests involved.We discuss degree-selection principles that are more appropriate.

Keywords: calibration, polynomial representation, uncertainty, degree selection

1. Introduction

Calibration consists of two stages.1 In stage 1 a relation is established be-

tween values provided by measurement standards and corresponding instru-

ment response values. In stage 2 this relation is used to obtain measurement

results from further response values. We consider polynomial calibration

functions that describe the response variable y in terms of the stimulus

variable x. Polynomials of various degrees, determined by least squares, are

extensively used as empirical calibration functions in metrology.

A polynomial of degree n has n+1 coefficients or parameters b. An esti-

mate b of b is to be determined given calibration data (xi, yi), i = 1, . . . ,m,

provided by a measuring system. For a further response value y0, the poly-

nomial is then used inversely to predict the corresponding stimulus value x0.

The xi and yi are assumed to be realizations of random variables having

Gaussian distributions (not necessarily independent).

Section 2 considers uncertainty and model validity, Sect. 3 the represen-

tation of polynomials, Sect. 4 measures of consistency, Sect. 5 an example

of thermocouple calibration and Sect. 6 our conclusions.

9

March 6, 2015 9:37 ws-procs9x6-9x6 9610-02 page 10

10

2. Uncertainty and model validity

Calibration data invariably have associated measurement uncertainty (un-

certainties associated with the xi or the yi or both), which means in the first

stage there will be uncertainty associated with b in the form of a covariance

matrix U b. In turn, in the second stage, U b and the standard uncertainty

associated with y0 contribute to the standard uncertainty u(x0) associated

with x0. Given the uncertainties associated with the calibration data (most

generally in the form of a covariance matrix), an appropriate numerical

algorithm2 is used to produce b and U b.

Once a candidate polynomial model has been fitted to the data, it is

necessary to determine the extent to which the model explains the data,

ideally in a parsimonious way. Only when the model is acceptable in this

regard should it be used to predict x0 given y0 and to evaluate u(x0).

3. Polynomial representation

Whilst the traditional representation of a polynomial in x is the monomial

form pn(x) = c0 + c1x+ · · ·+ cnxn, its use can lead to numerical prob-

lems.3 Representing pn(x) in Chebyshev form generally overcomes such

difficulties, and has advantages mathematically and computationally.4

First, consider x varying within the finite interval [xmin, xmax] and trans-

forming it to a normalized variable t ∈ I = [−1, 1]:

t = (2x− xmin − xmax)/(xmax − xmin). (1)

This normalization avoids working with numbers that are possibly very

large or very small in magnitude for high or even modest polynomial degree.

Second, the Chebyshev-polynomial representation

pn(x) ≡ 12a0T0(t) + a1T1(t) + · · ·+ anTn(t) (2)

is beneficial since polynomial functions expressed in this manner facilitate

working with them in a numerically stable way.5 The Tj(t), which lie be-

tween −1 and 1 for t ∈ I, are generated for any t ∈ I using

T0(t) = 1, T1(t) = t, Tj(t) = 2tTj−1(t)− Tj−2(t), j ≥ 2.

Algorithms based on Chebyshev polynomials appear in the NAG Library6

and other software libraries and have been used successfully for decades.

For many polynomial calibration functions the polynomial degree is

modest, often no larger than three or four. For such cases, the use of mono-

mials in a normalized (rather than the raw) variable generally presents few


11

numerical difficulties. There are cases, such as the International Tempera-

ture Scale ITS-90,7 where the reference functions involved take relatively

high degrees such as 12 or 15. For such functions, working with a normal-

ized variable offers numerical advantages and the Chebyshev form confers

even more, not only numerically, but also in terms of giving a manageable

and sometimes a more compact representation.

For instance, the monomial representation of thermoelectric voltage

E =n∑

r=0

crxr

in the reference function for Type S Thermocouples, for Celsius temper-

atures x in the interval [−50, 1 064.18] C, is given in a NIST database.8

There is a factor of some 1021 between the non-zero coefficients of

largest and smallest magnitude, which are held to 12 significant decimal

digits (12S); presumably it was considered that care is needed in working

with this representation. The cr are given in Table 1 (column “Raw”).

Table 1. Polynomial coefficients for a Type S thermocouple

Coeff Raw Scaled Normalized Chebyshev

0 0 0 4.303 6 9.278 21 5.403 133 086 31×10−3 5.749 9 5.527 8 5.371 1

2 1.259 342 897 40×10−5 14.261 8 0.478 4 0.370 63 −2.324 779 686 89×10−8 −28.017 4 −0.054 3 −0.072 9

4 3.220 288 230 36×10−11 41.300 5 0.220 6 0.037 1

5 −3.314 651 963 89×10−14 −45.239 0 −0.163 7 −0.013 06 2.557 442 517 86×10−17 37.144 7 0.021 6 0.002 2

7 −1.250 688 713 93×10−20 −19.331 0 −0.024 9 −0.000 4

8 2.714 431 761 45×10−24 4.464 8 0.025 2 0.000 2

A scaled variable q = x/B has been used in work on ITS-90 in

recent years, where B = 1 064.18 C is the upper interval endpoint.

Then, E =∑n

r=0 drqr, with dr = Brcr. The scaling implies that the con-

tribution from the rth term in the sum is bounded in magnitude by |dr|.Values of E in mV are typically required to 3D (three decimal places). Ac-

cordingly, the coefficients dr are given in Table 1 (column “Scaled”) to 4D

(including a guard digit) and are much more manageable. Alternatively,

the variable can be normalized to the interval I (not done in ITS-90) using

expression (1) with xmin = −50 C and xmax = B. The corresponding co-

efficients are given in column “Normalized” and the Chebyshev coefficients

in column “Chebyshev”, both to 4D, obtained using Refs. 4 and 9.


12

Figure 1 depicts the reference function. It is basically linear, but the

non-linearity present cannot be ignored. The coefficients in the monomial

representation in terms of the raw or scaled variable in Table 1 give no

indication of the fundamentally linear form. That the first two coefficients

are dominant is strongly evident, though, in the normalized and Chebyshev

forms. The Chebyshev coefficients for degree 8 and arguably for degree 7

could be dropped, since to 3D they make little or no contribution. Such

reasoning could not be applied to the other polynomial representations.

Further remarks on the benefits of the Chebyshev form appear in Sec. 5.

Fig. 1. Relationship between temperature and thermoelectric voltage

4. Measures of consistency

Usually the degree n is initially unknown. It is traditionally chosen by fitting

polynomials of increasing degree, for each polynomial forming a goodness-

of-fit measure, and, as soon as that measure demonstrates that the model

explains the data, accepting that polynomial.

A common measure is the chi-squared statistic χ2obs, the sum of squares

of the deviations of the polynomial from the yi, weighted inversely by the

squared standard uncertainties associated with the yi-values (or a modified

measure when xi-uncertainties or covariances are present). The statistic is

compared with a quantile of the chi-squared distribution, with m − n − 1

degrees of freedom, that corresponds to a stipulated probability 1− α (α

is often taken to be 0.05). A value of χ2obs that exceeds the quantile is

considered significant at the 100(1 − α) % level, and therefore that the

polynomial does not explain the data.


13

This approach is not statistically rigorous because the sequence of tests

does not form an independent set: the successive values of χ2obs depend on

the same data and hence are statistically interrelated.

Bonferroni10 regarded such a process as a multiple hypothesis test:

whilst a given α may be appropriate for each individual test, it is not for the

set of tests. To control the number of hypotheses that are falsely rejected, α

is reduced in proportion to the number of tests. (Less conservative tests of

this type are also available.11) A problem is that α depends on the number

of tests. If polynomials of all degrees lying between two given degrees are

considered, that set of degrees has to be decided a priori, with the results

of the tests depending on that decision. (In practice, the number of data is

often very limited, reducing somewhat the impact of this difficulty.)

We consider an approach that is independent of the number of tests to

be performed. Such an approach makes use of generally accepted model-

selection criteria, specifically Akaike’s Information Criterion (AIC) and the

Bayesian Information Criterion (BIC).12 For a model with n+1 parameters,

these criteria are

AIC = χ2obs + 2(n+ 1), BIC = χ2

obs + (n+ 1) lnm.

Both criteria are designed to balance goodness of fit and parsimony. AIC

can possibly choose too high a degree regardless of the value of m. BIC

tends to penalize high degrees more heavily. Recent considerations of AIC

and BIC are available together with detailed simulations.13

Given a number of candidate models, the model with the lowest value

of AIC (or BIC) would be selected. Some experiments14 with these and

other criteria for polynomial modelling found that AIC and BIC performed

similarly, although a corrected AIC was better for small data sets. Consid-

erably more experience is needed before strong conclusions can be drawn.

Related to the χ2obs statistic is the root-mean-square residual (RMSR)

for unit weight defined as [χ2obs/(m−n−1)]1/2, which we use subsequently.

5. Example: thermocouple calibration

A thermocouple is to be calibrated, with temperature in C as the stimulus

variable and voltage in mV as the response variable. The data consists of

temperature values 10 C apart in the interval [500, 1 100] C and the cor-

responding measured voltage values, 61 data pairs in all. The temperature

values are regarded as having negligible uncertainty and standard uncer-

tainties associated with the voltage values are based on knowledge of the

display resolution (4S). The coefficients of weighted least-squares polyno-


14

mial models of degrees from zero to ten were determined. Corresponding

values of RMSR appear in Fig. 2. This statistic show a clear decrease until

Fig. 2. RMSR values for thermocouple calibration data

degree 6 at which point it saturates at 0.9 mV. [The RMSR-values would

decrease once more for higher degrees when the polynomial model follows

more closely the noise in the data (corresponding to over-fitting).]

One way of selecting a degree is to identify the saturation level visually,

the polynomial of smallest degree with RMSR value at this level being

accepted.4 This approach works well for a sufficient quantity of data such

as in this example, but the saturation level may not be obvious for the small

data sets often arising in calibration work. We thus suggest that AIC or

BIC could be used instead to provide a satisfactory degree for large or small

calibration data sets. Applying these criteria to this example, degree 6 (bold

in Table 2), as given by visual inspection, would be selected. Note that once

a degree has been reached at which χ2obs saturates, the other terms in these

criteria cause their values to rise (as in Fig. 3), potentially more clearly

indicating an acceptable degree than does the RMSR saturation level.

Table 2. Information criteria as function of polynomial degree for

a thermocouple relationship

Degree 3 4 5 6 7 8 9 10

AIC 7 585 937 2 632 806 56 58 60 62 63

BIC 7 585 946 2 643 819 71 75 79 83 87

An advantage of the Chebyshev polynomials, due to their near-

orthogonality properties,9 is that the Chebyshev coefficients of successive


15

Fig. 3. Information criteria AIC and BIC versus polynomial degree

polynomials also tend to stabilize once the RMSR-value has saturated. For

instance the Chebyshev coefficients for degrees 3 to 7 are given in Table 3.

Coefficients in other representations do not have this property.

Table 3. Chebyshev coefficients in polynomial models for a thermocou-

ple relationship

Degree Chebyshev coefficients

Degree 3 Degree 4 Degree 5 Degree 6 Degree 7

0 6 245.08 6 262.43 6 262.33 6 261.92 6 261.93

1 4 530.90 4 575.62 4 575.17 4 574.94 4 574.942 1 756.70 1 854.03 1 852.99 1 852.92 1 852.92

3 309.11 407.25 405.85 405.48 405.49

4 39.31 38.05 37.21 37.235 −0.55 −1.40 −1.37

6 −0.37 −0.35

7 0.01

6. Conclusions

Polynomials form a nested family of empirical functions often used to ex-

press a relation underlying calibration data. We advocate the Chebyshev

representation of polynomials. We have considered the selection of an ap-

propriate polynomial degree when these functions are used to represent

calibration data. After having used polynomial regression software (NAG

Library6 routine E02AD, say) to provide polynomials of several degrees for a

data set, it is straightforward to carry out tests to select a particular poly-

nomial model. Information criteria AIC and BIC are easily implemented


16

and appear to work satisfactorily, but more evidence needs to be gathered.

Based on limited experience, these criteria select a polynomial degree

that is identical or close to that chosen by visual inspection.

Acknowledgments

The National Measurement Office of the UK’s Department for Business,

Innovation and Skills supported this work as part of its Materials and Mod-

elling programme. Clare Matthews (NPL) reviewed a draft of this paper.

References

1. BIPM, IEC, IFCC, ILAC, ISO, IUPAC, IUPAP and OIML, InternationalVocabulary of Metrology — Basic and General Concepts and AssociatedTerms Joint Committee for Guides in Metrology, JCGM 200:2012, (2012).

2. M. G. Cox, A. B. Forbes, P. M. Harris and I. M. Smith, The classificationand solution of regression problems for calibration, Tech. Rep. CMSC 24/03,National Physical Laboratory (Teddington, UK, 2003).

3. M. G. Cox, A survey of numerical methods for data and function approxi-mation, in The State of the Art in Numerical Analysis , ed. D. A. H. Jacobs(Academic Press, London, 1977).

4. C. W. Clenshaw and J. G. Hayes, J. Inst. Math. Appl. 1, 164 (1965).5. R. M. Barker, M. G. Cox, A. B. Forbes and P. M. Harris, SSfM Best Practice

Guide No. 4. Discrete modelling and experimental data analysis, tech. rep.,National Physical Laboratory (Teddington, UK, 2007).

6. The NAG library (2013), The Numerical Algorithms Group (NAG), Oxford,United Kingdom www.nag.com.

7. L. Crovini, H. J. Jung, R. C. Kemp, S. K. Ling, B. W. Mangum and H. Saku-rai, Metrologia 28, p. 317 (1991).

8. http://srdata.nist.gov/its90/download/allcoeff.tab.9. L. N. Trefethen, Approximation Theory and Approximation Practice (SIAM,

Philadelphia, 2013).10. L. Comtet, Bonferroni Inequalities – Advanced Combinatorics: The Art of

Finite and Infinite Expansions (Reidel, Dordrecht, Netherlands, 1974).11. Y. Benjamini and Y. Hochberg, Journal of the Royal Statistical Society. Se-

ries B (Methodological) , 289 (1995).12. K. P. Burnham and D. R. Anderson, Model Selection and Multimodel In-

ference: A Practical Information-Theoretic Approach 2nd edn (New York:Springer, 2002).

13. J. J. Dziak, D. L. Coffman, S. T. Lanza and R. Li, Sensitivity and specificityof information criteria, tech. rep., Pennsylvania State University, PA, USA(2012).

14. X.-S. Yang and A. Forbes, Model and feature selection in metrology dataapproximation, in Approximation Algorithms for Complex Systems, eds. E. H.Georgoulis, A. Iske and J. Levesley, Springer Proceedings in Mathematics,Vol. 3 (Springer Berlin Heidelberg, 2011) pp. 293–307.



EMPIRICAL FUNCTIONS WITH PRE-ASSIGNED

CORRELATION BEHAVIOUR

ALISTAIR B. FORBES∗

National Physical Laboratory, Teddington,Middlesex, UK

∗E-mail: [email protected]

Many model-fitting problems in metrology involve fitting a function f(x) to

data points (xi, yi). The response of an ideal system may be known fromphysical theory so that the shape of f(x) = f(x,a) is specified in terms of

parameters a of a model. However for many practical systems, there may be

other systematic effects, for which there is no accepted model, that modifythe response of the actual system in a smooth and repeatable way. Gaussian

processes (GP) models can be used to account for these unknown systematic

effects. GP models have the form y = f(x,a)+e, where f(x,a) is a function de-scribing the response due to the known effects and e represents an effect drawn

from a Gaussian distribution. These effects are correlated so that if x is close

to x′ then e is similar to e′. An alternative is to regard e(x, b) as described byan empirical function such as a polynomial, spline, radial basis function, etc.,

that also reflects a correlation structure imposed by assigning a Gaussian prior

to b, the parameters of the empirical model. The advantage of this approachis that the empirical models can provide essentially the same flexible response

as a GP model but with much less computational expense. In this paper, wedescribe how a suitable Gaussian prior for b can be determined and discuss

applications that involve such empirical models with a correlation structure.

Keywords: Empirical function, Gaussian processes

1. Introduction

Many physical system respond in a way that are only partially understood

and empirical models such as polynomials or splines are used to capture

the observed behaviour. In this, the choice of the empirical model can be

important to the success of the representation. For example, we may decide

to use a polynomial to describe the behaviour but we have to choose the

order (degree + 1) of the polynomial. A model selection approach2,6,12 is

to try all plausible models and then choose the best of them using a crite-

rion, such as the Akaike Information Criterion1 or the Bayes Information

Criterion,10 that balances the goodness of fit with compactness of repre-

17


18

sentation. The compactness of representation is usually measured in terms

of the number of data points and the number of degrees of freedom associ-

ated with the model, i.e., the number of free parameters to be fitted, e.g.,

the order of the polynomial model. This approach can be expected to work

well if the underlying system can in fact be described accurately by one of

the proposed models. In this case, the model selection amounts to identify-

ing the model with the correct number of degrees of freedom. If the set of

plausible models does not contain one that describes the system, then the

model selection process could well lead to a model that poorly describes

the underlying behaviour and is otherwise unsuitable for basing inferences

about the underlying system.

Another approach is represented by smoothing splines.6,11 Here, the

number of parameters associated with model is chosen to match exactly

the number of data points but an additional smoothing term is introduced

to penalise unwanted variation in the fitted function, with the degree of

penalty determined by a smoothing parameter. For the smoothing spline,

the penalty term is defined in terms of the curvature of the fitting func-

tion which can then be re-expressed in terms of the fitted parameters. The

larger the smoothing parameter, the smoother the fitted function and, in

the limit, the fitted function coincides with a straight line fit. The effect of

the increasing the smoothing parameter is to reduce the effective number of

degrees of freedom associated with the model. Thus, the fitted function can

be regarded as a linear function augmented by an empirical function whose

number of degrees of freedom is determined by the smoothing parameter.

Kennedy and O’Hagan8 suggest that model inadequacies can be com-

pensated for by augmenting the model with a Gaussian processes (GP)

model9 that assumes the underlying system shows a smooth response so

that nearby inputs will give rise to similar responses. Again the degree of

smoothness is determined one or more smoothing parameters. Here again,

implicitly, is the notion that the model is augmented by a model with po-

tentially a large (or even infinite) number of degrees of freedom, but these

degrees of freedom are effectively reduced by adding a prior penalty term.

While calculations to determine a smoothing spline can be implemented

efficiently by exploiting the banded nature of spline approximation,3 the

Gaussian processes approach involves variance matrices whose size depend

on the number m of data points. This can be computationally expensive

for large data sets since the number of calculations required is O(m3). In

this paper, we show how augmenting a model using a Gaussian process

model that has a modest effective number of degrees of freedom can be


19

implemented using an empirical model also depending on a modest number

of parameters, so that the computational requirement is greatly reduced.

The remainder of this paper is organised as follows. In section 2 we

overview linear regression for standard and Gauss-Markov models, the lat-

ter of interest to us since models derived from Gaussian processes lead to

a Gauss-Markov regression problem. In section 3, we show how a Gaussian

process model can be approximated by an empirical model with a pre-

assigned correlation structure. Example applications are given in section 4

and our concluding remarks in section 5.

2. Linear regression

2.1. Ordinary least squares regression

Consider the standard model

yi = f(xi,a)+εi, f(xi,a) =n∑

j=1

ajfj(xi), εi ∈ N(0, σ2M ), i = 1, . . . ,m,

or in matrix terms, y ∈ N(Ca, σ2MI), where Cij = fj(xi). Given data

(xi, yi), i = 1, . . . ,m, the linear least-squares estimate a of the parameters

a is found by solving

mina

(y − Ca)T(y − Ca).

If C has QR factorisation5 C = Q1R1 where Q1 is an m × n orthogonal

matrix and R1 is an n× n upper-triangular matrix, then

a = (CTC)−1CTy = R−11 QT

1 y,

and a can be calculated by solving the upper-triangular system of linear

equations R1a = QT1 y. The model predictions y are given by

y = Ca = C(CTC)−1CTy = Q1QT1 y = Sy, S = Q1Q

T1 .

The matrix S is a projection, so that S = ST = S2, and projects the data

vector y on to the n-dimensional subspace spanned by the columns of C;

the columns of Q1 define an orthogonal axes system for this subspace. In

Hastie et al.,6 the sum of the eigenvalues of the matrix S is taken to be

the effective number of degrees of freedom associated with the model. In

this case, S is a projection and has n eigenvalues equal to 1 and all other

0 so that the effective number of degrees of freedom is n, the number of

parameters in the model. The symbol S is chosen to represent ‘smoother’

as it smooths the noisy data vector y to produce the smoother vector of

model predictions y = Sy.


20

The variance matrices Va and Vy associated with the fitted parameters

and model predictions are given by

Va = σ2M (CTC)−1 = σ2

M (RT1 R1)

−1, Vy = σ2MSST = σ2

MS,

(recalling that S is a projection).

2.2. Gauss-Markov regression

Now suppose that the data y arises according to the model

y ∈ N(Ca, V ), (1)

where the variance matrix V reflects correlation due, for example, to

common systematic effects associated with the measurement system. The

Gauss-Markov estimate of a is found by solving

mina

(y − Ca)TV −1(y − Ca). (2)

If V has a Cholesky-type factorisation5 of the form V = UUT where U

is upper-triangular, then we can solve the ordinary linear least-squares re-

gression problem

mina

(y − Ca)T(y − Ca),

involving the transformed observation matrix C and data vector y where

UC = C and U y = y. If C = Q1R1, then the transformed model predictionsˆy are given in terms of the transformed data vector y by ˆy = Sy where

S = Q1QT1 . The matrix S is a projection matrix, as discussed above, and

has n eigenvalues equal to 1 and all others are zero. The unweighted model

predictions y are given in terms of the original data vector y by y = Sy,

where S = USU−1. The matrix S is not in general a projection but it is

conjugate to a projection matrix and therefore has the same eigenvalues:

if Sv = λv, then S(Uv) = USv = λ(Uv). Thus, the effective number of

degrees of freedom associated with the model is n, the number of parameters

in the model, as for the case of ordinary least squares regression.

2.3. Explicit parameters for the correlated effects

Suppose now that V in (1) can be decomposed as V = V0 + σ2MI, where

V0 is a positive semi-definite symmetric matrix. Here, we are thinking that

y = Ca + e + ε where e ∈ N(0, V0) represents correlated effects and ε ∈


21

N(0, σ2M ) random effects such as measurement noise. If V0 is factored as

V0 = U0UT0 , we can write this model equivalently as

y = U0e+ Ca+ ε, e ∈ N(0, I), ε ∈ N(0, σ2M ).

Estimates e and a of e and a, respectively, are found by solving the aug-

mented least squares system

Ca ≈ y, C =

[U0/σM C/σM

I

], a =

[e

a

], y =

[y/σM

0

]. (3)

The solution a is the same as that determined by solving (2) for V =

V0 + σ2MI.

If C has QR factorisation C = Q1R1 then the projection S = Q1QT1

calculates the 2m-vector of weighted model predictions ˆy = Sy from the

augmented data vector y. It has m + n eigenvalues equal to 1, the same

as the number of parameters in a, and all others are zero. The unweighted

model predictions y are given by

y = Sy, S =1

σ2M

[U0 C](CTC)−1[U0 C]T.

If the 2m× 2m matrix S is partitioned into m×m matrices as

S =

[S11 S12

ST12 S22

]

then S = S11. As a submatrix of a projection, S has eigenvalues λj with

0 ≤ λj ≤ 1. Hastie et al.6 use the term shrinkage operator for this kind of

matrix. In fact, n of eigenvalues of S are equal to 1 corresponding to the

free parameters a in the model: if y = Ca for some a, then Sy = y. The

number of effective degrees of freedom associated with the model is given

by the sum of the eigenvalues of S and can be thought of as that fraction

of the total number of degrees of freedom, m+ n, used to predict y.

The sum of the eigenvalues of S22 must be at least n. In fact, S22 also

has n eigenvalues equal to 1: if U0e = Cδ for some δ, then S22e = e. The

effective number of degrees of freedom of the model can range from between

n and m (≥ n). If the prior information about the correlated effects U0e is

strong, then the effective number of degrees of freedom will be closer to n;

if it is weak, it will be closer to m.

Note that while the solution a in (3) is the same as that in (2), the

vector of model predictions associated with (3) is y = U0e+Ca, as opposed

to y = Ca for model (2). The extra degrees of freedom provided by the

parameters e allows y to be approximated better by y.


22

3. Spatially correlated empirical models

Gaussian processes (GP) are typically used to model correlated effects

where the strength of the correlations depend on spatial and/or tempo-

ral separations. Consider the model

yi = f(xi,a) + ei + εi, ε ∈ N(0, σ2M ),

where ei represents a spatially correlated random effect. For example the

correlation could be specified by

V0(i, q) = cov(ei, eq) = k(xi, xq) = σ2 exp

− 1

2λ2(xi − xq)

2

. (4)

The parameter σ governs the likely size of the effect while the parameter

λ determines the spatial length scale over which the correlation applies.

For linear models, estimates of a are determined from the data using the

approaches described in sections 2.2 or 2.3.

While GP models can be very successful in representing spatially corre-

lated effects, the computational effort required to implement them is gen-

erally O(m3) where m is the number of data points. If the length scale

parameter λ is small relative to the span of the data x, then the matrix

V0 can be approximated well by a banded matrix and the computations

can be made much more efficiently. For longer length scales, V0 will be full

but will represent a greater degree of prior information so that the effec-

tive number of degrees of freedom associated with the model will be much

less than m + n. This suggests augmenting the model using an empirical

model involving a modest number of degrees of freedom but retaining the

desired spatial correlation.4 Thus, we regard e as described by an empirical

function e(x, b) =∑p

k=1 bjej(x), expressed as a linear combination of basis

functions. We impose a correlation structure by assigning a Gaussian prior

to b of the form b ∼ N(0, Vb). The issue is how to choose Vb to impose the

correct spatial correlation in order that cov(e(x, b), e(x′, b)) ≈ k(x, x′).Suppose z = (z1, . . . , zm)T is a dense sampling over the working range

for x, and let V0 be the m × m variance matrix with V0(i, q) = k(zi, zq),

and E the associated observation matrix with Eij = ej(zi). If E has QR

factorisation E = Q1R1 and e ∈ N(0, V ), then b = R−11 QT

1 e defines the

empirical function e(x, b) that fits closest to e in the least squares sense. If

e ∼ N(0, V0), then

b = R−11 QT

1 e ∼ N(0, Vb), Vb = R−11 QT

1 V0Q1R−T1 .

We use Vb so defined as the prior variance matrix for b. Setting e = Eb,


23

then for b ∼ N(0, Vb), we have

e ∼ N(0, Ve), Ve = EVbET = P1V0P

T1 , P1 = Q1Q

T1 .

The matrix P1 is a projection and P1V0PT1 represents the variance matrix

of the form EVbET that is closest to V0 in some sense. The quality of the

approximation can be quantified in terms of the trace

Tr(V0 − P1V0PT1 ), (5)

for example, where Tr(A) is the sum of the diagonal elements of A.

If the dense sampling of points used to generate the variance matrix V0

is regularly spaced, then V0 is a Toeplitz matrix.5 Matrix-vector multipli-

cations using a Toeplitz matix of order p can executed in O(p log p) using

the fast Fourier transform and the matrix itself can be represented by p

elements.

If Vb is factored as Vb = UbUTb then estimates a and b are determined

by finding the least squares solution of (3) where now

C =

[EUb/σM C/σM

I

], a =

[b

a

], y =

[y/σM

0

]. (6)

The matrix C above is an (m + p) × (p + n) matrix whereas in (3) it is

an 2m × (m + n) matrix (and m is likely to be much larger than p). The

unweighted model predictions y are given by

y = Sy, S =1

σ2M

[EUb C](CTC)−1[EUb C]T, (7)

where C is given by (6). The shrinkage operator S above has n eigenvalues

equal to 1, a further p eigenvalues between 0 and 1 and all other eigenvalues

equal to 0.

4. Example applications

4.1. Instrument drift

In the calibration of a 2-dimensional optical standard using a coordinate

measuring machine (CMM), it was noticed that the origin of the mea-

surement system drifted by a few micrometres over time, possibly due to

thermal effects arising from the CMM’s internal heat sources due to its

motion. The drift in x and y is modelled as a quadratic polynomial (n = 3)

augmented by an order p = 10 polynomial with a preassigned correlation

derived from kernel k in (4) with λ = 0.25. The time units are scaled so

that 0 represents that start time of the measurements and 1 the end time


24

(in reality, a number of hours later). Figure 1 shows the measured data and

the fitted functions for the x- and y-coordinate drift.

The shrinkage operator S defined by (7) has n+ p nonzero eigenvalues

with n of them equal 1, the rest between 1 and 0. For this model, there are in

fact only p nonzero eigenvalues because the order p augmenting polynomial

e(x, b) has n basis functions in common with the polynomial representing

the drift. Table 1 shows those eigenvalues of the shrinkage operator S that

are not necessarily 1 or 0 for different values of the length scale parameter

λ. Increasing λ decreases the effective number of degrees of freedom from

a maximum of p = 10 to a minimum of n = 3.

0 0.2 0.4 0.6 0.8 1−4

−2

0

2x 10−4

time/arbitrary units

x co

ordi

nate

drif

t/mm

p = 10, lambda = 0.25

fitted polynomialdata

0 0.2 0.4 0.6 0.8 1−2

0

2

4

6x 10−4

time/arbitrary units

y co

ordi

nate

drif

t/mm

fitted polynomialdata

Fig. 1. The fits of quadratic drift functions augmented with the spatially correlatedpolynomials of order n = 10 corresponding to λ = 0.25 to data measuring instrumentdrift in x- and y-coordinates, top and bottom, respectively.

4.2. Trends in oxygen data

Since the early 1990s, the Scripps Institute7 has monitored the change in the

ratio of O2 to N2, relative to a reference established in the 1980s, at 9 remote

locations around the globe. Figure 2 shows the record at two sites, Alert

in Canada, latitude 82 degrees North, and Cape Grim, Australia, latitude

41 degrees South. All the records i) have missing data, ii) show a yearly

cyclical variation, and iii) show an approximately linear decrease. The units


25

Table 1. Modelling drift: non-unit and non-zeroeigenvalues (rows 2 to 8) associated with the shrink-age operator S for different values of λ (row 1),for p = 10 and n = 3. The number of ef-fective degrees of freedom are given in row 9.

0.10 0.15 0.20 0.25 0.30 0.35 0.40

0.99 0.99 0.99 0.99 0.98 0.97 0.950.99 0.99 0.98 0.97 0.93 0.86 0.740.99 0.98 0.93 0.79 0.53 0.26 0.110.98 0.95 0.80 0.45 0.15 0.04 0.010.97 0.83 0.40 0.08 0.02 0.00 0.000.92 0.59 0.13 0.02 0.00 0.00 0.000.81 0.25 0.02 0.00 0.00 0.00 0.00

9.65 8.59 7.24 6.29 5.61 5.14 4.82

associated with the vertical axis in figure 2 are such that a decrease of 100

units represents a 0.01 % decrease in the ratio of oxygen to nitrogen.

1990 1995 2000 2005 2010 2015−600

−400

−200

0

time/year

δ (O

2/N

02)/1

0−6

Alert, Canada

1990 1995 2000 2005 2010 2015−600

−400

−200

0

time/year

δ (O

2/N

02)/1

0−6

Cape Grim, Australia

Fig. 2. Oxygen data gathered by the Scripps Institute for two sites.

The data is analysed using a model of the form

y = f1(t,a1) + f2(t,a2) + e1(t, b) + e+ ε,

where f1(t,a1) represents a linear trend, f2(t,a2) a Fourier series to model

cyclical variation, e1(t, b) a temporally correlated polynomial to model long

term trend with a time constant λ2 equal to approximately 5 years, e tem-

porally correlated effect to model short term seasonal variations with a


26

time constant λ2 equal to approximately 1 month, and ε represents ran-

dom noise associated with short term variations and measurement effects.

It would be possible to use a temporally correlated polynomial e2(t, b2)

to represent the shorter term variations. However, in order to deliver the

appropriate effective number of degrees of freedom (of the order of 60) or,

in other terms, approximate the variance matrix V0 well, at least an order

p = 100 polynomial would be required. This does not pose any real prob-

lem (if orthogonal polynomials are used) but it is computationally more

efficient to exploit the fact that the variance matrix V0 is effectively banded

with a bandwidth of about 25 for the type of data considered here. If the

extent of the spatial/temporal correlation length is small relative to the

span of the data, then the variance matrix can be approximated well by

a banded matrix (and there is large number of degrees of freedom in the

model) while if the spatial/temporal correlation length is comparable with

data span, the variance matrix can be approximated well using a correlated

empirical model (and there are a modest number of degrees of freedom). In

either case, the computations can be made efficiently.

Figures 3 and 4 show the results of calculations on the data in figure 2.

The units in these graphs are the same as that for figure 2. Figure 3 shows

the fitted model for the time period between 2000 and 2004. The Fourier

model included terms of up to order 4, so that 8 Fourier components are

present. Note that the northern hemisphere fit (top) is out of phase with the

southern hemisphere fit (bottom) by approximately half a year. The figure

also shows the uncertainty band representing ± 2 u(yi), where u(yi) is the

standard uncertainty of the model prediction at the ith data point. Figure 3

relates to a linear trend function f1(t,a1). We can also perform the same

fitting but with a quadratic trend function. Figure 4 shows the differences

in the combined trend functions (polynomial plus augmented polynomial)

for the two datasets in figure 2 along with the estimate uncertainty associ-

ated with the fitted trend functions. It can be seen that both sets of trend

functions agree with each other well, relative to the associated uncertain-

ties. Thus, the effect of a choice of linear or quadratic trend function is

minimised by the use of an augmented model that can compensate for the

mismatch between the model and the data. The invariance with respect to

such model choices is one of the benefits in employing an augmented model.

5. Concluding remarks

In this paper we have been concerned with fitting data that is subject to

systematic effects that are only partially understood. We use a model that


27

2000 2000.5 2001 2001.5 2002 2002.5 2003 2003.5 2004

−50

0

50

2000 2000.5 2001 2001.5 2002 2002.5 2003 2003.5 2004

−50

0

50

Fig. 3. Fitted model to oxygen data in figure 2 shown for the period 2000 to 2004.

1990 1995 2000 2005 2010 2015−5

0

5

10

1990 1995 2000 2005 2010 2015−5

0

5

10

1990 1995 2000 2005 2010 20152

4

6

Fig. 4. Differences in the combined trend functions f1(t,a1) + e1(t, b) determined forthe datasets in figure 2 for the cases of linear and quadratic f1(t,a1). The bottom graphshows the estimated uncertainties associated with the trend functions (these uncertaintiesare virtually the same for the linear and quadratic functions). Uncertainties are larger atthe ends of the data record due to the fact that the temporally correlated models haveonly future or past data to determine model estimates.

reflects what we believe about the system response but augmented by a

model to account for our incomplete knowledge. Gaussian processes (GP)


28

models can be used to provide these augmentations but can be compu-

tationally expensive for large datasets. A more computationally efficient

approach can be found by replacing the Gaussian process model with an

empirical model that provides almost the same functionality as the GP

model. The correlation structure in the GP model is translated to a corre-

lation structure applying to the parameters associated with the empirical

model and acts as a regularisation term.

Acknowledgements

This work was funded by the UK’s National Measurement Office Innova-

tion, Research and Development programme. I thank my colleague Dr Dale

Partridge for his comments of an earlier draft of this paper. The support

of the AMCTM programme committee is gratefully acknowledged.

References

1. H. Akaike. A new look at the statistical model identification. IEEE Transac-tions on Automatic Control, 19:716–723, 1974.

2. H. Chipman, E. I. George, and R. E. McCulloch. The practical implementa-tion of Bayesian model selection. Institute of Mathematical Statistics, Beach-wood, Ohio, 2001.

3. M. G. Cox. The least squares solution of overdetermined linear equationshaving band or augmented band structure. IMA J. Numer. Anal., 1:3 – 22,1981.

4. A. B. Forbes and H. D. Minh. Design of linear calibration experiments. Mea-surement, 46(9):3730–3736, 2013.

5. G. H. Golub and C. F. Van Loan. Matrix Computations. John Hopkins Uni-versity Press, Baltimore, third edition, 1996.

6. T. Hastie, R. Tibshirani, and J. Friedman. Elements of Statistical Learning.Springer, New York, 2nd edition, 2011.

7. R. Keeling. http://scrippsO2.ucsd.edu/ accessed 12 November, 2014.8. M. C. Kennedy and A. O’Hagan. Bayesian calibration of computer models.

J. Roy. Sat. Soc. B, 64(3):425–464, 2001.9. C. E. Rasmussen and C. K. I. Williams. Gaussian Processes for Machine

Learning. MIT Press, Cambridge, Mass., 2006.10. G. Schwarz. Estimating the dimension of a model. Annals of Statistics, 6:461–

464, 1978.11. G. Wahba. Spline models for observational data. SIAM, Philadelphia, 1990.12. X.-S. Yang and A. B. Forbes. Model and feature selection in metrology data

approximation. In E. H. Georgoulis, A. Iske, and J. Levesley, editors, Approx-imation Algorithms for Complex Systems, pages 293–307, Heidelberg, 2010.Springer-Verlag.

29





MODELS AND METHODS OF DYNAMIC MEASUREMENTS:

RESULTS PRESENTED BY ST. PETERSBURG

METROLOGISTS*

V.A. GRANOVSKII

Concern CSRI Elektropribor, JSC

30, Malaya Posadskaya Str., 197046, Saint Petersburg, Russia

E-mail:[email protected]

The paper reviews the results of St. Petersburg metrologists work on the development of

dynamic measurement and instrument models, as well as algorithms for measurement

data processing. The results were obtained in the 60-ies – 80-ies of the past century within

the framework of three generalized formal problems, two of which are related to ill-posed

inverse problems. The developed methods for dynamic measurement instrument

identification are presented. The general characteristic is given for the problem of an input

signal recovery of the dynamic measurement instrument. The topicality of the obtained

results is pointed out.

Keywords: Dynamic measurements; Instrument; Dynamic characteristic, Inverse problem,

Correctness, Regularity.

1. Introduction

The memoir [1] shall be considered as the first work on the theory of dynamic

measurements in St. Petersburg.

The regular development of the problem started in the 1960-ies in two

research centers: the Mendeleev Institute of Metrology (VNIIM) and the

Research Institute of Electrical Instruments (VNIIEP). This work has passed two

stages. Publication of the books [2, 3], respectively, can be considered the origin

of each stage.

The paper is aimed at reviewing the work of St. Petersburg metrologists, the

results of which seem to be actual nowadays.

1.1. Dynamic measurements [3-5]

The idea of dynamic measurements is usually associated with the presence of a

substantial component of a measurement error, caused by the discrepancy

* This work was supported by the Russian Foundation for Basic Research (project 13-08-00688)

30


between inertial (dynamic) properties of an instrument and the rate of measured

process change (frequency content). Such an interpretation determines the range

of problems for solving formal tasks of modeling and algorithm elaboration.

Concrete definition of problems first of all requires the analysis of relation

between measurement and the inverse problem.

1.1.1. Measurement and inverse mathematical physics problem

Being an instrument of knowledge, the measurement is aimed at recovery of the

phenomenon under investigation by the measured response of an object to the

controllable effect. The measurement itself acts as recovery of a measured

attribute by the result of its influence on an instrument, in the context of the

object model. Processing of measurement data is the recovery of actual effect on

the instrument, disturbed by a chain of physical measurement transformations.

1.1.2. Formal model of dynamic measurements

A model of direct dynamic measurements [3] is considered in order to find basic

features of dynamic measurements (Fig. 1).

Figure 1. Block diagram of dynamic measurement error origin

A true signal xt(t) contains the information about the property of the object

under investigation OI. A measurand in general is defined as a result of

functional transformation of signal xt(t):

Q = ΦnBn[xt(t)], (1)

31


where Bn – a required transformation operator; Φn – a required functional. Due to

the measuring instrument MI influence on the object, the real output signal of the

target of research xout(t) differs from xt(t). The device input is also affected by

disturbance ξ(t). Hence, the real input signal xin(t) differs from xout(t). The

instrument transforms xin(t) into the output signal y(t). The real transformation

operator Br.t, expressing the properties of the type of instruments, differs from Bn

because of imperfect principle of operation and device design. The properties of

a certain instrument, expressed by operator Br.c, differ from the typical ones.

Besides, the instrument is affected by influence quantities v1, …, vl. The

combined action of these quantities may be expressed by an influence signal ζ(t),

disturbing the operator Br.c. As a result, the operator Br, realized during

measurements, differs from Bn. Calculation of the functional Φ(y) is included

either in the instrument operation algorithm, or in the algorithm of measurement

data processing MP, particularly, processing the output signal values, read out

from the device scale. In the latter case the readout error and the calculation error

should be taken into account separately.

The following parameter should be taken as a measurement result:

Ǭ = Φr[ў(t)] = ΦrBr[xr(t)] + µ(t). (2)

An error

δΣ = Ǭ – Q = ΦrBr [xr(t)] + µ(t) – ΦnBn[xt(t)]. (3)

On the assumption of linearity of operators and functionals:

δΣ = Φn[δBr.c(xд)] + Φn[δBr.t(xin)] + Φn[δBn(xin)] + Φn[µ(t)] + Φn[Bn(δx)] +

δΦ[Br(xin)], (4)

where δBr.c = Br-Br.c; δBr.t = Br.c-Br.t; δBr.n = Br.t-Bn; δΦ = Φr-Φn; δx = δxt+ξ.

Linear operator B takes, in the time and frequency (complex) domain, the

different forms, to which the following total dynamic characteristics of the

instrument correspond: (a) set of differential equation structure and coefficients;

(b) impulse response g(t); (c) transient performance h(t); (d) complex frequency

characteristic W(jω) and its two components – amplitude- and phase-frequency

characteristics; (e) transfer function W(p).

1.1.3. Three typical problems of dynamic measurements

Metrological support of dynamic measurements is represented by a set of one

direct and two inverse problems. The direct problem is to determine the response

y of an instrument with the known dynamic properties (operator B) to the given

effect x:

32


, .x B y Bx→ = (5)

The first inverse problem is to determine the dynamic properties of the

instrument by the known test influence x and the instrument response to it y:

1* ( ) .x, y B y x

−

→ = (6)

Expression (6) is certainly symbolical.

The second inverse problem consists in the recovery of the input effect by

the known dynamic properties of the instrument and its response to the desired

effect: 1, ( ) .B y x B y

−

→ = (7)

Generally, inverse problems are related to ill-posed ones from Hadamard’s

viewpoint, and regularization methods should be used to solve them [6].

2. Results of solving typical problems of dynamic measurements

2.1. Direct problem of dynamic measurement [7-10]

2.1.1. Metrological statement of direct problem

The direct problem of dynamic measurements is concerned with the dynamic

measurement error estimation, or dynamic error of measurement.

From the expression (4) it follows that the basic contribution to the dynamic

error is induced by the difference between function g(t) and δ-function, or, what

is the same, by the difference between h(t) and the unit step.

In [10] the matrix of typical direct problem statements is analyzed. The

matrix is defined by varieties of dynamic characteristics of the instrument and

input signals. It contains above hundred specified tasks, from which only one

third have a solution; note that most of unsolved problems are those concerned

with particular dynamic characteristics of the instrument.

2.1.2. Direct problem solution

The work [9] presents the results of transformation error analysis for a variable

signal modeled by the stationary random process. The expression was obtained

for autocorrelation function of an error, as well as for error variance in the

steady-state mode, when the transformation operator is exactly known (in one

form or another). As for the process, it is assumed that we know either the

autocorrelation function, or the spectrum density of the process, or the

generating stochastic differential equation. Besides, the influence of the real

operator divergence from the nominal one on error estimation is studied.

33


The authors in paper [8] suggest estimating a transformation error by using

an inverse operator in the complex domain, the expression for which is derived

from the direct operator represented by the Taylor expansion. The estimates

obtained are reliable only when the input signal can be approximated by the low-

order polynomial.

2.2. The first inverse problem of dynamic measurements [3, 11-29]

2.2.1. Metrological statement of the problem of dynamic measurement

instrument identification

The first inverse problem of the dynamic measurement, or a problem of

determining the total dynamic characteristics, includes, in addition to the

incorrectness problem solution, also accounts for limited accuracy of forming the

required test effects on the instrument.

2.2.2. Identification problem solution for an instrument dynamic properties

Determining the total dynamic characteristics of the instrument is defined by the

peculiarities of characteristics and the test signals. When characteristical signals

are used, which fit the determination of the corresponding dynamic

characteristics, we should only compute an error of the characteristic estimate,

caused by non-ideal test signal. The expressions for errors are derived as applied

to linearly or exponentially increasing test signals, and transient responses of the

instrument having linear models of the 1st and 2

nd orders. The similar results

were obtained for pulse characteristics and frequency responses on the

assumption that the real test signal is accordingly a rectangular pulse, and a sum

of the first and second harmonics [3].

In the general case of determining characteristic g(t) or h(t) from the

convolution equation, by the known input x(t) and output y(t) signals, we have to

regularize it. Because of that the problem is ill-posed one and a priori

information about the desired dynamic characteristic is very important, adaptive

identification methods have become widely used. The work [13] describes the

method of adaptive selection of regularization parameter by the statistical

criterion based on the given fractile of the χ2 distribution. The method is realized

by digitalization of the convolution equation and transfer to the matrix equation,

which is regularized and converted to the following form:

( )1 1 .T T

A A I A yλ− −

∑ + = ∑g (8)

34


The criterion of selecting an optimal regularization parameter λ has the

form:

( ) ( ) ( )1 2

1 .T

ly A y Aλ λ χ α−

−− ∑ − ≤g g (9)

In contrast to [13], the identification method [12] uses the iteration

procedure of selecting a vector of the instrument model parameters by the

squared criterion of discrepancy between the responses from the real instrument

and its model. The iteration method of selecting a dynamic model of the

instrument (in the form of transfer function) by the squared criterion of

discrepancy, presented in [3], is based on model generation by integration of

typical dynamic elements of the 1st nd 2nd orders.

Being a dynamic model of the instrument, the differential equation is mostly

used for theoretical construction. But coefficients of linear differential equation

can be determined by successive integration, if a steady-state signal (or pulse

signal) is used as a test input signal [3]. Here the major problem is correct

determination of the equation order or the iteration procedure stopping criterion.

It may be based on convergence of discrepancy between the responses from the

instrument and its model, when its order increases.

The methods considered above were related to the instrument with the linear

dynamic model, because in this case general solutions are possible. As for

nonlinear instruments, their identification requires much a priori information.

Usually this means a limited number of model classes, first of all, Hammerstein

and Wiener integral equations.

The instrument identification methods with such models are considered in

[16]. They mean separate determination of characteristics of linear and nonlinear

elements of the model. The nonlinear element is identified in static mode, after

that the problem of linear element identification is solved. The solution process

quickly “branches” into variants based on versions of a particular instrument

model. The use of pseudorandom test sequence of pulses makes it possible to

ease the restrictions imposed on the model being identified.

2.3. The second inverse problem of dynamic measurements

2.3.1. Metrological statement of the problem of recovering the input signal

of dynamic measurement instrument

The recovery of the instrument input signal by the known output signal and total

dynamic characteristic of the instrument, in terms of metrology, means

correction of a dynamic error of input signal transformation or correction of non-

ideal dynamic characteristic. Put this another way, the signal processing is

35


required to go from real pulse and transient characteristics, and frequency

response to ideal ones: ò ò ò ò( ); 1( ); ( ) 1; ( ) 0.u u u ut h t Aδ ω ω= = = Φ =g

The problem regularization is very difficult, owing to the nature of the

available a priori information. In contrast to identification, during the recovery

we rarely come up against the a priori known input signal type. For the same

reason it is more difficult to implement the iteration methods.

2.3.2. Solution of the input signal recovery problem for the dynamic

measurement instrument

The works [3, 30] are devoted to this problem solution. The authors try to

analyze the problem peculiarities and to outline the ways of solving the most

difficult inverse problem of dynamic measurements. Because of impossibility to

regularize the problem without relevant a priori information, we obtain the

unlimited number of particular solutions, each having the restricted domain of

applicability, instead of general solutions.

2.4. Overall evaluation of results

During the past years the European program on dynamic measurement

metrology has been implemented [31, 32]. The published results show that our

colleagues from six countries of European Community are at the initial stage of

the way which was passed by St. Petersburg metrologists in the 70-ies-90-ies of

the past century. So, results obtained in the past remain valid and actual.

References

1. А. N. Кrylov, Some remarks on ‘kreshers’ and indicators, Proceedings of

the St. Petersburg Academy of Sciences, 1909.

2. А. N. Gordov, Temperature Measurement of Gas Streams (Мashgiz,

Moskow–Leningrad, 1962).

3. V. А. Granovskii, Dynamic measurements: Basics of metrological

assurance (Leningrad, Energoatomizdat, 1984).

4. V. О. Аrutyunov, V. А. Granovskii, V. S. Pellinets, S. G. Rabinovich,

D. F. Таrtakovskiy, К. P. Shirokov, Basic concepts of dynamic

measurement theory, Dynamic measurements. Abstracts of 1th

All-Union

Symposium (VNIIM, 1974).

5. G. I. Каvalerov, S. М. Маndel’shtam, Introduction in Information Theory

of Measurement, (Moscow, Energiya, 1974).

6. А. N. Тikhonov, V. Ya. Аrsenin, Solution methods for ill-posed problems

(Nauka, Moscow, 1974).

36


7. Ya. G. Neuymin, I. A. Popova, B. L. Ryvkin, B. A. Schkol’nik, Estimate of

measurement dynamic error, Metrology (Measurement Techniques), 1

(1973).

8. М. D. Vaysband, Approximate method for dynamic errors calculation of

linear conversion, Measurement Techniques, 12 (1975).

9. N. I. Seregina, G. N. Solopchenko, V. M. Chrumalo, Error determination of

direct measurement of variable quantities, Measurements, control,

automation, 4 (1978).

10. V. А. Granovskii, О. А. Кudryavtsev, Error estimation of direct dynamic

measurements, Metrology (Measurement Techniques), 1 (1981).

11. B. А. Shkol’nik, Estimation error of dynamic characteristic by system

response on periodic signal, in Transactions of the metrology institutes of

the USSR, 137 (197) (1972).

12. M. D. Brener, V. M. Chrumalo, On dynamic parameter estimates of typical

inertial measuring instruments, Тransactions, VNIIEP, 16 (1973).

13. N. Ch. Ibragimova, G. N. Solopchenko, Estimate of weight and transfer

functions by statistical regularization, Transactions, VNIIEP, 16 (1973).

14. N. А. Zeludeva, Some numerical transformations of measuring instrument

transfer functions, Transactions, VNIIEP, 16 (1973).

15. V. К. Zelenyuk, D. F. Тartakovskiy, Dynamic characteristic of film

measuring transducers, Measurement Techniques, 6 (1973).

16. М. D. Brener, G. N. Solopchenko, On dynamic characteristic estimates in

the process of non-linear measuring instruments calibration, Transactions,

VNIIEP, 18 (1973).

17. V. А. Granovskii, Оn dynamic properties normalization of measuring

transducers, Problems of theory and designing of information transducers:

Records of seminar, Cybernetics Institute of the Ukraine Academy of

sciences (Кiev, 1973).

18. B. А. Schkol’nik, Development of methods and apparatus for dynamic

characteristics determination of measuring transducers of continuous time

electrical signals, Abstract of a thesis on academic degree (VNIIM, 1974).

19. V. О. Аrutyunov, V. А. Granovskii, S. G. Rabinovich, Standardization and

determination of measuring instrument dynamic properties, Dynamic

measurements. Abstracts of 1th

All-Union Symposium (VNIIM, 1974).

20. V. А. Granovskii, Yu. S. Etinger, Determination procedure of measuring

instrument dynamic properties, Меtrology (Measurement Techniques), 10

(1974).

21. G. N. Solopchenko, Dynamic error of measuring instrument identification,

Меtrology (Measurement Techniques), 1 (1975)

37


22. V. О. Arutyunov, V. А. Granovskii, Standardization and determination of

dynamic properties of measuring instruments, Measurement Techniques, 12

(1975).

23. V. А. Granovskii, Determination procedure of error determination of full

dynamic characteristics of measuring instruments, Measurement

Techniques, 7 (1977).

24. V. А. Granovskii, Yu. S. Etinger, Transfer function determination of

measuring instruments with distributed parameters, Dynamic

measurements. Abstracts of 2th

All-Union Symposium (VNIIM, 1978).

25. S. M. Mandel’shtam, G. N. Solopchenko, Dynamic properties of measuring

and computing aggregates, Measurement Techniques, 22, 4 (1979).

26. G. N. Solopchenko, I. B. Chelpanov, Method for determining and

normalizing the measuring-equipment dynamic characteristics,

Measurement Techniques, 22, 4 (1979).

27. G. N. Solopchenko, Minimal fractionally rational approximation of the

complex frequency characteristic of a means of measurement,

Measurement Techniques, 25, 4 (1982).

28. V. А. Granovskii, М. B. Мinz, Yu. S. Etinger, Dynamic characteristics

determination of sulphide-cadmium photoresistors, Metrology

(Measurement Techniques), 10 (1982).

29. V. Ya. Kreinovich, G. N. Solopchenko, Canonical-parameter estimation for

instrument complex frequency response, Measurement Techniques, 36, 9

(1993).

30. R. А. Poluektov, G. N. Solopchenko, Correction methods of dynamic

errors, Avtometriya, 5 (1971).

31. T. Esward, C. Elster, J. P. Hessling, Analysis of Dynamic Measurements:

New Challenges Require New Solutions, in Proc. XIII International

Congress IMEKO (Lisbon, 2009).

32. C. Bartoli, M. F. Beug, T. Bruns, C. Elster, L. Klaus, M. Kobusch,

C. Schlegel, T. Esward, A. Knott, S. Saxholm, Traceable Dynamic

Measureement of Mechanical Quantities: Objectives and First Results of

this European Project, in Proc. XX International Congress IMEKO (Busan,

2012).



INTERVAL COMPUTATIONS AND INTERVAL-RELATED

STATISTICAL TECHNIQUES: ESTIMATING

UNCERTAINTY OF THE RESULTS OF DATA PROCESSING

AND INDIRECT MEASUREMENTS

V. KREINOVICH

Computer Science Department, University of Texas at El Paso,

El Paso, Texas 79968, USAE-mail: [email protected]

http://www.cs.utep.edu/vladik

In many practical situations, we only know the upper bound ∆ on the mea-

surement error: |∆x| ≤ ∆. In other words, we only know that the measurementerror is located on the interval [−∆,∆]. The traditional approach is to assume

that ∆x is uniformly distributed on [−∆,∆]. In some situations, however, this

approach underestimates the error of indirect measurements. It is thereforedesirable to directly process this interval uncertainty. Such “interval computa-

tions” methods have been developed since the 1950s. In this paper, we provide

a brief overview of related algorithms and results.

Keywords: interval uncertainty, interval computations, interval-related statis-

tical techniques

1. Need for Interval Computations

Data processing and indirect measurements. We are often interested

in a physical quantity y that is difficult (or impossible) to measure directly:

distance to a star, amount of oil in a well. A natural idea is to measure y

indirectly: we find easier-to-measure quantities x1, . . . , xn related to y by a

known relation y = f(x1, . . . , xn), and then use the results xi of measuring

xi to estimate y:

-

· · ·

-

-

xn

x2

x1

-y = f(x1, . . . , xn)f

38


39

This is known as data processing.

Estimating uncertainty of the results of indirect measurements: a

problem. Measurements are never 100% accurate. The actual value xi of

i-th measured quantity can differ from the measurement result xi; in other

words, there are measurement errors ∆xidef= xi − xi. Because of that, the

result y = f(x1, . . . , xn) of data processing is, in general, different from the

actual value y: y = f(x1, . . . , xn) 6= f(x1, . . . , xn) = y. It is desirable to

describe the error ∆ydef= y− y of the result of data processing. For this, we

must have information about the errors of direct measurements.

Uncertainty of direct measurements: need for overall error bounds

(i.e., interval uncertainty). Manufacturers of a measuring instrument

(MI) usually provide an upper bound ∆i for the measurement error: |∆xi| ≤∆i. (If no such bound is provided, then xi is not a measurement, it is a

wild guess.)

Once we get the measured value xi, we can thus guarantee that the

actual (unknown) value of xi is in the interval xidef= [xi −∆i, xi + ∆i]. For

example, if xi = 1.0 and ∆i = 0.1, then xi ∈ [0.9, 1.1].

In many practical situations, we also know the probabilities of different

values ∆xi within this interval. It is usually assumed that ∆xi is normally

distributed with 0 mean and known standard deviation.

In practice, we can determine the desired probabilities by calibration,

i.e., by comparing the results xi of our MI with the results x sti of measuring

the same quantity by a standard (much more accurate) MI. However, there

are two cases when calibration is not done: (1) cutting-edge measurements

(e.g., in fundamental science), when our MI is the best we have, and (2)

measurements on the shop floor, when calibration of MI is too expensive.

In both cases, the only information we have is the upper bound on the

measurement error. In such cases, we have interval uncertainty about the

actual values xi; see, e.g.,11.

Interval computations: a problem. When the inputs xi of the data

processing algorithms are known with interval uncertainty, we face the fol-

lowing problem:

• Given: an algorithm y = f(x1, . . . , xn) and n intervals xi = [xi, xi].

• Compute: the corresponding range of y:

[y, y] = f(x1, . . . , xn) |x1 ∈ [x1, x1], . . . , xn ∈ [xn, xn].


40

-

· · ·

-

-

xn

x2

x1

-y = f(x1, . . . ,xn)f

It is known that this problem is NP-hard even for quadratic f ; see, e.g.,8.

In other words, unless P=NP (which most computer scientists believe to be

impossible), no feasible algorithm is possible that would always compute the

exact range y. We thus face two major challenges: (1) find situations feasible

algorithms are possible, and (2) in situations when the exact computation

of y is not feasibly possible, find feasible algorithms for computing a good

approximation Y ⊇ y.

2. Alternative Approach: Maximum Entropy (MaxEnt)

Idea: a brief reminder. Traditional engineering approach to uncertainty

is to use probablistic techniques, based on probability density functions

(pdf) ρ(x) and cumulative distribution functions (cdf) F (x)def= P (X ≤ x).

As we have mentioned, in many practical applications, it is very difficult

to come up with the probabilities. In such applications, many different

probability distributions are consistent with the same observations. In such

situations, a natural idea is to select one of these distributions – e.g., the

one with the largest entropy Sdef= −

∫ρ(x) · ln(ρ(x)) dx; see, e.g.,5.

Often, this idea works. This approach often leads to reasonable results.

For example, for the case of a single variable x, if all we know is that

x ∈ [x, x], then MaxEnt leads to a uniform distribution on [x, x]. For sev-

eral variables, if we have no information about their dependence, MaxEnt

implies that different variables are independently distributed.

Sometimes, this idea does not work. Sometimes, the results of MaxEnt

are misleading. As an example, let us consider the simplest algorithm y =

x1 + . . . + xn, with ∆xi ∈ [−∆,∆]. In this case, ∆y = ∆x1 + . . . + ∆xn.

The worst case is when ∆i = ∆ for all i, then ∆y = n ·∆.

What will MaxEnt return here? If all ∆xi are uniformly distributed,

then for large n, due to the Central Limit Theorem, ∆y is approximately


41

normal, with σ = ∆ ·√n√3

.

With confidence 99.9%, we can thus conclude that |∆y| ≤ 3σ; so, we

get ∆ ∼√n, but, as we mentioned. it is possible that ∆ = n ·∆ ∼ n which,

for large n, is much larger than√n.

The conclusion from this example is that using a single distribution

can be very misleading, especially if we want guaranteed results – and we

do want guaranteed results in high-risk application areas such as space

exploration or nuclear engineering.

3. Possibility of Linearization

Linearization is usually possible. Each interval has the form

[xi −∆i, xi −∆i], where xi is a midpoint and ∆i is half-width. Possi-

ble values xi are xi = xi + ∆xi, with |∆xi| ≤ ∆i, so f(x1, . . . , xn) =

f(x1 + ∆x1, . . . , xn + ∆xn). The values ∆i are usually reasonable small,

hence the values ∆xi are also small. Thus, we can expand f into Taylor

series and keep only linear terms in this expansion:

f(x1 + ∆x1, . . .) = y +n∑i=1

ci ·∆xi, where ydef= f(x1, . . .) and ci

def=

∂f

∂xi.

Here, max(ci · ∆xi) = |ci| · ∆i, so the range of f is [y −∆, y + ∆], where

∆ =n∑i=1

|ci| ·∆i.

Towards an algorithm. To compute ∆ =n∑i=1

|ci| ·∆i, we need to find ci.

If we replace one of xi with xi + ∆i, then, due to linearization, we get

yidef= f(x1, . . . , xi−1, xi + ∆i, xi+1, . . . , xn) = y + ci ·∆i.

Thus, |ci| ·∆i = |yi − y| and hence ∆ =n∑i=1

|yi − y|.

Resulting algorithm. Compute y = f(x1, . . . , xn), compute n values

yi = Pf (x1, . . . , xi−1, xi + ∆i, xi+1, . . . , xn), then compute ∆ =n∑i=1

|yi − y|

and[P −∆, P + ∆

].

This algorithm requires n+ 1 calls to f : to compute y and n values yi.

Towards a faster algorithm. When the number of inputs n is large, n+1

calls may be too long. To speed up computations, we can use the following


42

property of Cauchy distribution, with density $ρδ(x) =δ

π· 1

1 +x2

δ2

: if ηi are

independently Cauchy-distributed with parameters ∆i, then ηdef=

n∑i=1

ci · ηi

is Cauchy-distributed with parameter ∆ =c∑i=1

|ci| ·∆i.

Once we get simulated Cauchy-distributed values η, we can estimate

∆ by the Maximum Likelihood method. We also need to scale ηi to the

interval [−∆i,∆i] on which the linear approximation is applicable.

Resulting faster algorithm.7 First, we compute y = f(x1, . . . , xn). For

some N (e.g., 200), for k = 1, . . . , N , we repeatedly:

• use the random number generator to compute r(k)i , i = 1, 2, . . . , n,

uniformly distributed on [0, 1];

• compute Cauchy distributed values as c(k)i = tan(π · (r(k)i − 0.5));

• compute the largest value K of the values∣∣∣c(k)i

∣∣∣;• compute simulated “actual values” x

(k)i = xi +

∆i · c(k)i

K;

• apply f and compute ∆y(k) = K ·(f(x(k)1 , . . . , x

(k)n

)− y).

Then, we compute ∆ ∈[0,max

k

∣∣∆y(k)∣∣] by applying the bisection method

to the equation1

1 +

(∆y(1)

∆

)2 + . . .+1

1 +

(∆y(N)

∆

)2 =N

2. We stop when

we get ∆ with accuracy ≈ 20% (accuracy 1% and 1.2% is approximately

the same).

The Cauchy-variate algorithm requires N ≈ 200 calls to f . So, when

n 200, it is much faster than the above linearization-based algorithm.

4. Beyond Linearization, Towards Interval Computations

Linearization is sometimes not sufficient. In many application areas,

it is sufficient to have an approximate estimate of y. However, sometimes, we

need to guarantee that y does not exceed a certain threshold y0: in nuclear

engineering, the temperatures and the neutron flows should not exceed the

critical values; a spaceship should land on the planet and does not fly past

it, etc.


43

The only way to guarantee this is to have an interval Y =[Y , Y

]for

which y ⊆ Y and Y ≤ y0. Such an interval is called an enclosure. Comput-

ing such an enclosure is one of the main tasks of interval computations.

Interval computations: a brief history. The origins of interval compu-

tations can be traced to the work of Archimedes from Ancient Greece who

used intervals to bound values like π; see, e.g.,1. Its modern revival was

boosted by three pioneers: Mieczyslaw Warmus (Poland), Teruo Sunaga

(Japan), and Ramon Moore (USA) in 1956–59. The first successful appli-

cation was taking interval uncertainty into account when planning space-

flights to the Moon. Since then, there were many successful applications: to

design of elementary particle colliders (Martin Berz, Kyoko Makino, USA),

to checking whether an asteroid will hit the Earth (M. Berz, R. Moore,

USA), to robotics (L. Jaulin, France; A. Neumaier, Austria), to chemical

engineering (Marc Stadtherr, USA), etc.4,9

Interval arithmetic: foundations of interval techniques. The prob-

lem is to compute the range

[y, y] = f(x1, . . . , xn) |x1 ∈ [x1, x1], . . . , xn ∈ [xn, xn].

For arithmetic operations f(x1, x2) (and for elementary functions), we have

explicit formulas for the range. For example, when x1 ∈ x1 = [x1, x1] and

x2 ∈ x2 = [x2, x2], then:

• The range x1 + x2 for x1 + x2 is [x1 + x2, x1 + x2].

• The range x1 − x2 for x1 − x2 is [x1 − x2, x1 − x2].

• The range x1 · x2 for x1 · x2 is

[min(x1 ·x2, x1 ·x2, x1 ·x2, x1 ·x2),max(x1 ·x2, x1 ·x2, x1 ·x2, x1 ·x2)].

The range 1/x1 for 1/x1 is [1/x1, 1/x1] (if 0 6∈ x1).

Straightforward interval computations. In general, we can parse an

algorithm (i.e., represent it as a sequence of elementary operations) and

then perform the same operations, but with intervals instead of numbers.

For example, to compute f(x) = (x − 2) · (x + 2), the computer first

computes r1 := x− 2, then r2 := x+ 2, and r3 := r1 · r2. So, for estimating

the range of f(x) for x ∈ [1, 2], we compute r1 := [1, 2] − [2, 2] = [−1, 0],

r2 := [1, 2] + [2, 2] = [3, 4], and r3 := [−1, 0] · [3, 4] = [−4, 0].

Here, the actual range is f(x) = [−3, 0]. This example shows that we

need more efficient ways of computing an enclosure Y ⊇ y.


44

First idea: use of monotonicity. For arithmetic, we had exact ranges,

because +, −, · are monotonic in each variable, and monotonicity

helps: if f(x1, . . . , xn) is (non-strictly) increasing (f ↑) in each xi, then

f(x1, . . . ,xn) = [f(x1, . . . , xn), f(x1, . . . , xn)]. Similarly, if f ↑ for some xiand f ↓ for other xj .

It is known that f ↑ in xi if∂f

∂xi≥ 0. So, to check monotonicity, we can

check that the range [ri, ri] of∂f

∂xion xi has ri ≥ 0. Here, differentiation

can be performed by available Automatic Differentiation (AD) tools, an

estimating ranges of∂f

∂xican be done by using straightforward interval

computations.

For example, for f(x) = (x − 2) · (x + 2), the derivatives is 2x, so its

range on x = [1, 2] is [2, 4], with 2 ≥ 0. Thus, we get the exact range

f([1, 2]) = [f(1), f(2)] = [−3, 0].

Second idea: centered form. In the general non-monotonic case, we can

use the general version of linearization – the Intermediate Value Theorem,

according to which

f(x1, . . . , xn) = f(x1, . . . , xn) +n∑i=1

∂f

∂xi(χ) · (xi − xi)

for some χi ∈ xi. Because of this theorem, we can conclude that

f(x1, . . . , xn) ∈ Y, where

Y = y +n∑i=1

∂f

∂xi(x1, . . . ,xn) · [−∆i,∆i].

Here also, differentiation can be done by Automatic Differentiation (AD)

tools, and estimating the ranges of derivatives can be done, if appropriate,

by monotonicity, or else by straightforward interval computations, or also by

centered form (this will take more time but lead to more accurate results).

Third idea: bisection. It is known that the inaccuracy of the first order

approximation (like the ones we used) is O(∆2i ). So, when ∆i is too large

and the accuracy is low, we can split the corresponding interval in half

(reducing the inaccuracy from ∆2i to ∆2

i /4), and then take the union of the

resulting ranges.

For example, the function f(x) = x · (1 − x) is not monotonic for x ∈x = [0, 1]. So, we take x′ = [0, 0.5] and x′′ = [0.5, 1]; on the 1st subinterval,

the range of the derivative is 1 − 2 · x = 1 − 2 · [0, 0.5] = [0, 1], so f ↑


45

and f(x′) = [f(0), f(0.5)] = [0, 0.25]. On the 2nd subinterval, we have

1−2·x = 1−2·[0.5, 1] = [−1, 0], so f ↓ and f(x′′) = [f(1), f(0.5)] = [0, 0.25].

The resulting estimate is f(x′)∪f(x′′) = [0, 0.25], which is the exact range.

These ideas underlie efficient interval computations algorithms and soft-

ware packages.3,4,6,9

5. Partial Information about Probabilities

Formulation of the problem. In the ideal case, we know the probability

distributions. In this case, in principle, we can find the distribution for

y = f(x1, . . . , xn) by using Monte-Carlo simulations.

In the previous section, we considered situations when we only know

an interval of possible values. In practice, in addition to the intervals, we

sometimes also have partial information about the probabilities. How can

we take this information into account?

How to represent partial information about probabilities. In gen-

eral, there are many ways to represent a probability distribution; it is de-

sirable to select a representation which is the most appropriate for the

corresponding practical problem. In most practical problems, the ultimate

objective is to make decisions. According to decision theory, a decision

maker should look for an alternative a that maximizes the expected utility

Ex[u(x, a)]→ maxa

.

When the utility function u(x) is smooth, we can expand it in Taylor

series u(x) = u(x0) + (x − x0) · u′(x0) + . . .; this shows that, to estimate

E[u], we must know moments. In this case, partial information means that

we only have interval bounds on moments. There are known algorithms for

processing such bounds; see, e.g.,10.

Another case is when we have a threshold-type utility function u(x): e.g.,

for a chemical plant, drastic penalties start if the pollution level exceeds a

certain threshold x0. In this case, to find the expected utility, we need the

know the values of the cdf F (x) = P (ξ ≤ x). Partial information means

that, for every x, we only have interval bounds [F (x), F (x)] on the actual

(unknown) cdf; such bounds are known as a p-box. There are also known

algorithms for processing such boxes; see, e.g.,2,10.

Example of processing p-boxes. Suppose that we know p-boxes

[F 1(x1), F 1(x1)] and [F 2(x2), F 2(x2)] for quantities x1 and x2, we do not

have any information about the relation between x1 and x2, and we want

to find the p-box corresponding F (y), F (y)] corresponding to y = x1 + x2.


46

It is known that for every two events A and B,

P (A ∨B) = P (A) + P (B)− P (A&B) ≤ P (A) + P (B).

In particular, P (¬A ∨ ¬B) ≤ P (¬A) + P (¬B). Here, P (¬A) = 1 − P (A),

P (¬B) = 1−P (B), and P (¬A∨¬B) = 1−P (A&B), thus, 1−P (A&B) ≤(1 − P (A)) + (1 − P (B)) and so, P (A&B) ≥ P (A) + P (B) − 1. We also

know that P (A&B) ≥ 0, hence P (A&B) ≥ max(P (A) +P (B)−1, 0). Let

us use this inequality to get the desired bounds for F (y).

If ξ1 ≤ x1 and ξ2 ≤ x2, then ξdef= ξ1 + ξ2 ≤ x1 + x2. Thus, if x1 + x2 =

y, then F (y) = P (ξ ≤ y) ≥ P (ξ1 ≤ x1 & ξ2 ≤ x2). Due to the above

inequality, P (ξ1 ≤ x1 & ξ2 ≤ x2) ≥ P (ξ ≤ x1) + P (ξ2 ≤ x2) − 1. Here,

P (ξi ≤ xi ≥ F i(xi), so F (y) ≥ F 1(x1) + F 2(x2) − 1. Thus, as the desired

lower bound F (y), we can take the largest of the corresponding right-hand

sides: F (y) = max

(max

x1,x2:x1+x2=y(F 1(x1) + F 2(x2)− 1), 0

), i.e.,

F (y) = max

(maxx1

(F 1(x1) + F 2(y − x1)− 1), 0

).

To find the upper bound for F (y), let us find a similar lower bound

for 1 − F (y) = P (ξ > y). If x1 + x2 = y, ξ1 > x1, and ξ2 > x2, then

ξ = ξ1 + ξ2 > y. Here, P (ξi > xi) = 1− P (ξi ≤ xi) = 1− Fi(xi). Thus,

1−F (y) = P (ξ > y) ≥ P (ξ1 > x1 & ξ2 > x2) ≥ P (ξ1 > x1)+P (ξ2 > x2)−1

= (1− F1(x1)) + (1− F2(x2))− 1 = 1− F1(x1)− F2(x2),

hence F (y) ≤ F1(x1) + F2(x2). Since Fi(xi) ≤ F i(xi), we have F (y) ≤F 1(x1) + F 2(x2). Thus, as the desired upper bound F (y), we can take the

smallest of the corresponding right-hand sides:

F (y) = min

(min

x1,x2:x1+x2=y(F 1(x1) + F 2(x2)), 1

), i.e.,

F (y) = min

(minx1

(F 1(x1) + F 2(y − x1)), 1

).

Similar formulas can be derived for other elementary operations.

How to represent p-boxes. Representing a p-box means representing

two cdfs F (x) and F (x). For each cdf F (x), to represent all its values

with accuracy1

n, it is sufficient to store n − 1 quantiles x1 < . . . < xn−1,


47

i.e., values xi for which F (xi) =i

n. These values divide the real line into

segments [xi, xi+1], where x0def= −∞ and xn+1

def= +∞.

Each real value x belongs to one of these segments [xi, xi+1], in which

case, due to monotonicity of F (x), we have F (xi) =i

n≤ F (x) ≤ i+ 1

n=

F (xi+1), hence

∣∣∣∣F (x)− i

n

∣∣∣∣ ≤ 1

n.

Need to go beyond p-boxes. In many practical situations, we need to

maintain the value within a certain interval: e.g., the air conditioning must

maintain the temperature within certain bounds, a spaceship must land

within a certain region, etc. In such cases, the utility drastically drops if

we are outside the interval; thus, the expected utility is proportional to the

probability F (a, b) = P (ξ ∈ (a, b]) to be within the corresponding interval

(a, b]. In such situations, partial information about probabilities means that

for a and b, we only know the interval [F (a, b), F (a, b)] containing the actual

(unknown) values F (a, b).

When we know the exact cdf F (x), then we can compute F (a, b) as

F (a) − F (b). However, in case of partial information, it is not sufficient

to only know the p-box. For example, let us assume that x is uniformly

distributed on some interval of known width ε > 0, but we do not know

on which. In this case, as one can easily see, for every x, F (x) = 0 and

F (x) = 1 – irrespective on ε. On the other hand, for any interval [a, b], we

have F (a, b) = min

(b− aε

, 1

). This bound clearly depends on ε and thus,

cannot be uniquely determined by the p-box values.

How to process this more general information. Good news is that we

process this more general information similarly to how we process p-boxes.

Specifically, when ξ1 ∈ x1 = (x1, x1] and ξ2 ∈ x2 = (x2, x2], then

ξ = ξ1 + ξ2 ∈ x1 + x2 = (x1 + x2, x1 +x2]. Thus, if x1 + x2 ⊆ y = [y, y], we

have

F (y, y) ≥ P (ξ1 ∈ x1 & ξ2 ∈ x2) ≥ P (ξ1 ∈ x1) + P (ξ2 ∈ x2)− 1 ≥

F 1(x1) + F 2(x2) = 1.

So, as the desired lower bound F (y, y), we can take the largest of the cor-

responding right-hand sides:

F (y, y) = max

(max

x1,x2:x1+x2⊆y(F 1(x1) + F 2(x2)− 1), 0

).


48

This formula is very similar to the formula for p-boxes. The formula for

the upper bound comes from the fact that F (y, y) = F (y)−F (y), and thus,

F (y, y) ≤ F (y)− F (y). We already know the values F (y)− F (y), thus we

can take their difference as the desired upper bound F (y, y):

F (y, y) = min

(minx1

(F 1(x1) + F 2(y − x1)), 1

)−

max

(maxx1

(F 1(x1) + F 2(y − x1)− 1), 0

).

Similar formulas can be obtained for other elementary operations.

How to represent this more general information. Not so good news

is that representing such a more general information is much more difficult

than representing p-boxes.

Indeed, similarly to p-boxes, we would like to represent all the values

F (a, b) and F (a, b) with a given accuracy1

n, i.e., we would like to find the

values x1 < . . . < xN for which xi ≤ a ≤ xi+1 and xj ≤ b ≤ xj+1 implies

|F (a, b)− F (xi, xj) ≤1

nand |F (a, b)− F (xi, xj) ≤

1

n.

For p-boxes, we could use N = n values xi. Let us show that for the

bounds on P (a, b), there is no upper bound on the number of values needed.

Namely, we will show that in the above example, when ε → 0, the corre-

sponding number of points N grows indefinitely: N → ∞. Indeed, when

j = i, a = xi, and b = xi+1, then, due to F (xi, xi) = 0, the above con-

dition means F (xi, xi+1) ≤ 1

n. Thus, we must have

xi+1 − xiε

≤ 1

n, i.e.,

xi+1 − xi ≤ε

n. The next point xi+1 is this close to the previous one, so,

e.g., on the unit interval [0, 1], we need at least N ≥ n

εsuch points. When

ε→ 0, the number of such points indeed tends to infinity.

It is worth mentioning that we can have an upper bound onN if we know

an upper bound d on the probability density ρ(x): in this case, F (a, b) ≤ (b−a) · d and thus, to get the desired accuracy

1

n, it is sufficient to have xi+1−

xi =1

n · d. On an interval of width W , we thus need N = Wxi+1 − xi =

W · n · d points.


49

Acknowledgments

This work was supported in part by the National Science Foundation grants

HRD-0734825 and HRD-1242122 (Cyber-ShARE Center of Excellence) and

DUE-0926721. The author is greatly thankful to Scott Ferson, to Franco

Pavese, and to all the participants of the International Conference on Ad-

vanced Mathematical and Computational Tools in Metrology and Testing

AMTCM’2014 (St. Petersburg, Russia, September 9–12, 2014) for valuable

discussions.

References

1. Archimedes, On the measurement of the circle, In: T. L. Heath (ed.), TheWorks of Archimedes (Dover, New York, 1953).

2. S. Ferson et al., Constructing Probability Boxes and Dempster-Shafer Struc-tures (Sandia Nat’l Labs, Report SAND2002-4015, 2003).

3. Interval computations website http://www.cs.utep.edu/interval-comp4. L. Jaulin et al., Applied Interval Analysis (Springer, London, 2001).5. E. T. Jaynes and G. L. Bretthorst, Probability Theory: The Logic of Science

(Cambridge University Press, Cambridge, UK, 2003).6. V. Kreinovich, Interval computations and interval-related statistical tech-

niques, In: F. Pavese and A. B. Forbes (eds.), Data Modeling for Metrologyand Testing in Measurement Science (Birkhauser-Springer, Boston, 2009),pp. 117–145.

7. V. Kreinovich and S. Ferson, A new Cauchy-Based black-box technique foruncertainty in risk analysis, Reliability Engineering and Systems Safety 85(1–3), 267–279 (2004).

8. V. Kreinovich et al., Computational Complexity and Feasibility of Data Pro-cessing and Interval Computations (Kluwer, Dordrecht, 1997).

9. R. E. Moore, R. B. Kreinovich, and M. J. Cloud, Introduction to IntervalAnalysis (SIAM Press, Philadelphia, Pennsylvania, 2009).

10. H. T. Nguyen et al., Computing Statistics under Interval and Fuzzy Uncer-tainty (Springer, Berlin, Heidelberg, 2012).

11. S. G. Rabinovich, Measurement Errors and Uncertainty:Theory and Practice(Springer, Berlin, 2005).

50





CLASSIFICATION, MODELING AND QUANTIFICATION OF

HUMAN ERRORS IN CHEMICAL ANALYSIS

ILYA KUSELMAN

National Physical Laboratory of Israel, Givat Ram, Jerusalem 91904, Israel


Classification, modeling and quantification of human errors in chemical analysis are

described. The classification includes commission errors (mistakes and violations) and

omission errors (lapses and slips) by different scenarios at different stages of the analysis.

A Swiss cheese model is used for characterization of the error interaction with a

laboratory quality system. A new technique for quantification of human errors in

chemical analysis, based on expert judgments, i.e. on the expert(s) knowledge and

experience, is discussed.

Keywords: Human errors; Classification; Modeling; Quantification; Analytical Chemistry

1. Introduction

Human activity is never free from errors: the majority of incidents and accidents

are caused by human errors. In chemical analysis, human errors may lead to

atypical test results, in particular out-of-specification test results that fall outside

established specifications in the pharmaceutical industry, or do not comply with

regulatory, legislation or specification limits in other industries and fields, such

as environmental and food analysis. Inside the limits or at their absence (e.g., for

an environmental object or a new material) errors may also lead to incorrect

evaluation of the tested properties. Therefore, study of human errors is necessary

in any field of analytical chemistry and required from any laboratory (lab)

seeking accreditation. Such a study consists of classification, modeling and

quantification of human errors [1].

51


2. Classification

The classification includes commission errors (knowledge-, rule- and skill-based

mistakes and routine, reasoned, reckless and malicious violations) and omission

errors (lapses and slips) by different scenarios at different stages of the analysis

[1]. There are active errors by a sampling inspector and/or an analyst/operator.

Errors due to a lab poor design, a defect of the equipment and a faulty

management decision, are latent errors [2].

3. Modeling

A Swiss cheese model is used for characterization of the errors interaction with

a lab quality system. This model considers the quality system components j = 1,

2, .., J as protective layers against human errors. For example, the system

components are: validation of the measurement/analytical method and

formulation of standard operation procedures (SOP); training of analysts and

proficiency testing; quality control using statistical charts and/or other means;

and supervision. Each such component has weak points, whereby errors are not

prevented, similar to holes in slices of the cheese. Coincidence of the holes in all

components of the lab quality system on the path of a human error is a defect of

the quality system, which does not allow prevention of an atypical result of the

analysis [1].

4. Quantification

4.1. A new technique

By this technique [3] kinds of human error k = 1, 2, …, K and steps of the

analysis m = 1, 2, …, M in which the error may happen (locations of the error),

form event scenarios i = 1, 2, …, I, where I = K × M. An expert may estimate

likelihood pi of scenario i by the following scale: likelihood of an unfeasible

scenario – as pi = 0, weak likelihood - as pi = 1, medium – as pi = 3, and strong

(maximal) likelihood – as pi = 9. The expert estimates/judgments on severity of

an error by scenario i as expected loss li of reliability of the analysis, are

52


performed with the same scale (0, 1, 3, 9). Estimates of the possible reduction rij

of likelihood and severity of human error scenario i as a result of the error

blocking by quality system layer j (degree of interaction) are made by the same

expert(s) using again this scale. The interrelationship matrix of rij has I rows and

J columns, hence it contains I × J cells of estimate values. Blocking human error

according to scenario i by a quality system component j can be more effective in

presence of another component j' (j' ≠ j) because of their synergy ( )

',i

jj∆ equals to

0 when the effect is absent, and equals to 1 when it is. Estimates qj of importance

of quality system component j in decreasing losses from human error are

calculated as 1 ,I

j i i ij ijiq p l r s

== ∑ where the synergy factor is

( )( )

''1 1 .iJ

ij jjj js J

≠= + −∑ ∆

The technique allows transformation of the semi-intuitive expert judgments

on human errors and on the laboratory quality system into the following

quantitative scores expressed in %: a) likelihood score of human error in the

analysis ( )*

1;100% 9

I

ii

P p I=

= ∑ b) severity (loss) score of human error for

reliability of the analysis results ( )*

1;100% 9

I

ii

L l I=

= ∑ c) importance score

of a component of the lab quality system *

1100% ;J

jj j jq q q

== ∑ and d)

effectiveness score of the quality system, as a whole, against human error Eff* =

( ) 1 1 1100% 9 .J J I

i ijj j ij iq p l s= = =∑ ∑ ∑

4.2. Further developments

Calculation of the score values qj* allows evaluation of the quality system

components for all steps of the analysis together. The columns of the

interrelationship matrix are used for that: it is the "vertical vision" of the matrix.

However, an analyst may be interested to know which step m is less protected

from errors, with intent to improve it. To obtain this information the "horizontal

vision" of the interrelationship matrix (by the rows) is necessary. The scores

similar to qj*, but related to the same error location, i.e., the same step m of the

analysis, are applicable for that.

Variability of the expert judgments and robustness of the quantification

parameters of human errors are also important. Any expert feels a natural doubt

choosing one of close values from the proposed scale: 0 or 1? 1 or 3? 3 or 9?

53


One change of an expert judgment on the likelihood of scenario i from pi = 0 to

pi = 1 and vice versa leads to the change of the likelihood score P* from

11.11 % for one scenario to 0.21 % for I = 54 scenarios, for example. The same

is true for severity score L*. Evaluation of the robustness of quality system

scores to variation of the expert judgments is more complicated and can be

based on Monte Carlo simulations.

Examples of the human error classification, modeling and quantification

using this technique are considered for pH measurements of groundwater [3],

multi-residue analysis of pesticides in fruits and vegetables [4], and ICP-MS of

geological samples [5].

Acknowledgements

This research was supported in part by the International Union of Pure and

Applied Chemistry (IUPAC Project 2012-021-1-500). The author would like to

thank the project team members Dr. Francesca Pennecchi (Istituto Nazionale di

Ricerca Metrologica, Italy), Dr. Aleš Fajgelj (International Atomic Energy

Agency, Austria), Dr. Stephen L.R. Ellison (Laboratory of Government Chemist

Ltd, UK) and Prof. Yury Karpov (State Research and Design Institute for Rare

Metal Industry, Russia) for useful discussions.

References

1. I. Kuselman, F. Pennecchi, A. Fajgelj, Y. Karpov. Human errors and reliability

of test results in analytical chemistry. Accred. Qual. Assur. 18, 3 (2013).

2. ISO/TS 22367. Medical laboratories – Reduction of error through risk

management and continual improvement (2008).

3. I. Kuselman, E. Kardash, E. Bashkansky, F. Pennecchi, S. L. R. Ellison, K.

Ginsbury, M. Epstein, A. Fajgelj, Y. Karpov. House-of-security approach to

measurement in analytical chemistry: quantification of human error using expert

judgments. Accred. Qual. Assur. 18, 459 (2013).

4. I. Kuselman, P. Goldshlag, F. Pennecchi. Scenarios of human errors and their

quantification in multi-residue analysis of pesticides in fruits and vegetables.

Accred. Qual. Assur. 19, online, DOI 10.1007/00769-014-1071-6 (2014).

5. I. Kuselman, F. Pennecchi, M. Epstein, A. Fajgelj, S. L. R. Ellison. Monte Carlo

simulation of expert judgments on human errors in chemical analysis – a case

study of ICP-MS. Talanta 130C, 462 (2014).

54





APPLICATION OF NONPARAMETRIC GOODNESS-OF-FIT

TESTS: PROBLEMS AND SOLUTION*

B. YU. LEMESHKO

Applied Mathematic Department, Novosibirsk State Technical University,

Novosibirsk, Russia


www.ami.nstu.ru/~headrd/

In this paper, the problems of application of nonparametric goodness-of-fit tests in the

case of composite hypotheses have been considered. The factors influencing test statistic

distributions have been discussed. A manual on application of nonparametric tests have

been prepared. The proposed recommendations would reduce errors in statistical

inference when using considered tests in practice.

Keywords: Composite hypotheses of goodness-of-fit; Anderson-Darling test, Cramer-

von Mises-Smirnov test, Kolmogorov test, Kuiper test, Watson test, Zhang tests.

1. Introduction

In applications of statistical data analysis, there are a lot of examples of incorrect

usage of nonparametric goodness-of-fit tests (Kolmogorov, Cramer-von Mises

Smirnov, Anderson-Darling, Kuiper, Watson, Zhang tests). The most common

errors in testing composite hypotheses are associated with using classical results

obtained for simple hypotheses.

There are simple and composite goodness-of-fit hypotheses. A simple

hypothesis tested has the following form 0H : ( ) ( , )F x F x= θ , where ( , )F x θ is

the distribution function, which is tested for goodness-of-fit with observed

sample, and θ is an known value of parameter (scalar or vector). A composite

hypotheses tested has the form 0

H : ( ) ( , ), ,F x F x∈ ∈ Θθ θ where Θ is the

definition domain of parameter θ . If estimate θ of scalar or vector parameter of

tested distribution was not found by using the sample, for which goodness-of-fit

hypothesis is tested, then the application of goodness-of-fit test for composite

hypothesis is similar to the application of test in the case of simple hypothesis.

* This work is supported by the Russian Ministry of Education and Science (project 2.541.2014K).

55


The problems arise in testing composite hypothesis, when estimate θ of the

distribution parameter was found by using the same sample on which goodness-

of-fit hypothesis is tested.

2. Goodness-of-fit tests for simple hypotheses

In the case of simple hypotheses, nonparametric tests are “free from

distribution”, i.e. the limiting distribution of statistics of classical nonparametric

goodness-of-fit tests do not depend on a tested distribution and its parameters.

The Kolmogorov test (which is usually called the Kolmogorov–Smirnov

test) is based on statistic

sup ( ) ( , )n n

x

D F x F x<∞

= − θ , (1)

where ( )n

F x is the empirical distribution function; ( , )F x θ is the hypothetical

distribution function; n is the sample size. The limiting statistic distribution for

testing simple hypothesis has been obtained by Kolmogorov in Ref. [6]. The

distribution function of n

n D⋅ uniformly converges to the Kolmogorov

distribution function ( )K S as n → ∞ , see Ref. [3, 11].

The Kolmogorov test is recommended to be used with Bolshev’s correction,

see Ref. [3, 11]:

6 11

6 6

n

K n

nDS nD

n n

+

= + = , (2)

where

( )max ,n n n

D D D+ −

= , 1max ( , )

n ii n

iD F x

n

+

≤ ≤

= −

θ ,

1

1max ( , )

n ii n

iD F x

n

−

≤ ≤

− = −

θ ,

1 2 nx x x≤ ≤ ≤… is the variational series (the sample sorted in increasing order).

The Cramer-von Mises Smirnov test is based on statistic

2

2

1

1 2 1( , )

12 2

n

n i

i

iS n F x

n n=

− = = + −

∑ω

ω θ , (3)

which has distribution 1( )a s , when a simple hypothesis is tested, see Ref. [3,

11].

Statistic of the Anderson-Darling test has the form (Ref. [1, 2])

1

2 1 2 12 ln ( , ) 1 ln(1 ( , ))

2 2

n

i i

i

i iS n F x F x

n nΩ

=

− − = − − + − −

∑ θ θ (4)

and has distribution 2( )a s for simple hypotheses, see Ref. [3, 11].

The Kuiper test is based on statistic n n n

V D D+ −

= + (Ref. [7]). It is

preferred to use it in the form (Ref. [25])

56


0.24

0.155n

V V n

n

= + +

, (5)

or in the form (Ref. [8])

1

( )3

mod

n n nV n D D

n

+ −

= + + . (6)

This statistic has distribution ( )2 22 2 2

11 2(4 1)

m s

mKu s m s e

∞−

=

= − −∑ , see Ref. [7].

The statistic of Watson test has the form (Refs. [26, 27])

( ) ( )

2

2

2

1 1

1

1 1 12, ,2 12

n n

n i i

i i

i

U F x n F xn n n

= =

−

= − − − +

∑ ∑θ θ (7)

and has distribution ( )2 21 2

11 2 ( 1)m m s

mW s e

π∞

− −

=

= − −∑ for simple hypotheses

tested.

The statistics of Zhang test can be written as (Refs. [8])

1

log ( , ) log 1 ( , )

1 1

2 2

n

i i

A

i

F x F xZ

n i i=

−

= − + − + −

∑θ θ

, (8)

[ ]

2

1

1

( , ) 1log

1 3( ) / ( ) 1

2 4

n

i

C

i

F xZ

n i

−

=

−

= − − −

∑θ

, (9)

1

1 1

1 12 2max log log2 ( , ) 2 1 ( , )

Ki n

i i

i n i

Z i n inF x n F x≤ ≤

− − +

= − + − + −

θ θ

.(10)

The tests based on statistics A

Z and C

Z have higher power in comparison

with the Kolmogorov, Cramer-von Mises-Smirnov and Anderson-Darling tests.

However, the application of these powerful tests is complicated because of the

dependence of statistic distributions on the sample size.

3. Problems of application of tests for composite hypotheses

In the case of testing composite hypotheses, all nonparametric goodness-of-fit

tests lose their property of being distribution free, if parameters estimation is

based on the same sample, on which the hypothesis is tested. Statistic

distributions 0( )G S H of these tests depend on:

57


− distribution ( , )F x θ corresponding to tested hypothesis 0H (see

Fig. 1);

− the type and the number of parameters estimated;

− the estimation method used (see Fig. 2);

− in some cases, a particular value of parameter (for example, in the case

of gamma-distribution).

The statistic distributions for simple hypotheses and the distributions of the

same statistics for composite hypotheses are quite different. Therefore, it is

unacceptable to disregard this difference.

On Fig. 1, empirical distributions 0( )

nG S H of Cramer-von Mises-Smirnov

statistic Sω

are presented for the case of testing composite hypothesis 0H ,

when the maximum likelihood method is used for estimation of two parameters.

The dependence of the test statistic distribution on estimation method used is

shown on Fig. 2. There are density functions 0( )n

g S H of Kolmogorov test

statistic K

S with the following methods for estimating parameters of the normal

distribution: the methods based on minimizing statistics K

S , Sω

, SΩ

(MD-

estimates) and the maximum likelihood method.

1

2

3

4

5

а1(S)

1.00

0.90

0.80

0.70

0.60

0.50

0.40

0.30

0.20

0.10

0.000.00 0.03 0.06 0.09 0.11 0.14 0.17 0.20

Fig. 1. Distributions ( )0G S Hn of Cramer-von Mises-Smirnov test statistic Sω in the case of

estimation of two parameters of the distribution corresponding to 0H (1 – normal, 2 – logistic, 3 –

Laplace, 4 – extreme-value (minimum), 5 – Cauchy), maximum likelihood method is used, 1( )a s is

the distribution function for simple hypotheses tested.

Moreover, the greatest problem consists in the dependence of test statistic

distributions on specific value of the distribution shape parameter. For example,

in the case of generalized normal distribution with density

58


2

02

1 2 1

( ) exp2 (1 / )

xf x

− = −

Γ

θ

θθ

θ θ θ

, (11)

the value of shape parameter 2θ influences the statistic distributions of

nonparametric goodness-of-fit tests. Such influence is illustrated on Fig. 3 in the

case of estimating three parameters of (11) by the maximum likelihood method.

Fig. 2. Distribution densities ( )0g S Hn of statistic SK in the case of testing composite hypothesis

( 0H – normal distribution, two parameters estimated: 1 – MD-estimates SK ; 2 – MD-estimates

Sω; 3 – MD-estimates SΩ

; 4 – maximum likelihood method; ( )k s –density of Kolmogorov

distribution).

G S H ( )K 0

θ = 22 K S ( )K

SK

θ = 1.62

θ = 12

θ = 0.752

θ = 0.252

θ 4= 2

θ 7= 2

1.00

0.90

0.80

0.70

0.60

0.50

0.40

0.30

0.20

0.10

0.001.000.800.600.400.200.00 1.20 1.40 1.60

Fig. 3. The dependence of statistic distribution of the Kolmogorov test on the value of shape

parameter 2θ , when three distribution parameters of the generalized normal distribution are

estimated.

59


4. Application of tests for composite hypotheses: solution of problems

The investigation of limiting statistic distributions of nonparametric goodness-

of-fit tests for composite hypotheses was initiated in Ref. [5].

Various approaches have been used for solving problems in this area: the

limiting statistic distributions have been studied by analytical and numerical

methods. In particular cases, the tables of percentage points for the limiting

statistic distributions of nonparametric tests have been obtained by using

statistical simulation methods.

Apparently, the first papers, in which the Monte-Carlo method and compu-

ter simulation appeared to be an efficient method for the development of applied

mathematical statistic, were Refs. [22, 23]. In these papers, the percentage points

for the Kolmogorov test statistic (without Bolshev’s correction) were obtained

for testing composite hypotheses relative to normal distribution law.

In a number of our papers, the analytically simple models, approximating

the limiting statistic distributions of nonparametric tests in the case of testing

composite hypotheses, when unknown parameters are estimated with the ma-

ximum likelihood method, have been constructed by using computer simulate-

on of statistic distributions relative to various distributions corresponding to

hypothesis 0H . The recommendations for standardization R 50.1.037-2002

have been published on the basis of these studies (Ref. [24]). Later, results

presented in Ref. [24] have been made more precise and extended in Refs. [8-

16]. At present, the manual (Ref. [17]) based on obtained results has been

prepared and intended to replace recommendations in Ref. [24].

The manual Ref. [17] includes the tables of percentage points and the

models of limiting statistic distributions of nonparametric tests (altogether 63

tables), which can be used for testing various composite hypotheses (on the

following distributions: exponential, seminormal, Rayleigh, Maxwell, Laplace,

normal, log-normal, Cauchy, logistic, extreme-value (minimum and maximum),

Sb-Johnson, Sl-Johnson, Su-Johnson, Weibull, generalized Weibull, family of

gamma-distribution, family of beta-distribution, generalized normal, inverse

Gaussian distribution). Moreover, the manual includes the description of

computer simulation techniques for research of probabilistic regularities,

which can be used for investigation of test statistic distributions.

The tables of percentage points and the models of test statistics distributions

were based on simulated samples of the statistics with size 610N = . The

difference between actual distribution 0( )G S H and empirical distribution

0( )

NG S H does not exceed 10-3 for such N . The values of the test statistic

were calculated using samples of pseudorandom values simulated for the

60


observed distribution ( , )F x θ with the size 310n = . In such a case, distribution

0( )n

G S H practically equal to the limiting one 0( )G S H . The given models

can be used for statistical analysis if the sample sizes 25n > .

Unfortunately, the dependence of the nonparametric goodness-of-fit tests

statistics distributions for testing composite hypotheses on the values of the

shape parameter (or parameters) (see Fig. 3) appears to be for many parametric

distributions implemented in the most interesting applications, particularly in

problems of survival analysis and reliability. This is true for families of gamma-,

beta-distributions of the 1st, 2nd and 3rd kind, generalized normal, generalized

Weibull, inverse Gaussian distributions, and many others.

5. An interactive method to study distributions of statistics

In the cases, when statistic distributions of nonparametric tests depend on a

specific values of shape parameter(s) of tested distribution, the statistic

distribution cannot be found in advance (before computing corresponding

estimates). In such situations, it is recommended to find the test statistic

distribution by using interactive mode in statistical analysis process, see Ref.

[18], and then, to use this distribution for testing composite hypothesis.

The dependence of the test statistics distributions on the values of the shape

parameter or parameters is the most serious difficulty that is faced while

applying nonparametric goodness-of-fit criteria to test composite hypotheses in

different applications.

Since estimates of the parameters are only known during the analysis, so the

statistic distribution required to test the hypothesis could not be obtained in

advance. For the criteria with statistics (8) - (10), the problem is harder to be

solved as statistics distributions depend on the samples sizes. Therefore, the

statistics distributions of applied test should be obtained interactively during

statistical analysis (see Ref. [19, 20]), and then should be used to make

conclusions about composite hypothesis under test.

The implementation of such an interactive mode requires a developed

software that allows parallelizing the simulation process and taking available

computing resources. The usage of parallel computing enables to decrease the

time of simulation of the required test statistic distribution 0( )N n

G S H (with the

required accuracy), which is used to calculate the achieved significance level *

nP S S≥ , where *

S is the value of the statistic calculated using an original

sample.

In the software system (see Ref. [4]), the interactive method for the

research of statistics distributions is implemented for the following

nonparametric goodness-of-fit tests: Kolmogorov, Cramer-von Mises-Smirnov,

61


Anderson-Darling, Kuiper, Watson and three Zhang tests. Moreover, the

different methods of parameter estimation can be used there.

The following example demonstrates the accuracy of calculating the

achieved significance level depending on sample size N of simulated

interactively empirical statistics distributions (Software system, Ref. [4]).

Example. It is necessary to check a composite hypothesis on goodness-of-fit of

the inverse Gaussian distribution with the density function 1/2 2

30 1

20

3

2 2 331

22

1( ) exp

22

x

f xxx

− −

= − − −

θθ θ

θθ

θ θθθπ

θθ

on the basis of the following sample of the size n =100:

0.945 1.040 0.239 0.382 0.398 0.946 1.248 1.437 0.286 0.987

2.009 0.319 0.498 0.694 0.340 1.289 0.316 1.839 0.432 0.705

0.371 0.668 0.421 1.267 0.466 0.311 0.466 0.967 1.031 0.477

0.322 1.656 1.745 0.786 0.253 1.260 0.145 3.032 0.329 0.645

0.374 0.236 2.081 1.198 0.692 0.599 0.811 0.274 1.311 0.534

1.048 1.411 1.052 1.051 4.682 0.111 1.201 0.375 0.373 3.694

0.426 0.675 3.150 0.424 1.422 3.058 1.579 0.436 1.167 0.445

0.463 0.759 1.598 2.270 0.884 0.448 0.858 0.310 0.431 0.919

0.796 0.415 0.143 0.805 0.827 0.161 8.028 0.149 2.396 2.514

1.027 0.775 0.240 2.745 0.885 0.672 0.810 0.144 0.125 1.621

The shift parameter 3θ is assumed to be known and equal to 0.

The shape parameters 0θ , 1θ , and the scale parameter 2θ are estimated

using the sample. The maximum likelihood estimates (MLEs) calculated using

the sample above are the following 0ˆ 0.7481=θ , 1

ˆ 0.7808=θ , 2ˆ 1.3202=θ . The

statistics distributions of nonparametric goodness-of-fit tests depend on the

values of the shape parameters 0

θ and 1

θ (see Ref. [21]), does not depend on

the value of the scale parameter 2θ and have to be calculated using values

0 0.7481=θ , 1 0.7808=θ .

The calculated values of the statistics *iS for Kuiper, Watson, Zhang,

Kolmogorov, Cramer-von Mises-Smirnov, Anderson-Darling tests and achieved

significance levels for these values *

0 iP S S H≥ (p-values), obtained with

different accuracy of simulation (with different sizes N of simulated samples of

statistics) are given in Table 1.

62


The similar results for testing goodness-of-fit of the Г-distribution with the

density: 1

1 0 3

2

1

31

2 0 2

( )( )

x

xf x e

− −

−

−

= Γ

θ

θ θ θ

θθθ

θ θ θ

on the given sample, are given in Table 2. The MLEs of the parameters are

0ˆ 2.4933=θ , 1

ˆ 0.6065=θ , 2ˆ 0.1697=θ , 4

ˆ 0.10308=θ . In this case, the distribu-

tion of test statistic depends on the values of the shape parameters 0θ and 1θ .

Table 1. The achieved significance levels for different sizes N when testing goodness-of-fit of the

inverse Gaussian distribution

The values of

tests statistics 310N =

410N = 510N =

610N =

1.1113mod

nV = 0.479 0.492 0.493 0.492

2 0.05200n

U = 0.467 0.479 0.483 0.482

3.3043A

Z = 0.661 0.681 0.679 0.678

4.7975C

Z = 0.751 0.776 0.777 0.776

1.4164K

Z = 0.263 0.278 0.272 0.270

0.5919K

S = 0.643 0.659 0.662 0.662

0.05387Sω

= 0.540 0.557 0.560 0.561

0.3514SΩ

= 0.529 0.549 0.548 0.547

Table 2. The achieved significance levels for different sizes N when testing goodness-of-fit of the

Г-distribution

The values of

tests statistics 310N =

410N = 510N =

610N =

1.14855mod

nV = 0.321 0.321 0.323 0.322

2 0.057777n

U = 0.271 0.265 0.267 0.269

3.30999A

Z = 0.235 0.245 0.240 0.240

4.26688C

Z = 0.512 0.557 0.559 0.559

1.01942K

Z = 0.336 0.347 0.345 0.344

0.60265K

S = 0.425 0.423 0.423 0.424

0.05831Sω

= 0.278 0.272 0.276 0.277

0.39234SΩ

= 0.234 0.238 0.238 0.237

63


Fig. 4 presents the empirical distribution and two theoretical ones (IG-

distribution and Г-distribution), obtained by the sample above while testing

composite hypotheses.

The results presented in Table 1 and Table 2 show that estimates of p-value

obtained for IG-distribution higher than estimates of p-value for the Г-

distribution, i.e. the IG-distribution fits the sample given above better than the Г-

distribution. Moreover, it is obvious that the number of simulated samples of

statistics 410N = is sufficient to obtain the estimates of p-value with desired

accuracy in practice, and this fact does not lead to the noticeable increase of

time of statistical analysis.

Fig. 4. Empirical and theoretical distributions (IG-distribution and Г-distribution), calculated using

given sample

6. Conclusion

The prepared manual for application of nonparametric goodness-of-fit

tests (Ref. [17]) and the technique of interactive simulation of tests statistic

distributions provide the correctness of statistical inferences when testing

composite and simple hypotheses.

References

1. T. W. Anderson, D. A. Darling. Asymptotic theory of certain “Goodness of

fit” criteria based on stochastic processes, J. Amer. Statist. Assoc., 23, 1952,

pp. 193–212.

64


2. T. W. Anderson, D. A. Darling. A test of goodness of fit, J. Amer. Statist.

Assoc., 29, 1954, pp. 765–769.

3. L.N. Bolshev, N.V. Smirnov. Tables of Mathematical Statistics. (Moscow:

Science, 1983).

4. ISW – Program system of the statistical analysis of one-dimensional

random variables. URL: http://ami.nstu.ru/~headrd/ISW.htm (address date

02.09.2014)

5. M. Kac, J. Kiefer, J. Wolfowitz. On tests of normality and other tests of

goodness of fit based on distance methods, Ann. Math. Stat., 26, 1955, pp.

189–211.

6. A. N. Kolmogoroff. Sulla determinazione empirica di una legge di distri-

buzione, G. Ist. Ital. attuar. 4(1), 1933, pp. 83–91.

7. N. H. Kuiper, Tests concerning random points on a circle, Proc. Konikl.

Nederl. Akad. Van Wettenschappen, Series A, 63, 1960, pp. 38-47.

8. B. Yu. Lemeshko, A. A. Gorbunova. Application and Power of the

Nonparametric Kuiper, Watson, and Zhang Tests of Goodness-of-Fit,

Measurement Techniques. 56(5), 2013, pp. 465-475.

9. B. Yu. Lemeshko, S. B. Lemeshko. Distribution models for nonparametric

tests for fit in verifying complicated hypotheses and maximum-likelihood

estimators. P. 1, Measurement Techniques. 52(6), 2009, pp. 555–565.

10. B. Yu. Lemeshko, S. B. Lemeshko. Models for statistical distributions in

nonparametric fitting tests on composite hypotheses based on maximum-

likelihood estimators. P. II, Measurement Techniques. 52(8), 2009, pp. 799–

812.

11. B. Yu. Lemeshko, S. B. Lemeshko, S. N. Postovalov. Statistic Distribution

Models for Some Nonparametric Goodness-of-Fit Tests in Testing

Composite Hypotheses, Communications in Statistics – Theory and

Methods, 39(3), 2010, pp. 460–471.

12. B. Yu. Lemeshko, S. B. Lemeshko, M. S. Nikulin, N. Saaidia. Modeling

statistic distributions for nonparametric goodness-of-fit criteria for testing

complex hypotheses with respect to the inverse Gaussian law, Automation

and Remote Control, 71(7), 2010, pp. 1358–1373.

13. B. Yu. Lemeshko, S.B. Lemeshko. Models of Statistic Distributions of

Nonparametric Goodness-of-Fit Tests in Composite Hypotheses Testing for

Double Exponential Law Cases, Communications in Statistics - Theory and

Methods, 40(16), 2011, pp. 2879-2892.

14. B. Yu. Lemeshko, S. B. Lemeshko. Construction of Statistic Distribution

Models for Nonparametric Goodness-of-Fit Tests in Testing Composite

Hypotheses: The Computer Approach, Quality Technology & Quantitative

Management, 8(4), 2011, pp. 359-373.

15. B. Yu. Lemeshko, A. A. Gorbunova. Application of nonparametric Kuiper

and Watson tests of goodness-of-fit for composite hypotheses,

Measurement Techniques, 56(9), 2013, pp. 965-973.

65


16. B. Yu. Lemeshko, A. A. Gorbunova, S. B. Lemeshko, A. P. Rogozhnikov.

Solving problems of using some nonparametric goodness-of-fit tests, Opto-

electronics, Instrumentation and Data Processing, 50(1), 2014, pp. 21-35.

17. B. Yu. Lemeshko. Nonparametric goodness-of-fit tests. Guide on the

application. – M.: INFRA–M, 2014. – 163 p. (in russian)

18. B. Yu. Lemeshko, S. B. Lemeshko, A. P. Rogozhnikov. Interactive investi-

gation of statistical regularities in testing composite hypotheses of goodness

of fit, Statistical Models and Methods for Reliability and Survival Analysis :

monograph. – Wiley-ISTE, Chapter 5, 2013, pp. 61-76.

19. B. Yu. Lemeshko, S. B. Lemeshko, A. P. Rogozhnikov. Real-Time Study-

ing of Statistic Distributions of Non-Parametric Goodness-of-Fit Tests when

Testing Complex Hypotheses, Proceedings of the International Workshop

“Applied Methods of Statistical Analysis. Simulations and Statistical

Inference” – AMSA’2011, Novosibirsk, Russia, 20-22 September, 2011, pp.

19-27.

20. B. Yu. Lemeshko, A. A. Gorbunova, S. B. Lemeshko, A. P. Rogozhnikov.

Application of Nonparametric Goodness-of-fit tests for Composite Hypo-

theses in Case of Unknown Distributions of Statistics, Proceedings of the

International Workshop “Applied Methods of Statistical Analysis.

Applications in Survival Analysis, Reliability and Quality Control” –

AMSA’2013, Novosibirsk, Russia, 25-27 September, 2013, pp. 8-24.

21. B. Yu. Lemeshko, S. B. Lemeshko, M. S. Nikulin, N. Saaidia. Modeling

statistic distributions for nonparametric goodness-of-fit criteria for testing

complex hypotheses with respect to the inverse Gaussian law, Automation

and Remote Control, 71(7), 2010, pp. 1358-1373.

22. H. W. Lilliefors. On the Kolmogorov-Smirnov test for normality with mean

and variance unknown, J. Am. Statist. Assoc., 62, 1967, pp. 399–402.

23. H. W. Lilliefors. On the Kolmogorov-Smirnov test for the exponential

distribution with mean unknown, J. Am. Statist. Assoc., 64, 1969, pp. 387–

389.

24. R 50.1.037-2002. Recommendations for Standardization. Applied Statistics.

Rules of Check of Experimental and Theoretical Distribution of the

Consent. Part II. Nonparametric Goodness-of-Fit Test. Moscow: Publishing

House of the Standards, 2002. (in Russian)

25. M. A. Stephens. Use of Kolmogorov–Smirnov, Cramer – von Mises and

related statistics – without extensive table, J. R. Stat. Soc., 32, 1970, pp.

115–122.

26. G. S. Watson. Goodness-of-fit tests on a circle. I, Biometrika, 48(1-2),

1961. pp. 109-114.

27. G. S. Watson. Goodness-of-fit tests on a circle. II, Biometrika, 49(1-2),

1962, pp. 57- 63.

66


Advanced Mathematical and Computational Tools in Metrology and Testing X Edited by F. Pavese, W. Bremser, A. Chunovkina, N. Fischer and A. B. Forbes © 2015 World Scientific Publishing Company (pp. 66–77)

DYNAMIC MEASUREMENTS BASED ON AUTOMATIC CONTROL THEORY APPROACH

A. L. SHESTAKOV South Ural State University (National Research University)

Chelyabinsk, Russian Federation E-mail: [email protected]

www.susu.ru

The paper deals with the accuracy of dynamic measurements improvement based on automatic control theory approach. The review of dynamic measuring systems models developed by the author and his disciples is given. These models are based on modal control method, iterative principle of dynamic systems synthesis, observed state vector, sliding mode control method, parametric adaptation and neural network approach. Issues of dynamic measurements error evaluation are considered.

Keywords: Dynamic Measurement, Dynamic Measuring System, Dynamic Measurement Error Evaluation, Modal Control of Dynamic Behavior, Iterative Signal Recovery Approach, Observed State Vector, Sliding Mode Control, Adaptive Measuring System, Neural Network Approach.

1. Modal control of dynamic behavior method

The dynamic measurement error (DME) is determined by two main factors: dynamic characteristics of a measuring system (MS) and parameters of a measured signals. Requirements for the accuracy of dynamic measurements improvement initiated the study of two approaches to the DME correction: on a basis of the solution to convolution integral equations and its regularization [1–4], and with the use of the inverse Fourier [5] or Laplace [6] transformation. In the present paper the third group of approaches to the DME correction based on the automatic control theory methods is proposed.

1.1. Measuring system with modal control of dynamic behavior

The analysis of MSs can be made in terms of the automatic control theory (as well as of the theory of automatic control systems sensitivity [7, 8]), but the main structural difference between automatic control systems and MSs is that the latter have a primary measuring transducer (sensor), which input is inaccessible neither for direct measurement, nor for correction. Therefore, it is

67


impossible to cover the feedback the MS as a whole from the output to the input. This means that it is impossible to directly use approaches of modal control or other methods of the automatic control theory in MSs. However, it is possible to offer special structures of correcting devices of MSs, in which the idea of modal control can be implemented. MSs with the sensor model [9–11] are among such structures.

Let the transfer function (TF) of a sensor is generally represented as follows:

s

jj

r

iiii

q

jj

l

iiii

S

pTpTpT

pTpTpTpW

11

22

111

1111

221

112

112, (1)

where 1iT , iT , 1jT , jT are time constants and 1i , i are damping coefficients. Its differential equation can be represented in such a way:

ubupbupbyaypayp mm

mm

nn

n0

110

11 ...... , (2)

where y is the sensor output, u is the sensor input to be measured, 0a , 1a , …, 1na , 0b , 1b , …, mb are constant coefficients ( nm ) and dtdp is the

differentiation operator. Similarly, the sensor model that is presented as a real unit is described by the equation

MMm

mMm

mMMn

nMn ubupbupbyaypayp 0

110

11 ...... , (3)

where My and Mu are the sensor model output and input signals respectively. Differential equations of the sensor and its model are identical. Therefore, if

their outputs are close to each other, their inputs will differ a little one from another. Hence, the sensor model input, which is accessible for observation, can be used to evaluate the sensor input, which is inaccessible for observation. This is the basic idea of the sensor model application to the DME correction. For the idea implementation, the system of the sensor and its model shown in Fig. 1 is formed. To achieve proximity of signals, feedbacks with coefficients jK (for

1...0 nj ) and the m order filter with coefficients of numerator id (for mi ...0 ) and denominator ib (for mi ...0 ) are introduced. This structure of

the MS is recognized as the invention [11]. The MS proposed is described by the following TF:

00

00

001

11

001

11

......

dbKa

KapKapdbpdbpdb

pW nnn

n

mmm

mmm

MS . (4)

68


The last equation shows that by changing of adjustable coefficients id (for mi ...0 ) and jK (for 1...0 nj ) it is possible to receive any desired TF of

the MS. The method proposed of the MS with modal control of dynamic behavior synthesis in accordance with the DME required is as follows. Type and parameters of model measured signals, which are closest to the actual measured signal, are a priori evaluated. In accordance with maximum permissible value of the DME, zeros and poles of the MS are selected. Then adjustable coefficients of the MS are calculated. These parameters define desired location of zeros and poles.

Fig. 1. Block diagram of the dynamic measuring system.

1.2. Dynamic error evaluator based on sensor model

A properly designed MS performs its function to recover the sensor input with the smaller DME than the sensor output. Presence of the sensor model input and output allows to evaluate the DME of the sensor. Therefore, having available some additional signals of the measuring transducer it is possible to evaluate the DME of the entire MS. The method proposed is the basis of the DME evaluator, which is recognized as the invention [12].

The input of the MS (see Fig. 1) is

pupWpu MSMS , (5)

the output of the sensor is

pupWpy S , (6)

69


the output of the sensor model is

pupWpy MSM . (7) The DME signal is formed as follows:

00

000 db

Kapypyp M . (8)

Taking into account equations (6) and (7) the following equation is obtained from the last one:

0 0

00 0

,

S M

S MS S MS

a Kp W p u p u pb d

W p u p u p W p p (9)

where pupup MSMS is the MS error. Thus, formation of signals difference according to (9) gives the MS error evaluation, which differs from the true evaluation in the same way as the sensor output differs from the input. This allows to correct the evaluation in the same way as the sensor signal [10, 13]. The block diagram of the DME evaluator is shown in Fig. 2.

Fig. 2. Block diagram of the dynamic measurement error evaluator.

The DME evaluator is described by the following TF:

00

00

001

11

00*

......

dbKa

KapKapdbpdb

ppupW n

nnn

mmm

MS. (10)

70


The TF of the DME evaluator (10) has the same form as the TF of the MS (4). Adjustable coefficients of the evaluator id (for mi ...0 ) and jK (for

1...0 nj ) affect corresponding coefficients of its TF numerator and denominator in the same way. It should also be noted that the DME evaluator does not require the complete model of the sensor.

Block diagrams above reflect all significant links in the implementation of the dynamic MS. They can be considered as structural representations of differential equations, which must be numerically integrated in the implementation of the MS in the form of the sensor output processing program.

2. Iterative dynamic measuring system

The sensor model, which is described by the same differential equation as the sensor, in the MS structure (see Fig. 1) allows to reduce the DME. If to consider the sensor model not as the device distorting the signal, but as the device reproducing some input signal it is possible to improve this reproduction by means of additional channels with the sensor models introduction. Well-known iterative principle of automatic control systems synthesis allows to develop systems of high dynamic accuracy. However, due to implementation difficulties, they are not widely used in control systems. In MSs the idea of iterative channels can be implemented easily, namely in the form of additional data processing channels. The structure of the MS of dynamic parameters differs from that of automatic control systems. The main difference is the impossibility of introducing feedbacks and corrective signals directly to the MS input. However, the iterative signal recovery approach in this case allows to significantly reduce the DME.

The block diagram of the iterative MS proposed is shown in Fig. 3. The idea of the DME correction in such a system is as follows. The sensor output y has a certain dynamic error of the sensor input u reproduction. At the output of the sensor there is the sensor model. This model reproduces the signal y , which is dynamically distorted relative to the measured signal u . If the difference of signals Myy is fed to the second model input, this difference reproduction at the model output is obtained. The sum of signals 221 yyy MM reproduces signal y more accurately. Hence, the sum of their inputs 221 uuu MM more accurately reflects the signal u , because TFs of both the model and the sensor are identical. Then the difference of signals 2yy is fed to the third model input. The error of the signal y reproduction in first two models is processed by the third model. The sum of first three models outputs 3y reproduces the sensor output y more accurately. Therefore, the sum of first three models inputs

3321 uuuu MMM more accurately reproduces the sensor output u .

71


Fig. 3. Block diagram of the iterative dynamic measuring system.

The iterative MS for an arbitrary number N of models is described by the

following TF [10, 14]:

NSN pWpW 11 . (11)

It should be noted that iterative MSs have high noise immunity.

3. Dynamic measuring system with observed state vector

Dynamic model of a liner system with constant parameters is considered. Let tu is the input r -vector, ty is the output l -vector ( nlr , ) and tx is the state n -vector (here n is dimension of the state space). Then its state space model is described by the following system of linear differential equations in a vector-matrix form:

tuDtxCtytuBtxAtx

, (12)

72


where A , B , C , D are matrices of sizes nn , rn , nl , rl consequently, A is the system matrix, B is the control matrix, C is the output matrix and D is the feed-forward matrix. All these matrices are constants.

The block diagram of the primary measuring transducer with observed state vector [10, 15] is shown in Fig. 4. Coefficients ic indicate the possibility of the state vector coordinates measurement (if 1ic , the measurement of coordinate i is possible, and if 0ic , the measurement is impossible). Outputs of the sensor with the observed state vector are

txbcty

txbctytxbtxbtxbty

mmmm

mm

11

2112

121101

...

...

. (13)

If nm and 1ic for i from 1 to m , then outputs ty1 , ty2 , …, tym 1 (13) are complete state vector of the sensor. Otherwise, some

coordinates of the state vector should be measured.

Fig. 4. Block diagram of the primary measuring transducer with observed state vector.

The possibility of the state vector coordinates measurement allows to design

various block diagrams of the MS for the flexible choice of its form according to the actual measurement situation. On a basis of the primary measuring transducer with observed state vector various block diagrams of the MS was examined [10, 15]. The algorithm of optimal adjustment of the MS parameters was proposed [10, 15].

73


4. Dynamic measuring system with sliding mode control

To ensure the proximity of the sensor model output to the sensor output in the MS with modal control of dynamic behavior feedbacks are introduced. It is possible to achieve the proximity of these signals in the MS by implementation of sliding mode control. The block diagram of the MS with sliding mode control [10, 16] is shown in Fig. 5. In this diagram for sliding mode launching a nonlinear unit (relay) is introduced. The gain factor K , which affects both the amplitude of the relay output signal and the switching frequency of the relay, is introduced after the nonlinear unit.

Oscillations in the closed-loop nonlinear MS with sliding mode control were examined. The MS with the sensor model in the form of serial dynamic units was also proposed to ensure sliding mode stability.

Fig. 5. Block diagram of the dynamic measuring system with sliding mode control.

5. Adaptive measuring system

On a basis of the MS with modal control of dynamic behavior it is possible to design a MS adaptive to the minimum of the DME [10, 17]. The MS with modal control of dynamic behavior and with adaptation of its TF coefficients by

74


direct search were investigated. The DME evaluation method in the presence of a priori information about characteristics of measured signal and noise of the sensor was proposed. The dynamic model of the MS with adaptation of its adjustable coefficients to minimum of the DME in the real time mode (see Fig. 6) was examined. This adaptation was implemented on a basis of the DME evaluator output (see Fig. 7). Coefficients ik in diagrams below are the results of the solution to some differential equations [17].

Fig. 6. Block diagram of the adaptive measuring system.

Fig. 7. Block diagram of the dynamic measurement error evaluator.

75


6. Neural network dynamic measuring system

Neural networks application is one of approaches to intelligent MSs development [10, 18]. Neural network dynamic model of the sensor and training algorithm for dynamic parameters determination were considered. Neural network dynamic model of the MS with the sensor inverse model (see Fig. 8) and the algorithm for its training by minimum of mean-squared DME criterion were examined. On its base the MS in the form of serial sections of the first and second order, as well as in the form of correcting filter with special structure and identical serial sections of the first order to ensure the MS stability were proposed. Neural network dynamic models of the MS in the presence of noise at the sensor output were investigated.

Fig. 8. Block diagram of the neural network inverse sensor model.

76


Acknowledgments

Author is grateful to his disciples for participation in the research and development of proposed approaches: D. Yu. Iosifov (section 3), O. L. Ibryaeva (section 3), M. N. Bizyaev (section 4), E. V. Yurasova (section 5) and A. S. Volosnikov (section 6).

References

1. G. N. Solopchenko, “Nekorrektnye zadachi izmeritel'noy tekhniki [Ill-conditioned Problems of Measuring Engineering]”, Izmeritel'naya tekhnika [Measuring Engineering], no. 1 (1974): 51-54.

2. A. I. Tikhonov, V. Ya. Arsenin, Metody resheniya nekorrektnykh zadach [Methods of Solution to Ill-conditioned Problems] (Moscow: Nauka, 1979).

3. V. A. Granovskiy, Dinamicheskie izmereniya: Osnovy metrologicheskogo obespecheniya [Dynamic Measurements: Fundamentals of Metrological Support] (Leningrad: Energoatomizdat, 1984).

4. E. Layer, W. Gawedzki, “Theoretical Principles for Dynamic Errors Measurement”, Measurement 8, no. 1 (1990): 45-48.

5. G. N. Vasilenko, Teoriya vosstanovleniya signalov: O reduktsii k ideal'nomu priboru v fizike i tekhnike [The theory of Signals Recovery: about Reduction to Ideal Instrument in Physics and Engineering] (Moscow: Sovetskoe Radio, 1979).

6. S. Dyer, “Inverse Laplace Transformation of Rational Functions. Part I”, IEEE. Instrumentation and Measurement Magazine 5, no. 4 (2006): 13-15.

7. E. N. Rosenwasser, R. M. Yusupov, Sensitivity of Automatic Control Systems (CRC Press, 2000).

8. M. Eslami, Theory of Sensitivity in Dynamic Systems: An Introduction (Springer-Verlag, 1994).

9. A. L. Shestakov, “Dynamic Error Correction Method”, IEEE Transactions on Instrumentation and Measurement 45, no. 1 (1996): 250-255.

10. A. L. Shestakov, Metody teorii avtomaticheskogo upravleniya v dinamicheskikh izmereniyakh: monografiya [Theory Approach of Automatic Control in Dynamic Measurements: Monograph] (Chelyabinsk: Izd-vo Yuzhno-Ural'skogo gosudarstvennogo universiteta, 2013).

11. A. L. Shestakov, “A. s. 1571514 SSSR. Izmeritel'nyy preobrazovatel' dinamicheskikh parametrov [ . . 1571514 USSR. Measuring Transducer of Dynamic Parameters]”, Otkrytiya, izobreteniya [Discoveries and inventions], no. 22 (1990): 192.

77


12. V. A. Gamiy, V. A. Koshcheev and A. L. Shestakov, “A. s. 1673990 SSSR. Izmeritel'nyy preobrazovatel' dinamicheskikh parametrov [ . . 1673990 USSR. Measuring Transducer of Dynamic Parameters]”, Otkrytiya,izobreteniya [Discoveries and inventions], no. 12 (1991): 191.

13. A. L. Shestakov, “Modal'nyy sintez izmeritel'nogo preobrazovatelya [Modal Synthesis of Measuring Transducer]”, Izv. RAN. Teoriya i sistemy upravleniya [Proceedings of the RAS. Theory and Control Systems], no. 4 (1995): 67-75.

14. A. L. Shestakov, “Izmeritel'nyy preobrazovatel' dinamicheskikh parametrov s iteratsionnym printsipom vosstanovleniya signala [Measuring Transducer of Dynamic Parameters with Iterative Approach to Signal Recovery]”, Pribory i sistemy upravleniya [Instruments and Control Systems], no. 10 (1992): 23-24.

15. A. L. Shestakov, O. L. Ibryaeva and D. Yu. Iosifov, “Reshenie obratnoy zadachi dinamiki izmereniy s ispol'zovaniem vektora sostoyaniya pervichnogo izmeritel'nogo preobrazovatelya [Solution to the Inverse Dynamic Measurement Problem by Using of Measuring Transducer State Vector]”, Avtometriya [Autometering] 48, no. 5 (2012): 74-81.

16. A. L. Shestakov and M. N. Bizyaev, “Vosstanovlenie dinamicheski iskazhennykh signalov ispytatel'no-izmeritel'nykh sistem metodom skol'zyashchikh rezhimov [Dynamically Distorted Signals Recovery of Testing Measuring Systems by Sliding Mode Control Approach]”, Izv. RAN. Energetika [Proceedings of RAS. Energetics], no. 6 (2004): 119-130.

17. A. L. Shestakov and E. A. Soldatkina, “Algoritm adaptatsii parametra izmeritel'noy sistemy po kriteriyu minimuma dinamicheskoy pogreshnosti [Adaptation Algorithm of Measuring System Parameters by Criterion of Dynamic Error Minimum]”, Vestnik Yuzhno-Ural'skogo gosudarstvennogo universiteta. Seriya “Komp'yuternye tekhnologii, upravlenie, radioelektronika” [Bulletin of the South Ural State University. Series “Computer Technologies, Automatic Control & Radioelectronics”], iss. 1, no. 9 (2001): 33-40.

18. A. S. Volosnikov and A. L. Shestakov, “Neyrosetevaya dinamicheskaya model' izmeritel'noy sistemy s fil'tratsiey vosstanavlivaemogo signala [Neural Network Dynamic Model of Measuring System with Recovered Signal Filtration]”, Vestnik Yuzhno-Ural'skogo gosudarstvennogo universiteta. Seriya “Komp'yuternye tekhnologii, upravlenie, radioelektronika” [Bulletin of the South Ural State University. Series “Computer Technologies, Automatic Control & Radioelectronics”], iss. 4, no. 14 (2006): 16-20.



MODELS FOR THE TREATMENT OF APPARENTLY

INCONSISTENT DATA

R. WILLINK

Wellington, New Zealand


Frequently the results of measurements of a single quantity are found to be

mutually inconsistent under the usual model of the data-generating process.Unless this model is adjusted, it becomes impossible to obtain a defensible esti-

mate of the quantity without discarding some of the data. However, taking that

step seems arbitrary and can appear unfair when each datum is supplied by adifferent laboratory. Therefore, we consider various models that do not involve

discarding any data. Consider a set of measurement results from n independentmeasurements with stated standard uncertainties. The usual model takes the

standard uncertainties to be the standard deviations of the distributions from

which the measurement results are drawn. One simple alternative involves sup-posing there is an unknown extra variance common to each laboratory. A more

complicated model has the extra variance differing for each laboratory. A fur-

ther complication is to allow the extra variance to be present with an unknownprobability different for each laboratory. Maximum-likelihood estimates of the

measured quantity can be obtained with all these models, even though the last

two models have more unknown parameters than there are data. Simulationresults support the use of the model with the single unknown variance.

Keywords: Combination of data; Random effects; Inconsistent data.

1. Introduction

Frequently data put forward as the results of measurements of a single quan-

tity appear mutually inconsistent. Unless the model of the data-generating

process is adjusted, it becomes impossible to obtain a defensible estimate

of that quantity without discarding some of the data. However, discard-

ing data can appear unfair and arbitrary, especially when each datum is

supplied by a different laboratory. Therefore, in this paper we consider al-

ternative models for the generation of the data.

Let θ denote the fixed unknown quantity measured. The information at

hand is a set of measurement results x1, . . . , xn and standard uncertainties

u1, . . . , un from n independent measurements of θ. The usual model for the

generation of the data takes xi to be drawn from the normal distribution

78


79

with mean θ and variance u2i . This model can be written as

xi ← N(θ, u2i ), i = 1, . . . , n. (1)

Sometimes it will be apparent that this model cannot properly describe the

spread in the data. If we are to continue to use the data without down-

weighting or removing any of them then another model must be proposed.

Any model should be a realistic representation of the system studied, and

the relevant system here is the data-generating process. So if we are to

propose an alternative model then it should be realistic as a description

of how the xi data arise. Also, it should be amenable to analysis, lead to

a meaningful estimate of θ and, arguably, should contain (1) as a special

case.

One useful possibility is the model

xi ← N(θ, u2i + σ2), i = 1, . . . , n, (2)

where σ2 is an unknown nuisance parameter.1 This model involves the

ideas that (i) the measurement procedure in each laboratory incurred an

additional error not accounted for in the uncertainty calculations and (ii)

the sizes of the additional errors in the n measurements can be regarded as

a random sample from the normal distribution with mean 0 and unknown

variance σ2. (It is a special case of a standard ‘random effects’ model as

discussed by Vangel and Rukhin,2 where the u2i variances are not known but

are estimated from data and where concepts of ‘degrees of freedom’ apply.)

The merit of model (2) is its simplicity and its additive nature, which is

realistic. It might be criticised for the implication that every laboratory

has failed to properly assess some source of error, (even if the estimates

of the extra errors turn out to be small). One subject of this paper is a

generalization of (2) constructed to address this criticism.

Section 2 gives more details of models (1) and (2) and the ways in which

they are fitted to the data by the principle of maximum-likelihood. Section 3

describes an extension to (2) and its solution by maximum-likelihood, and

Section 4 describes a more complicated model that turns out to have the

same solution. Sections 5 and 6 present examples of the results obtained

with the models, and Section 7 uses simulation to examine the abilities of

the models to give accurate estimates of quantities measured.

A different alternative to (1), which has been proposed in Bayesian anal-

yses, is xi ← N(θ, κu2i ) for i = 1, . . . , n.3,4 If this model is interpreted as

describing the data-generating process then its implication is that every

laboratory has erred by a common factor κ in assessing the overall error


80

variance. Given that the assessments of variance are made for individual

components of error, added together, and assessed independently at dif-

ferent laboratories, such a model seems highly unrealistic. Also, it can be

inferred that, unless some values for κ are favoured over others a priori,

multiplying every submitted standard uncertainty ui by a constant would

produce no change in the estimate of θ or in the standard uncertainty of

this estimate.5 This does not seem reasonable.

2. The standard models

The total error in a measurement result xi can be seen as the sum of a com-

ponent whose scale is accurately ‘counted’ in the uncertainty budget, ec,i,

and a component whose scale is not properly assessed, enc,i. The laboratory

will see the standard uncertainty u(xi) as the parent standard deviation of

ec,i, i.e. the standard deviation of the distribution from which ec,i arose.

Also, the laboratory claims that enc,i does not exist, which is functionally

equivalent to claiming that enc,i = 0. If we accept this claim then we are

led to adopt (1), which we shall call Model I.

2.1. Model I

Suppose that, for each i, we accept the claim that enc,i is zero. For labora-

tory i we obtain the model xi ← (θ, u2i ), which indicates that xi was drawn

from some distribution with mean θ and variance u2i . The weighted-least-

squares estimate of θ and the minimum-variance unbiased linear estimate

of θ under this model are both given by

θ =

∑ni=1 xi/u

2i∑n

i=1 1/u2i. (3)

The distribution from which θ is drawn has mean θ and variance

(∑ni=1 1/u2i )

−1, so the corresponding standard uncertainty is

u(θ) =

√1∑n

i=1 1/u2i. (4)

If the parent distributions of the xi values are treated as being normal then

the complete model is (1), in which case θ in (3) is also the maximum-

likelihood estimate, as can be found by maximising the likelihood function

L(θ) =n∏i=1

1√2πui

exp

−(xi − θ)2

2u2i

.


81

Henceforth, we assume that the distributions are sufficiently close to nor-

mal for this step to be taken. We will also use the principle of maximum-

likelihood exclusively in fitting a model for the estimation of θ.

2.2. Model II

If, by some principle, Model I is deemed to be inconsistent with the data

then we must conclude that either (i) the xi values were not drawn inde-

pendently, (ii) the distributions are not well modelled as being normal or

(iii) one or more of the enc,i errors are non-zero. One unprejudiced modifi-

cation of the model based on the third of these possibilities involves the idea

that each enc,i was drawn from a normal distribution with mean zero and

unknown variance σ2. The model then becomes (2). This assumption of a

single distribution for the extra errors does not mean that each laboratory

incurs an extra error of the same size. Rather it means that there will be

extra errors of different sizes for different laboratories, as would be expected

in practice, and that the underlying effects can be modelled as being nor-

mally distributed across the hypothetical population of laboratories. The

spread of values of these extra errors can reflect the spread of resources and

expertise in the laboratories. Indeed, the implied values of enc,i for a large

subset of laboratories whose results are consistent under model (1) will be

negligible under model (2). In an inter-laboratory comparison involving the

circulation of an artefact for imeasurement among many laboratories, the

extra variance σ2 could be seen as an effect of artefact instability.

Let us refer to (2) as Model II. The corresponding likelihood function is

L(θ, σ2) =n∏i=1

1√2π(u2i + σ2)

exp

−(xi − θ)2

2(u2i + σ2)

.

Fitting the model by the principle of maximum-likelihood means finding the

values of the unknown parameters θ and σ2 that maximise this function.

This means maximising the logarithm of the likelihood, which is equivalent

to minimising the quantityn∑i=1

log(u2i + σ2) +(xi − θ)2

u2i + σ2. (5)

The fitted parameter values θ and σ2 are those for which the partial deriva-

tives of (5) are zero. Differentiating with respect to θ and setting the result

to zero givesn∑i=1

xi − θu2i + σ2

= 0,


82

which implies that at the point of maximum-likelihood

θ =

∑ni=1 xi/(u

2i + σ2)∑n

i=1 1/(u2i + σ2).

This expression for θ is substituted into (5), and we find that σ2 is the value

minimising

Q(σ2) =n∑i=1

log(u2i + σ2) +

(xi −

∑nj=1 xj/(u

2j+σ

2)∑nj=1 1/(u2

j+σ2)

)2

u2i + σ2.

This is found by searching between zero and some upper bound, say (xmax−xmin)2 where xmax and xmin are the largest and smallest values of x1, . . . , xn.

Finally the estimate θ is given by

θ =

∑ni=1 xi/(u

2i + σ2)∑n

i=1 1/(u2i + σ2). (6)

It is clear from symmetry that θ is an unbiased estimate of θ under this

model. So a suitable estimate of the parent standard deviation of θ can act

as the standard uncertainty of θ. One possibility is

u(θ) =

√1∑n

i=1 1/(u2i + σ2). (7)

If σ2 were equal to σ2 then this figure describes the smallest possible stan-

dard deviation of any unbiased linear estimator of θ. So, in practice, u(θ)

in (7) might tend to be smaller than the parent standard deviation of θ.

Another possibility is the standard deviation of the ML estimator of θ that

would apply if θ and σ2 were equal to θ and σ2, which is a figure that can be

found by simulation. We generate a set of simulated measurement results

according to the model

xi ← N(θ, u2i + σ2), i = 1, . . . , n,

and then apply the estimation procedure to the xi values and the u2i values

to obtain a simulated estimate˜θ. (The˜indicates a simulated value.) This

is repeated m times to form a set of simulated estimates˜θ1, . . . ,

˜θm. Then

the standard uncertainty to associate with θ is

u∗(θ) =

√√√√ 1

m

m∑j=1

(˜θj − θ

)2. (8)

Model II is a straightforward extension of Model I designed to accom-

modate situations where an unprejudiced assessment of a set of data is


83

required. Although it supposes the existence of a shared extra variance,

it does permit the extra error enc,i to be negligible for almost all of the

laboratories.

3. Model III

A natural modification to Model II involves allowing the extra errors

enc,1, . . . , enc,n to be drawn from distributions with different unknown vari-

ances σ21 , . . . , σ

2n. The model becomes

xi ← N(θ, u2i + σ2i ), i = 1, . . . , n,

with θ and each σ2i being unknown. We call this Model III. There are now

n+ 1 unknown parameters, but we only wish to estimate θ.

The likelihood function under this model is

L(θ, σ21 , . . . , σ

2n) =

n∏i=1

1√2π(u2i + σ2

i )exp

−(xi − θ)2

2(u2i + σ2i )

. (9)

Let θ indicate the estimate that we shall obtain of θ. Even though θ is as

yet unknown, (9) implies that the corresponding estimates of σ21 , . . . , σ

2n are

the values minimising the sum∑ni=1Hi(σ

2i ) where

Hi(σ2i ) = log(u2i + σ2

i ) +(xi − θ)2

u2i + σ2i

. (10)

This means minimizing each Hi(σ2i ) term, because each is unrelated. From

∂Hi(σ2i )

∂σ2i

=1

u2i + σ2i

− (xi − θ)2

(u2i + σ2i )2

we infer that Hi(σ2i ) has a minimum at σ2

i = (xi − θ)2 − u2i and that this

is the only minimum. So, because σ2i ≥ 0, the fitted value of σ2

i is

σ2i = max(xi − θ)2 − u2i , 0. (11)

If t indicates a possible value for θ then the corresponding fitted value

of u2i + σ2i is max

(xi − t)2, u2i

. So, from (10), we set θ to be the value of

t that minimises

Q∗(t) =n∑i=1

log(

max

(xi − t)2, u2i)

+(xi − t)2

max (xi − t)2, u2i .

That is, we set

θ = argmin t Q∗(t). (12)


84

This estimate can be found by a searching over t between the lowest and

highest values of xi. The corresponding maximum-likelihood choice for σ2i is

then given by (11), and - like (7) - one simple figure of standard uncertainty

is

u(θ) =

√1∑n

i=1 1/(u2i + σ2i ). (13)

From symmetry, it is clear that θ is an unbiased estimate of θ. So,

as in Model II, we could instead take the standard uncertainty of θ to

be the parent standard deviation θ under the condition that the parame-

ters θ, σ21 , . . . , σ

2n are equal to the fitted values θ, σ2

1 , . . . , σ2n. Again, we can

evaluate this standard deviation by simulating the measurement process

many times. Thus for i = 1, . . . , n we draw a value xi from the distribution

N(θ, u2i + σ2i ), and then we apply the estimation procedure to the xi values

and the u2i values to obtain a simulated estimate˜θ. This is repeated m

times to form a set of simulated estimates˜θ1, . . . ,

˜θm. Then, as in (8), u(θ)

is given by

u∗(θ) =

√√√√ 1

m

m∑j=1

(˜θj − θ

)2. (14)

4. Model IV

Let us now consider a model that allows many of the enc,i errors to be

exactly zero. We suppose that laboratory i had probability λi of incurring

a non-zero enc,i error and that this error would be drawn from the normal

distribution with mean 0 and unknown variance σ2i . The model is

xi ← N(θ, u2i + kiσ2i ).

with ki ← Bernoulli(λi). (A Bernoulli variable with parameter λi takes the

value 1 with probability λi and takes the value 0 otherwise.) There are now

2n+ 1 unknown parameters, but our primary attention is on estimating θ.

The parent probability distribution of xi is now a mixture of the distri-

butions N(θ, u2i ) and N(θ, u2i + σ2i ) in the ratio (1 − λi) : λi. This mixture

distribution has probability density function

fi(x) =1− λi√

2πu2iexp

−(x− θ)2

2u2i

+

λi√2π(u2i + σ2

i )exp

−(x− θ)2

2(u2i + σ2i )

.


85

The likelihood function is∏ni=1 fi(xi). Setting each λi to zero gives

Model I while setting each λi to one gives Model III.

Again, let θ denote the MLE of θ. Even though θ is as yet unknown,

the corresponding fitted values of λ1, . . . , λn, σ21 , . . . , σ

2n are the values max-

imising∏ni=1 g(λi, σ

2i ) where

g(λi, σ2i ) =

1− λiui

exp

(−(xi − θ)2

2u2i

)+

λi√u2i + σ2

i

exp

(−(xi − θ)2

2(u2i + σ2i )

),

subject to 0 ≤ λi ≤ 1 and σ2i > 0. This means maximising each of the

individual g(λi, σ2i ) factors separately. Setting ∂g(λi, σ

2i )/∂λi = 0 implies

that

1

uiexp

(−(xi − θ)2

2u2i

)=

1√u2i + σ2

i

exp

(−(xi − θ)2

2(u2i + σ2i )

). (15)

So if λi 6= 0, 1 then (15) holds with σ2i replacing σ2

i . Also, setting

∂g(λi, σ2i )/∂σ2

i = 0 implies that either λi = 0 or

− 1

u2i + σ2i

+(xi − θ)2

(u2i + σ2i )2

= 0,

in which case (xi− θ)2 = u2i +σ2i . So if λi 6= 0, 1 then, using this result and

(15), we find that σ2i satisfies

u2i + σ2i

u2iexp

(−u

2i + σ2

i

u2i

)= exp(−1),

which implies that σ2i = 0. Thus, if λi 6= 0, 1 then σ2

i = 0, in which case

the value of λi does not matter, and we recover the solution under Model I.

Also, if λi = 0 then we again recover the solution under Model I. However, if

λi = 1 then we recover the solution under Model III. Model III encompasses

Model I as a special case, so the value of the likelihood function at the

solution under Model III must be at least as large as the value of the

likelihood function at the solution under Model I. From this we can infer

that – when fitting is carried out by the method of maximum likelihood –

the model described in this section leads to the same result as Model III.

Therefore, this model is not considered further as a means of solution.


86

5. Example: the gravitational constant G

Consider the formation of a combined estimate of Newton’s gravitationalconstant from the 10 ordered measurement results given in Table 1.3

Table 1: Measurement results for G (10−11 m3 kg−1 s−2)

i xi ui i xi ui1 6.6709 0.0007 6 6.67407 0.000222 6.67259 0.00043 7 6.67422 0.000983 6.6729 0.0005 8 6.674255 0.0000924 6.67387 0.00027 9 6.67559 0.000275 6.6740 0.0007 10 6.6873 0.0094

Analysis is carried out in the units of 10−11 m3 kg−1 s−2. With Model I

we obtain, from (3) and (4), θ = 6.674186 and u(θ) = 0.000074. With

Model II we obtain, from (6) and (7), θ = 6.673689 and u(θ) = 0.000401,

with σ2 = 1.21 × 10−6. With Model III we obtain, from (12) and (13),

θ = 6.674195 and u(θ) = 0.000081, with σ2i = 1.04 × 10−5, 2.39 ×

10−6, 1.43 × 10−6, 3.27 × 10−8, 0, 0, 0, 0, 1.87 × 10−6, 8.34 × 10−5. Using

Model III instead of Model II brings the estimate back towards the value

obtained with Model I. This was a pattern observed in other examples also.

Figure 1 shows the corresponding intervals θ ± u(θ) and the intervals

xi ± ui for Model I, xi ±√

(u2i + σ2) for Model II and xi ±√

(u2i + σ2i ) for

Model III. In accordance with (11), every laboratory with σ2i > 0 has its

interval xi ±√u2i + σ2

i with one end-point at the estimate θ.

6.670 6.675 6.680 6.685

Model IModel IIModel III

Fig. 1. Gravitational constant G: estimate ± standard uncertainty (10−11m3kg−1s−2)


87

6. Example: Planck’s constant

Similarly, consider the formation of a combined estimate of Planck’s con-

stant from the 20 ordered measurement results given in Table 2.4 Analysis

is carried out in the units of 10−34 J s. Model I gives, from (3) and (4),

θ = 6.62606993 and u(θ) = 0.00000010. Model II gives, from (6) and (7),

θ = 6.62606986 and u(θ) = 0.00000020, with σ2 = 2.17× 10−13. Model III

gives, from (12) and (13), θ = 6.62607004 and u(θ) = 0.00000011 with

maxiσ2i = 1.23× 10−11. Figure 2 presents the results graphically.

Table 2: Measurement results for Planck’s constant (10−34 J s)

i xi ui i xi ui1 6.6260657 0.0000088 11 6.62607000 0.000000222 6.6260670 0.0000042 12 6.62607003 0.000000203 6.6260682 0.0000013 13 6.62607009 0.000000204 6.6260684 0.0000036 14 6.62607063 0.000000435 6.6260686 0.0000044 15 6.626071 0.0000116 6.6260686 0.0000034 16 6.6260712 0.00000137 6.62606887 0.00000052 17 6.62607122 0.000000738 6.62606891 0.00000058 18 6.6260715 0.00000129 6.62606901 0.00000034 19 6.6260729 0.0000067

10 6.6260691 0.0000020 20 6.6260764 0.0000053

6.626066 6.626068 6.626070 6.626072 6.626074 6.626076

Model IModel IIModel III

Fig. 2. Planck’s constant: estimate ± standard uncertainty (10−34 J s)


88

7. Performance assessment

We envisage each error distribution being symmetric. So each method is

unbiased and its performance can be judged by its standard error. Model I

will perform best when the ui uncertainties are correctly assessed, so we

also consider the application of Model I unless the data fail the test of con-

sistency undertaken by comparing the statistic∑ni=1(xi − θ)2/u2i with the

95th percentile of the chi-square distribution with n−1 degrees of freedom.

The five methods of analysis studied were therefore:

I - apply Model I

II - apply Model II

III - apply Model III

I+II - apply Model II if Model I fails the chi-square test

I+III - apply Model III if Model I fails the chi-square test.

Datasets were generated by mechanisms obeying Models I to III. There

were four settings for the main parameters and five settings for the extra

variances (which are nuisance parameters), as follows:

θ = 0, n = 8 and ui = 1, 1, 1, 1, 1, 1, 1, 1θ = 0, n = 8 and ui = 1, 1, 1, 1, 9, 9, 9, 9θ = 0, n = 8 and ui = 1, 1, 4, 4, 4, 4, 9, 9θ = 0, n = 8 and ui = 1, 4, 4, 4, 4, 4, 4, 4

extra variances = 0, 0, 0, 0, 0, 0, 0, 0 (as per Model I)

extra variances = 1, 1, 1, 1, 1, 1, 1, 1 (as per Model II)

extra variances = 9, 9, 9, 9, 9, 9, 9, 9 (as per Model II)

extra variances = 15, 0, 0, 0, 0, 0, 0, 0 (as per Model III)

extra variances = 15, 15, 0, 0, 0, 0, 0, 0 (as per Model III).

For each of the 20 corresponding combinations, there were 10,000 sim-

ulated experiments. The appropriate row in Table 1 shows the standard

errors for the five methods normalized to the value of the best perform-

ing method, which is given the value 1. (So entries in Table 1 may be

compared across rows but not down columns.) To indicate relatively poor

performance, entries of 1.2 or more have been italicised. Bold type has been

used to indicate where the best performing model was not the model under

which the data were actually generated.

The results support the use of Model II with or without a chi-square

test. They indicate that Model II performed relatively well with data gen-

erated under Model I. Also they show it to be the best performing model


89

in several of the scenarios involving data generated under Model III, which

was the most general of the three models. This phenomenon will be associ-

ated with the simpler model having a smaller number of free parameters.

Table 3: Relative standard errors of estimators of θ

ui (known) extra variances I II III I+II I+III

1,1,1,1,1,1,1,1 0,0,0,0,0,0,0,0 1.00 1.00 1.17 1.00 1.051,1,1,1,1,1,1,1 1,1,1,1,1,1,1,1 1.00 1.00 1.33 1.00 1.261,1,1,1,1,1,1,1 9,9,9,9,9,9,9,9 1.00 1.00 1.54 1.00 1.541,1,1,1,1,1,1,1 15,0,0,0,0,0,0,0 1.33 1.33 1.01 1.33 1.001,1,1,1,1,1,1,1 15,15,0,0,0,0,0,0 1.53 1.53 1.00 1.53 1.00

1,1,1,1,9,9,9,9 0,0,0,0,0,0,0,0 1.00 1.00 1.17 1.00 1.041,1,1,1,9,9,9,9 1,1,1,1,1,1,1,1 1.00 1.00 1.29 1.00 1.181,1,1,1,9,9,9,9 9,9,9,9,9,9,9,9 1.02 1.00 1.43 1.00 1.421,1,1,1,9,9,9,9 15,0,0,0,0,0,0,0 1.51 1.45 1.00 1.45 1.011,1,1,1,9,9,9,9 15,15,0,0,0,0,0,0 1.23 1.17 1.00 1.17 1.00

1,1,4,4,4,4,9,9 0,0,0,0,0,0,0,0 1.00 1.06 1.16 1.04 1.041,1,4,4,4,4,9,9 1,1,1,1,1,1,1,1 1.00 1.03 1.20 1.04 1.111,1,4,4,4,4,9,9 9,9,9,9,9,9,9,9 1.15 1.00 1.35 1.01 1.341,1,4,4,4,4,9,9 15,0,0,0,0,0,0,0 1.32 1.00 1.02 1.02 1.031,1,4,4,4,4,9,9 15,15,0,0,0,0,0,0 1.32 1.00 1.33 1.03 1.34

1,4,4,4,4,4,4,4 0,0,0,0,0,0,0,0 1.00 1.11 1.17 1.05 1.051,4,4,4,4,4,4,4 1,1,1,1,1,1,1,1 1.00 1.02 1.15 1.02 1.051,4,4,4,4,4,4,4 9,9,9,9,9,9,9,9 1.34 1.00 1.37 1.07 1.341,4,4,4,4,4,4,4 15,0,0,0,0,0,0,0 1.63 1.00 1.25 1.15 1.291,4,4,4,4,4,4,4 15,15,0,0,0,0,0,0 1.65 1.00 1.28 1.14 1.31

References

1. R. Willink, Statistical detemination of a comparision reference value usinghidden errors Metrologia 39, 343 (2002).

2. M. G. Vangel and A. L. Rukhin, Maximum Likelihood Analysis for Het-eroscedastic One-Way Random Effects ANOVA in Interlaboratory StudiesBiometrics 55, 129 (1999).

3. V. Dose, Bayesian estimate of the Newtonian constant of gravitation Meas.Sci. Technol. 18, 176 (2007).

4. G. Mana, E. Massa and M. Predecsu, Model selection in the average of incon-sistent data: an analysis of the measured Planck-constant values Metrologia49, 492 (2012).

5. R. Willink, Comments on ‘Bayesian estimate of the Newtonian constant ofgravitation’ with an alternative analysis Meas. Sci. Technol. 18, 2275 (2007).

90



MODEL FOR EMOTION MEASUREMENTS IN ACOUSTIC SIGNALS AND ITS ANALYSIS

Y. BAKSHEEVA Radioelectronic System Department, St. Petersburg State University of Aerospace

InstrumentationSt. Petersburg, 190000, Russian Federation

†E-mail: [email protected] www.guap.ru

K. SAPOZHNIKOVA, R. TAYMANOV† Computerized Sensors and Measuring Systems Laboratory, D. I. Mendeleyev Institute for

MetrologySt. Petersburg, 190005, Russian Federation

E-mail: [email protected], †[email protected] www.vniim.ru

In the paper a hypothesis concerning the mechanism of emotion formation as a result of perception of acoustic impacts is justified. A model for measuring emotions and some methods of emotion information processing are described, which enable signals-stimuli and their ensembles causing the emotions to be revealed.

Keywords: Measurement Model, Acoustic Signals, Emotion Measurement

1. Introduction

As a result of civilization development, the priorities of tasks that the society puts in the forefront for metrology, change. In the last decades the scientific research emphasis shifts more and more to a study of humans, their abilities, special features of their communication and perception of external impacts, interaction with environment, etc.

Interest in measuring quantities characterizing properties which until recently have been referred to immeasurable ones, increases [1]. They were considered as the nominal properties that, according to [2], had “no magnitude”. For the most part, these quantities are of a multiparametric (multidimentional) character.

Approach used in processing the results of such measurements as well as reliability of results depend, to a significant extent, on a measurement model. In

91


fact, measurement model demonstrates a conception of model designers on the “mechanism” of forming the corresponding quantities. Its designing is associated with a step-by-step development of this conception.

Developing such a model, it is necessary to use knowledge from the fields that are far from metrology and put forward hypotheses based on it.

2. Stages of development and justification of a measurement model

The experience gained in developing a model for emotion measurement in musical fragments, communication (biolinguistic) signals of animals as well as other acoustic signals with emotional colour, is a representative one. The statement of the task with regard to a possibility to measure emotions in acoustic signals is based on the hypothesis according to which the considered signals contain certain “signals-stimuli” in an infrasound and a low part of a sound range (hereinafter, the above ranges will be referred to as the IFR), approximately up to 30 Hz. These signals-stimuli initiate the emotions [3-5].

At the first stage of the model development it was required: to put forward and justify a hypothesis that the selection of signals-stimuli

from complicated acoustic signals of various type (for example, chords), is carried out by nonlinear conversion;

to determine possible parameters that can describe these signals-stimuli; to reveal correlation of some signals-stimuli with certain emotions and to

build a simplified measurement scale (nominal scale); to evaluate ranges of variation of the signals-stimuli parameters; to prove that at a certain stage of evolution, nonlinear conversion of

acoustic communication signals was included into the “mechanism” of emotion formation.

Within the frames of this proof it was demonstrated that evolution of biolinguistic signals proceeded along the path of increasing the number and frequency of IFR signals and later on forming ensembles of such signals.

Shrimps Alpheidae have only one communication signal (danger signal). Crabs Uca annulipes emit two signals, emotional colour of which are different. Fishes use two or three types of signals.

When amphibia and reptiles (later) left water and settled on dry land, where the density of a medium was significantly less, they needed to keep a “vocabulary” of vitally important signals. This has resulted in use of modulation in biolinguistic signals as well as their demodulation for perception (with the help of nonlinear conversion).

92


Highly developed animals (birds, mammals) have in their arsenal much more signals, the meaning and emotional colour of which are different, but they have preserved ancient signals-stimuli.

On the whole, the work performed at the first stage made it possible to develop the simplest measurement model of a “mechanism” providing formation of emotions. The results of corresponding investigations are published in [6] and other papers.

The model contains a nonlinear converter, selector of a frequency zone of the energy maximum, selector of signals-stimuli as well as comparison and recognition unit. In the comparison and recognition unit, the frequencies of signals-stimuli selected and frequency intervals on a scale of elementary emotions are compared.

Functionality of the simplest model was tested by “decoding” an emotional content of fragments of drum ethnic music and bell rings [5, 7].

At the second stage of the model development its limitations were analyzed. Special measures were taken for the step-by-step removal of the above mentioned limitations. This required:

to suggest and substantiate a hypothesis about the way by which signals-stimuli are singled out when listening to the simplest melodies;

to evaluate the parameters characterizing this process; to show a role of an associative memory in the “mechanism” of emotion

formation considered at the first stage. Investigations [5-7] have demonstrated the necessity to improve the

simplest measurement model. It was supplemented with: a preselector that restricts a frequency and amplitude range of perceived

acoustic signals X; a time delay unit that can carry out the delay of signals in order to form

elementary emotions Y while listening to sounds with a changing frequency; a memory unit assigned for memorizing the ensembles consisting of 3-4

signals-stimuli; an associative memory unit assigned for memorizing emotional images

corresponding to certain signals-stimuli (it carries out the function of multidimentional scale of emotional images);

a comparison and recognition unit 2, which forms emotional images Z. The improved model linking emotions and acoustic signals caused them is

shown in Fig.1. An emotional content of some animal biolinguistic signals in various situations was “decoded” in order to study capabilities of this model [6-8]. It should be emphasized that as a further study of the “mechanism” providing

93


formation of human emotions takes place, the structure of the improved measurement model can be corrected somewhat.

The third stage being performed at present stipulates for optimization of the measurement model parameters. The results of the work at this stage should become a basis that will allow preparations for designing a special measurement instrument. It will be capable to measure an expected emotional reaction of listeners to various acoustic impacts.

Fig.3. Measurement model.

3. Optimization of conversion function

In experiments carried out at the previous stages of the measurement model development, a nonstationary acoustic signal converted nonlinearly was presented as a Fourier spectrum in the IFR. A duration of acoustic signals fragments under investigation was from some fractions of a second up to a few seconds. Within such a time interval the acoustic signal can contain a number of signals-stimuli.

A corresponding Fourier spectrum in the IFR included a large quantity of spectrum components. It was caused by a number of reasons.

Firstly, the signals-stimuli can have a nonsinusoidal form, i.e. they can contain a number of harmonics “masking” the remaining signals-stimuli. Secondly, a short duration of the analyzed fragments results in a supplementary distortion of the spectrum. In addition, the spectrum of emotionally coloured acoustic signals is to some extent “blurred” owing to special features of the “instrument” emitting them with some modulation.

94


Optimization of the transform function is aimed at searching a form of a signal converted nonlinearly, so that a number of components considered to be efficient signals-stimuli would be minimal. This requirement is caused by the fact that an elementary emotions scale has comparatively little number of gradations.

As the first result of the search, taking into account special features of the Fourier transform [9, 10], a modified algorithm of signal presentation was proposed. In this algorithm a considered Fourier spectrum is subjected to a supplementary transform. It was proposed to select the most probable “basic” frequencies of signals-stimuli in the original spectrum and then on the basis of corresponding oscillations to form synthesized signals-stimuli, using harmonics of oscillations with the “basic” frequency components with the frequencies close to them.

A resulting signal-stimulus contains only one spectral line with the “basic” frequency in the modified spectrum. For components remaining in the original spectrum, the procedure is repeated as long as a number of signal-stimuli containing a greater part of IFR spectrum energy is synthesized. Fig. 2 and Fig. 3 illustrate the efficiency of the modified algorithm.

Fig.2 demonstrates a growling of a cheetah as the signal analyzed [11].

Fig. 2. Spectra of the IFR signals after nonlinear conversion. Growling of cheetah (axis of abscissa is the frequency, Hz; ordinate axis is the level of spectrum components, relative units); a) Fourier spectrum, b) spectrum modified.

95


A Fourier spectrum of a nonlinearly converted signal in the IFR is shown in Fig. 2a). Fig.2b) demonstrates the spectrum of the same nonlinearly converted signal in the IFR processed with the help of the modified algorithm.

In Fig.3 Fourier spectra and modified spectra of signals emitted by various animals (a dhole, white-naped crane, and red-breasted goose) in the process of their coupling [11], are given.

Sounds emitted by these animals are perceived by ear quite differently, but their spectra in the IFR after nonlinear conversion are similar. This fact can speak about the ancient origin of the corresponding emotion and the same “mechanism” of its origination for different animals. Of course, for analogous situations, the modified spectra of biolinguistic signals (after nonlinear conversion) can differ between various animals too. These differences are influenced by the “age” and complexity of an emotion, out-of-true interpretation of an animal behavior, technical distortions due to signal recording, etc. However, the efficiency of “basic” signals-stimuli selection using the modified algorithm indicates that the search in the chosen direction is perspective.

Fig. 3. Spectra of the IFR signals after nonlinear conversion. Signals emitted by various animals in the process of their coupling (axis of abscissa is the frequency, Hz; ordinate axis is the level of spectrum components, relative units); a), c), and e) Fourier spectra; b), d), and f) spectra modified.

This way can be referred not only to the analysis of biolinguistic signals, but to the decoding of any acoustic signals that can form an emotional response of

96


listeners. A future plan is to continue the search using wavelet transform and other methods of nonstationary signals study.

4. Conclusion

Papers, in which measurements of multidimensional quantities are considered, can be found in scientific journals more and more often. However, the experience gained in designing the measurement models for such quantities have not received a due generalization and methodological support in metrology so far.

It should be noticed that the works in the field of multidimensional measurements have a wide spectrum of practical applications. In particular, the measurement model assigned for investigations of the relationship between acoustic signals and emotions of listeners, opens opportunities for applied works in the fields of musicology, medicine, mathematical linguistics, etc.

References

1. K. Sapozhnikova, A. Chunovkina, and R. Taymanov, “Measurement” and related concepts. Their interpretation in the VIM, Measurement 50(1), 390 (2014).

2. International Vocabulary of Metrology – Basic and General Concepts and Associated Terms. 3rd edn., 2008 version with minor corrections (BIPM, JCGM 200, 2012).

3. K. Sapozhnikova and R. Taymanov, About a measuring model of emotional perception of music, in Proc. XVII IMEKO World Congress, (Dubrovnik, Croatia, 2003).

4. K. Sapozhnikova and R. Taymanov, Measurement of the emotions in musi fragments, in Proc.12th IMEKO TC1 & TC7 Joint Symposium on Man Science & Measurement, (Annecy, France, 2008).

5. R. Taymanov and K. Sapozhnikova, Improvement of traceability of widely-defined measurements in the field of humanities, MEAS SCI REV 3 (10), 78 (2010).

6. K. Sapozhnikova and R. Taymanov, Role of measuring model in biological and musical acoustics, in Proc. 10th Int. Symposium on Measurement Technology and Intelligent Instruments (ISMTII-2011), (Daejeon, Korea, 2011).

7. R. Taymanov and K. Sapozhnikova, Measurements enable some riddles of sounds to be revealed, KEY ENG MAT 613 482 (2014).

97


8. R. Taymanov and K. Sapozhnikova, Measurement of multiparametric quantities at perception of sensory information by living creatures, EPJ WOC 77, 00016 (2014) http://epjwoc.epj.org/articles/epjconf/abs/2014/ 14/epjconf_icm2014_00016/epjconf_icm2014_00016.html

9. L. R. Rabiner and B. Gold, Theory and Application of Digital Signal Processing, Textbook (Prentice-Hall Inc., 1975).

10. L. Yaroslavsky, Fast Transform Methods in Digital Signal Processing, v.2 (Bentham E-book Series “Digital Signal Processing in Experimental Research”, 2011)

11. Volodins Bioacoustic Group Homepage, Animal sound galleryhttp://www.bioacoustica.org/gallery/gallery_eng.html

98





UNCERTAINTY CALCULATION IN GRAVIMETRIC

MICROFLOW MEASUREMENTS

E. BATISTA*, N. ALMEIDA, I. GODINHO AND E. FILIPE

Instituto Português da Qualidade

Caparica, 2828-513, Portugal *E-mail: [email protected]

www.ipq.pt

The primary realization of microflow measurements is often done by the gravimetric

method. This new measurement field arise from the need of industry and laboratories to

have their instruments traceable to reliable standards. In the frame of the EMRP -

European Metrology Research Programme a new project on metrology for drug delivery

started in 2012 with the purpose of developing science and technology in the field of

health. One of the main goal of this project is to develop primary microflow standards

and in doing so also developing the appropriated uncertainty calculation. To validate the

results obtained by the Volume Laboratory of the Portuguese Institute for Quality (IPQ)

through that model, it was considered to apply the GUM and MCM methodologies.

Keywords: Microflow, uncertainty, drug delivery devices, calibration.

1. Introduction

With the development of science and the widespread use of nanotechnology, the

measurement of fluid flow has become the order of microliter per minute or

even nanoliter per minute.

In order to pursuit the industry and laboratory’s needs, in such fields as

health, biotechnology, engineering and physics, it was identified, not only

nationally but also at international level [1] the need of developing a primary

standard for microflow measurement, to give traceability to its measurements.

Therefore, in 2011, Metrology for Drug Delivery - MeDD [2] was funded

by EMRP. This joint research project (JRP) aims to develop the required

metrology tools and one of the chosen JRP subjects was Metrology for Health.

The choice of this subject had the purpose of developing science and technology

in the field of health, specifically, to assure the traceability of clinical data,

allowing the comparability of diagnostic and treatment information.

99


2. Microflow measurements

The scientific laws used for the study of fluids in a macro scale are not always

applicable for micro fluids. This happens because of some physical phenomenon

like capillarity, thermal influences and evaporation, have bigger influence in

micro fluid measurements than in larger flows.

Based on recent studies [1], several parameters have to be taken into

account, such as: thermal influence, dead volume, system for delivering and

collection the fluid, continuous flow, pulsation of the flow generator,

evaporation effects, surface tension effects/drop and capillarity, balance effects

(floatability and impacts), contamination and air bubbles, variation of pressure

and time measurement.

A reference for microflow is often the gravimetric setup and this requires a

scale, a measuring beaker or collecting vessel, a flow-generator and a water

reservoir. This is the type of setup used by IPQ, one of the MeDD project

partners, to calibrate drug delivery devices, normally used in hospitals,

microflow meters and other microflow generators.

IPQ has two different setups that cover a range of 1 µL/h up to 600 mL/h

with correspondent uncertainties from 4 % up to 0.15 %. Two types of scales are

used according to the flow, a 20 g balance (AX26) with 0.001 mg resolution,

Fig. 1 a), and a 200 g balance (XP205) with 0.01 mg resolution, Fig. 1 b).

A data acquisition system was developed using LabView

graphical

environment. Different modules were implemented to the acquisition,

validation, online visualization data, statistical processing and uncertainty

calculation. The data acquisition is done directly from the balance every 250 ms

and the measurement of time is done simultaneously.

Figure 1. a) IPQ microflow setup AX26 Figure 1. b) IPQ microflow setup XP205

100


3. Uncertainty calculation

The measurement uncertainty of the gravimetric method used for microflow

determination is estimated following the Guide to the expression of Uncertainty

in Measurement (GUM) [3].

The measurement model is presented along with the standard uncertainties

components, the sensitivity coefficients values, the combined standard

uncertainty and the expanded uncertainty.

It was considered to perform also a validation process by a robust method,

being used for that purpose, the Monte Carlo Method (MCM) simulation, as

described in GUM Supplement 1 [4] using MATLAB programming software.

The computational simulation was carried out using the same input information

used by the Law of Propagation of Uncertainties (LPU ) evaluation, namely, the

same mathematical model (equation 1), estimates and the assigned probability

density functions (PDF) characterizing each input quantity. It was considered a

number of Monte Carlo trials equal to 1.0×106 .

3.1. Measurements model

The gravimetric dynamic measurement method is, by definition, the

measurement of the mass of fluid obtained during a specific time period. For

volume flow rates (Q) the density of the liquid (ρW ) is included in equation 1

along with the following components: final time (tf), initial time (ti), final mass

(IL), initial mass (IE), air density (ρA), mass pieces density (ρB), expansion

coefficient (γ ), water temperature (T) and evaporation rate (δQevap):

= − − − × − × − × − ! − "#$% + '()*+,

(1)

If the buoyancy correction (δmbuoy) of the dispensing needle is determined by:

= - − × ./.012"3

(2)

Where: Dtube is the immersed tube diameter and Dtank is the measuring beaker

diameter.

Then:

= − 4- − × 5 − ./.012"63 × − × − × − ! − "#$7 + /809 (3)

The evaporation rate was determined by leaving the collecting vessel, full of

101


water, in the balance for 24 h, at the same conditions as the measurements are

normally done.

3.2. Uncertainty evaluation

The main standard uncertainties considered are: mass measurements (m), density

of the mass pieces (ρB), density of the water (ρW), density of the air (ρA),

evaporation rate (δQevap), water temperature (T), time (t), expansion coefficient

(γ), standard deviation of the measurements (δQrep) and buoyancy on the

immersed dispensing needle (δQmbuoy). Detailed information regarding the

uncertainty components is described in Table 1.

Table 1. Uncertainty components in the microflow measurements.

Uncertainty

components

Standard

uncertainty

Evaluation

process

Evaluation type Distribution

Final mass u(IL) Calibration

certificate

B Normal

Initial mass u(IE) Calibration

certificate

B Normal

Density of the

water u(ρW) Literature B Rectangular

Density of the

air u(ρA) Literature B Rectangular

Density of the

mass pieces u(ρB) Calibration

certificate

B Rectangular

Temperature u(T) Calibration

certificate

B Normal

Expansion

coefficient u(γ) Literature B Rectangular

Evaporation u(δQevap) Standard

deviation of the

measurements

A Normal

Final time u(tf) Estimation

(1 µs)

B Rectangular

Initial time u(ti) Estimation

(1 µs)

B Rectangular

Buoyancy u(δQmbuoy) Calibration

certificate

B Normal

Repeatability u(δQrep) Standard

deviation of the

measurements

A Normal

102


The combined uncertainty for the gravimetric method is given by the following

equation:

=;<<<<<<<= >>" " + >> " " + 5 >>6" "

+ >>" " + >>" " + >>" " + >> " " + >>!" "! + 5 >>/8096" "/809 + 5>>6" " + >>" "+ 5 >>?/96" "?/9@AA

AAAAAB"

(4)

From the determined values of the coverage factor k and the combined standard

uncertainty of the measurand, the expanded uncertainty is deduced by:

U = k × u(Q) (5)

4. Results

A Nexus 3000 pump (microflow generator) was calibrated using the

gravimetric method, with the AX26 balance, for a programed flow rate of

0.1 mL/h. The measurements results were collected using a Labview

application and the average of approximately 60 values were used to determine

the measured flow, Fig. 2, with a mean value equal to 2.73×10-5 mL/s.

0.00E+00

5.00E-06

1.00E-05

1.50E-05

2.00E-05

2.50E-05

3.00E-05

3.50E-05

4.00E-05

00:00.0 07:12.0 14:24.0 21:36.0 28:48.0 36:00.0 43:12.0

Flo

w (

ml/

s)

Time (min)

Flow measurement results

Figure 2. Flow measurement results

The uncertainty results using GUM approach are presented in Table 2.

A comparison between GUM and MCM approaches (considering coverage

intervals of 68 %, 95 % and 99 %) is presented in Table 3, where is indicated the

estimated values for the output quantity with the associated standard

uncertainties and the limits determined for each measurement interval. A

difference of 2.8×10-11

mL/s was obtained for the output quantity, which is a

negligible value compared to the experimental system accuracy (5.7×10-8 mL/s).

103


Table 2. Uncertainty components in the calibration of a Nexus 3000 pump - GUM

Uncertainty components Estimation u(xi) ci (ci×xi) 2

Final mass (g) 5.12 4.8×10-6 5.25×10-4 6.46112×10-18

Density of water (g/mL) 0.9980639 9.00×10-7 -2.72×10-5 6.00614×10-22

Density of air (g/mL) 0.001202 2.89×10-7 2.38×10-5 4.72819×10-23

Density of weights (g/mL) 7.96 3.46×10-2 5.15×10-10 3.1831×10-22

Temperature (ºC) 20.68 5.00×10-3 -2.71×10-10 1.84216×10-24

Expansion coefficient (/ºC) 1×10-5 2.89×10-7 -1.85×10-5 2.83938×10-23

Initial mass (g) 5.06 4.8×10-6 -5.25×10-4 6.42455×10-18

Evaporation (mL/s)

Initial Time (s)

1.09×10-7

0.249

1.12×10-8

5.77×10-5

1

1,42×10-8

1.2544×10-16

6.73403×10-25

Final Time (s) 191 5.77×10-5 -1.42×10-8 6.73403×10-25

Buoyancy (g) 0.0007 9.01×10-6 5.25×10-4 2.23655×10-17

Repeatability (mL/s) 0 5.55×10-8 1 3.08025×10-15

Flow (mL/s) 2.7254×10-5

ucomb (mL/s) 5.7×10-8

Uexp (mL/s) 1.1×10-7

Table 3. Comparison between GUM and MCM approach

MCM GUM

M (Monte Carlo trials) y (mL/s) ± u (mL/s) y (mL/s) ± u (mL/s)

1.0×10+6 2.7254×10-5 5.6×10-8 2.7254×10-5 5.7×10-8

Probability Density u probability Limits u probability Limits

68 % ⇔ (y ± u) 5.6×10-8 2.7197×10-5

5.7×10-8 2.7197×10-5

2.7310×10-5 2.7311×10-5

95 % ⇔ (y ± 1.96 × u) 1.1×10-7 2.714×10-5

1.1×10-7 2.714×10-5

2.736×10-5 2.737×10-5

99 % ⇔ (y ± 2.68 × u) 1.5×10-7 2.710×10-5

1.5×10-7 2.710×10-5

2.740×10-5 2.741×10-5

From the Monte Carlo simulation performed it was obtained a normal probability

density function of the output quantity as presented in Fig. 3.

104


Figure 3. Probability density function of output quantity Q using MCM

5. Conclusions

In the gravimetric determination of microflow there are several influence factors

that have a major contribution to the uncertainty calculation due to the very

small amount of liquid used, namely the evaporation of the fluid and capillary

effects like buoyancy correction. The standard deviation of the measurements

(repeatability) is also one of the major uncertainty sources.

Comparing the results and the corresponding uncertainties obtained by the

two approaches, it can be concluded that the estimated output quantity values,

considering both GUM and MCM approaches, show an excellent agreement (in

order of 10-11

mL/s), negligible compared to the experimental system accuracy

(5.7×10-8

mL/s) which allows the validation of methodology used for microflow

measurements.

Acknowledgments

This work is part of a project under the European Metrology Research

Program (EMRP), MeDD – Metrology for Dug Delivery.

References

1. C. Melvad, U. Kruhne e J. Frederlkesen, “IOP Publishing, Measurement Science and Techonology, nº 21, 2010.

2. P. Lucas, E. Batista, H. Bissig et al Metrology for drug delivery, research project 2012 – 2015, www.drugmetrology.com

3. JCGM 2008, Evaluation of measurement data - Guide to expression of uncertaity in measurement, 1ª ed., 2008.

4. BIPM, IEC, IFCC, ILAC, ISO, IUPAP and OIML, “Evaluation of measurement data – Supplement 1 to the Guide to the Expression of Uncertainty in Measurement – Propagation of distributions using a Monte Carlo method, Joint Committee for Guides in Metrology, JCGM 101, 2008.



UNCERTAINTIES PROPAGATION FROM PUBLISHED

EXPERIMENTAL DATA TO UNCERTAINTIES OF MODEL

PARAMETERS ADJUSTED BY THE LEAST SQUARES

V.I. BELOUSOV, V.V. EZHELA∗, Y.V. KUYANOV, S.B. LUGOVSKY,

K.S. LUGOVSKY, N.P. TKACHENKO

COMPAS group, IHEP, Science square 1

Protvino, Moscow region, 142280, Russia∗E-mail: [email protected]

www.ihep.ru

This report presents results of the indirect multivariate “measurements” of

model parameters in a task of experimental data description by analyticalmodels with “moderate” nonlinearity. In our “measurements” we follow the

recommendations of the GUM-S1 and GUM-S2 documents in places where

they are appropriate.

Keywords: GUM; JCGM; Propagation of distributions; Indirect measurements;

Uncertainty evaluation; Criteria for the measurement result; Numerical peer

review.

1. Introduction

Our task is as follows: for available experimental data sample of Nd data

points (xi, ui) we need to find an algebraic model ti(yj) dependent upon

Np parameters yj and a vector yrefj , such that for almost all i we will

have |xi − ti(yrefj )| 5 ui, where ui is an estimate of the uncertainty of

random variable xi with mean (or estimated) value xi. This task is an

optimisation task that one can solve by the method of least squares (MLS)

using estimator function χ2(x, yj) defined as:

χ2(x, yj) =

Nd∑i=1

(xi − ti(yj)

ui

)2

Solution(s) is contained in the set of roots of the equation:

minyj

χ2(xi, yj) = χ2(xi, yrefj )

105


106

If selected “best” vector yrefj points to the local minimum of the sufficiently

smooth estimator function then, due to necessity equations for extremums,

vector yj is determined as a function upon random variables xi in the

vicinity of values xi

yj = Fj(xi) ⇒ yrefj = Fj(xi)

We see that in such tasks almost all requirements for the applicability of

the GUM-S1 and GUM-S2 recommendations are met.

In our case study we have conducted a few computer experiments on

simultaneously indirect “measuring” of all components of the vector of ad-

justable parameters Yj using algebraic formulae for physical observables

Σab(sab;Yj) and <ab(sab;Yj) (also indirect measurands, but named here as

direct for clarity) describing possible measurable outcomes after collisions

of two particles a, b at total collision energy√sab in the center of mass of

colliding particles.

Formulae comes from theory and phenomenology to model direct exper-

imental data on σab(sab) and ρab(sab) and connect them with adjustable

model parameters (indirect measurands). Best estimates of parameters (ref-

erence estimates) yrefj are obtained by tuning parameters by MLS to obtain

the “best” currently possible quantifiable consistency between theory and

experiment.

2. Experimental data input

Experimental data samples used are from recent compilations [1], [2] of

the measurement results on the hadronic production total cross sections

σab(sab) and another measurands ρab(sab) in various two particle collisions

at center of mass energies√sab above 5 GeV. Compilation were collected

from published scientific reports (1960-2013). It contains 1047 data points

of (σab(sabl ), u(σabl )) or (ρab(sabk ), u(ρabk )) where u(...) stands for total exper-

imental uncertainties at each energy point (marked as l or k). These data

should be compared with model tables Σab(sabl ; yrefj ) and <ab(sabk ; yrefj ) cal-

culated using our algebraic formulae with reference values yrefi of adjusted

parameters inserted.

3. Phenomenological models

In this section we show results of data description by different variants of

the model used in our mini-review on the current situation of the subject in


107

RPP 2013/2014 [1]. Each variant where adjusted on the same data sample

by simultaneous fit to the data on collisions:

(p, p) (p, n, d); Σ−p; π∓ (p, n, d); K∓ (p, n, d); γ p; γ γ; γ d.

To trace the variation of the range of applicability of simultane-

ous fit results, several fits were produced with lower energy cutoffs:√sab ≥ 5, ≥ 6, ≥ 7 GeV until the “uniformity” of the fit quality (FQ)

across different collision will became acceptable with good value of overall

fit quality (FQ = χ2/ndf , FQ 5 1).

3.1. Model HPR1R2

Σa∓b =

H log2(

ssabM

)Heisenberg term

+P ab Pomeranchuk term

+Rab1

(sabM

s

)η1C+Reggeon term

±Rab2(sabM

s

)η2C−Reggeon term

<a∓b =

1

Σa∓b

πH log

(ssabM

)Heisenberg term

−Rab1 tan(η1π2

) ( sabM

s

)η1C+Reggeon term

±Rab2 cot(η2π2

) ( sabM

s

)η2C−Reggeon term

where upper signs are for particles and lower signs for anti-particles.

The adjustable parameters are as follows:

H = π (~c)2M2 in mb, where notation H is after Heisenberg(1952,1975);

P ab in mb, are Pomeranchuk’s(1958) constant terms;

Rabi in mb are the intensities of the effective secondary Regge pole con-

tributions named after Regge-Gribov(1961);

s, sabM = (ma +mb +M)2 are in GeV2 ;

ma, mb, (mγ∗ = mρ(770)), M all in GeV are the masses of initial state

particles and the mass parameter defining the rate of universal rise of the

total cross sections. Parameters M , η1 and η2 are universal for all collisions

considered. For collisions with deuteron target Hd = λH where dimension-

less parameter λ is introduced to test the universality of the Heisenberg

rise for particle–nuclear and nuclear–nuclear collisions.

Exact factorization hypothesis was used for both H log2( ssabM

) and P ab

to extend the universal rise of the total hadronic cross sections to the

γ(p, d)→ hadrons and γγ → hadrons collisions.


108

This results in one additional adjustable parameter δ with substitutions:

H log2

(s

sγ(p,d)M

)+ P γ(p,d) ⇒ δ

[(1, λ)H log2

(s

sγ(p,d)M

)+ P p(p,d)

],

H log2

(s

sγγM

)+ P γγ ⇒ δ2

[H log2

(s

sγγM

)+ P pp

]In this variant we have 35 adjustable parameters and 1047 observational

equations to “indirect measuring” (estimate) the best “reference values” of

parameters and their scattering region (SR) in 35-dimensional parameter

space.

In cases with “moderate” nonlinearity one can construct SR by two

methods: the Hessian method recommended in GUM [3] and by the

adaptive Monte Carlo method (MCM) advocated in GUM-S1 [4] and

GUM-S2 [5] documents. In the cases under study we construct and compare

both SR:

• the SRhess constructed by the standard NonlinearModelFit procedure

in Mathematica 8;

• the SRprop constructed by propagation of assumed normal distribu-

tion of experimental uncertainties to the “empirical” distribution of the

parameter uncertainties.

In fact, Hessian method gives the parameter covariance matrix as inverse

Hessian matrix calculated at minimum point corresponding to ~yref

(yi − yrefi )(yj − yrefj ) =

(1

2· ∂

2χ2(~y)

∂yi∂yj

∣∣∣∣~yref

)−1

.

Inserting it into equation

∆χ2(~y | ~yref ) =1

2· ∂

2χ2(~y)

∂yi∂yj

∣∣∣∣~yref· (yi − yrefi )(yj − yrefj ) + . . .

we obtain ∆χ2(~y | ~yref ) = Np and SRhess is deemed as region in the pa-

rameter space defined by inequality χ2(~y)− χ2(~yref ) 5 Np = 35

Input data and plots with their model description are accessed by URLs

from references [1], [2].


109

3.1.1. Parameter uncertainties estimation

We have 1047 independent random input quantities xi ∈ N(xi, ui) and 35

dependent quantities yj(xi) which are estimated by MLS.

First of all we should decide what is the result of an indirect measure-

ment in this case. From the GUM-S2(2011) (clause 7.6) we have general

recommendation

yj =1

Nstop

Nstop∑r=1

yrj , Uij =1

Nstop − 1

Nstop∑r=1

(yri − yi)(yrj − yj)

where yrj = yj(xri ) is the reference vector obtained from of independent

random drawn xri by measuring procedure; (.) used to indicate expectation

value (or estimated output value) which constitute a part of the measure-

ment result; Uij is the output covariance matrix of the obtained MC-sample

of yrj ; Nstop is the cardinality of the MC-sample.

This recommendation works well in case of linear measuring model only,

but in general case we propose a more natural estimates to be the result of

indirect measurements, namely:

yj ⇒ yrefj , Urefij =1

Nstop

Nstop∑r=1

(yri − yrefi )(yrj − y

refj ).

In case of the MLS measurement method estimates yj should be replaced

by the best fit parameter values yrefj . For the nonlinear measuring model

Yj = Fj(X1, X2, ..., XN ), yj = Fj(xr1, x

r2, ..., x

rN ) 6= Fj(x1, x2, ..., xN ) and

again the estimates yj should be replaced by yrefj = Fj(x1, x2, ..., xN ) as it:

(i) is independent of Nstop;

(ii) always belongs to the manifold where the probability is defined;

(iii) it tends to the indirect measuring result recommended by the

GUM-S2 when input data become more and more precise (u(xi) 0).

Covariance matrix Uij also should be replaced Uij ⇒ Urefij as well.

Our measuring method is to estimate the yrefj as the best fit parameters yjby minimization of the quadratic form:

χ2(yj) =∑a,b,l

(σabl − Σab(sabl ; yj)

u(σabl )

)2

+∑a,b,k

(ρabk −<ab(sabk ; yj)

u(ρabk )

)2

over parameters yj , i.e.

minyj

χ2(yj) = χ2(yrefj ).


110

We have goodness of fit indicator FQ = 0.963 that corresponds to fit con-

fidence level CLref (1047− 35) ≈ 0.9993 at ndf = 1012.

In our case to perform propagation of assumed normal distribution of

experimental scores for observable experimental values at each energy point

we construct new quadratic form χ2r(yj)

χ2r(yj) =

∑a,b,l

(σabl,r − Σab(sabl ; yj)

u(σabl )

)2

+∑a,b,k

(ρabk,r −<ab(sabk ; yj)

u(ρabk )

)2

with the same set of adjustable parameters yj , where experimental value of

observable at each energy point replaced by the random value drawn from

corresponding assumed distribution independently, but simultaneously for

all experimental data points. Index r marks the consecutive simultaneous

replacements. Minimizations of the χ2r(yj) in the parameter space with fit

starting vector yrefj for all r will give us a sample of propagated reference

vectors i.e.

minyj

χ2r(yj) = χ2

r(ypropj )

that form the empirical distribution of random reference vectors in the

propagated scattering region SRprop.

The obtained “empirical” PDF for the sample of values

∆χ2(ypropj ) = χ2(ypropj )− χ2(yrefj )

is visually well fitted by χ2(ν) distribution with ν ≈ 35 (see Fig. 1). This is

a signal for the “moderate” nonlinearity and ∆χ2(ypropj ) quantiles can be

used for sampling of the whole MC propagated vector sample to construct

the scatter regions with different coverage probabilities.

At this stage we have a lucky situation (in our task). Indeed, we have

Nstop = 1.7 × 106 vectors belonging to SRprop and a repeatable reliable

procedure to extract samples with predefined coverage probability.

Indeed, we have the scatter region like a “jet” of vectors that is

mapped by PDF histogram of our ∆χ2(ypropj ) quadratic form values

to the one dimensional distribution. This distribution is well fitted by

χ2(34.91577...) distribution with confidence level of distribution fit test

value: CLDFT (1.7× 106) = 0.858. This value is acceptable as we have:

max~yprop

∆χ2(~yprop) = 93.7814

and Quantile[χ2(34.91577...), p = 0.99999972579] = 93.7814.


111

Fig. 1. Distribution of ∆χ2(ypropj ) (gray histogram) and curve of fitted χ2(ν = 34.92)

(gray part of the curve corresponds to coverage probability 5 0.95).

Thus, we may take Nstop = 1.7× 106 to be the “stopping rule” for our

MC-sampling as we have obtained SRprop sample with practically 100%

coverage probability in terms of the approximate analytic distribution we

have chosen by a reasonable fit of our 35-dimensional scatter region mapped

to one-dimensional statistic χ2(34.91577...) with help of ∆χ2(ypropj ) con-

struction.

3.1.2. Summary on HPR1R2 model parameters measurement

Now we can formulate the first level result of our measurement. Our task

was to get reliable estimates of the HPR1R2 model parameters and to

construct reasonable parameter scattering region with traceable calculation

of its coverage probability.

We propose a simple quantitative probabilistic reliability indicator

(RLEV ) of parameters measurement:

RLEV (Nstop) = CLref (ndf)× CLDFT (Nstop)× SCP ,

where: CLDFT (Nstop) denotes CL of “DistributionFitTest” at Nstop,

SCP stand for the stipulated coverage probability.

In our case we have: RLEV (Nstop) = 0.999× 0.858× 0.999 = 0.857.

Thus, our measurement is reliable at level of 86%.


112

This RLEV could be used in classifying results of measurements as

“reliable” or “inconsistent” and in risk assessments in implementing mea-

surement results in applications:

It is strongly recommended by JCGM documents that the summariza-

tion of the measurement results should be as complete as possible and

expressed in computer usable form as well. The minimal structure that

will give any interested person to check and reproduce our statements is as

follows:

• measured experimental data sample treated as independent variables

in measurement model (file with data or URL and procedure to extract

needed sample);

• parametric model and procedure to construct our best estimate of the

model parameter vector value based on available experimental knowledge;

• procedure that maps the scatter region of experimental data in 1047-

dimensional space onto scatter region in 35-dimensional space of the model

parameters that are treated as dependent variables in measurement model;

• file with Nstop 36-component binary vectors (χ2(yrj ) − χ2(yrefj ), yrj )

with yrj ∈ SRprop (in our case it is ≈ 490 Mb).

At this state we can say nothing about the geometric form of the scat-

tering region except that we have a jet of Nstop 35-dimensional vectors

randomly populated around the best fit vector yrefj .

CAUTION! Nevertheless, it should be noted, that there is no way to

model parameter distribution as 35-dimensional normal distribution with

covariance matrix constructed on the whole SRprop sample.

Let

Q2(~y) = Urefα,β (yα − yrefα )(yβ − yrefβ )

be the quadratic form dependent on ~y, centered at ~yref , and with covariance

matrix constructed as the second moment of the whole SRprop vectors with

respect to ~yref . In this case, if ~ygen ∈ N [~yref , Uref ], thenQ2(~ygen) ∈ χ2[35].

Let ~ygen ∈ SRgen - the scatter region with cardinality Nstop drawn from

N [~yref , Uref ]. We have calculated two statistics:

χ2(~yprop), χ2(~ygen)

and plotted corresponding histograms on Fig. 2 for comparison.


113

Fig. 2. Distributions of ∆χ2(ypropj ) (gray histogram) and χ2(~y100%gen) (hard gray

histogram). Curve is the fitted χ2(ν = 34.92...) (gray part of the curve corresponds tocoverage probability 5 0.95).

It is seen that the χ2(~y100%gen) histogram has a large part of its right

hand decline out of the χ2(~yprop) histogram area. This means that in the

SRgen there are vectors that could not be obtained by our MCM procedure.

This is an indication that obtained scatter region is non convex or dis-

tribution is asymmetric in the parameter space.

Definitely to use 100% quantile area of the SRprop to calculate second

moment of the empirical distribution will give ellipsoid possibly containing

vectors out of fits stability region.

We tried the 50% quantile area and obtained results presented on Fig. 3

where we have much better situation. The χ2(~y50%gen) histogram now has

much larger overlap with χ2(~yprop) histogram and its maximum is close

to the left edge of the SRprop with the main body inside it. We have no

good enough idea how to imbed an 35-dimensional ellipsoid of maximal

possible volume (to get the as large coverage probability as possible) into

such SRprop. Nevertheless we have played with quantile values and obtained

more hopeful situation plotted on Fig. 4.


114

Fig. 3. Distributions of ∆χ2(ypropj ) (gray histogram), ∆χ2(y50%genj ) (hard gray his-

togram). Curve is the fitted χ2(ν = 34.92...) (gray part of the curve corresponds tocoverage probability 5 0.95).

Fig. 4. Distributions of ∆χ2(ypropj ) (gray histogram), ∆χ2(y68.5%genj ) (hard gray his-

togram). Curve is the fitted χ2(ν = 34.2...) (gray part corresponds to coverage probability5 0.95).


115

In the last case we can claim more compact result: instead of file with

whole propagated vectors we propose to present a covariance matrix Uref68.5%

constructed on the 0.685 quantile of the whole propagated sample and as

the parameters multivariate probability distribution function the normal

distribution N(~yref , Uref68.5%). Now reliability indicator is not so good be-

cause we use poorer statistics of SRgen

RLEV (Ngenstop) = 0.999× 0.685× 0.765 = 0.52

Last factor in the RLEV (0.765) is forced coverage probability to keep all

generated vectors inside the SRprop for sure.

Acknowledgements

This work is supported in part by the Russian Foundation for Basic Re-

search (RFBR) grants 14-07-00362 and 14-07-00950.

References

[1] J. Beringer et al. [Particle Data Group], Review of Particle Physics,Phys. Rev. D86 (2012) 010001,http://pdg.lbl.gov/2013/hadronic-xsections/hadron.html

[2] K.A. Olive et al. [Particle Data Group], Review of Particle Physics,Chin. Phys. C38 (2014) 090001 ,http://pdg.lbl.gov/2014/hadronic-xsections/hadron.html.

[3] JCGM 100:2008, Evaluation of measurement data - “Guide to the expressionof uncertainty in measurement”,http://www.bipm.org/utils/common/documents/jcgm/JCGM_100_2008_E.pdf.

[4] JCGM 101:2008, Evaluation of measurement data - Supplement 1 to the -“Guide to the expression of uncertainty in measurement” - Propagation ofdistributions using a Monte Carlo method ,http://www.bipm.org/utils/common/documents/jcgm/JCGM_101_2008_E.pdf

[5] JCGM 102:2011, Evaluation of measurement data - Supplement 2 to the“Guide to the expression of uncertainty in measurement” - Extension to anynumber of output quantities ,http://www.bipm.org/utils/common/documents/jcgm/JCGM_102_2011_E.pdf



A NEW APPROACH FOR THE MATHEMATICAL ALIGNMENT MACHINE TOOL-PATHS ON A FIVE-AXIS MACHINE AND ITS

EFFECT ON SURFACE ROUGHNESS

SALIM BOUKEBBAB1,*, JULIEN CHAVES-JACOB2 1 Laboratoire Ingénierie des Transports et Environnent,

Faculté des Sciences de la Technologie Université Constantine 1, Campus Universitaire Zarzara,

25000 Constantine, Algérie Tel: +213 (0)31 81 90 66

*E-mail : [email protected]

JEAN-MARC LINARES2,†, NOUREDDINE AZZAM1,2 2Aix Marseille Université

CNRS - UMR 7287 13288 Marseille Cedex 9, France

Tel: + 33 (0) 4 42 93 90 96 †E-mail : [email protected]

This paper proposes a procedure to adapt the geometry of the toolpath to remove a constant thickness on a five-axis machine. The aim of this work is to contribute to the automation of prosthesis machining, mainly, in the preparation of polishing surface. The proposed method can deform and adapt a toolpath to respect the geometry of the manufactured surface. This method is based on three steps: alignment, deformation and smoothing toolpath. In the alignment step, a mapping is carried out between the measured surface of prostheses and the nominal toolpath using the Iterative Closest Point (ICP) algorithm. The aligned toolpath is deformed in two steps. The first step is the projection of aligned points on the measured surface (defined by STL file). In the second step, these points are offset by a value (ap) to obtain the required geometry. During the deformation step a meshed surface is used, reducing the smoothness of the deformed toolpath. Experimental tests on industrial prostheses are conducted to validate the effectiveness of this method. During these tests the effects of the smoothing methods on the surface quality of machined parts are presented.

1. Introduction

The surface quality of surgical implants is one of the most important properties to be controlled in their design and manufacture. The polishing operation represents the final action in the production cycle to improve the quality of implants surfaces. Generally, knee prosthesis is constituted of three parts. Two

116

117


metal parts are fixed respectively on the femur and one on the tibia. The third part is intercalated between the two metallic’s and it is made up of a very strong plastic resistant called the polyethylene, which improves the knee slip [1].

To reduce the removed bone volume the knee prostheses thickness is reduced. Thus, this small thickness is caused by deformations due to the foundry process [2]. The geometry has a small influence on the lifespan of the prosthesis, because the intercalated parts in polyethylene will be deformed to compensate geometry errors of the femoral part which is commonly made in cobalt-chromium alloy. On the other hand, the surface discontinuities and the surface quality (roughness and waviness) have a major influence on the lifespan of the prosthesis; this implies that we must have a very accuracy surface quality and to ensure the thickness of the prosthesis to avoid the prosthesis failure. When CNC machines are used to polish these functional surfaces, the polishing force is not controlled because usual CNC machines drive the position and not the applied force. This effect requires a geometrical adaptation of the machining toolpath at each rough work piece [3]. In manual polishing, the operator uses his eyes to adapt his toolpath.

In the proposed method, a three-dimensional measurement is needed to obtain the rough part geometry made by foundry process. An STL model is generated after this measurement process, it should be noted here that the STL format is obtained by a triangulation of real work piece after acquisition step. The initial tool trajectory is calculated by a CAM (Computer Aided Manufacturing). It is defined on the nominal model given by CAD (Computer-aided design) software, with the respect of toolpath synchronization (figure 1). It makes it possible to avoid the traces on manufactured surface and thus avoid the build of CAD model of each deformed part and to remake a special CNC program [4-5].

Figure 1: Tool-path generate using CAD model of the knee prosthesis (femoral condyle). The main objective of this research work is to modify a trajectory of machining calculated on a nominal model to remove a constant thickness over a rough

118


surface of part coming from the foundry. In this paper, the case of femoral component of knee prostheses (femoral implant: condyles) is studied. The CNC toolpaths are made only on the upper part of the knee condyle.

2. Description of the developed procedure

This study proposes a method to adapt the geometry of the toolpath with the aim to remove a constant thickness. As presented in introduction, this case is present in the machining process of the femoral component of knee prostheses. The figure 2 illustrates the stages of this method. The proposed toolpath deformation method is composed of three stages: the measured surface alignment, toolpath deformation and toolpath smoothing. Each of these three items is studied in relation to the bibliography.

Figure 2: The stage of the method The deformation of the toolpath is performed in three steps: a) Aligning the tool path (computed on the nominal model) and the STL model of the rough surface using the ICP algorithm, b) Deformation of the tool path, c) Smoothing of the deformed toolpath.

119


3. The alignment process using ICP algorithm

The alignment process using the ICP algorithm begins by the measurement of rough surface which must be aligned with the nominal toolpath. The ICP algorithm is a well-known method for registering a 3D set of points to a 3D model [6]. It will be noted that the successive coordinates of the drive point expressed in the coordinate system of the workpiece give the nominal toolpath. Some CAM software options allow expression of the toolpath of the cutter contact point [3]. Subsequently, these coordinates are noted PCC(xi,yi,zi) and the tool axis direction, u. On the other hand, an STL file defines the measured surface [7]. It is composed of vertices, edges, and triangular facets. Each facet has a normal vector, n. It should be noted here that P’CC(xi,yi,zi) is the vertical projection of PCC(xi,yi,zi) on a triangular facet. A rigid transformation [Tt] consists in the rotation matrix [R] and the translation vector T giving the iterative transformation Eq. 1.

P’CC(xi,yi,zi) = [R]× PCC(xi,yi,zi) + T (1)

The transformation is calculated in the aim to displace the nominal toolpath on the measured surface. The algorithm minimizes the sum of squared residual errors between the set of points and the model, and finds a registration that is locally the best fit using the least-squares method Eq. 2.

1),(2

1

's

ii

N

iCCtCC

s

PTPN

TRf (2)

4. Deformed toolpath and offset step

After alignment phase, the toolpath is deformed in two steps (figure 3). In the first one the projection of the aligned points on the measured surface (STL model) is realised. In the second step, an offset of these points by a value (ap) is necessary to obtain the required geometry. These steps are detailed below.

Figure 3: Deformation of the nominal toolpath

120


4.1. Projection aligned points

Firstly, all the points of the trajectory P’CC(xi,yi,zi) are projected on all facets of the STL model. A test is carried out to verify if the projection is inside the triangle or not. The distance between P’CC(xi,yi,zi) and a triangular element of STL model (figure 3) is determinate using the Eq. 3. The triangle vertices are denoted N1, N2 and N3. Eq. 4 is used to calculate the point PCC_def(xi,yi,zi).

Ei = PCCiN1 . n (3) )

OPCC_def(xi,yi,zi) = OP’CC(xi,yi,zi) + Ei . n (4)

Where n is the unit vector of the triangular element and Ei is the distance between P’CC(xi,yi,zi) and PCC_def(xi,yi,zi).

4.2. Offsetting the toolpath after projection

The projected toolpath is offset with a quantity ap: depth of cut inside material (figure 3). The equation Eq. 5 is used to carry out to determinate the points PCCi_def_dec(xi,yi,zi).

OPCCi_def_dec (xi,yi,zi) = OP’CC(xi,yi,zi) + (Ei – ap). n (5)

It will be noted that, on a meshed surface (plane element); the local normal is submitted at discontinuous variations along a toolpath. This last will induce discontinuities on the deformed toolpath [3]. This deformation induces oscillations, principally, in the axis of the machine and this is observed in the manufacturing surface, because the initial trajectory is far from the target surface (figure 4). To resolve this impediment, section 5 proposes a method to smooth the toolpath within a pre-assigned tolerance.

Figure 4: Discontinuities observed on the deformed tool path. The generation of toolpath starting from model STL generates disturbances of deformed trajectory then decelerations of the machine and defects on the part. These phenomena are harmful with respect to production and the surface quality.

121


Toolpath smoothing is carried out to improve surface quality after the deformation step. A technique of smoothing methods is developed in literature. Some authors propose the B-Spline curve interpolation to smooth the nominal toolpath points [8-9].

5. Smoothing toolpath and experimental validations

The proposed smoothing method is based on smoothing axis by axis with a 3-dimensional admissible tolerance IT. This method may be applied to the 3 axes of the toolpath or only to one. On each axis a low degree polynomial (<6) is calculated using the least squares method. In addition with that, we propose to use the Bezier curves to smooth the toolpaths with an aim to have a better surface quality. Tests are carried out on a femoral prosthesis. This prosthesis is a uni-compartmental knee component. Shape complexity of these surfaces requires machining by a multi-axis CNC machine, in this case five-axis “ULTRASONIC 20 linear”. A Siemens 840D CNC was used to carry out the tests. The measurement of manufacturing time gives us an idea of the effectiveness of the smoothing technique and makes it possible to select the most smooth toolpath trajectory as shown in the figure 5. It shows clearly that the proposed technique of smoothing by Bezier curves offers a better fluidity.

Figure 5: Experimental testing and validation. The machined surfaces are measured with an optical coordinate measuring machine. Figure 6 presents the obtained results. This figure illustrates the roughness surface to the machined surface and compares the total depth of surface in micron [ m].

122


Figure 6: The measurement roughness surface results. From the machining experiments and the results of roughness measurements, we can conclude that the best strategy is the smoothing according to the three machine axes X, Y, Z.

6. Conclusion

In this paper a method to adapt a toolpath to a geometrical target to remove a constant thickness on a rough surface was proposed. This case is generally present in the production of knee prostheses. An STL model is generated after the measurement process. The toolpath deformation method starts with aligning the measured surface and the nominal toolpath. After this, a deformation

123


toolpath method is proposed to remove a constant thickness on rough surface. However, the use of a meshed model to deform the toolpath induces systematic effect (apparition of pattern marks) on the manufactured surface. To resolve this problem, a toolpath smoothing methods was developed. To validate the usefulness of the presented method and its effects on the machined surface quality, industrial tests were carried out and analyses, leading to an optimal method based.

References

1. Gacon G, Hummer J (2006). Les prothèses tricompartimentaires du genou de première intention. Techniques opératoires. Problèmes et solutions. Collection GECO. Springer-Verlag France, Paris.

2. Lison D, Lauwerys R, Demedts M, Nemery B. (1996), Experimental research into the pathogenesis of cobalt/hard metal lung disease, European Respiratory Journal 9: 1024–1028. doi: 10.1183/09031936.96.09051024

3. Azzam N., Chaves-Jacob J. Boukebbab S, Linares J.M. (2014), Adaptation of machining toolpath to distorted geometries: application to remove a constant thickness on rough casting prosthesis, International Journal of Advanced Manufacturing Technology. DOI 10.1007/s00170-014-5738-2

4. Hu Gong, Li-Xin Cao, Jian Liu (2005), Improved positioning of cylindrical cutter for flank milling ruled surfaces, Computer-Aided Design 37 1205–1213.

5. Chaves-Jacob J, Linares J-M, Sprauel J-M (2013), Improving tool wear and surface covering in polishing via toolpath optimization, Journal Of Materials Processing Technology 213/10 : 1661-1668. Doi: 10.1016/j.jmatprotec.2013.04.005

6. S. Rusinkiewicz and M. Levoy, Efficient Variants of the ICP algorithm, in Proceeding of the 3rd IEEE International Conference on 3-D Digital Imaging and Modeling, Quebec, 2001, pp. 145–152

7. Boukebbab S, Bouchenitfa H, Boughouas H, Linares JM (2007), Applied Iterative Closest Point algorithm to automated inspection of gear box tooth, International Journal of Computers & Industrial Engineering 52: 162-173. doi: 10.1016/j.cie.2006.12.001.

8. Can A, Ünüvar A. (2010), Five-axis tool path generation for 3D curves created by projection on B-spline surfaces, International Journal of Advanced Manufacturing Technology. Vol. 49. pp 1047–1057.

9. Pechard P-Y, Tournier C, Lartigue C. Lugarini J-P. (2009), Geometrical deviations versus smoothness in 5-axes high–speed flank milling, International Journal of Machine Tools & Manufacture. Vol. 49. pp 454–461.



GOODNESS-OF-FIT TESTS FOR ONE-SHOT DEVICE

TESTING DATA∗

E. V. CHIMITOVA

Department of Theoretical and Applied Informatics, Novosibirsk State Technical

University,Novosibirsk, Russia


N. BALAKRISHNAN

Department of Mathematics and Statistics, McMaster University,

Hamilton, Ontario, Canada


In this paper, we propose a formal goodness-of-fit testing procedure for currentstatus data in which each observation in a sample is either left censored or right

censored. Such kind of data is a particular case of interval-censored data. We

consider four different statistics for the purpose of testing the goodness-of-fitin this set-up: a chi-square type statistic based on the difference between the

observed and expected numbers of failures at each inspection time; two statis-

tics based on the difference between the nonparametric maximum likelihoodestimator of the lifetime distribution obtained from the observed current sta-

tus data and the distribution under the null hypothesis; and finally, a statistic

based on White’s idea of comparing two consistent estimators of the Fisherinformation.

Keywords: One-shot devices; current status data; nonparametric maximum

likelihood estimator; goodness-of-fit tests; Monte Carlo simulations.

1. Introduction

In reliability analysis, testing one-shot devices at specific inspection times

results in total destruction of tested devices. The status of a device is re-

ported in this case instead of an actual failure time of the device. Each

failure time here is either left censored, a case when the test outcome is a

failure (that is, the lifetime is less than the inspection time), or right cen-

sored, a case when the test outcome is a success (that is, the lifetime is more

∗This research has been supported by the Russian Ministry of Education and Science(project 2.541.2014K).

124


125

than the inspection time). Such data arise, for example, in tests of electro-

explosive devices, military weapons and automobile air bags. In Refs. 1–3,

the EM-algorithm has been developed for the determination of maximum

likelihood estimates (MLE) of the parameters of exponential, Weibull and

gamma distributions based on such data.

If there is no prior information about the distribution of the corre-

sponding random variable, then nonparametric methods are used for the

estimation of the underlying lifetime distribution function. In Refs. 6,7, the

nonparametric MLE for current status data with competing risks has been

discussed when inspection times are random.

One-shot device testing data and current status data are, in fact, ex-

treme cases of interval-censoring. There are only a few approaches for test-

ing the goodness-of-fit in the literature for interval-censored data. Among

them is the leveraged bootstrap (see Ref. 9), which is based on pseudo

complete samples, drawn from a nonparametric estimate of distribution

function. In Ref. 8, a test, which requires a series of nested alternative

models, has been discussed. In Ref. 5, a sampling-based chi-square test for

interval-censored data has been proposed. Classical goodness-of-fit tests are

not applicable for these data, as there is no lifetime observation made in

the sample, and so it is impossible to construct the empirical distribution

function or the Kaplan-Meier estimator of the reliability function.

In this paper, we study the statistical properties of MLEs of model

parameters determined from current status data depending on the choice of

inspection times. Subsequently, we propose different statistics for testing the

composite hypotheses of goodness-of-fit and compare these tests in terms

of power by using Monte Carlo simulations.

2. Maximum likelihood estimates from current status data

Let us consider a reliability experiment, in which n one-shot devices are

tested at k different inspection times t1, t2, ..., tk. Let ni denote the number

of devices tested at the i-th inspection time, with∑ki=1 ni = n. During

the experiment, some tested devices fail, resulting in left censored obser-

vations, while the other devices operate successfully, resulting in right cen-

sored observations. Thus, the obtained random sample can be presented in

the following form:

X = (ti, ri, ni), i = 1, ..., k, (1)

where ri is the number of failures at inspection time ti, with ri ≤ ni.


126

Suppose each device has a random lifetime T with cumulative distri-

bution function F (t; θ). The loglikelihood function for the sample in (1) is

then given by

lnL (X; θ) =k∑i=1

[ri lnF (ti; θ) + (ni − ri) ln(1− F (ti; θ))] . (2)

Maximum likelihood estimates of the model parameters are then obtained

by maximizing (2) with respect to the unknown parameter θ.

The MLE so determined has the following properties for regular distri-

butions F (t; θ):

• asymptotic unbiasedness: limn→∞

E(θn

)= θ;

• consistency: θnP−→ θ;

• efficiency: Dθn = I−1n (θ), where In (θ) is the Fisher information;

• asymptotic normality:√n(θn − θ

)P−→ ξ ∼ N

(0, I−1

n (θ)).

The Fisher information matrix of the vector parameter θ in sample (1)

has the following form:

In(θ) =k∑i=1

niF (ti; θ) (1− F (ti; θ))

∂F (ti; θ)

∂θ

∂F (ti; θ)

∂θT.

If there is prior information about the lifetime distribution, then it is possi-

ble to find the optimal inspection times and the number of devices tested in

each inspection time as a result of maximization of some objective function

based on the Fisher information matrix In(θ).

Let us consider the inspection times and the numbers of devices as a

discrete normalized design of experiment

εk =

t1, t2, ..., tkp1, p2, ..., pk

,

wherek∑i=1

pi = 1 and pi = ni/n. Then, the optimal design ε∗k can be found

by solving the following optimization problem:

det In(θ)→ maxt1,...,tk;n1,...,nk

(3)

under the conditions 0 < t1 < t2 < ... < tk,k∑i=1

ni = n, ni ∈ N, i =

1, 2, ..., k.

We have obtained the solution of the conditional optimization problem

in (3) for different lifetime distributions:


127

• in the case of the exponential distribution with density function

f(t; θ) = 1θ1

exp(− tθ1

), the optimal design consists of a single

point:

ε∗1 =

1.594 · θ1

1

;

• in the case of the Weibull distribution with density function

f(t; θ) = θ2θ1

(tθ1

)θ2−1

exp

(−(tθ1

)θ2), the optimal design consists

of two inspection points:

ε∗2 =

(0.262 · θ1)

1θ2 , (2.665 · θ1)

1θ2

0.5, 0.5

;

• in the case of lognormal distribution with density function f(t; θ) =1√

2πθ2texp

(− 1

2θ22

ln2(tθ1

)), the optimal design is also invariant rel-

ative to the values of parameters:

ε∗2 =

θ1 · 0.320θ2 , θ1 · 3.121θ2

0.5, 0.5

;

• in the case of gamma distribution with density function f(t; θ) =

1θ1·Γ(θ2)

(tθ1

)θ2−1

exp(− tθ1

), the optimal inspection times are not

invariant relative to the value of the shape parameter. When θ2 = 1

and treated as unknown, the optimal design is

ε∗2 =

0.140 · θ1, 2.205 · θ1

0.5, 0.5

.

It is possible to evaluate the precision of the MLE based on current

status data in comparison with the MLE based on complete data using the

ratio det In(θ)/det I∗n(θ), where I∗n(θ) is the Fisher information in a com-

plete sample. The values of this ratio for different numbers of inspection

points are given in Table 1. In the cases of non-optimal design, the in-

spection times were taken to obtain equiprobable intervals between them:

ti = F−1(

ik+1

), and the numbers of devices tested at each inspection time

to be equal.

As can be seen from Table 1, the choice of optimal inspection points

results in increasing the precision of the MLE based on current status data

for all considered distributions. It is interesting to note that in the case of

exponential distribution, one-shot device testing data with only one optimal


128

Table 1. Values of the ratio det In(θ)/ det I∗n(θ) for some lifetime distribu-

tions

k Exponential Weibull Lognormal Gamma (θ2 = 1)

Optimal design 0.6476 0.0638 0.0993 0.0361

k = 1 0.4805 - - -

k = 2 0.4661 0.0300 0.0328 0.0112k = 5 0.4442 0.0535 0.0579 0.0199

k = 10 0.4298 0.0605 0.0648 0.0224

k = 20 0.4193 0.0626 0.0666 0.0230k = 50 0.4110 0.0627 0.0663 0.0229

inspection point contains about 65% of the Fisher information in a complete

sample of the same size.

It should be mentioned that the optimal design problem has been dis-

cussed here in terms of determinant of the Fisher information matrix. How-

ever, in the case of one-shot-device testing, it will be of natural interest to

discuss the optimal design problem in estimating the reliability of test units

at a specific mission time as done in Ref. 4.

3. Goodness-of-fit tests

The most commonly used approach in testing the goodness-of-fit is based on

a distance between some nonparametric estimator of the distribution func-

tion obtained from the sample of observations and the distribution function

under test. In the case of current status data, one can use the nonparamet-

ric maximum likelihood estimator (NPMLE), which can be obtained by

solving the following optimization problem:

k∑i=1

[ri lnFi + (ni − ri) ln(1− Fi)]→ maxF1,F2,...,Fk

(4)

under conditions 0 ≤ F1 ≤ F2 ≤ ... ≤ Fk ≤ 1, where F1, ..., Fk denote

the values of the unknown distribution function at the points of inspection

times F (t1), ..., F (tk), respectively.

The solution of the optimization problem in (4) can be obtained by

adopting the following algorithm:

(1) Calculate the initial values Fi = rini

;

(2) If obtained values F1, ..., Fk satisfy the conditions F1 ≤ F2 ≤ ... ≤ Fk,

then the solution is found, else go to Step (3);

(3) Find the minimal number i < k, for which Fi > Fi+1;


129

(4) Recalculate values Fi, ..., Fi+m, which satisfy the inequalities Fi >

Fi+1 ≥ ... ≥ Fi+m, in the following way:

Fi = ... = Fi+m =

∑i+mj=i rj∑i+mj=i nj

;

(5) Repeat Steps (3) and (4) until conditions F1 ≤ F2 ≤ ... ≤ Fk are

satisfied.

Then, the NPMLE of the unknown distribution function F (t) from cur-

rent status data can be expressed as follows:

F (t) =

0, t < t1,

F1, t1 6 t < t2,

...

Fk, tk 6 t.

Let us now consider the following statistics for testing the goodness-of-

fit hypothesis H0 : F (t) ∈ F0(t; θ), θ ∈ Θ. The chi-square type statistic

can be written as

X2n =

k∑i=1

(eni − ei)2

ei, (5)

where eni = niF (ti) is the empirical number of failures at the i-th inspec-

tion time, and ei = niF0(ti; θn) is the expected number of failures at i-th

inspection time, with θn being the MLE of the unknown parameter.

The Kolmogorov type statistic can be defined as

Dn = sup0<t<∞

∣∣∣F (t)− F0(t; θn)∣∣∣ , (6)

and the ω2 type test statistic can be written as

W 2n =

∞∫0

(F (t)− F0(t; θn)

)2

dF0(t; θn). (7)

In Ref. 10, White proposed a unified framework for detection of model

misspecification when maximum likelihood techniques are used. We utilize

here his idea to construct a goodness-of-fit test for current status data.

White’s statistic is based on the comparison of two consistent estimators of

the Fisher information matrix:

An(θn) = − 1

n

k∑i=1

[ri∂2 lnF0(ti; θn)

∂θ2+ (ni − ri)

∂2 ln(1− F0(ti; θn))

∂θ2

]


130

and

Bn(θn) =1

n

k∑i=1

ri∂ lnF0(ti; θn)

∂θ

(∂ lnF0(ti; θn)

∂θ

)T

+1

n

k∑i=1

(ni − ri)∂ ln(1− F0(ti; θn))

∂θ

(∂ ln(1− F0(ti; θn))

∂θ

)T.

The statistic can be defined as any function of the matrix |An −Bn|;for example, it could be

Vn =

∣∣∣detAn(θn)− detBn(θn)∣∣∣

detBn(θn). (8)

Evidently, the null hypothesis is rejected for large values of the proposed

statistics. The distributions of these statistics under H0 can be obtained by

parametric bootstrap procedure by the use of the following algorithm.

(1) Generate current status sample of the form in (1) according to the

distribution under test F0(t; θn), where θn is the MLE obtained from

the given data;

(2) Determine the MLE of θ from the simulated current status sample;

(3) Calculate the test statistic in (5), (6), (7) or (8).

By repeating the above process N times, a random sample from the distri-

bution of the test statistic can be generated from which the required critical

values can be obtained.

We now compare the proposed goodness-of-fit tests in terms of power

determined from Monte Carlo simulations. The number of simulations used

for the distributions of statistics is N = 10000. The values of the power

of the tests are estimated with the nominal significance level α = 0.1. In

Table 2, the powers of the considered tests are presented for the composite

hypothesis H0 : the exponential distribution against competing hypotheses

corresponding to the Weibull and gamma distributions with different values

of the shape parameter. The scale parameter θ1 = 1. We also set the sample

size n = 200 and the number of inspection points k = 5.

As can be seen from Table 2, the chi-square type test turns out to be

the most preferable one, as in some cases it has the highest power among

considered tests, and in the cases when its power is not the highest, the

difference in powers compared to the best test turns out to be small.


131

Table 2. Estimates of the power of proposed goodness-of-fit tests.

H1: Weibull Gammaθ2 = 0.5 θ2 = 1.5 θ2 = 2 θ2 = 0.5 θ2 = 1.5 θ2 = 2

X2n 0.998 0.479 0.853 0.939 0.258 0.523

W 2n 0.693 0.501 0.887 0.389 0.273 0.542

Dn 0.621 0.283 0.901 0.405 0.146 0.361Vn 0.999 0.389 0.510 0.952 0.266 0.414

References

1. N. Balakrishnan and M.H. Ling, Multiple-stress model for one-shot devicetesting data under exponential distribution. IEEE Tran. Reliab., 61, 809-821(2012).

2. N. Balakrishnan and M.H. Ling, Expectation maximization algorithm forone shot device accelerated life testing with Weibull lifetimes, and variableparameters over stress. IEEE Tran. Reliab., 62, 537-551 (2013).

3. N. Balakrishnan and M.H. Ling, Gamma lifetimes and one-shot device testinganalysis. Reliab. Eng. System Safety, 126, 54-64 (2014).

4. N. Balakrishnan and M.H. Ling, Best constant-stress accelerated life-testplans with multiple stress factors for one-shot device testing under a Weibulldistribution. IEEE Tran. Reliab., 63, 955-952 (2014).

5. M.L. Calle and G. Gomez, A sampling-based chi-squared test for interval-censored data. In Statistical Models and Methods for Biomedical and Tech-nical Systems. (Eds., F. Vonta, M. Nikulin, N. Limnios, H. Huber-Carol),Birkhauser, Boston, 2008.

6. P. Groeneboom, M.H. Maathuis and J.A. Wellner, Current status data withcompeting risks: Consistency and rates of convergence of the MLE. Ann.Stat., 36, 1031-1063 (2008).

7. N.P. Jewell, M.J. Van der Laan and T. Henneman, Nonparametric estima-tion from current status data with competing risks. Biometrika, 90, 183-197(2003).

8. R. Nysen, M. Aerts and C. Faes, Testing goodness of fit of parametric modelsfor censored data. Stat. Med., 31, 2374-2385 (2012).

9. J.J. Ren, Goodness of fit tests with interval censored data. Scand. J. Stat.,30, 211-226 (2003).

10. H. White, Maximum likelihood estimation of misspecified models. Economet-rica, 50, 1-25 (1982).



CALCULATION OF COVERAGE INTERVALS:

SOME STUDY CASES

A. STEPANOV†, A. CHUNOVKINA, N. BURMISTROVA

D.I.Mendeleyev Institute for Metrology (VNIIM),

19, Moskovsky pr., 190005, St.Petersburg, Russian Federation†E-mail: [email protected]

The aims of the work are to derive coverage factor K0.95 for confidence levelof 95% using Bayesian approach to uncertainty analysis; and to check whether

coverage factor value K0.95 = 2 used commonly is an appropriate choice.

1. Introduction

At present the GUM is a document on calculating the measurement un-

certainty, which is acknowledged internationally. The GUM has solved its

main task, namely it has provided a uniform, transparent and rather simple

method of calculating the measurement uncertainty. Practically, from the

very moment of its publication in 1993, certain internal inconsistency of

the GUM was noted, but to a considerable degree it was corrected in the

process of developing Supplements to it. Finally the revision of the GUM is

planned to correct its inconsistency and to adjust it with the Supplements

developed later. At the same time, one would like the GUM to remain as a

simple and easy-to-use document in practice. An important point is that the

GUM should cover the simplest tasks of calculating the uncertainty, which

frequently occur in practice. In the paper the measurement model with two

input quantities and consequently two uncertainty sources is considered.

One of them concerns with a measuring instrument being in usage. Usually

a calibration certificate contains information about the measuring instru-

ment accuracy. The second source of uncertainty is indication dispersion.

In this paper coverage factor expressions are given for coverage probabil-

ity of 0.95. Normal and uniform distributions of indications as well as of a

random variable assigned to systematic error of measuring instrument are

considered. The expressions for coverage factors are given as a function of

a number of repeated indications and ratio of two above indicated sources

of uncertainty.

132


133

1.1. Measurement model

Consider the following measurement model:

Y = X +B, or X = Y −B,

where X is a measurand, Y is an indication of measuring instrument, B is

a measurement (systematic) error.

1.2. Prior information; measurement data

Information about the accuracy of a measuring instrument can be obtained

from the calibration certificate. If an instrumental uncertainty uB is given,

then the corresponding normal distribution is assigned to B:

p(b) =1√

2πuBexp

− b2

2u2B

. (1)

Fairly often the maximum permissible error θB of the measuring instrument

is indicated. In this case a uniform distribution with an interval [−θB , +θB ]

(here θB =√

3uB) is assigned to B:

p(b) =

1

2√3uB

, |b| ≤√

3uB ,

0, |b| >√

3uB .(2)

Let us take n indications y1, . . . yn for Y having normal distribution

yi ∈ N (X + b, σ) (3)

or uniform distribution

yi ∈ U(X + b− θ,X + b+ θ). (4)

Here σ or θ are parameters describing measurement precision.

Let us denote y = 1n

∑n1 yi and S2 = 1

n

∑n1 (yi − y)

2sample mean and

variance for yi, correspondingly.

1.3. Bayesian inference

In the absence of prior information about the measurement accuracy, non-

informative priory pdfs1 are assigned to the parameters σ and θ:

p(σ) = σ−1, p(θ) = θ−1.

Joint pdf is given by the formula:

p(x, b, s

∣∣ y1, . . . , yn) ∼ L (y1, . . . , yn ∣∣x, b, s) p(b) s−1,


134

where L is a likelihood function and s is a common symbol for σ and θ.

So the posterior pdf for x is

p(x∣∣y1, . . . , yn) = C

∫ +∞

−∞db

∫ +∞

s0

L(y1, . . . , yn∣∣x, b, s)p(b)s−1ds, (5)

where C is norming factor (i.e.∫ +∞−∞ p(x)dx = C−1).

Given pdf for the measurand, one can easily calculate a critical

point α0.95 such that∫ α0.95

−α0.95p(x)dx = 0.95, and knowing the uncer-

tainty u(x), the coverage factor could be obtained by:

K0.95 =α0.95

u(x), where u2(x) = Var(x) = Var(y) + u2B .

2. Calculating coverage factor

2.1. Normal distribution for yi and normal distribution

assigned to B

Let us calculate the coverage factor in case when measuring instrument

indications are distributed normally and the normal distribution (1) is as-

signed to the systematic error of the measuring instrument. Usage of (3)

leads to the following expression for the likelihood function:

L(y1, . . . , yn

∣∣x, b, σ) ∼ 1

σnexp

−∑ni=1 (yi − x− b)2

2σ2

, σ > 0. (6)

So the posterior pdf has the following form:

p(x) ∼∫ +∞

−∞db

∫ +∞

0

exp

−∑ni=1 (yi − x− b)2

2σ2

exp

− b2

2u2B

dσ

σn+1.

Integration with respect to σ gives the pdf for the measurand:

p(x) ∼∫ +∞

−∞

(n (x+ b− y)

2+ (n− 1)S2

)−n2

exp

− b2

2u2B

db

with expectation y and variance

Var(x) = u2(x) =n− 1

n− 3

S2

n+ u2B , n > 3. (7)

For transformed variable

X =X − yS/√n

(8)


135

one can rewrite (5) as

p(x) = C1

∫ +∞

−∞

(1

n− 1(γz + x)2 + 1

)−n2

exp

− z2

2

dz,

where z = buB

, and γ is a parameter: γ = uB√n

S .

So the α0.95 value responding to x should be calculated using this for-

mula. Moreover,

u2(x) =n− 1

n− 3+ γ2, K0.95 =

α0.95

u(x)=

S√n

α0.95

u(x)=

α0.95√γ2 + n−1

n−3

, n > 3.

The K0.95(γ) dependency graphs for some n are presented below, see Fig. 1.

Fig. 1. K0.95(γ) dependency.

Note that K0.95 does not exceed 2 and has the following asymptotics:

K0.95 −−−→γ→0

√n− 3

n− 1α0.95 (tn−1) , K0.95 −−−−−→

γ→+∞1.96

(α0.95 (tn−1) is a 95% critical point for t-distribution). The deviation of

coverage factor from 2 for n > 4 does not exceed 2%.


136

2.2. Uniform distribution for yi and normal distribution

assigned to B

Consider at first a particular task of getting the pdf for a measuring

instrument indication in case if there exists a series of repeated indica-

tions having the uniform distribution with unknown interval bounds. I.e.

yi ∈ U(X − θ,X + θ), where θ is unknown. Likelihood function and joint

pdf have form:

L(y1, . . . , yn

∣∣x, θ) ∼ θ−n, p(x, θ∣∣ y1, . . . , yn) ∼ θ−(n+1).

Denote ymin and ymax minimum and maximum of yi, correspondingly, let

y =1

2(ymin + ymax) , r = ymax − ymin > 0.

As

X − θ ≤ ymin ≤ ymax ≤ X + θ, θ ≥ A = max X − ymin, ymax −X ,

and integration of joint pdf with respect to θ leads to:

p(x∣∣ y1, . . . , yn) ∼ ∫ +∞

A

θ−(n+1) dθ = n−1A−n.

Here A =

x− ymin, x ≥ y;

ymax − x, x < y,so

p(x∣∣ y1, . . . , yn) ∼ (x− ymin)

−n, x ≥ y;

(ymax − x)−n

, x < y∼(|x− y|+ r

2

)−nor, after normalization,

p(x∣∣ y1, . . . , yn) =

(n− 1)rn−1

(2|x− y|+ r)n , E(x) = y, Var(x) =

r2

2(n− 2)(n− 3).

Let us now return to the initial task and suppose that yi are distributed

in compliance with (4) (assume normal distribution (1) for B). In this case

L(y1, . . . , yn

∣∣x, b, θ) ∼ θ−n,p(x) = p

(x∣∣y1, . . . , yn) ∼ ∫ +∞

−∞db

∫ +∞

A

exp

− b2

2u2B

dθ

θn+1,

A = max X +B − ymin, ymax −X −B ;

the pdf p(x) has form:

p(x) ∼∫ +∞

−∞(2 |x+ b− y|+ r)

−nexp

− b2

2u2B

db,


137

and

E(x) = y, Var(x) =r2

2(n− 2)(n− 3)+ u2B .

For the transformed variable X = X−yr/√n

, the pdf has form

p(x) = C2

∫ +∞

−∞

(2√n|µz + x|+ 1

)−nexp

− z2

2

dz,

where µ is a parameter: µ = uB√n

r ; and the value for α0.95 should be

calculated using the latest formula for p(x). So

u2(x) =n

2(n− 2)(n− 3)+µ2, K0.95 =

α0.95

u(x)=

α0.95√µ2 + n

2(n−2)(n−3)

, n > 3.

The K0.95(µ) dependency graphs for some n are presented below (see

Fig. 2).

Fig. 2. K0.95(µ) dependency.

Note that K0.95 has the following asymptotics:

K0.95 −−−→µ→0

√(n− 2)(n− 3)

2

(n−1√

20− 1), K0.95 −−−−−→

µ→+∞1.96.


138

The deviation of coverage factor from 2 for n > 4 does not exceed 2%.

2.3. Normal distribution for yi and uniform distribution

for B

Suppose that indications yi are distributed normally (according to (3)) and

p(b) is given by (2). So we again have expression (6) for the likelihood

function, and the posterior pdf is given by:

p(x∣∣ y1, . . . yn) ∼ ∫ √3uB

−√3uB

db

∫ +∞

0

exp

−∑ni=1 (yi − x− b)2

2σ2

dσ

σn+1.

Integration with respect to σ yields to the following expression for p(x)

p(x) ∼∫ √3uB

−√3uB

(n (x+ b− y)

2+ (n− 1)S2

)−n2

db.

The variance for x is again given by (7). So

p(x) ∼∫ 1

−1

(1

n− 1(x+ νz)

2+ 1

)−n2

dz = C3

∫ x+ν

x−ν

(t2

n− 1+ 1

)−n2

dt,

where X is given by (8), ν is a parameter: ν = uB

√3n

S = θB√n

S and

z = b√3uB

. The latest formula for p(x) to be used for calculation of α0.95

values. So, as

Var(x) =n− 1

n− 3+ν2

3, then K0.95 =

α0.95

u(x)=

α0.95√n−1n−3 + ν2

3

.

The K0.95(ν) dependency graphs for some n are presented below (see

Fig. 3). Note that K0.95 does not exceed 2 and has the following asymp-

totics:

K0.95 −−−→ν→0

√n− 3

n− 1α0.95 (tn−1) , K0.95 −−−−−→

ν→+∞0.95√

3.


139

Fig. 3. K0.95(ν) dependency.

In that case a single coverage factor cannot be recommended for different

ratios of the measuring instrument accuracy to the indications dispersion.

But for n > 4 the graphs practically coincide.

3. Conclusion

The paper deals with the simplest linear model with two input quantities:

measuring instrument indication and the systematic error of measuring in-

strument. Combinations of normal and uniform distributions describing the

repeated indications and the systematic measuring instrument error are

considered. The dependence of coverage factor for the coverage probability

of 0.95 on the ratio of the input quantities uncertainties is analyzed.

It is shown that the coverage factor 2 can be recommended for practical

calculations if the normal pdf is assigned to the measuring instrument error.

References

1. I. Lira, W. Woger, Comparison between the conventional and Bayesian ap-proaches to evaluate measurement data, Metrologia 43 (2006) S249 – S259.

140





APPLICATION OF NUMERICAL METHODS IN METROLOGY

OF ELECTROMAGNETIC QUANTITIES

M. CUNDEVA-BLAJER

Ss. Cyril and Methodius University, Faculty of Electrical Engineering and Information

Technologies, Karpos II, b.b., POBox 574, 1000 Skopje, R. Macedonia


www.feit.ukim.edu.mk

A metrological analysis of non-linear measurement devices for electromagnetic

quantities can be correctly conducted by application of numerical methods, i.e. the finite

element method. This method is most proper for solving a system of non-linear partial

differential variable coefficients type equations, which describe the electromagnetic field

distribution in some testing devices for electromagnetic quantities. Examples discussed

are: an electrical steel sheet testing device-Epstein frame (EF) and a combined

instrument transformer (CIT). Both devices must be in metrological conformance with

IEC standards. In this contribution an application of an originally developed software

FEM-3D for metrological analysis of the EF and CIT is presented. The numerically

derived results are experimentally verified.

Keywords: Finite element method, Metrology, Electromagnetic quantities, Numerical

methods, Epstein frame, Instrument transformer

1. Introduction

The numerical simulations and calculations are an important phase of the design

and verification of measurement configurations, [1]. Different numerical

techniques serve as basis for the development of software for estimation and

reduction of measurement devices uncertainties and errors in various fields of

metrology [1-3]. In the metrology of electromagnetic quantities, some of the

testing devices are non-linear electromagnetic systems. The non-linearity is

mainly introduced by the materials. The finite element method (FEM) is shown

to be the most convenient tool for the system’s metrological non-linear analysis.

In this contribution two non-linear electromagnetic testing devices are analyzed

by FEM: an electrical steel sheet testing device-Epstein frame (EF) and a

combined instrument transformer (CIT) for measurement of high voltages and

currents, [4]. The EF must comply with the standard IEC 60404-2, [5] and the

141


CIT with the standard IEC 61869-4, [6]. Both objects of metrological analysis

can be treated as special transformers:

a. The EF forms an unloaded transformer with changeable magnetic core

(the electrical steel sheet under test), very often in the area of high

magnetic saturation, i.e. non-linearity;

b. The CIT comprises two transformer systems in complex

electromagnetic coupling. The current measurement system with a

regime close to a short circuit in the secondary winding and the voltage

measurement system with a regime of almost unloaded transformer, i.e.

open-circuit regime. Both measurement systems are in a common

housing with strong electromagnetic influence which significantly

contributes to the metrological properties of the whole device.

Their initial study starts with the classical analytical transformer theory. The

accurate non-linear analysis continues with application of the numerical method

FEM.

The EF forms an unloaded transformer comprising a magnetizing winding, a

voltage winding and a magnetic core, formed by the electrical steel sheet test

specimen, [4], as given in Figure 1 a).

The second analyzed measurement device CIT is a complex electromagnetic

system, with two measurement cores: current (CMC) and a voltage core (VMC).

The CIT comprises two electrical systems with four windings and two magnetic

cores magnetically coupled as displayed in Figure 1 b).

a) Epstein Frame

1. air flux compensation coil

2. magnetic core

3. voltage winding

4. current winding

b) Combined instrument transformer

1. VMC magnetic core

2. CMC magnetic core

3. , 4. VMC electrical system

5. , 6. CMC electrical system

Figure 1. Two electromagnetic measurement devices metrologically analyzed by a numerical method

The characteristics of electrical steel sheet have to be measured as accurately as

possible, because they are the main construction material of critical power

142


devices, e.g. transformers, electrical generators or motors, directly determining

their energy efficiency. The standard EF method introduces some systematic

errors, [4, 5], which should be eliminated or reduced. The EF is a non-linear

electromagnetic system, [4]. So, for a correct uncertainty budget evaluation,

numerical methods such as the finite element method are necessary [4]. The

initial analysis of the EF method using the classical analytical transformer theory

is rather approximate: the constant magnetic path length as stated in [5] (lm=0,94

m), the magnetic field distribution is approximate and the leakage fluxes in the

air are not exactly calculated. By using the model of an unloaded transformer

with a changeable magnetic core, the main metrological parameters of the EF

prototype are estimated. The three-dimensional iterative calculation of the

magnetic field distribution is made by using the originally developed program

package FEM-3D, based on the finite element method, [7].

The metrological significance of the instrument transformers is in the power

measurements, especially in the field of legal metrology and trade of electrical

energy. Electromagnetic coupling between the two measurement cores is proved

to exist, [4]. The initial analysis of the CIT is done by the classical analytical

transformer theory. This is again only an approximate estimation. Namely,

analytically it is impossible to consider the non-linear electromagnetic coupling

between the two cores; hence the magnetic field distribution is approximated.

The accurate metrological analysis of the 20 kV CIT (voltage transformation

ratio V3 /100:V 3 /20000 and current transformation ratio 100 A : 5 A) is

possible only by numerical calculation of the magnetic field distribution in the

3D domain, [4]. The following four metrological parameters are relevant to the

CIT: VMC voltage error pu, VMC phase displacement error δu, CMC current

error pi, CMC phase displacement error δi.

Most of the uncertainties sources arise from the unknown magnetic field

distribution in the domain of the EF or the CIT. In the EF the effective magnetic

field path length and in the CIT the leakage reactances of the four winding are

determined by the field distribution. They are the main contributors to the

uncertainty budget of both devices.

2. FEM electromagnetic metrological analysis

The two devices are closed and bounded non-linear electromagnetic systems.

The mathematical modeling of the quasi-stationary electromagnetic field in all

closed and bounded non-linear electromagnetic devices, such as the EF and the

CIT, is described by the system of Maxwell’s equations. Two auxiliary quantities

are introduced: the electric scalar potential V and the magnetic vector

potential A

. The electromagnetic fields in such cases are best described through

the magnetic vector potential A

, where:

143


BArot

= (1)

0=Adiv

(2)

After some mathematical transformations the Poisson’s equation is derived. In

non-linear electromagnetic systems like in the EF and the CIT the relationship

between the magnetic flux density B

and the magnetic field strength H

is

determined by the magnetization characteristic: ( )Hf

=µ or ( )Bf

=µ . In this

case the Poisson’s equation is transformed into:

z)y,(x,jz

A)B(v

zy

A)B(v

yx

A)B(v

x

−=

∂

∂

∂

∂

+

∂

∂

∂

∂

+

∂

∂

∂

∂

(3)

where µν

/1= . Because of the non-linear relationship ( )Bf

=ν , equation (3) is

a non-linear partial differential equation and it can only be solved by numerical

methods. One option is the finite element method through which the whole three-

dimensional domain is divided into sub-domains (finite elements). During the

discretization process one main principle must be fulfilled: each finite element

must be with homogenous electrical and magnetic properties. Both metrological

devices, the EF and the CIT, with their complex structures consist of more

domains with different electrical and magnetic properties. For the magnetic field

calculation the whole devices’ domains are discretized through meshes of nodes

dividing them into a large number of finite elements. The magnetic vector

potential is approximately determined through the value of the three vector

components Ax, Ay, Az in the finite element nodes. In the non-linear case the

calculation of the magnetic vector potential is iterative in more steps.

In the originally developed program at the Ss. Cyril and Methodius University in

Skopje at the Faculty of Electrical Engineering and Information Technologies,

named as FEM-3D the Garlekin method or the Method of Weighted Residuals is

used. The program package FEM-3D is universal and is already applied for

analysis of other non-linear electromagnetic systems in electrical engineering

such as, rotational electrical machines, power transformers or linear actuators,

[7]. The structure of the FEM-3D is given in the flow chart in Figure 2. The

main program G1 is the preprocessing module for automatic finite element mesh

generation in the three-dimensional domain. The main program G2 is a post-

processing module for graphical display of the results and of the finite element

mesh. The main program G3 defines the free articles vector in the finite

elements. The main program G4 is the processing module, forming the global

system matrix and iterative equation system solver. It derives the results of the

magnetic vector potential in each of the nodes of the finite element mesh. The

non-linearity of the magnetic characteristics is embedded in the calculation. The

main program G5 is the post-processing module which derives the equipotential

144


surfaces after numerical integration of the magnetic vector values in the finite

element mesh nodes. It forms the output database for graphical display in the

main program G2.

Figure 2. Flow chart of the program package FEM-3D

3. FEM metrological results and experimental verification

As mentioned above, in the standard IEC 60404-2, [5], the effective magnetic

path length in the EF is adopted as constant and equal to 94 cm. However, the

magnetic flux distribution in the EF is very complex and varies by different

grades of electrical steel sheet as well as by different magnetic flux densities.

The degree of leakage fluxes also differs at various materials and magnetic

polarizations. So the effective magnetic length is variable. In the standard test

procedure where the EF is used for determination of the electrical steel sheet

magnetic characteristics, the lm is considered as constant with the value of 94 cm.

This is a systematic error in the measurements. In Table 1 the numerically

derived results by FEM-3D are compared to experimental values of the magnetic

characteristics of three grades of electrical steel sheet derived by the standard EF

test procedure. There is a difference between the numerical and the experimental

values. The numerical calculation does not include the effective magnetic path

145


length, but the geometrical length of each finite element separately, through

which the magnetic equipotential line passes. The numerical calculation

iterations are done separately for each of the finite elements and the magnetic

properties change in the next iteration.

Table 1. Comparison of numerically calculated by FEM-3D and experimentally derived by EF

magnetic characteristics of three different grades of electrical steel sheet

Grade A Grade B Grade C

H Jmexp Jmnum δδδδJ H Jmexp Jmnum δδδδJ H Jmexp Jmnum δδδδJ

A/m T T % A/m T T % A/m T T %

3,7 0,2 0,21 -4,76 3,9 0,2 0,21 -5,21 5,9 0,2 0,22 -7,27

6,2 0,4 0,42 -4,54 6,7 0,4 0,42 -5,21 9,3 0,5 0,52 -5,19

8,3 0,6 0,63 -4,61 9,0 0,6 0,63 -5,21 12,2 0,7 0,69 -6,38

10,6 0,8 0,84 -4,19 11,0 0,8 0,85 -5,32 14,9 0,9 1,00 -5,90

13,4 1,0 1,02 -1,77 12,9 1,0 1,06 -5,30 17,7 1,2 1,22 -4,34

17,5 1,2 1,19 1,09 15,0 1,2 1,27 -5,29 21,1 1,4 1,41 0,141

23,9 1,4 1,38 1,67 17,4 1,4 1,48 -5,28 26,4 1,5 1,45 7,374

38,6 1,6 1,55 3,29 21,7 1,6 1,64 -2,20 41,4 1,6 1,58 4,246

115,2 1,8 1,79 0,33 47,4 1,8 1,71 4,956 167,1 1,8 1,71 5,494

The magnetic properties are different for the different finite elements; therefore

the magnetic flux density is calculated in each of the mesh nodes. This is a

distributed approach for determination of the magnetic field distribution. In

Table. 1 a comparison of the FEM-3D numerically calculated and the

experimentally derived by EF magnetic characteristics of three different grades

of electrical steel sheet are given.

Figure 3. FEM results of the most important metrological characteristics of the EF-specific power

losses of the tested electrical steel sheet (Grade A)

146


In Figures 3. and 4. some of the FEM derived results for the two devices are

compared to the approximate analytical values. In the same display the

experimental characteristics gained through testing of real prototypes in

laboratory are shown. They demonstrate very good agreement to the FEM

results, which verifies the methodology.

0,0

0,2

0,4

0,6

0,8

1,0

1,2

1,4

1,6

0,0 0,5 1,0 1,5

VMC relative input voltage U u /U ur [r. u.]

Ma

gn

eti

c f

lux

den

sit

y B

mu [

T]

only VMC (analytical)

only VMC (FEM-3D)

I/In=0

I/In=0,2

I/In=0,4

I/In=0,6

I/In=0,8

I/In=1,0

I/In=1,2

a) CIT-Magnetizing characteristics in the VMC

0,0

0,1

0,2

0,3

0,4

0,5

0,6

0,0 0,5 1,0 1,5

CMC relative input current I i /I ir [r. u.]

Ma

gn

eti

c f

lux

den

sit

y B

mi [

T]

only CMC (analytical)

only CMC (FEM-3D)

U/Un=0

U/Un=0,2

U/Un=0,4

U/Un=0,6

U/Un=0,8

U/Un=1,0

U/Un=1,2

b) CIT-Magnetizing characteristics in the CMC

Figure 4. FEM results of the magnetic characteristics of the CIT in comparison to the analytical and

experimentally derived values

147


a) VMC voltage error via the VMC input voltage and the CMC current is a parameter

b) CMC current error via the CMC input current and the VMC voltage is a parameter

Figure 5. FEM derived CIT metrological characteristics

By numerical integration of the magnetic vector potential, the magnetic flux

density is derived. In the case of CIT the mutual inductance between the primary

and the secondary winding of the both cores, the leakage inductances and

reactances are further determined. This enables the calculation of the CIT

metrological characteristics given in Figure 5.

148


4. Conclusion

The numerical simulation of the electromagnetic field distribution could be a

very useful tool in the development of testing methods as well as for uncertainty

estimation in metrology of electromagnetic quantities. The applied FEM method

is universal and experimentally verified. The two presented examples illustrate

the variety of applications of the numerical methods in the metrology of

electromagnetic quantities.

References

1. R. Model, S Schmelter, G. Lindner, M. Baer „Numerical Simulations and

Turbulent Modelling for Applications in Flow Metrology“, Adv. Math. &

Comp. Tools in Metrology and Testing 9, Ser. on Advances in Mathematics

for Applied Sciences, 84, World Scientific, Singapore, 268-275 (2012)

2. M Baer, S Bauer, K John, R Model and R W dos Santos, “Modeling

measurement processes in complex systems with partial differential

equations: from heat conduction to the heart”, Adv. Math. & Comp. Tools in

Metrology, 7, Ser. on Advances in Mathematics for App. Sciences 72,

World Scientific, Singapore, 1-12 (2006)

3. M Orlt, D Richter, “Uncertainty estimation of numerically computed

quantities: a case study for the twofold derivative”, Adv. Math. & Comp.

Tools in Metrology 4, Ser. on Advances in Mathematics for Applied

Sciences 53, World Scientific, Singapore, 171-181 (2000)

4. M. Cundeva-Blajer, L. Arsov, „Computer Aided Techniques for Estimation

and Reduction of Electromagnetic Measurement Devices Uncertainties“, Int.

Jour. of Metrology and Quality Engineering, vol. 2, EDP Sciences, Paris,

pp. 89-97 (2010)

5. EN ISO/IEC 60404-2, 1996+A1:2008: Magnetic Materials, Part 2: Methods

of measurement of magnetic properties of electrical steel sheet and strip by

means of an Epstein frame, Geneva, (2008)

6. IEC 61869-4, Edition 1.0, 2013-11: Instrument transformers, Part 4:

Additional requirements for combined transformers, Geneva, (2013)

7. M. Cundev, L. Petkovska, V. Stoilkov “3D Magnetic Field Calculation in

Compound Configurations”, Proc. Int. Conf. ACOMEN ’98, Ghent, Belgium

503-510, (1998)

149





CALIBRATION METHOD OF MEASURING INSTRUMENTS

IN OPERATING CONDITIONS

A. A. DANILOV* AND YU. V. KUCHERENKO

The state regional center of standardization, metrology and tests in the Penza region

Penza, 440028, Russia *E-mail: [email protected]

www.penzacsm.ru

M. V. BERZHINSKAYA† AND N. P. ORDINARTSEVA

Penza state university

Penza, 440017, Russia †E-mail: mberj@ mail.ru

www.pnzgu.ru

The method which allows calibration of measuring instruments (MI) in the operating

conditions of operation is described. The method has the highest efficiency in the

calibration of multichannel measurement systems. The method is based on the conversion

characteristic of portable calibrator determination, taking into account the effects of

influence quantities. An example of the calibration channel measurements of direct

electric current power in the operating conditions of operation (at elevated ambient

temperature) is given. Measurement uncertainty estimation is discussed.

Keywords: calibration, measuring instruments, uncertainty.

1. Introduction

Arguments of economical worthy MI calibration in operating conditions of

operation are considered in the articles [1]. These include: lack of time and cost

of transportation to the place of MI calibration and back, as well as costs arising

from downtime due to waiting MI queue and proper calibration. While MI

transportation is usually necessary in the calibration laboratory not only to make

better use of the standard, but also in order to ensure normal operating

conditions required for transmission unit size value from standard to measuring

instruments.

However, some people insist on holding MI calibration in operating

conditions, explaining that the conversion MI characteristic in normal operating

conditions is not any interest for them. The fact is that due to the peculiarities of

specific instances of the application operating space MI values of influence

150


quantities is relatively small for them, and can take a range of values located

away from the normal range of values of influence quantities.

Of course, the following features [1] for calibration of MI in operating

conditions must be taken into account:

- Reference standard should be not only operational but also has to preserve

the accuracy specifications in suitable operating conditions in the MI in order to

ensure the transfer unit size of the values in these conditions;

- The uncertainty of measurement in calibration MI in operating conditions,

definitely more than in the calibration under normal operating conditions.

To calibrate MI in operating conditions, portable multifunction calibrators

electrical quantities (voltage, electrical current strength, electrical resistance,

etc.), such as Calog-PRO-R, Calys 100R, Fluke 724, Fluke 753, MC6-R, MFT

4000R, etc., especially multichannel measurement systems are typically used.

Despite the capability of these calibrators function in a wide range of ambient

temperatures (0 to 50 °C), they cannot maintain in these conditions the same size

of the unit value that under normal conditions (18 to 28 ° C). For this reason

during MI calibration in operating conditions measures have to be taken to

ensure operation conditions of multifunction calibrators similar to normal

conditions.

2. Procedure of calibration of measuring instruments in operating

conditions of operation

In some cases, when operating conditions of operation MI differ significantly

from normal conditions, the proposed MI calibration method consisting of two

stages can be used, based on the two standards (portable multifunction calibrator

electrical quantities, and stationary reference - calibrator, set in the calibration

laboratory), and which is as follows [2].

In the first phase is proposed to calibrate the MI in operating conditions of

operation MI via the multifunction portable calibrator of electrical quantities. In

the second phase multifunction portable calibrator of electrical quantities is

matched with a stationary standard (located under normal operating conditions)

at the same points of the measuring range, in which the MI (i.e. implement the

substitution method) was calibrated. While in the second stage the same

operating conditions in which the calibrator was used during the first phase are

provided with help of special technical means for a portable multifunction

calibrator of electrical quantities.

151


Of course, operating conditions artificially reproduced (repeated by

technical means) in the calibration laboratory, uniquely developed for MI

calibration are likely to fail. This difference may be taken into account when

estimating the uncertainty of measurement of MI calibration.

However, it is advisable to change the sequence of steps in the algorithm of

MI calibration in operating conditions. Please perform the second stage, i.e.

conduct an experiment to determine the calibration characteristics of

multifunction portable calibrator for some set of combinations of values of

influence quantities that characterize its performance operating conditions. Then

determine the amendments to the values of the quantities, playing with it, as a

function of a set of values of influence quantities. This will allow calibrating the

MI in operating conditions by using values obtained amendments to the value

obtained amendments to the value you get from a portable multifunction

calibrator.

3. Example

It’s necessary to calibrate the measurement channel of direct current electric

power in the multichannel measuring system (hereinafter - ammeter) in the

range of 4 to 20 mA, which is operated in a range of ambient temperatures from

15 to 40 °C.

3.1. Calibration Values of one

Let’s suggest that at the time of calibration ambient temperature is 35 ºC.

As means of calibration of an ammeter one of the portable calibrator listed

above may be used. Also let’s make an assumption that the conversion

characteristic of a portable calibrator was previously determined in a calibration

laboratory using permanently installed calibrator higher accuracy. The portable

calibrator calibration executed in the following six points of its reproduction

range 4, 8, 12, 16 and 20 mA for three values of the ambient temperature,

simulated using a climate chamber: 0, 23 and 50 ° C.

As a result, the conversion characteristic of portable calibrator is

represented in table 1.

152


Table 1. The conversion characteristics of the portable calibrator.

Θ, °С I , mA

4 8 12 16 20

01 =Θ 11I 12I 13I 14I 15I

232 =Θ 21I 22I 23I 24I 25I

503 =Θ 31I 32I 33I 34I 35I

354 =Θ 41I 42I 43I 44I 45I

In the last row of the Table 1 presents the results of calculating the values

of the direct electric current power iI0 you get from a portable calibrator when

the ambient temperature of 35 ° C (i.e., under conditions that are appropriate

calibration ammeter). These values are belonging approximation equations that

can be obtained with the method of least squares [3]:

44 Θ⋅+= iii baI , (1.1)

where

2

3

1

3

1

2

3

1

3

1

3

1

3

1

2

)(3 ∑∑

∑∑∑∑

==

====

Θ−Θ⋅

⋅Θ⋅Θ−⋅Θ

=

j

j

j

j

j

jij

j

j

j

ji

j

j

i

II

a , (1.2)

2

3

1

3

1

2

3

1

3

1

3

1

)(3

3

∑∑

∑∑∑

==

===

Θ−Θ⋅

⋅Θ−⋅Θ⋅

=

j

j

j

j

j

ji

j

j

j

jij

i

II

b . (1.3)

After calibration of the ammeter with a portable calibrator, previously

calibrated at an ambient temperature of 35 ° C the conversion characteristic an

ammeter shown in Table 2 can be obtained.

It should be noted that the resulting conversion characteristic can be used to

improve the accuracy of the ammeter only when the ambient temperature is

35°C. In the same case, if the ambient temperature of ammeter changes in the

specified calibration the procedure needs to be repeated at a different

temperature.

153


Table 2. Ammeter conversion characteristic

for a single value of ambient temperature

Θ, °С I , mA

41I 42I 43I 44I 45I

354 =Θ 41i 42i 43i 44i 45i

3.2. Calibration for arbitrary values of influence quantities

We assume that the ammeter calibration is performed with a significant change

in the values of the influence quantity. In this example, this quantity is the

ambient temperature.

Suggesting that the calibration of an ammeter was performed at moments k

t

when the values of the ambient air temperature corresponding with these points

and equal k

Θ . Thus ammeter calibration was performed using the same pre-

calibrated portable calibrator.

As a result of the calibration, conversion characteristic ammeter shown in

Table 3 will be obtained.

Table 3. Conversion characteristic ammeter

Θ, °С Point Number

1 2 3 4 5

354 =Θ 41I 42I 43I 44I 45I

41i 42i 43i 44i 45i

5Θ 51I 52I 53I 54I 55I

51i 52i 53i 54i 55i

6Θ 61I 62I 63I 64I 65I

61i 62i 63i 64i 65i

Etc.

The final matrix of values can be used for constructing the equation

approximating conversion characteristic ammeter for arbitrary values of the

measured direct electric current power i and ambient temperature Θ :

Θ+Θ++++=Θ ibbiaiaaiIk

k 11110 ...),( . (1.4)

The parameters of this equation 0a , 1a , …, ka , 1b , 11b can be found by

the method of least squares [3].

154


4. Assessment of uncertainty of measurements at calibration

Evaluation of uncertainty of measurement at the calibration of the ammeter can

be performed in accordance with the Guide [4] and should take into account the

following sources:

- Random errors of stationary standard, portable calibrator and calibrated

ammeter;

- Random errors of measurement influencing variables;

- The uncertainty of the actual values of stationary and portable standards;

- Instabilities of stationary and portable standards;

- Instability of calibrated ammeter;

- Non-linearity of the calibration characteristics of stationary and portable

standards;

- Additional measurement error due to the deviation of the measurement

conditions;

- Rounding the measurement results, etc.

5. Conclusions

Thus, the proposed method makes it possible to calibrate the MI in the operating

conditions.

With the increase of quantity of MI, which calibration is carried out using

the proposed method in similar operating conditions, the cost of pre-calibrated

portable calibrator based on the cost of calibrating one MI will decrease.

Consequently, the most cost effective way is the calibration of measuring

channels of multichannel measurement systems.

References

1. Berzhinskaya, M. V., Danilov, A. A., Kucherenko, Yu. V.,

Ordinartseva, N. P. About calibration of measuring instruments in operating

conditions of operation, in Metrology and metrology assurance:

Proceedings of the 23th National Scientific Symposium with International

Participation, September 9–13, 2013, Sozopol. (Technical University of

Sofia, Bulgaria) pp. 443-447.

2. Berzhinskaya, M. V., Danilov, A. A., Kucherenko, Yu. V.,

Ordinartseva, N. P. Calibration of Measuring Instruments Under Working

Conditions in Measurement Techniques: V. 57, Is. 3 (2014), pp. 228-230.

3. JCGM 107 Evaluation of measurement data – Applications of the least-

squares method (ISO/IEC Guide 98-5).

155


4. ISO/IEC Guide 98-3:2008 Uncertainty of measurement – Part 3: Guide to

the expression of uncertainty in measurement (GUM:1995).



STATISTICAL METHODS FOR CONFORMITY

ASSESSMENT WHEN DEALING WITH

COMPUTATIONALLY EXPENSIVE SYSTEMS :

APPLICATION TO A FIRE ENGINEERING CASE STUDY

S. DEMEYER∗, N. FISCHER

Statistics and Mathematics Department, Laboratoire National de Metrologie et

d’Essais,78197 Trappes, France


www.lne.fr

F. DIDIEUX, M. BINACCHI

Fire Engineering Department, Laboratoire National de Metrologie et d’Essais78197 Trappes, France


Statistical methods are compared to assess the conformity of outputs of compu-

tationally expensive systems with respect to regulatory thresholds. The directMonte Carlo method provides baseline results, obtained with a high compu-

tational cost. Metamodel-based methods (in conjunction with Monte Carlo or

importance sampling) allow to reduce the computation time, the latter correct-ing for the metamodel approximation. These methods have been implemented

on a fire engineering case study to compute the probability that the tempera-

ture above the smoke layer exceeds 200C.

Keywords: Monte Carlo Method, Importance Sampling, Gaussian Process,

Computational Code, Probability of Exceeding Threshold, Fire Engineering

1. Introduction

When dealing with a computational code viewed as a black box, one may

be interested in propagating the uncertainties related to the input variables

to estimate the uncertainty associated with the output variable. In the

decision theoretical framework pertaining to conformity assessment, one

is interested also in the position of the output variable with respect to a

given threshold (regulatory threshold,...). The problem of knowing whether

a computationally expensive model output exceeds a given threshold is

very common for reliability analysis and safety-critical applications such

156


157

as aerospace, nuclear power stations and civil engineering. The decision

of conformity is based on the probability of non conformity, that is the

probability that the code output exceeds the threshold.

The probabilistic model is defined as Yi = F (Xi), i = 1, ..., N where

F denotes the system (computational code), Xi = (Xi1, ..., XiK)T

denotes

the row vector of K input variables defined on a domain of variation D, Yiis the system output and N is the number of runs of the code.

Given a threshold s, non conformity (also known as failure in reliability

assessment) is the event F = x ∈ D : F (x) > s.The probability of non conformity pf is defined as the following integral

pf = P (F) = P (x ∈ D : F (x) > s) =

∫1F (x)>sf(x)dx (1)

where f is the joint density of the input variables defined on D.

Since F is a numerical algorithm it is impossible to compute pf analyt-

ically. Rather, pf is estimated from a number of simulation runs that have

to be carefully planned.

2. Fire engineering case study

The aim of this case study is to assess the impact of a fire starting in a large

test hall (dimensions: 19.75m(length) × 12m(width) × 16.50m(height)) on

the ability to safely egress, based on fire simulations performed with the

numerical code CFAST (Consolidated Model of Fire and Smoke Transport)

developed at NIST a. CFAST assumes that the test hall is empty and

divided into two zones delimited by the smoke layer (see figure 1): the upper

layer and the lower layer. Non conformity is the event ”the upper layer

temperature (UL) exceeds 200C”, where the code output UL is influenced

by 7 input variables relating to the environmental conditions and the course

of the fire defined in table 1.

CFAST is computationally cheap (a few seconds per run) and thus al-

lows to compute Monte Carlo baseline results.

3. Direct Monte Carlo method

Consider a set of L independent random vectors X1, ...,XL with density

f . The Monte Carlo estimator of integral (1) is given by

ahttp://www.nist.gov/el/fire_research/cfast.cfm


158

Fig. 1. Fire representation with the zone model CFAST of a test hall. The blue hori-

zontal line models the smoke layer that divides the space into 2 zones: the upper layer

(above the layer) and the lower layer (under the layer). The grey cone models the fire.The time elapsed (bar at the bottom of the figure) indicates that the fire has just started.

The color bar at the right shows that the temperature of the smoke layer is under 40C.

Table 1. Description of the input variables: symbol, name, unit, range of vari-

ation and distribution (N: normal, U: uniform).

Variable Name Unit Range Distribution

AP atmospheric pressure Pa [98000, 102000] N

Text external temperature K [263.15, 303.15] N

Tamb ambient temperature K [290, 303.15] Nα fire growth rate kW/s2 [0.011338, 0.20] U

Af fire area m2 [1, 20] U

Q′′

characteristic HRR kW/m2 [300, 500] U

per unit area

qfd design fire load density MJ/m2 [300, 600] Uper unit area

pMCf =

1

L

L∑l=1

1F (Xl)>s (2)

Each random variable 1F (Xl)>s follows a Bernoulli distribution with

finite expectation pf and variance pf (1−pf ). According to the strong law of

large numbers,3 pMCf −→ pf almost surely. The estimator pMC

f is unbiased

and its mean square error is σ2pMCf

=pf (1−pf )

L . Throughout this paper, the

coefficient of variation (CV) is used to compare methods. For Monte Carlo


159

method, it reads CVpMCf

=σpMCf

pMCf

.

4. Gaussian Process based statistical methods

4.1. Gaussian process based conformity assessment

Gaussian Process (GP) modelling assumes that the true unknown function

F is a realization of a random function y such that F (xi) = y(xi) for

xi ∈ Dn where Dn = xi, i = 1, ..., n denotes a set of n training (design)

points, with Dn ⊂ D.

The distribution of y is uniquely determined by its mean function

m(x) = E(y(x)) and its covariance function k(x,x′) = E((y(x) −m(x))(y(x′)−m(x′))).

For example, take m(x) = µ where µ is an unknown constant (case of

ordinary kriging) and a Gaussian covariance function that reads k(x,x′) =

exp− ||x−x

′||222θ2

where θ is the range parameter. Then, µ and θ are esti-

mated from the training points with maximum likelihood and their esti-

mates are used to predict the distribution of y(x) at a non observed input

x.

More generally, for x ∈ D \ Dn, y(x) ∼ N(m(x), σ2(x)

)where m(x)

denotes the predicted mean at point x, σ2(x) denotes the variance of the

prediction at point x. Their expressions can be found in Santner.1

The probability of excursion π(x) is defined for location x ∈ D \ Dn as

π(x) = P (y(x) > s) = P

(y(x)− m(x)

σ(x)>s− m(x)

σ(x)

)= Φ

(m(x)− sσ(x)

)(3)

where Φ is the cdf of the standard Gaussian distribution.

The probability of non conformity of a metamodel y is given by pGPf =∫1y(x)>sf(x)dx. The best estimator pGPf of pGPf that minimizes the mean

squared error MSE := Ey

((pGPf − pGPf )2

)is given in Bect2

pGPf = Ey(pGPf

)=

∫P (y(x) > s) f(x)dx =

∫π(x)f(x)dx (4)

A Monte Carlo estimate of pGPf is given by ˆpGPf = 1Nε

∑Nε

i=1 π(x(i))

where x(i), i = 1, . . . , Nε are draws from the input distribution f .

This method is computationally cheap compared to its Monte Carlo

counterpart but provides a biased estimate of pf .


160

4.2. Correcting the metamodel approximation

According to (1) pf can be estimated by importance sampling with the quasi

optimal importance density hopt(x) = π(x)f(x)∫D π(x)f(x)

dx that forces simulation

in the non conformity domain detected by the metamodel

pf =

∫D

1F (x>s)f(x)

hopt(x)hopt(x)dx = pGPf αcorr (5)

where the correction αcorr =∫ 1F (x)>s

π(x) hopt(x)dx = Ehopt

[1F (x)>s

π(x)

]mea-

sures the fit of the metamodel in the non conformity domain.

The estimate pf of pf is computed as the product pf = ˆpGPf αcorrwhere αcorr is the Monte Carlo estimate of αcorr given by αcorr =

1Ncorr

∑Ncorr

j=1

1F (h(j))>s

π(h(j))where h(j), j = 1, . . . , Ncorr are draws from hopt

that can be obtained with a Metropolis Hastings algorithm.3

According to Dubourg,4 the coefficient of variation of pf reads

CVpf ≈√CVαcorr

+ CV ˆpGPf

for small values of CVαcorrand CV ˆpGP

f.

This method yields an unbiased estimate of pf at the cost of new calls

to the code (F (hj)) to correct ˆpGPf and Markov chains algorithms, and its

efficiency depends on the fit of the metamodel in F .

5. Application

The methods presented previously have been applied to a simplified case

study derived from section 2 focusing on the relationship UL = F (Af )

with all other input variables set to their mean values. This is to show the

influence of the learning database when using a metamodel to estimate the

probability of non conformity pf = P (UL > 200C).

Monte Carlo simulations have been performed on the computational

code CFAST with nMC = 1000 to obtain baseline results pMCf = 0.271 and

CVpMCf

= 1.64%. They were completed in 90 minutes. These simulations

allow to represent the relationship between the output UL and Af, which

appears to be regular (see figure 2). Moreover, from these data we get the

non conformity region as the Af -interval F =[14.85m2, 20m2

].

Two competing metamodels M3 and M10 have been built on n = 3

and n = 10 training points respectively, with the same model hypotheses :

a constant but unknown trend µ, a gaussian covariance function parameter-

ized by the unknown range parameter θ and the unknown variance of the

GP σ2. Estimation of µ, θ and σ2 has been performed with the R package


161

DiceKriging.5 A nugget effect τ2 has also been estimated. Results for both

metamodels are displayed in table 2.

Table 2. Estimated parameters of thekriging models with n=3 and n=10 train-

ing points.

n=3 n=10

trend µ 108.76 99.13

range θ 16.80 8.68

variance σ2 15614.56 12465.26nugget τ2 1.6e−4 1.4e−2

Once estimated, these metamodels produce trajectories interpolating

their respective training points. Figure 2 shows that the trajectories pro-

duced with M3 depart from the true function in the extremities and be-

tween the training points. Inversely, trajectories produced withM10 match

exactly the true function.

The metamodel based importance sampling method has been imple-

mented. The key point is to sample from the best knowledge of the non

conformity domain given a metamodel with a Metropolis Hastings (MH)

algorithm. Figure 3 displays the empirical cumulative distribution func-

tions (cdf) of the MH-samples produced under the two metamodels and

compares them to the true empirical cdf obtained from the Monte Carlo

sample. It appears that the distribution of the samples obtained with n=10

matches the true distribution whereas the distribution obtained with n=3

departs from the true distribution by sampling lower values, known not

to belong to the true non conformity domain. Estimates of the probabil-

ities and of the correction are displayed in table 3. When the metamodel

behaves poorly in the non conformity domain (case n=3) the correction

requires Ncorr = 480 additional runs of the code to compensate for this dis-

crepancy. Inversely, when the metamodel fits the true function in the non

conformity domain (case n=10), the metamodel based estimate of the prob-

ability requires no correction. The computational times under metamodels

M3 and M10 amount to 45 min and 1 min respectively.

6. Conclusion

A metamodel based method has been presented (see Dubourg4) to cut costs

when computing a probability of non conformity from a computationally


162

Table 3. Probability results obtained with metamodelsM3 and M10.

M3 (n=3) M10 (n=10)

Nε 100,000 10,000Ncorr 480 -ˆpGPf 0.306 0.269

CV GP 0.46% 1.64%

αcorr 0.896 -CVαcorr 1.55% -

pf 0.274 0.269

CVpf 1.62% 1.64%

Total number of code runs 483 10(n+Ncorr)

Fig. 2. Plot of the simulated trajectories obtained with n=3 (blue curves) and n=10

(green curves) learning points. Trajectories obtained with the 10 learning points and thetrue function are superimposed.

expensive numerical code. The one dimensional case study shows the impact

of the choice of the metamodel (here restricted to the choice of the num-

ber of training points) on the correction. More generally, the metamodel

based importance sampling method is able to produce an estimate of the

target probability of non conformity at a much lower cost than the direct

Monte Carlo method with the same coefficient of variation. This relies on

an optimization of the additional calls to the numerical code.


163

Fig. 3. Plot of the empirical distribution functions of h(j), j = 1, ..., 1000 samples ob-tained with 2 learning databases : n=3 (black), n=10 (blue) under the same simulation

conditions. The dotted line represents the theoretical cdf of hopt.

Acknowledgements

This work is supported by the European Metrology Research Programm

(EMRP), which is jointly funded by the EMRP participating countries

within EURAMET and the European Union.

References

1. T. J. Santner, B. J. Williams and W. I. Notz, The design and analysis ofcomputer experiments (Springer Verlag, 2003).

2. J. Bect, D. Ginsbourger, L. Li, V. Picheny and E. Vazquez, Sequentialdesign of computer experiments for the estimation of a probability of failure,Statistics and Computing, 22, 3 (2012).

3. C. P. Robert and G. Casella,Monte Carlo Statistical Methods (Springer Seriesin Statistics, Springer Verlag, 2002)

4. V. Dubourg, F. Deheeger and B. Sudret, Metamodel-based importance sam-pling for structural reliability analysis, Probabilistics Engineering Mechanics,33 (2013).

5. O. Roustant, D. Ginsbourger, Y. Deville, DiceKriging, DiceOptim: Two RPackages for the Analysis of Computer Experiments by Kriging-Based Meta-modeling and Optimization, JSS, 51, 1 (2012).



OVERVIEW OF EMRP JOINT RESEARCH PROJECT

NEW06 “TRACEABILITY FOR

COMPUTATIONALLY-INTENSIVE METROLOGY”

A. B. FORBES and I. M. SMITH∗

National Physical Laboratory, Teddington, Middlesex, UK∗E-mail: [email protected]

F. HARTIG and K. WENDT

Physikalisch-Technische Bundesanstalt, Braunschweig, Germany

Today, almost all measuring systems involve computation and it is impor-tant that software components can be shown to be operating correctly. The

European Metrology Research Programme Joint Research Project (JRP)

NEW06 “Traceability for computationally-intensive metrology” is specificallyconcerned with developing technology that will deliver such traceability.

This paper provides a broad overview of the main activity being undertakenwithin the JRP. An ultimate goal of the JRP is to establish an information

and communications infrastructure for software validation. The steps required

to reach this goal are described.

Keywords: Traceability, software validation

1. Introduction

A key requirement of traceability is that measurement results can be linked

to references, e.g., measurement units, through a documented unbroken

chain. To an ever-increasing degree, such chains nowadays involve com-

putation, and it is important that computational links are known to be

operating correctly. Analogous to physical artefacts are reference data sets

(sometimes referred to as “numerical artefacts”) that may be used to test

that software components in a measurement chain are operating correctly.

The European Metrology Research Programme (EMRP)1 is currently

funding the Joint Research Project (JRP) NEW06 “Traceability for

computationally-intensive metrology” (referred to as “TraCIM”).2 The

main objective of the JRP, which runs from June 2012 to May 2015, is

to develop new technology that will deliver traceability of computationally-

intensive metrology, transparently and efficiently, at point of use.

164


165

In order to meet this objective, a number of tasks are being under-

taken within the JRP: 1) identification of priority metrology applications

in different metrology domains, 2) establishment of a general framework

for a system of traceability in computationally-intensive metrology, 3) pro-

vision of a mechanism that allows definitions of computational aims to be

provided in clear, unambiguous terms, and the specification of computa-

tional aims for the priority metrology applications identified in task 1), 4)

development of software to produce reference data for the computational

aims specified in task 3), 5) provision of appropriate metrics to evaluate

the performance of software under test, and 6) development of a state-of-

the-art information and communications technology (ICT) infrastructure

for software validation.

The JRP Consortium comprises: A) National Metrology Institute

(NMI) partners NPL (JRP-Coordinator, United Kingdom), PTB (Ger-

many), CMI (Czech Republic), UM (Slovenia), VSL (The Netherlands)

and INRIM (Italy), B) industrial partners Hexagon, Mitutoyo, Werth and

Zeiss (all Germany), and C) Researcher Excellence Grant (REG) partners

Westsachsische Hochschule Zwickau (Germany), Ostfalia Hochschule fur

angewandte Wissenschaften (Germany) and the University of Huddersfield

(United Kingdom). In addition, the University of York, UK, and the Pol-

ish Central Office of Measures (GUM), have participated in the project

through additional grants. The JRP has a focus on coordinate metrology,

and therefore involves working closely with the listed industrial partners

from that area. However, the infrastructure that is being developed may

easily be applied to any metrology area.

This paper provides an overview of the JRP and an introduction to

other papers that describe in more detail aspects of the technical work

being undertaken within the JRP.3–5 The paper is organised as follows.

Sections 2 to 7 contain general descriptions of each of tasks (1) to (6) listed

above. Conclusions are given in section 8.

2. Priority metrology applications

In task (1), JRP-Partners have undertaken a review of the metrology areas

in which computation plays a critical part. This review, together with the

JRP’s focus on coordinate metrology, has identified ten priority application

areas, with each application area belonging to either the Length, Chemistry

or Interdisciplinary domain (indicated in brackets by “L”, “C” and “I”,

respectively):


166

• Least squares geometric element fitting (L).

• Chebyshev geometric element fitting (L).

• Evaluation of surface texture parameters (L).

• Least squares non-uniform rational B-splines (NURBS) fitting (L).

• Peak assessment (C).

• Least squares exponential decay fitting (C).

• Principal component analysis (I).

• Uncertainty evaluation (I).

• Regression (I).

• Interlaboratory comparisons (I).

3. General framework for traceability

Task (2) involves three main activities:

• A review of current internet-aided software validation has been un-

dertaken to help understand the limitations of what is currently

in place. In addition, the unfunded JRP-Partners have helped to

document the requirements of the ICT infrastructure to be devel-

oped during the JRP, identifying how the TraCIM system can go

beyond the current state-of-the-art.

• A glossary of terms and definitions for traceability in

computationally-intensive metrology has been developed.

• Requirements for the formation and principles of the “TraCIM As-

sociation” concerned with software validation have been developed.

During the JRP, membership of the TraCIM Association will be re-

stricted to the JRP-Partners. After the JRP ends, membership will

be open to other NMIs or designated institues (DIs).

4. Specifications of computational aims

Verification and validation of software may only be undertaken if there

exists a clear statement of the problem that the software is intended to

solve or the task that the software is intended to execute. Such a statement

is essential both to act as the user and functional requirements for the

software developer, and to provide a basis for verification and validation of

the software implementation.

As part of task (3), a procedure has been developed that provides a

clear description of how a computational aim should be specified. The

specification of the computational aim includes information contained in

the following fields:


167

• Title.

• Keywords.

• Mathematical area.

• Dependencies.

• Input parameters.

• Output parameters.

• Mathematical model.

• Signature.

• Properties.

• References.

A computational aims database6 has been developed that acts as a

repository for specifications of computational aims. For the ten priority

application areas identified in task (1) (section 2), specifications of compu-

tational aims have been developed. The University of York is looking at the

use of formal specification languages in defining computational aims.

5. Reference data

Data generators are commonly used in the process of testing software imple-

mentations of computational aims.7,8 A data generator, also implemented

in software, though not necessarily in the same programming language as

the software under test, is used to produce a reference pair, comprising

reference input data and reference output data. The reference input data

is processed by the software under test to produce test output data that is

compared (in an appropriate way) with the reference output data. Repeat-

ing this process a number of times allows a statement to be made about

the quality of the software under test.

Data generators generally implement one of the following approaches:

• Forward data generation, which refers to the process of taking ref-

erence input data and using it to produce corresponding reference

output data.

• Reverse data generation, which refers to the process of taking ref-

erence output data and using it to produce corresponding reference

input data.7,9–11

Forward data generation involves developing reference software that pro-

cesses input data to produce output data. Reverse data generation typically

requires an understanding or analysis of the computational aim such that

output data can be processed to produce input data. Reverse data gen-


168

eration is often more simple to implement than forward data generation,

and avoids the need to develop reference software, a process that may be

particularly complicated and costly.

Within task (4), for each priority application area specified in task (1)

(section 2), appropriate data generators will be developed.

6. Performance metrics

The aim of task (5) is to develop metrics that will be used to evaluate the

performance of software under test. These metrics should take into account:

• The numerical uncertainty associated with the reference data, i.e.,

how close the reference input and output data is to the true math-

ematical solution.

• Characteristics of the measurement data likely to arise in practice,

such as the simulated measurement uncertainty associated with the

reference input and/or output data.

• A maximum permissible error (MPE), or other specification, that

applies in the relevant metrology domain.

Performance metrics will be designed to assess two main features of

software under test: numerical accuracy – how far the computed solution is

from the reference solution, and fitness for purpose – is the difference be-

tween the computed solution and the reference solution significant relative

to other influence factors such as measurement uncertainties?

7. ICT infrastructure

One of the key aims of the JRP is to establish a sustainable and durable

service for stakeholder communities that survives beyond the lifetime of

the JRP. To this end, task (6) is concerned with the development and

establishment of an ICT infrastructure for software validation. This ICT

infrastructure, referred to as the “TraCIM system”, allows a customer, e.g.,

a software developer, to interface with the TraCIM server. A secure data

exchange system has been developed that allows the TraCIM server to

deliver reference input data sets to the customer. The customer then applies

the software to the data sets to obtain corresponding test output data sets.

These data sets are then delivered securely to the TraCIM server so that

comparison can be made between corresponding test and reference output

data sets. Finally, the user is provided with information, e.g., a certificate,

on the performance of the software under test. The TraCIM system has


169

already been used to undertake the certification of software that implements

least squares geometric element fitting.

8. Conclusions

The EMRP JRP “Traceability for computationally-intensive metrology”

addresses the need for the provision of a software validation service in order

to address the ever-increasing use of software within computational chains

within metrology. This paper summarises the main steps to be undertaken

to achieve the goal of establishing an ICT infrastructure for software vali-

dation.

Acknowledgements

This work has been undertaken as part of the EMRP Joint Research Project

NEW06 “Traceability for computationally-intensive metrology”, co-funded

by the UK’s National Measurement Office Programme for Materials and

Modelling and the European Union. The EMRP is jointly funded by the

EMRP participating countries within EURAMET and the European Union.

References

1. European Metrology Research Programme. www.emrponline.eu.2. Summary of Joint Research Project NEW06 (“TraCIM”). www.euramet.org/

index.php?id=emrp\_call\_2011\#c11010.3. F. Hartig, M. Franke and K. Wendt, Validation of CMM evaluation soft-

ware using TraCIM. In Advanced Mathematical and Computational Tools forMetrology X.

4. G. J. P. Kok and I. M. Smith. Approaches for assigning numerical uncertaintyto reference data pairs for software validation. In Advanced Mathematical andComputational Tools for Metrology X.

5. H. D. Minh, I. M. Smith and A. B. Forbes. Determination of the numeri-cal uncertainty for numerical artefacts for validating coordinate metrologysoftware. In Advanced Mathematical and Computational Tools for MetrologyX.

6. TraCIM Computational Aims Database. www.tracim-cadb.npl.co.uk.7. B. P. Butler, M. G. Cox, A. B. Forbes, S. A. Hannaby and P. M. Harris.

A methodology for testing the numerical correctness of approximation andoptimisation software. In The Quality of Numerical Software: Assessmentand Enhancement, ed. R. Boisvert (Chapman and Hall, 1997).

8. R. Drieschner, B. Bittner, R. Elligsen and F. Waldele. Testing CoordinateMeasuring Machine Algorithms, Phase II. Technical Report EUR 13417EN, Commission of the European Communities (BCR Information) (Lux-embourg, 1991).


170

9. M. G. Cox, M. P. Dainton, A. B. Forbes and P. M. Harris. Validation of CMMform and tolerance assessment software. In Laser Metrology and MachinePerformance V, ed. G. N. Peggs (WIT Press, Southampton, 2001).

10. M. G. Cox and A. B. Forbes. Strategies for testing form assessment software.Technical Report DITC 211/92, National Physical Laboratory (Teddington,1992).

11. A. B. Forbes and H. D. Minh. Generation of numerical artefacts for geometricform and tolerance assessment. Int. J. Metrol. Qual. Eng. 3, 145 (2012).

171





STABLE UNITS OF ACCOUNT FOR ECONOMIC VALUE

CORRECT MEASURING

NIKOLAI V. HOVANOV

Department of Economics, Saint Petersburg State University,

7/9, Universitetskaya nab., St. Petersburg, 199034, Russia


A previously developed method of minimal-volatile “baskets of economical goods”

construction outlined. These baskets proposed on the role of stable aggregated units

(SAU) of account (i.e., units of goods value measuring). Some numerical examples

demonstrate that SAU-baskets with very low volatility during rather long periods of time

may formed on the base of historical data analysis and/or on the base of experts’ non-

numeric, non-precise and non-complete information.

1. Introduction

Last decades there exists a rather wide flow of publications on “baskets” (sets,

aggregates, collections, portfolios, “bundles”, “cocktails”, etc.) of economical

goods (commodities, services, currencies, assets, stocks, negotiable papers,

derivative instruments, etc.). Some of these composite (aggregated) goods are

proposing to be a “standard of value”, “unit of account” (i.e., unit of goods value

measurement), etc. For example, Robert Shiller, laureate of Nobel Prize in

Economics (2013), considers that the introduction of such composite unit of

account “would mean the creation of a new system of economic measurement,

which would make real quantities, rather than arbitrary nominal quantities, the

centre of our attention” [1, p. 13].

As the main attribute of any unit of measurement is its stability through

time, we propose to select on the role of “value standard” a composite good with

minimal volatility during a sufficient period of time. For example, a low volatile

basket of national currencies may define such Stable Aggregated Unit (SAU).

For presentation of a general approach to the problem of a SAU formation,

we’ll outline a model of simple and composite (aggregated) economical goods

exchange (section 2). For these exchangeable goods, some multiplicative

monetary indices of exchange value will be introduced (section 3). As measure

172


of exchange-value indices volatility we’ll use mean square deviation from unit

(MSDU). A stable aggregated good with minimal MSDU on a fixed historic

period of time may pretend on the role of SAU (section 4). We give an example

of SAU construction on the base of historic data on exchange coefficients of

national currencies, namely EUR, GBP, JPY, and USD (section 5). For a SAU

construction one may use not only historical data but non-numeric, non-precise

and non-complete expert information too (section 6). Conclusion of the paper

contains some general remarks on SAU using as a unit for measuring of

economic goods value.

2. Extended model of simple exchange

Model of simple exchange is resting on the next suppositions. There is a given

set ,...,1 n

ggG = of economic goods (commodities, services, currencies,

valuable assets, stocks, securities, derivatives, etc.). A volume i

q of any i

g is

expressing in terms of the measurement unit i

u from set ,...,1 n

uuU = . So, a

volume of i

g may be represented by a “named” numberii

uq , where i

q is an

“abstract” number, and measurement unit i

u defines “dimension” of the

volume. We’ll assume that on the “market” of economic goods n

gg ,...,1

all

possible named quantities nn

uquq ,...,11

are offering for exchange.

If one can change volume ii

uq of good i

g for volume kk

uq of good k

g

on the market, we’ll say that between named quantitiesii

uq , kk

uq there is

“relation of exchange” (marked askkii

uquq ≡ ). Suppose this exchange-

relation is reflexive, symmetric and transitive (if kkii

uquq ≡ and

mmkkuquq ≡ then )

mmiiuquq ≡ , i.e., the relation of exchange is a binary

relation of equivalence. In such case, we’ll say that there is equivalent pair-wise

exchange between corresponding volumes of economic goods.

Exchange relation kkii

uquq ≡ may take the form of relation kkii

ucu ≡1

where coefficient of exchange 0)( >=ikki

qqtc shows how many units k

u of

good k

g may be exchanged for one unit i

u of goodi

g at a point of time t .

Exchange-coefficients form a square transitive matrix ))(()( tctCki

= .

Transitivity of exchange-matrix implies possibility to reconstruct a whole matrix

)(tC from a row ))(),...,(()(1

tctctcniii

= .

A finite set G of “simple” economic goods may be extended by introducing

of composite economic goods, each composite economic good g being

determined by a vector ),...,(1 n

qqq = ( 0≥i

q , 0...1

>++n

qq ) of quantities

(volumes) n

qq ,...,1

of corresponding simple economic goodsn

gg ,...,1

. A

composite economic good ),...,(1 n

qqq = may be interpreted as a “basket”

173


1 1( ) ,..., n nB q q u q u= which contains i

q measurement units i

u of simple

economic good i

q from the set G .

Any composite good ),...,(1 n

qqq = may be represented as the product of a

scalar n

qqq ++= ...'1

and a normalized vector ),...,(1 n

vvv = ( 0≥i

v ,

1...1

=++n

vv ): )',...,'('1 n

vqvqvqq ⋅⋅=⋅= . Therefore, for the role of a

“natural” unit of the composite good quantity (volume) measurement may be

taken the normalized vector. If we choose such unit ),...,(1 nv

vvvu == of

measurement, then the composite good volume will be equal ton

qqq ++= ...'1

.

Exchange-coefficient )(...)();,()(11

tcvtcvtuuctcknnkkvkv

++== shows

how many units k

u of simple good k

g may be exchanged for one unit v

u of

composite good g at a point of time t .

3. Monetary indices of economic value

As exchange-relation kkii

uquq ≡ is binary relation of equivalence, there must

be “something equal” in exchangeable quantitiesii

uq , kk

uq of simple

economic goodsi

g , k

g (even though the goods may be quite different in their

qualities). This “something equal” in exchangeable volumes of goods was

named by Adam Smith “relative or exchangeable value of goods” with the

synonyms: “value in exchange”, and “power of purchasing” [2, pp. 44-46].

Transitivity of exchange-matrix )(tC permits to construct such index

);( tuIi

of exchange-value of uniti

u , that any exchange-coefficient )(tcki

may

be represented in the form of ratio );(/);( tuItuIki

of corresponding values of

the index. Analogous indices );( tuIv

one can construct and for a normalized

composite economic good ),...,(1 n

vvv = . It is rather natural to call such

indices );( tuIi

, );( tuIv

by the name “monetary indices of value in exchange”

(“monetary exchange-value indices”).

J.S. Mill noted in his “Principles of Political Economy” (1848) that “the

value of a thing means the quantity of some other thing or of things in general,

which it exchanges for” [3]. So, if we want to construct a monetary exchange-

value index );( tuIi

for simple economic good i

g , then we must take into

consideration all exchange-coefficients in row ))(),...,(()(1

tctctcniii

= that

show proportions in which unit of measurement i

u exchanges for units

nuu ,...,

1: ))(;();( tcuItuI

iii= . We’ll use multiplicative monetary index

(Jevons Index) ))(;();( tcuItiIii××

= of simple good i

g exchange-value, this

index being defined as a geometric mean n

niitctctiI

1

1)](...)([);( ⋅⋅=

× of

exchange-coefficients )(),...,(1

tctcnii

[4, p. 332]. Analogously, value in

exchange of a normalized composite good v is measuring by a monetary index

174


))(;();( tcuItvIvv××

= which is defined by geometric mean n

nvvtctctvI

1

1)](...)([);( ⋅⋅=

× of coefficients )(),...,(

1tctc

nvv [5].

4. Composite good with minimal volatility

On real markets, exchange-coefficients and exchange-value indices of different

simple and composite economical goods are rather unsteady. This unsteadiness

(volatility) complicates the problem of a “standard good” choice for needs of

goods “value” measuring. Really, any “standard good” (simple or composite),

chosen to play the role of “common measure“ (“numeraire”, “unit of account”,

“standard of value”) for value in exchange of all economic goods has some

fluctuations in its level of value in exchange. However, as it was reasoned by A.

Smith “a commodity which is itself continually varying in its own value, can

never be an accurate measure of the value of other commodities” [2, p. 50].

In our opinion, it is impossible to find a universal standard good with value

in exchange, which is absolutely constant across time, space, and economic

systems. Therefore, we prefer to set a more modest but more realistic aim: not to

reveal a standard good with ideally constant measure of value in exchange, but

to construct a composite good with the minimal volatility of value in exchange

for a fixed market of goods and for a fixed period of time.

To measure variation of multiplicative monetary exchange-value

index );( tiI×

relatively to its value )1;(iI×

at a fixed point of time 1=t , we’ll

introduce indicator )1/;( tiI of exchange-value monetary index );( tiI×

variation by formula )1;();()1/;( iItiItiI××

= . Analogously, for measuring of

variation of multiplicative monetary exchange-value index );( tvI×

relatively to

its value )1;(vI×

at a fixed point of time 1=t , we’ll introduce indicator

)1/;( tvI of exchange-value monetary index );( tvI×

variation by

formula )1;();()1/;( vItvItvI××

= .

To measure variability (instability, volatility, etc.) of monetary exchange-

value index’ relative deviations )1/;( tvI on a fixed period of time ],1[ T we’ll

use Mean Square Deviation from Unit: 2

122 ]/)1)1/;((...)1)1/1;([)1];,1[;()( TTvIvITvMSDUvV −++−== .

Now the problem of minimally volatile composite good ),...,( **

1

*

nvvv =

choice may be formally stated as the problem of )1];,1[;()( TvMSDUvV =

minimization under constraints 0≥i

v , ni ,...,1= , 1...1

=++n

vv . The

obtained optimal basket ),...,( **

1

*

nvvv = determines the required composite

good with minimal volatility )( *vV , i.e., stable composite good. It is important

to note that the constructed composite good of minimal volatility may be used

175


mainly in the role of a Stable Aggregated Unit of account (SAU), and does not

pretend on other functions of money (e.g., function of a legal tender).

5. SAU: construction on the base of historic data

The most straightforward and plain approach to a SAU construction is direct ex

post calculation of optimal basket components **

1,...,

nvv on the base of statistical

data on exchange coefficients )(tcki

, nki ,...,1, = for a Learning Period

(LP) ,...,1],1[ TT = .

For example, we choose LP: 01.01.2012-31.01.2012, 366,...,1=t , and four

currencies: EUR , GBP , JPYJPY 100'= , and USD . Then optimal volumes

243.0)(**

1== EURvv , 200.0)(**

2== GBPvv , 252.0)'(**

3== JPYvv ,

305.0)(**

4== USDvv of the currencies in basket

*SAU one may calculate by

tools of web-site stable-basket-money.com. Analogously, ex post calculation of

the currencies optimal volumes in basket #

SAU for Testing Period (TP):

01.01.2013-31.12.2013 gives results: 237.0)(##

1== EURvv ,

177.0)(##

2== GBPvv , 284.0)'(##

3== JPYvv , 302.0)(##

4== USDvv .

We may calculate the previously specified measures of volatility

)1];,1[;()( TXYZMSDUXYZV = (by tools of the mentioned web-site) for the

national currencies (see second and third rows in Table 1) and for the composite

currencies: 0001.0)*

( =SAUV , 0003.0)#

( =SAUV .

Table 1. Measures of volatility for national currencies

Type of analysis EUR GBP JPY USD

Ex post: LP 0.0205 0.0296 0.0351 0.0093

Ex post: TP 0.0480 0.0207 0.0882 0.0414

We see that volatility of

*SAU is extremely small in comparison with

volatilities of national currencies: measure 0001.0)*

( =SAUV of the composite

currency volatility is 93 times less than it is for USD, 205 times – for EUR, 296

times – for GBP, and 351 times – for JPY! The similar results we have for

Testing Period (TP). Really, measure 0003.0)#

( =SAUV of the composite

currency volatility is about 69 times less than it is for GBP, 138 times – for

USD, 160 times – for EUR, and 294 times – for JPY!

Now let’s carry out ex ante analysis of composite currency )(*

TPSAU of

the data on exchange-coefficients for Testing Period (TP) with nominal volumes *4

*1 ,...,vv calculated for Learning Period (LP). In the case, one can see again that

volatility of such )(*

TPSAU is rather small in comparison with volatilities of

176


the national currencies. Really, measure 0034.0))(*

( =TPSAUV of the

composite currency volatility is about 6 times less than it is for GBP, 12 times –

for USD, 14 times – for EUR, and 26 times – for JPY. But, in this case of ex ante

analysis, the level 0.0034 of volatility measure for )(*

TPSAU is 34 times

greater than it is in the ex post analysis of 2012 year data, and more than 11

times greater than it is in the ex post analysis of 2013 year data.

A possible approach to SAU volatility reduction in the result of an ex ante

analysis by using of non-numeric and/or non-precise expert information is

presented in next section.

6. SAU: construction on the base of non-numeric data

In ex ante construction of a SAU for testing period, an analyst can use not only

historical statistical data from previous (learning) period, but may take into

account his personal information (knowledge) on possible optimal volumes ++

nvv ,...,

1 for next (testing) period. This personal information may include the

analyst’s own opinion, and/or additional conjectures from different experts and

experts’ committees. Such expert information I usually have non-numerical

(ordinal) or non-precise (interval) form and may be represented as a system

...,,...,,iiisrji

bvavvvvI ≤≤=>= of equalities and inequalities. As a

rule, a vector of supposed optimal volumes could not be determined uniquely on

the base of information I. For this reason, we’ll name experts’ knowledge I by

the term non-numeric, non-precise, and non-complete information (NNN-

information).

Under such uncertainty, i.e., in a situation when we know only a whole set

)(IAV of all admissible (from the point of view of NNN-information) vectors,

we will address ourselves to the concept of uncertainty randomization [8]. In

accordance with the concept, an uncertain choice of a vector ),...,(1

+++

=n

vvv is

modeling by a random choice of an element from the set )(IAV . Such

randomization produces a random vector ))(~),...,(~()(~1

IvIvIvn

+++

= ,

1)(~...)(~1

=++++

IvIvn

, 0)(~≥

+

Ivi

, uniformly distributed on set )(IAV .

Mathematical expectation )(~)( IvEIvii

++

= may be used as a numerical

estimation of i-th expected optimal volume.

Let’s consider example where an expert has NNN-information

31.0;26.0;18.0;25.0;43212134

≤≥≤≤>>>=++++++++

vvvvvvvvI about

optimal volumes of national currencies for Testing Period (TP): 01.01.2013-

31.12.2013. By using the method of randomized volumes, we can estimate the

expected normalized volumes: 239.0)(1

==++

EURvv , 169.0)(2

==++

GBPvv ,

285.0)'(3

==++

JPYvv , 306.0)(4

==++

USDvv . Aggregated currency

177


)(TPSAU+

(formed by these volumes of national currencies) has volatility

measure 0004.0))(V( =+

TPSAU that is about 52 times less than it is for GBP,

104 times – for USD, 120 times – for EUR, and 221 times – for JPY.

The proposed method of SAU construction on the base of experts’ NNN-

information is in good accordance with requirements of GUM (“Guide to the

expression of uncertainty in measurement”) which play the role of an

international standard for measurement conceptions and procedures [6]. Really,

in Annex D.6.1 to GUM one can read about probability distributions (italics and

bold letters in the quotation are mine): “… All one can do is estimate the values

of input quantities … either from unknown probability distributions that are

sampled by means of repeated observations, or from subjective or a priori

distributions based on the pool of available information”. But, features of a

probability distribution n

pp ,...,1

( 0≥i

p , 1...1

=++n

np ) are quite equal,

from the mathematical point of view, to distribution of normalized volumes

nvv ,...,

1 ( 0≥

iv , 1...

1=++

nvv ).

The additional stimulus to use experts’ knowledge is the fact that often a

NNN-information is the only “pool of available information” on a normalized

quantities (probabilities, volumes, weights, etc.) distribution.

7. Conclusion

In the paper we outline two main approaches to economic value measuring:

construction of a stable aggregate unit of account (SAU) on the base of historic

exchange-coefficients time series [7], and direct construction of such SAU on the

base of non-numeric, non-precise, and non-complete expert information (NNN-

information) [8].

Each of these two approaches to SAU construction are using for practical

decision of some actual problems of economic value correct measurement:

monitoring of commodities prices; developing of virtual regional common

currencies (e.g., for BRICS, MERCOSUR, EurAsEC); hedging currencies risks

of long-term contracts, etc. Now we consider that there is the actual problem to

combine the two above mentioned methods for more precise and reliable

measurement of economic goods value.

Acknowledgment

The work was supported by grant 14-06-00347 of the Russian Foundation for

Basic Researches (RFBR)

178


References

1. Shiller R.J. The Case for a Basket: A New Way of Showing the True Value

of Money. London: The Policy Exchange, 2009.

2. Smith A. An Inquiry into the Nature and Causes of the Wealth of Nations.

Oxford, Oxford University Press. 1976 (1776).

3. Mill J.S. Principles of Political Economy. 7-th ed. London: Longmans,

Green and Co., 1909 (1848).

4. Jevons S.W. Money and the Mechanism of Exchange. New York: D.

Appleton and Company, 1875.

5. Hovanov N., Kolari J., Sokolov M. The problem of money as a measuring

stick // XRDS: Crossroads. 2011. Vol. 17. P. 23-27.

6. Guide to the expression of uncertainty in measurement. Geneva: Joint

Committee for Guides in Metrology, 2008.

7. Hovanov N., Kolari J., Sokolov M. Computing currency invariant indices

with an application to minimum variance currency. Journal of Economic

Dynamics and Control. 2004. Vol. 28. P. 1481-1504.

8. Hovanov N., Yudaeva M., Hovanov K. Multi-criteria estimation of

probabilities on basis of expert non-numeric, non-exact and non-complete

knowledge // European Journal of Operational Research. 2009. Vol. 195. P.

857-863.

179





A NOVEL APPROACH FOR UNCERTAINTY EVALUATION

USING CHARACTERISTIC FUNCTION THEORY

A. B. IONOV*, N. S. CHERNYSHEVA AND B. P. IONOV

Chair for Radio-Engineering Devices and Monitoring Systems, Omsk State Technical

University, Mira av., 11, 644050, Omsk, Russia *E-mail: [email protected]

www.omgtu.ru

In the paper the novel approach for accurate representation and evaluation of uncertainty

in intelligent measuring systems is suggested. According to the approach, available

probabilistic knowledge of the quantity value is presented in the form of characteristic

function. As an example, the on-line uncertainty calculation procedure for radiation

thermometry is examined.

Keywords: Uncertainty Calculation; Intelligent Instruments; Characteristic Function;

Radiation Thermometry.

1. Introduction

At present time intelligent measuring systems operating on-line without

interruption have been becoming more and more popular in industry. As a rule,

such systems are used to control manufacturing processes and prevent accidents.

They usually include a number of intelligent sensors and computing facilities for

multi-aspect information processing and decision making as well [1-2].

The key characteristic of intelligent measuring systems is built-in

metrological quality control of all operations being performed by the system.

Such operations include: (a) transformation of physical quantity into primary

electrical signal; (b) signal processing, filtration, correction; (c) storage and

transmission of measurement information; (d) automatic decision making; (e)

presenting measurement information to the operator. As a result, measurements

are becoming more reliable and decisions made on the basis of these

measurements are becoming more adequate.

As the monitoring systems for industrial purposes should operate

continuously for a long time, so it is very important to provide such operating as

independently as possible. In other words, the operator should receive the

information about quality of every measurement result and make arrangements

only when this quality is not appropriate. From this point of view, the best way

180


to implement such functional capabilities is the on-line estimation of uncertainty

in a device itself [1].

2. Main Issues of Uncertainty Evaluation in Intelligent Measuring

Systems

To provide the proper operating, all measuring data processed in intelligent

instruments should be treated as random quantities with unconditioned (real)

probability laws. So, the basic characteristic of the correct presentation of

measurement uncertainty is probability density function (PDF). Moreover, in the

case of intelligent automatic system operating on-line there are specific

requirements for data handling:

• high level of accuracy and adequacy of the probabilistic model used;

• simplicity of data processing algorithms;

• absence of information redundancy in the storing data.

At present time two main approaches to uncertainty analysis (propagation

of distributions) are popular. The one of them is the usage of widespread

techniques described in “Guide to the Expression of Uncertainty in

Measurement” (GUM) [3]. But such methods cannot guarantee the correct

estimation of uncertainty in the following main cases [4]:

• distribution of output variable is not Gaussian;

• distributions of input variables exhibit asymmetry;

• measurement model is a strongly nonlinear function;

• uncertainty ranges of individual input quantities are incomparable.

Therefore, total approximation error can cause big result distortion due to a great

number of consecutive transformations of quantities in intelligent instruments.

The other well-used approach is Monte Carlo technique that allows deriving

exact probability distribution of result. The principal disadvantages of this

method (from the point of view of on-line data processing) are:

• enormous amount of calculations;

• necessity of approximation of output PDF by analytic function.

So, in this case it is reasonable to pay attention to alternative approaches of

propagation of distributions. They should provide more accurate PDF estimation

than GUM techniques, but have to be less resource consuming than Monte Carlo

simulation procedures. In this paper one of such methods which is based on the

usage of characteristic function for constructing of PDF models is suggested and

examined.

181


3. Fundamentals of the Approach

3.1. Description of the characteristic function

According to the approach suggested, available probabilistic knowledge of

the quantity value is presented in the form of characteristic function. From

theory, it is known that the characteristic function φx(t) of random quantity X is a

probabilistic characteristic, relating to the probability density function f(x) by

Fourier transform [5-6]:

( ) ( ) ( ) ( )x exp exp .t E itX itx f x dx

∞

−∞

= = ∫ϕ (1)

This characteristic is complex and can be represented either on the basis of

real and imaginary parts or on the basis of magnitude and argument:

( ) ( ) ( ) ( ) ( )( )x x x x xRe Im exp argt t i t t i t= + = ⋅ ϕ ϕ ϕ ϕ ϕ . (2)

The significant feature of characteristic function is its limitation – it tends to

zero when the argument tends to infinity. Apart from that, due to the limitation

of probability density function it can be discretized without information loss.

This way allows storing data of probability distribution of a quantity in form of

arrays (that are not usually big) containing characteristic function values.

Besides, we can find the optimal balance between the required accuracy and

acceptable volume of information while managing the number of characteristic

function samples N (see Fig. 1).

Fig. 1. Illustration of searching of balance between the number N of characteristic function samples

used and the accuracy of corresponding probabilistic model.

182


3.2. The algorithm of uncertainty evaluation

The following steps can be marked out in the main procedure of uncertainty

evaluation (propagation of distributions) by the CF approach.

1. Definition of the output quantity of the considered measurement model.

2. Definition of input quantities of this model.

3. Design of the measurement model (in form of analytical relationships) that

should be realized in intelligent instrument.

4. Determination of the characteristic functions of the model inputs.

5. Forming of array of CF samples (using the appropriate value of N) for each

of input quantities by discretization of the CF defined in the step 4.

6. Evaluation of characteristic function samples of the output using the

analytical relationships of the measurement model and the determined CF

arrays of the inputs.

7. Estimation of shape of probability density function and the parameters of

output quantity (exact mean value, standard or extended uncertainty, etc.)

on the basis of the calculated CF samples.

The determination of characteristic function of input quantity can be

performed in three different ways:

• by utilizing the analytical expression of the CF described in a reference

source (for any standard probability distribution);

• by conversion of previously recorded time series of input quantity in

accordance with left equality of (1);

• by Fourier transform of PDF determined earlier.

The key feature of the concept suggested is the possibility to do

mathematical operations to transform probabilistic characteristics directly on the

basis of the characteristic function in the framework of appropriate theory [5-6].

In some cases it is more convenient (in terms of computational efficiency) than

to operate with probability density function and other characteristics. In

particular, CF φy(t) of the sum Y of two independent random quantities X and A

equals the product of their characteristic functions φx(t) and φa(t) (so it doesn’t

require calculation of convolution).

Thanks to the fact that characteristic function is analytically connected to

lots of other probabilistic characteristics, the operator can receive the final result

in any form that is convenient for him. For example, expectation value Ex can be

calculated on the basis of characteristic function samples by the expression [5]

( ) ( )1

1

21 Im ,

Nn

x x

n

E n tn t

+

=

= − ⋅∆

⋅ ∆ ∑ ϕ (3)

183


where ∆t is a sampling interval of the characteristic function. Standard

uncertainty (a variance) σx can be estimated by the formula

( ) ( )

222

x21

21 Re .

3

Nn

x x

n

n t Et n t

=

= + − ⋅ ∆ −

∆ ⋅∆ ∑

πσ ϕ (4)

And for the evaluation of the probability density function fx(x) we should use the

following expression

( ) ( ) ( )

1

1Re cos

2

N

x x

n

tf x n t n t xϕ

π=

∆ = + ⋅∆ ⋅ ⋅ ∆ ⋅

∑

( ) ( )

1

Im sin .N

x

n

n t n t x

=

+ ⋅ ∆ ⋅ ⋅∆ ⋅

∑ ϕ (5)

4. Example: On-Line Uncertainty Calculation in the Radiation

Thermometer

We consider non-contact temperature measuring systems as a promising

area of implementation of the concept introduced. In radiation thermometry the

measuring process is greatly influenced by external factors such as [7-9]: non-

ideal state of an object surface (unknown emissivity), atmospheric absorption or

scattering of thermal radiation, etc. In a single-channel radiation thermometer

for the compensation of these factors the operator should estimate and input a

proper value of correction factor K into the device. In this case, the total

uncertainty of measurement is significantly influenced by the uncertainty of the

compensation factor K [9].

As an example, the functional scheme of the procedure of on-line

uncertainty calculation in case of a single-channel radiation thermometer is

shown in Fig. 2. The intrinsic noise of the optical detector and non-perfection of

the compensation factor are sources of total uncertainty. The characteristic

feature of the radiation thermometer is that: (a) it has non-linear functional

transformer “measured code – temperature”; (b) it is not reasonable to create

probabilistic models on the basis of normal distribution law, owing to big

distortion. Preliminary research showed that in this case it was not reasonable to

use methods of defining combined standard uncertainty described in GUM.

While using the characteristic function method for uncertainty calculation

the accuracy of result strongly depends upon the number of characteristic

function samples used (see Fig. 3). Moreover, in this case the acceptable

184


accuracy is already reached while using 6-8 samples. That proves the possible

practical application of this approach.

Fig. 2. Functional scheme of the procedure of on-line uncertainty calculation in case of a single-

channel radiation thermometer (an example situation).

Fig. 3. Example of the uncertainty calculation in the radiation thermometer (see Fig. 2) – results by

the characteristic function approach.

185


The distinctive feature of the CF approach is the possibility of more

accurate estimation of confidence and “residual” probabilities in comparison

with general GUM technique, as it is demonstrated in Fig. 4. In such situations,

when a certain critical temperature level exists, the maintenance personnel must

pay a special attention to the probability of exceeding the limiting value.

Fig. 4. Comparison of the results between CF approach and general GUM technique.

5. Conclusion

The approach discussed cannot be considered as a perfect alternative to the

methods that are widely used currently for uncertainty propagation, as in

described form it is oriented to the application area, which is quite specific.

Especially, it is reasonable to take a further special look into the matter of

delimitation between the conditions of effective practical application of the

GUM technique and the CF approach. And, unfortunately, the current level of

development of the characteristic function theory does not allow to make

analytic transformations of distributions in the case of strong non-linearity.

The usage of the CF method seems rational when the GUM approach

cannot assure the required accuracy of the result and Monte Carlo technique

application is impossible by reason of computational constraints. It should be

noted, that in any specific practical case it is possible to reach an optimal

balance between the required accuracy and acceptable volume of information for

186


storing and processing by optimal selection of the number of characteristic

function samples.

For instance, this approach can be useful for constructing computer-based

automatic intelligent measurement systems aimed for monitoring of

manufacturing processes.

References

1. M.P. Henry and D.W. Clarke, The Self-Validating Sensor: Rationale,

Definitions and Examples, Control Engineering Practice, No. 1, pp. 585-

610, 1993.

2. R. Taymanov, K. Sapozhnikova and I. Druzhinin, Sensor Devices with

Metrological Self-Check, Sensors & Transducers, Vol. 10, pp 30-45, 2011.

3. ISO/IEC Guide 98-3:2008, Uncertainty of Measurement – Part 3: Guide to

the Expression of Uncertainty in Measurement (GUM:1995), (IOS, 2008).

4. W. Minkina and S. Dudzik, Infrared Thermography: Errors and

Uncertainties (John Wiley & Sons, 2009).

5. Yu. M. Veshkourtsev, The Application Study of the Characteristic Function

of Random Process, (Radio i Svyaz, 2003).

6. E. Lukacs, Characteristic Functions, (Charles Griffin & Company Limited,

1970).

7. D.P. DeWitt and G.D. Nutter, Theory and Practice of Radiation

Thermometry (Wiley Interscience, 1988).

8. Z.M. Zhang, B.K. Tsai and G. Machin, Radiometric Temperature

Measurements, (Elsevier, 2010).

9. A.B. Ionov, Metrological Problems of Pyrometry: an Analysis and the

Prospects for Solving Them, Measurement Techniques, Vol. 56, No. 6, pp.

658-663, 2013.



ESTIMATION OF TEST UNCERTAINTY FOR TraCIM

REFERENCE PAIRS

F. KELLER∗, K. WENDT, F. HARTIG

Department of Coordinate Metrology, Physikalisch-Technische Bundesanstalt (PTB),

D-38116 Braunschweig, Germany∗E-mail: [email protected]

www.ptb.de

In this paper we propose a way to parametrize certain geometrical elements

such as lines, planes or cones, and define distance functions between the re-

spective parameters. These functions are used as test values for the validationof Gaussian best-fit software. Furthermore we describe the concept of refer-

ence pairs used for the test and show how one can calculate the numericaluncertainty for these reference pairs applying a Monte-Carlo simulation.

1. Introduction

The TraCIM (TRAceability of Computationally-Intensive Metrology)1,2

system was developed to provide a tool for the validation of software ap-

plications used in metrology, in particular software deployed on coordinate

measuring machines. The latter are mainly software applications which cal-

culate to a given point set obtained by a measurement an associated geomet-

rical feature. In this paper we concentrate on the Gaussian (least square)

best-fit elements for the geometrical elements line, plane, circle, cylinder,

cone and sphere. For validation, the TraCIM system supplies a point set

representing one of the geometrical elements, for which the software under

test has to calculate the fitting feature, i.e. parameters representing this

feature. To decide if the software has passed the test, the TraCIM system

compares the parameters calculated by the software with reference param-

eters stored in a database.

These reference pairs consisting of point sets and parameters can be

regarded as numerical artefacts of standards. Similar to physical artefacts of

standards, their numerical equivalents are subject to uncertainty. Since the

point sets and reference parameters are numerically calculated and stored

in variables with finite precision, they are subject to rounding errors and

differ inevitably from the mathematically exact values.

187


188

To compare the result from the software under test with the reference

parameters, we cannot just look at the difference between the components

of two parameters, since different parameters can describe the same geomet-

rical object. We rather have to define appropriate functions δ = δ(T1, T2)

to compare two parameters T1 and T2. So if Tref is a reference parameter

for a certain geometrical feature and T the parameter returned from the

software under test, we consider the test value δ(Tref , T ).

Due to the numerical reasons mentioned above, the reference parameter

Tref is not the mathematically exact solution of the Gaussian best-fit for

the supplied point set P , hence there remains an uncertainty for the value

of δ(Tref , T ) as well. To estimate the uncertainty for the test values we use

a kind of Monte-Carlo simulation. To this aim we make use of a reference

software owned by the PTB for the calculation of the Gaussian best-fit

elements.

2. Geometrical features

We first want to specify the geometrical elements as subsets of R3 together

their parameter description, as well as the distances between two elements of

the same type as functions of the corresponding parameters. More precisely,

for each element type E (i.e. line, plane, circle, cylinder, cone and sphere)

we will define a parameter space T together with a surjective map φ : T −→elements of type E. This maps can for example be given by φ(T ) = x ∈R

3 | L(T, x) = 0 for some map L : T × R3 −→ Rd. We further define

for each geometrical feature maps δi : T × T −→ R≥0 for i = 1, . . . ,K

with 2 ≤ K ≤ 4 depending on the element type E, which we will use to

compare two elements of the same type. This functions may also depend

on a reference parameter Tref .

In the following we use the set S2κ = x ∈ R3 | | ‖x‖ − 1| ≤ κ with

0 ≤ κ 1 as domain for directional or normal vectors. We have to allow

κ 6= 0 since in finite precision we cannot expect to have vectors with length

exactly one. The actual value of κ depends on the precision one is dealing

with. For our purposes we choose κ = 10−15. Moreover, we set n = n/ ‖n‖for n ∈ S2

κ. We further write 〈x, y〉 for the euclidean inner product, and

x× y for the cross product of two vectors x, y ∈ R3.

Line A line is determined by a point p ∈ R3 on it and by its orientation,

given by a vector n ∈ S2κ. Hence T = R

3 × S2κ, and we define φ(p, n) =

x ∈ R3 | (x− p)× n = 0 for (p, n) ∈ R3 × S2κ. Note that two parameters

T1 = (p1, n1) and T2 = (p2, n2) describe the same line, i.e. the same subset


189

of R3, if and only if n1 = ±n2 and p2 − p1 ∈ R · n1.

Definition 2.1. Given a reference parameter Tref = (pref , nref) ∈ T , we

make for T1 = (p1, n1), T2 = (p2, n2) ∈ T the following definition:

(1) To compare the two normal vectors we define

δ1(T1, T2) = arcsin ‖n1 × n2‖,

which gives us the smaller angle between the lines R · n1 and R · n2.

(2) To measure the distance of two lines, we use

δ2(T1, T2) = ‖p1 − p2 + 〈pref − p1, n1〉 n1 − 〈pref − p2, n2〉 n2‖ .

The interpretation of δ2 is the following: If a1, a2 ∈ R3 are the pro-

jections of the point pref onto the lines given by T1 and T2, then

δ(T1, T2) = ‖a1 − a2‖.

Remark 2.1. Later the reference parameter Tref will describe the Gaussian

best-fit element to a point cloud Pref . In the case of lines and planes, the

point pref is the centroid of Pref , while for cylinders and cones pref is the

projection of the centroid onto the axis of the cylinder or cone, respectively.

The distance of two elements is thus measured near the center of the point

cloud.

Plane A plane is given by a point p ∈ R3 on it and by a vector n ∈ S2κ

orthogonal to the plane, so T = R3×S2κ. We map parameters T = (p, n) ∈

T to planes by φ(p, n) = x ∈ R3 | 〈x− p, n〉 = 0.

Definition 2.2. Let Tref = (pref , nref) ∈ T be a reference parameter.

(1) The angle δ1(T1, T2) between the normal vectors is defined as for lines.

(2) We define the distance between the planes given by T1 = (p1, n1),

T2 = (p2, n2) ∈ T as

δ2(T1, T2) = ‖〈pref − p1, n1〉 n1 − 〈pref − p2, n2〉 n2‖ .

Similar as in he case of lines this is the distance between the projections

of pref onto the two planes.

Cylinder A cylinder is given by a point p ∈ R3 on its axis, the orientation

n ∈ S2κ of the axis and the radius r ∈ R>0 of the cylinder. The parameter

space is thus given by T = R3 × S2κ ×R>0, and we have φ(p, n, r) = x ∈

R3 | ‖(x− p)× n‖ = r.

Definition 2.3. Given a reference parameter Tref ∈ T we define:


190

(1) The functions δ1 and δ2 are the same as for lines, applied to the axes

of the cylinders.

(2) The difference δ3(T1, T2) = |r1 − r2| of the radii.

Circle A circle is given by its center p ∈ R3, its orientation n ∈ S2κ

(i.e. a vector orthogonal to the plane in which the circle lies), and its ra-

dius r ∈ R>0. We can map parameters to circles by φ(p, n, r) = x ∈R

3 | 〈x− p, n〉 = 0 ∧ ‖x− p‖ = r, where (p, n, r) ∈ T = R3 × S2κ ×R>0.

Definition 2.4. To compare two circles we use the following functions:

(1) The angle δ1(T1, T2) between the normal vectors of two circles, defined

as in the case of lines.

(2) The distance δ2(T1, T2) = ‖p1 − p2‖ between the positions of two cir-

cles.

(3) The difference δ3(T1, T2) = |r1 − r2| of the radii.

Cone A cone is given by a point p ∈ R3 on its axis, its orientation n ∈ S2κ,

by the radius r ∈ R≥0 measured at the point p, and by the apex angle

α ∈ (0, π). Hence T = R3 × S2κ ×R≥0 × (0, π), and for T = (p, n, r, α) ∈ T

we define

φ(T ) = x ∈ R3 | ‖(x− p)× n‖+ 〈x− p, n〉 tan(α2 ) = r.

Note that for the cone, unless as for the other elements, the sign of its

orientation vector is crucial.

Two parameters T1 = (p1, n1, r1, α2) and T2 = (p2, n2, r2, α2) describe

the same cone, if and only if n1 = n2, α1 = α2 and tan(α1

2 )(p1 − p2) =

(r2−r1)n1. To compare two cones given by Ti = (pi, ni, ri, αi) ∈ T , i = 1, 2

we make the following definition:

Definition 2.5. Let Tref ∈ T be a reference parameter.

(1) The angle between the normal vectors of the two cones is defined as

δ1(T1, T2) = arccos(〈n1, n2〉).

(2) The position distance δ2 is defined as for lines, applied to the axes of

the cones.

(3) To compare the two radii we use

δ3(T1, T2) = |r1 − r2 − tan(αref

2 ) 〈p2 − p1, nref〉 |.

(4) The difference between the two angels is

δ4(α1, α2) = |α1 − α2|.


191

Sphere A sphere is given by its center point p ∈ R3 and its radius r ∈ R>0.

Thus T = R3×R>0 and φ(p, r) = x ∈ R3 | ‖(x− p)‖ = r for (p, r) ∈ T .

Definition 2.6. To compare two spheres given by T1 = (p1, r1) and T2 =

(p2, r2) we define the following functions:

(1) The distance δ1(T1, T2) = ‖p1 − p2‖ between the center points.

(2) The difference δ2(T1, T2) = |r1 − r2| of the two radii.

Note that for all geometrical features considered here except for the sphere,

different parameters can describe the same point set, i.e. there exists pa-

rameters T1, T2 such that φ(T1) = φ(T2).

Lemma 2.1. Fix a geometrical feature of one of the above described types.

Let T be the corresponding parameter set and φ : T −→ subsets of R3 the

respective map. Let further be δi for i = 1, 2, . . . ,K the associated distance

functions.

(1) The functions δi satisfy the triangle inequality:

δi(T1, T3) ≤ δi(T1, T2) + δi(T2, T3)

for all parameters T1, T2 and T3 ∈ T .

(2) We have for all T, T ′ ∈ T that

δi(T, T′) = 0 ∀i ⇔ φ(T ) = φ(T ′).

(3) Let T1, T′1, T2, T

′2 ∈ T with φ(T1) = φ(T ′1) and φ(T2) = φ(T ′2). Then for

all functions δi one has

δi(T1, T2) = δi(T′1, T

′2).

To show (1) and (2), one has simply to do the calculations for all element

types and functions δi and φ. The last assertion (3) follows then from (1)

and (2). Note that (3) means, that the calculated distances δi do not depend

on the parameters itself but only on the geometrical objects defined by the

parameters.

3. Reference Pairs

For the following we fix a geometrical feature E. If P is a point cloud and T

a parameter for the feature E, we denote by S(P, T ) the sum of the squared

distances between the points in P and the element given by φ(T ). More

precisely, for P = p1, p2, . . . , pn 6= ∅ we have

S(P, T ) =n∑i=1

d(pi, φ(T ))2,


192

where d(p, φ(T )) = min‖p− x‖ | x ∈ φ(T ). (Note that φ(T ) is always a

non-empty, closed subset of R3.)

Definition 3.1 (Reference pairs). (1) A reference pair is a pair

(Pref , Tref) where Pref = p1, p2, . . . , pn ⊂ R3 is a point cloud and

Tref ∈ T is a parameter describing a geometrical element φ(Tref).

(2) The Gaussian best-fit element is unique: For a reference pair there ex-

ists a global minimum S(Pref , Tth) at a point Tth ∈ T to the sum of

the squared distances S(Pref , ·). If S(Pref , T′th) with T ′th ∈ T is also the

global minimum, then φ(Tth) = φ(T ′th).

(3) There are values u1, u2, . . . , uK (uncertainties) such that

δi(Tref , Tth) ≤ ui

with a probability of 95% for all associated test values δ1, δ2, . . . , δK .

This reference pairs can now be used to test a software for calculating the

Gaussian best-fit elements. Suppose that such a software calculates to the

point cloud Pref the parameter T as the Gaussian best-fit element. Since

the parameters to a given element are not unique, we can not directly

compare T with the reference parameter Tref . Instead, we have to use the

distance functions defined in section 2. More precisely, if δi is one of the

distance functions for the feature under consideration, we consider the value

of δi(Tref , T ). This value has to be smaller than a maximum permissible

error εi specified by the owner of the tested software. Taking into account

the uncertainties for the reference pairs, we say that the software has passed

the test, if and only if δi(Tref , T ) ≤ εi + ui for all test values δi.

4. Transformation of reference pairs

Given a reference pair (Pref , Tref) we can create new reference pairs via

scaling, rotation and spatial shift.

Definition 4.1. The orientation preserving Helmert group is the group of

transformations on R3 consisting of scaling, rotations and spatial shift, i.e.

G = R>0 × SO(3) nR3,

and g = (s,R, b) ∈ G acts on x ∈ R3 by g.x = sRx+ b.

The Helmert group thus also acts on subsets of R3 like point clouds or

geometrical elements. Moreover, we define actions of g = (s,R, b) ∈ G on

the parameter spaces as follows:


193

• On points as just defined.

• On orientation vectors as rotation: g.n = Rn.

• On R≥0 (e.g. on radii or distances) as scaling: g.r = s · r.• On (0, π2 ) (i.e. on angles) as the identity: g.α = α

Let T be the parameter space for one of the considered element types. Then

G acts on T as just described and it follows that φ(g.T ) = g.φ(T ) for T ∈T and g ∈ G, i.e. the transformed parameters represent the transformed

geometrical elements. Moreover, if P is a point set and T a parameter

representing the Gaussian best-fit element to P , then g.T is a parameter for

the Gaussian best-fit element to g.P . Given a reference pair (Pref , Tref) we

can thus use the group action to generate new reference pairs (g.Pref , g.Tref)

for all g ∈ G. Unfortunately, the uncertainty for the new reference pairs

cannot be obtained in a simple way from the uncertainty of the original pair.

This is due to the fact that both g.Pref and g.Tref are subjected to numerical

errors. Moreover, the components of the parameters are correlated and

the test values δi may depend non-linearly on the parameters. In the next

section we will describe how the uncertainty for the (transformed) reference

pairs can be estimated by a Monte-Carlo simulation.

5. Uncertainty

For the following fix again a feature type E. Suppose that we have a ref-

erence software Γ which calculates to a given point cloud P a numerical

solution Γ(P ) ∈ T for the Gaussian best-fit element. We may think here of

P as measuring points taken on a geometrical feature of type E. Let further

be δ : T × T −→ R≥0 one of the distance functions for the parameters of

elements of type E, and let (Pref , Tref) be a reference pair for this type. We

consider then the function P 7−→ δ(Tref ,Γ(P )) as model for our measuring

process.

Suppose that the points p in Pref satisfy ‖p‖ < 1, and that the coordi-

nates x, y, z ∈ R of p are given up to 15 digital places. To this coordinates

we add random values uniformly distributed between −10−15 and +10−15

to simulate the numerical uncertainty of the coordinates. In this manner we

create new point clouds P1, P2, . . . , PN , where N = 104. To all this point

clouds we then calculate the parameters Tr = Γ(Pr), as well as the values

δi,r = δi(Tref , Tr) for i = 1, . . . ,K and r = 1, . . . , N . The uncertainty for δiis then given by the smallest value ui such that δi,r ≤ ui holds for at least

95% of all r = 1, 2, . . . , N .

Suppose now T is a parameter calculated by a software under test for


194

the point cloud Pref . The customer claims that the test values δi(T, Tth)

are smaller that specified maximum permissible errors εi, where Tth is the

(unknown) mathematically exact solution for the Gaussian best-fit element

to the points Pref . By the triangle inequality for δi follows now that

δi(T, Tref)− δi(Tth, Tref) ≤ δi(T, Tth) ≤ δi(T, Tref) + δi(Tref , Tth),

and thus with a probability of 95% that

δi(T, Tref)− ui ≤ δi(T, Tth) ≤ δi(T, Tref) + ui.

Therefore δi(T, Tth) ≤ εi would imply that δi(T, Tref) ≤ εi + ui. Since we

cannot charge the customer for our uncertainty, we hence say that the

software has passed the test, if all test values satisfy

δi(T, Tref) ≤ εi + ui.

For the customers parameter T this finally implies that δi(T, Tth) ≤ εi+2uiwith a probability of 95%.

6. Summary

In many fields of metrology the analysis and processing of date by software

applications play an increasingly important role. For that reason, to get

reliable and comparable results it is crucial, that the employed software

fulfills the specified task accurately. Therefore a well defined computational

aim for the software as well as a method to test correctness of the software

is required. In this paper a method to test Gaussian best-fit software was

proposed. We defined appropriate test values, explained the concept of ref-

erence pairs, and showed a method to calculate the numerical uncertainty

for this reference pairs.

Bibliography

1. A.B. Forbes, I.M. Smith, F. Hartig, and K. Wendt. Overview of EMRPJoint Research Project NEW06: “Traceability for Computationally IntensiveMetrology”. In Proc. Int. Conf. on Advanced Mathematical and ComputationalTools in Metrology and Testing (AMCTM 2014), St. Petersburg, Russia, 2014.

2. K. Wendt, M. Franke, and F. Hartig. Validation of CMM evaluation softwareusing TraCIM. In Proc. Int. Conf. on Advanced Mathematical and Computa-tional Tools in Metrology and Testing (AMCTM 2014), St. Petersburg, Russia,2014.



APPROACHES FOR ASSIGNING NUMERICAL

UNCERTAINTY TO REFERENCE DATA PAIRS FOR

SOFTWARE VALIDATION

G. J. P. KOK∗

VSL, Delft, The Netherlands∗E-mail: [email protected]

I. M. SMITH

National Physical Laboratory, Teddington, Middlesex, UK

This paper discusses the numerical uncertainty related to reference data pairsused for software validation. Various methods for assigning a numerical accu-

racy bound to reference data pairs are compared. Several performance metrics

are discussed which can be used to summarize the results of software testing.They require the calculation of the condition number of the problem solved by

the software, or a related quantity called numerical sensitivity. These perfor-mance metrics may be used to demonstrate traceability of metrology software.

Keywords: Numerical uncertainty, numerical accuracy, condition number, nu-

merical sensitivy, performance metric, software validation

1. Introduction

Many results in metrology depend heavily on calculations performed using

software. A precise description of such a calculation is called a computa-

tional aim (CA, symbol f). To assure traceability of the result it is essential

that the software is validated using reference data pairs consisting of ref-

erence data (input) X and the corresponding reference result (output) a.

These reference data pairs should be designed carefully and bounds on

their numerical accuracy should be calculated. There are various ways of

interpreting and summarizing the results of software testing. Several perfor-

mance metrics are presented. They require the calculation of the condition

number of the CA, or a related quantity called numerical sensitivity. These

performance metrics may be used to demonstrate traceability of metrology

software.

195


196

2. Methods for the generation of reference data pairs

Methods for the generation of reference data pairs can be classified into two

groups. In the case of forward methods, reference software that is known to

solve the CA a = f(X) is used. It is applied to different input data X1,

X2, . . . and the calculated results a1, a2, . . . are recorded. Examples are

fitting routines that provide solutions to least squares fitting problems.

In the case of backward methods, one starts with a reference result a

and computes data X such that a = f(X). This process is then repeated

for several reference results a1, a2, . . . and the calculated input data X1,

X2, . . . are recorded. Sometimes it is easier to proceed in this way than

to construct forward reference software. A common backward method for

generating reference data pairs for least-squares fitting problems is the null-

space method.1

3. Numerical uncertainty

In this section we present concepts related to numerical uncertainty. We do

not give a formal definition for numerical uncertainty itself.

3.1. Numerical accuracy bounds

A numerical accuracy bound (NAB)(v(X(0)),w(a(0))

)of the reference

data pair(X(0),a(0)

)is such that givenm, n and X(0) =

(X

(0)1 , . . . , X

(0)n

),

v = (v1, . . . , vn), a(0) =(a(0)1 , . . . , a

(0)m

)and w = (w1, . . . , wm), there exists

an exact mathematical solution (X, a) of the CA with

|Xi −X(0)i | ≤ vi and |aj − a(0)j | ≤ wj .

The special case vi = 0 for all i corresponds to a bound on the forward error

of the reference data pair: the reference input data is assumed to be exact

and the wj provide a bound on the possible error of the reference result

(output). When a reference algorithm is being used, this special case may

be easiest to assess. It is also the easiest statement to use in the exercise of

software validation.

The special case wj = 0 for all j corresponds to a bound on the backward

error of the reference data pair: the reference result is assumed to be exact

and the vi provide a bound on the possible error of the reference data

(input), i.e. they specify the maximal distance from X(0) of the problem

instance X that is actually solved.


197

In the intermediate case there are non-zero NABs v and w for both

the reference data and the reference result. This case can be useful when

accounting for rounding effects when presenting a finite number of digits of

the reference data and reference result.

An NAB of a reference data pair may be calculated using one of the

following methods: mathematical analysis, extended precision arithmetic

(e.g. using Advanpix toolbox4 for Matlab2), interval arithmetic (IA) (e.g.

using IntLab toolbox3 for Matlab), advanced computing software tools (e.g.

using Mathematica5), heuristic methods and, possibly, a Monte Carlo (MC)

method.

An example of an heuristic method is to look at the value of some

intermediate quantities in the calculations which should attain a known

value. For example, when using the null-space method, certain derivatives

should vanish. If their evaluation show they do not, their values can be used

to calculate an NAB of the reference data or reference result.

A Monte Carlo method can possibly be used in combination with a for-

ward reference algorithm to evaluate an NAB. The idea is that a forward

reference algorithm is prone to numerical error due to round-off errors in

finite precision arithmetic. When the input reference data is slightly per-

turbed with a relative perturbation of a little more than the value of the

working precision εwa, the calculated reference result may vary. The mag-

nitude of this variation may provide an NAB of the reference result. In two

examples later in this paper it will be evaluated if this idea is correct.

3.2. Condition number and numerical sensitivity

The relative condition number Kr(X) of an instance X of a computational

aim f is defined as

Kr(X) = limε→0+

sup‖δX‖≤ε

‖f(X + δX)− f(X)‖/‖f(X)‖‖δX‖/‖X‖

,

where ‖.‖ denotes norms in the space of problem instances X and in the

solution space of the a = f(X). The condition number sets limits on achiev-

able accuracies of algorithms. It is related to a worst case analysis.

We can define the relative numerical sensitivity coefficient at an instance

X of a computational aim f as

Sr(X) = limε→0+

[E‖δX‖≤ε‖f(X + δX)− f(X)‖2/‖f(X)‖2

E‖δX‖≤ε‖δX‖2/‖X‖2

]1/2,

aεw is the smallest number larger than 1 that can be represented in the arithmetic used.


198

where E denotes the expectation operator.

The numerical sensitivity coefficient is related to the condition number.

However, an ‘average’ error is considered and not the worst case error and

therefore Sr(x) ≤ Kr(x). Sr(X) may be approximated using a Monte Carlo

method and a forward reference method using a sufficiently small value of

ε.

4. Performance metrics

The idea behind a ‘performance metric’ is that one would like to have a

single number to quantify the performance of software under test for a

given reference data pair. This performance metric is not a ‘metric’ in the

mathematical sense. It should reflect how ‘well’ the test software performs

compared to ‘what can reasonably be expected’. It should take into account

the accuracy of the reference result as well as the working precision of the

arithmetic used. It can be expressed as an approximate number of decimal

digits lost in addition to the mean (or worst case) number when using

a stable algorithm. It is probably of greater interest for the test software

writer than for the end-user, whose primary concern may be accuracy. Some

possible performance metrics are as follows:

P0 = log10

1 +

‖at−ar‖‖ar‖

εw

, (1)

P1 = log10

1 +


Sr(Xr)εw

, (2)

P2 = log10

1 +


max(‖w(ar)‖‖ar‖ ,Kr(Xr)εw

) , (3)

P3 = log10

1 +


max(εw,

‖w(ar)‖‖ar‖ ,Kr(Xr)εw

) . (4)

The first performance metric P0 simply compares the test result at with

the reference result ar and assesses the ‘number of lost digits’ with respect

to the working precision εw (e.g. εw ≈ 2.2×10−16 corresponding to 16 digits

of precision). A disadvantage of this metric is that it does not reflect if it

is reasonable to expect all digits to be correct.

The second performance metric P1 also takes into account the number

of digits that would be reasonably lost by a stable reference algorithm


199

calculating in the same working precision εw as the software under test.

A disadvantage of this metric is that its interpretation when P1 is large

is ambiguous. It could mean that one had ‘bad luck’ when testing the

software (i.e. the error in the reference result was closer to the ‘worst case’

rather than the ‘mean case’ error) or it could mean that the test software

indeed does not perform optimally. One can solve this issue by using the

condition number instead of the numerical sensitivity. Another issue is that

the performance metric does not take into account the accuracy of the

reference result ar and its effect on P1.

These issues are solved by P2. However, a possible problem with this

performance metric is posed by the case when both w(ar) and Kr are

small. For example, is (1, 1 + εw) a reasonable test result for CA a(X) =

1 − εw + εwX with reference result (Xr, ar) = (1, 1)? We obtain P2 = 16,

reflecting that 16 digits are unduly lost compared to a stable reference

implementation in the case (at least) 32 digits would be visible. However,

the end-user only sees 16 digits and he would rather judge that at most the

last digit is lost, and expect a small value of P . This will be handled in the

next performance metric.

Performance metric P3 is similar to P2, but it additionally compares

the relative deviation of the test software result with the working precision.

Although this performance metric captures now several ideas, it still allows

for extensions. The cases Xr = 0 and ar = 0 are not covered, and more

importantly, the numerical accuracy bound v on the reference data input

Xr is not taken into account at all.

5. Examples

In this section the performance metrics presented are tested. In addition,

the use of Monte Carlo methods will be assessed.

5.1. Example 1: The identity function

In this example we consider the identity function implemented as:

a = id(x) = (xn + 1− 1)1/k, x ∈ [0, 1], k = 20.

We now study the function id(x) from two different viewpoints.

In interpretation 1 we consider this implementation of the identity func-

tion as a black box. Assume that this black box represents a stable forward

reference data generator function of a problem with a large condition num-

ber. We will study methods for calculating NABs and the numerical sensi-

tivity of the resulting reference data pairs. The results displayed in the table


200

below show that interval arithmetic provides valid NABs on the result. The

standard deviation of the result values for a using the Monte Carlo method

may be either much larger or much smaller than the true error, and thus

this standard deviation is not a good estimate of an NAB. The Monte Carlo

method implemented in double precision also does not provide good results

related to the numerical sensitivity of the problem. It turns out that us-

ing extended precision with 34 digits solves all these problems. The Monte

Carlo results then correspond with the column aKrδ∗, where δ∗ = 1×10−15,

the relative perturbation size used in the Monte Carlo method.

In interpretation 2 we simply consider this implementation of the iden-

tity function as ‘software under test’. The values of all four performance

metrics coincide for this example and are shown in the last column of the

table below.

Correct Difference with IA NAB MC std. dev. Result P0 = P1 =a = x correct result for a of a for aKrδ

∗ P2 = P30.0 0.0e+00 4.9e-324 0.0e+00 0.0e+00 (0.0)0.1 -1.0e-01 8.3e-02 0.0e+00 1.0e-16 15.60.2 -4.8e-05 1.1e-04 6.7e-14 2.0e-16 12.00.3 1.8e-08 4.8e-08 1.1e-13 3.0e-16 8.40.4 1.7e-10 2.0e-10 5.1e-14 4.0e-16 6.30.5 0.0e+00 2.2e-16 0.0e+00 5.0e-16 0.00.6 4.5e-14 9.1e-14 2.2e-13 6.0e-16 2.50.7 2.7e-15 5.1e-15 3.2e-13 7.0e-16 1.30.8 -2.2e-16 6.7e-16 3.8e-13 8.0e-16 0.40.9 0.0e+00 3.3e-16 2.1e-13 9.0e-16 0.01.0 0.0e+00 0.0e+00 1.0e-15 1.0e-15 0.0

5.2. Example 2: Fit of exponential decay and baseline

The second example consists of fitting a function representing exponential

decay with a baseline to data points. The computational aim is: Given M

and data points (xi, yi), i = 1, . . . ,M , determine a, b and c that minimize

M∑i=1

[yi − (a+ b exp(−xi/c))]2

and return c.

A reference data set was generated using the null-space method for xivalues 0, 1, 2, . . . , 10, a = 1, b = 2, c = 3 and noise variance σ2 = 0.252.

As forward fitting software Matlab’s fminsearch in double precision and

in extended precision (using Advanpix toolbox) were used. Two different

analyses were performed.

In interpretation 1 the numerical uncertainty of the reference data pair

generated by the null-space method is assessed. Using Matlab’s fminsearch


201

in double precision leads to a large NAB of 5× 10−9. Nevertheless Matlab

reports that the fitting algorithm has converged and all tolerances (set to

10−40) have been met. The standard deviation of 2×10−8 resulting from the

Monte Carlo method implemented in double precision is consistent with this

overestimate. The correctness of the reference data to a level of 1 × 10−16

could be established using Advanpix’s extended precision implementation of

fminsearch. The results of a Monte Carlo method implemented in extended

precision correspond well with an approximate analytical result by using

the Jacobian of the CA, the uncertainties in the yi and neglecting the (less

important) uncertainties of the xi (column LPU-y, law of propagation of

uncertainty with yi). In interpretation 2 the fminsearch function in double

Comparisonwith fit in

doubleprecision

Std. dev.MC

doubleprecision

LPU-ydouble

precision

Comparisonwith fit inextendedprecision

Std. dev.MC

extendedprecision

LPU-yextendedprecision

5 × 10−9 2 × 10−8 7 × 10−15 1 × 10−16 6 × 10−15 7 × 10−15

and extended precision applied to the data stored in double precision are

evaluated as test software. Approximate values of the relative numerical

sensitivity coefficient and the relative condition number were calculated

using a Monte Carlo method. For the double precision implementation the

values of the four performance metrics are 7.3, 6.6, 6.2 and 6.2. For the

extended precision implementation these values are 0.2, 0.0, 0.0 and 0.0.

6. Conclusion

Assessing the numerical uncertainty of generated reference data pairs re-

quires a careful analysis. Using a very high numerical precision is usually the

easiest way to get a small numerical accuracy bound of the reference data

pair. Combination with interval arithmetic at same time would be ideal, but

is not straightforward. Assessing the quality of software under test working

in double precision requires knowledge of the condition number of the CA,

which may be difficult to obtain in some cases. It was shown that the re-

sults obtained by a Monte Carlo method do not always provide valid NABs

and that the results obtained by a Monte Carlo method implemented in

double precision with small εMC do not always correspond with numerical

sensitivity. Results obtained by a Monte Carlo method implemented in ex-

tended precision corresponded well with the numerical sensitivity. Finally


202

we presented some performance metrics to summarize the results of soft-

ware testing. They may be used to demonstrate traceability of metrology

software.

Acknowledgements

This work has been undertaken as part of the EMRP Joint Research Project

NEW06 “Traceability for computationally-intensive metrology”,6 jointly

funded by the EMRP participating countries within EURAMET and the

European Union.

We thank Peter Harris from NPL for useful discussions and comments

on the paper.

References

1. M G Cox and A B Forbes, Strategies for testing form assessment software,NPL Report DITC 211/92, 1999.

2. Matlab software, The Mathworks, www.mathworks.com.3. S.M. Rump. INTLAB - INTerval LABoratory. In Tibor Csendes, editor, Devel-

opments in Reliable Computing, pages 77-104. Kluwer Academic Publishers,Dordrecht, 1999, www.ti3.tuhh.de/rump/.

4. Advanpix Multiprecision Computing Toolbox for MATLAB, www.advanpix.com.

5. Mathematica software, Wolfram, www.wolfram.com.6. EMRP Project NEW06: “Traceability for Computationally-Intensive Metrol-

ogy”, www.tracim.eu.



UNCERTAINTY EVALUATION FOR A

COMPUTATIONALLY EXPENSIVE MODEL OF A SONIC

NOZZLE

G. J. P. KOK∗, N. PELEVIC

VSL, Delft, The Netherlands∗E-mail: [email protected]

Uncertainty evaluation in metrological applications with computationally ex-

pensive model functions can be challenging if it is not clear if the model can be

locally linearized and the law of the propagation of uncertainty of the Guideto the Expression of Uncertainty in Measurement can be applied. The use the

Monte Carlo method as presented in GUM supplement 1 is not practical as

it requires a vast number of model evaluations, which can be very time con-suming in case of computationally expensive model functions. For this type of

model functions smart sampling approaches can be used to assess the uncer-

tainty of the measurand. In this paper a computational fluid dynamics modelof sonic gas flow through a Venturi nozzle is studied. Various smart sampling

methods for uncertainty quantification of the model’s output parameter mass

flow rate are assessed. Other sources of uncertainty of the model are brieflydiscussed, and a comparison with measurement data and with the results of a

1-dimensional simplified model are made.

Keywords: Uncertainty evaluation, sonic nozzle, CFD, Monte Carlo, Latin Hy-percube sampling, polynomial chaos

1. Introduction

Uncertainty evaluation in metrological applications with computationally

expensive model functions can be challenging if it is not clear if the model

can be locally linearized and the law of the propagation of uncertainty

(LPU) of the Guide to the Expression of Uncertainty in Measurement

(GUM1) can be applied. The use the Monte Carlo method as presented in

GUM supplement 12 is not practical as it requires a vast number of model

evaluations, which can be very time consuming in case of computationally

expensive model functions. For this type of model functions smart sampling

approaches can be used to assess the uncertainty of the measurand. In this

paper a computational fluid dynamics model of sonic gas flow through a

Venturi nozzle is studied. Various smart sampling methods for uncertainty

203


204

quantification of the model’s output parameter mass flow rate are assessed.

Other sources of uncertainty of the model are briefly discussed, and a com-

parison with measurement data and with the results of a 1-dimensional

simplified model are made.

2. Mathematical and stochastical model

In figure 1 an image of a Venturi nozzle is shown. Air flows in from the

left side, is accelerated to sonic and supersonic velocities and, after a shock

wave, decelerates again to subsonic conditions. A necessary condition for

reaching these sonic velocities is that the imposed ratio of outlet pressure

to inlet pressure is below a critical value.

September 12, 2014 10:54 WSPC Proceedings - 9in x 6in AMCTM2014-UncertaintyCompExpSys-Paper page 2

2

quantification of the model’s output parameter mass flow rate are assessed.

Other sources of uncertainty of the model are briefly discussed, and a com-

parison with measurement data and with the results of a 1-dimensional

simplified model are made.

2. Mathematical and stochastical model

In figure 1 an image of a Venturi nozzle is shown. Air flows in from the

left side, is accelerated to sonic and supersonic velocities and, after a shock

wave, decelerates again to subsonic conditions. A necessary condition for

reaching these sonic velocities is that the imposed ratio of outlet pressure

to inlet pressure is below a critical value.

Fig. 1. Venturi nozzle.

The first question to be answered in this paper is to predict the mass flow

rate through the nozzle and to calculate its uncertainty, given the nozzle’s

geometry, inlet and outlet pressure, inlet temperature and the values of a

number of physical constants, each known with a given uncertainty. The

second question is which method for uncertainty quantification is best to

use, i.e. converges fastest.

The flow model3 is governed by the classical Navier-Stokes equations.

It was solved using a transient solver for trans-sonic/supersonic turbulent

flow of a compressible gas in the CFD software OpenFoam4. As it took too

long to solve the problem in 3-dimensions, the problem was simplified by

taking using the rotational symmetry of the problem and solving a ‘slice’

of the nozzle flow problem in 2 dimensions. Performing four model evalua-

tions in parallel on a desktop computer with 4 Intel(R) Core (TM) i7 CPU

[email protected] GHz processors and 7.8 GB of RAM memory took approximately

30 minutes.

Fig. 1. Venturi nozzle.

The first question to be answered in this paper is to predict the mass flow

rate through the nozzle and to calculate its uncertainty, given the nozzle’s

geometry, inlet and outlet pressure, inlet temperature and the values of a

number of physical constants, each known with a given uncertainty. The

second question is which method for uncertainty quantification is best to

use, i.e. converges fastest.

The flow model3 is governed by the classical Navier-Stokes equations.

It was solved using a transient solver for trans-sonic/supersonic turbulent

flow of a compressible gas in the CFD software OpenFoam.4 As it took too

long to solve the problem in 3-dimensions, the problem was simplified by

taking using the rotational symmetry of the problem and solving a ‘slice’

of the nozzle flow problem in 2 dimensions. Performing four model evalua-

tions in parallel on a desktop computer with 4 Intel(R) Core (TM) i7 CPU

[email protected] GHz processors and 7.8 GB of RAM memory took approximately

30 minutes.


205

2.1. Stochastic model inputs

As noted before some model parameters are known with an uncertainty.

These parameters, together with their uncertainties are listed in table 1.

The parameters are assumed independent.

Table 1. Definition of distributions of the uncertain input quan-tities of the model. The symbol N(µ,σ2) is used to denote a nor-

mal distribution with mean µ and variance σ2. The symbolR(a, b)is used to denote a rectangular distribution on the interval [a, b].

Parameters are assumed independent.

Variable Description Unit Distribution

pin Inlet pressure Pa N(100 000, 132)

pout Outlet pressure Pa N(60 000, 132)Tin Inlet temperature K N(295, 0.12)

M Molar mass kg/ kmol N(28.85, 0.0052)

µ Dynamic viscosity µPa s R(18.09, 18.51)cv Specific heat kJ/ (kg K) R(710.5, 725.0)

at constant volume

Pr Prandtl number - R(0.687, 0.729)

The uncertainty in the mass flow rate through the nozzle due to the

uncertainty in the input parameters was calculated based on an evaluation

of the CFD model at different sets of input parameter values. The spe-

cific values of the input parameters were chosen according to the following

sampling methods:

• Monte Carlo sampling (abbreviated MC),

• Law of Propagation of Uncertainty sampling (LPU),

• Stratified sampling (SS),

• Latin Hypercube sampling (LHS),

• Polynomial Chaos sampling (PC).

Monte Carlo sampling simply consists of random samples from the distribu-

tions of each of the input parameters. The reference result was calculated

using this method with 1998 samples, denoted with REF-MC-1998. Fur-

thermore this method was tested with 8 and 128 samples, each repeated

three times. The results are identified with MC-8 and MC-128.

In law-of-propagation-of-uncertainty sampling the model is evaluated

at the mean input parameters values and subsequently each of the 7 input

parameters is varied with one (resp. two and three) standard deviation(s)

one after each other, resulting in 8 model evaluation in total. From the


206

results sensitivity coefficients are calculated and the standard uncertainty

of the flow rate is calculated using the law of propagation of uncertainty. If

there is one input parameter mainly contributing to the uncertainty of the

flow rate, the distribution of this parameter can also be used as distribution

of the flow rate. The results of the three repetitions of this sampling method

is called LPU-8.

In stratified sampling the input space is sub-divided into a number of

non-overlapping regions of (in our case) equal probability. As the input

quantities are independent and have a known distribution this subdivision

is straightforward. From each subdivision a random draw of the input pa-

rameters is taken. In our case the range of each of the 7 input parameters

was divided in 2, leading to a total of 27 = 128 regions and the same number

of samples. The procedure was repeated three times, and result are denoted

with SS-128.

In Latin Hypercube sampling the range of each input variable is subdi-

vided in the same number N of non-overlapping intervals of (in our case)

equal probability. Then for each of the 7 input parameters a random permu-

tation (i1,j , i2,j , . . . , iN,j) of the numbers (1, 2, . . . , N) is randomly gener-

ated and subsequently N samples are constructed with sample k containing

a random sample from the ik,j-th interval of input parameter j. We used

the values N = 8 and N = 128 to allow fair comparisons with LHS-8 and

SS-128 and repeated the procedure three times. The results are denoted

with LHS-8 and LHS-128.

In polynomial chaos sampling (usually simply called polynomial chaos,

we used the non-intrusive variant) the random output variable mass flow

rate is represented by a series development in the input random variables.

By using quadrature rules specific to each input probability distribution

moments of the output variable can be approximated. The complete prob-

ability distribution of the output variable can be approximated by a La-

grange interpolation. For a given order of series development the values of

the input parameters at which the model needs to be evaluated are fixed.

In the multi-dimensional case a tensor-grid of input parameter values is

constructed. In this study we used a second order development leading to

27 = 128 model evaluations to be performed. As this method is determin-

istic, only 1 repetition was done resulting in PC-128. Note that so called

sparse grid approaches exist (e.g. Smolyak grids) that reduce the number

of model evaluations required compared to the full tensor grid. Sparse grids

were not used in this study.


207

3. Results

The reference result for the uncertainty calculation is based on 1998 model

evaluations based on random sampling (Monte Carlo method). We found

as mean flow rate qref = 43720(1) mg/s and a standard uncertainty of

u(qref) = 27(5) (mg/s), which is 0.06 % (0.01 %) in relative terms. The

numbers in the parentheses denote the uncertainties of these mean values.

Results for the calculated flow rate and the standard uncertainty for

several sampling methods and several sampling sizes, presented as difference

with the reference values are shown in table 2. In parentheses the standard

deviation of q and u(q) are shown based on the three repetitions of each

simulation. All methods seem to converge to the correct values, but the

stochastic methods MC and LHS clearly have a higher standard deviation

when using 8 samples than when using 128 samples. The full cumulative

Table 2. Results for the calculated flow

rate and the standard uncertainty forseveral sampling methods and several

sampling sizes, presented as difference

with the reference values.

Sampling (q − qref) (u(q) − u(qref))

method / (mg/s) / (mg/s)

MC-8 2 (5) 2 (8)LPU-8 0 (0) 0 (0)

LHS-8 0 (2) 0 (6)

MC-128 0 (2) 1 (1)

SS-128 -1 (2) 0 (0)LHS-128 0 (0) 0 (1)

PC-128 0 (0) -1 (0)

distribution functions (CDF) were also compared for the different methods.

For LPU the rectangular distribution of the main uncertainty source cv was

used. For MC, LHS and SS a maximum and minimum CDF was calculated

based on the collected results. For PC the estimated output probability

function was calculated as well. In table 3 the maximum differences of

the reference CDF with these CDFs was calculated. It can be seen that the

CDFs of PC-128 and also LPU-8 always have good correspondence with the

reference CDF, whereas the other methods may lead to larger differences

in case of ‘bad luck’ in the sampling result. Note that as the reference

distribution is based on 1998 samples, the agreement is not expected to be

better than 0.005 at best.


208

Table 3. Maximal absolute difference of the CDF constructed from the sampleswith the reference CDF for various sampling strategies. As LPU-8 and PC-128 are

deterministic methods the values in both columns are equal.

Sampling method maximum absolute difference maximum absolute

difference with CDF difference with CDFbased on the minimum values based on the maximum values

MC-8 0.308 0.131

LPU-8 0.033 0.033

LHS-8 0.250 0.156

MC-128 0.088 0.076SS-128 0.077 0.056

LHS-128 0.047 0.048

PC-128 0.013 0.013

4. Other uncertainty contributions

Other uncertainty sources contributing to the uncertainty of the predicted

flow rate and uncertainty by the CFD model were studied as well. They

are listed in table 4. It can be seen that several uncertainty sources yield

higher uncertainty contributions than the 0.06 % due to the uncertainty in

the parameter values of the model.

Table 4. Other uncertainty sources related to the CFD model.

Uncertainty source Contribution to relativestandard uncertainty of q

Nozzle geometry 1.4 %

(0.1 mm change in throat diameter)

Temperature dependence of 0.1 %physical parameters

Physical turbulence model 0.5 %

Boundary conditions 0.0 %Grid convergence/ discretization error 1.1 %

Iteration convergence error 0.0 %

5. Comparison with measurement data and with a

1-dimensional simplified model equation

The mass flow rate through the nozzle has been experimentally measured

at VSL’s gas flow lab with an uncertainty of 0.14 %. ISO-norm ISO-93005

provides a 1-dimensional formula for the mass flow rate with an uncertainty

of about 1 %. A quick comparison of the predicted values of the mass flow


209

rate and the measured mass flow rate is shown in table 5. In view of the

uncertainties the CFD model is both consistent with the predicted value of

the ISO-norm and with the measurement data.

Table 5. Comparison of predicted results for flow rate by

CFD and ISO-9300 formula and experimental measurement.

Comparison Relative difference Relative standard

uncertainty of difference

CFD - Exp. 2.0 % 1.4 %ISO - Exp. 3.8 % 1.0 %

CFD - ISO -1.8 % 1.7 %

6. Conclusions

We studied different methods to calculate the standard uncertainty and

full probability distribution of the mass flow rate through a sonic nozzle

given inlet and outlet pressure and inlet temperature of the flowing gas.

The problem appeared to be linear around the working point in relation

with the size of the uncertainties. Therefore the law of the propagation

of uncertainty performed very well. Polynomial chaos was considered the

best method, but is more expensive in terms of model evaluations. At a

fixed sample size the Monte Carlo method showed the slowest convergence.

Stratified sampling performed better than Monte Carlo, and Latin Hyper-

cube sampling performed better than stratified sampling. In general it is

not known beforehand which input parameters are contributing most to

the output uncertainty, if a local linearization of the model is sufficient

and which order of expansion is needed in case of using polynomial chaos.

Therefore an analysis is stages is advised, rather than pointing out one

single method of choice. It was found that the uncertainty contribution

due to the uncertainty of the input parameters of the CFD model is much

smaller than some other uncertainty contributions. The correspondence of

the CFD model with the measurement data was acceptable in view of the

model uncertainty.

Acknowledgements

This work has been undertaken as part of the EMRP6 Joint Research

Project NEW04,7 “Novel mathematical and statistical approaches to un-


210

certainty evaluation”, co-funded by the Dutch ministry of Economic Affairs

and the European Union.

References

1. BIPM, IEC, IFCC, ILAC, ISO, IUPAC, IUPAP and OIML, Evaluation ofmeasurement data - Guide to the expression of uncertainty in measurement(Joint Committee for Guides in Metrology, JGCM 100:2008).

2. BIPM, IEC, IFCC, ILAC, ISO, IUPAC, IUPAP and OIML, Evaluation ofmeasurement data - Guide to the expression of uncertainty in measurement -Propagation of distributions using a Monte Carlo method, (Joint Committeefor Guides in Metrology, JGCM 101:2008).

3. M. Baer, e.a., Novel mathematical and statistical approaches to uncertaintyevaluation: Physical model and statistical model for uncertainties for flowapplication defined, (EMRP NEW04 Deliverable 2.1.6, 2013).

4. OpenFoam, Open source CFD toolbox, (OpenCFD Ltd), www.openfoam.com.5. ISO, ISO Norm-9300, Measurement of gas flow by means of critical flow

Venturi nozzles, (ISO, 2005).6. European Metrology Research Programme, www.emrponline.eu.7. Project website of EMRP Joint Research Project NEW04, “Novel mathemat-

ical and statistical approaches to uncertainty evaluation”, www.ptb.de/emrp/new04-home.html.



EllipseFit4HC: A MATLAB ALGORITHM FOR

DEMODULATION AND UNCERTAINTY EVALUATION OF

THE QUADRATURE INTERFEROMETER SIGNALS

RAINER KONING

Physikalisch-Technische BundesanstaltBundesallee 100, 38116 Braunschweig, Germany


GEJZA WIMMER

Faculty of Natural Sciences, Matej Bel University, Banska Bystrica, Slovakia

Mathematical Institute, Slovak Academy of Sciences, Bratislava, SlovakiaE-mail: [email protected]

VIKTOR WITKOVSKY

Institute of Measurement Science, Slovak Academy of SciencesDubravska cesta 9, 84104 Bratislava, Slovakia


We present the MATLAB algorithm EllipseFit4HC for fitting an ellipse to

noisy data by minimizing the geometric distance between the measured andfitted values. Quadrature homodyne interferometers typically exhibit two si-

nusoidal interference signals shifted, in the ideal case, by 90 degree to allow a

detection of the direction of the motion responsible for the actual phase change.But practically encountered signals exhibit additional offsets, unequal ampli-

tudes and a phase shift that differs from 90 degree. In order to demodulate such

interference signals an ellipse is fitted to both (possibly correlated) signals si-multaneously. The algorithm EllipseFit4HC is suggested for estimating the

ellipse parameters required for the demodulation of quadrature homodyne in-terferometer signals by using the Heydemann correction (HC) together with the

associated uncertainties of the estimated ellipse parameters and interferometric

phases and/or displacements. The accuracy of the proposed method has beenverified by Monte Carlo simulations. The algorithm EllipseFit4HC is freelyavailable at the MATLAB Central File Exchange, http://www.mathworks.com/

matlabcentral/fileexchange/47420-ellipsefit4hc.

Keywords: Quadrature homodyne interferometers; Heydemann correction; un-

certainties of interferometric phases; ellipse fitting; Matlab; EllipseFit4HC.

211


212

Laser

Detector

l/8

PBS

BS

ReferenceMirror

MeasurementMirror

-0.5 0 0.5 1 1.5-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

Cosine signal [V] →

Sin

e s

ign

al [V

]→

Fig. 1. Left: Schematic setup of a quadrature homodyne interferometer, where BS is

the beam splitter and PBS is the polarizing beam splitter. Right: Real interferometersignals.

1. Introduction

Quadrature homodyne interferometers are used in many dimensional

metrology applications. The output signals, called sine / cosine signals or

quadrature signals, usually exhibit offsets, unequal amplitudes and a phase

difference, that is not exactly 90 degree. Mathematically, the (noiseless)

output signals can be described as

x(ϕ) = α0 + α1 cosϕ

y(ϕ) = β0 + β1 sin(ϕ+ ϕ0), (1)

where ϕ is the phase (the parameter of a primary interest), α0 and β0 denote

the coordinates of the ellipse center (the offsets), α1 and β1 are the signal

amplitudes, and −π/2 < ϕ0 < π/2 is the phase offset. Under these circum-

stances, given the true values of the ellipse parameters, α0, β0, α1, β1, ϕ0,

and the particular signal values x and y (lying on this specific ellipse), the

required interferometric phase ϕ is determined by using the relation

ϕ = arctan

[α1(y − β0)− β1(x− α0) sinϕ0

β1(x− α0) cosϕ0

]. (2)

However, real applications have to use noisy experimental data (xi, yi),

i = 1, . . . , n. So it is a problem of fitting an ellipse to data by minimizing

SS(ϑ) =n∑

i=1

[xi − (α0 + α1 cosϕi)]2

+ [yi − (β0 + β1 sin(ϕi + ϕ0))]2. (3)

The procedure requires a minimization in the (n + 5)-dimensional param-

eter space, with the parameters ϑ = (α0, β0, α1, β1, ϕ0, ϕ1, . . . , ϕn). This

approach is predictably cumbersome and slow for relatively large n (a typi-

cal case for the interferometer measurements). The estimation of the ellipse


213

parameters, phases, and their covariance matrix (and/or the associated un-

certainties) is still a computational challenge in such type of models. So,

in such cases one should typically resort to various approximations. Here

we present an implementation of the MATLAB algorithm ellipseFit4HC,

which is based on an approximate method for estimation of the ellipse pa-

rameters and their uncertainties, as suggested in [1]. For further details and

alternative methods for fitting an ellipse to experimental data see [2–8].

2. Approximate method based on linearization of the

regression model with constraints

According to [1] and [9], the originally nonlinear model (in fact the linear

regression model with nonlinear constraints on its parameters) is approxi-

mated locally by a linear regression model with linear constraints of type II.

This allows to derive the locally best linear unbiased estimators (BLUEs)

of the model parameters, as well as derivation of the (approximate) covari-

ance matrix of the estimators. Using this solution the required interfero-

metric phases follow from (2) and their uncertainties, can be obtained in

a straightforward way by the law of propagation of uncertainty. The pro-

cess of linearization/estimation can be iterated, until an adequately chosen

convergence criterion is reached.

2.1. The algorithm

The measurement model for the quadrature output signals (xi, yi), i =

1, . . . , n, can be specified by

xi = µi + εx,i,

yi = νi + εy,i, (4)

with the following set of nonlinear restrictions on the model parameters,

µ2i +Bν2

i + Cµiνi +Dµi + Fνi +G = 0, i = 1, . . . , n, (5)

where B,C,D, F,G represent the algebraic ellipse parameters. Notice that

the ellipse parameters B,C,D, F,G appear only in the restrictions. In a

matrix notation we get(x

y

)=

(µ

ν

)+

(εxεy

),

(εxεy

)∼N

((

), σ2

(I %I

%I I

))(6)

where x = (x1, . . . , xn)′, y = (y1, . . . , yn)′, µ = (µ1, . . . , µn)′, ν =

(ν1, . . . , νn)′, εx = (εx,1, . . . , εx,n)′, εy = (εy,1, . . . , εy,n)′, such that εx ∼


214

N(, σ2I) and εy ∼ N(, σ2I) (possibly correlated, with corr(εx,i, εy,i) =

%, i = 1, . . . , n), and with nonlinear restriction on the model parameters of

the form

Bθ + b = , (7)

where B = [ν2...µν

...µ...ν

...], θ = (B,C,D, F,G)′, b = µ2, with

µ2 = (µ21, . . . , µ

2n)′, ν2 = (ν2

1 , . . . , ν2n)′, µν = (µ1ν1, . . . , µnνn)′, and

= (1, . . . , 1)′, = (0, . . . , 0)′. Here, [u...v] denotes the concatenation of

the vectors u and v to a matrix.

We shall linearize the nonlinear system of restrictions, Bθ + b = , by

the first-order Taylor expansion about µ0, ν0, and θ0,

Bθ + b ≈ (B0θ0 + b0) +∂(Bθ + b)

∂µ′

∣∣∣0(µ− µ0)

+∂(Bθ + b)

∂ν′

∣∣∣0(ν − ν0) +

∂(Bθ + b)

∂θ′

∣∣∣0(θ − θ0),

≈ A0

(µ∆

ν∆

)+B0θ∆ + c0, (8)

where

A0 =

[Diag

([

...ν0

......

...

]θ0 + 2µ0

)... Diag

([2ν0

...µ0

......

...

]θ0

)],

µ∆ = µ− µ0, ν∆ = ν − ν0,

B0 =

[ν2

0

...µ0ν0

...µ0

...ν0

...

],

θ∆ = θ − θ0, c0 = B0θ0 + b0, θ0 = (B0, C0, D0, F0, G0)′ and b0 = µ20. (9)

Thus, we get the (approximate) linear regression model with linear con-

straints,(x∆

y∆

)approx∼ N

((µ∆

ν∆

), σ2H

)∧ A0

(µ∆

ν∆

)+B0θ∆ + c0 = , (10)

where x∆ = x − µ0, y∆ = y − ν0, A0, B0, and c0 are given by (9), and

H is a known correlation matrix of the measurement errors (ε′x, ε′y)′, here

H = [I... %I; %I

... I]. This model serves as a first-order approximation to the

originally nonlinear model (4)–(5), which is correct in the vicinity of the

preselected fixed values of the parameters, µ0, ν0, and θ0.

Hence, the (locally) best linear unbiased estimators (BLUEs) of the

model parameters and their covariance matrix can be estimated by a


215

method suggested in [9], for more details see also [1]:(µ∆

ν∆

)θ∆

= −(HA′0Q11,0

Q21,0

)c0 +

(I −HA′0Q11,0A0

−Q21,0A0

)(x∆

y∆

), (11)

where Q11,0 and Q21,0 are blocks of the matrix Q0 defined by

Q0 =

(Q11,0 Q12,0

Q21,0 Q22,0

)=

(A0HA

′0 B0

B′0 0

)−1

,

together with its covariance matrix

cov

(µ∆

ν∆

)θ∆

= σ2

(H −HA′0Q11,0A0H −HA′0Q12,0

−Q21,0A0H −Q22,0

). (12)

Then, the estimators of the original parameters µ, ν, and θ are given by

µ = µ∆ + µ0, ν = ν∆ + ν0, θ = θ∆ + θ0. (13)

The estimator of the common variance σ2 is given by

σ2 =1

n− 5

((x

y

)−(µ

ν

))′H−1

((x

y

)−(µ

ν

)). (14)

In particular, for uncorrelated measurement deviations we get

σ2 =1

n− 5

n∑i=1

([xi − µi]

2+ [yi − νi]2

). (15)

The process of linearization/estimation can be iterated, until the stated

convergence criterion is reached. We suggest to start with µ0 = x, ν0 = y,

θ0 = −(B′0B0)−1B′0x2, where B0 =

(ν2

0

...µ0ν0

...µ0

...ν0

...

).

3. Example

We shall illustrate applicability and usage of the MATLAB algorithm

EllipseFit4HC by a simple example using artificially generated quadrature

homodyne interferometer signals with correlated measurement deviations.

We have generated and fitted the measurement signals with the following

(known) true parameters and by using the following MATLAB code:


216

alpha0true = 0; % x center offsetsbeta0true = 0; % y center offsetsalpha1true = 1; % x amplitudesbeta1true = 0.98; % y amplitudesphi0true = pi/10; % phase offsetsigma = 0.01; % true stdn = 1000; % number of measurementsrho = 0.95; % true correlationphitrue = (2*pi)*sort(rand(n,1)); % true phases phi i%% Set the true ellipse valuesX = @(t) alpha0true + alpha1true * cos(t);Y = @(t) beta0true + beta1true * sin(t + phi0true);%% Generate the measurement deviations (use Statistics Toolbox)err = mvnrnd([0 0],(sigma*sigma)*[1 rho; rho 1],n);x = X(phitrue) + err(:,1);y = Y(phitrue) + err(:,2);%% Set the optional parametersoptions.alpha = 0.05; % level of significanceoptions.correlation = rho; % known correlation coeffoptions.displconst = 633.3/(4*pi); % displacement constantoptions.displunit = 'nanometer [nm]'; % displacement unit%% Fit the ellipseresults = ellipseFit4HC(x,y,options);

The outcome of the algorithm (results) is a rich structure of different

outcomes which includes the estimated ellipse parameters, phases, displace-

ments, their estimated standard uncertainties and/or confidence intervals

(Type A evaluation). For example, here we present the tables with esti-

mated algebraic ellipse parameters (B,C,D, F,G), as well as the geometric

ellipse parameters (α0, β0, α1, β1, ϕ0), together with their standard uncer-

tainties and the 95% confidence intervals. Compare the estimated parame-

ters and their confidence intervals with the true values of the parameters.

ESTIMATE STD LOWER UPPER

B 1.0425 0.0013493 1.0398 1.0451C -0.62988 0.0012831 -0.6324 -0.62736D 0.0012767 0.00056939 0.00015932 0.002394F 0.00069123 0.00058911 -0.0004648 0.0018473G -0.90511 0.00069285 -0.90647 -0.90375

alpha 0 -0.00082085 0.00036463 -0.0015364 -0.00010531beta 0 -0.00057952 0.0003603 -0.0012866 0.00012751alpha 1 1.0001 0.0004787 0.9992 1.0011beta 1 0.97956 0.00047917 0.97862 0.9805phi 0 0.31357 0.00066165 0.31227 0.31487


217

0 50 100 150 200 250 3000.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

0.55

0.6

True displacement

Sta

tistic

al u

ncer

tain

ty [n

m]

Statistical Uncertainty: Min: 0.13246, Max: 0.58993

0 50 100 150 200 250 300−1.5

−1

−0.5

0

0.5

1

1.5

True displacement

Res

idua

ls

Expanded Uncertainty of Fitted Displacement

0 1 2 3 4 5 6−0.04

−0.03

−0.02

−0.01

0

0.01

0.02

0.03

0.04

True phi

X re

sidu

als

Expanded Uncertainty of Fitted X Residuals

0 1 2 3 4 5 6−0.04

−0.03

−0.02

−0.01

0

0.01

0.02

0.03

0.04

True phi

Y re

sidu

als

Expanded Uncertainty of Fitted Y Residuals

Fig. 2. The statistical uncertainties of the estimated displacements, the residuals and

their expanded uncertainties (coverage factor k = 2).

The algorithm EllipseFit4HC provides also the estimated standard de-

viation (σ2 = 0.00009895) and similar tables for estimated phases and/or

displacements (the parameters of main interest), together with their esti-

mated standard uncertainties and confidence intervals. Figure 2 presents

the statistical uncertainties of the estimated displacements graphically, to-

gether with the (different) residuals and their expanded uncertainties.

4. Conclusions

The advantage of the presented algorithm is that it provides BLUEs of

the parameters, the (locally) best linear unbiased estimators and also its

covariance matrix. In calculation of the phase values the observed input

signals (measurements) are replaced by their estimates (the fitted values on

the ellipse curve). The covariances of these estimates and the estimates of

the curve parameters are also included. Therefore the statistical uncertainty


218

of the phase can be determined using the error propagation law without

any difficulties.

The MATLAB algorithm EllipseFit4HC has been implemented in and

is freely available at the MATLAB Central File Exchange, http://www.

mathworks.com/matlabcentral/fileexchange/47420-ellipsefit4hc.

Acknowledgements

The work was partly supported by the Slovak Research and Development

Agency, projects APVV-0096-10, SK-AT-0025-12, and by the Scientific

Grant Agency of the Ministry of Education of the Slovak Republic and the

Slovak Academy of Sciences, projects VEGA 2/0038/12, and 2/0043/13.

References

[1] R. Koning, G. Wimmer and V. Witkovsky, Ellipse fitting by nonlinear con-straints to demodulate quadrature homodyne interferometer signals and todetermine the statistical uncertainty of the interferometric phase, Measure-ment Science and Technology 25, p. 115001 (11pp) (2014).

[2] W. Gander, G. H. Golub and R. Strebel, Least-squares fitting of circles andellipses, BIT Numerical Mathematics 34, 558 (1994).

[3] N. Chernov and C. Lesort, Statistical efficiency of curve fitting algorithms,Computational Statistics and Data Analysis 47, 713 (2004).

[4] N. Chernov and S. Wijewickrema, Algorithms for projecting points onto con-ics, Journal of Computational and Applied Mathematics 251, 8 (2013).

[5] S. Van Huffel and J. Vandewalle, The Total Least Squares Problem: Compu-tational Aspects and Analysis (SIAM Philadelphia, 1991).

[6] A. Malengo and F. Pennecchi, A weighted total least-squares algorithm forany fitting model with correlated variables, Metrologia 50, 654 (2013).

[7] S. J. Ahn, Least squares orthogonal distance fitting of curves and surfaces inspace, in Lecture Notes in Computer Science Volume, (Springer, Heidelberg,2004) pp. 17–34.

[8] C.-M. Wu, C.-S. Su and G.-S. Peng, Correction of nonlinearity in one-frequency optical interferometry, Measurement Science and Technology 7, 520(1996).

[9] L. Kubacek, Foundations of Estimation Theory (Elsevier, Amsterdam, 1988).

219





CONSIDERATIONS ON THE INFLUENCE OF TEST

EQUIPMENT INSTABILITY AND CALIBRATION METHODS

ON MEASUREMENT UNCERTAINTY OF THE TEST

LABORATORY

KRIVOV A.S.

Moscow div. of Dipaul, 20 b.1, Ogorodny proezd, Moscow, 127322, Russia

MARINKO S.V.

Metrological Center, 13 Komarova str., Mytischi, Moscow reg., 141006, Russia

BOYKO I.G.

Metrological Center, 13 Komarova str., Mytischi, Moscow reg., 141006, Russia

The purpose of this paper is to show how the accuracy, frequency and number of

measurements for calibration, instability of the test equipment can affect the evaluation

of measurement uncertainty in product testing for resistance to the impact of external

environment. The proposed scheme for calculations of uncertainty estimation is based on

linear filtering of the measurements for periodic calibration of the equipment with

unstable characteristics. This work considers the example for a temperature chamber.

Keywords: Test equipment calibration, measurement uncertainty, Kalman filtering,

temperature chamber

1. Introduction

Evaluation of uncertainty of the measurement is necessary for confidence in

the results of tests in the laboratory. Sources of uncertainty involve methods and

equipment, the environment, the technical performances and state of the test

object, as well as the operator. Inaccuracy of external impacts on the test object,

which may be associated with the characteristics of the test equipment drift in

time and uncertainty of their estimates for periodic calibration, should be

considered in the case of product testing for resistance to the impact. For product

testing on resistance to external impacts the main sources of measurement

parameters uncertainty can be grouped in two categories: 1) imperfect

measurement procedures (calibration uncertainty, characteristics drift, outlying

measurement sample, influence of external conditions, etc.); 2) inaccurate

knowledge of actual impact values because of the uncertainty of the test

220


equipment parameters and their changes since the last calibration. The influence

of these components differs depending on the test purpose. When checking the

deviation of object parameters it is evident that the influence of the test

equipment characteristics will be significantly greater than in the case of simple

verification that the object works normally.

The authors consider product testing, which include the assessment of

influence of external conditions on the object parameters. In this case, you

should consider a correction on deviation of actual impact value from the

setpoint as

f fmeas meas

x x С f∆ = ∆ − ∆ , (1)

where ∆xf is the change of object parameter x due to impact value f, ∆xfmeas is the

estimate of parameter x changes due to impact value f by measurements, ∆fmeas is

the deviation of impact value f from the setpoint, C is the coefficient of influence

of impact value f on the product parameter x.

Accordingly, the evaluation of standard uncertainty can be written as

2 2 2( ) ( ) ( )f fmeas meas

u x u x C u f∆ = ∆ + ∆ , (2)

where u(∆xfmeas) is the total number of all standard uncertainties associated with

the imperfection of measurement procedures, u(∆fmeas) is the evaluation of

measurement uncertainty of impact values during product testing.

The state of testing equipment can be evaluated in the form of different

characteristics not only deviations ∆f. Therefore, generalized model of the

condition of the test equipment was used to analyze the characteristics

instability. The authors present the test equipment characteristic dynamics in the

form of a partially-observable stochastic sequence:

1k k k k

U U B S−

= + , (3)

where Uk is the vector of the test equipment characteristics at time moment tk, Bk is the connection coefficient matrix between random changes elements of the

vector Uk, Sk is the sequence of random Gaussian vectors with zero mathematic

expectations and covariance matrix Rs.

The calibration of test equipment is provided at the beginning of the

operation and at regular intervals. Direct and indirect measurement of the test

equipment characteristics in the time tk of operation are represented by the

expression:

( )k k k k k k

I C F U Vα= + + , (4)

where αk is a matrix which diagonal elements take the values 0 or 1, depending

on whether or not a particular parameter is measured in the calibration of the test

221


equipment at time tk, Ck is the determined vector defining the operation of test

equipment, Fk is the projection matrix of the vector Uk in the measurement area

Ik, Vk is the Gaussian random variable with covariance matrix Rv.

Applying the optimal linear filtering apparatus to the dynamic models (3)

and (4) we can each time obtain the best estimate of the test equipment

characteristics vector k

U and its covariance matrix in terms of minimum

variance.

Thus, the obtained accuracy characteristics can be considered as the total

uncertainty, including components associated with the equipment model errors

(including spatial and temporal variability of parameters), as well as the

uncertainties in the calibration measurements, that are accounted for in the

models (3) and (4).

2. Consideration of models

We consider test equipment as a test chamber in the form of a

multidimensional dynamic system. Between the parameters of such a system

there is correlation that can be used as a source of additional a priori

information. Parameters variation is considered as a process represented by a

random time-varying vector. Equipment characteristics is evaluated at discrete

points at time tk, k=0,1, …,N.

It is desirable to obtain estimates of time-varying parameters, which would

mitigate the effect of noise process changes over time and noise measurements.

The purpose of this smoothing is to obtain estimates of dynamic systems

parameters that converge to the mathematical expectation and the variance

should be minimal. For certain conditions such assessment is provided by the

Kalman filter or linear quadratic estimation [1]. Kalman filter is now widely used

for processing measurement results of dynamic systems in electronics,

communications equipment, and other technologies. It is impossible to imagine

modern navigation systems without the use of a Kalman filter. The algorithm

works in a two-step process – prediction and update. In the prediction step, the

Kalman filter gives estimates of the current state. After you receive results of the

next measurement containing some errors, these estimates are updated using a

weighted average, with more weight being given to estimates with higher

certainty. The algorithm can run in real time using only the present input

measurements and the previously calculated state and its uncertainty matrix. No

additional information is required.

Study on the real application of the Kalman filter involves exploring the

conditions which provided optimal estimates of the parameters and the lowest

222


variances. Kalman filter assumes that the processes (3), (4) comply some

restrictions. The first is the requirement of linearity of the system. We assume

intervals ∆t=tk+1-tk small enough to represent the change of equipment

characteristics in the form of a linear relationship. The state model of the

equipment is characterized as a vectorial differential equation of first order.

The following restrictions on the state vector and measurements model (3), (4)

are applied:

– the random sequences Sk, Vk do not depend on the current values of the

phase coordinates and are not correlated with each other;

– the sequences Sk, Vk with zero mathematic expectations E(Sk), E(Vk) and

covariance matrices Rs , Rv are defined for all tk.

A necessary condition for the Kalman filter to work correctly is that the

system for which the states are to be estimated, is observable. The system (3) is

observable if and only if the dimension of the measurement vector is equal to the

state vector dimension (the number of variables).

In actual practice, ensuring all of the above assumptions is complex and not

always justified by the task. Therefore, the results of the calculations do not have

the properties of an ideal evaluations. So, the study of the application of Kalman

filter is the learning of the conditions and possibilities of fulfillment of the

conditions of its application. The main sources for the non-optimal estimates can

be a mismatch between the model (3) and real characteristics of the process over

time, an incomplete observability of the characteristics in the working volume of

equipment in the model (4), an incomplete information about the characteristics

of the measuring equipment.

For the computational scheme we consider the algorithm for recursive

estimation that implements the Kalman filter [1] to equations of the state vector

(3) and the measurement vector (4). The most common type of the test

equipment measurements (4) is the periodic calibration of the equipment.

For the implementation of the state prediction at time tk+1 it is enough to

know the filtering evaluation at time tk and the transition matrices Φk+1/k and

Bk+1/k. The estimation filter includes two components: the assessment of the state

kU

∧

and covariance matrix Kk. An efficient estimate of the one-step prediction will

estimate the state vector:

1/ 1/ 1/ˆ ˆ=Ф

k k k k k k k kU U B S

+ + ++ , (5)

where Φk+1/k, Bk+1/k are the transition matrices, Sk is the Gaussian Markov

random vector sequence.

The estimate of the covariance matrix prediction is:

223


1/ 1/k / 1/ 1/ 1/

T T

k k k k k k k k k S k kK Ф K Ф B R B

+ + + + += + . (6)

This one-step prediction will be regarded as a dynamic extrapolation assessing

the current state in a small interval. The prediction is filtered as follows.

Estimation filtering of the state vector is given by:

1/ 1 1/ /1 1 1 1 1[ ( )]k k k k k kk k k k k

U U P I C F Uα

∧ ∧ ∧

+ + ++ + + + +

= + − + . (7)

This expression shows that the filtering is done by correcting the assessment of

prediction 1/k kU

∧

+ on a value proportional to the discrepancy of the parameter

vector determined on the basis of an evaluation of the prediction and the current

status. The weighting matrix Pk+1 is defined by the expression:

1

1 1/ 1 1 1 1 1/ 1 1( ) [( ) ( ) ]T T

k k k k k k k k k k k vP K F F K F Rα α α

−

+ + + + + + + + += + . (8)

Similarly, there is a filtration of the covariance matrix prediction:

1/ 1 1/ 1 1 1 1/( )k k k k k k k k k

K K P F Kα+ + + + + + +

= − . (9)

Equations (5-9) are recursive algorithm of the Kalman filter. Information about

the previous state of the object is reflected in the weight matrix Pk+1.

Thus, an algorithm for obtaining recurrent assessments, including predicting and

Kalman filtering forms a single computing scheme. This property is expressed in

the fact that another estimate of filters is calculated using estimates of the

prediction, and the next prediction is calculated by filtration.

3. Computational scheme and properties of the evaluations

The computational procedure can be carried out if the state of the object is

sufficiently observable from the measurements. Probabilistic correlation between

parameters of the object's state suggests that an increased number of involved

measurement channels may be able to reduce the order of the state vector.

Applied to mathematical models, this condition means the possibility of lowering

the rank of the matrix F in (4) on the (k+1)-th step evaluation with respect to the

order of the state vector. The matrix αααα implements this procedure and enables

the choice of number the measuring channels. However, the stability of the

algorithm requires further study in this case.

Consider the physical interpretation of the state model and measurement

parameters in the example of spatially distributed temperature parameters in a

working volume of the chamber. Elements of the vector U are the reproducible

temperatures at some n points of working volume. At the initial time, these

parameters are the elements of the state vector U0 at tk = t0. Temperature

224


fluctuations in the i-th point of the working volume (i=1,2,…,n) can be regarded

as random. The point estimate of the mathematic expectation of i-th element of

the state vector Uk at tk = t0 determined from the expression:

1

0

1

ˆ ,l

i iml

m

U T

=

= ∑ (10)

where l is the number of the observations in the i-th point of the working

volume, Tim is the result of the m-th observation of the temperature in the i-th

point.

The covariance matrix of the disturbances of the process at initial time t0 is

formed of elements defined as the covariance between the random temperature

fluctuations in the conventional i-th and j-th points in the working volume :

( ) ( )0 0 0

1

1 ˆ ˆl

ij

S im i jm i

m

R T U T Ul

=

= − × − ∑ , (11)

where Tim, Tjm are the results of m-th observation in the i-th and j-th points.

Diagonal elements of matrix RS0 dimension n×n are point estimates of the

variance of the temperature fluctuations.

Any temperature parameter in the working volume can be measured, ranging

from the directly measured temperature to complex functions of temperature.

Matrix αk dimension n×n manages the process of observation in the form of

switching on or off the i-th measurement channel, i=1…n. The relevant elements

of the matrix αk determined in accordance follows:

αii = 0, if the temperature in the i-th point is not measured,

αii = 1- in the opposite case.

Property of the Kalman filter is that the algorithm for computing the

estimates (5) – (9) will always be asymptotically stable if the state of a dynamic

object is fully observable by measurement. In this case, the diagonal elements of

estimation error matrix Kk/k over time form a decreasing sequence. The results of

calculations, using the algorithm (5) – (9), presented in the form of the matrix

trace trajectory of the estimating filter errors are shown in Figure 1. The

presented trajectory reflects the theoretical stability of the Kalman filter

algorithm. Stability of the filter is not dependent on the characteristics of the

state model.

225


Figure 1. Asymptotic stability of the Kalman filter

Assumptions about asymptotic stability and the absence of measurement

errors are theoretical. Lack of a priori information about the dynamics of the

state parameters can be offset by the introduction of "fictitious" noise [2].

The method aims at increasing the weight of the measurements at the next

step and to introduce into the equation (6) a scalar weighting factor s > 1:

( )1/ 1/ / 1/ 1/ 1/

T T

k k k k k k k k k k S k kK s Ф K Ф B R B

+ + + + += + . (12)

Divergence of the filtering results is expressed by the monotonic unlimited

accumulation of errors in time. The results of the calculations for s = 1,0…1,5

are shown in Figure 2, the influence of the measurement accuracy on filtering

error in Figure 3.

Figure 2. Filter estimation errors accumulating over time.

0

0,1

0,3

0,5

0,7

0,9

1 2 3 4 5 6 7 8 9 10 11 12

∆s=0%

∆s=30%

∆s=50% α→100%, ∆V=0,1˚C

Тr(Kk/k),˚С2

t, year

0

0,1

0,3

0,5

0,7

0,9

1 2 3 4 5 6 7 8 9 10 11 12

Тr(Kk/k),˚С2

α→100%, ∆V=0,1˚C

t, year

226


Figure 3. Influence of the measurement accuracy on error filtering.

The choice of the number of measuring points is also associated with the

divergence of the Kalman filter. The results of calculations for the number of

measuring points in the working volume from 2 to 9 are presented in Figure 4.

Figure 4. Increased filtering errors with reducing the number of measurement points.

4. Application to temperature chambers

Consider the problem of choosing the number of measurement points for

periodic calibrations. It can be interpreted as a search for a rational measurement

vector sequence for a sufficiently long time tN as shown:

( ) ( ) ( ) ( )1/1 2/2

2

/ /

1

, ,

... ...

... ..., k

k k

k N N

N

Тr К Тr К Тr К Тr К

α α α α=

. (13)

The goal was to find such a sequence (13) that will provide the temperature

accuracy requirements and minimizes the total measurement costs. For example,

the algorithm in [3] is based on the discrete version of the Pontryagin maximum

principle. In most cases, the simple solution the problem by trying of variations

0

0,1

0,3

0,5

0,7

0,9

1 2 3 4 5 6 7 8 9 10 11 12

∆s=30%, ∆V=0,1˚C

α→60%

α→100%

α→25%

Тr(Kk/k),˚С2

t, year

0

0,1

0,3

0,5

0,7

0,9

1 2 3 4 5 6 7 8 9 10 11 12

∆V=0,5˚

∆V=0,2˚

∆V=0,1˚

Тr(Kk/k),˚С2

t, year

227


of sequence (13) is enough. Take a time interval equal to 12 years with a

sampling interval equal to one year. As a first approximation, we took nine

measuring points α1 → 9 each time tk, k = 1,2,…,N. Consider, as second and

subsequent approximations, options with fewer measurement points. So we try to

get a minimum amount of measurements for selected time period and not to

exceed the acceptable threshold of the estimation uncertainty. After some

iterations we reach the minimum number of measurement points on the trajectory

of the state uncertainty:

1 2 3 4 5 61 1 1 3 1 1...

0,65 0,58 0,66 0,83 0,64 0,57 ...

α α α α α α→ → → → → →

. (14)

Calculations showed that the optimal solution (Fig. 5) corresponds to

measurements at one point every year and at three points in three years.

Figure 5. Trajectory of temperature uncertainty corresponding to an optimal number

of measurement points

5. Conclusions

When we estimate the measurement uncertainty in the test laboratory we

must take into account the characteristics of the equipment parameters

instability, composition and measurements accuracy for its calibrations. Joint

consideration of the effects of all components of uncertainty is suggested for

implementation on the basis of a recursive algorithm to filter the results of

measurements during the periodic calibrations.

The example of temperature chambers demonstrated the suitability of the

temperature uncertainty estimation in a working volume for different numbers of

measurement points and periodic calibrations. The main possibility of optimizing

the cost of calibration is shown.

0

0,1

0,3

0,5

0,7

0,9

1 2 3 4 5 6 7 8 9 10 11 12

Тr(Kk/k),˚С2

t, year

228


References

1. V.I. Mudrov, V.L. Kushko. Methods of measurement processing. -Moscow:

Radio and communication., 1983.-304 p.

2. V.I. Sobolev. Information-statistical theory of measurement. -Moscow:

Machinery, 1983.-224p.

3. A. S. Krivov, S. V. Marinko. Optimization of measurement processes under

metrological provision for complex technological systems. Measurement

Techniques. Springer US, August 1994, Volume 37, Issue 8, pp 849-853

4. GOST R 53618-2009 (IEC 60068-3-5:2001).

229





A CARTESIAN METHOD TO IMPROVE THE RESULTS AND

SAVE COMPUTATION TIME IN BAYESIAN SIGNAL

ANALYSIS

G. A. KYRIAZIS

Instituto Nacional de Metrologia, Qualidade e Tecnologia (Inmetro)

Av. Nossa Senhora das Graças, 50, Duque de Caxias, RJ, 25250-020, Brasil

[email protected]

A Cartesian method to improve the results and save computation time in the Bayesian

analysis of combined amplitude and phase modulated signals is presented. Such signals

have been employed in the dynamic testing of phasor measurement units (PMU). A

compact time domain model is employed at the outset to reduce the number of model

functions required. The modulated signal is decomposed into simpler components that are

progressively recomposed. Metrologists have cogent prior information about the signal

components: they can generate, digitize and analyze independently each of them. At each

stage, parameters estimated in a previous stage are constrained to those estimates

obtained, thus reducing the dimensionality of the search algorithm with consequent time

savings. A computer simulated example is presented and discussed in detail.

1. Introduction

Bayesian parameter estimation techniques [1], [2] have been applied to

waveform metrology [3], [4]. The method uses approximations based on the

posterior mode which are valid when the data size is large and/or the signal-to-

noise ratio (SNR) is high. Metrologists have cogent prior information about the

signal waveform, they can design their experiment and select an arbitrarily large

number of samples, and they typically work with high SNRs.

As with other time-domain analysis alternatives to the Fourier

transformation, the price for reduced uncertainties is computation time.

Computation time is expected to grow approximately as the square of the

number of sinusoids in the model function. This is especially critical for phase

modulation (PM) signals. The larger the PM index is, the greater the number of

significant side-frequencies terms [5], and therefore the longer the time required

for the algorithm to converge and the greater the sensitivity of the search

algorithm to starting point changes. An additional shortcoming is that the PM

index cannot be directly estimated when the PM signal is expanded in a series of

sinusoids.

230


To address the aforementioned problems, compact time domain models

have been employed [6] to reduce the number of model functions required.

However, even using such models, the computation time is still expected to

grow as the models become increasingly complex. For instance, consider that

we are interested in estimating parameters of carrier signals with combined

amplitude (AM) and phase (PM) modulation. Such signals have been employed

in the dynamic testing of phasor measurement units (PMU) [7]. Electric utilities

are installing significant number of these instruments throughout the power grid

to monitor the state of the grid at single points and transmit the voltage and the

current magnitudes and phases at those points in real time to control centers or

other instruments. The measurements made by PMUs must be time

synchronized.

The equation for phase A voltage Xa with combined AM and PM is

( ) ( )[ ]

( )( )[ ] ,sincos

cos1

000

0

θφωω

γω

+++++⋅

+++=

tCktD

tkXEtX

aa

xxma (1)

where C0, D0, E0 are the offsets, Xm is the carrier amplitude, t is time, ω0 is the

carrier frequency (the power system frequency), ka is the PM index, φ is the PM

phase angle, ωa is the PM frequency, kx is the AM index, γ is the AM phase

angle, ωx is the AM frequency, and θ is a reference phase angle. Though the

carrier frequency could be regarded as known and the offsets ignored, thus

simplifying the estimation problem [8], they are assumed unknown in what

follows. Only time is assumed to be known.

A Cartesian method is proposed here to reduce the time required for

estimating all the above unknown parameters. The modulated signal in Eq. (1) is

decomposed into simpler signal components that are progressively recomposed.

Metrologists can generate, digitize and analyze independently each signal

component. The components selected have increasing complexity: each signal

component incorporates parameters that were estimated in previous stages. At

each stage, parameters estimated in a previous stage are constrained to those

estimates obtained, thus reducing the dimensionality of the search algorithm in

[3], [4] with consequent time savings. This is reasonable for large data sets and

for those arbitrary waveform function generators used by metrologists, whose

frequency settings may be assumed not to vary over all the estimation stages.

An application example to specific data is discussed in this paper.

Dimensionless units (the data sampling interval is 1) are employed so that in

principle the method is applicable to any frequency range in physical units if

231


data acquisition systems are available for that range. Applications to real-world

metrology data are reported elsewhere [9].

The article is organized as follows. The Cartesian method proposed here is

summarized in section 2. The modulated signal is described in section 3. The

method employed for signal analysis is studied in section 4. In section 5 the

results obtained from simulations are discussed. The conclusions are drawn in

section 6.

2. Cartesian Method

René Descartes advanced his method in 1635 [10]. It consists essentially of the

four principles here summarized:

1. ´The first was never to accept anything for true which I did not clearly

know to be such …´. Here we assume that nothing is known about the

signal parameters before the data is available. Noninformative priors are

therefore employed throughout this paper for all unknown parameters.

2. ´The second, to divide each of the difficulties under examination into as

many parts as possible, …´. Here we decompose the complex model whose

parameters we want to estimate into simpler model components.

3. ´The third, to conduct my thoughts in such order that, by commencing with

objects the simplest and easiest to know, I might ascend by little and little,

and, as it were, step by step, to the knowledge of the more complex; …´.

Here we select the model components so that they have increasing

complexity: each component incorporates parameters that were estimated in

previous stages.

4. Ánd the last, in every case to make enumerations so complete, and reviews

so general, that I might be assured that nothing was omitted.´. Here, the

residuals are reviewed at each stage to see if there is any coherent

characteristic that has not been accounted for in each model component.

He also added a second maxim: ´when it is not in our power to determine

what is true, we ought to act according to what is most probable; …´. Here, at

each stage, parameters estimated in a previous stage are constrained to those

most probable estimates obtained. Note that a parameter estimated in a previous

stage becomes a nuisance parameter to be eliminated in the subsequent stages.

For large data sets, the posterior for such parameter is nearly a delta function.

Thus, integrating out such parameter at a given stage is nearly equivalent to

calculating its posterior mode in a previous stage and plugging the mode value

in the model component for the given stage. Note that all informative priors used

at a given stage are based on data observed and analyzed in previous stages.

232


3. Modulated Signal

The equation for phase A voltage Xa with combined amplitude and phase

modulation was given in Eq. (1). Specific equations for each of the three-phase

combined modulation waveforms are given in [7].

The dynamic phasor for Xa in Eq. (1) at time t = nT, where n is an integer

and T is the sampling interval in physical units, is

( ) ( ) ( )[ ]

( )[ ].

cos12

sin φωθ

γω

++

⋅

++⋅=

nTkj

xxma

aae

nTkXnTX

(2)

The magnitude of the phasor is the root mean square (RMS) value of the

sinusoid. As shown in Eq. (2), the dynamic phasor magnitude is modulated in

amplitude. The carrier frequency does not appear explicitly in the phasor

representation; but is an implied property of the phasor. The frequency of the

dynamic phasor is the rate of change of the dynamic phase angle independent of

the carrier frequency, that is

( ) ( ) ( ) .cos220 φωπωπω ++= nTknTfaaaD

(3)

The ROCOF is the derivative of the frequency and is given by

( ) ( ) ( ) .sin22φωπω +−= nTknTROCOF

aaa (4)

The procedure adopted here for estimating the unknown parameters is

described in the next section.

4. Signal Analysis

4.1. Modulating Signal

First, we digitize the sinusoid used to phase modulate the carrier (see Eq. (1)). A

total of N uniform samples is taken at times n = 0, …, N−1 (dimensionless

units). The time series is postulated to contain the signal f[n] with additive noise.

It is assumed that the data can be modeled as

[ ] [ ] [ ]

[ ]aa

nBnACnf

nenfny

ωω sincos 110 ++=

+=

(5)

where y[n] is the n-th sample value, C0 is the offset, A1 and B1 are the

amplitudes, ωa is the frequency (in fact, the modulation frequency of the PM

signal) and e[n] is the n-th noise term. Additive white Gaussian noise with null

expectation and unknown variance is assumed throughout this paper.

233


Model (5) has three model functions. The fitting parameters are estimated as

described in [3], [4]. The model functions are made orthonormal by standard

procedures. We compute the mode of the marginal posterior for ωa using its

(simulated) true value as our initial estimate. Last, we use the linear relations

between the orthogonal and nonorthogonal models to compute C0, A1 and B1.

The amplitude and phase angle of the modulating signal are calculated from

XPM = (A12 + B1

2)

1/2 and φ = atan2 (−B1, A1), respectively, where atan2 is the four

quadrant inverse tangent function.

This prior information about the modulating signal is then used in the

subsequent analyses of the PM signal and the combined AM and PM signal.

4.2. PM Signal

We record a noisy set of N uniform samples from a PM signal with carrier

frequency ω0 and modulation frequency ωa, and assume the data can be modeled

as

[ ] [ ]( )

[ ]( ) [ ] ,sin

cos

1001

10010

nenwkCknG

nwkCknFDnx

aa

aa

++++

+++=

ω

ω

(6)

where

[ ] ( ) ,sin1 φω += nnwa

(7)

x[n] is the n-th sample value, C0 and D0 are the offsets, and F1 and G1 are the

amplitudes.

Model (6) has three model functions with all parameters unknown, except

for ωa and C0 which are constrained to their estimates evaluated in section 4.1.

We again apply the method described in [3], [4]. The model functions are made

orthonormal by standard procedures. We compute the mode of the marginal

posterior for ω0, φ and ka using the (simulated) true values as our initial

estimates of ω0 and ka. The initial estimate of φ is null. Note that φ needs to be

estimated again here for it changes randomly at each acquisition. Last, we use

the linear relations between the orthogonal and nonorthogonal models to

compute D0, F1 and G1. The amplitude and phase angle of the PM signal are

then calculated from Xm = (F12

+ G12)

1/2 and θ = atan2 (−G1, F1), respectively.

This prior information about the PM signal is then used in the final analysis

of the combined AM and PM signal.

234


4.3. Combined AM and PM Signal

We record a noisy set of N uniform samples from a combined AM and PM

signal with carrier frequency ω0, AM frequency ωx, and PM frequency ωa, and

assume the data can be modeled as

[ ] [ ]

[ ] [ ]

[ ],sincos

sincos

22

2121

210

nenSnR

nnwSnnwR

nwHKnz

xx

xx

+++

++

+=

ωω

ωω (8)

where

[ ] [ ]( ) ,cos 1002 θω +++= nwkCknnwaa

(9)

[ ] ( ) ,sin1 φω += nnwa

(10)

z[n] is the n-th sample value, K0 and C0 are the offsets, and H1, R1, S1, R2 and S2

are the amplitudes. R2 and S2 are needed because the offset D0 of the PM signal

is also modulated in amplitude. This offset is regarded unknown here.

Model (8) has six model functions with all parameters unknown, except for

ωa, C0, ω0, and ka which are constrained to their estimates obtained in sections

4.1 and 4.2. The model functions are made orthonormal by standard procedures.

We compute the mode of the marginal posterior for θ, φ and ωx using the

(simulated) true value as our initial estimate of ωx. The initial estimates of θ

and φ are null. Note that both θ and φ need to be estimated again here as they

change randomly at each acquisition. Last, we use the linear relations between

the orthogonal and nonorthogonal models to compute K0 and all the amplitudes.

The carrier amplitude, the AM index, the AM phase angle and the offsets D0 and

E0 (see section 5.3) are then calculated from Xm = H1, kx = (R12

+ S12)

1/2/H1,

γ = atan2 (−S1, R1), D0 = (R22 + S2

2)

1/2/kx and E0 = K0 − XmD0, respectively.

In all aforementioned stages, the uncertainties associated with the estimates

are computed using approximations based on the normal distribution and the

posterior mode. The derivatives which appear in this procedure are evaluated

numerically. The residuals are reviewed at each stage to see if there is any

coherent characteristic that has not been accounted for in the model component.

The dynamic phasor, frequency and ROCOF sequences can be estimated by

inserting all signal parameter estimates in Eqs. (2) – (4). The uncertainty

associated with each sequence estimate can be evaluated according to [11] by

propagating the uncertainties associated with those signal parameter estimates.

235


5. Simulation Results

The uncertainties associated with the estimates decrease with increasing SNR

and/or number of samples. The examples in this section assume the noise term

to be normally distributed with null mean and noise variance σ2 equal to 1⋅10

−6.

This is the order of magnitude of the noise variance typically found in

calibration systems that employ commercial signal generators.

We first note that the “starting guess” ωωωω(0)

array, also referred to in [3], [4]

here includes frequencies as well as phase angles. We confirmed that the

elements of this array should be sorted in decreasing order of accuracy to ensure

algorithm convergence. The experienced metrologist will be able to sort the ωωωω(0)

array appropriately. The parameter ρ and the “minimum” step size ε of the

pattern search algorithm referred to in [3], [4] were set at 0.50 and 1⋅10−8

,

respectively, for all signal components in the sequel.

The algorithm uses dimensionless units. ´Starting guess´ frequencies f in

physical units are selected here instead. The conversion of f to dimensionless ω

is ω = 2πfT. The frequency estimate in hertz is obtained from dimensionless ω

by reconverting the units at the end of the algorithm. ´Starting guess´ phase

angles in radians of course need no conversion.

The number of samples should be such that they cover more than one

modulation period, i.e., Nωx > 2π and Nωa > 2π. Very low AM and PM

frequencies (about 1 Hz) are assumed here. They reflect real power grid

conditions. Thus, a total of N = 20,000 samples is taken at times t[n] = nT,

where T = 65.2 µs.

Standard uncertainties (k = 1) associated with the fitting residuals are

reported in this paper. This contribution was evaluated according to [3], [4]. The

noise standard deviation was also estimated at each stage according to [3], [4] to

check how well the model component fits the data.

The algorithm has been implemented in LabWindows/CVI v. 6.0. The

processing time for estimating the modulating signal parameters was less than

1 s. The processing time for estimating the remaining parameters of the PM

signal was 22 s. The additional processing time for estimating the final

parameters of the combined AM and PM signal was 7 s. Such times refer to the

3 GHz 1.96 GB random-access memory (RAM) Duo Core computer used to

process the data under a Windows XP®

environment.

5.1. Modulating Signal

The set of samples of the signal used to phase modulate the carrier was

generated from

236


[ ] ( ) [ ] ,sinPM0 nTenTXCnTya

+++= φω (11)

where φ is a specific (pseudo) randomly selected phase angle distributed

uniformly between 0 and 2π, and e[nT] ~ N(0, 1⋅10−6

). Here the starting guess is

ωωωω(0)

= [ωa

(0)T]´ (see [3], [4]), where ωa

(0) = 2π⋅0.9 rad/s. The true values and the

corresponding estimates are listed in Table 1. The sampled data are shown in

Fig. 1. The residuals are plotted in Fig. 2.

Table 1. Modulating signal parameters.

Figure 1. Sampled data for the modulating signal.

Figure 2. Residuals between the sampled and reconstructed data for the modulating signal.

True value Estimate

ωa / 2π = 0.8765432100 Hz 0.876541(3) Hz

XPM / √2 = 1.0000000000 1.000010(14)

φ (randomly selected) 0.455485(14) rad

C0 = 0.1000000000 0.099997(7)

σ = 0.001 0.001002(10)

237


5.2. PM Signal

The set of samples of the PM signal was generated from

[ ] [ ]( ) [ ] ,cos 000 nTenTgkCknTXDnTxaam

+++++= θω (12)

where

[ ] ( ) [ ] ,sin nTenTnTga

++= φω (13)

θ and φ are specific (pseudo) randomly selected phase angles distributed


). Here the starting guess

is ωωωω(0)

= [ω0(0)

T, φ (0)

, ka

(0)]´ (see [3], [4]), where ω0

(0) = 2π⋅60 rad/s,

φ (0)

= 0.0 rad and ka

(0) = 0.1 rad. The true values and the corresponding estimates

are listed in Table 2. The sampled data are shown in Fig. 3. The residuals are

plotted in Fig. 4.

Figure 3. Sampled data for the PM signal.

Figure 4. Residuals between the sampled and reconstructed data for the PM signal.

238


Table 2. PM signal parameters.

True value Estimate

ω0 / 2π = 59.9876543210 Hz 59.987656(3) Hz

ka = 0.1000000000 rad 0.100004(5) rad

Xm / √2 = 1.0000000000 1.000008(14)

θ (randomly selected) −0.881583(14) rad

φ (randomly selected) 1.476797(68) rad

D0 = 0.1000000000 0.100006(7)

σ = 0.001 0.001004(10)

5.3. Combined AM and PM Signal

The set of samples of the combined AM and PM signal was generated from

[ ] ( ) [ ]( )[ ] [ ] [ ]nTenThnTenTkXEnTzxxm

+⋅++++= γωcos10 (14)

where

[ ] [ ]( ) ,cos 000 θω ++++= nTgkCknTDnThaa

(15)

[ ] ( ) [ ] ,sin nTenTnTga

++= φω (16)

φ, γ and θ are specific (pseudo) randomly selected phase angles distributed


). Here the starting guess is

ωωωω(0)

= [θ (0)

, φ (0)

, ωx

(0)T]´ (see [3], [4]), where θ

(0) = φ

(0) = 0.0 rad and

ωx

(0) = 2π⋅1.1 rad/s. The true values and the corresponding estimates are listed

in Table 3. The sampled data are shown in Fig. 5. The residuals are plotted in

Fig. 6.

Table 3. Combined AM and PM signal parameters.

True value Estimate

ωx / 2π = 1.0987654321 Hz 1.098739(63) Hz

kx = 0.1000000000 0.099959(23)

Xm / √2 = 1.0000000000 1.000004(10)

γ (randomly selected) 2.066566(21) rad

θ (randomly selected) 0.616594(7) rad

φ (randomly selected) −0.821215(43) rad D0 = 0.1000000000 0.099981(27)

E0 = 0.1000000000 0.100028(30)

σ = 0.001 0.001016(10)

The dynamic phasor, frequency and ROCOF sequences can be estimated by

inserting the tabled estimates in Eqs. (2), (3) and (4), respectively. The

uncertainty associated with each sequence estimate is evaluated according to

[11] by propagating all tabled uncertainties through Eqs. (2) – (4).

239


Figure 5. Sampled data for the combined AM and PM signal.

Figure 6. Residuals between the sampled and reconstructed data for the combined AM and PM

signal.

6. Conclusions

In the Cartesian method proposed here a complex signal is decomposed into

simpler signal components that are progressively recomposed. The signal

models increase in complexity at each estimation stage. As the metrologist

knows in advance which signal component has been generated at a given stage,

the question to be asked in that stage simply becomes “What is the evidence of

such signal component in these data?”. The Bayesian framework then derives

automatically the statistic that is best suited to answer the question and also

indicates how it should be processed to obtain estimates of signal parameters.

Since metrologists can select an arbitrarily large number of samples,

nuisance parameters are eliminated at each stage in a rather trivial way: at any

given stage, such parameters are constrained to their estimates obtained in

previous stages. This allows the reduction of the nonlinear optimization problem

240


to one of low dimensionality with significant reduction in computation. Though

the method requires more data, this should not be viewed as a disadvantage

since the availability of more data information contributes to reducing the

uncertainties associated with the signal parameter estimates.

References

1. E. T. Jaynes, “Bayesian spectrum and chirp analysis”, in Maximum

Entropy and Bayesian Spectrum Analysis and Estimation Problems, C.

Ray Smith and G. J. Erickson, ed., D. Reidel, Dordrecht-Holland 1-37

(1987).

2. G. L. Bretthorst, “Bayesian spectrum analysis and parameter estimation”,

in Lecture Notes in Statistics, 48, J. Berger, S. Fienberg, J. Gani, K.

Krickenberg, and B. Singer (eds.), New York: Springer-Verlag (1988).

3. G. A. Kyriazis, IEEE Trans. Instrum. Meas. 60 2314 (2011).

4. G. A. Kyriazis, “Bayesian inference in waveform metrology”, in Advanced

Mathematical and Computational Tools in Metrology and Testing IX, F.

Pavese, M. Bär, J –R. Filtz, A. B. Forbes, L. Pendrill and K. Shirono Eds.,

Singapore: World Scientific 232 (2012).

5. M. Engelson, Modern Spectrum Analyzer Theory and Applications,

Dedham, Massachusetts: Artech House, 1984.

6. G. A. Kyriazis, IEEE Trans. Instrum. Meas. 62 1681 (2013).

7. IEEE Standard for Synchrophasor Measurements for Power Systems,

IEEE Std. C37.118.1 (2011).

8. G. N. Stenbakken, “Calculating combined amplitude and phase modulated

power signal parameters”, Power and Energy Society General Meeting,

2011 IEEE, 24-29 July, pp. 1-7 (2011). 9. G. A. Kyriazis, W. G. Kürten Ihlenfeld and R. P. Landim, IEEE Trans.

Instrum. Meas. 64 (2015). DOI 10.1109/TIM.2015.2395491

10. R. Descartes, Discourse on the Method of Rightly Conducting the Reason

and Seeking Truth in the Sciences (1635). [Online]. Available:

http://www.gutenberg.org/files/59/59-h/59-h.htm (accessed Jan. 29, 2014).

11. BIPM, IEC, IFCC, ILAC, ISO, IUPAC, IUPAP and OIML, Evaluation of

measurement data: Guide to the Expression of Uncertainty in

Measurement, GUM 1995 with minor corrections, Joint Committee for

Guides in Metrology, JCGM 100 (2008). [Online]. Available:

http://www.bipm.org/utils/common/documents/jcgm/JCGM_100_2008_E.

pdf (accessed Jan. 29, 2014).

241





THE DEFINITION OF THE RELIABILITY OF

IDENTIFICATION OF COMPLEX ORGANIC COMPOUNDS

USING HPLC AND BASE CHROMATOGRAPHIC AND

SPECTRAL DATA

E.V. KULYABINA* AND YU. A. KUDEYAROV

Russian Research Institute for Metrological Service (VNIIMS)

Moscow, 119361, Russia *E-mail: [email protected]

www.vniims.ru

In this work a method for determining the reliability of the identification of complex or-

ganic compounds using HPLC is proposed. This method is based on the use of Student's statistical criterion. The advantage of this method is reliable identification with values of

retention parameters and measuring information about the spectrum of substance without

the use of traditional calibration.

Key words: HPLC, identification, reliability, Student criterion

1. Introduction

This article describes a method for identification and determination of complex

organic compounds. More precisely, this particular method uses high-

performance liquid chromatography (HPLC) with UV detection without the use

of traditional calibration with standard samples of analyses [1]. It is hard to

reliably identify complex organic compounds due to absence of standard

samples. A method of analysis [2] and multi-component test mixture [1] was

developed to solve these tasks and is summarized in this article.

2. Materials and Methods

To carry out this research, we used high performance liquid chromatograph

‘MiliChrome A-02’ (manufactured by Institute of Chromatography ‘EcoNova’,

Novosibirsk), UV spectrophotometric detector which can record spectrum and

detect substance at 8 wavelengths. The chromatograph has a chromatographic

column with volume variability no more than ± 2 % from instrument to

instrument and the column efficiency variability no more than ± 5 %.

242


The test mixture ‘DB-2012’ was used to check the stability parameters of

the chromatograph. This test mixture is made from substances that are markers

of chromatograph parameters. Basic parameters of the chromatograph under the

control are free volume of column; the accuracy of setting of the wavelength UV

detector; a deviation gradient of elution from correct shape, the linearity of the

detector.

The test mixture consists of five substances from database of spectral and

chromatographic data of 500 compounds - DB-2012. These are potassium

iodide, pyrene, caffeine, meta-nitroaniline, orto-nitroaniline.

The database (DB-2012) was used for the purpose of identification for

complex organic compound.

An integral part of the method is a certified measurement method that

contains ranges and errors determining the chromatographic and spectral

parameters of substances from the DB-2012. This method is ‘Method of

identification and quantification of organic compounds without calibration by

the sample for analysis using HPLC with UV detector’.

The method of identification and quantification of complex organic

compounds is shown below.

(1) Pass the multi-component mixture through the chromatograph (each

component of the mixture is designed to test a particular parameter (s)

of the chromatograph).

(2) Check if the resulting chromatogram of the test mixture agrees with the

certified values within the certified tolerances.

(3) Pass the sample to be analyzed (analyte) through chromatograph,

keeping all settings the same as in the previous step.

(4) Initial identification of the analyte can be carried out in terms of

retention parameter. Final identification of the analyte can be carried

out by analyzing spectral ratios.

(5) If a unique substance is identified, amount of substance in the sample

can be determined.

To test of the statistical hypothesis, we used Student's criterion.

3. Results and Discussion

In this work we have adopted the following definition of identification: the

substance is considered to be identified if there is a one-to-one match

characteristics of the analyte with the characteristics of the substances from the

database.

The DB itself is derived from chromatograms of substances. In particular,

from chromatograms we obtained chromatographic characteristics of substances

243


- a retention time and volume, peak height and area, the ultra-violet spectrum

(and the spectral ratios) were recorded in the range from 190 nm to 400 nm.

For identification there we consider two types of data [3].

The first type of data are data from DB-2012, which are chromatographic

characteristics and spectra of standard samples of pure substances. DB-2012

shows the average values of these characteristics aR(i,j), (i =1,2,…,m, j

=1,2,…,k), the index i means index of the substance in the database, j - the index

of characteristics. Standard deviation σ(i,j) and corresponding tolerance d(i,j)

are determined for each characteristic a(i,j) – retention volume or spectral

rations.

Recommended range of tolerances [4], is given by

),(3),(),(2 jijidji σσ << (1)

The second type of data is the experimentally obtained characteristics of

substances x, such as average values of retention volumes, spectral ratio b(j) and

the corresponding values of the mean square deviation )( jx

σ .

It is necessary to determine the limits of applicability of the theory.

We assume that the measurements of substance parameters behave as

random variables and their distribution is normal. However, characteristics of

the analyte cannot be a random quantity, but the difference between the

retention times or the corresponding spectral ratios is a random variable and it

can be applied to adopt methods of mathematical statistics.

The distance between measured and reference in this case is the difference

),( ji∆ between the measured average value characteristics and the reference

value of this characteristic in the DB:

|)(),(|),( jbjiajiR

−=∆ (2)

Identification is considered effective if the difference between the measured

value and reference value listed in the database will be less than or equal to

some criterion (in our case - threshold for this difference), that is if the following

condition is true

),(),( jidji ≤∆ (3)

Condition (3) is necessary but not sufficient for unambiguous identification.

For evaluation of accuracy of identification, it is need to formulate a statistical

hypothesis H0 (null hypothesis) and H1 (alternative hypothesis).

The null hypothesis H0 is the statement that there is no substance in the

sample.

The error of the 1st kind is made with a false positive result, then the

hypothesis H0 is rejected and the hypothesis H1 accepted.

244


The probability of error of the 1st kind denoted α - probability of

acceptance of alternative hypothesis or significance level. Alternative situation

is adoption of hypothesis H1.

The error of the 2nd kind is made with a false negative results, then the

hypothesis H1 is rejected and the hypothesis H0 accepted. The probability of

error of the 2nd kind is indicated by β .

The accuracy of the identification, the function P associated with α and β

[4] is given by equation

.1 βα −−=P (4)

4. Example

For example, in experiment we obtained the following retention volumes

V(1,1) = 3300 µl, V(2,1) = 3572 µl, V(3,1) = 3306 µl (5)

By comparing these values with the corresponding values of retention

volumes in the database, we see that the first material can be identified as pyrene

(VR(1,1) = 3301), the second – as ionol (VR(2,1) = 3569), the third – as isoamyl

benzoate (VR(3,1) = 3304), the error of determination of retention volumes is

7%.

The identification of the second substance is not complicated, but the

identification of the first and third materials have problems as it can be

identified as pyrene, and as isoamyl benzoate, wherein β is low (β=0.003).

Therefore, it is not possible to identify uniquely these two substances from only

the retention volumes. The above results indicate that for the unambiguous

identification we need to consider additional properties of substances – spectral

ratios.

We have 7 spectral ratios for each substance. We do not use the index j

because we will consider one characteristic – spectral ratio.

Table 1. Spectral ratio of the investigated substances.

Name of

substance VR, µl

Spectral ratio (Sλ/S210)

λ=220 λ=230 λ=240 λ=250 λ=260 λ=280 λ =300

pyrene 3300 1.15 3.55 5.77 1.08 1.88 0.40 0.59

substance 1

(unknown) 3302 1.12 3.22 6.00 0.98 1.86 0.50 0.53

substance 2

(ionol) 3572 0.54 0.33 0.06 0.01 0.04 0.15 0.00

substance 3

(isoamyl

benzoate)

3306 2.15 3.15 1.58 0.27 0.17 0.17 0.00

245


To identify we use an average value of the spectral ratios of a particular

substance. Average value of spectral ratios s substance is simply the arithmetic

average of a set of values of the spectral ratios. We introduce the difference

between the base and the measured spectral ratios [3] for each spectral ratios

)(),()( lslisiba

−=∆ (6)

Where l – a number of a spectral ratio, sa (i,l), sb (l) - values of a spectral

ration for i-th reference and analysed substances consequently.

The average value of the difference between the base and the measured

spectral ratios for substance

∑=

∆=∆

n

lli

Ni

1),(

1)( , where N=7 (7)

Substance is considered identified if the following condition is present

di <)(∆ or disisba

<)()( − (8)

where d - the tolerance on the average base spectral ratios:

dissdisaba

+− )(<<)( (9)

Hypothesis H0 (null hypothesis) is the assertion that the considered

substances in the sample is missing.

This means that in this case, the inequality

0>-)( di∆ (10)

Hypothesis H1 (alternative hypothesis) implies substance is present in the

sample.

Condition for presence of substances is expressed by the inequality

0<-)( di∆ (11)

We determine which hypothesis is realized using Student's criterion.

Quantile of the Student distribution tN-1,P corresponding to probability P = 0.95

for N - 1 = 6 (N = 7) of degrees of freedom is equal to 2.4470. This means that

the probability P=0.95 corresponds quantile of Student distribution of the

random variable )-)(( di∆ is equally 2.4470.

The Student's coefficient is calculated by the equality

)(

)-)(()(

iS

diNit

b

b

∆⋅= (12)

where Sb(i) is estimate of )( jx

σ

If the experimental value of the quantile tb(i) is less than the value of the

table, the condition

246


PNb

tit ,1)(−

< (13)

means that the H0 hypothesis is rejected, the H1 - was adopted.

Taking into account the spectral ratios let us calculate the Student's

coefficient (by formula 12) for the substances from list (5)

tb(1) = 1.9875,

tb(2) = 2.6395,

tb(3) = 2.6292.

For substances 2 and 3 the hypothesis H0 is true (these substances in the

sample are not present), and for the substance 1 this hypothesis should be

rejected, that is it may be identified as pyrene. The level of significance (the 1st

kind error or the probability of false identification) is to be equal to 0.05 and the

corresponding probability of identification P = 0.95.

5. Conclusions

Thus, it should be noted that the above results suggest that the use of the spectra

of compounds and spectral ratios allow quantifying the reliability of

identification. It is also possible to extend this method to determine the

reliability of the identification of a set of characteristics jointly, as opposed to

looking at only one characteristic.

References

1. E.V. Kulyabina, Development and study of identification methods and

quantification of complex organic compounds by using of complex

substances with standardized chromatographic and spectral parameters,

Dissertation, FSUE "VNIIMS", Moscow (2013).

2. M.A. Grachev, G.I. Baram, I.N. Azarova, MVI Mass concentration of UV-

absorbing substances. Methods of measurement by HPLC, Limnological

Institute of SO RAS, Irkutsk (2003) Attestation Certificate 37-03 dated

10.12.2003, number FR.1.31.2003.00950.

3. Y.A. Kudeyarov, E.V. Kulyabina, O.L. Rutenberg, Application of Student's

criterion to determine the reliability of substances identification in

chromatographic analysis, J. Legislative and applied metrology, 3 (2013)

pp. 44-48.

4. V.I. Vershinin, B.G. Derendyaev, K.S. Lebedev, Computer identification of

organic compounds (Science, Moscow, 2002).

23 de abril de 2015 11:0 ws-procs9x6-9x6 9610-29 page 247


UNCERTAINTY EVALUATION OF FLUID DYNAMIC

SIMULATION WITH ONE-DIMENSIONAL RISER MODEL

BY MEANS OF STOCHASTIC DIFFERENTIAL

EQUATIONS

EMERSON A. O. LIMA1

Universidade de Pernambuco

Rua Benfica, 455 - Madalena Recife/PE CEP: 50720-001 - Recife (PE), Brazil1 Escola Politecnica, E-mail: [email protected]

SILVIO B. MELO3,†, CARLOS C. DANTAS∗,2 AND FRANCISCO A. S. TELES3

Universidade Federal de Pernambuco,Av. Prof. Luiz Freire, 1000 - Cidade Universitaria CEP 50.740-540 - Recife (PE),

Brazil∗,2 Departamento de Energia Nuclear, E-mail: [email protected]

3 Centro de Informatica, †E-mail: [email protected]

SILVIO SOARES BANDEIRA

Universidade Catolica de Pernambuco,

Rua do Principe, 526 - Boa Vista CEP: 50.050-900 - Recife (PE), Brazil

Uncertainty evaluation of the fluid dynamic simulations with one-dimensional

model describing riser of a Fluid Catalytic Cracking type cold unit was carriedout. Simulation of circulation flow by deterministic approach is taken as refe-

rence for uncertainty evaluation. Classical numerical formulation of problem is

given as a matter of comparison. Stochastic formulation according to Euler-Maruyama and Euler-Heun’s Methods of the fluid dynamic model is described.

Solutions is discussed along with graphical presentation of results.Uncertainty

as a stochastic data evaluation is presented.

Keywords: FCC simulation, numerical computation, continuous space, uncer-

tainty

1. Introduction

Fluid dynamics of an FCC type cold unit was studied experimentally by pa-

rameters determination with gamma ray transmission measurements. These

experimental parameters served as input to solve a fluid dynamic model

system of equation. Axial simulations of solid and air flow described the

247

26 de marco de 2015 12:6 ws-procs9x6-9x6 9610-29 page 248

248

riser in operational conditions according to a one-dimension formulation.

Uncertainty in the experimental parameters measured by gamma ray trans-

mission was evaluated in a previous work by using discrete models [1]. The

problem may be seen as a Taylor Rule, which presents some desirable ro-

bustness properties, although the necessity to use continuous space model

was reported by Brock et. al.[3]. The work by Barker et. al. [2] systemati-

zes models by category and ISO-GUM literature is fairly considered. From

this point of view a mathematical structured model and numerical solution

are the focus of this work, which follows the methodology for continuous

model as basically stated by Lord and Wright [5]. Choosing their stochastic

implementation simplicity and Matlab computation facility, calculations of

fluid parameters by deterministic numerical methods are presented and a

comparison with stochastic evaluation is detailed. Uncertainty is proposed

as a stochastic data treatment and results can be observed in a graphic

showing the evolution of the solution.

2. Fluid Dynamics of a One-Dimensional Riser: Classical

Formulation

We are interested in describing the fluid dynamic gas-solid behavior in a

one-dimensionally modeled riser at a steady and isothermal state. The flow

is supposed to be in a permanently incompressible regime, without the

occurrence of chemical reactions, but with the catalyst (solid phase) com-

pletely fluidized. The solid particles are assumed spherical and the friction

between the riser walls and the solid phase is considered negligible. With

these assumptions, the equations that model this system are (see [7]):

dUgdz

= −Ugεg

dεgdz

(1)

dUsdz

=Usεs

dεgdz

(2)

dP

dz= −dεg

dz

(ρsU

2s − ρgU

2g

)− (εsρs + εgρg) g − fw (3)

3. Numerical Solution of the Classical Formulation

The numerical solution for the Euler method and the order-4 Runge-Kutta

method (with an adaptive step) of the system formed by the equations

given in 3, and for different choices of the termdεgdz (equation 4, 5 and 6)

with the parameters listed in Table 1 are illustrated in Figure 1 (smooth

curve).


249

dεgdz

=gεsU2s

(1 − ρg

ρs

)( 1 − Us

Ug

1 − S∞

)2

− 1

(4)

dεgdz

=(ε∗ − εg)(ε

∗ − εa)

z0(εg − εa)(5)

dεgdz

=

1

−(Ug

εg+ Us

εs

)(Ug − Us)

(g − Farrasteρs

)(6)

Tabela 1. Fluid Dynamic Parameters Typical Values

Parameter Symbol Value

Riser Length L 2.3m

Riser Diameter D 0.032mPressure in the Riser P 104.364 kPA

Temperature in the Riser T 302K

Catalyst Particles Mean Diameter d 0.000072m

Solid Flow (per unit area) W 7.1Kg/m2 · sSolid Density ρs 850 kg/m3

Gas Molecular Mass (mean) M 28Kg/mol

Gas Dynamical Viscosity 29C µg 0.0000186Kg/m2 · sGas Density ρg 1.164Kg/m3

Gas Flow Q 0.0038m3/s

Gravity Acceleration g 9.806m/s2

4. Fluid Dynamics of a One Dimensional Riser: Stochastic

Formulation

Consider that the gas volume fraction, εg is composed of two components of

the form εg = εg(z) = εdetg (z)+W (z) where the function εdetg (z) corresponds

to the deterministic behavior and W(z) to εg’s stochastic component, seen

as a stochastic process along the z direction.


250

Figura 1. Example of a numerical solution for the Riser Dynamics. On the top, the

ratio UsUg

as function of height on riser for 10 simulations with σ = 0.01; on the bottom,

as smooth curve, this ratio for deterministic model. The noisy line line at the center, is

the common behavior for average of ratio UsUg

for 100 simulations with σ = 0.01 using

Euler-Maruyama method. The another noisy lines next to center curve are relative to

distance of one standard deviation of average value. Noisy lines far from center line are

relative to one uncertainty deviation ( σ√number of simulations

)

5. Results and Discussion

The slip velocity graph from the SDE evaluation shows a classical approach

from a one-dimensional model in which a recirculation of solid flux term was

not included. The focus in this work is applying the SDE calculation with a

simple computation approach to fluid dynamic model simulations that are

well known in the literature [8]. Such one-dimensional model is consistent

for experimental data in solid diluted fluxes. As industrial processes usually

operate in high concentrate solid regimes, recirculation is present and brings

several problems which are under investigation towards a precise modeling.

As an example, flux structure according to kern/annular model is discus-

sed under a novel formulation proposal, fully based on particle interaction

[8]. The proposed method allows some insight in uncertainty evaluation

problems for fluid dynamic simulations. First of all, a quite simple imple-

mentation of effective numerical stochastic methods that can easily be eva-

luated. Numerical computation using classical algorithms as Runge-Kutta

in a accessible software facility as Matlab motivates to further steps as bi-


251

dimension model investigation, for example. Particularly in fluid dynamic

simulation the present method can be an alternative to CFD in uncertainty

evaluation due to operational simplicity. Nevertheless, the mathematical

understanding of a discrete or continuous can be improved by the simple

experiments.

References

1. Dantas, C. C., Alex E. de Moura, Lima Filho, H. J. B., Melo, Silvio B,Santos, V. A., Lima, Emerson A de Oliveira, “Uncertainty evaluation bygamma transmission measurements and CFD model comparison in a FCCcold pilot unit”, International Journal of Metrology and Quality Engineering,v.4, p.9–15, 2013.

2. Barker, R. M. , Cox, M.G., Forbes, A.B. and Harris, P. M. , “Discrete DataAnalysis, Software Support for Metrology”, Best Practice Guide No. 4, 2004.

3. Brock, William A. , Durlauf, Steven N., West, Kenneth D., “Model uncer-tainty and policy evaluation: Some theory and empirics”, Journal of Econo-metrics 136 (2007) 629-664

4. Sommer, Klaus-D , p. 275, “Data Modeling for Metrology and Testing inMeasurement Science”, Franco Pavese, Alistair Forbes, Birkhauser, 2008.

5. Lord, G. J. , Wright, L. , “Uncertainty Evaluation in Continuous Modeling”,Report to the National Measurement System Policy Unit, Department ofTrade and Industry.

6. Higham, D. J. (2001) “An Algorithmic Introduction to Numerical Simulationof Stochastic Differential Equations”. SIAM REVIEW Vol. 43(3) pp 525–546

7. Melo, A. C. B. A. “Validacao de modelos matematicos para descrever a fluido-dinamica de um riser a frio utilizando atenuacao gama” (in portuguese) Doc-torate Thesis UFPE-DEN, Recife-PE (Brazil)

8. Zhu, C., Wang, D., “Resistant effect of non-equilibrium inter-particle colli-sions on dense solids transport” , Department of Mechanical and IndustrialEngineering, New Jersey Institute of Technology, Newark, NJ 07102, USA

252





SIMULATION METHOD TO ESTIMATE THE

UNCERTAINTIES OF ISO SPECIFICATIONS

J.M. LINARES , J.M. SPRAUEL

Aix Marseille Université, CNRS, ISM UMR 7287, 13288 Marseille, cedex 09, France

In this work a simplified method, dedicated to the use in industrial environments, is

proposed to evaluate uncertainties of ISO 1101 specifications. For that purpose a Delete

d-Jack-knife Method is implemented and adapted to the estimation of specification

uncertainties. As example, the verification uncertainty of a ISO geometrical specification

will be presented. The advantages and limitations of the method are then discussed.

1. Introduction:

ISO/IEC 17000 Standard [1] defines accreditation as an "Attestation issued

by a third party related to a conformity assessment body conveying formal

recognition of its competence to carry out specific conformity assessment tasks".

ISO/IEC 1702 specifies and defines the general terms relating to conformity

assessment, including the accreditation of conformity assessment bodies, and the

use of conformity assessment to facilitate trade. Recently, the accreditation for

3D measures of ISO specifications was launched in European countries [2]. This

accreditation imposes to estimate the uncertainties of ISO 1101 geometrical

specifications [3], and this even in case of measurements carried out in industrial

environments. In this work a simplified method, dedicated to the use in industrial

environments, is therefore proposed to evaluate uncertainties of ISO 1101

geometrical specifications [4,5]. For that purpose a Delete d-Jack-knife method

is implemented and adapted to the estimation of specification uncertainties. As

example, the verification uncertainty of ISO specification will be presented. The

advantages and limitations of the method will then be discussed.

2. Uncertainty propagation methods:

2.1. GUM:

Uncertainties are generally evaluated using classical GUM’s method [6]

which is based on specific laws of propagation. A first order Taylor series

expansion is also used to propagate elementary uncertainties to the composed

253


uncertainty of the measurand. This propagation method has nevertheless some

limitations, principally, when the model of the measurand is non linear. In such

case, indeed, the shape of the PDF is distorted and some bias is observed for the

calculated mean value of the result.

2.2. Monte Carlo Simulation Method:

Recently, a supplement to the GUM (GUM S1) has shown how to overcome

this problem by using the Monte Carlo simulation Method (MCM) to evaluate

uncertainties. MCM is a computational algorithm that relies on repeated random

sampling to obtain numerical results and derive statistical parameters (mean

value, standard deviation) [7-8]. MCM is a common tool in uncertainty

evaluation of complex measurement processes. It is used because of the lack or

the difficulty to express analytical solutions. The convergence rate of Monte

Carlo methods is ( )N1O , where N is the number of simulated experiments.

Instead of using pseudo-random generators, it can be accelerated by employing

deterministic uniformly distributed sequences known as presenting low-

discrepancy. Methods based on such sequences are named Quasi Monte Carlo.

Asymptotically, Quasi Monte Carlo can provide a rate of convergence of

about ( )N1O [9]. MCM needs however numerous repeated random sampling and

thus often leads to large Tables.

2.3. Sobol’s method:

In analytical propagation approach, the sensitivity coefficient may also be

defined by Sobol’s approach [10]. This method [11] is a variance based global

sensitivity analysis technique founded upon “Total Sensitivity Indices” that

account for interaction effects of the variables. The Total Sensitivity Indices of

an input is defined as the sum of all the sensitivity indices involving that input.

This method includes both main effect of each input as well as the interactions

with the other variables [12]. Sobol’s method can cope with both nonlinear and

non-monotonic models, and provides a truly quantitative ranking of inputs and

not just a relative qualitative measure. Effort has been done to reduce the

computational complexity associated with the calculation of Sobol’s indices.

However, even with its most recent developments, Sobol’s method remains

computer time consuming.

254


2.4. Jack-knife, Bootstrap or delete d-Jack-knife methods:

To reduce, the computing time of MCM, the Jack-knife, Bootstrap or delete

d-Jack-knife methods can be used to estimate the uncertainties of ISO standard

specifications [13, 14].

The jack-knife was thought up by Quenouille in 1949. Ten years after, Tukey

has developed its use in statistics. This method requires less computational

power than MCM. For a dataset x = (x1, x2, ..., xn) of size n and an estimator θ ,

the Jack-knife derives estimators iθ on subsamples that leave out a given

selected element xi. The subsample is defined by this equation.

( ))x ,· · · ,x ,x ,· · · ,x ,(x x

n1i1-i21i += (1)

The size of each Jack-knife subsample x(i) is p=n−1 and the total number of

datasets that can be built is n. No sampling method is needed to define the

subsamples. To estimate an uncertainty, the standard error of the Jack-knife

replications is needed. Its estimate is defined by [15]:

( )2ˆ ˆ ˆ ˆˆ with

n n

e (i) ( ) ( ) (i)

i 1 i 1

n 1 1s θ θ θ θ

n n• •

= =

−= − =∑ ∑ (2)

Generally, Jack-knife’s method gives fine results for smooth statistics and for

sufficiently large n. Nevertheless, it does not give accurate estimations for non-

smooth statistic or nonlinear behavior.

The Bootstrap method was thought up after Jack-knife’s method. B. Efron

introduced it in 1979. For a dataset x = (x1, x2, ..., xn) of size n and an estimator

θ , the Bootstrap derives the estimator θ on a resample b of the same size n.

Each resample is obtained by random sampling with replacement from the

original dataset. The total number of resamples that can thus be built is nn. A

lower number B of datasets is however used in practice to estimate uncertainties.

It is usually fixed to B = 200 for standard error estimation and B = 500 for error

bar estimation. The standard error of the estimator θ can be derived from the

Bootstrap replications using the equation [13, 15]:

The delete d-Jack-knife method consists in generating subsamples, simply by

randomly removing a number d of elements from the initial dataset. The size of

each subsample is thus n-d. The total number of subsamples that can be built is

the number of combinations of d elements removed from the original dataset of

size n. As compared to the earlier Jack-knife scheme the delete d-Jack-knife

( )2ˆ ˆ ˆ ˆˆ

B B

e (b) ( ) ( ) (b)

b 1 b 1

1 1s θ θ with θ θ

B - 1 B• •

= =

= − =∑ ∑

255


sub-sampling technique leads thus to a greater number of sub-datasets. This can

improve the accuracy of the method in the case of non-smooth statistics [16].

The standard error of the estimator θ can be evaluated through this equation

[15]:

( )2ˆ ˆ ˆ ˆˆ

n n

d d

e (i) ( ) ( ) (i)

i 1 i 1

n - d 1s θ θ with θ θ

n nd.

d d

• •

= =

= − =∑ ∑

(3)

To obtain an accurate estimate of the standard error, the number d of deleted

data elements has to be selected in the range: 1ndn −≤≤ .

An overview of other potential methods that can be used to estimate the

uncertainties of geometrical specification has been presented in this section. A

modified delete d-jack-knife method was finally chosen in our study to evaluate

the measurement uncertainty of an ISO 1101 specification. The results of this

work will be developed in the next section.

3. Estimation of the verification uncertainty of a geometrical specification

using a modified Jack-knife method:

3.1. Geometrical specification checking:

Figure 1 shows an example of parallelism constraint as specified with ISO

standard. The tolerance zone which defines the limits of the checked surface is

bounded by two planes parallel to the datum plane A. In the example of figure 1

which deals with the parallelism between the specified surface and the datum

plane A, the geometrical defect to be evaluated and checked is defined by the

distance between the two planes that bound the measured points Mj, while being

parallel to the datum feature A characterized by the digitized coordinates Mi. In a

given reference frame (O, X, Y, Z) these requirements are expressed by the

minimisation conditions:

( ) ( )[ ]

( ) ( )[ ]

=

=

=

−

−

n.OMe

n.OMe

:where

nn while

eminemaxmin

eminemaxmin

jj

Aii

A

jj

ii (4)

This equation needs two optimisation steps. The initial optimisation that

corresponds to the first minimization condition permits to determine the normal

vector A

n

of the datum plane A. The final step which is expressed by the second

minimization condition defines the parallelism defect between the specified

256


surface and the datum feature A. The cosines of the normal vectors of the

specified surface and the datum surface are imposed equal. In consequence and

after best fit, the value of minimax criterion of distances between the measured

points and the specified surface describes the value of parallelism defect. These

calculations allow thus to evaluate the parallelism defect of the checked surface.

Next section will now focus on the presentation of the method used to estimate

the uncertainty of this value.

Figure 1. ISO Parallelism Specification.

3.2. Modified Jack-knife method:

The orientation of the datum surface A is, greatly, influenced by the outlier

measured points. This fact can lead to a non-smooth statistic of the searched

parallelism defect. As stated in section 2.4, the Delete d-Jack-knife permits to

accurately estimate standard errors even in the case of non-smooth statistics [16].

This method was therefore chosen to estimate the verification uncertainty of the

parallelism defect.

In the classical Delete d-Jack-knife method, the number of deleted points is

usually fixed to a given value d that remains the same for all subsamples. In the

proposed method, on the contrary, this number d was selected randomly. In order

to contain sufficient statistical information, both the datum plane and the

specified surface were characterized by datasets of at least 25 acquisition points.

For each sub-sampling, the modified Delete d-Jack-knife consisted then in a

random generation of two subsets of the initial data: one constructed with the

coordinates that define the datum plane and one created with the points acquired

to characterize the specified feature.

257


Select point Mi or Mj

and Calculate

Associate at each point a

random floating number: pi or j

Initial set of measured points

(size n, d = 0)

jor i Mjor i Mjor i MZYX ,,

Random

generator

pi or j =p

i=i+1

or

j=j+1

d=d+1

jor i M*

jor i M*

jor i M*

ZYX ,,n-d = 4

Generated Sub-Sample

(size n-d)

jor i M*

jor i M*

jor i M*

ZYX ,,Random

generator

Figure 2. Sample and uploading acquisition uncertainty

The smallest number of points Mi or Mj required to build these two sub-

samples was fixed to 4 that is one more than the minimal number needed to

define a plane (3 points). A specific selection procedure was implemented to

generate the two sub-datasets. For every sub-sampling sequence, it consisted in

randomly associating a floating number to each element of the initial datasets,

that was built in the range [0,1]. A cut-off threshold p was then fixed to only

select the points with a linked value lower than this limit. A random perturbation

was finally added to each coordinate to account for the calibration uncertainty of

the Coordinate Measuring Machine (CMM). ISO 10360 standard [17] was used

for this last operation. This standard permits to know the calibration error bar ∆

for a measured length L in the CMM volume. For a CMM, this value can be

defined by this equation:

( )b.La∆ +±= (5)

The measured length L was derived from the coordinates of the acquired

points Mi or Mj. This equation was then applied to add random perturbations to

the three initial coordinates which permits to account for the acquisition error ∆

of the CMM. These calculations assume a uniform probability density of the

calibration errors in the range [–∆, ∆]:

[ ]similarly defined and (),with *Z*YRnd

ZYX

1-2.∆.1.XX

2

j or iM

2

j or iM

2

j or iM

MM*

j or ij or i=

++

+= εε

258


In these relationships, Rnd() is the random generation of a uniformly

distributed variable in the interval [0,1]. The whole procedure used to build each

sub-sample is presented in figure 2. After optimisation based on different tests,

the value of the cut-off threshold p was chosen to 0.4. This value guarantees to

obtain sub-samples of sufficient number of points.

3.3. Error Bar of ISO specification:

The uncertainties of a given specification are defined by the standard error of

the geometrical defect to be characterized. It is based on repeated random

generation of subsamples of the datum and specified surfaces using the modified

d-Jack-knife procedure. After estimation of the mean value of the geometrical

defect to be checked, the initial set of points is replicated by the random d-Jack-

knife method already presented. At each replication step, the geometrical defect

of the virtual surfaces associated to the generated datasets is then computed. This

operation is repeated M times. The standard error of the set of values that are

thus obtained if finally calculated. It represents the uncertainty of the estimated

geometrical defect. The lower bound of the tolerance interval of orientation

specifications is always equal to 0. A test was therefore implemented in the

calculation of the error bars of the estimated geometrical defect to avoid negative

values of the confidence interval starting point. If this check detects a negative

indicator, a unilateral distribution of probability is considered to define the error

bars. The lower bound of the confidence interval is then fixed to 0. A bilateral

probability distribution is considered otherwise.

4. Conclusion:

Uncertainty calculation of ISO specifications is a complex task. In industrial

context, the calculation time is a main constraint. The revised version of the

GUM (GUM S1) proposes to use Monte Carlo simulations for the evaluation of

uncertainties, but this method is much computer time consuming. To avoid this

impediment, an alternative uncertainty calculation method was proposed. It is

based on a modified delete d-jack-knife sub-sampling technique. The results

obtained with this method for measured surfaces with a small form defect are in

complete adequacy with ISO 1101 standard for parallelism specification. The

orientation of datum surface was deduced tangent of the material using the

minimax criterion and the error bar of ISO specification was deduced quickly

using the modified Delete-d-jack-knife method.

259


References

1. ISO/IEC 17000:2004, Conformity assessment - Vocabulary and general

principles.

2. http://www.mesures.com/pdf/old/816-Dossier-Metrologie-3.pdf

3. ISO 1101: Third edition 2012, Geometrical product specifications (GPS) -

Geometrical tolerancing -Tolerances of form, orientation, location and run-

out.

4. Ricci, F., Scott, P.J., Jiang X., 2013, A categorical model for uncertainty

and cost management within the Geometrical Product Specification (GPS)

framework Precision Engineering 37, p.265- 274.

5. Maihle, J., Linares, J.M., Sprauel, J.M., 2009, The statistical gauge in

geometrical verification: Part I. Field of probability of the presence of

matter, Precision Engineering 33, p.333-341.

6. BIPM, IEC, ISO, IUPAC, IUPAP, OIML; "Guide to the expression of the

uncertainty in measurement, First Edition". 1993, ISBN 92-6710188-9.

7. Wen, X.L., Zhao, Y.B., Pan, J., 2013, Adaptative Monte Carlo and GUM

methods for evaluation of measurement uncertainty of cylindricity error.

Precision Engineering 37, p.856- 864.

8. Linares, J.M., Sprauel, J.M., Bourdet, P., 2009, Uncertainty of reference

frames characterized by real time optical measurements: Application to

Computer Assisted Orthopaedic Surgery. CIRP Annals - Manufacturing

Technology 58, p.447-4.50

9. Søren Asmussen and Peter W. Glynn, Stochastic Simulation: Algorithms

and Analysis, Springer, 2007, 476 pages.

10. Allard, A. and Fischer N., 2009, Sensitivity analysis in metrology: study and

comparison on different indices for measurement uncertainty, Advanced

Mathematical and Computational Tools in Metrology and Testing VIII,

World Scientific, p1-6.

11. Sobol, I.M., 1993, Sensitivity estimates for nonlinear mathematical models,

Mathematical Modelling and Computation 1, p.407-414.

12. Saltelli, A., 2002. Making best use of model evaluations to compute

sensitivity indices, Computer Physics Communication 145, p.280-297.

13. Farooqui, S.A., Doiron, T., Sahay, C., 2009, Uncertainty analysis of

cylindricity measurements using bootstrap method, Measurement 42, p.524-

531.

14. Efron, B., 1993. An Introduction to the Bootstrap. Chapman & Hall.

15. Rapacchi, B., Une introduction au bootstrap, Centre Interuniversitaire de

Calcul de Grenoble, 1994, 74 pages.

16. Shao, J., Wu, C.F.J., 1989, A general-theory for jackknife variance-

estimation, Annals of statistics 17/3, p.1176-1197.

17. ISO 10360 Part 2: 2005, Performance assessment of coordinate measuring

machines.



ADDING A VIRTUAL LAYER IN A SENSOR NETWORK

TO IMPROVE MEASUREMENT RELIABILITY

U. MANISCALCO AND R. RIZZO

Istituto di Calcolo e Reti ad Alte Prestazioni - Italian National Research Council

Viale delle Scienze, Ed. 11, 90128, Palermo - ITALYE- mail: umberto.maniscalco, [email protected]

Keywords: Soft Sensors, Measurement Estimation, Neural Network

1. Introduction

A layer of soft sensors based on neural network is designed and trained at

the aim to constitute a virtual layer of measure in a sensor network. Each

soft sensor of the layer esteems the missing values of some hardware sensors

by using the values obtained from some other sensors performing a spatial

forecasting. The correlation analysis for each parameter taken into account

is used to define cluster of real sensors used as sources of measure to esteem

missing values. An application concerning the fire prevention field is used

as test case and result evaluation.

A sensor network is a set of transducers or sensory stations that uses a

communication infrastructure to communicate with a remote station that

collects the data. These stations can be distributed in large area or even

world-wide. In a distributed sensor network, the measures obtained from a

sensor s can be considered in a relation with the nearest sensors: for example

temperature can vary over a wide geographical area, but we expect that the

values are somewhat slowly varying in neighborhood positions. Thus, should

exist a mathematical model that describe the functional link among these

measures.

Soft Sensors, known also as Virtual Sensors, or Inferential Models are

mathematical models implemented as software tools capable to calculate

quantities that are difficult or impossible to measure. They are able to

learn the functional link among measures and then they are able to esteem

missing value starting from other measures.

Their implementation is low-cost and they can be also used in parallel

with hardware sensors1,2.

260


261

Using soft sensors based on artificial neural network, we approximate

the missing values of some hardware sensors by using the values obtained

from some other sensors, Thus, it can be thought as spatial forecast and

not as a temporal forecast.

In the next section we report about the theoretical framework, In the

section 3 results and the experimental setup is presented.

2. Theoretical framework

The starting hypothesis is that exists some mathematical model describing

the functional link among the measures of the same parameter in a sensor

network, this means that the values measured by a real sensor s will be

someway related with the values measured from the set S1 of neighborhood

real sensors (see Fig. 1).

Fig. 1. The virtual layer and the hardware sensor network: the soft sensor ss corre-

sponds to the real sensor s. It works using the data of the sensors S1.

This is actually the key idea of soft sensors network: a malfunctioned

real sensor s can be substituted by a soft sensor (ss). In our case this soft

sensor is implemented by a neural network that learn the correspondence

between the values measured by s and the values measures by the sensors

in the set S1 as depicted in Fig. 1. This is not an average operation: the

neural network have to learn the relationship among the measures. Such a


262

soft sensor, can be obtained for each hardware sensor in a sensors network,

creating a second layer of soft sensors, the Virtual Layer in fig. 1. This layer

will be used to back up the sensor network and each soft sensor can esteems

the measure instead of the homologous real sensor.

In this paper we show how a virtual layer added in a sensor network can

be used both as to improve measurement reliability and backup for a set of

geographically distributed sensors. Each soft sensor is constituted by one

(or more than one) neural network that is trained using a set of data from

other hardware sensory station. The choice of the neural network model

and the set of data is a fundamental part of the design process.

It is straightforward to think that two neighborhood sensors, measuring

the same parameter, produce most related values then random coupled

sensors. Indeed, it is not always true. Microclimate conditions, for example,

can determinate non correlated measure also for very close sensors. Thus,

we use a methodology to design an effective virtual layer starting from

the correlation analysis for each parameter taken into account. The use

of the correlation analysis (see Fig. 3) to get the suitable set of sensory

station instead of the geographical neighborhood (see the shadow ring in

the hardware layer of Fig. 1) improves the performance of the soft sensors.

Each soft sensors ssx is trained using the set of data produced by the most

correlated the real sensors. The best number of the most correlated sensors

to use in the training an in the working phase, for each soft sensors, is

experimentally fixed.

3. Experimental set up

The experimental set up is based on a set of wether sensory station dis-

tributed over a region about 70 km2 in Irpinia, Italy. The set of hardware

sensor counts 30 sensory stations; each one measures, hourly, the soil mois-

ture, the soil temperature, the leaf wetness and the air temperature.

The data are taken from a set of sensory network that is a subset of

the weatherlink networks a. Each of these station produces a record of four

measures each our, so that, at the end of the day we have a matrix of 24×4

values for each station.

We tested two different kinds of soft sensor topology, the first one in-

volves several input in the same Neural Network (see Fig. 2 left) to estimate

the target parameter. The second one, use several Neural Networks to ob-

tain several esteems of the target parameters and select the best one using

ahttp://www.weatherlink.com/


263

Fig. 2. Soft sensor topology. Several input in the same Neural Network (left side),

Several Neural Network output are fused bye a Gating Network (right side)

Fig. 3. A graphical representation of the correlation matrix for the parameter leafwetness. The black stripes are due to malfunctioned stations.

a gating network (see Fig. 2 right). The neural network used is the Elman

neural network.

A statistical procedure based on specific evaluators2 is used to asses

the performance of each soft sensor form a metrological point of view. Pre-

liminary results, obtained using dataset formed by six months of sensory

network data acquisition, seem good and effective with respect to the kind

of application. Here we report just an example of result: parameter Leafwet

is esteemed by the ss14 (multiInput model), using data coming from real

sensor set formed by s8, s16,, s3, and s1. We obtained an error vector with

a 4.035 C of standard deviation and a very low mean.

4. Conclusions

The soft sensors virtual layer is a virtual structure that can be used as a

backup for real sensors. This layer can be obtained using an Elman neural

network for each soft sensor, and the error of these sensors can be greatly

reduced if during the training phase of the neural network the suitable set


264

Fig. 4. Histogramm of error for the Leafwet parameter.

of data is considered. Our experiment demonstrate that this approach is

viable even for sensor network spread on a wide area.

Acknowledgments

This work is a part of a wider italian research project named INSYEME

(INtegrated SYstem for EMErgency).

References

1. P. Ciarlini and U. Maniscalco, Wavelets and elman neural networks for moni-toring environmental variables, Journal of Computational and Applied Math-ematics 221, 302 (November 2008).

2. P. Ciarlini, U. Maniscalco and G. Regogliosi, Validation of soft sensors inMonitoring Ambient Parameters, in Advanced Mathematical and Computa-tional Tools in Metrology VII , ed. E. Ciarlini et al., Series on advanced inMathematics for Applied Science., Vol. 72 (World Scientific, 2006), pp. 252–259.

265





CALIBRATION ANALYSIS OF A COMPUTATIONAL OPTICAL

SYSTEM APPLIED IN THE DIMENSIONAL MONITORING OF

A SUSPENSION BRIDGE

L. L. MARTINS†1 J. M. REBORDÃO2 AND A. S. RIBEIRO1 1Scientific Instrumentation Centre, Laboratório Nacional de Engenharia Civil,

Avenida do Brasil 101, 1700-066 Lisbon, Portugal

E-mail: †[email protected], [email protected]

www.lnec.pt

2Laboratory of Optics, Lasers and Systems, Faculdade de Ciências, Univ. de Lisboa

Campo Grande, 1749-016 Lisbon, Portugal


www.ciencias.ulisboa.pt

This paper describes the analysis of the calibration procedure of a computational optical

system applied in the dimensional monitoring of the 25th of April suspension bridge

(P25A) in Lisbon (Portugal). The analysis includes the displacement optical measurement

approach, the calibration method, the reference standard prototype and the experimental

setup. The evaluation of the measurement uncertainty is described, including input

measurement uncertainty contributions related to the experimental design and the use of

Monte Carlo numerical simulation as tool for determination of the measurement

uncertainty related to the calibration test, as well as a sensitivity analysis to identify the

major sources of uncertainty. Conclusions are drawn about the suitability of the

calibration method and reference standard prototype.

Keywords: Optical Metrology, Computational Vision, Suspension Bridge, Displacement

1. Introduction

Safe mobility of persons and goods in transport networks is a growing concern

of society due to human and economic consequences related to eventual failure.

Visual inspection, observation and monitoring of key-elements in transport

networks − such as bridges and viaducts − provide relevant information on their

condition and structural safety. In this framework, several types of quantities can

be measured in order to characterize both structural actions and responses.

† Work partially supported by grant SFRH/BD/76367/2011 of the Portuguese National Foundation

for Science and Technology (FCT).

266


In the case of long-span suspension bridges, such as the Portuguese 25th

of April

bridge (P25A) shown in Figure 1, 3D displacement measurement in the main

span of its stiffness beam is a challenge since conventional instrumentation is

not suitable due to the lack of nearby absolute reference points combined with

the bridge dynamical behaviour, characterized by low frequency and high

amplitude vertical displacements (exceeding one meter).

Fig. 1. The 25th of April bridge in Lisbon (Portugal) with location of the measurement system.

Research efforts in this area have been geared towards non-contact measurement

approaches, such as global navigation satellite systems and microwave

interferometric radar systems. However, in the case of metallic bridges, as it is

the case of the P25A, the measurement accuracy related to the mentioned

systems can be compromised by multiple signal reflections on the stiffness beam

components. This creates an opportunity for new approaches based on optical

systems and computational vision, composed of active targets, high focal length

lenses and digital cameras.

A computational optical system was designed and tested for continuous

monitoring of the P25A aiming to measure the 3D displacement of the central

section on its main span. The system (shown in Fig. 1) is composed of one

digital camera with a high focal length lens rigidly installed in the lower surface

of the bridge central section and orientated towards the south tower foundation,

where a set of four active targets1 is located, establishing an observation distance

close to 500 meters.

Prior to their installation on the P25A, both the digital camera and the set of

targets were subjected to laboratorial testing aiming at the determination of

intrinsic parameters2 (focal length and principal point coordinates) and world

coordinates (relative to one of the targets). Based on the knowledge of these

input quantities and supported on collinearity equations of the adopted pinhole

1 Each target was composed of 16 near-infrared LEDs in a circular pattern. 2 These tests revealed an irrelevant effect of the lens radial distortion and, therefore, this effect was

not accounted for in both the parameterization and measurement processes.

Set of active targets in the

south tower foundation

Digital camera in the

main span central section

South anchorage

North anchorage

267


geometric model, the camera’s projection centre can be determined using targets

image coordinates (obtained from digital image processing). The recorded

temporal evolution of the camera’s projection centre can be considered

representative of the bridge’s 3D displacement at the camera’s location.

This computational optical system was successively tested on the P25A,

allowing measuring vertical, transverse and longitudinal displacements of its

main span central section: maximum values of, respectively, 1,69 m, 0,39 m and

0,07 m were recorded for standard operational road and rail traffic.

2. Collinearity equations between camera and targets

The optical approach for the displacement measurement of P25A was supported

in the collinearity equations established between image and world targets points.

Each observed target originated two equations given by

( ) ( ) ( )

( ) ( ) ( )0

033023013

0310210110 =

−⋅+−⋅+−⋅

−⋅+−⋅+−⋅+−

ZZrYYrXXr

ZZrYYrXXrcux , (1)

( ) ( ) ( )

( ) ( ) ( )0

033023013

0320220120 =

−⋅+−⋅+−⋅

−⋅+−⋅+−⋅+−

ZZrYYrXXr

ZZrYYrXXrcvy , (2)

where ( )00 ,vu are the principal point image coordinates, c is the focal length,

( )yx, and ( )ZYX ,, are the target image and world coordinates, respectively, ij

r

correspond to the rotation matrix elements related to the camera orientation

and ( )000 ,, ZYX are the projection centre world coordinates.

The solution for the collinearity equation system built is conventionally given by

an iterative procedure based on the Generalized Least Squares method, which

implies its linearization by a first order Taylor expansion. In this study – high

focal length camera – this method showed numerical instability due to an ill-

conditioned system, explained by the narrow field-of-view with several

correlated variables. An alternative approach was studied – unconstrained non-

linear optimization [1] – revealing no numerical instability, which allow to

obtain convergent solutions for the unknown variables.

3. Calibration method and reference standard prototype

In order to assure reliable displacement measurements in the P25A study, the

computational optical system was calibrated according to a SI dimensional

traceability chain, and the measurement deviations and system’s accuracy were

both quantified. Since this system is able to perform a non-contact and long-

268


distance dimensional measurement, a field calibration method was developed by

placing both the camera and the set of targets in static regions of the structural

observation scenario, respectively, in the south anchorage and tower foundation

(see Fig. 1). This geometrical configuration testing mimics the operational

conditions of observation distance, line-of-sight elevation and atmospheric

effects.

A calibration device was specially built for the installation of the set of targets

and application of reference displacements in the transverse (X), vertical (Y) and

longitudinal (Z) observation directions, allowing to obtain images of the set of

targets in different, well-known, 3D reference positions. The camera’s virtual

displacement results from the targets displacement between positions (at the

south tower foundation), maintaining the camera at the same static observation

position (at the south anchorage).

Reference displacement values had been previously obtained in laboratory by

dimensional testing of the calibration prototype device. A SI traceable 3D

coordinate measuring (contact) machine was used to determine the spatial

coordinates of targets LEDs in four positions: initial position; 250 mm in the

longitudinal and vertical directions; and 350 mm in the transverse direction.

Calibration deviations were found between − 1,5 mm and 1,4 mm, showing no

significant differences between displacement directions. If vertical refraction

corrections [2] are applied to targets world coordinates in the vertical direction,

the vertical deviation found for the Y direction (0,3 mm), becomes null thus

improving system’s accuracy.

4. Measurement uncertainty evaluation of calibration deviations

4.1. Probabilistic formulation of input quantities

4.1.1 Intrinsic parameters

The camera’s intrinsic parameters were obtained by the diffractive optical

element (DOE) method [3] using a collimated laser beam and a diffraction

grating with multi-level microstructure, generating a regular spatial distribution

of diffraction dots in the camera’s focal plane. The knowledge of the laser

wavelength and diffraction grating period, combined with the accurate

determination of the diffraction dots centroids location, allowed performing a

non-linear optimization [1] aiming to minimize the sum of differences between

ideal and measured locations, considering the pinhole geometrical model.

Due to the non-linear and iterative nature of the intrinsic parameterization, the

Monte Carlo Method (MCM) [4] was used for uncertainty propagation from

269


input quantities to focal length and principal point coordinates. In this process,

the laser wavelength estimate (632,8·10-9 m) was considered constant and a

Gaussian probability density function (PDF) was adopted for the diffraction

grating period (centered at 152,4·10-6

m with a standard uncertainty of

0,15·10-6

m − according with manufacturer specifications). A measurement

standard uncertainty of 0,25 pixel was assigned to the estimates of diffraction

dots centroids location based on digital image processing performance. Results

are shown in Table 1 being supported on 105 Monte Carlo runs.

Table 1. Estimates and standard uncertainties of the camera’s intrinsic parameters

Intrinsic

parameter

Focal length

(mm)

Principal point

x coordinate (pixel)

Principal point

y coordinate (pixel)

Average value 599,95 545,61 960,41

Standard

uncertainty

0,38 0,030 0,045

A significant correlation between intrinsic parameters was noticed (correlation

coefficients between − 0,25 and − 0,35), being included in the remaining

uncertainty propagation process. Sensitivity analysis revealed that the

uncertainty related to the diffraction dots centroids location is dominant relative

to the measurement uncertainty of the diffraction grating period, which only

contributed about 25% to the combined uncertainty.

4.1.2 Targets world coordinates

Regarding the targets world coordinates estimates, Table 2 presents the

corresponding uncertainty components and measurement combined uncertainties

for each displacement direction.

Table 2. Measurement uncertainty evaluation of targets world coordinates

Uncertainty component Remarks

Dimensional testing Set of targets measured in a 3D coordinate measuring

(contact) machine.

Target circularity Statistically evaluated based on the deviations obtained

from all the performed least squares computational

adjustments.

Environmental temperature changes Thermal expansion/contraction in an aluminium frame,

with temperatures ranging from 5 ºC up to 35 ºC.

Transportation and installation Based on dimensional testing performed before and after

installation in the P25A in three different occasions.

Vertical refraction correction Related to the applied power-law model for the vertical

refractive index.

Measurement combined uncertainty 0,57 mm (X) 0,69 mm (Y) 0,36 mm (Z)

270


Major contributions for the combined uncertainty were identified, namely, the

target transportation and installation (between 58 % and 61 %) for the X and Y

direction and, for the Z direction, the target circularity (about 54 %), followed by

the thermal expansion/contraction and transportation (23 % contribution from

each mentioned component). It should be mentioned that, if no vertical

refraction correction was applied to the vertical world target coordinates, the

standard uncertainty of 0,69 mm for the Y direction would increase significantly,

namely in Summer reaching a value of 6,5 mm.

4.1.3 Targets image coordinates

Two major uncertainty components – digital image processing (consisting of an

ellipse fitting algorithm for target centre location) and turbulence due to vertical

thermal gradients – were identified. In order to quantify their combined

uncertainty effect, beam wandering tests were also performed at the P25A in a

similar way as described for the system’s field calibration, but without moving

the set of targets. Considering that the camera and targets are in a static

observation condition, any modification in the targets image coordinates will be

justified by the uncertainty components mentioned and the sample experimental

standard deviation can be used as a parameter to characterize the dispersion of

values attributed to this input quantity. The beam wandering experimental

results were obtained for different observation conditions, namely, season of the

year (Summer or Winter) and shadow over the set of targets in the P25A south

tower foundation.

As expected, significant differences were found on the sample experimental

standard deviation, ranging from 0,13 pixel (Winter with shadow over the

targets) up to 0,56 pixel (Summer without shadow over the targets). It was also

noticed that the y direction image coordinate has a slightly higher standard

deviation than the relative to the x direction, due to more relevant thermal

vertical gradients in Summer.

4.1.4 Reference displacements

Regarding the reference displacements applied to the targets of the calibration

device prototype, Table 3 presents the related uncertainty components and

measurement combined uncertainties for each displacement direction.

The uncertainty analysis of data showed that the transverse reference

displacement (between 0,90 mm and 0,97 mm) was nearly two times larger than

the uncertainties obtained for the remaining reference displacements. This is

explained by the influence of the uncertainty component due to the differential

displacement between targets, which is quite high for this displacement direction

271


(near 0,8 mm). Besides this component, other major uncertainty contributions

found were mainly related to the dimensional measurement of the targets at

initial and final positions.

Table 3. Measurement uncertainty evaluation of reference displacements

Uncertainty component Remarks

Initial and final positioning

measurement

Combines both the dimensional testing measurement

uncertainty and the circularity deviation.

Return to zero deviation Based on observed zero deviations when the set of targets

returns to the initial position after each displacement.

Differential displacement between

the four targets

Quantified by the maximum deviation between targets

displacements in all directions.

Targets installation repeatability Due to the manual installation of the set of targets in each

reference position of the calibration device.

Transverse displacement 0,97 mm (X) 0,91 mm (Y) 0,90 mm (Z)

Vertical displacement 0,58 mm (X) 0,58 mm (Y) 0,47 mm (Z)

Longitudinal displacement 0,59 mm (X) 0,51 mm (Y) 0,45 mm (Z)

4.2. Intermediate quantities

Due to the non-linear and iterative nature of optimization procedure related to

the determination of the camera’s 3D position, the MCM was used to propagate

the PDF´s of the input quantities, namely, of the intrinsic parameters (section

4.1.1), target world coordinates (section 4.1.2) and target image coordinates

(section 4.1.3). With respect to this last quantity mentioned, a measurement

standard uncertainty of 0,15 pixel was assumed considering that the calibration

was performed in favourable conditions (in Winter, with shadowed targets).

Numerical simulations (with 105 runs) led to similar standard measurement

uncertainties between the initial and final positions in all displacement

directions: 0,95 mm – 1,0 mm for the X and Y directions and 8,5 mm up to

9,0 mm for the Z direction. Sensitivity analysis showed that the targets image

coordinates uncertainty provides a major contribution (close to 60%) to the

camera’s 3D position uncertainty, followed by the targets world coordinates

uncertainty, with a contribution between 18 % and 30 %.

4.3. Output quantities

The combination of the measurement uncertainties of the camera’s initial and

final positions (section 4.2) and reference displacements (section 4.1.4)

originated the measurement expanded uncertainties (95 %) related to the

calibration deviations. Measurement uncertainties in the X and Y directions

showed similar values (roughly around 3 mm). A significant difference is

noticed when compared to the results obtained for the Z direction where an

272


expanded uncertainty value of 26 mm was obtained. This fact results from

longitudinal alignment between the camera and the stiffness beam which results

in reduced measurement sensitivity in mentioned displacement direction.

If the calibration operation was performed during Summer without shadow over

the targets, the expanded uncertainty increases from 3 mm to 8,5 mm in the X

and Y directions and from 26 mm up to 81 mm in the Z direction, since the

targets image coordinate uncertainty is a major uncertainty component,

enhancing the importance of performing the calibration with favourable

environmental conditions.

5. Conclusions

The proposed calibration method and reference standard prototype allowed

determining the 3D displacement calibrations deviations estimates and

measurement uncertainties. In the case of transverse and vertical displacements,

measurement expanded uncertainties lower than 10 mm (required for structural

analysis of the P25A) were achieved, even in a worst-case field calibration

scenario, demonstrating the suitability of the computational optical system, the

proposed calibration method and reference dimensional standard.

Longitudinal displacement measurements were affected by the camera’s reduced

sensitivity in that direction due to the adopted measurement geometrical

configuration (foundation/central section). This remark is less important for the

case of the P25A because this bridge is already equipped with longitudinal

displacement transducers located at the stiffness beam connections in the north

and south anchorages, which provide more accurate estimates of the longitudinal

displacement.

References

1. L. Lagarias, J. Reeds, M. Wright and P. Wright, Convergence properties of

the Nelder-Mead simplex method in low dimensions, SIAM Journal of

Optimization 9, 1, 112-147 (1998).

2. L. Martins, J. Rebordão and A. S. Ribeiro, Thermal influence on

long-distance optical measurement of suspension bridge displacement,

International Journal of Thermophysics 35, 1 (2014).

3. M. Bauer, D. Grießbach, A. Hermerschmidt, S. Krüger, M. Scheele and A.

Schischmanow, Geometrical camera calibration with diffractive optical

elements, Optics Express, 16, 25, 20241-20248 (2008).

4. Supplement 1 to the Guide to the expression of Uncertainty in Measurement

(S1 GUM), Joint Committee for Guides in Metrology (France, 2008).



DETERMINATION OF NUMERICAL UNCERTAINTY

ASSOCIATED WITH NUMERICAL ARTEFACTS FOR

VALIDATING COORDINATE METROLOGY SOFTWARE

HOANG D MINH, IAN M SMITH AND ALISTAIR B FORBES∗

National Physical Laboratory, Hampton Road,TW11 0LW, Middlesex, United Kingdom


In manufacturing engineering an important task is to assess conformance of a

manufactured part to its design specification. This task is usually carried outby measuring the manufactured part using a coordinate measuring machine

to provide a set of coordinate data. Software is used to fit the design surface

to the data and the closeness of the data to the fitted surface is examinedto determine if the manufactured part is within tolerance. The result of this

process strongly depends on the correctness of the software used for fitting.

To validate the software, one often uses pre-generated reference datasets. Areference dataset is associated with a reference solution, which is compared

with the solution returned by the software under test, referred to as the test

solution.In real world applications datasets are represented and stored using finite

numbers of digits. Using such finite precision, when the software under test pro-cesses a dataset, the test solution generally differs from the reference solution.

In this paper we present a method for determining the numerical uncertainty

of the reference solution that is then used in comparison of the test solutionwith the reference solution. The results of applying the method to fitting data

to a geometric elements are presented.

Keywords: Uncertainty, numerical error, geometric element

1. Introduction

Reference data sets can be used to evaluate the performance of metrology

software1–3 and there are number of methodologies that can be applied to

generate such data sets.4,5 However, the question arises about how accurate

these reference data sets are.6 This is necessary since when we evaluate

the performance of software we need to know to what extent the difference

between the test results and the references results can be explained in terms

of uncertainty associated with the reference results. This paper is concerned

with evaluating the effect of the finite precision representation of data sets

273


274

on the numerical accuracy of the data sets for evaluating the performance

of coordinate metrology software.

2. Computational aims represented as input-output models

We assume that a computational aim associates with an input data vec-

tor x ∈ Rm, an output vector a ∈ Rn. We assume that a = A(x) is a

deterministic function of x so that knowing x and the computational aim

uniquely determines a. We refer to x and a as a reference pair, sometimes

denoted by 〈x,a〉A. Often, A(x) is given implicitly by equations of the form

g(x,a) = 0. The input and output vectors could be partitioned to specify

different types of inputs and outputs. For the case of least squares orthog-

onal distance regression (LSODR) with geometric elements,7–9 the inputs

are a 3m vector of coordinates xI = (x1, y1, z1, x1, . . . , zm−1, xm, ym, zm)T

and the primary outputs are the n vector of parameters a specifying the

geometric element and the m vector of residual distances.

3. Sensitivity matrix associated with a computational aim

Suppose a = A(x) where A is sufficiently smooth so that A has continuous

first derivatives with respect to xi. (We note that not all computational

problems satisfy this smoothness condition.) Let S be the n×m sensitivity

matrix with Sji =∂aj

∂yi, i = 1, . . . ,m, and j = 1, . . . , n. For many problems,

the conditions that a = A(x) can be written as a set of equations involving

x and a of the form g(x,a) = 0. These conditions define a implicitly as a

function of x. In this case, we have

HS + JT = 0, Jik =∂gk∂xi

, Hkj =∂gk∂aj

,

so that S = −H−1JT.

For the case of LSODR with geometric elements, the optimality condi-

tions are g(xI ,a) = 0 with

gk(xI ,a) =m∑i=1

di∂di∂ak

, di = d(xi,a),

∂gk∂aj

=m∑i=1

(di

∂2di∂ak∂aj

+∂di∂ak

∂di∂aj

),

∂gk∂xi

= di∂2di∂ak∂xi

+∂di∂ak

∂di∂xi

,

with similar expressions for ∂gk/∂yi and ∂gk/∂zi.


275

4. Finite precision representation of real numbers

We assume that computer representable real numbers, e.g., in IEEE floating

point arithmetic,10 belong to a finite set F ⊂ R. For x ∈ R, xf = f(x) ∈ Fis the output of the rounding operator f : R −→ F. We are not particularly

interested in the exact specification of the rounding operator, only in its

general behaviour. For f ∈ F, [f ] = x ∈ R : f(x) = f is the subset of Rthat is rounded to f ∈ F. Given x ∈ R, we define the rounding error e(x)

associated with x to be given by

e(x) = maxx∈[xf ]

|x− xf |. (1)

With this definition, [xf ] = [xf − e(x), xf + e(x)]. Note that e(x) =

e(xf ) for all x ∈ [xf ]. For a vector x ∈ Rm, we set e(x) =

(e(x1), . . . , e(xi), . . . , e(xm))T. We assume that the internal machine rep-

resentation of xf could be any element of [xf ].

5. Numerical artefacts and numerical standards

A numerical artefact associated with a computational aim A is a pair

〈xf ,af 〉A of finite precision vectors representing a reference pair 〈x,a〉A.

The very fact that xf and af are finite precision vectors means that, in

general, af will not be an exact solution for xf , but we say that, nominally,

af = A(xf ). By analogy, a physical mass standard may be made to have

a nominal mass of 1 kg, but its actual mass will differ from 1 kg. A nu-

merical standard is a numerical artefact for which a quantitative measure,

its numerical uncertainty, of how far 〈xf ,af 〉A is from an exact reference

pair 〈x,a〉A. By analogy again, a physical artefact is regarded as a physi-

cal standard if it has been calibrated and given a calibrated value and an

associated uncertainty.

The numerical uncertainty associated with a numerical artefact will have

a contribution that arises simply by the representation of a mathematically

exact reference pair a = A(x) in finite precision. Additionally, there may be

an uncertainty component arising from the fact that a and x are arrived at

using a computational approach that is approximate. In this paper we are

interested only in the contribution due to the finite precision representation

of the numerical artefact. In practice 〈xf ,af 〉 are often derived by rounding

reference artefacts determined using extended precision and the rounding

operator is the dominant uncertainty component.


276

6. Evaluating the numerical accuracy of a reference data set

6.1. Numerical accuracy bounds

For computational aims that define a as a sufficiently smooth function a =

A(x) of the data x, it is possible to derive numerical accuracy bounds using

the sensitivity matrix associated with the computational aim discussed in

section 3. If x is perturbed by ε, where ‖ε‖ ≈ 0 then, to first order, a(x+

ε) ≈ a(x) + Sε, so that a is perturbed by δ = Sε. Let sj , j = 1, . . . , n, be

the 1×m row vectors of S so that δj = sjε.

If ‖x‖p = (∑m

i=1 |xi|p)1/p

is the p-norm of a vector x then Holder’s

inequality11 states that for vectors x and y, |xTy| ≤ ‖x‖p‖y‖q, if 1/p +

1/q = 1. From this we have

|δj | ≤ ‖sj‖2‖ε‖2, |δj | ≤ ‖sj‖1‖ε‖∞, δj ≤ ‖sj‖∞‖ε‖1.

Thus, if |εi| ≤ e > 0, then |δj | ≤ ‖sj‖1e. More generally, if |εi| ≤ ei, then

|δj | ≤m∑i=1

|sji|ei, (2)

with equality achieved if εi = sign(sji)ei or εi = −sign(sij)ei, i = 1, . . . ,m,

where sign(x) = 1 if x ≥ 0 and is −1 for x < 0. This specification of ε

represents the worst case scenario for δj .

Suppose a = A(x), xf = f(x), af = f(a). For any x ∈ [xf ], we have

|xi − xi| ≤ |xi − xfi |+ |xfi − xi| ≤ 2e(xi).

If a = A(x), S is the sensitivity matrix calculated for xf , and af = f(a),

then from (2),

|aj − aj | ≤ δj , δj = 2∑i

|sij |e(xi),

and so |afj −afj | ≤ δj +e(aj)+e(aj). This bound provides an upper bound of

the difference between the reference solution af for xf and another solution

af that can be accounted for by the rounding of x, a and a.

6.2. Numerical uncertainty estimates

While numerical accuracy bounds provide a worst case scenario, for these

bounds to be attained requires that the effects of the rounding opera-

tor matches exactly the required conditions. In practice, this is extremely

unlikely to happen. A statistical accuracy statement can be derived as

follows. We regard xf = f(x) as providing partial information about x


277

and given the ‘observed’ xf , we assign a rectangular distribution to x:

x|xf ∼ R(xf − e(x), xf + e(x)). The standard deviation associated with

this distribution is e(x)/√

3. For data vector xf , the variance matrix Vx|xf

is diagonal matrix with e2(xi)/3 in the ith diagonal element. The variance

matrix Va|xf associated with a|xf is estimated by

Va|xf = SVx|xfST. (3)

The variance matrix Vaf |xf = E + Va|xf , where E the diagonal matrix

with e2(aj)/3 in the jth diagonal element takes into account the additional

variance associated with the rounding of a.

6.3. Example: least squares cylinder fitting

Table 1 gives the numerical accuracy bounds δj and numerical uncertainties

given by the square roots of the diagonal elements of Va|xf defined in

(3) for two sets of 14 data points lying on a cylinder. The first set has

the data points uniformly distributed around two complete circles while

for the second set they are limited to semi-circles. For each coordinate,

e(xi) = 5× 10−14.

Table 1. Numerical accuracy bounds and nu-

merical uncertainties associated with two datasets representing a cylinder.

The units are 10−13 mm.

δj u(aj |xf ) δj u(aj |xf )

x0 0.813 0.109 1.712 0.260

y0 0.813 0.109 0.813 0.109

r0 0.634 0.077 1.290 0.184

7. Concluding remarks

This paper has been concerned with the finite precision representation of

data sets – numerical standards – used to evaluate the performance of

metrology software. We have shown how to evaluate numerical accuracy

bounds and numerical uncertainties associated with numerical standards.

The numerical accuracy bounds represent a worst case scenario, while the

numerical uncertainties provide a statistical characterisation. Both types

of calculations involve the determination of the sensitivity matrix giving

the partial derivatives of the solution parameters with respect to the input

data.


278

Acknowledgement

This work has been undertaken as part of the EMRP project NEW06,

Traceability for computationally-intensive metrology, co-funded by the

UK’s National Measurement Office Programme for Materials and Modelling

and by the European Union. The EMRP is jointly funded by the EMRP

participating countries within EURAMET and the European Union.

References

1. B. P. Butler, M. G. Cox, A. B. Forbes, S. A. Hannaby and P. M. Harris,A methodology for testing the numerical correctness of approximation andoptimisation software, in The Quality of Numerical Software: Assessment andEnhancement , ed. R. Boisvert (Chapman and Hall, 1997).

2. R. Drieschner, B. Bittner, R. Elligsen and F. Waldele, Testing CoordinateMeasuring Machine Algorithms, Phase II, Tech. Rep. EUR 13417 EN, Com-mission of the European Communities (BCR Information) (Luxembourg,1991).

3. F. Hartig, M. Franke and K. Wendt, Validation of CMM evaluation soft-ware using TraCIM, in Advanced Mathematical and Computational Tools forMetrology X , eds. F. Pavese, A. Chunovkina, M. Bar, N. Fischer and A. B.Forbes (World Scientific, Singapore, 2014). Submitted.

4. M. G. Cox and A. B. Forbes, Strategies for testing form assessment software,Tech. Rep. DITC 211/92, National Physical Laboratory (Teddington, 1992).

5. A. B. Forbes and H. D. Minh, Int. J. Metrol. Qual. Eng. , 145 (2012).6. G. J. P. Kok and I. M. Smith, Approaches for assigning numerical uncertainty

to reference data pairs for software validation, in Advanced Mathematicaland Computational Tools for Metrology X , eds. F. Pavese, A. Chunovkina,M. Bar, N. Fischer and A. B. Forbes (World Scientific, Singapore, 2014).Submitted.

7. A. B. Forbes, Least-Squares Best-Fit Geometric Elements, Tech. Rep. DITC140/89, National Physical Laboratory (Teddington, 1989).

8. A. B. Forbes, Least squares best fit geometric elements, in Algorithms for Ap-proximation II , eds. J. C. Mason and M. G. Cox (Chapman & Hall, London,1990).

9. A. B. Forbes and H. D. Minh, Form assessment in coordinate metrology,in Approximation Algorithms for Complex Systems, eds. E. H. Georgoulis,A. Iske and J. LevesleySpringer Proceedings in Mathematics, Vol 3 (Springer-Verlag, Heidelberg, 2011).

10. IEEE, IEEE Standard for Floating Point Arithmetic. IEEE Computer Soci-ety, Piscataway, NJ, (2008).

11. L. Hogben (ed.), Handbook of Linear Algebra (Chapman&Hall/CRC, BocaRaton, 2007).

279





LEAST-SQUARES METHOD AND TYPE B EVALUATION OF

STANDARD UNCERTAINTY

R. PALENČÁR, S. ĎURIŠ AND P. PAVLÁSEK†

Institute of automation, measurement and applied informatics

Slovak University of Technology,

Bratislava, 812 31, Slovakia †E-mail: [email protected]

www.sjf.stuba.sk

M. DOVICA AND S. SLOSARČÍK

Department of Biomedical Engineering and Measurement,

Technical University of Košice,

Košice, 04183, Slovakia

www.tuke.sk/tuke

G. WIMMER

Mathematical Institute, Slovak Academy of Sciences,

Bratislava, 814 73, Slovakia

www.sav.sk

Linear regression models are frequently used for modelling measurements with larger

amount of output quantities. Very often the least squares method (LSM) is used for

determination of estimation of unknown parameter values. When value estimation is done

by LSM then it is possible to make use of the input quantity uncertainties. When

unknown parameter estimation is obtained by the LSM according to uncertainty

propagation law then the uncertainties and covariances amongst them are determined. In

wide range of publications only a few cases had considered the type B evaluations of

uncertainties (e.g. caused by measuring gauge error) in this matter. A question arises if

there are cases where the type B uncertainties should be omitted while unknown model

parameters are estimated. In this paper the method of regression model parameters

estimation is presented for the case when type B evaluation of uncertainties does not

affected the estimations.

Keywords: uncertainty, model of experiment, covariance matrix.

280


1. Introduction

Measurements of various physical quantities play crucial role in many industrial

field. These measurements are used to define certain manufacturing procedures,

enhance their safety and effectiveness as well as better understand the process as

a whole. Measurements provide valuable information, but this information

would be of a minor importance without the knowledge of uncertainty of the

measurement.

To be able to determine with confidence the overall uncertainty of the

measurement a series of repeated measurement have to be made. This creates an

increase of costs and time consumption. In the following parts of this paper we

are going to describe the cases by which they will allow us to omit the precision

of the measuring device and other influences of a similar nature on the model

parameter estimates. This will enable us to reduce the number of repeated

measurements.

Modelling of measurements with more output quantities is frequently done by

linear regression models. For the determination or estimation of unknown

parameter values the least square method (LSM) is used. This method is one of

the most commonly used methods for value estimation and it allows us to use

the input quantities uncertainties. When the estimates of unknown parameters

are obtained by LSM then by the uncertainty propagation law it is possible to

determine the uncertainty of the estimates and the covariance’s between them.

Further in this paper a method for the estimation of regression model parameters

is described when type B uncertainties are taken into account. Furthermore an

example when type B uncertainties don’t affect the estimates of an unknown

model parameters. The described procedures respect the principles from

publications [1, 2].

1.1. Linear regression model of measurement

This section is going to present in general the linear regression model of

measurement which are commonly used. The theoretical linear model of

measurement is

AYW = (1)

The theoretical model in eq. (1) can be expressed as a stochastic model in the

form that can be found in eq. (2).

)(W

UAY,W, (2)

281


In the following equations W is a random vector of input quantities with the

mean value E(W) = A Y and covariance matrix D(W) = UW. The component

A represents an (r×p) known matrix and Y is a vector of output quantities

(vector of unknown parameters). If the matrix A has p (number of unknown

variables) < r (number of equation) and UW is a know positive definite matrix,

the Best Linear Unbiased Estimator (BLUE) of Y in the model (2) is presented

in the following equation [3, 4]

WUAAUAYWW

111 −−−

=TT

)(

(3)

The covariance matrix of that estimation is

11 )( −−

= AUAUWY

T (4)

In case that UW = σ 2HW then Ŷ can be expressed

WHAA)H(AYWW

111 −−−

=TT

(5)

and 112 )( −−

= AHAUWY

Tσ (6)

In case that σ 2 is unknown, we estimate this component from the flowing

relation

( )prT−−−=

− /T12 )()( YAWHYAW

wσ

2. Linear regression model of measurement and type B uncertainty

In metrological application the vector W is often used in the following form

21CXXW += (7)

In this equation X1 is the vector of directly measured general quantities and X2

vector represents the input quantities. These vectors estimations, uncertainties

and covariance's among estimations of these input quantities are known from

other sources. Matrix C expresses the structure of influence of input quantities

X2 when measuring quantities X1. The covariance matrix of vector (7) is

presented below (8) but in this form it is valid only when we assume that each of

the random vectors X1 and X2 are independent.

T2

21

CCUHUXXW

+= σ (8)

In eq. (8) σ 2 HX1 is a covariance matrix of vector of input quantities X1. UX2 is a

covariance matrix of vector of input quantities X2. In eq. (8) the first component

is evaluated by type A method and the second component by type B method. If a

case occurs that σ 2

is not known, it is not possible to use relations (3) or (5) for

output quantities estimation. This fact means that there could be certain cases

282


where the uncertainties and covariance’s evaluated by type B method (the

component of the right part of eq. (8)) don’t influence the output quantities

estimation. If these specific cases would occur then we could use the relation

presented in eq. (5) for the output quantities estimation. One case when this is

possible will be presented [2, 5].

If in the model (2) the vector of input variables W is in the form known which

can be find in eq. (7), where X1 and X2 are independent. Then for the matrix C

the following statement can be applied

(A)(C) MM ⊂ (9)

(M (C) is a vector space generated by columns of matrix C), so if such a matrix

Q exists, that meets the eq. (11)

AQC = (10)

then according to the literature [2, 5] the BLUE estimation of Y is the following

WHAAHAYXX

111

11

−−−

=TT

)(

(11)

and its covariance matrix is

YYXXYUUQQUAHAU

BA

112

21

+=+=−− TT

)(σ (12)

If σ 2 is unknown, it is estimated according to as (see also e.g. in [2, 5])

)/()ˆ()ˆ( T2 pr −−−=− YAWHYAW 1

X 1

σ

(13)

It means, that for output quantities estimation and uncertainties of those

estimations we do not have to know uncertainties and covariance's of input

quantities X2 (evaluated by method type B) and it is enough to know just

relations among uncertainties and covariance's of input quantities X1 (evaluated

by method type A), so it is enough to know matrix HX1. The covariance matrix

σ 2 HX1 here represents uncertainties and covariance’s specified by type A

method characterising actual measurement. The matrix HX1 is known and the

most often the parameter σ 2 is unknown and we estimate it from data known

from actual measurement.

3. The polynomial model of measurement

In a similar way we can show the example for polynomial model of

measurement in the following form

p

ip

2

i2i10itYtYtYYW ++++= (14)

where p

ip2,

2

i2,2i2,12,0iitXtXtXXXW +++++= … (15)

283


AYCXX =+21

(16)

in presented equations Xi represents the input quantities, their estimations are

represented by measured values, X2,j are the input quantities with estimations

known from sources different from actual measurement. Estimations of

parameters Yj will not depend of uncertainties of input quantity X2,j. Same as in

the case of line model some of X2,j do not have to be taken into the model.

In matrix form the polynomial model on measurement is in form (16), where

(17)

(18)

, , (19)

(20)

For this model a matrix Q that meets the condition from the eq. (11) exists and

the matrix will be in the form

(21)

where I is identical matrix,

Covariance matrix of vector W = X1 + CX2 will be in form as can be seen in eq.

(8). Estimations of unknown parameters will be determined according to relation

(3) and their uncertainties and covariance's among them from relation (12). Also

in this polynomial model special cases occur. The most common case are

(22)

where

, ( )

−

=

×−

×

21

22

p0

IQ (23)

resp. , where

(24)

( )T

1 1 2, , ,n

X X X=X

( )T

2 2,0 2,1 2,, , ,p

X X X=X

1 1

2 2

1

1

1

p

p

p

n n

t t

t t

t t

− − −

− − −

=

− − −

C

1 1

2 2

1

1

1

p

p

p

n n

t t

t t

t t

=

A

( )T

0 1, , ,

pY Y Y=Y

= −Q I

( )T

diag 1, 1, , 1=I

( )T

2 2,0 2,1,X X=X

1

2

1

1

1n

t

t

t

− −

− − =

− −

C

2 2,0X=X

( )T

1, 1, , 1= − − −C

284


(25)

resp. ,where

(26)

(27)

4. Conclusion

This paper presents a method that takes into account the measurement

uncertainties influence on estimation of regression model parameters. As this

mathematical analysis together with examples has shown that in cases when the

uncertainty of measuring gauge and the uncertainty from other influences is

comparable with variance of measured values, it is not possible to omit these

influences. Along with this finding there is also an example of one special

situation, where precision of gauge and other influences of such a nature do not

have to be considered in estimating of model parameters, but only when the

uncertainties of these estimations are determined.

Acknowledgement

Authors would like to thank for the support to the Slovak University of

Technology (STU) Bratislava, Slovak Institute of Metrology (SMU) to the grant

agency APVV grant No. APVV-0096-10, VEGA- grant No. 1/0120/12, VEGA

1/0085/12 a KEGA - grant No. 005 STU-4/2012.

References

1. Guide to the Expression of Uncertainty in Measurement.

BIPM/IEC/ISO/OIML, Geneva, Switzerland, 1995

2. Palenčar R., Wimmer G., Hslaj M., Measurement Science Review, 2,

Section 1, 2002, 9-20 www.measurement.sk

3. Rao C. R. Linear Statistical Inference and Its Applications, 2nd edition,

JohnWilley& Sons, New York, 1993.

4. Bich, W., Cox M. G., Harris P. M. Metrologia, 30, 1994, 495-502

5. Palenčar R., Wimmer G. Journal on Electrical Engineering, 45, 1994, 230–

235

( )T

1,0, ,0= −Q

2 2,1X=X

( )T

1 2, , ,n

t t t= − − −C

( )T

0, 1,0, ,0= −Q



OPTIMISING MEASUREMENT PROCESSES USING

AUTOMATED PLANNING

S. PARKINSON∗, A. CRAMPTON AND A. P. LONGSTAFF

Department of Informatics, University of Huddersfield,

HD1 3DH, UK∗E-mail: [email protected]

Many commercial measurement processes are planned with little or noregard to optimality in terms of measurement time and the estimated uncer-

tainty of measurement. This can be because the complexity of the planningproblem makes optimality in a dynamic environment difficult to achieve, even

with expert knowledge. This paper presents a novel approach to measurement

planning using automated planning. Detailed information regarding the mod-elling and encoding of measurement processes are provided. The benefits of

this approach are demonstrated through the results of applying it to machine

tool calibration. A discussion is then formed around the development of futuretools to further validate the approach.

Keywords: Automated Planning; Measurement Uncertainty; Optimisation

1. Introduction

Measurement processes often contain multiple measurements, which have

time and order dependencies when estimating and minimising the uncer-

tainty of measurement. The scheduling of interrelated measurements can

have significant impact on the estimated uncertainty of measurement, es-

pecially in dynamic environments such as those taken within non-stable

environmental temperature. Expert knowledge is required to produce both

valid and optimal measurement plans. This can present problems for in-

dustrialists who are wanting to implement or improve their measurement

processes.

The Guide to the expression of Uncertainty in Measurement (GUM)1

establishes general rules for evaluating and expressing the uncertainty of

measurement with the intention of being applicable to a broad range of

measurements. An expert will use the GUM to plan and optimise a sequence

of measurements by making informed decisions. However, it is often the case

that planning a sequence of measurements against a continuously changing

285


286

environment (e.g. changing temperature) can be cumbersome and little or

no regard is taken to optimality, resulting in a higher uncertainty than that

which is achievable.

The GUM, and other theoretical guides, contain detailed, advanced pro-

cedures for estimating the uncertainty of measurement. However, this can

often make it difficult to implement on an industrial level. The Procedure

for Uncertainty MAnagement (PUMA)2 provides an iterative method for

reducing the estimated uncertainty per measurement. However, this ap-

proach does not consider scheduling a sequence of measurements to reduce

the overall estimated uncertainty of measurement.

This paper proposes an approach that utilises Automated Planning

(AP) to encode and deliberate over the measurements, optimising their

order by anticipating their expected outcome. The theory of automated

planning and the implementation of knowledge are discussed in the follow-

ing two sections. This leads to a demonstration of how expert knowledge can

be encoded and subsequently used to produce optimal measurement plans.

A discussion is then formed around the authors’ ambition to develop and

extend tools which will enable users to easily optimise their measurement

processes.

2. Automated Planning

Planning is an abstract, explicit deliberation process that chooses and or-

ganises actions by anticipating their expected outcome. Automated plan-

ning is a branch of Artificial Intelligence (AI) that studies this deliberation

process computationally and aims to provide tools that can be used to

solve real-world planning problems.3 To explain the basic concepts of AP, a

conceptual model is provided based on the state-transition system. A state-

transition system is a triple∑

= (S,A,→) where S = (s1, s2, . . . ) is a finite

set of states, A = (a1, a2, . . . ) is a finite set of actions, and →: S ×A→ 2s

is a state-transition function. A classical planning problem for a restricted

state-transition system is defined as a triple P = (∑, s0, g), where s0 is

the initial state and g is the set of goal states. A solution P is a sequence

of actions (a1, a2, . . . , ak) corresponding to a sequence of state transitions

(s1, s2, . . . , sk) such that s1 =→ (s0, a1), . . . , sk =→ (sk−1, ak), and sk is

the goal state. The state s1 is achieved by applying action a1 in state s0and so on.

In AI planning, when planning for a complex problem, it can become

practically impossible to represent explicitly the entire state space; since the

number of states can potentially increase exponentially. In classical plan-


287

ning, the state of the world is represented by a set of first-order predicates

which are set true or false by an action a. An action has three elements:

(1) a parameter list that is used for identifying the action, (2) a list of

preconditions precond(a) that must be satisfied before the action can be

executed, and (3) an effect effects(a) that contains a list of predicates that

represent the resulting state from the execution of this action.

A full conceptual model for planning is shown in Figure 1 (Modified

from3). The model has three parts: (1) a planner, (2) a controller, and

(3) the state-transition system. The planner generates a plan (sequence

of actions) for a specified problem model by using the domain model. A

domain model is an abstraction of the real-world domain which is sufficient

to be used in conjunction with a planner to automatically solve the planning

problem specified in the problem model. A planning problem consists of an

initial and goal state composed of a set of first-order predicates. A controller

observes the current state of the system from the state-transition function

and chooses an action that is generated by the planner based on the domain

model. The state-transition system progresses according to the actions that

it receives from the controller.

Planner

Controller

State transition System

Domain Model Planning Problem

Plans

Actions

Observations

Executing Status

Fig. 1. A conceptual model of AI planning.

3. Knowledge Engineering

Knowledge Engineering (KE), for automated planning, is the process that

deals with acquisition, formulation, validation and maintenance of planning

knowledge; where the key product is the domain model. To enable a wide

use of planning applications, the Planning Domain Definition Language

(PDDL)4 is used to encode the domain. A PDDL problem is comprised of


288

two parts. Firstly, a domain that consists of predicates and actions, and

secondly the problem definition, consisting of the initial and goal state.

Domain engineers will typically either develop domain models using (1)

a traditional text editor, or (2) a Graphical User Interface (GUI). Tradition-

ally, all domain models had to be developed in a text editor (e.g. Notepad),

but recent improvements in GUI knowledge engineering tools are helping

to make knowledge engineering a more efficient process. One of the more

prominent tools available for domain engineering is itSIMPLE5 which pro-

vides an environment that enables knowledge engineers to model a planning

domain using the Unified Modelling Language (UML) Standards.6 This is

significant as it opens up the potential use of the tool to most software engi-

neers with knowledge of UML, but not necessarily AP. itSIMPLE, just like

many other tools, focuses on the initial phase of a disciplined life-cycle, fa-

cilitating the transition of requirements to formal specification. The design

life-cycle goes from gathering requirements and modelling them in UML,

right through to the generation of a PDDL model which can be used with

state-of-the-art planning tools. The current state-of-the-art in knowledge

engineering for AP is sufficient for initial development of real-world appli-

cations. However, as the domain advances, features that are not supported

by knowledge engineering tools are required. Therefore, for the application

presented in this paper, a traditional text editor is used.

3.1. Implementation of Measurement Knowledge

In this section, the knowledge required to automatically construct mea-

surement plans, as well as the methods of encoding it, are presented and

discussed.

3.1.1. Temporal Information

Within metrology, especially industrial metrology, the financial cost of a

measurement process can be related to the time it takes to complete. The

direct labour cost and any lost revenue due to ‘opportunity cost’ if measur-

ing a production asset. For modelling purposes, each individual measure-

ment can be broken up into individual temporal components. For example,

when performing a measurement, equipment will need to be set-up, the

measurement will be performed, and then the equipment will be removed.

To enable planning with time, the durative action model of PDDL2.17 is

used.

In PDDL, a durative action encoding includes a numeric fluent which


289

represents a non-binary resource and can be used in the duration, pre-

conditions and effects of an action. The effects use operators (scale up,

scale down, increase, decrease and assign) to modify the value of

the fluent by using the binary functions (+, -, /,*). Comparisons be-

tween fluents is performed by using comparators (≤,<,=,>,≥) between

functions or fluents and real numbers. Durations are expressed either as a

predetermined value, or dynamically using binary functions. For example,

the following PDDL syntax for the set-up action :duration(= ?duration

(setup-time ?in ?mv)) specifies that a chosen action will take a quantity

of time specified in the initial state for when the instrument ?in is chosen

to take measurement ?mv.

3.1.2. Uncertainty Contributors

Factors that contribute to the total uncertainty of measurement are also

encoded as numeric fluents which are specified in the initial state and ex-

pressed through action effects. For example, Equation 1 can be easily en-

coded in PDDL, as provided in Figure 2. Equation 1 shows how to esti-

mate the uncertainty contribution when using a laser interferometer, where

Ucalibration is provided on a device’s calibration certificate. Here L is the

length in metres and k is the coverage factor

udevice laser =Ucalibration × L

k(1)

(*(/(k value ?in)(*(u calib ?in)(length-to-measure ?ax ?er)))

(/(k value ?in)(*(u calib ?in)(length-to-measure ?ax ?er))))

Fig. 2. Example PDDL uncertainty effect.

3.1.3. Dynamics

Throughout the measurement process, dynamics such as the continuous

change in temperature, affect the estimated uncertainty. In order to op-

timise the measurement process effectively, it is important that such dy-

namics are encoded into the model. In PDDL, dynamics in the measure-

ment process are encoded either using PDDL2.1 or PDDL+. In PDDL2.1,

dynamics can be represented as effects of continuous change throughout


290

an action’s duration. For example, (increase (temperature ?t) (* #t

(rate-of-change ?r))) describes how the environment temperature, ?t,

increases continuously, as a function of the rate-of-change of ?r. In PDDL+,

numerics of continuous, non-linear change can be implemented using the

stop, start process model exhibited through processes and effects.8

However, there is currently no planning tool capable of supporting the

full PDDL+ syntax. The solution is to discretise the continuous change

into a set of durative actions with time-dependent continuous effects. How-

ever, this requires pre-processing of non-linear resources to discretise them

based on a discretisation threshold. If the chosen value is too low, then

too many actions could be generated rendering the planner unable to solve

the problem. If the value it too high, the discretisation could no longer be

representative and lead to the generation of suboptimal plans.

3.1.4. Optimisation

Based on ISO recommendations, the root of the sum of squares is used

to calculate the combined uncertainty.9 The square root function is not a

PDDL operator. However, Considering that the square root is a monotonic

function, minimising the sum of the squares is as optimal as minimising

the square root of the sum of the squares. In the PDDL model this can

be achieved by combining the individual, squared contributions for each

measurement, and then adding this to an accumulative uncertainty value,

U . This optimisation metric is encoded to minimise the global uncertainty

value. For applications where time to measure is cost sensitive, it is possible

to minimise the total measurement time, T . It is also possible to perform

multi-optimisation by calculating the arithmetic mean of both T and U .

However, this could be expanded to the weighted optimisation

αU + (1− α)T , 0 ≤ α ≤ 1.

4. Example: Machine Tool Calibration

In current work, AP has been successfully applied to the calibration of pre-

cision machine tools where multiple measurements are performed to deter-

mine the machine’s accuracy and repeatability.10,11 This has been achieved

by encoding the planning problem in the PDDL2.1 planning language7

alongside a state-of-the-art planning tool (LPG-td12).

Empirical analysis has shown that it is possible to achieve a reduction

in machine tool downtime greater than 10 % (12:30 to 11:18 (hh:mm))

over expert generated plans. In addition, the estimated uncertainty due


291

to the schedule of the plan can be reduced by 59 % (48 µm to 20 µm).13

Further experiments have investigated the trade-off when optimising cali-

bration plans for both time and the uncertainty of measurement. We have

demonstrated that it is possible to optimise functions of both metrics reach-

ing a compromise that is on average only 5 % worse than the best-known

solution for each individual metric.14 Additional experiments, using a High

Performance Computing architecture, show that on average, optimality of

calibration plans can be further improved by 4 %. This gain was due to

the planner having access to more powerful hardware and so could explore

more plans in a reduced time. However, the 4 % improvement demonstrates

that in most cases it is sufficient to use a standard PC architecture.

5. Conclusion and Future Challenges

The successful application in the machine tool calibration domain has high-

lighted the possibility to extend and generalise the technology for a wide

variety of measurement problems. The diversity of the problems means that

there is no single planner that can be used for all. For example, some plan-

ning problems are rich in constraints restricting the measurements, whereas

some are rich in temporal and numeric information. In the Automated Plan-

ning (AP) community, planners often perform better on domains of different

complexities and tendencies. Therefore, as well as having the facility to de-

termine the best planning tool for each problem, it is also important to

study the development of AP tools that can be applied to a wide range

of different problems. This will significantly improve their ability to solve

complex, real-world problems.

A main aim of this paper is to increase interest in applying automated

planning to metrological processes. It is the authors’ intention to apply the

proposed technology to a broad range of applications, through which both

the theoretical approach of using automated planning as well as the pro-

duced measurement plans can be validated. However, for this to be possible

suitable tools and guidelines need to be made available for metrologists to

use. The future challenge will be in developing tools that are useful for

as broad a range of metrology planning problems as possible without the

requirement of specific AP knowledge.

References

1. GUM, Guide to the expression of Uncertainty in Measurement (InternationalStandards Organisation, 1995).


292

2. ISO 1453-2, Geometrical product specifications (GPS) - Inspection by mea-surement of workpieces and measuring equipment – Part 2 Guidance for theestimation of uncertainty in GPS measurement, in calibration of measuringequipment and in product verification (International Standards Organisation,2013).

3. M. Ghallab, D. Nau and P. Traverso, Automated planning: theory & practice(Morgan Kaufmann, 2004).

4. D. McDermott, M. Ghallab, A. Howe, C. Knoblock, A. Ram, M. Veloso,D. Weld and D. Wilkins, PDDL-the planning domain definition language(1998).

5. T. S. Vaquero, V. Romero, F. Tonidandel and J. R. Silva, itSIMPLE 2.0:An integrated tool for designing planning domains., in Proceedings of theInternational Conference on Automated Planning and Scheduling (ICAPS),2007.

6. UML OMG, 2.0 superstructure specification, OMG, Needham (2004).7. M. Fox and D. Long, PDDL2.1: An extension to PDDL for expressing tem-

poral planning domains, Journal of Artificial Intelligence Research (JAIR)20, 61 (2003).

8. M. Fox and D. Long, Modelling mixed discrete-continuous domains for plan-ning, Journal of Artificial Intelligence Research 27, 235 (2006).

9. ISO230, Part 9: Estimation of measurement uncertainty for machine tooltests according to series ISO 230, basic equations (International StandardsOrganisation, 2005).

10. S. Parkinson, A. Longstaff, A. Crampton and P. Gregory, The application ofautomated planning to machine tool calibration., in ICAPS , 2012.

11. S. Parkinson, A. Longstaff, S. Fletcher, A. Crampton and P. Gregory, Au-tomatic planning for machine tool calibration: A case study, Expert Systemswith Applications 39, 11367 (2012).

12. A. Gerevini and I. Serina, LPG: A planner based on local search for planninggraphs with action costs., in Proceedings of the Artificial Intelligence PlanningSystems (AIPS), 2002.

13. S. Parkinson, A. Longstaff and S. Fletcher, Automated planning to minimiseuncertainty of machine tool calibration, Engineering Applications of ArtificialIntelligence , 63 (2014).

14. S. Parkinson, A. Longstaff, A. Crampton and P. Gregory, Automated plan-ning for multi-objective machine tool calibration: Optimising makespan andmeasurement uncertainty, in ICAPS , 2014.

293





SOFTWARE TOOL FOR CONVERSION OF HISTORICAL

TEMPERATURE SCALES

P. PAVLASEK†, S. ĎURIŠ AND R. PALENČÁR

Institute of automation, measurement and applied informatics,

Slovak University of Technology,

Bratislava, 812 31, Slovakia †E-mail: [email protected]

www.sjf.stuba.sk/

P. PAVLASEK

Temperature group, Slovak Institute of Metrology, Karloveská 63

Bratislava, 842 55, Slovakia


A. MERLONE

Temperature group, Istituto Nazionale di Ricerca Metrologica, Strada delle Cacce 91, 73

Torino, 10135, Italy


Measurements of temperature plays an important role in wide variety of applications in

which the quality, effectiveness and safety of the processes is affected. Therefore the

determination of the temperature values is of great interest. Different sources affect the

temperature sensors which results in indicated temperature changes. These influences are

mostly bounded to the sensor and to its typical limitations. In this paper we are going to

concentrate on the mathematical aspect of temperature determination.

Throughout the history of temperature measurements different temperature scales were

introduced and used to ensure the consistency and confidence of the measured

temperature. This paper deals with the previous mathematical conversions of these

temperature scales and enhances them to a new mathematical model. Furthermore the

implementation of this mathematical model into to a reliable and effective conversion

software tool is presented. The usage of this conversion tool is focused on climatological

application to directly convert large files of historical records to the current international

temperature scale. This fast, reliable and easy to handle comparison tool will help in the

harmonization process of the historical surface temperature data.

Keywords: Historical temperature scale, Conversion, Software tool

294


1. Introduction

To better understand the problematic of historical temperature data conversion

we must first understand the basic principles on which these international

temperature scale work. The basic principles of temperature scales consist of

series of temperature points characterized by a typical material phase transition

or phase equilibrium (fixed points). This method has remained to this day and it

is still adopted in the current international temperature scale (ITS). Although the

principles of the scales have remained, evolvement of sensors, measuring

techniques and usage of different fixed points has caused the introduction of

different international temperature scales throughout history. This variation of

temperature scales has created difficulties during the comparison between older

temperature measurements and data measured according to the current ITS-90:

International Temperature Scale of 1990. This problem is particular important

when historical temperature data is processed to gain a trend of temperature

drift. This method which compares the surface temperature records is commonly

used to receive the trend of climate change. This type of investigation has

several sources of inhomogeneities that are caused by the usage of various

sensor types with different accuracy, stability, sensibility and by different

calibration of these sensors. This work deals with the differences in calibrations

preformed on these sensors due to different international temperature scales.

Several publications have dealt with the conversion between historical

temperature scales [1, 7]. Although these publications have created

mathematical conversions between each individual scale the direct conversion

form older ITS to ITS-90 wasn’t created. As the temperature scales are

characterized by individual fixed temperature point a connection had to be

made. This is done in real conditions by so called interpolation tools and by

mathematical functions for each individual sub ranges. These mathematical

functions represent only the interpolation between only certain fixed

temperature point they don’t represent the whole scale. As previously mentioned

the conversion of temperature scales has been done but only in a form of

conversion coefficient for a broad temperature range and no conversion function

was created. This could create conversions errors mainly on the boundaries of

different ranges. The creation of a conversion functions will bring a faster more

effective conversion of historical temperature data together with the

interpolation of the conversion of any needed range.

2. Conversion function creation

As can be seen in publications [1-7] the ITS scales have come through a long

evolution leading us to today’s ITS-90. Through this process many changes in

both mathematical and practical realizations have been done. These variations

have created differences between the temperature scales and the numerical

295


values. These differences can be found in the supplementary information for

ITS-90 [1].

The differences in historical temperature scales are considerable. These

differences were numerically expressed in publication [1], unfortunately no

conversion function was created for direct conversion to the current ITS-90.

This section shows the process of such a function creation together with the

function itself.

Calculation model for the linear theoretical model:

XaaY21

+= (1)

a stochastic model will have the form of

( )

( )

( )nrn

XaaYE

XaaYE

XaaYE

2

2212

1211

+=

+=

+=

(2)

where n ≥ 2, Yi represent the input variables, aj are unknown parameters, Xi are

the know values of temperature. In the case where a polynomial approach of

degree of p is used the theoretical model will look:

p

pXaXaaY

121 ++++= (3)

and the stochastic model for this approach will be

( )

( )

( )p

npnnrn

p

p

p

p

XaXaXaYE

XaXaXaYE

XaXaaYE

12

2122212

111211

+

+

+

+++=

+++=

+++=

(4)

Estimation of the calculation model parameters:

A measuring model has a linear stochastic nature when for physical quantities

(measured quantities) and random quantities which represents a possible output,

observation of this quantity we use the same designation.

296


( )Y

UAa,Y, (5)

Y is the random vector of input quantities with the mean value E(Y) = A a and

with the covariance matrix D(Y) = UY. A is an (n×2) known matrix, a is a vector

of unknown parameters.

If the rank of the matrix A is 2 < n and UY is a know positive definite matrix,

the Best Linear Unbaiased Estimator (BLUE) of the model (5) is [9, 10]

( ) YUAAUAaYY

111ˆ −−

−

=TT

(6)

The covariance matrix of this estimation is then

( )11

ˆ

−−

= AUAUYa

T

(7)

For the case when UY = 2

σ HY then the following equations are used

( ) YHAAHAaYY

111ˆ −−

−

=TT

(8)

( )112

ˆ

−−

= AHAUYa

T

σ (9)

In the case when σ 2 unknown, we estimate it from the following equation

( ) ( )2/)ˆ(ˆ 1T2−−−=

−

nσ aAYHaAYY

(10)

The realisation errors in temperature scales caused by fixed point realisation and

by resistance measurement errors are neglected in the historical temperature

scale conversion. This means that the temperature measurements according to

both ITS can be seen as equally precise and non correlated. So the calculation

error will be established only be the inadequacy of the used calculation model.

The estimation of calculation model parameters the eq. (8) is going to be used.

Where HY = I (I is identical matrix), uncertainty of the parameter estimations

and covariance’s will be obtained by using eq. (9) under the condition that HY =

I and σ

will be estimated form (10). The uncertainties of the calculation model

will be made according to eq. (3) and the used model will be in the form

p

pXaXaaY

121 ++++= (11)

where X is the value of temperature determined by one ITS (for instance ITS-

68) and Y is the temperature determined by another ITS (for example ITS-90).

The uncertainty of the conversion can be calculated according to the eq. (12).

297


( )( )

( )( ) ( )

( )∑ ∑∑+

= =

+

>

−−−

+=

1

1 1

1112122

,2p

i

p

i

p

ii

ii

ji

i

i

aauXXauXYu (12)

the elements of the equation ( )i

au2

and ( )ii

aau , are the elements of

covariance matrix of the model vector estimation parameters (9). If we attend to

make a reverse conversion it is necessary to interchange the axis and estimate

the parameters of the calculation model. For the case of an linear model in the

form shown in (1) the calculation model will be

XaaY21

+= (13)

the uncertainty of the calculation will be then

),(2)()()(212

22

1

22aaXuauXauYu ++= (14)

For the reversed model the following equation is valid

YbbYaa

aX

11

22

11

+=+−= (15)

the uncertainty of this conversion is going to be

),(2)()()(212

22

1

22bbYubuYbuXu ++= (16)

where

( ) ( ) ( )213

2

1

2

2

4

2

2

1

1

2

2

2

1

2 ,21

)( aaua

aau

a

aau

abu −+= (17)

( )2

2

4

2

2

2 1)( au

abu = (18)

( ) ( )213

2

2

2

4

2

1

21,

1),( aau

aau

a

abbu +−= (19)

By using the above described mathematical computation we are able to create a

polynomial function that enables us to directly convert the historical temperature

data to the current ITS of 1990. To convert the historical temperature data to the

ITS-90 a correction of this value has to be calculated by the polynomial function

(21) and by using the calculated value to the eq. (20) we will obtain the

converted temperature. The coefficient needed to calculate the correction can be

298


found in Table 1. The conversion is valid for the temperature range from -50 °C

to 50 °C and the uncertainty caused by the mathematical conversion is within ±

0.0005 °C (k = 2).

CHistoricalttt +=

90 (20)

4

4

3

3

2

210 HistoricalHistoricalHistoricalHistoricalCtatatataat ++++= (21)

Table 1. correction polynomial coefficients.

Prior to 1968 1968 to 1989

a0 4.17095 x 10-4

-8.31908 x 10-5

a1 -6.85871 x 10-4

-2.07761 x 10-4

a2 4.84595 x 10-6

-3.7404 x 10-7

a3 3.44392 x 10-8

-5.78149 x 10-9

a4 -7.24503 x 10-10

-1.09541 x 10-10

3. Conversion program

The conversion program that was designed is a practical implementation of the

above mentioned mathematical conversion of temperature scales. It provides a

quick and reliable conversion of temperature data from the ITS-27 to the current

ITS-90. It was designed for the use in the meteorology and more specifically to

help determine the historical climate change. The cooperation between

meteorology and metrology has brought many useful information for both

scientific fields and furthermore that this program is compatible with the used

databases in meteorology.

The program is designed in visual basic program and it is compatible with most

operating systems. Its simple yet effective in design. The program offers both

mass conversion using a predefined text file structure can be used for vast data

clusters. Also a single value conversion is possible for simple conversion. The

year selection of the data’s origin is manual. The typical examples of corrections

done by the program can be seen in Table 2.

299


Table 2. Correction in mK for typical temperatures.

Correction in mK to be applied to historical data

Year at 25 °C at 15 °C at 5 °C at -5 °C at -15 °C at -25 °C

1927-1966 -13.5 -8.8 -3.0 4.0 11.8 19.8

1976-1989 -5.6 -3.2 -1.1 0.9 2.9 4.9

The program as it was designed for climate purposes the temperature range we

concentrated on was form -50 °C to +50 °C. The working program can be seen

on the following website [11].

4. Conclusion

As it was previously shown there are indisputable differences in historical

temperature scales. These differences have multiple sources like changes in

fixed point changes of temperature values, new mathematical calculation etc.

This makes it difficult to direct compare the historical temperature values and

thus determine possible climate change. In this paper we present the

mathematical function for the direct conversion of historical temperature data

from the year 1927 to the present. This function has proven to be an effective

and reliable tool for conversion of such data with an uncertainty within

±0.0005°C over a temperature range from -50 °C to +50 °C. This conversion

function was implemented into a computer program which can be used by

meteorologists and climatologist and can be a useful tool for the analysis of

climate change.

Acknowledgement

This work is part of the European Metrology Research Program (EMRP) joint

research project ENV07 “METEOMET”. The EMRP is jointly funded by the

EMRP participating countries within EURAMET and the European Union.

Authors would also like to thank for the support to the Slovak University of

Technology (STU) Bratislava, Slovak Institute of Metrology (SMU) and further

more to the grant agency APVV grant No. APVV-0090-10, VEGA- grant No.

1/0120/12, and KEGA - grant No. 005 STU-4/2012.

References

1. BIPM, Supplementary information for the international temperature scale of

1990, BIPM, Paris, 1990.

2. CGPM: comptes Rendus des Séances de la Septiéme Conférence Générale

des Poids et Mesures, 94-99, 1927.

3. CGPM: comptes Rendus des Séances de la Neuviéme Conférence Générale


300


4. CGPM: comptes Rendus des Séances de la Onziéme Conférence Générale


5. CGPM: comptes Rendus des Séances de la Treiziéme Conférence Générale

des Poids et Mesures, A1-A24, 1967-1968.

6. CGPM: comptes Rendus des Séances de la Quinziéme Conférence Générale

des Poids et Mesures, A1-A21, 1975.

7. H. Preston-Thomas, The International Temperature Scale of 1990 (ITS-90),

Metrologia, Vol. 27, pp.3-10, 1990.

8. H. Preston-Thomas, The International Practical Temperature Scale of 1968

Amended Edition of 1975, Metrologia, Vol. 12, pp. 7-17, 1976.

9. RAO, C. R. Linear Statistical Inference and Its Applications, 2nd edition,

JohnWilley& Sons, New York, 1993.

10. BICH, W., COX, M. G., HARRIS, P. M. Metrologia, 30, 1994, 495-502

11. http://surfacetemperatures.blogspot.no/2014/06/understanding-effects-of-

changes-in.html, 2014

301





FEW MEASUREMENTS, NON-NORMALITY: A STATEMENT

ON THE EXPANDED UNCERTAINTY

J. PETRY, B. DE BOECK, M. DOBRE

SMD, FPS Economy, Boulevard du Roi Albert II 16

1000 Brussels, Belgium

A. PERUZZI

VSL, Dutch Metrology Institute, P.O. Box 654

2600 AR Delft, The Netherlands

In the current GUM approach the law of propagation of uncertainty (LPU) is not used

consistently for type A input quantities. This may cause understatement of measurement

uncertainty in the case of few repeated measurements. The Welch-Satterthwaite formula

only partially solves this issue and is hardly utilized in practice. Performing Monte Carlo

simulations (GUM-S1) might be a solution, but many metrologists prefer the use of

simply rules of thumb in daily practice. Therefore we suggest a more consistent approach

of type A input quantities by using a full Bayesian framework. Our method remains an

approximation but we believe it to be more correct and straightforward than the current

GUM approach. Also, we allow for the treatment of type A input quantities which arise

from non-normal repeated measurements.

1. Introduction

Due to the lack of measurement data, metrologists are often facing the problem

of evaluating the measurement uncertainty with too few data to draw statistically

consistent conclusions. Moreover, the distribution underlying the data is often

considered to be normal while it might not be.

The reference text GUM [1] recommends the use of a coverage factor in

order to determine an expanded uncertainty with a certain coverage probability.

Type A input quantities are assumed to have a normal underlying distribution

and the model for the measurand is linearized. Another important condition for

application of the GUM uncertainty framework is that the distribution for the

measurand can adequately be approximated by a Gaussian distribution, which

can be ensured by applicability of the Central Limit Theorem (CLT). If there are

only few or dominating input quantities, however, the distribution for the

measurand might deviate strongly from normality. In this case the only

appropriate option is to use the method proposed in Supplement 1 (S1) of the

GUM which gives as output the probability density function for the measurand

302


and as a consequence the coverage probability of an interval, once the

probability distribution functions for the input quantities are known [2].

In the present publication, we propose an improved uncertainty calculation

method, close to the actual practice of metrologists, for few and non-normal

measurements. Note that in this method a linearized measurement model is

adopted and the output quantity is assumed to be normally distributed.

The coverage factor is defined in the GUM as the factor to multiply the

combined standard uncertainty, uc(x), in order to have the expanded uncertainty

U, on the basis of a level of confidence. The combined standard uncertainty is

the square root of the squared sum of the individual components. Assuming that

the number of individual components is large and that there is not a dominant

component either of type A based on few observations or of type B based on a

rectangular distribution, then the validity of the CLT justifies calculating

coverage factors as the appropriate normal quantiles (GUM G.2.3, [3]). A

criticism of this method is the inconsistent treatment of the propagation of the

uncertainty of a type A component. The current GUM propagates the sample

mean variance for repeated measurements whereas it should propagate the

variance of the appropriate t-distribution, if it exists. Using Student’s t-quantiles

derived from the Welch-Satterthwaite formula (GUM G.4) only partially

compensates for this, especially in the case of few repeated measurements. A

proposed revision of the GUM addresses this issue in a consistent way [4]. Also,

a type A input quantity arising from non-normal measurements is not provided

in the GUM approach.

For the sake of consistency, we chose to work in a Bayesian framework.

Component coverage factors for each type A component of the uncertainty

budget are introduced. For a large number of measurements (> 30) the

difference of our suggestion compared to the classical approach is negligible,

even in the case of non-normality.

2. Theoretical inset

In section 4.2 of the GUM type A evaluation of standard uncertainty is justified

from a frequentist point of view. The measurements ( = 1, . . , ) are

assumed observations from a normal distribution (μ, ²) where the parameter

μ is the expected value of the arithmetic mean of the measurements. The

estimate is an unbiased estimate of the quantity μ. As usual ² is used to

denote the sample variance with denominator − 1. The square of the standard

uncertainty ²/ is an unbiased estimate of Var[] = ²/. The measurements

are thus the random variables and the true value μ of the quantity a fixed but

unknown parameter. In section 4.3 of the GUM, however, type B evaluation of

standard uncertainty is developed from a Bayesian point of view. The true value

of the quantity is considered to be associated with a random variable μ which

has a prior distribution, which is informative iff prior knowledge of the quantity

303


is available. Measurement data can possibly be used to update the distribution of

the random variable μ. The expected value and variance of this random variable

are then the typical location and dispersion parameters that are used to

summarize distributional information about the quantity represented by μ.

Also, in section 5 of the GUM the input quantities are considered to be random

variables which are combined to obtain a random variable as the output quantity.

Combining the quantities in this way thus follows the Bayesian approach by

assuming the quantities to be associated with random variables. The distribution

of a random variable associated with a quantity represents the knowledge we

have about its true value. To obtain a more coherent framework we therefore

choose a full Bayesian approach, but we will also keep in mind the classical

GUM type A approach for reasons of comparison. We will now treat more

elaborately the input quantities and the output quantity separately.

2.1. Input quantity

2.1.1. Type A

Given are measurements ( = 1, . . , ) of the input quantity which are

assumed to be drawn independently and identically from a distribution which is

symmetric around its expected value μ. We further assume the prior distributions

to be the uninformative Jeffreys priors [5], as well for μas for possible nuisance

parameters. Bayes’ theorem is then used to obtain the posterior distribution of μ,

as in [5]. We denote = ∑ and =

∑ ( − ) , and define the t-

statistic

= !"# √⁄ and ∗ = !"

'([!],

where SD[.] denotes standard deviation. Given the posterior distribution of µ, the

distributions of and ∗ are determined and can be shown to be symmetric

around 0. However, in general, they cannot be obtained in an analytical form

and simulations are needed. Nevertheless, it is well known that ~,-, i.e.

has a t distribution with − 1 degrees of freedom, if the ~ (μ, ) and

> 1.

In the Bayesian framework a possible choice is to summarize the input

quantity represented by μby the location parameter E[μ] = and the dispersion

parameter SD[μ]. We thus define the standard uncertainty as 0∗(μ) = SD[μ]. In

the classical type A GUM approach the location estimate is also given by but

the standard uncertainty is defined as 0(μ) = √⁄ . Starting from those 2

separate tracks, parallel definitions can be given for a component coverage

factor.

304


Starting from 0(μ) = √⁄ , the component coverage factor 1 is defined as

the (1 − 2) 2⁄ quantile 4 of the distribution of . By definition we thus have 56−1 ≤ ≤ 18 = 2,

or

59 − 1 √⁄ ≤ μ ≤ + 1 √⁄ ; = 2.

Starting from0∗(μ) = SD[μ], the component coverage factor 1∗ is defined as

the (1 − 2) 2⁄ quantile of the distribution of ∗. By definition we thus have 56−1∗ ≤ ∗ ≤ 1∗8 = 2,

or

56 − 1∗SD[μ] ≤ μ ≤ + 1∗SD[μ]8 = 2.

Thus the component coverage factor kc is by definition the quantile qn that

satisfies 5( > 4) = (1 − 2)/2. If the underlying distribution is normal, the

statistic is known to be t-distributed. In this case the t-quantiles are denoted tn.

A quantile 4 of the distribution of depends, besides on the coverage

probability 2, also on the number of measurements and the underlying

distribution of the . Few measurements and/or a non-normal distribution may

cause the dispersion of C and thus to increase, and the latter then leads to an

increase of 1, which is not reflected in the classical GUM approach.

We have performed a simulation study based on Monte Carlo simulation to

assess the behavior of 1 for various underlying distributions, symmetric around

its mean µ, and various , since analytical determination of 1 is generally

unfeasible. The Monte Carlo simulation consists of the following algorithm:

repeat M times (do for l=1,..M):

generate a random sample of observations D, ( = 1, . . , ) from

the underlying distribution compute the arithmetic mean D and the sample variance D

calculate the t-statistic D = (μ − D) ED √⁄ F⁄

the set of t-statistics D is a discrete representation of the distribution of :

the (1 − 2) 2⁄ quantile of this discrete representation can be found easily

by sorting the D and approximates 1 (or 4).

The structure of the t-statistic ensures that its distribution is location-scale

invariant, and consequently, the quantiles as well. The results of the algorithm

are therefore independent of the location and scale parameters of the underlying

distribution.

The component coverage factors have been calculated for the normal,

rectangular, triangular, Laplace and standard trapezoidal underlying

distributions, which are unimodal and symmetric around their mean. With

305


standard trapezoidal we mean all location-scale transformations of a trapezoidal

distribution with min=0, mode1=1/3, mode2=2/3 and max=1.

The results of the simulation study are reported in Table 1 for the 95 %

confidence interval. Their kurtosis Gis also shown. As expected, in the case of

a normal underlying distribution, the algorithm gives rise to the same value as

the Student’s t-quantiles for degrees of freedom H = − 1. For small n, the

component coverage factors differ strongly from the t-quantiles for other

underlying distributions. The 1 are larger than the t-quantiles, except for the

Laplace distribution. For large we see that 1 ≈ 2 but the 1 are not equal to 2

for all underlying distributions and all number of measurements . Neglecting

this would often lead to an understatement of uncertainty. These results are in

agreement with values reported in [6].

n norm.

G = 0

rect.

G = −1.2

tri.

G = −0.6

trap.

G = −0.8

lapl.

G = 3

asin

G = −1.5

2 12.7 18.9 13.2 15.2 10.0 37.0

3 4.3 5.8 4.5 5.0 3.5 8.5

4 3.2 3.9 3.3 3.6 2.7 4.8

5 2.8 3.2 2.8 3.0 2.5 3.5

10 2.3 2.3 2.3 2.3 2.2 2.3

30 2.0 2.1 2.1 2.1 2.0 2.1

Table 1 : Component coverage factors 1 for coverage probability 2 = 95%.

From these results, we can state that in the class of unimodal symmetric

distributions, the rectangular distribution is the least favorable underlying

distribution in the sense that it leads to maximal coverage factors for a fixed

coverage probability 2 and a fixed sample size . If the underlying distribution

of measurements may be assumed unimodal and symmetric, using the

component coverage factors of the rectangular distribution is thus a conservative

choice.

We extended our scope to underlying distributions which are not unimodal,

but we will not discuss this here because their behavior is less clear. Hence, in

the case of a suspected non unimodal distribution, we recommend to simulate

the appropriate 1 values. We only show the component coverage factors for the

arc sine distribution in Table 1 which is an (atypical) bimodal symmetric

distribution with very small kurtosis. The 1 are found to be even larger than for

the rectangular distribution.

2.1.2. Type B

Given is a priori knowledge about this input quantity. Using the maximum

entropy principle or Bayes’ theorem, a probability distribution function (pdf) is

derived for the random variable μ associated with this input quantity. A possible

306


choice is to summarize the input quantity (represented by) μ by the location

parameter E[μ] and the dispersion parameter SD[μ]. Standard uncertainty is thus

defined as 0(μ) = SD[μ]. 2.2. Output quantity

Let us assume an explicit measurement model Q = R(C, . . , CS) = R(T) of

independent input quantities. Again, the same summary measures E[Q] and

0(U) = SD[Q] characterize location and dispersion of the measurand. The latter

is the combined standard uncertainty 0(U). The linearized model is given by

Q = R(V) +W X (C −Y),S

by performing a Taylor expansion to the first order. The notation Y = E[C] and X = Z[

Z\]^TV is used. In the current GUM the law of propagation of

uncertainty (LPU) given by

0(U) = Var[Y] = W XVar[C]S

,

is inconsistently deduced to ∑ X0(C)S .

After all, 0(C) = √⁄ ≠ ab[C] = 0∗(C) for type A input quantities

shows that the LPU is correctly restated as

0(U) = ∑ X0∗(C)∈d +∑ X0(C)∈e .

f denotes the index set of A input quantities and g the index set of B input

quantities. From the definition of the component coverage factors 1 and 1∗ one

can easily derive the equation 1, √⁄ = 1,∗ ab[C] for the type A input

quantity . As discussed, few measurements and non-normal underlying

distributions often lead to an increased dispersion of C and thus an increased

1, or ab[C]. The classical GUM approach does not take this into account, but

our revised approach does. The ratio 1, 1,∗⁄ grasps the difference in both

approaches for every type A component.

Eventually, we obtain the combined standard uncertainty :

0(U) = hW X(1, 1,∗⁄ )∈d

⁄ +W X0(C)∈ei ⁄

For a practical implementation that deduces the (expanded) combined

standard uncertainty, we still need to determine some parameters, namely and

the ratio 1 1∗⁄ for every input quantity C. We suggest to use = 2 because it is

common practice in metrology and it can be usually justified by the CLT. If the

distribution of (or C) is known, we can directly calculate the ratio 1 1∗⁄ . For a

307


normal underlying distribution, we have ~,- and thus 1 1∗⁄ = SD[ ] =jk. This is a strong indication that at least 4 repeated measurements (from a

normal distribution) should be made to obtain a finite variance for a type A input

quantity.

If the distribution of is unknown, we propose to use 1∗ = 2 and to use

1values detailed in Table 1. The choice 1∗ = 2 has proven to be a good

approximation from various simulations because ∗ has a standardized

symmetric distribution. All together this leads to the expression :

l = 20(U) = hW X1,∈d ⁄ + 4W X0(C)∈e

i ⁄

We note from Table 1 that 1 ≈ 2 if is large enough making the revised

approach coincide with the classical one. We realize that possible pitfalls remain for this approach (e.g. linearization

although seldom a problem in practice; determination of if the CLT is not

applicable) but it is an improvement of the practical LPU method.

3. A practical example

We consider the calibration of standard platinum resistance thermometers in

fixed point cells of mercury (Hg). Under assigned pressure conditions, the fixed

point temperature is the phase transition temperature of the substance (melting,

freezing or triple point). Checking the reproducibility of one fixed point cell is a

long and expensive experimental process.

All input quantities are type B except reproducibility which is type A. Two

situations are considered: a routine calibration based on = 4 reproducibility

measurements, and = 10 historical measurements obtained over a 10 year

period by the VSL laboratory. In both cases it is plausible and assumed that the

reproducibility measurements come from a normal distribution.

The routine calibration data leads to √⁄ = 0.027mK for the

reproducibility with sensitivity coefficient X = 4, and thus we get a type A

contribution of 0∗(μ) = 0.11mK in the current GUM approach. Since = 4

we have 1 1∗⁄ = √3 ≈ 1.73, and 0(μ) = 0.19mK in the revised approach. We

obtained the following results:

• GUM: 0(U) = 0.14mK and = 2 → l = 0.27mK

• Revision:0(U) = 0.21mK and = 2 → l = 0.41mK

• GUM-S1: 0(U) = 0.21mK and = 1.93 → l = 0.40mK

There is a significant difference between the classical GUM approach and our

revision because the type A input quantity has a dominant uncertainty that is

based on only 4 measurements. The revised result is very close to the result of

Supplement 1.

308


The historical data leads to √⁄ = 0.079mK for the reproducibility with

sensitivity coefficient X = 4, and thus we get a type A contribution of 0∗(μ) =0.31mK in the current GUM approach. Since = 10 we have 1 1∗⁄ =q9 7⁄ ≈ 1.13, and 0(μ) = 0.36mK in the revised approach. We obtained the

following results:

• GUM: 0(U) = 0.33mK and = 2 → l = 0.65mK

• Revision:0(U) = 0.37mK and = 2 → l = 0.74mK

• GUM-S1: 0(U) = 0.37mK and = 1.88 → l = 0.69mK

The difference between the classical GUM approach and our proposed revision

is much smaller here. The type A input quantity has still dominant uncertainty

but the correction is relatively small because is larger. The revised combined

standard uncertainty is approximately equal to the one of S1.

Although the few measurements of the short term repetition inflate the

uncertainty w.r.t. the standard deviation of the mean, the long term

measurements still entail a relatively larger uncertainty.

4. Conclusions

We have proposed a revision of the classical LPU approach described in the

GUM. This revision is based on a full Bayesian framework, and is in line with

Supplement 1 of the GUM. For few or non-normal measurements of a type A

input quantity, the revised approach entails a considerable difference in

uncertainty compared to the classical approach. This is experienced in

thermometry practice of fixed point cell calibration: our proposed calculation

gives uncertainty values close to the one obtained from the Supplement 1 of the

GUM, although the formulation is not too different from the classical GUM

approach.

References

1. BIPM, IEC, IFCC, ILAC, ISO, IUPAC, IUPAP and OIML 2008 Guide to

the Expression of Uncertainty in Measurement—GUM 1995 with minor

corrections JCGM 100:2008.

2. BIPM, IEC, IFCC, ILAC, ISO, IUPAC, IUPAP and OIML 2008 Evaluation

of Measurement Data—Supplement 1 to the ‘Guide to the Expression of

Uncertainty in Measurement’—Propagation of distributions using a Monte

Carlo method JCGM 101:2008.

3. Wilks, S.S., “Mathematical statistics”, ed. Wiley (New York), 257-

258:1962.

4. Bich, W., “Revision of the ‘Guide to the Expression of Uncertainty in

Measurement’. Why and how”, Metrologia 51(4): 2014.

309


5. Gelman A., Carlin J.B., Stern H.S. and Rubin D.B., “Bayesian data

analysis”, Chapman and Hall (London), 2004

6. Boon, P.C., Hendriks H., Klaassen C.A.J. and Muijlwijk R., "To t or not to

t?", 33rd

European Study Group with Industry, 1-10:1998.



QUANTIFYING UNCERTAINTY IN

ACCELEROMETER SENSITIVITY STUDIES

A. L. RUKHIN?, D. J. EVANSNational Institute of Standards and Technology, USA

?E-mail [email protected]

Key Comparisons of accelerometers sensitivity measurement are

performed to compare the sensitivity of linear accelerometers. Thekey comparison reference value (KCRV) for charge sensitivity as a

function of frequency and the accompanying uncertainty are the prin-

cipal objectives of these studies. In a mixed effects model severalmethods for evaluation of the vector KCRV and its uncertainty are

suggested. A practical remark is that iterated log-scaled frequencies

could lead to a better data description than the frequencies them-selves.

Keywords: Growth curves, heterogeneeous linear models, key com-

parisons, restricted maximum likelihood, uncertainty evaluation

1. Introduction

International and Regional Key Comparisons of measurement of ac-

celerometers sensitivity are periodically organized to compare data

obtained by participating National Metrology Institutes (NMI’s)

on the sensitivity of linear accelerometers. These measurements

use sinusoidal excitation to determine sensitivity as a function of

frequency6. Two such comparisons are discussed here; one con-

ducted by the International Committee for Weights and Mea-

sures (CIPM) Consultative Committee for Acoustics Ultrasound

and Vibration (CCAUV.V-K1)7; and one conducted by the Inter-

American Metrology System Working group 9 (SIM.AUV.V-K1)2.

The CCAUV.V-K1 comparison included 12 NMIs and covered the

frequency range from 40Hz to 5kHz. The vibration acceleration

SIM.AUV.V-K1 comparison included 5 NMIs and covered the fre-

quency range from 50Hz to 5kHz. The measurand in each of com-

310


311

parisons was charge sensitivity (electrical charge per unit accelera-

tion) as a function of frequency with each laboratory determining

sensitivity at the same frequency but generally at different ampli-

tudes. The Physikalish-Technische Bundesanstalt (PTB) served as

the Pilot Laboratory for the CCAUV.V-K1 comparison, and the

National Institute of Standards and Technology (NIST) served as

the Pilot Laboratory for the SIM.AUV.V-K1 comparison.

2. Model: Heterogeneous growth curves

To describe the mathematical setting we suggest the following

mixed effects linear model,

Yi = Bi(θ + ì) + ei, i = 1, . . . , p = 12, (1)

Here the data vector Yi is composed of ri repeats in i-th lab (out

of twelve) made at n = 14 frequencies. Thus Yi is formed by the

measurements of the i-th lab having the dimension ni = nri. In (1)

Bi = [B, . . . , B]T , where the given n×q design matrix B is stacked

ri times. The q-dimensional parameter θ is of interest. The errors

ei are assumed to be independent and normally distributed with

the variance depending only on the study, ei ∼ Nni(0, σ2

i I) where

I denotes the identity matrix. The independent vectors ì of the

same dimension q as θ represent random between-study effect with

zero mean and some (unknown) covariance matrix V . As Figure

1 shows a noticeably heterogeneous noisy data form these twelve

insitutes, one can expect a consequential matrix V .

The usual motivation of (1) is provided by a two-stage modeling

with the first stage introducing all parameters and variables for

fixed ì, and the second stage specifying the distribution of these

effects. The statistical goal is to estimate the vector parameter θ

and to provide a standard error of the estimate so that a confidence

region for this parameter or for a function thereof can be obtained.

In our model the object of interest is the vector KCRV values which

is a linear function of the matrix B and parameter θ, KCRV=Bθ.

This model extends the balanced scenario (ni ≡ n, σ2i ≡ σ2), and

when q = p, falls into the class of classical growth curve models8.


312

frequency

sens

itivi

ty

0.1280

0.1285

0.1290

0.1295

0.1300

0.1305

200 400 600 800

10 8

200 400 600 800

1 6

9 2 3

0.1280

0.1285

0.1290

0.1295

0.1300

0.13057

0.1280

0.1285

0.1290

0.1295

0.1300

0.13054

200 400 600 800

11 12

200 400 600 800

5

Fig. 1. The CCAUV data for 12 institutes.

3. Restricted likelihood, DerSimonian-Laird and

Hedges procedures

The algorithms of fitting general linear mixed models by using

classical techniques of maximum likelihood estimation are imple-

mented in the R statistical language. In the variance components

setting including (1) there are general results on maximum like-

lihood and restricted maximum likelihood estimation (REML) as

well as the algorithms for their calculation9. However in some cases

(false or singular) convergence problems appear. For this reason we

look here at the following simpler procedures which are both more

specific and easier to evaluate.

When there is just one unknown parameter (q = 1), DerSi-

monian and Laird1 suggested an estimation method of θ. This


313

procedure became widely used in biostatistics, especially in anal-

ysis of multicenter clinical trials. Its popularity is due mainly to

the fact that this is a simple non-iterative procedure admitting an

approximate formula for the variance of the resulting θ-estimator.

Here we review an extension of this and similar moments equation

type estimators to the multiparameter situation in the random-

effects model (1). The method consists of estimating the covari-

ance matrix V first, and then determining the matrix weights for

the weighted means statistic to estimate θ itself.

The following estimator of the unknown covariance matrix V is

derived in11

VDL =

∑i

riσ2i

−

[∑i

riσ2i

]−1∑i

r2iσ4i

−1

×

[∑i

riσ2i

(Xi − X0)(Xi − X0)T − (p− 1)(BTB)−1

](2)

=(BTB)−1BT

∑iriσ2i(Yi − Y0)(Yi − Y0)T − (p− 1)I∑iriσ2i−[∑

iriσ2i

]−1∑ir2iσ4i

B(BTB)−1.

Here for i = 1, . . . , p, Xi is the ordinary least squares estimator of

θ based only on the results of i-th laboratory, Xi = (BTB)−1BT Yi,

with Yi denoting the average of ri repeats over n frequencies. Sim-

ilarly,

σ2i =

∑rik=1(Y

(i)k −B(BTB)−1BT Yi)

T (Y(i)k −B(BTB)−1BT Yi)

ni − q,

is the (unbiased) error variance estimator for the i-th laboratory.

Both Xi and σ2i can be found from linear fitting of laboratory i

data.

Since the covariance matrix V must be nonnegative-definite,

we take the positive part of the (symmetric) matrix in (2) which

may not be positive definite. Thus VDL has the same spectral de-

composition as (2), with new eigenvalues being positive parts of


314

eigenvalues of the matrix there. It is the direct extension of the

DerSimonian and Laird formula.

The Graybill-Deal estimator,

X0 =∑i

ω0iXi = (BTB)−1BT

∑i

ω0i Yi = (BTB)−1BT Y0,

with

ω0i =

(∑k

rkσ2k

)−1riσ2i

, i = 1, . . . , p,∑i

ω0i = 1,

is used in (2) as a centering vector, and Y0 = BX0.

When the role of X0 is played by the unweighted average

X =∑iXi/p, a procedure extending the Hedges estimator 4 of

the heterogeneity matrix V obtains,

VH =

∑i(Xi − X)(Xi − X)T

p− 1− 1

p

∑i

σ2i

ri(BTB)−1 (3)

= (BTB)−1BT

(∑i(Yi − Y )(Yi − Y )T

p− 1− 1

p

∑i

σ2i

riI

)B(BTB)−1.

Here Y = BX, and, as in (2), we take the positive part V +H in this

formula.

Both the DerSimonian-Laird estimator XDL and the Hedges

estimator XH of the parameter θ (as well as the Graybill-Deal esti-

mator X0 or the mean of least squares estimates X) are examples

of the weighted means statistics X whose matrix weights have the

form

Wi =[V + r−1i σ2

i (BTB)−1]−1

, i = 1, . . . , p,

with the non-negative definite matrix V estimating V . These esti-

mators

X = (∑

Wk)−1∑i

WiXi =∑i

ωiXi,

employ normalized matrix weights, ωi = (∑Wk)−1Wi,

∑i ωi = I.


315

The traditional plug-in estimator of the covariance matrix

V ar(X) given by

V ar(X) = (∑

Wk)−1, (4)

is known in many instances to underestimate the true variance, so

that we give here an alternative statistic for V ar(X). Namely, to de-

termine the matrix V ar(X) =∑i ωiV ar(Xi)ω

Ti , for the (unbiased)

weighted means statistic X, with fixed normalized matrix weights,

one can use the almost unbiased estimate of V ar(Xi) whose origin

is in a more general setting of linear models3. This estimator Vi is

derived by solving the following equation,

Vi = (Xi − X)(Xi − X)T +1

2(ωiVi + Viω

Ti ).

Provided that none of the matrix weights ωi dominates, an explicit

solution of this equation in terms of a matrix series is

Vi=∞∑k=0

(ωi2

)k(I − ωi

2

)−k−1(Xi−X)(Xi−X)T

(I − ωTi

2

)−k−1(ωTi2

)k.

Since V ar(Xi) ≥ σ2i (BTB)−1/ri, and an unbiased estimate σ2

i

of σ2i is available, we use

V ar(Xi) = max[Vi, σ2i (BTB)−1/ri]

= σ2i (BTB)−1/ri + [Vi − σ2

i (BTB)−1/ri]+

as the final estimate of V ar(Xi). The resulting formula for the

V ar(X) estimator has the form

V ar(X) =∑i

ωiV ar(Xi)ωTi .

This estimator leads to an approximate (1 − α) confidence el-

lipsoid for θ,

(X − θ)T[V ar(X)

]−1(X − θ) ≤ qFα(q, p− q).

Here Fq,p−q(α) denotes the critical point of F -distribution with

given degrees of freedom. The suggestion to use the F -distribution

parallels the case q = 1 3.


316

A confidence interval for a linear function of θ, say, aT θ, follows

as V ar(aT X) = aT V ar(X)a. An approximate (1 − α) confidence

interval is defined by

aT X ± tα/2(p− q)

√aT V ar(X)a.

a critical point tα/2(p − q) of t-distribution with p − q degrees of

freedom. Simultaneous confidence intervals for several linear func-

tions can be derived similarly. In particular a confidence set for the

KCRV can be obtained.

3.1. Numerical Results

The model (1) was used with the matrices Bi = (B, . . . , B)T ,

BTi Bi = riBTB, r = [9, 5, 5, 5, 5, 5, 5, 3, 4, 5, 5, 5]. The 14 × 2 de-

sign matrix B is formed by rows, 1, log(log(f)) at the common

frequencies f as given below. Figure 2 suggests that sensitivity can

be viewed as an approximately linear function of the iterated log-

arithm of frequencies.

The nlme algorithm for REML does not converge for 7. non-

transformed frequencies or when a quadratic regression is fitted. An

alternative approach is to use double exponential transformation of

sensitivities. It is less attractive as this transform clearly cannot

give good answers outside of the considered frequency range.

Here are the estimates of the parameters in pC/(m/s2) units,

X XDL XH X0 X

θ0 0.128316 0.128312 0.128317 0.128493 0.128326

θ1 0.000430 0.000433 0.000430 0.000337 0.000425

.


317

1.3 1.4 1.5 1.6 1.7 1.8 1.9

0.12

800.

1285

0.12

900.

1295

0.13

000.

1305

loglogFREQUENCY

SE

NS

ITIV

ITY

Fig. 2. The plot of log(log(frequency)) and sensitivity.

The REML based KCRV values along with the lower (LKCRV)

and the upper (UKCRV) 95-% confidence bounds are

frequency KCRV (X) LKCRV (X) UKCRV (X)

40 0.128878 0.128228 0.129527

50 0.128903 0.128241 0.129565

63 0.128927 0.128253 0.129602

80 0.128952 0.128266 0.129638

100 0.128973 0.128277 0.129669

125 0.128993 0.128287 0.129700

160 0.129015 0.128298 0.129732

200 0.129033 0.128307 0.129759

250 0.129051 0.128316 0.129786

315 0.129069 0.128325 0.129812

400 0.129086 0.128334 0.129838

500 0.129102 0.128342 0.129862

630 0.129118 0.128350 0.129885

800 0.129133 0.128358 0.129908

.


318

The KCRV values determined from XDL and XH are very simi-

lar to these of REML. The almost unbiased estimator of V ar(XDL)

of XDL turns out to be in 10−7pC/(m/s2) units

(0.356979 −0.193912

−0.193912 0.116735

),

while V ar(X) based on REML is

V ar(X) =

(3.964788 −0.028366

−0.028366 1.23667

).

Traditional estimators (4) of V ar(XDL) and of V ar(XH) are con-

siderably smaller.

The between lab variance Ξ estimators are

Ξ =

(3.96479 −2.125727

−2.125727 1.236667

),

(found from the R-language VarCorr function),

ΞDL =

(2.248314 −1.220877

−1.220877 0.739065

), ΞH =

(4.044016 −2.145403

−2.145403 1.231740

).


319

Referencess

1. R. DerSimonian and N. Laird. Control. Clin. Trials 7 177

(1986).

2. D. J. Evans, A. Hornikova, S. Leigh, A. L. Rukhin and W.

Strawderman. Metrologia 46 Technical Supplement 09002 (2009).

3. D. A. Follmann and M. A. Proshan. Biometrics 55 732 (1999).

4. L. V. Hedges. Psy. Bull. 93 388 (1983).

5. S. A. Horn, R. A. Horn and D. B. Duncan. Journ. Amer.

Statist. Assoc. 70 380 (1975).

6. ISO. ISO Guide 16063-11: Methods for the calibration of

vibration and shock transducers. International Organization for

Standardization (ISO) Geneva, Switzerland, (2011).

7. H. J. von Martens, C. Elster, A. Link, A. Taebner and W.

Wabinski. CCAUV.V-K1 Final report, Metrologia 40 Technical

Supplement 09001 (2003).

8. J. X. Pan and K. T. Fang. Growth Curve Models and Statis-

tical Diagnostics, New York, Springer, (2002).

9. J. Pinheiro and D. Bates. Mixed Effects Models in S and

S-Plus, New York, Springer, (2000).

10. A. L. Rukhin. Journ. Multivar. Anal., 98 435 (2007).

11. A. L. Rukhin. Journ. Statist. Plann. Inf. 141 3181 (2011).

320





METROLOGICAL ASPECTS OF STOPPING ITERATIVE

PROCEDURES IN INVERSE PROBLEMS FOR STATIC-MODE

MEASUREMENTS

SEMENOV K. K.

Department of measurement informational technologies, St. Petersburg State

Polytechnical University, 29, Polytechnicheskaya str., St. Petersburg, 195251, Russia

If the measurand is converted nonlinearly by a gage, then, to determine its value from the

gage output signal, we will need to solve some nonlinear equation using most probably

one of iterative techniques (for example, Newton method). Each measuring conversion is

always performed with errors that are transformed by equation solver and cause solution

uncertainty. Unlike usual stopping rules for iterative processes that ignore inaccuracy of

initial data, the new rule is proposed in the presented paper that takes this into account:

the decision to stop should be based on the comparison between next iteration step size

and transformed error bounds for solution estimate for the previous iteration step.

To control or manage complex technical object (unit under test), we need to

perform measurements of some quantities nxxx ...,,, 21 that describe object

state. For this purpose, multichannel measuring systems are used that can

probably contain nonlinear converters. If we take into account that sensors

usually are selective to only one measurand, then we can describe nonlinear

conversion in i-th measuring channel with nonlinear equation ( ) iii yxfy ∆+= ,

where iy is the i-th channel output, iy∆ is its absolute error, i = 1, 2, …, n. If

unit under test is functioning normally then some measurands from set

nxxx ...,,, 21 can depend on each other. We can describe these dependencies for

common case by implicit functions ( ) jnj gxxxg ∆=,...,, 21 , where

j = 1, 2, …, m. These functions can be known exactly ( 0=∆ jg ), or determined

experimentally with errors jg∆ (then we use interval or random variables to

describe this uncertainty), or determined by experts (then we use fuzzy

variables). As a result, we will deal with system S of (n+m) equations, some of

which are probably nonlinear. Generally, the left side of this system can be

expressed using vector function S(x) and right side – with block vector

( )

TTTT

+= gyy ∆∆∆ , where ( )T

21T ...,,, nxxx=x is a vector of measurands,

321


( )T

21T ...,,, nyyy=y is a vector of output values of measuring channels,

( )T

21T ...,,, nyyy ∆∆∆=∆y is a vector of absolute errors of components of y

and, at last, ( )T

21T ...,,, mggg ∆∆∆=g∆ is a vector of errors for functions that

describe relations between measurands. Since the main purpose of performed

measurements is to estimate values of measurands nxxx ...,,, 21 , i. e. values of

solutions of equation system ( ) 0=− ∆xS , where ( )0...,,0=T

0 , then such

problem turns out to be inverse problem for static-mode measurements. If even

some of these equations are nonlinear, then we will need to solve it with one of

iterative methods (for example, with Newton, or Newton-Raphson, or Marquardt

method, or with another approach). We should solve this system using obtained

values iy and errors characteristics ( iy∆ and jg∆ ).

In usually used equation systems solvers, iteration is stopped when the

absolute value of next iteration step becomes less than arbitrarily assigned small

positive number ε that doesn’t depend on errors values, which are transformed

by solver from measurements errors of initial data. In the presented paper, the

following stopping rule is proposed that doesn’t have this disadvantage.

1. For each k-th iteration step (k = 1, 2, …), we should calculate intervals Ii, k

(i = 1, 2, …, n) of possible errors for components of intermediate solution

estimate ( )

( )T

,,2,1T ...,,, knkkk

xxx=x (transformed errors) that are caused by

errors iy∆ of measurements results and by uncertainty jg∆ of coupling

equations for measurands. For this, we can use technique [1, 2] based on

automatic differentiation and fuzzy intervals that allow to operate with different

errors types (systematic, random etc).

2. After calculation of next improvement ( )1+kx of desired solution, we

should compare each component of solution current adjustment ( ) ( )( )kk xx −

+1

with corresponded intervals Ii, k of transformed errors of these components: we

need to determine if relation ( ) kikiki Ixx ,,1, ∈−+

is true or not for all i = 1, …, n.

3. If absolute value of i-th component of vector ( ) ( )( )kk xx −

+1 is less than

the half-width of interval Ii, k, for all i = 1, 2, … n, then iterative process should

be stopped.

The main difference of proposed rule from traditional stopping criteria is

that it deals with initial data uncertainty and allows to stop iteration as early as it

is possible without losses of results quality.

Usually, the following conditions are used as a stopping criterion [3, 4]:

( ) ( ) ε≤−−1kk xx or ( ) ( ) ( )kkk xxx ⋅≤−

−δ1 ,

322


where ε and δ are predetermined sufficiently small positive numbers, ( )kx is

the norm of vector ( )kx . Such inequalities carry little information about solution

accuracy: we don’t know how far estimate ( )kx is from exact solution exactx .

That’s why the more adequate stopping criterion is like the following one: all

elements in vector-function value ( )( )1⋅− εkxS or ( ) ( )( )δ−⋅ 1kxS should have

other signs than corresponding elements in vector-function value ( )( )1⋅+ εkxS

or ( ) ( )( )δ+⋅ 1kxS , where ( )1...,,1=T

1 . If such criterion is satisfied then we

could estimate solution accuracy: all absolute errors ( )( )exactxx −k of solution

components estimates are within interval [ ]εε, +− or correspondingly all

relative errors have absolute values less than δ.

This stopping rule ignores uncertainty of equation parameters; they are

treated as absolutely accurate. The only way to update such traditional stopping

rules with taking into account initial data uncertainty is to use multiple

calculations, for example Monte-Carlo technique. We can model different

possible values of initial data errors and use traditional system solver and

stopping criteria for each computation. As a result, we will obtain the set of

possible values of solution of examined equation system and estimate its bounds.

Such approach allows us to obtain adequate results but requires too much time to

be executed. The presented method doesn’t result such computational cost.

The proposed rule is similar to idea that is noted in [5, 6]: the right moment

of stopping iterative process is the important factor of regularization of ill-

defined problems. So, we should choose the moment to stop in coordination with

current solution estimate inaccuracy.

The approach of papers [1, 2] allows to obtain the domain ( )kX∆ for

possible error of current solution estimate ( )kx that is caused by transformation

of errors y∆ and g∆ through the iterative procedure. The domain ( )kX∆ is

produced in the form of a parallelepiped. So, at every iteration k, we get the

approximate roots values ( )kx , the parallelepiped ( )kX∆ of their possible errors

and the improvement ( ) ( ) ( )kkk xxe −=++ 11 for the next iteration step. We can

formulate proposed stopping criterion for iteration process in a such way: if k is

the smallest natural interger that the value ( )1+ke is inside the domain ( )kX∆

then the iteration (k+1) should be final. So, it will be reasonable in metrological

sense to stop the iterative process for the smallest k, for what one of the

following equivalent relations will hold:

( ) ( )kk X∆∈+1e or ( ) ( )kk X∈

+1x , (1)

323


where ( ) ( ) ( )kkk XX ∆+= x is the domain of possible values of roots estimate at

iteration k.

We can consider the simple example as an illustration of results reliability

obtained with proposed stopping rule. Let n be equal to 1. Then system S

contains the only equation ( )xfy = and x = (x). The value y represents the

measured value of function f for some argument and is inaccurate: its absolute

error doesn’t exceed y∆ . If the Newton method is used to find the root of this

equation, then

( ) ( )

( )( )

( )( )k

k

kkxf

yxfxx

′

−

−=+1 ,

( ) ( )

( )( )

( )( )k

k

kkxf

yxf

xx′

−

=−+1 ,

where f ′ is the derivative of function f.

In accordance with (1), the difference ( ) ( )kk xx −+1 should be less than

( )kx∆ that is the maximum possible error of ( )kx caused by inaccuracy of y. So,

the value ( )( ) ( )( )kk xfyxf ′− / should be less than ( )k

x∆ too. As a conclusion,

relation (1) for Newton method can be rewritten as

( )( )( )

( )( )kxk xfyxfk

′⋅∆≤− . (2)

For such one-dimensional case, the domain ( )kX∆ is interval ( ) ( )

[ ]kk

xx ∆+∆− , .

Let k = k0 be the iteration number when we stopped the iterative process. If

y∆ is small and f is smooth as it usually takes place in practice, then we could

state with sufficient confidence that the actual value of root will be inside the

interval ( )( )

( )( )

[ ]0000

,kk

xkxk xx ∆+∆− . Really, the interval

=I ( )( ) ( )( )( )

( )( ) ( )( )( )

[ ]000000

,kk

xkkxkk xfxfxfxf ∆⋅′+∆⋅′− almost sure

contains zero and I is an almost exact boundaries for value

( ) ( )( )00 kk xxf ∆+ ≈ ( )( ) ( )( ) ( )000

kxxfxf kk ∆⋅′+ for all ( )( ) ( )

[ ]000

,kk

xxkx ∆+∆−∈∆ .

This reasoning can be extended to multidimensional case (when n > 1) of

Newton method. Then

( ) ( ) ( )( ) ( )( )( )yxxxx −⋅−=−

+ kkkk SJ1

1 ,

324


where ( )x1−J is inverse for Jacobian matrix of partial derivatives, ( )xS is

vector-function of solving equations system as it was defined above. So, the

inequality (2) will change to

( )( ) ( )( )( ) ( )kkk XSJ ∆∈−⋅−

yxx1

or ( )( )( )

( )( )kX

k JS

k

xxyxx

⋅∆≤−

∆∈∆

max .

Similar considerations on reliability can be given for different methods of

equations system solving, not only for Newton approach.

In the presented work, advantages of proposed stopping rule are illustrated

with two examples. First example is extremely simple equation taken from the

metrological application. It was briefly discussed in [7]. The problem is to

determine the variable x from the equation

( ) xyxy ⋅=⋅ 21exp . (3)

This relation is used [8] in metrology for measuring the radius of dangerous zone

of surface pollution by toxic contaminant. Let parameters y1 and y2 be obtained

from measurements and their values are inaccurate:

[ ]01.1;99.01 ∈y and [ ]73.2;71.22 ∈y . The midpoints of these intervals

( 1y = 1.00 и 2y = 2.72) are the measurements results for parameters 1y and

2y , the absolute errors 1y∆ and 2y∆ of which are 01.011 =∆≤∆ yy and

01.022 =∆≤∆ yy . We can estimate the root of this equation using the Newton’s

method. Let initial estimate be ( ) 01 =x . All possible roots are [0.849, 1.190].

For every iteration step, we will estimate the bounds for possible values of

absolute error for current root estimate ( )kx by the relation

( )

( ) ( )

21

21y

k

y

k

xy

x

y

x

k

∆⋅

∂

∂

+∆⋅

∂

∂

=∆ . The obtained results are presented in Table 1

and illustrated by Fig. 1.

Table 1. Iterative procedure results detailed for every iteration step

iteration

number k

interval boundaries

( ) ( )( )k

xkk xX ∆±=

current root

estimate

( )kx

interval half-

width ( )k

x∆

root adjustment

( ) ( )kk xx −+1

1 [0.000; 0.000] 0.000 0.000 0.582

2 [0.575; 0.589] 0.582 0.007 0.224

3 [0.775; 0.835] 0.805 0.030 0.100

4 [0.827; 0.985] 0.906 0.079 0.048

5 √ [0.775; 1.132] 0.954 0.178 0.023

6 [0.599; 1.355] 0.977 0.378 0.012

7 [0.210; 1.766] 0.988 0.778 0.006

325


The iterative process should be stopped at iteration No 5 in correspondence

with formula (1). So, the final interval ( )kx ±( )k

x∆ for the root is [0.775, 1.132].

Fig. 1. Intermediate results of root-finding iterative procedure

We can see that if we don’t stop iterations then error interval will widen.

This demonstrates that problem is ill conditioned.

The reliability of obtained root estimate was studied numerically for

proposed stopping rule. The actual value of parameter y1 was taken equal to 1.0

and different actual values of y2 were chosen greater than 3.0. These values were

distorted with absolute random errors 1y∆ and 2y∆ that were uniformly

generated from intervals [ ]11

, yy ∆+∆− and [ ]22

, yy ∆+∆− correspondingly.

N = 104 times different values of errors 1y∆ and 2y∆ were taken for every pair

of actual values of y1 and y2. Various values 1y∆ and 2y∆ were taken from 0.01

to 0.50 in performed numerical tests. For all variants of modeling, all 104 made

attempts were successful: the actual values of root lied inside the final interval

estimates for it. If we take into account the statistical uncertainty related with

finite sample size then we can state that the true probability of success is more

than 0.9996 (with confidence probability equal to 95%). This conclusion is

derived from Clopper-Pearson confidence interval for probability [9].

The following simple Matlab program code can be used to obtain the results

described above. It contains realization of proposed approach with respect to

Newton’s root finding method for scalar real-valued function. This code is

corresponded to equation (3) but can be easily modified to any other equation.

N = 10^4; % used sample size.

y10 = 1.0; % true value of equation parameter y1.

y20 = 3.0; % true value of equation parameter y2.

326


str = ['exp(', num2str(y10), '*x)-', ...

num2str(y20), '*x = 0']; % true equation to solve.

real_root = double(solve(str, 'x'));

% value that is treated as real root.

dy1 = 0.01; % limit value of absolute error of y1.

dy2 = 0.05; % limit value of absolute error of y2.

x0 = 0.0 % initial root estimate for Newton’s method.

str = ['exp((', num2str(y10), '+dy1)*x)-(', ...

num2str(y20), '+dy2)*x']; % equation to solve (with

% measured values of parameters) in form f(x) = 0.

f = inline(str, 'x', 'dy1', 'dy2'); % function f.

str = ['(', num2str(y10), '+dy1) * exp((', ...

num2str(y10), '+dy1)*x)-(', num2str(y20), '+dy2)'];

diff_f = inline(str, 'x', 'dy1', 'dy2');

% derivative of function f (for Newton’s method).

is_root = zeros([N, 1]); % indicator of attempts when

% the true root will be inside interval estimate.

for i = 1 : N

% random generation of values y1 and y2.

y1 = unifrnd(y10-dy1, y10+dy1, [1,1]); y2 = unifrnd(y20-dy2, y20+dy2, [1,1]); % Newton’s method combined with complex-step

% derivative approximation:

x_prev_1 = x0; x_new_1 = x0;

x_prev_2 = x0; x_new_2 = x0; % temporary variables

% related to previous and current iteration.

dx = 1.0; % root estimate adjustment.

err = 0.0; % root estimate error related to

% equation parameters uncertainty.

alpha = 10^-100; % arbitrary small number for

% complex-step derivative approximation method.

while abs(dx) > err % Newton’s method iteration

% calculation with respect to dy1.

x_new_1 = x_prev_1-f(x_prev_1, 1i*alpha, 0) / ...

diff_f(x_prev_1, 1i*alpha, 0);

dx_1 = x_new_1 - x_prev_1; % estimate adjustment

% calculation with respect to dy2.

x_new_2 = x_prev_2-f(x_prev_2, 0, 1i*alpha) / ...

diff_f(x_prev_2, 0, 1i * alpha);

dx_2 = x_new_2 - x_prev_2; % estimate adjustment

327


% partial derivatives of current root estimate

% with respect to dy1 and dy2 correspondingly.

d_x_new_1 = abs(imag(x_new_1) / alpha);

d_x_new_2 = abs(imag(x_new_2) / alpha);

% absolute error bound for current root estimate.

err = d_x_new_1 * dy1 + d_x_new_2 * dy2;

% root estimate adjustment.

dx = real(x_new_1 - x_prev_1);

x_prev_1 = x_new_1; % passage to next iteration

x_prev_2 = x_new_2; % passage to next iteration

end

% is the true root inside derived interval estimate?

for j = 1 : length(real_root)

if (real_root(j) > real(x_new_1) - err) && ...

(real_root(j) < real(x_new_1) + err)

is_root(i) = 1;

end

end

end

% percentage of situations, when the true root was

% inside the final interval estimate for the root

sum(is_root ~= 0) / N

The code above was supposed to be as compact as it was possible and that’s

why is uses reduced and modified version of approach [1, 2] for dealing with

initial data uncertainty. The realization of discussed stopping criterion is applied

to determine the moment to break execution of iteration ‘while’. To determine

the current root estimate error bound, the technique of complex-step derivative

estimation is used [10, 11] (it is a kind of automatic differentiation approach for

numerical software that deals with real numbers). This technique is applied to

Newton’s procedure for root finding.

The second example illustrates the case of solving system that contains two

nonlinear equations. A presented problem is related to metrological application

in hydrodynamics and represents data processing when we need to recalculate

the pressure values measured inside the liquid to elevation of free surface [12].

The system to solve has the form presented by expression (4). This system

should be solved relatively to x1 and x2. Unknowns has the following meaning: x1

is amplitude multiplier that characterize the dependence between pressure fall

inside the liquid (that takes place during wave propagating on water free surface)

and wave height; x2 is wave number that is equal to λπ /⋅2 , where λ is wave

length. The system has six parameters, some of which are measured directly and

some are estimated roughly a priori. Equations parameters have the following

meaning: y1 is water depth (and is measured), y2 is mark of pressure gage

displacement above the seabed (and is measured when gage is mounted), y3 is

328


free fall acceleration (taken in accordance with geographical location), y4 is

water density (estimated in accordance with current temperature), y5 is pressure

fall (and is measured directly by the pressure gage), y6 is wave period (and is

measured indirectly from data provided by the pressure gage).

( )( )

( ) ( ) ( )

( )( )( )

( )

( )( )

( ) ( ) ( )( )

( )( ) ( )

( )

⋅

⋅=

⋅⋅

⋅⋅

+⋅⋅−⋅⋅

+⋅⋅⋅

⋅

=

⋅⋅

+⋅⋅−⋅⋅+

+⋅⋅⋅

⋅⋅

⋅⋅−

+⋅⋅⋅−×

×

⋅⋅⋅⋅

⋅

⋅+

⋅

⋅

⋅⋅

.

,

263

21

12

1212122

43

52212121

221

21

21221

2121

21

21

2211

yy

πxx

yx

yxyxyxx

yy

yx-yyxyxy

x-yy

xy

xyx-yy

xyxy

xx

xy

x-yyx

22

22

4

24

24

2

2

2

2

23

4

sh16

9ch8ch81th

ch1ch3

8ch

3

8

3chsh4

ch4133ch2

2shsh4

3

ch

ch2

(4)

When the system is solved, we can estimate wave height h on free surface

with the following expression from the Stokes theory of waves on the water:

( ) ( ) ( )( )

( )

.

12

121212211

⋅⋅

+⋅⋅−⋅⋅+⋅⋅⋅⋅+⋅⋅=

yx

yxyxyxxxxh

6

24622

sh64

39ch76ch32ch3212

The proposed stopping rule was applied to solving procedure for this system

and was tested for real data [12]. From the laboratory experiment, we obtained

the following values: y1 = 0.657±0.002 m, y2 = 0.265±0.002 m,

y3 = 9.819±0.001 m/sec2, y4 = 998.20 kg/cm3 (for 20 Celsius degrees),

y5 = 1.0±0.1 kPa, y6 = 1.93±0.01 sec. We applied Newton method to solve this

system and chose values ( )

( )TT

15.0,5.0=x as an initial guess values for x1 and x2.

On the Fig. 3, we can see that solution estimate for iteration No 8 is inside

uncertainty domain for solution estimate for iteration No 7. So, we need to stop

iteration procedure here. The obtained results are: x1 = 0.064±0.06 m,

x2 = 1.435±0.015 m-1. The corresponded value of wave height h is 0.13±0.01 m.

The value of h was also measured with wave meter; the obtained value was equal

to 0.129±0.03 m for presented test parameters. We can see that obtained solution

satisfactorily agrees with measurement result.

The accuracy of parameter y5 is formed by several factors and was estimated

on the assumption of the worst case. The more plausible bound on its error is

lower: y5 = 9.92±0.05 kPa. This bound is estimated from consideration that

influencing factors are independent. In this case, the iteration process should be

also stopped on iteration No 8 and value of wave height h is 0.130±0.06 m.

As a conclusion, we suppose that the proposed stopping criterion is valid in

metrological sense for iteration processes used in metrology (presented examples

illustrate this).

329


Fig. 2. Intermediate solution estimates

Acknowledgments

Author would like to thank Gennady N. Solopchenko for valuable

comments and a careful reading of this paper and anonymous reviewer for

valuable remarks and commentaries.

References

1. K. Semenov, G. Solopchenko. Measurement Techniques. 53 (6), 529 (2010).

2. K. Semenov, G. Solopchenko. Measurement Techniques. 54 (4), 378 (2011).

3. G. Recktenwald. Stopping Criteria for Iterative Solution Methods (2012).

4. V. Berinde. Novi Sad J. Math. 27 (1), 19 (1997).

5. M. Krasnoselskiy, I. Emelin, V. Kozyakin. About iteration procedures in

linear problems (1979). In Russian.

6. G. Vaynikko, A. Veretennikov. Iterative procedures in ill-posed problems

(1986). In Russian.

7. V. Kreinovich, L. Reznik, K. Semenov, G. Solopchenko. Proceedings of XX

IMEKO World Congress (2012). Paper IMEKO-WC-2012-ADC-O3, file

762.

8. A. Votschanin. Factorial laboratory. 68 (2002). In Russian.

9. C. Clopper, E. Pearson. Biometrika. 26 (4), 404 (1934).

10. W. Squire, G. Trapp. SIAM Rev. 40 (1), 110 (1998).

11. J. R. R. A. Martins, P. Sturdza, J. J. Alonso. ACM Transactions on

Mathematical Software. 29 (3), 245 (2003).

12. V. Maximov, I. Nudner, K. Semenov, N. Titova. Geophysical Research

Abstracts, 15 (2013), abstract EGU2013-3659.

330





INVERSE PROBLEMS IN THEORY AND PRACTICE OF

MEASUREMENTS AND METROLOGY

SEMENOV K. K., SOLOPCHENKO G. N.

Department of measurement informational technologies, St. Petersburg State


KREINOVICH V. YA.

Department of Computer Science, University of Texas at El Paso,

500 W. University, El Paso, TX 79968, USA

In this paper, we consider the role of inverse problems in metrology. We describe general

methods of solving inverse problems which are useful in measurements practice. We also

discuss how to modify these methods in situations in which there is a need for real-time

data processing.

1. Introduction

What mathematical physics calls inverse problems is, in effect, the class of

problems, which are fundamental in measurement theory and practice [1, 2]. The

main objective of such problems is to develop procedures for acquiring

information about objects and phenomena, accompanied by decreasing the

distortion caused by the measuring instruments. Lord Rayleigh was the first to

formulate such problem in 1871, on the example of spectroscopy. His purpose

was to maximally decrease the influence of diffraction. Rayleigh showed that in

mathematical terms, the problem of reconstructing the actual spectrum ( )νx

from the measured signal ( )uy can be reformulated as the problem of solving an

integral equation

( ) ( ) ( ) ννν dxuKuy ⋅−= ∫∞

∞−

, (1)

where ( )ν−uK is the apparatus function of the spectrometer – which describes

the distortion caused by diffraction.

The relation between inverse problems and measurements was emphasized

by G. I. Vasilenko [3], who explicitly stated that the main objective of the

inverse problem is “restoring the signals” or “reduction to the ideal instrument”.

331


Eq. (1) is the integral Fredholm’s equation of first type; it can be

represented in the form ( ) ( )vxuy A= , where A is a compact linear operator of

convolution – which describes a generic analog transformation of a signal inside

a measuring instrument – and ( )ν−uK is the kernel of this operator. From the

mathematical viewpoint, the solution of Eq. (1) can be expressed as

( ) ( )uyvx1−

= A , where 1−A is the inverse operator to the compact operator A.

From the practical viewpoint, however, we have a problem: it is known that such

inverse operators are not bounded (see [4, p. 509]); as a result, a small noise in

the measured signal can lead to drastic changes in the reconstructed solution

( )νx . Such problems are known as ill-posed. A general approach of generating a

physically reasonable solution to this problem – known as regularization – was

formulated by A. N. Tikhonov in 1963 [5].

2. Inverse problems in metrology

If we take into account the inaccuracy e(u) with which we register the output

signal registration and the inaccuracy ( )vu −ε with which we know the

apparatus function of the measurement device, then Eq. (1) will have the form

( ) ( ) ( ) ( )uedxuKuy +−= ∫∞

∞−

νννε . This equation with infinite (symmetric)

integration limits describes spatial distortion processes in spectroscopy,

chromatography, and in acoustic and other antenna-based measurements. For

dynamic measurements – i.e., for measuring dynamic signals – the measurement

result can only depend on the past values of the signal, so integration starts at 0:

( ) ( ) ( ) ( ) ( ) ( )tetxtedxtKty +=+−= ∫∞

ετττε A

0

, (2)

where εA is the convolution operator with the kernel ( )τε −tK (known with

inaccuracy ( )τ−tε ) and e(t) is the additive noise.

The main idea behind Tikhonov’s regularization is that we look for an

(approximate) solution ( )tx~ to Eq. (2) by minimizing an appropriate stabilizing

functional ( )( )txΩ in Sobolev’s space of smooth functions [5]. Usually, a

functional ( )( ) ( ) ( )[ ]∫ ∫∞ ∞

′+=Ω

0 0

21

20

~~dttxdttxtx ββ , 0β >0 and 1β >0, is used on the

condition that the difference between y(t) and A ( )tx~ is of the same order as the

error ∆ caused by e(t) and ( )tε : ( ) ( )22~

∆=− tytxA . The Lagrange multiplier

techniques reduces this constrained optimization problem to the unconstrained

332


optimization of the functional [5]:

( )

( ) ( ) ( )( )

Ω+− txtytxtx

α2

min A , (3)

where α is called a regularization parameter.

2.1. The minimal modulus principle

When we have an a priori information about the norm of the solution and/or

its derivative, we can find α. In particular, we can use fuzzy (imprecise) expert a

priori information [6]. In the absence of such a priori information, we can use

the principle of minimal modulus [7, 8] to select α.

This method is based on the fact that in the frequency domain, the

stabilizing functional takes the form ( )( ) ( ) ( )∫ ∫∞ ∞

+=Ω

0 0

221

20 ωωωβωωβω djxdjxjx ,

where j is imaginary unit and ω is circular frequency. The minimum of this

functional is attained when the modulus ( )ωjx is minimal.

Fourier transform of Eq. (2) leads to ( ) ( ) ( ) ( )ωωωω ε jejxjKjy +⋅= . Based

on 95% confidence intervals ( ) ( ) ( ) ( ) ( )τετττετ ε 95.095.0 +≤≤− KKK and

( ) ( ) ( )tetete 95.095.0 ≤≤− in time domain, we can find the ellipses describing

uncertainty in the frequency domain [9]. As a result, for every frequency iω we obtain two error-related ellipses in

the complex plane: the first one centered in ( )ijy ω (Fourier transform of output

signal) and another one centered at the value ( )ijK ωε (Fourier transform of

apparatus function), as shown on Fig. 1. As shown in [7], for all values iω the

value ( )ijx ω~ corresponding to the regularized solution is equal to

( ) ( ) ( )iii jKjyjx ωωω ** /~

= , where ( )ijK ω* is point on the ellipse which is

the farthest from the coordinates origin, and ( )ijy ω* is the point on the

corresponding ellipse that is the closest to the coordinates origin. This prevents

from the situation when there is zero value in denominator. So, the problem

stops being incorrect, but numerator ( )ijy ω* of the ratio ( )ijx ω~ sustains a

step to zero value at some frequency. This causes Gibbs phenomenon when we

perform inverse Fourier transform of ( )ωjx~ . In each concrete case, manual

adjustment of input data error characteristics may decrease effect’s influence.

333


Fig. 1. Illustration of minimal modulus principle

From Fig. 1, it is clear that this solution indeed minimizes the modulus

( )ωjx , and the condition ( ) ( )22~

∆=− tytxA holds. After applying the inverse

Fourier transform to the solution ( )ijx ω~ , we get the desired regularized solution

to the inverse problem – in other words, we achieve the desired reduction to the

ideal measuring instrument. We have shown that this method works very well in

many practical situations [10, 11]. This method also allows us to take into

account the “objective” prior information about errors and also “subjective”

information – as described by (possibly imprecise) expert estimates [6].

2.2. The inverse filter

The principle of minimal modulus can only be used after the whole signal is

measured. This is reasonable in spectroscopy and chromatography, but in

processing dynamic signals, we often need to produce results in real time, before

all the measurements are finished. This can be achieved by using an inverse

filter, which can be physically implemented as one or several sequential

dynamically stable circuits. An example is given on Fig. 2.

Fig. 2. Inverse filter circuit

334


If the amplifier gain is ampK and R and C are the resistance and capacitance

of inertial RC-circuit, then the complex frequency characteristic (CFC) of circuit

on Fig. 2 is equal to

( ) ( ) .1

111

1−

+

⋅+⋅+⋅

+

=

ampamp

ampωωω

К

CRjCRj

К

KjK f

This filter can be used if the modulus of CFC of the measuring instrument is

monotonically decreasing. For example, such property is usual for thermistors,

thermocouples, Hall sensors for current strength etc. Such gauges have first

order CFC:

( )

gωω

τj

KjK g

+

=

1

0 ,

where gτ is its time constant and 0K is gain coefficient for static mode. In this

case, if values of R and C for inverse filter (Fig. 2) are such that gτ=CR , then

series of the gauge and the inverse filter placed after it will have CFC equal to

( ) ( )

1

0

11

1

−

+

⋅+⋅

+

⋅

=⋅

ampamp

ampωωω

К

CRj

К

KKjKjK fg .

We can see that time constant of such series is decreased in ( ) 11 >>+ ampК

times versus gτ . This causes corresponded response acceleration with the same

ratio and represents the solution for inverse problem of signal restoration.

If the order of CFC for the measurement device is larger than one, then the

quantity of first-order inverse filters (Fig. 2) that should be concatenated one

after another is the same as the order value. The positive result can be achieved

with individual tuning of gain and parameters R and C for every first-order

circuit. The inverse problem solution can be achieved using the similar inverse

filters even for converters whose CFC order cannot be rated.

Let us examine the example of using such inverse filter for Σ∆ – Analog-to-

Digital Conversion: let us consider approximation of frequency characteristic for

ADC ADS1256 [12]. This ADC is used for digitizing analog signals with

frequency bands (0÷25), (0÷50) and (0÷500) Hz. To construct inverse filter to

improve its metrological properties, we should use fractionally rational

approximation to ADC frequency characteristics.

335


To approximate ADS1256 CFC ( )ωADC

jK , we can use separate values of its

squared amplitude frequency characteristic (AFC) ( )2

ωADC

jK that is presented

by ADC producer [12]:

( )( )

( )

( )

( ),

sin

sin

sin64

sin5

ωπ

ωπ

64/ωπ

ωπω

ADC⋅⋅

⋅⋅⋅

⋅⋅

⋅=

N

NjK

where ss ff // == ωωω is relative frequency, f is ADC input signal

frequency, sf = 30 kHz is ADC maximum sampling frequency, N is quantity of

averaging output values, fπω 2= and ss fπω 2= are angular frequencies.

The mentioned data points are placed in the second row of table 1.

Fractionally rational approximation was performed for function ( )2

ωADC

jK .

This function is real-valued, its argument is 2ω . So, we can apply traditional

approximation techniques that are developed for real-valued functions.

Two variants of approximation were considered: the case when N = 1 and

frequency band for approximation is [0, 0.06666] for ω or [0, 2000] Hz for f

and the case when N = 8 and band is [0, 0.06] for ω or [0, 1800] Hz for f. We

use uniform meshes of 81 points for both of cases N = 1 and N = 8.

The obtained approximations were factorized to get expression for CFC

( )ωADC

jK . The used factorization method is described in [13]. Approximation

accuracy was set to 0.3%. As a result, the following CFC were obtained:

N = 1: ( )ω

ωADC

jjK

⋅+

=

112.41

1~,

N = 8: ( )

( )2

73.1473.148598.021

1~

ωωω

ADC

⋅−⋅⋅⋅+

=

j

jK .

In table 1 placed below, values of real AFC ( )jfKADC

and its obtained

approximation ( )jfKADC

~ are presented for N = 1. Fig. 3 illustrates data

presented in Table 1. Averaging, that takes a place during analog-to-digital

conversation, causes this effect of AFC decreasing.

336


Table 1. Results of approximation of ( )jfKADC

for ADC ADS1256 (N = 1)

f , Hz 0.0 50 100 200 300 400 500 1000 2000

( )jfKADC

1.0 0.99998 0.999909 0.99964 0.99918 0.99854 0.9977 0.9909 0.9641

( )jfKADC

~ 1.0 0.99998 0.999906 0.99962 0.99916 0.99850 0.9977 0.9907 0.9644

Fig. 3. Results of AFC approximation and correction for ADC ADS1256 (case N = 1)

On Fig. 3, a curve for approximation error ( ) ( )jfKjfKADCADC

~− is also

presented. Its scale is put on the right side of the graph.

In Table 2 the results are placed for the case N = 8. Fig. 4 contains graphical

representation of data from this table.

Table 2. Results of approximation of ( )jfK

ADC for ADC ADS1256 (N = 8)

f , Hz 0.0 60 300 600 900 1200 1500 1800

( )jfKADC

1.0 0.9996 0.9889 0.9559 0.9025 0.8311 0.7444 0.6463

( )jfKADC

~ 1.0 0.9996 0.9895 0.9576 0.9035 0.8297 0.7426 0.6513

337


Fig. 4. Results of AFC approximation and correction for ADC ADS1256 (case N = 8)

We can see that for frequencies f less than 100 Hz the AFC is close to unit

value (difference is less than 0.02%) for both of approximations that is

acceptable. For frequencies from 500 to 1000 Hz, the AFC differs from 1.0 with

error less than 0.5%. The approximate order of CFC is now determined. Further,

our purpose is to construct physically realizable inverse filter that will describe

obtained approximation.

Let us take into consideration case N = 1.

The CFC approximation has the first order. So, we can use the simplest

inverse filter as presented on Fig. 2. Good AFC correction will be obtained if we

choose amplifier’s gain equal to ampK = 500 and time constant for filter in

feedback circuit equal to CRf ⋅=τ =21.5 µs. Let ( )jfKfilt~

be AFC of inverse

filter. Then, the AFC of inverse filter (first circuit in sequence) and ADC (second

circuit in sequence) connection will be ( )jfK filtADC+

~= ( ) ( )jfKjfK

ADCfilt ⋅~

.

Values of corrected AFC for some frequencies are presented in Table 3 and

put on Fig. 3 (marked as squares and dashed curve) with the values of correction

inaccuracy, equal to ( )jfK filtADC+−

~0.1 (its scale is on the right side of graph).

Table 3. Correction with the inverse filter for ADC ADS1256 (N = 1)

f , Hz 0 60 300 600 900 1200 1500 2000

( )jfKADC

1.00000 0.99997 0.9992 0.9967 0.9926 0.9869 0.9796 0.9641

( )jfK filtADC+

~ 0.99800 0.99804 0.9980 0.9980 0.9979 0.9978 0.9976 0.9966

338


We see that corrected AFC has essentially wider frequency band.

Let us now take into consideration case N = 8.

The simplest realization of inverse filter for correction of second-order CFC

that is used to describe ADC time-frequency characteristics is concatenation of

two inverse filters of first order. Time constant fτ for each of them should be

about ( ) sf f/0.50.1 ÷=τ for ( )ω1~K and ( )ω2

~K . Block-scheme of such

complex inverse filter is presented on Fig. 5.

Filter of such structure can be easily realized in analog or in digital form.

But to obtain higher accuracy, it is better to put inverse filter before ADC and

combine it with input gain amplifier.

Time constant fτ for each of first-order inverse filter on the Fig. 5 should

be adjusted using filter mathematical model. It can happen that the best result

will be when time constants will be different for these filters. Mathematical

modeling can help to determine the best gain value ampK for direct circuits on

Fig. 5. They should have work frequency band wider than frequency diapason

that is chosen for CFC correction.

Fig. 5. Block-scheme for inverse filter of second order

Mathematical modeling shows that for case N = 8 the satisfactory correction

can be achieved if we use two inverse filter of first order, which parameters are

1ampK = 1000, 1fτ = 60 µs and 2ampK = 1000, 2fτ = 58 µs. Results of

such correction are presented in Table 4 and put on Fig. 4 (marked as squares

and dashed curve).

Table 4. Correction with the inverse filter for ADC ADS1256 (N = 8)

f , Hz 0 30 60 150 300 500 1000 1500 2000

( )jfKADC

1.00000 0.99989 0.99955 0.99721 0.98886 0.9692 0.8806 0.7445 0.5764

( )jfK filtADC+

~

0.99800 0.99802 0.99805 0.99829 0.99909 1.0005 0.9996 0.9728 0.8915

339


It is clear, that the AFC unevenness for frequency band [0, 1] kHz is less

than 0.2%. We can conclude that described technique of inverse filter design

allows obtaining measuring channel with wider frequency band and faster

response. Such technique can be applied to any measurement instrument or

converter with monotonically decreased amplitude frequency characteristics.

References

1. G. Solopchenko, Measurement Techniques. 17 (1974).

2. V. Knorring and G. Solopchenko, Measurement Techniques. 46, 546

(2003).

3. G. Vasilenko, Theoriya vosstanovleniya signalov. (1979). In Russian.

4. S. Mikhlin, Mathematical physics: an advanced course. (1970).

5. A. Tikhonov and V. Arsenin, Solutions of Ill-Posed Problems. (1977).

6. V. Kreinovich, C.-C. Chang, L. Reznik and G. Solopchenko, NASA

Conference Publication (NAFIPS-92), 2, 418. (1992).

7. G. Solopchenko. Measurement Techniques, 44, 546 (2001).

8. N. Seregina and G. Solopchenko, Izvestiya AN SSSR. Technical cybernetics,

2, 166 (1984). In Russian.

9. K. Semenov and G. Solopchenko. Measurement Techniques. 53, 592

(2010).

10. K. Savkov, N. Seregina and G. Solopchenko. Journal of Advanced

Materials, 1 (2), 205 (1994).

11. N. Seregina, G. Solopchenko. Pribory i sistemy upravleniya. (4), 19 (1992)

In Russian.

12. ADS1255, ADS1256: Very Low Noise, 24-bit Analog-to-Digital Converter.

Texas Instruments technical document SBAS288K. Available at:

www.ti.com/lit/ds/sbas288k/sbas288k.pdf.

13. V. Kreinovich, G. Solopchenko. Measurement Techniques. 36 (9), 968

(1993).

340





FUZZY INTERVALS AS FOUNDATION OF METROLOGICAL

SUPPORT FOR COMPUTATIONS WITH INACCURATE DATA

SEMENOV, K. K., SOLOPCHENKO, G. N.

Department of Measurement Informational Technologies, St. Petersburg State


KREINOVICH, V. YA.

Department of Computer Science, University of Texas at El Paso,

500 W. University, El Paso, TX 79968, USA

In this paper, we discuss the possibility of using the formalism of fuzzy intervals

combined with automatic differentiation technique as a basis for numerical software self-

verification in metrology. The natural domain of such approach is calculating indirect

measurements results using the inaccurate results of direct measurements as the initial

data. We propose to support software for such computations with tools that allow us to

receive simultaneously calculated results and their error characteristics. Only such

software can be put to metrological validation in full.

In many practical situations, the inaccurate results of direct measurements

are used for calculations of indirect measurements results. Final data are also

uncertain. Characteristics of this uncertainty should be expressed in quantitative

form and presented together with indirect measurement result. The main purpose

of this paper is to discuss ways to provide software for measured data processing

with tools of automatic calculation of final result uncertainty. Only software that

is supported in such manner can pass the metrological certification in full.

To achieve this purpose, we propose to use combination of two formalisms:

fuzzy intervals approach – to represent inaccuracy of initial data for calculations,

and formalism of software automatic differentiation – to compute how initial

data uncertainty transforms to inherited uncertainty of final result.

There are many approaches for representing inaccuracy of measured data

that act as initial information for subsequent calculations. Modern approaches

take into account different information about the initial data inaccuracy. Some of

them use random variables [1-3] for uncertainty representing and handling with

it, other ones use bounds on possible values of initial data [4-6]. Interval

representation of data inaccuracy was firstly mentioned by Wiener [7] and

Kantorovich [8]. With the development of the fuzzy set theory, its formalism

341


became actual tool for uncertainty expressing in metrology [9, 10]. Natural

evolution of ideas of interval and fuzzy frameworks is the concept of fuzzy

interval [9]. In this paper, we show that the combination of fuzzy interval

approach with technique of automatic differentiation of programs is the most

perspective way to achieve the declared purpose in metrology. This approach

allows operating with both objective and subjective (expert) data that can occur

in applications.

Let us consider advantages of using fuzzy intervals instead of the traditional

intervals as a characteristic of uncertainty in computations with inaccurate data.

Let 111~

xxx ∆+= , ..., nnn

xxx ∆+=~ be the measurement results for quantities

nxx ,...,1 that were obtained with absolute errors nxx ∆∆ ...,,1 . Let

( )n

xxfy ,...,1= be the function that describes the necessary computations.

We should compute not only value ( ) ==n

xxfy~...,,~~

1 ( )nn

xxxxf ∆+∆+ ...,,11 ,

but also characteristics of its inaccuracy:

( ) ( )nnn

xxxfxxxxxxfy ...,,,...,,, 212211 −∆+∆+∆+=∆ .

If the errors n

xx ∆∆ ...,,1 are small, then we can simplify the problem by

linearizing of the function ( )n

xxfy~...,,~~

1= . In this case, the resulting

inaccuracy becomes a linear combination of the errors n

xx ∆∆ ...,,1 :

( )

∑=

∆⋅

∂

∂

≈∆

n

i

i

i

nx

x

xxxfy

1

21~...,,~,~

. (1)

Fig. 1. Membership function

construction for fuzzy interval

Since the computation of f is

performed by a computer program, we can

estimate derivatives in Eq. 1 efficiently and

with absolute accuracy using technique of

automatic differentiation [11]. This

technique is used in [12, 13] for solving a

wide class of metrological problems.

Let us describe errors n

xx ∆∆ ...,,1 as

fuzzy variables. Operations of addition and

multiplication with constant in Eq. 1 should

be treated accordingly. Errors n

xx ∆∆ ...,,1

are composed of systematic 1xsyst∆ , …,

nxsyst∆ and random 1xrand∆ , …, n

xrand∆

components. It should be considered that

they act differently when we perform

multiple measurements.

342


Usually, it is known from the technical documentation for measuring

instruments that iix systsyst ∆≤∆ with probability Psyst = 1 and that

iix randrand ∆≤∆ with probability greater than or equal to Prand < 1. So,

inequality iiiiii

xxx randsysttotalrandsyst ∆+∆=∆≤∆+∆=∆ holds with probability

P > Psyst ⋅ Prand = Prand. The value itotal∆ of the total error bound is the function

of confidence probability P: ( )Pii totaltotal ∆=∆ . If we associate the set of

intervals ( ) ( )[ ]PPJiiP totaltotal ∆∆−=

−,1 with values P−= 1α then the

received curve ( )itotalαα ∆= will correspond to membership function ( )i∆µ of

a fuzzy interval that will represent information about total error (Fig. 1).

The curve ( )i

∆µ is the symmetrical curvilinear trapezoid. Its upper base

represents information about the systematic part of error and its lateral sides

describe known information about the error’s random component. The value α is

the degree of belief of the statement “limit possible value of total error i

x∆ of

measurement result i

x~ will be inside the interval αJ ”.

In [14, 15], it is theoretically justified that the trapezoid ( )i∆= µα should

has left and right halves of Gaussian curve as its latter sides (Fig. 1c). If experts

produce membership function of another type then it can be easily approximated

with function ( )i

∆µ~ of the necessary form. Really, let experts give two sets

( )

iij syst∆−≤∆1

and ( )

iij syst∆≥∆2

, j = 1, 2, …, m of values that satisfy conditions

( ) ( )jijij

αµµ =

∆=

∆

21 , where jα are pre-defined degrees of belief. Then

parameters σ(1)

and σ(2)

of Gaussain curves for left and right sides of ( )i

∆µ~ can

be estimated as

( )

j

iij

j α

syst

ln2max

1

⋅−

∆+∆

and

( )

j

iij

j α

syst

ln2max

2

⋅−

∆−∆

correspondingly.

The final value σ can be taken as ( ) ( ) 21 ,max σσ for symmetrical membership

function. Some examples of such approximations are presented on the Fig. 2.

343


Fig. 2. Gaussian-type approximation for membership function of fuzzy interval

constructed by expert evaluations

So, membership function of fuzzy interval can be described with only two

parameters σ,0∆ , where 0∆ is such a value that ( ) 1=∆iµ if 0∆≤∆i and

( ) 1<∆iµ if 0∆>∆i and σ is parameter of latter sides of ( )i∆µ .

To process fuzzy intervals, different definitions of arithmetic operations can

be used. Its rational choice depends on the concrete problem to solve. The

general definition is the following. Let ( )1∆1µ and ( )2∆2µ be membership

functions of two fuzzy intervals 1total∆ and 2total∆ . If we sum them then the

result interval 213 totaltotaltotal ∆+∆=∆ will have membership function

( ) ( ) ( )( )2133 ,sup

321

∆∆=∆

∆=∆+∆

21 µµµ T , where T is triangular norm (see for

details [10]). For metrology, we should choose such norm T or such

algebraically closed family of suitable membership functions that can provide

the decreasing of fuzziness when we average fuzzy intervals. If we choose

widely used product triangular norm ( ) babaT ⋅=, then it can be proved [15]

that such family exists and the example of corresponding membership function is

presented on Fig. 1. This class is closed for addition and multiplication with

constant, as it is required in Eq.1. So, to process fuzzy intervals, we can process

only tuples σ,0∆ [16]. Linear operations with fuzzy intervals, which are used

in Eq. 1, will lead to the following operations with tuples:

+∆+∆=∆±∆22

212010220110 ,,, σσσσ , 110110 ,, σσ ⋅∆⋅=∆⋅ ссс .

We can see that these rules repeat well-known rules that are used in

metrology for processing systematic errors and for standard deviations of

random errors. From [17, 18], we can conclude that we should use values

α = 0.05 ÷ 0.10 to get most credible confidence interval from fuzzy interval. As

it was demonstrated in [16], averaging of fuzzy intervals for multiple measure-

344


ments results reduces the uncertainty of its borders and makes it tend to the clas-

sical deterministic interval – in full correspondence with traditional metrology.

Alternative approach is to choose Lukasiewicz triangular norm

( ) 0,1max, −+= babaT that is widely used too. In the manner of paper [15],

we can prove that the only possible form of membership function should be sym-

metrical curvilinear trapezoid with parabolic lateral sides (left curve on Fig. 3).

Fig. 3 shows the following quantitative property of fuzzy interval. Let

( )i∆systµ and ( )i∆randµ be membership functions for fuzzy intervals that

represent purely systematic ixsyst∆ and purely random ixrand∆ components of

total error iii xxx randsysttotal ∆+∆=∆ of some quantity ix~ . Then membership

function of sum of these two fuzzy intervals is exactly ( )i∆µ that is membership

function of fuzzy interval constructed for total error. Moreover, we always can

break fuzzy interval for total error into sum of fuzzy intervals for systematic and

random error components and such decomposition will be unique.

Fig. 3. Membership function of fuzzy interval corresponding to Lukasiewicz triangular norm

It can be easily shown [15] that analogous property takes place for the case

of product triangular norm for membership functions with Gaussian lateral sides.

In the case of Lukasiewicz triangular norm membership function of fuzzy

interval also can be described with only two parameters that is clear from Fig. 3.

Linear operations with fuzzy intervals turn to rules for these parameters that

repeat well-known rules in metrology as it was already stated for product norm.

The main difference between Lukasiewicz and product triangular norms for

considered problem is that fuzzy interval carrier is bounded or not.

If the examined triangular norms aren’t applicable according to any circum-

stances then we can organize the new norm 1T from product or Lukasiewicz

norm 0T using relationship ( ) ( ) ( )( )( )2101

211 ,, µµµµ ϕϕϕ TT−

= where 1µ and

2µ are membership functions and operands, ϕ is arbitrary increasing function

that produces mapping [0, 1] → [0, 1], ( ) 00 =ϕ , ( ) 11 =ϕ . Membership function

345


family that can express fuzzy interval for norm 1T can be organized by trans-

formation 1−ϕ applied to all elements of original function class for norm 0T .

Fuzzy interval description of measurement inaccuracy is in good agreement

with known approaches [1-6] used for numerical software self-verification. It can

be shown that the fuzzy intervals formalism is in good accordance with

probabilistic [1] and interval [4] arithmetics.

Really, let fuzzy intervals represent the pure systematic error. Then all linear

operations with them will be performed by interval arithmetic. It can be easily

understood from operations with tuples for such a case:

0,0,0, 20102010 ∆+∆=∆±∆ and 0,0, 1010 ∆⋅=∆⋅ сс .

We see that these operations are identical to classic interval arithmetic. It

was shown for product norm and can be shown for Lukasiewicz norm as well.

Since interval arithmetic is used when we have the only limit values for quanti-

ties, we can state that proposed approach covers this important particular case.

If fuzzy data obtained from experts are impeachable or the expert quantity is

insufficient then the only way to manage data uncertainty will be to obtain

objective information by performing multiple measurements. In many practical

applications probabilistic arithmetic [1] is used for this purpose. This is the tool

to manage with imprecise distributions of random variables. It is based on

objects called probabilistic boxes (p-boxes for short) that represent the domain

of pos-sible values of cumulative distribution function F(x). P-box can be

expressed in the following form: ( ) ( ) ( )[ ]xFxFxF ,∈ for all possible values x of

random vari-able, where ( )xF and ( )xF are low and upper bounds of p-box. It

is well known [1] that if we want to sum two variables (maybe correlated)

represented by p-boxes ( )∈xF1 ( ) ( )[ ]xFxF 11 , and ( ) ( ) ( )[ ]xFxFxF 222 ,∈ then

the resulting p-box will have the following bounds:

( ) ( ) ( ) ( ) ( )

+−+∈

+=+=

1,mininf,0,1maxsup 21213 yFxFyFxFzF

yxzyxz

.

Let ( ) ( ) ( )[ ]xxFxF Φ∈−

, be p-box to present the negative limit value of

purely random component of measurement error. Here ( )

≤

>

=Φ

0,0

0,1

x

xx is

Heaviside step function. Let ( ) ( ) ( )[ ]xFxxF ,Φ∈+ be p-box to present the

positive limit value of the same random error. Then we can require symmetry in

the following sense: ( ) ( )xFxF −−= 1 . We suppose that there is no systematic

error at all and that’s why ( ) 1=xF and ( ) 0=xF if x = 0. We can construct the

346


membership function µ from these two p-boxes: ( )( )

( )

>−

≤

=

0,1

0,

xxF

xxFxµ . Then

the value ( )xµα = of belief degree has the meaning that was stated above.

Let us choose the Lukasiewicz triangular norm. Let ( )1∆1µ and ( )2∆2µ be

membership function of two fuzzy intervals constructed for purely random errors

of quantities 1~x and 2

~x (see the right curve on Fig. 3). Then the sum of these

fuzzy intervals will have the membership function of the following type:

( ) ( ) ( ) 0,1maxsup 2133

321

−∆+∆=∆

∆=∆+∆

21 µµµ . Let ( ) ( ) ( )[ ]xxFF Φ∈∆− ,111 ,

( ) ( ) ( )[ ]xFxF 111 ,Φ∈∆+

be the p-boxes corresponded to first fuzzy interval,

( ) ( ) ( )[ ]xxFF Φ∈∆−

,222 , ( ) ( ) ( )[ ]xFxF 222 ,Φ∈∆+

– to second fuzzy interval and

( ) ( ) ( )[ ]xxFF Φ∈∆− ,333 , ( ) ( ) ( )[ ]xFxF 333 ,Φ∈∆

+ – to resulting fuzzy interval.

For values 3∆ ≤0 we see the total identity between fuzzy approach and

probabilistic arithmetic: ( ) ( ) ( ) 1 2 3

3 3 1 1 2 2µ sup max µ µ 1, 0∆ +∆ =∆

∆ = ∆ + ∆ −

( ) ( ) 0,1maxsup 2211

321

−∆+∆=

∆=∆+∆

FF .

Let us examine values 3∆ >0:

( ) ( ) ( ) ,0,1maxsup 2133

321

−∆+∆=∆

∆=∆+∆

21 µµµ

( ) ( ) ( ) ,0,1maxsup1 2133

321

∆−∆−=∆−

∆=∆+∆

21 FFF

( ) ( ) ( ) ( )1 2 3

3 3 1 1 2 21 sup 1 min , 1F F F∆ +∆ =∆

∆ = − − ∆ + ∆

( ) ( ) ( )1,mininf 21321

∆+∆=

∆=∆+∆

21 FF .

We established the connection between fuzzy intervals and p-boxes

formalisms for metrological applications. To postulate relationship with other

approaches for dealing with uncertainty, let us cite paper [19] that notes

equivalence of probabilistic arithmetic with series of other formalisms.

Let us examine the example of constructing fuzzy interval from empirical

data. Let ijx~ be values of multiple measurements results of one quantity,

j = 1, 2, …m. We should obtain fuzzy interval for uncertainty of its value. From

technical documentation on used measuring instrument we can find out the

bound isyst∆ for possible systematic error of every ijx~ . Then we can estimate

347


bounds irand∆ of possible random error of obtained mean value

∑=

⋅=

m

j

iji xm

x

1

~1~ : ( )1975,0 −⋅=∆ mt

m

s

irand , where ( )1975,0 −mt is 97,5%

quantile of Student’s distribution with parameter equal to (m–1),

( )∑=

−⋅

−

=

m

j

iij xxm

s

1

22 ~~

1

1 is dispersion estimate for random error population.

The probability compared with this confidence interval is P = 95%.

Let us choose product triangular norm, then we can easily construct fuzzy

interval for ix~ from its tuple: σ,0∆ =

( )( )

( )

−⋅⋅−

−⋅

∆+⋅

Pm

mts P

i1ln2

1,

15.0syst . The

value of Gaussian parameter σ is determined from the following consideration:

nested interval -PJ1 for fuzzy interval on the degree of belief equal to (1–P)

must be of form [ ]iiii randsystrandsyst ∆+∆∆−∆− , .

We can see again that all operations performed to construct fuzzy interval

don’t go beyond the scope of traditional metrology and its approaches.

From the results of paper [15], it can be concluded that the natural domain

for using fuzzy intervals is limited to linear operations. Attempts of their using

for nonlinear transforms lead to difficulties. That’s why it is reasonable to use

fuzzy intervals jointly with automatic differentiation techniques. Program code

example for simple realization of automatic differentiation for real-valued

function is presented in [20] correspondingly to metrological problems.

Let us turn back to Eq. 1. The partial derivatives ix

f

∂

∂ of the function f are

computed accurately but at inaccurate values n

xx~...,,~

1 . Thus, the value of y∆

can be underestimated because we use linearization for f in a slightly different

domain: we take the domain [ ]1111~,~

∆+∆− xx × … × [ ]nnnn

xx ∆+∆−~,~ instead

of [ ]1111 , ∆+∆− xx × … × [ ]nnnn

xx ∆+∆− , . To prevent this situation, we can

use the following approach. If the automatic differentiation is used to estimate

first-order derivatives, then we can apply this technique again (recursively) to

obtain the values of the second-order derivatives. If the errors n

xx ∆∆ ...,,1 are

small enough, then the following inequality will hold:

( ) ( ) ( )∑=

∆⋅

∂∂

∂

≤

∂

∂

−

∂

∂n

j

j

ji

n

i

n

i

n

xx

xxxf

x

xxxf

x

xxxf

1

212

2121 .~...,,~,~~...,,~,~...,,,

total

This improves Eq. 1 and allows us to obtain more correct results. For that we

should use in Eq. 1 upper bounds on derivatives absolute values that can be

produced with the inequality above:

348


( ) ( )∑ ∑= =

∆⋅

∆⋅

∂∂

∂

+

∂

∂

≤∆

n

i

i

n

j

j

ji

n

i

nx

xx

xxxf

x

xxxfy

1 1

212

21 .~...,,~,~~...,,~,~

total .

In this work, we show that it is possible to realize such software that can

present simultaneously final result and its uncertainty characteristics. For this, it

is proposed to use combination of fuzzy interval formalism and automatic

differentiation technique. Realization of this approach requires very small

modifications in software initial code [21].

References

1. R. Williamson and T. Downs. International Journal of Approximate

Reasoning. 4, 89 (1990).

2. D. Berleant and C. Goodman-Strauss. Reliable Computing. 4 (2), 147

(1998).

3. S. Ferson, V. Kreinovich, L. Ginzburg, D. Myers and K. Sentz.

Constructing probability boxes and Dempster–Shafer structures, Technical

report SAND2002-4015 (2003).

4. R. Moore. Methods and Applications of Interval Analysis. (1979).

5. L. de Figueiredo and J. Stolfi. Numerical Algorithms. 37 (1-4), 147 (2004).

6. M. Berz and G. Hoffstätter. Reliable Computing. 4 (1), 83 (1998).

7. N. Wiener. Proc. of the London Mathematical Society, 19, 185 (1921).

8. L. Kantorovich. Siberian Mathematical Journal. 3, 701 (1962). In Russian.

9. L. Reznik, W. Jonson and G. Solopchenko. Proc. NASA Conf. NAFIPS’94.

405 (1994).

10. S. Salicone. Measurement Uncertainty. An approach via the Mathematical

Theory of Evidence. Springer Series in Reliability Engineering (2007).

11. G. Corliss, С. Faure, A. Griewank, L. Hascoët and U. Naumann. Automatic

Differentiation of Algorithms: From Simulation to Optimization.383 (2002).

12. B. Hall. Measurement Science and Technology. 13 (4), 421 (2002).

13. L. Mari. Measurement. 42 (6), 844 (2009).

14. V. Kreinovich, C. Quintana and L. Reznik. Gaussian membership functions

are most adequate in representing uncertainty in measurements. Technical

Report, University of Texas at El-Paso (1992).

15. K. Semenov. Informatics and its applications. 6 (2), 101 (2012). In Russian.

16. V. Kreinovich, L. Reznik, K. Semenov and G. Solopchenko. Proceedings of

XX IMEKO World Congress. paper IMEKO-WC-2012-ADC-O3 (2012).

17. H. Nguen, V. Kreinovich, C.-W. Tao, G. Solopchenko. Soft Computing in

Measurement and Information Acquisition. 10 (2003).

18. P. Novitskiy and I. Zograf. Errors estimation for measurements results.

(1991). In Russian.

349


19. H. Regan, S. Ferson, D. Berleant. Equivalence of methods for uncertainty

propagation of real-value random variables. International Journal of

Approximate Reasoning. 36 (1) (2004).

20. K. Semenov. Metrological aspects of stopping iterative procedures in

inverse problems for static-mode measurements. This book.

21. K. Semenov, G. Solopchenko. Theoretical prerequisites for implementation

of metrological self-tracking of measurement data analysis programs.

Measurement techniques. 53 (6), 592 (2010).



TESTING STATISTICAL HYPOTHESES FOR

GENERALIZED SEMIPARAMETRIC PROPORTIONAL

HAZARDS MODELS WITH CROSS-EFFECT OF SURVIVAL

FUNCTIONS∗

M. A. SEMENOVA+ AND E. V. CHIMITOVA

Department of Applied Mathematics, Novosibirsk State Technical University,

Novosibirsk, RussiaE-mail: [email protected]

www.nstu.ru

The paper is devoted to the methods of testing hypotheses on insignificance of

regression parameters and goodness-of-fit hypotheses for semiparametric pro-portional hazards model and its generalizations. These generalized models such

as simple cross-effect model proposed by Bagdonavicius and Nikulin are well

adapted to study the cross-effect of survival functions, which are often observedin clinical trials.

Keywords: Survival analysis; proportional hazards model; simple cross-effect

model; insignificance of regression parameters; goodness-of-fit.

1. Introduction

Survival regression models are used for estimation of the effect of covariates

or stresses on life time and for estimation of the survival functions under

given values of covariates, see Ref. 5.

The most popular and most widely applied survival regression model

is the proportional hazards model (called also the Cox model) introduced

by Sir David Cox. The popularity of this model is based on the fact that

there are simple semiparametric estimation procedures which can be used

when the form of the survival distribution function is not specified, see

Ref. 4. The survival functions for different values of covariates according

to the Cox proportional hazards (PH) model do not intersect. However,

the proportional hazards model is not applicable, when the proportional

hazards assumption does not hold. Then, we need to apply some more

∗This research has been supported by the Russian Ministry of Education and Science(project 2.541.2014K).

350


351

sophisticated models which allow decreasing, increasing or nonmonotonic

behavior of the ratio of hazard rate functions.

In Refs. 1 and 2, the generalized proportional hazards models are pro-

posed. We will investigate the distributions of statistic and the power of

the Wald test for hypotheses on regression parameters, as well as likelihood

ratio test and the score test proposed in Ref. 2 by Monte-Carlo simulation

methods.

2. Models

Suppose that each individual in a population has a lifetime Tx under a

vector of covariates x = (x1, x2, ..., xm)T . Let us denote by Sx(t) = P (Tx ≥t) = 1−Fx(t) the survival function and by λx(t) and Λx(t) the hazard rate

function and the cumulative hazard rate function of Tx, respectively.

In survival analysis, lifetimes are usually right censored. The observed

data usually are of the form (t1, δ1, x1), ..., (tn, δn, x

n), where δi = 1 if ti is

an observed complete lifetime, while δi = 0 if ti is a censoring time, which

simply means that the lifetime of the i-th individual is greater than ti.

The cumulative hazard rate for the Cox proportional hazards model is

given by

Λx (t;β) = exp(βT · x

)Λ0 (t) , (1)

where β is the vector of unknown regression parameters, Λ0(t) is the base-

line cumulative hazard rate function, see Ref. 4.

This model implies that the ratio of hazard rates under different values

of covariate x2 and x1 is constant over time:

λx2 (t)

λx1 (t)=

exp(βT · x2

)exp (βT · x1)

= const. (2)

In Fig.1, it is shown that the hazard rates curves for different values of

covariate are parallel for this model with the exponential baseline distribu-

tion and the parameter β = 0.3.

However, this model is rather restrictive and is not applicable when the

ratios of hazard rates are not constant in time. There may be an interaction

between covariates and time, in which case hazards are not proportional.

A more versatile model including not only crossing but also going away

of hazard rates is the simple cross-effect (SCE) model given by (see Ref. 2)

Λx (t;β, γ) =(1 + exp

((β + γ)T · x

)Λ0 (t)

)exp(−γT ·x) − 1. (3)


352

0.10

0.08

0.06

0.04

0.02

0.00

0.0 20.0 40.0 60.0 80.0 100.0

x

x

x

Fig. 1. Hazard rates under the proportional hazards model

The parameters β and γ are m-dimensional. The ratio

λx2 (t)

λx1(t)

=exp

(βT · x2

) (1 + exp

((β + γ)T · x2

)Λ0 (t)

)exp(−γT ·x2)−1

exp (βT · x1) (1 + exp ((β + γ)T · x1)Λ0 (t))exp(−γT ·x1)−1

is monotone and larger than 1 at t = 0,λx2 (0)

λx1(0) =

exp(βT ·x2)exp(βT ·x1)

= c0 > 1.

If γ < 0, thenλx2 (∞)

λx1(∞) =∞, hazard rates go away and survival functions

do not intersect. As an example, Fig.2 presents the hazard rates curves for

the SCE model with regression parameters β = 0.3 and γ = −0.25 . The

baseline cumulative hazard rate function was also taken according to the

exponential distribution.

x

x

x

Fig. 2. Hazard rates under the SCE

model, γ < 0

0.10

0.08

0.06

0.04

0.02

0.00

0.0 10.0 20.0 30.0 40.0 50.0

x

x

x

Fig. 3. Hazard rates under the SCEmodel, γ > 0)

If γ > 0, thenλx2

(∞)

λx1 (∞) = 0, the hazard ratio decreases from the value

c0 > 1 to 0, i.e. the hazard rates and survival functions intersect once in

the interval (0,∞), see Fig.3 with β = 0.3 and γ = 0.5.


353

To estimate the regression parameters β, γ and baseline hazard rate

function Λ0 (t) for these models we can use the modified maximum likeli-

hood method, in particular partial likelihood method for the proportional

hazards model and simple cross-effect model, see Ref. 1.

3. Hypothesis and Tests

In this paper, the hypothesis of goodness-of-fit for proportional hazards

model is considered. This hypothesis has the following form

H0 : Λx

(t; β)

= exp(βT · x

)Λ0 (t) , (4)

and can be tested by several methods. We investigated the method based

on residuals Ri = Λx

(ti; β

), i = 1, . . . , n, which should fit closely to the

standard exponential distribution if tested model is indeed “correct”. In

the case of parametric models, the hypothesis of exponential distribution

of residuals can be tested by Kolmogorov, Cramer-von Mises-Smirnov and

Anderson-Darling tests (Ref. 3). However, in the case of semiparametric

models, distribution of tests statistics need to be simulated by algorithms

based on identification of appropriate parametric model.

Another goodness-of-fit test is oriented against wide class of alternatives

H1 : Λx

(t; β, γ

)=(

1 + exp(

(β + γ)T · x)Λ0 (t)

)exp(−γT ·x)− 1 including

monotone hazard ratios and crossing of survival functions (Ref. 2). The

statistic of this test can be written as

T = n−1UT D−1U, (5)

where Uk =∑n

i=1δi=1

[−xik ln

(1 + exp

(β′xi

))− S1(ti)

S0(ti)

], k = 1, ...,m, the co-

variance matrix for vector U is D = Σ∗∗ − Σ∗Σ−10 ΣT∗ , and

Σ0 = 1n

∑ni=1δi=1

[S2(ti)S0(ti)

− EET], S0 (ti) =

∑nj=1tj≥ti

exp(β′xj

),

Σ∗ = 1n

∑ni=1δi=1

[S2(ti)S0(ti)

− EET], Ek = S1(ti)

S0(ti), Ek = S1(ti)

S0(ti),

Σ∗∗ = 1n

∑ni=1δi=1

[S2(ti)S0(ti)

− EET], S1 (ti) =

∑nj=1tj≥ti

xjk exp(β′xj

),

S2 (ti) =∑n

j=1tj≥ti

−xj(xj)T

exp(β′xj

),

S1 (ti) =∑n

j=1tj≥ti

−xjk exp(β′xj

)ln(

1 + exp(β′xj

)Λ0

),

S2 (ti) =∑n

j=1tj≥ti

−xj(xj)T

exp(β′xj

)ln(

1 + exp(β′xj

)Λ0

),

˜S2 (ti) =∑n

j=1tj≥ti

−xj(xj)T

exp(β′xj

)ln2(

1 + exp(β′xj

)Λ0

).


354

Statistic (5) under true hypothesis H0 belongs to the chi-squared dis-

tribution with m degrees of freedom as n→∞.

Hypothesis (4) against alternative of simple cross-effect model can be

written as hypothesis of insignificance of the regression parameters of gen-

eralized model

H0 : Λx

(t; β, γ0

)=(

1 + exp(

(β + γ0)T · x)Λ0 (t)

)exp(−γT0 ·x)− 1, (6)

where γ0 = [0, 0, ..., 0]T

and β is the estimation of regression parameters

for the proportional hazards model. In this case, one can test H0 by the

likelihood ratio test or the Wald test. The statistic of likelihood ratio test

has the following form

LR = 2l(θ)− 2l

(θ0

), (7)

where l (θ) is the partial likelihood function for simple cross-effect model,

θ =[β, γ

]Tare the estimates of regression parameters for simple cross-

effect model, θ0 =[β0, γ0

]Tis the maximum likelihood estimate of param-

eters for proportional hazards model and γ0 = [0, 0, ..., 0]T

. Statistic (7)

under true hypothesis of proportional hazards model belongs to the chi-

squared distribution with p degrees of freedom as n → ∞, where p is the

number of parameter estimated.

Wald statistic

W =(θ − θ0

)TI(θ)(

θ − θ0)

(8)

also can be used for testing insignificance of the regression parameters of

simple cross-effect model, where I(θ) =

[∂2l(θ)∂θi∂θj

]i,j

is the Fisher information

matrix observed, i, j = 1, ...,m. Statistic W is asymptotically distributed

according to the χ2- distribution with the number of degrees of freedom p.

4. Research Results

We have investigated the convergence of the test statistics distributions to

the corresponding limiting chi-squared distributions. It has been shown, the

distributions of statistic (5) and the likelihood ratio statistic converge to

the corresponding chi-squared distributions beginning with the sample size

n = 50.

Fig.4 and Fig.5 present the distributions of likelihood ratio (LR) test

statistic (7) and Wald test statistic (8) under null hypothesis about the


355

proportional hazard model against alternative hypothesis of simple cross-

effect model for type II censored samples, sample size n = 100, regression

parameter β = 0.3, and various censoring degrees: 0%− 50%. The covarite

x is constant in time and takes one of the two values: 0 or 1. We gener-

ated equal numbers of observations corresponding to different values of the

covariate. The numbers of censored observations in groups of objects with

different values of covariate were also taken to be equal. The number of

simulations used is N = 10000.

Fig. 4. Statistic distributions of LR test Fig. 5. Statistic distributions of Wald test

In Fig.6, the distributions of test statistic T are presented for the null

hypothesis of proportional hazards model with two constant covariates

x1, x2 ∈ 0, 1, regression parameters β1 = 0.3, β2 = 0.6, and various sam-

ple sizes.

As can be seen, likelihood ratio test statistic distribution and T statistic

distributions for complete sample are close to the corresponding limiting

distribution, when Wald test statistic distribution differs significantly from

χ21-distribution. This fact is a great disadvantage of the Wald test and can

lead to incorrect inference using statistic (8) and the limiting distribution.

Moreover, investigation shows that the convergence rate of all considered

test statistic distributions to the chi-squared distribution depends on the

censoring degree and the number of covariates: the larger the number of

censored observations or the number of covariates, the slower the empirical

distributions of test statistics approaches the χ2-distribution.

We compared likelihood ratio test, Wald test and test with statistic T

in terms of power for H0 (the adequacy of proportional hazards model) and

two alternative hypotheses of simple cross-effect models:

• H1: the hazard rates and survival functions intersect once, x ∈ 0, 1 , β =

0.3, γ = 0.8;


356

Fig. 6. Distributions of test statistic T

• H2: hazard rates go away from each other and survival functions do not

intersect, x1 ∈ 0, 1, x2 ∈ 0, 1, 2, 3, β1 = 0.3, β2 = −0.2, γ1 = γ2 =

−0.4.

For all cases, the powers were calculated basing on the distributions of

the test statistics under null and competing hypotheses, which were simu-

lated for Type II censored samples of size n = 100. The parameters of the

model under H0 were estimated by the maximum likelihood method. The

number of simulations used is N = 10000. The values of the tests power

were calculated for the significance level α = 0.1. The estimated powers of

considered tests are given in Table 1.

Table 1. Estimated powers of considered tests.

Hypothesis Test 0% 10% 20% 30%

LR 0.12 0.12 0.11 0.10H1 W 0.24 0.16 0.12 0.10

T 0.27 0.18 0.14 0.12

LR 0.18 0.17 0.17 0.16H2 W 0.16 0.15 0.15 0.14

T 0.17 0.16 0.15 0.13

In general, the powers of the likelihood ratio test, Wald test, and the

test with statistic T presented in Table1 1 decrease when the number of

censored observations grows.

In the case of alternative hypothesis H1, the test with statstic T has a

higher power in comparison with the power of likelihood ratio and Wald

tests. However, when hazard rates go away from each other as in the case


357

of alternative hypothesis H2 the power of the likelihood ratio test exceeds

the power of other tests in the study.

The results of the investigation of test statistic distributions and the

comparison of tests power obtained in this paper allow to recommend using

likelihood ratio statistic LR, statistic T , and corresponding limiting chi-

squared distributions for testing goodness-of-fit for proportional hazards

model against the model with cross-effect of survival functions.

References

1. Bagdonavicius, V., Nikulin, M.: Accelerated Life Models. Boca Raton, Chap-man and Hall/CRC (2002)

2. Bagdonavicus, V., Levuliene, R., Nikulin, M.: Modeling and testing of pres-ence of hazard rates crossing under censoring. Comm. in Stat. - Sim. andComp., 41, 980-991 (2012)

3. Balakrishnan, N., Chimitova, E., Galanova, N., Vedernikova, M.: Testinggoodness-of-fit of parametric AFT and PH models with residuals. Comm.in Stat. - Sim. and Comp., 42, 1352-1367 (2013)

4. Cox, D.R.: Regression models and life tables (with discussion). Journal ofthe Royal Statistical Society, Series B, 34, 187-220 (1972)

5. Klein, J.P., Moeschberger, M.L.: Survival analysis. New-York, Springer(1997).



NOVEL REFERENCE VALUE AND DOE DETERMINATION

BY MODEL SELECTION AND POSTERIOR PREDICTIVE

CHECKING

K. SHIRONO∗, H. TANAKA, M. SHIRO AND K. EHARA

National Metrology Institute of Japan (NMIJ),National Institute of Advanced Industrial Science and Technology (AIST),

Tsukuba, Ibaraki 3058565, Japan∗E-mail: [email protected]

Bayesian analyses with models in which unknown biases and variances arerespectively considered are applied to interlaboratory comparison data, using

priors chosen through maximization of the marginal likelihood. Both types of

analysis seem appropriate to be employed in cases where more than half of thereported data are considered to be consistent.

Keywords: Key comparison; Uncertainty;Marginal likelihood

1. Introduction

This paper describes the theoretical background and practical applicabil-

ity of two statistical models for handling interlaboratory comparison test

data. In the theoretical section, a parameter determination method using

the marginal likelihood is proposed. To discuss their applicability, the ref-

erence values and degrees of equivalence (DOEs) yielded by these models

are investigated.

A guideline proposed by Cox1 for statistical methods in interlaboratory

comparison tests provides Procedures A and B for consistent and inconsis-

tent comparison data, respectively. However, there has been a great deal

of discussion on the statistical handling of inconsistent data including the

analysis using the largest consistent subset.2 (See Refs. 3, 4 and the paper

cited therein.)

A statistical method applicable to both consistent and inconsistent data

is proposed in this study. The proposals made in our previous papers5,6 can-

not be applied to consistent data. In the present study, a parameter set for

each value is adjusted according to its consistency, so that robust analy-

sis can be implemented without explicit outlier removal. The following two

358


359

technically specific approaches are employed in this study: (i) Selection of

the priors, which is a fundamental task in Bayesian statistics, is conducted

through maximization of the marginal likelihood. (ii) Statistical models for

determination of the DOEs are developed.

The following two statistical models are employed in this study:

(1) a model in which unknown biases are considered, and

(2) a model in which unknown variances are considered.

There may be discussions on the employment of the data in choosing priors.

Demonstrations are, hence, implemented to show the practical applicability

of the proposed methods.

2. Statistical Models

2.1. Model with unknown biases (bias model)

In this model, unknown biases are considered. It is assumed that n labora-

tories participated in the comparison test, and that Laboratory i reported

the measurement value xi and its standard uncertainty ui (i = 1, 2, ..., n).

Let u2i be qi for simplicity. The vectors x = (x1, x2, ..., xn)T and q = (q1,

q2, ..., qn)T are defined, where T is the transpose.

xi is assumed to be derived from the normal distribution with a mean

of µ + ζi and a population variance of qi. In other words, while µ is the true

value of the measurand, ζi is the unknown bias intrinsic to Laboratory i.

The incorporation of the parameter ζi in the statistical model consequently

provides robust analysis. The vector ζ = (ζ1, ζ2, ..., ζn)T is defined.

Thus, the likelihood, l(µ, ζ|x, q), is given as follows:

l(µ, ζ|x, q) =n∏

i=1

(2πqi)−1/2 exp

(−

n∑i=1

xi − (µ+ ζi)2

2qi

). (1)

Although the likelihood can be derived as above, the priors p(µ) and

p(ζi) cannot be given uniquely. The following priors, however, may be ac-

ceptable:

p(µ) ∝ 1 (−∞ 5 µ 5 +∞), (2)

p(ζi) = (2πηi)−1/2 exp

(− ζ2i

2ηi

). (3)

The vector η = (η1, η2, ..., ηn)T is defined here and the parameter of this

model.


360

Concerning ηi, although it may be acceptable that ηi → ∞ because

of the prior’s non-informativeness, the posterior becomes improper in such

a case. Hence, analyses cannot be conducted using that prior. To avoid

improperness and allow ηi to be chosen objectively, maximization of the

marginal likelihood is applied. The details are described in Section 3.

It is natural to let the reference value, Xbias, and its standard uncer-

tainty, u(Xbias), be the mean and the standard deviation of the poste-

rior of µ. The posterior is a normal distribution with the mean and the

standard deviation of Xbias = ∑n

i=1 xi/(qi + ηi)/∑n

i=1 1/(qi + ηi) and

u(Xbias) = ∑n

i=1 1/(qi + ηi)−1/2, respectively.

This model is referred to as the bias model in this study.

2.2. Model with unknown variances (variance model)

In this model, the unknown variances are considered in order to analyze

the data robustly. n, xi, qi (i = 1, 2, ..., n), and the vectors x = (x1, x2,

..., xn)T and q = (q1, q2, ..., qn)T are defined as in Subsection 2.1.

It is assumed that xi is derived from the normal distribution with the

mean µ and the variance qi+θi, where θi is the unknown variance other

than the reported variance qi. The parameter θi is incorporated into the

statistical model to accomplish robust analysis of the data. The vector θ =

(θ1, θ2, ..., θn)T is defined. Thus, this is different from the statistical model

in which the common random effect is considered as that in ISO 5725-2.7

The likelihood of the unknown parameters µ and θ is given as follows:

l(µ,θ|x, q) =n∏

i=1

2π(qi + θi)−1/2 exp

(−

n∑i=1

(xi − µ)2

2(qi + θi)

). (4)

With regard to the priors, p(µ) is given in Eq. (2). Although the priors

of θi, p(θi) (i = 1, 2, ..., n), are not uniquely chosen, the following delta

function form is employed in this study:

p(θi) = δ(θi − φi), (5)

where φi is the parameter determined through maximization of the marginal

likelihood. The vector φ = (φ1, φ2, ..., φn)T is defined here. The marginal

likelihood shown in Section 3 when using the delta function form is larger

than or equal to that when using any other function form. Thus, even if

another function form is employed as the priors, the parameters are chosen

in order for the variance of the distribution to be zero when possible. This

means, irrespective of the function form of the priors, the results are the

same as those with priors having the delta function form.


361

It is natural to let the reference value,Xvar, and its standard uncertainty,

u(Xvar), be the mean and the standard deviation of the posterior of µ.

Thus, it is given that Xvar = ∑n

i=1 xi/(qi + φi)/∑n

i=1 1/(qi + φi) and

u(Xvar) = (∑n

i=1 1/(qi + φi))−1/2.

This model is referred to as the variance model in this study.

3. Marginal likelihood

Since p(µ) is improper, marginal likelihood cannot be defined usuallya. That

is redefined in this study with regarding the prior of Equation (2) as the

limit of the proper prior p(µ) = 1/2C (−C 5 µ 5 +C) as C → +∞.

In the bias model, the marginal likelihood in this study is redefined as

follows:

Λbias =

∫W(ζ)

∫ +∞

−∞l(µ, ζ|x, q)p(ζ)dµdζ, (6)

where W(ζ) is the integration range of ζ| −∞ < ζi < +∞ (i = 1, ...,

n). Equation (6) is obtained by multiplying the usual marginal likelihood

defined with the above proper prior by the constant 2C and taking its limit

as C → +∞. It is obvious that η obtained by maximizing the marginal

likelihood according to this definition is the same as that obtained by max-

imizing the usual marginal likelihood.

The integration calculation of Eq. (6) yields the following simple math-

ematical expression of Λbias:

Λbias =(2π)1/2u(Xbias)∏n

i=1 2π(qi + ηi)1/2exp

(−1

2

n∑i=1

(xi −Xbias)2

qi + ηi

), (7)

where Xbias and u(Xbias) are as defined in Subsection 2.1. The vector η

maximizing Λbias is set as η = ( η1, η2, ..., η3)T.

The marginal likelihood of the variance model, Λvar, is given similarly

as follows:

Λvar =(2π)1/2u(Xvar)∏n

i=1 2π(qi + φi)1/2exp

(−1

2

n∑i=1

(xi −Xvar)2

qi + φi

), (8)

where Xvar and u(Xvar) are as defined in Subsection 2.2. The vector φ

maximizing Λvar is set as φ = ( φ1, φ2, ..., φn)T.

aThe usual marginal likelihood for the bias model is defined as Λ =∫W(ζ)

∫+∞−∞ l(µ, ζ|x, q)p(µ)p(η)dµdζ. See the above for the definition of W(ζ).


362

It is found from a comparison between Eq. (7) and Eq. (8) that the

maximized values of Λbias and Λvar are equal to each other. This implies that

it is impossible to choose the better of the two models using the marginal

likelihood. Moreover, it can be said that η = φ.

4. Degree of equivalence (DOE)

The optimum statistical models are just appropriate for a robust analysis.

To determine the DOE, a statistical model without an additional bias or

variance on the data of the concerned laboratory must be considered. For

a robust and reasonable analysis, the statistical models for the DOE are

introduced, and the DOEs are yielded as a model checking indexes.

4.1. DOE in the bias model

When Laboratory k’s performance is examined, the bias model with the

parameter η0(k) = (η1, η2, ..., ηk−1, ηk = 0, ηk+1, , n)T is considered. In

this modified form of the bias model, since ηk = 0 is given, the bias ζk is

fixed at zero. On the other hand, ηi = ηi for all of the laboratories other

than Laboratory k.

Here, the following statistic to be employed in the posterior predictive

checking is proposed:

Rk = xk −Xbias|η=η0(k). (9)

This statistic is employed by Kacker et al.8 for a model without biases, and

the mathematically identical DOE to that in the guideline by Cox1 is given.

Since it is approximately correct that the posterior predictive distri-

bution of x is given as the multivariate normal distribution and Xbias is

the linear sum x1, x2,..., xn, the posterior predictive distribution of the

replicated value of Rk, Rrepk , is easily yielded as shown in the Appendix.

Employing the mean Rrepk and the standard deviation u(Rrep

k ), the value

and uncertainty parts of the unilateral DOE of Laboratory k, dbiask and

Ubiask , are given as follows:

(dbiask , Ubiask ) = (Rk −Rrep

k , 2u(Rrepk )). (10)

When the absolute value of Ebiasn = dbiask /Ubias

k is greater than 1, the

hypothesis ”Laboratory k’s bias ζi is zero” is rejected and it can be con-

cluded that the performance of Laboratory k is ”unsatisfactory.” On the

other hand, when |Ebiasn | 5 1, it is ”satisfactory.”

The above unilateral DOE is a statistic only applicable to model check-

ing in relation to Laboratory k; that is, the models to reduce the unilateral


363

DOEs are different for each of the laboratories. This makes the statistical

meaning of the DOE clearer than in the previous DOE concept.1

4.2. DOE in the variance model

In the variance model, the posterior predictive test is applied to the model

with the prior parameter φ0(k) = (φ1, φ2, ..., φk−1, φk = 0, φk+1, , φn)T.

In this model, the additional uncertainty of Laboratory k is not given.

Here, the following statistic is proposed:

Tk = xk −Xvar|φ=φ0(k). (11)

As with the bias model, the posterior predictive distribution of Rk is easily

yielded. In this model, the DOE of Laboratory k is given in a relatively

simple form, as follows:

(dvark , Uvark ) =

(xk −Xvar|φ=φ0

(k), 2√qk − u2(Xvar|φ=φ0

(k))). (12)

The performance evaluation using Evarn = dvark /Uvar

k is similar to that using

Ebiasn .

5. Demonstrations

In this section, demonstrations with the following cases are implemented:

(1) Case 1: A major consistent subset and a few outliers are found.

(2) Case 2: No consistent subset is found.

The values and analysis results are shown in Table 1 and Fig. 1. It is noted

that Xbias = Xvar, because η = φ.

In Case 1, four of the six reported values are identical (x1 = ... = x4 =

1), and the other two are outliers (x5 = 5 and x6 = 6). Figure 2 shows the

calculated marginal likelihoods, Λbias and Λvar, as the functions of (a) η1and φ1 and (b) η5 and φ5. This graph implies that the additional bias or

variance is not given for the relatively consistent reported values (as x1), but

for the relatively inconsistent reported values (as x5). Since the determined

reference value (Xbias = Xvar) is close to 1, it can be said that both of the

models provide robust analyses. Furthermore, the standard uncertainty of

the reference value, 0.50, is the same as the standard uncertainty of the

weighted mean of the four consistent values. Although the outliers are not

explicitly removed in either the bias or the variance models, the analyses

are not influenced by the outliers. In this case, the analysis with the largest

consistent subset,2 in which the outliers are removed explicitly, offers almost


364

Table 1. Input and output of the two demonstrations.

Case 1 Case 2

xi ui ηi or φi Ebiasn Evar

n xi ui ηi or φi Ebiasn Evar

n

1 1 0 0.0 0.0 1 0.1 2.62 -0.34 -2.41 1 0 0.0 0.0 2 0.1 1.72 -0.20 -1.5

1 1 0 0.0 0.0 3 0.1 0.92 -0.07 -0.6

1 1 0 0.0 0.0 4 0.1 0.92 0.07 0.65 0.1 3.92 4.0 4.0 5 0.1 1.72 0.20 1.5

6 0.1 4.92 5.0 5.0 6 0.1 2.62 0.34 2.4

Xbias|η=η = Xvar|φ=φ = 1.11 Xbias|η=η = Xvar|φ=φ = 3.50

u(Xbias|η=η = Xvar|φ=φ) = 0.50 u(Xbias|η=η = Xvar|φ=φ) = 0.55


364

Table 1. Input and output of the two demonstrations.

Case 1 Case 2

xi ui ηi or φi Ebiasn Evar

n xi ui ηi or φi Ebiasn Evar

n

1 1 0 0.0 0.0 1 0.1 2.62 -0.34 -2.41 1 0 0.0 0.0 2 0.1 1.72 -0.20 -1.5

1 1 0 0.0 0.0 3 0.1 0.92 -0.07 -0.6

1 1 0 0.0 0.0 4 0.1 0.92 0.07 0.65 0.1 3.92 4.0 4.0 5 0.1 1.72 0.20 1.5

6 0.1 4.92 5.0 5.0 6 0.1 2.62 0.34 2.4

Xbias|η=η = Xvar|φ=φ = 1.11 Xbias|η=η = Xvar|φ=φ = 3.50

u(Xbias|η=η = Xvar|φ=φ) = 0.50 u(Xbias|η=η = Xvar|φ=φ) = 0.55

Fig. 1. Input values for the two demonstrations shown in Table 1. The error bars

correspond to the expanded uncertainties (k = 2).

the same En numbers as follows; En = 0.0, 0.0, 0.0, 0.0, 3.9, and 4.9 for

Laboratories 1 to 6, respectively.

The calculated En numbers seem pertinent for the performance evalua-

tion. Only the absolute values of the two outliers’ En numbers are greater

than 1 in both models, denoting ”unsatisfactory” performances. Thus, the

proposed method can give the appropriate analysis using both the bias and

variance models. In Case 2, no consistent subset is found. The reported

values vary uniformly from 1 to 6 with a standard uncertainty of 0.1. The

reference value (Xbias = Xvar) is 3.5. Since this is in good agreement with

the arithmetic mean of the reported values, it is intuitively considered to

be pertinent. The standard uncertainty of 0.55 is slightly smaller than the

standard deviation of the arithmetic mean (0.71).

Fig. 1. Input values for the two demonstrations shown in Table 1. The error bars

correspond to the expanded uncertainties (k = 2).

the same En numbers as follows; En = 0.0, 0.0, 0.0, 0.0, 3.9, and 4.9 for

Laboratories 1 to 6, respectively.

The calculated En numbers seem pertinent for the performance evalua-

tion. Only the absolute values of the two outliers’ En numbers are greater

than 1 in both models, denoting ”unsatisfactory” performances. Thus, the

proposed method can give the appropriate analysis using both the bias and

variance models. In Case 2, no consistent subset is found. The reported

values vary uniformly from 1 to 6 with a standard uncertainty of 0.1. The

reference value (Xbias = Xvar) is 3.5. Since this is in good agreement with

the arithmetic mean of the reported values, it is intuitively considered to

be pertinent. The standard uncertainty of 0.55 is slightly smaller than the

standard deviation of the arithmetic mean (0.71).


365


365

Fig. 2. Calculated marginal likelihoods, Λbias and Λvar, with the data of Case 1 as thefunctions of (a) η1 or φ1 and (b) η5 or φ5 with fixing ηi = ηi or φi = φi for i other than

1 and 5, respectively.

On the other hand, the results of calculating the En numbers might

be unacceptable. With the bias model, all En numbers are smaller than 1,

denoting ”satisfactory” performance. Even if this performance evaluation

is correct from a statistical perspective, we must say that it is poor from

a practical standpoint. With the variance model, the performances of four

laboratories are evaluated as ”unsatisfactory.” However, a question remains

with regard to evaluating the other two laboratories’ performances as ”sat-

isfactory” without consideration of the possible existence of a large and

dominant unknown uncertainty source.

Qualitatively, it can be said that the proposed method can be employed

when more than half of the reported values are consistent. Since this situ-

ation is common in key comparisons, the proposed method is useful in the

analysis of key comparisons. In particular, when it is technically certain

that there are no large unknown uncertainty sources, the variance model

can be employed. Development of a quantitative approach on the applica-

bility from the statistical point of view will be a future task.

6. Summary

Bayesian analyses using statistical models in which unknown biases and

variances are respectively considered are proposed for analysis of interlab-

oratory comparison test data. Both of the proposed analyses are applicable

Fig. 2. Calculated marginal likelihoods, Λbias and Λvar, with the data of Case 1 as thefunctions of (a) η1 or φ1 and (b) η5 or φ5 with fixing ηi = ηi or φi = φi for i other than

1 and 5, respectively.

On the other hand, the results of calculating the En numbers might

be unacceptable. With the bias model, all En numbers are smaller than 1,

denoting ”satisfactory” performance. Even if this performance evaluation

is correct from a statistical perspective, we must say that it is poor from

a practical standpoint. With the variance model, the performances of four

laboratories are evaluated as ”unsatisfactory.” However, a question remains

with regard to evaluating the other two laboratories’ performances as ”sat-

isfactory” without consideration of the possible existence of a large and

dominant unknown uncertainty source.

Qualitatively, it can be said that the proposed method can be employed

when more than half of the reported values are consistent. Since this situ-

ation is common in key comparisons, the proposed method is useful in the

analysis of key comparisons. In particular, when it is technically certain

that there are no large unknown uncertainty sources, the variance model

can be employed. Development of a quantitative approach on the applica-

bility from the statistical point of view will be a future task.

6. Summary

Bayesian analyses using statistical models in which unknown biases and

variances are respectively considered are proposed for analysis of interlab-

oratory comparison test data. Both of the proposed analyses are applicable

when more than half of the reported values seem consistent; in other words,

in almost all cases of key comparison.


366

Acknowledgments

This work was supported by a Grants-in-Aid for Scientific Research (No.

26870899) from the Japan Society for the promotion of Science (JSPS).

Appendix A. Calculation of Rrepk and u(Rrep

k )

In this appendix, the equations to calculate Rrepk and u(Rrep

k ) are provided.

The posterior predictive distribution of xrep for the bias model with the

parameter of η, p(xrep|x), is given as follows:

p(xrep|x)

∝∫W(ζ)

∫ +∞

−∞exp

(−

n∑i=1

xrepi − (µ+ ζi)2

2qi

)l(µ, ζ|x, q)p(ζ)dµdζ

∝ exp

(−1

2(xrep −A−1Bx)TA(xrep −A−1Bx)

),

(A.1)

where A and B are the matrix with the size of n× n whose respective (i,

j) components aij and bij are given as follows;

aij = δij

1

2qi + 4ηi+

1

2qi

− (2qi + 4ηj)

−1(2qj + 4ηj)−1∑n

m=1(2qm + 4ηm)−1, (A.2)

bij = δij

− 1

2qi + 4ηi+

1

2qi

+

(2qi + 4ηj)−1(2qj + 4ηj)

−1∑nm=1(2qm + 4ηm)−1

. (A.3)

Here, δij means the Kronecker delta. Thus, xrep is derived from the mul-

tivariate normal distribution with the mean and the variance-covariance

matrix of A−1Bx and A−1, respectively.

The replicated value of Xbias, Xrepbias, can be expressed as follows:

Xrepbias = cTxrep, (A.4)

where, letting Q and H be the diagonal matrix with the size of n×n whose

(i, i) components are respectively qi and ηi, and 1 be the vector with the size

of n whose all components are 1, cT = (1T(Q+H)−11)−11T(Q+H)−1.

The replicated value of Rk, Rrepk , is hence given in the following linear

equation:

Rrepk = xrepk −Xrep

bias = (ek − c)Txrep, (A.5)

where ek is the vector with the size of n whose i-th component is δik. Thus,

the mean value and the variance of Rrepk , Rrep

k and u2(Rrepk ), are given as

follows:

Rrepk = (ek − c)TA−1Bx, (A.6)


367

u2(Rrepk ) = (ek − c)TA−1(ek − c). (A.7)

In the calculation of the DOE, the bias model with the parameter of η

= η0(k) is considered. The DOEs in the bias model are different from those

in the variance model even for identical data, because of the correlation

among the components of ζ in the bias model. The components of θ in the

variance model does not correlate to each other, because they are constant.

References

1. M. G. Cox, The evaluation of key comparison data,Metrologia 39, 589 (2002).2. M. G. Cox, The evaluation of key comparison data: determining the largest

consistent subset, Metrologia 44 187 (2007).3. A. G. Chunovkina, C. Elster, I. Lira and W. Woger, Analysis of key compar-

ison data and laboratory biases, Metrologia 45, 211 (2008).4. C. Elster and B. Toman, Analysis of key comparison data: Critical assessment

of elements of current practice with suggested improvements, Metrologia 50,549 (2013).

5. K. Shirono, H. Tanaka and K. Ehara, Bayesian statistics for determinationof the reference value and degree of equivalence of inconsistent comparisondata, Metrologia 47, 444 (2010).

6. K. Shirono, H. Tanaka and K. Ehara, Theory of and computation program fordetermination of the reference value in key comparisons based on Bayesianstatistics, Advanced Mathematical and Computational Tools in Metrology andTesting IX, 366 (2012).

7. International Organization for Standardization, ISO 5725-2 (1994).8. R. N. Kacker, A. Forbes, R. Kessel and K.-D. Sommer, Bayesian poste-

rior predictive p-value of statistical consistency in interlaboratory evaluation,Metrologia 45, 512 (2008).

368





CERTIFICATION OF ALGORITHMS FOR CONSTRUCTING

CALIBRATION CURVES OF MEASURING INSTRUMENTS

TATIANA SIRAYA

Concern CSRI Elektropribor, JSC, 197046, 30 Malaya Posadskaya,

St. Petersburg, Russia

While using complex data processing algorithms for calibration of measuring

instruments, there are problems of rational selection of algorithms and software as well as

of estimation of result errors. The scheme of certification for data processing algorithms

is applied for these aims, which is realized as a procedure of estimating the algorithm

characteristics for selected typical models of data. The paper presents problem of

certification for calibration algorithms, and details basic characteristics of algorithms and

typical models of experimental data.

1. Introduction

Calibration curves are widely used as primary metrological characteristics of

sensors, measuring transducers, and devices. The quality of calibration curves

determines the accuracy of measuring instruments to a great extent; so the

properties of algorithms for constructing and testing of calibration curves are

essential. Traditional technique is the classical least squares (LS) fitting [1], but

it relies on strict assumptions upon errors of data. So in practice various methods

are applied, including generalised LS [2], confluent estimates [3, 4], robust and

heuristic methods [5, 6].

Thus the quality characteristics of various data processing algorithms are to

be evaluated; for these aims the scheme of certification for algorithms is

developed in metrology [7, 8]. The basic concept of the algorithm certification is

the idea that various algorithms are tested using the same set of characteristics or

criteria. In so doing, one considers the unified set of typical models of input data.

These characteristics should specify the basic properties of algorithm, such as

precision, stability and complexity. Certification of calibration algorithms

provides a researcher with information for the rational choice of algorithm and

for the estimation of the result errors.

In a way, the algorithm certification principles may be considered as an

extension of the data analysis to the domain of data processing in measurements.

We should mention the most wide-ranging investigation of robust algorithms, the

369


so-called “Princeton robustness study” [9], and the study of several groups of

regression algorithms in [5].

Works on certification of data processing algorithms in measurements have

been conducted for several years. The initial formulation of the problem was

suggested by prof. I. B. Chelpanov, and a series of research have been carried

out in D. I. Mendeleyev VNIIM [7, 8].

So, today, the methodological basis of the algorithms certification has been

developed [7, 8]. In the first place, it is a formal scheme of the certification, with

the principles for choosing its elements. It was also considered for some

important cases, such as direct measurements with multiple observations.

The paper considers certification of calibration algorithms and details basic

characteristics of algorithms and typical models of data. On the one hand, the

certification scheme for calibration algorithms is compared with the basic

scheme for direct measurement algorithms [8]. On the other hand, various

calibration algorithms are compared with the classical LS fitting.

2. Formal scheme of certification

The general scheme of certification for the algorithms of data processing may be

described as follows.

1. A homogeneous group of algorithms A = a is specified. This is a

group of algorithms which could be used for the certain problem of

measurement data processing.

2. A set of algorithm characteristics Π1, ..., Πn is chosen. These

characteristics are used for comparing the algorithms in the group A =

a . There are three main groups of characteristics:

a) characteristics of accuracy, which are used for estimating of the result

errors;

b) stability characteristics used for defining the domain of correct

operation of algorithm;

c) complexity characteristics, specifying the labour and temporal

expenses for the algorithm realisation.

3 A set of typical models for the input data u1, ..., uk is also stated. It

usually includes models both for effective signals and the data errors. The

data error models comprise both systematic and random component

models.

4. The values of the algorithm characteristics Π1, ..., Πn are evaluated

(computed or estimated) for the typical data models u1, ..., uk :

π (i, j) = Πi (a | uj ) . (1)

370


It may be realized either using analytical methods or by statistical modelling.

So the result of the algorithm certification is presented as a matrix

ΠΠΠΠ(a) = || π (i, j)|| =|| Πi (a, uj )||. (2)

The entries of the matrix are either numbers or functions, depending on the

model parameters.

This formal scheme is just a general algorithm of certification. In practice,

the certification procedure is specified in several aspects.

Firstly, the procedure is specified according to the main groups of data

processing algorithms. In this paper, the scheme is specified for algorithms of

fitting calibration curves.

Secondly, it is specified according to the aims and scope of the certification.

In this aspect, two main kinds of certification are detailed, such as wide sense (or

general) and narrow sense procedures.

On the one hand, wide sense certification is a full and comprehensive study

of the properties of an algorithm (for the fullest set of characteristics and for a

wide range of typical models). It is performed with the aim to recommend the

algorithm for practical application. So, this case is quite close to the algorithm

investigation in data analysis.

On the other hand, the narrow sense certification is a specific and detailed

study of the algorithm properties for rather definite conditions in order to

estimate the accuracy of the results. For instance, it may be performed for the

given measurement procedure, on a rather limited set of typical models. So, this

case is quite close to the measurement procedure certification in metrology.

Nevertheless, the third case seems to be also used in practice, which may be

called as comparative certification. In this case, a group of algorithms is

examined together under a limited set of characteristics and for a few typical

models. The aim is to provide a rational choice of algorithm for the given

measurement problem.

3. Basic elements of the certification procedure

3.1. Classification of the calibration algorithms

The certification system is based on a full and adequate classification of data

processing algorithms. It was suggested [8] to develop classification based on

the principal structural elements of data processing algorithms. Thus, there are

principal classification criteria, concerned with the forms of initial data and of

measurement result, and the type of the computational procedure. Classification

371


by the third criterion is fundamental for certification as far as it is classification

within homogeneous groups of data processing algorithms.

The minor criteria are related to the mathematical content of the algorithms,

and to the structure of the computational procedure. For instance, according to

the first criterion, it is possible to resolve the optimal statistical procedures, the

efficient robust procedures, and heuristic methods.

The general scheme is specified for the case of calibration algorithms. It is

assumed that a calibration curve is presented as

Y = f (X) = f (X, a1, …, al), (3)

where X and Y are input and output values; f is l-parameter function of certain

type (selected based on a priori data).

A brief classification of the calibration algorithms is presented in Table 1. In

the table, one can see two main groups of the algorithms. The algorithms of the

first group are valid for regression models, when the errors of input values Xi are

negligible as compared with the errors of output values Yi, and the algorithms of

the second group are valid for confluent models, when the errors of Xi and Yi are

of the same order.

Table 1. Main groups of the calibration algorithms

Data model Calibration algorithms:

classical optimal robust heuristic

Regression

model

Least squares algorithms Least module method Median estimates

Maximum likelihood

method

M-estimates by Huber,

Andrews, Hampel,

Tukey

Estimates based on

order statistics and

ranks

Confluent

model Modified least squares

estimates

Modified M-estimates Fractional rational

estimates

Orthogonal regression Grouping estimates

In the table, the classical optimal algorithms are presented, such as least

squares (LS) and maximum likelihood methods. Some groups of robust methods

are also given, such as M-estimates with weight functions by Andrews, Huber,

Hampel, and Tukey. Several kinds of heuristic estimates, including median and

rank methods, are shown as well.

In a similar way, the main groups of confluent algorithms are also presented.

372


3.2. The main characteristics of the calibration algorithms

While the calibration curve is presented as (3), the main characteristics of

algorithms are concerned with the vector of parameter estimates (a1*, …, al*) or

the function value estimates Y* = f* (X) = f (X, a1*, …, al*).

The characteristics of algorithms include three main groups:

A. Characteristics of accuracy meant for estimating of parameters errors,

include the following:

- vector of standard deviations (or variances) of parameter estimates;

- vector of biases of parameter estimates;

- vector of limits of systematic errors;

- covariance matrix of parameter estimates vector.

As compared with the basic case of direct measurements [8], the

algorithm characteristics become complicated, due to consideration of

vectors and covariance matrices. The primary significance of the bias of

estimate, along with the variance (especially for confluent models) is also

to be mentioned.

B. Characteristics of stability, meant to define the domain of normal

operation of the algorithm, include the fractions of initial data distortion

which are tolerable for the parameter estimates. So, they are defined like

a breakdown points for robust estimates of location, but may be different

for some parameters.

There are also some specific parameters, which are useful for comparing

the stability of different robust regression estimates [10].

C. Characteristics of complexity of the algorithm realisation, including

computing complexity of the algorithm, defined as a number of

operations necessary for the algorithm realization, and temporal

complexity (for the program), defined as the time needed for calculations.

As compared with the basic case of direct measurements [8], these

characteristics become more diverse. The computing complexity

characteristics of various LS algorithms are studied, for instance, in [11].

3.3. The typical models of data

The experimental data used to construct the calibration curve (3) may be given

by the general model of the form

xij = Xi + θxi +εxij ; yij = f (Xi) + θyi + εyij , i=1…m, j=1…ni , (4)

where Xi and Yi = f (Xi) are the true values of input and output quantities;

373


θxi , θyi are the systematic errors of measured Xi, Yi;

εxij , εyij are the random errors of Xi, Yi.

So the main aspects of the typical models of data are as follows:

1) functional type of the function f (X, a1, …, al);

2) typical models of vectors of random errors εxij , εyij;

3) typical models of vectors of systematic errors θxi , θyi;

4) location of the points Xi within the range.

Considering the second aspect, the most significant is the ratio between the

errors in input Xi and output Yi values. According to this ratio, there are two main

groups of models, as presented in Table 1; those are regression and confluent

models. Then typical models of distributions for the errors εxij and εyij are just the

same as for the case of direct measurements [8]. In particular, they include

Gaussian, uniform, and double exponential (Laplace) distributions.

The contaminated Gaussian distributions are also useful. These models allow

for a certain fraction of data to contain the outliers or to follow another

distribution, so, the density of data distribution can be represented as:

f (x) = (1-q) g(x) + q h(x), (5)

where g(x) is the density of the main (Gaussian) distribution,

h(x) is the density of the foreign (contaminating) distribution,

q is the level of contamination (a small number, usually 0,05 or 0,1).

The typical models for systematic errors θxi , θyi are usually defined as

deterministic sequences of the following form: constant, linear, and harmonic

ones. Often it is also useful to include quasi-stochastic sequence (as uniformly

distributed on the intervals within the limits).

Some additional models based on the peculiar features of data processing

problem may be also included. So, the variety of the data typical models which

are useful for algorithm certification is very wide.

The concluding tables of the characteristics of the calibration algorithms are

presented mainly in the parametric form. It is possible to present a total

certification table for one algorithm, which gives the complete description of its

properties. But in practice it is convenient to present the certification results for a

few similar algorithms in the unified table, which lets the users compare the

algorithm properties directly. It also enables users to do the well-founded choice

of the rational algorithm for the given measurement problem.

374


4. Examples of algorithms for calibration curves

For example, consider a linear calibration curve Y = a + b X. If the errors in

measured input values Xi are insignificant, then classical LS fitting is valid; but

in the case of significant errors of Xi – the LS estimates turn out statistically

inconsistent.

Therefore the following groups of consistent confluent estimates may be used

(according to available additional information on data):

a) generalized orthogonal regression estimates (with the known ratio of

error variances λ=D(εy)/D(εx) );

b) variance analysis estimates (with multiple observations in points Xi , Yi);

с) homographic, or fractional rational estimates (with a priori known

increasing order of Xi).

The main characteristics to compare confluent estimates are bias and

variance; these characteristics are presented in Table 2 for several estimates.

Here the sums of data are denoted as follows

Σx =Σ (xi – x )2 , Σxy =Σ (xi – x )(yi – y )

It is also supposed, that the points Xi do not concentrate close to one point:

Σx / m → τ > 0 .

The characteristics of confluent estimates are also compared with those of LS

estimates. In particular, it is seen that confluent estimates are consistent, and they

essentially differ in bias from LS estimates.

Table 2. Characteristics of confluent estimates for coefficient b of linear calibration line

A priori information Estimate Bias ( )bB Variance

kn

ow

n

par

amet

ers

22xy σσ=λ

or estimate

,2

1 λ+±= vvb

ν= (Σy – λΣx ) / 2Σxy

(2σ2

y + b2

σ2

x)/bΣxy

(σ2

y+b2

σ2

x)/Σx 2

xσ or estimate b2 = Σxy/(Σx–(m–1) σ2

x ) 2 b σ

2x / Σx

2yσ or estimate b3 = (Σy–(m–1) σ2

y )/ Σxy (b2–λ) b

σ

2x / b

2 Σx

esti

mat

es f

rom

mu

ltip

le d

ata

estimate of λ

,2

4 λ+±= vvb

ν= (Σy – λΣx ) / 2 Σxy

B(b1)

(σ2

y+b2

σ2

x)/Σx 2xS b5 = Σxy/(Σx–(m–1) S2

x ) B(b2)

2yS b6 = (Σy–(m–1) S2

y )/ Σxy B(b3)

Least squares estimate b0 = Σxy / Σx –b (m–3) σ2

x/ Σx (σ2

y+b2

σ2

x)/Σx

375


5. Conclusions

As regards to certification of calibration algorithms, some conclusions can be

drawn.

First, certification of calibration algorithms provides a researcher with useful

information for rational choice of algorithm and estimating of result errors.

Secondly, it is often convenient to present comparative tables with certification

results for a few similar algorithms, which let the users compare the algorithm

properties directly. Thirdly, certification of calibration algorithms seems to be

especially useful as applied to the confluent models of data.

Acknowledgments

This work is supported by Russian Foundation for Fundamental Researches;

grant N 13-08-00778.

References

1 Handbook of Applicable Mathematics. Vol. VI: Statistics. (1984 ) Eds.

Lederman W. and Lloyd, E. (John Wiley & Sons, New York).

2 Forbes, A. B. (2000) Fusing prior calibration information in metrology data

analysis. Advances mathematical & computational tools in metrology IV.

Series on Advances in mathematics for Applied Sciences 53 (World

Scientific, New York).

3 Demidenko, E. Z. (1981) Linear and non-linear regression (Finance and

Statistica Press, Moscow, in Russian).

4 Granovsky, V. A. and Siraya, T. N. (1990) Methods for Data Processing in

Measurement (Energoatomizdat, Leningrad, in Russian).

5 Robustness in statistics (1979) - Eds. Launer, R. L. and Wilkinson, G. N.

(Academic Press, New York).

6 Mosteller, F. and Tukey, J.W. (1982) Data analysis and Regression

(Reading: Addison-Wesley, Massachusetts).

7 Tarbeyev, Yu.V., Chelpanov, I. B. and Siraya, T. N. (1983) Investigation

and certification of data processing algorithms in precise measurements,

Acta IMEKO (Budapest).

8 Chelpanov, I. B. and Siraya, T. N. (1999) Certification of data processing

algorithms in measurements: principles and results. Metrological aspects of

data processing and information systems in metrology. PTB-Bericht IT-7.

Eds. Richter, D. and Granovsky, V. A. (PTB, Braunschweig und Berlin).

9 Andrews, D. F., Bickel, P. J., Hampel, F. R., Huber, P.J., Roger, W. H., and

Tukey, J. W. (1972) Robust estimates of location: Survey and advances. –

(Princeton University Press, Princeton).

376


10 Riani, M., Atkinson, A. C. and Perrotta, D. (2014) A parametric framework

for the comparison of methods of very robust regression, Statistical Science,

29, N 1, pp. 128-143.

11 Maindonald, J. H. (1984) Statistical computation. Wiley series in probability

and statistics (John Wiley & Sons, New York).



DISCRETE AND FUZZY ENCODING OF THE ECG-SIGNAL

FOR MULTIDISEASE DIAGNOSTIC SYSTEM

V. USPENSKIY

Federal Medical Educational-Scientific Clinical Center n. a. P. V. Mandryka

of the Ministry of Defence of the Russian Federation, Moscow, RussiaE-mail: [email protected]

K. VORONTSOV, V. TSELYKH AND V. BUNAKOV

Moscow Institute of Physics and Technology, Moscow, RussiaDorodnicyn Computing Centre of RAS, Moscow, Russia

E-mail: [email protected], [email protected], [email protected]

In information analysis of the ECG signal, discrete and fuzzy variants of signal

encoding are compared for multidisease diagnostic system. Cross-validationexperiments on more than 10 000 ECGs and 18 internal diseases show that the

AUC performance criterion can be improved by up to 1% with fuzzy encoding.

Keywords: electrocardiography, information function of the heart, multidiseasediagnostic system, signal discretization, machine learning, cross-validation.

1. Introduction

Heart rate variability (HRV) is the physiological phenomenon of variation in

the time interval between heartbeats, or, more precisely, between R-peaks

(see Fig. 1). HRV analysis is widely used to diagnose cardiovascular dis-

eases.1,3 HRV reflects many regulatory processes of the human body and

therefore has a high potential to contain valuable diagnostic information

about many internal diseases, not only related to heart problems.

The information analysis of ECG signals,4 instead of averaging time in-

terval variability around the signal, discovers patterns of variability for both

intervals and amplitudes of consecutive R-peaks. It was found that some of

these patterns are significantly correlated with various diseases.5,6 This ap-

proach has been implemented in the multidisease diagnostic system which

permits a diagnosis of a multitude of internal diseases through a single ECG

record. This diagnostic technology is based on the encoding of the electro-

cardiogram into a symbolic string with each cardiac cycle corresponding to

one symbol. Subsequently, computational linguistics and machine learning

377


378

Fig. 1. Three consecutive R-peaks of the ECG signal determine two full cardiac cycles

with amplitudes Rn, Rn+1, intervals Tn, Tn+1, and “phase angles” αn, αn+1.

techniques are used to infer diagnostic rules from a training sample of ECGs

collected from healthy and sick persons.

In this paper, we improve the diagnostic performance by means of fuzzy

encoding. Note that we use the term “fuzzy” only in its intuitive sense,

without regard to the fuzzy logic. Fuzzy encoding aims to smooth out the

noise and decrease uncertainties in the ECG signal. To do this, we introduce

a simple two-parametric probabilistic model of measurements. We make an

extensive cross-validation experiment to estimate the model parameters and

to show that fuzzy encoding improves the performance.

2. Discrete and Fuzzy Encoding

The informational analysis of the ECG is based on the measurement of

the interval Tn and amplitude Rn for each cardiac cycle, n = 1, . . . , N

(see Fig. 1). The sequence T1, . . . , TN represents the intervalogram of the

ECG, and the sequence R1, . . . , RN represents the amplitudogram of the

ECG. Note that in HRV analysis only intervals Tn are used; in contrast, we

analyze the variability of intervals Tn and amplitudes Rn together.

Discrete Encoding. In successive cardiac cycles, we take the signs of

increments ∆Rn, ∆Tn and ∆αn, where αn = arctan Rn

Tn. Only six of the

eight combinations of increment signs are possible. They are encoded by

the letters of a six-character alphabet A = A, B, C, D, E, F:

A B C D E F

∆Rn = Rn+1 −Rn + − + − + −∆Tn = Tn+1 − Tn + − − + + −∆αn = αn+1 − αn + + + − − −


379

Fig. 2. An example of a codegram with a sliding window of three symbols.

Fig. 3. Vector representation nw(S) of the codegram S shown in Fig. 2. Only 64 of 216trigrams with frequency nw(S) ≥ 2 are shown.

Thus, the ECG is encoded into a sequence of characters from A called

a codegram, S = (s1, . . . , sN−1), see Fig. 2. We define a frequency pw(S) of

a trigram w = (a, b, c) with three symbols a, b, c from A in the codegram S:

pw(S) =nw(S)

N − 3, nw(S) =

N−3∑n=1

[sn = a][sn+1 = b][sn+2 = c],

where brackets transform logical values false/true into numbers 0/1.

Denote by p(S) =(pw(S) : w ∈ A3

)a frequency vector of all |A|3 = 216

trigrams w in the codegram S, see Fig. 3. The informational analysis of the

ECG is based on the idea that each disease has its own diagnostic subset

of trigrams frequently observed in the presence of that disease.4,6

Fuzzy encoding. There are two reasons to consider a smooth variant of

discrete encoding. First, the ECG may contain up to 5% of outliers among

the values Rn and Tn. In discrete encoding, each outlier distorts four neigh-

boring trigrams; accordingly, the total number of distorted trigrams may

reach 20%. Second, the discreteness of the ECG digital sensor results in

uncertainties ∆Tn = 0 and ∆Rn = 0 in 5% of cardiac cycles. In such cases,

it is appropriate to consider the increment as positive or negative with equal

probabilities. In general, the smaller the increment, the greater the uncer-


380

Rn, mV 313 343 343 318 344 350 327 321 340 340

Tn, ms 843 843 865 828 865 880 861 808 825 825αn, 33.4 36.6 35.7 34.6 35.8 35.8 34.2 35.8 37.1 37.1

∆Rn, mV 30 0 -25 26 6 -23 -6 19 0∆Tn, ms 0 22 -37 37 15 -19 -53 17 0

∆αn, 3.2 -0.9 -1.1 1.2 0.0 -1.6 1.6 1.3 0.0

sn C D F A A F B A F

qn(A), % 50 6 0 93 39 0 0 84 11

qn(B), % 0 2 8 0 0 3 87 0 14

qn(C), % 50 3 0 1 11 0 10 10 25qn(D), % 0 47 2 0 8 8 0 1 25

qn(E), % 0 41 0 6 41 0 0 5 14

qn(F ), % 0 1 90 0 1 89 3 0 11

A

B

C

D

E

F

(∆Tn,∆Rn)

(0, 0)

αn ∆T

∆R

Fig. 4. An example of discrete and fuzzy encoding. Fig. 5. Six sectors.

tainty in their sign. We can replace each character sn with a probability

distribution qn(s) over A (see Fig. 4) and redefine the frequency of a trigram

w = (a, b, c) as a probability of w averaged across the codegram S:

pw(S) =1

N − 3

N−3∑n=1

qn(a) qn+1(b) qn+2(c).

To estimate the probability qn(s) from Rn, Rn+1, Tn, and Tn+1 we in-

troduce a probabilistic model of measurement. We assume that each ampli-

tude Rn comes from Laplace distribution with a fixed but unknown RMS er-

ror parameter σR, which is the same for all ECGs. For intervals Tn, we intro-

duce a similar model with the RMS error parameter σT . Subsequently, we

calculate probabilities qn(s) analytically by integrating a two-dimensional

probability distribution centered at a point (∆Tn,∆Rn) over six sectors

corresponding to symbols A, B, C, D, E, F shown at Fig. 5.

Machine learning techniques are designed to learn a classifier automati-

cally from a sample of classified cases.2 We learn a diagnostic rule for each

disease from a two-class training sample that contains both healthy persons

and patients, each represented by its trigram frequency vector.

In this work we compare three classification models: NB — Naıve Bayes

with greedy feature selection, LR — Logistic Regression after dimension-

ality reduction via Principal Components Analysis, and RF — Random

Forest, which is known as one of the strongest classification model. For all

classifiers we use binary features[pw(S) ≥ θ

]instead of frequencies pw(S),

and optimize threshold parameter θ experimentally.


381

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

necrosis of the femoral head toxic nodular goiter coronary heart disease

Fig. 6. The result of permutational tests for three diseases. Points indicates trigrams.

The X-axis and the Y-axis indicate the proportion of healthy and sick people correspond-ingly, with two or more occurrences of the trigram in their codegram. The trigrams lo-

cated in the region of acceptance near the diagonal are likely to have occurred by chance

(the significance level equals 10% for the narrow region and 0.2% for the wider one). Thetrigrams located in the critical region far above the diagonal are specific to the disease,

and the trigrams far below the diagonal are specific to a healthy condition.

This approach is motivated by an empirical observation that each dis-

ease induces a diagnostic subset of trigrams that are significantly more

frequent in the codegrams of sick people. Also, there are trigrams that are

highly specific to the codegrams of healthy people. Fig. 6 shows the results

of permutational statistical tests for three diseases. If the frequency of the

trigram and the class label were independent random variables, then all tri-

grams would be close to the diagonal of the chart. However, many trigrams

are located far away from the chart diagonal. This fact means that for each

disease the diagnostic subset of highly specific trigrams exists and can be

reliably determined.

Note that both discrete and fuzzy encoding can be used to calculate fea-

tures pw(S), thus enabling a comparative study of the two types of encoding

with the same performance criterion.

We measure the diagnostic rules performance using a standard 40×10-

fold cross-validation procedure. During procedure, a two-class sample of

codegrams are randomly divided into 10 equi-sized blocks 40 times. Each

block is used in turns as a testing sample, while the other nine blocks are

used as a training sample in order to learn a classifier.

For each partitioning, we calculate three performance measures, for both

training and testing samples. Sensitivity is the proportion of sick people

with true positive diagnosis. Specificity is the proportion of healthy people

with true negative diagnosis. AUC is defined as the area under the curve

of specificity as a function on sensitivity. For each of three performance

measures the higher the value, the better. From all 40 cases of partitioning

we estimate the mean AUC values as well as their confidence intervals.


382

Table 1. The AUC (in percents) on testing data for three types of classifiers (RF, LR,

NB) and two types of encoding (.d for discrete and .f for fuzzy). Confidence intervals

are: ±0.26 for RF, ±0.19 for LR, and ±0.08 for NB.

disease cases RF.d RF.f LR.d LR.f NB.d NB.f RF-2 RF-4

(1) 278 98.72 99.00 99.00 98.94 98.96 99.00 95.16 94.49(2) 324 99.24 98.86 99.26 99.07 99.24 99.01 98.11 95.49(3) 1265 98.43 98.75 98.21 98.70 97.85 98.52 91.68 92.72(4) 530 97.15 97.99 96.79 97.42 96.03 96.45 93.09 93.43(5) 700 97.74 97.95 97.64 97.67 97.81 98.20 82.54 87.14(6) 871 97.34 97.79 97.10 97.74 96.68 97.17 91.05 92.73(7) 260 96.65 97.55 96.64 97.38 96.61 96.96 89.33 90.59(8) 1894 97.13 97.49 96.87 97.68 96.59 97.31 87.43 90.12(9) 748 96.07 96.90 95.73 96.04 95.17 95.72 85.56 88.10(10) 324 95.53 96.37 95.20 95.98 94.79 95.85 88.95 92.17(11) 340 95.21 96.25 95.06 96.17 95.51 96.44 86.29 87.60(12) 717 95.29 96.20 95.13 96.12 95.13 95.82 86.92 87.86(13) 654 95.09 96.16 95.14 95.94 95.14 96.03 87.80 86.90(14) 785 94.99 95.58 94.74 95.33 94.68 95.09 86.60 89.17(15) 781 94.43 95.26 93.58 94.74 93.38 94.28 84.06 85.97(16) 276 92.37 92.65 92.44 92.32 91.88 91.50 81.49 84.96(17) 260 90.03 91.82 90.03 91.07 89.56 90.34 79.39 81.77(18) 694 88.07 88.63 87.70 87.65 86.59 86.50 76.48 82.39

3. Experiments and Results

In the experiment, we used more that 10 000 ECG records with N = 600

cardiac cycles in each. 193 ECGs were taken from healthy participants,

while the others were taken from patients who were reliably diagnosed

with one or more of the 18 diseases: (1) cholelithiasis, (2) AVN, necro-

sis of the femoral head, (3) coronary heart disease, (4) cancer, (5) chronic

hypoacidic gastritis (gastroduodenitis), (6) diabetes, (7) BPH, benign pro-

static hyperplasia, (8) HTN, hypertension, (9) TNG, toxic nodular goiter or

Plummer syndrome, (10) chronic hyperacidic gastritis (gastroduodenitis),

(11) chronic cholecystitis, (12) biliary dyskinesia, (13) urolithiasis, (14) pep-

tic ulcer, (15) hysteromyoma, (16) chronic adnexitis, (17) iron-deficiency

anemia, (18) vasoneurosis.

Table 1 compares the performance of three classifiers (Random Forest,

Logistic Regression and Naıve Bayes) on testing data for discrete and fuzzy

encoding. Fuzzy encoding gives better results for 16 of the 18 diseases.

Random Forest is usually the best choice. Nonetheless, Naıve Bayes with

feature selection is not much worse. Two additional columns RF-2 and

RF-4 show the performance of Random Forest for two simplified discrete

encodings. RF-2 uses a two-character alphabet for ∆Tn signs. RF-4 uses


383

0 5 10 15

0

1

2

3

4

5

6

70.942

0.944

0.946

0.948

0.95

0.952

0.954

0.956

training testing1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0

0.946

0.948

0.950

0.952

0.954

0.956

0.958

Fig. 7. The AUC on testing set aver-aged across all diseases depending on σT(X-axis) and σR (Y-axis).

Fig. 8. The AUC on training and testingset averaged across all diseases depending

on threshold parameter θ(N − 3).

a four-character alphabet for ∆Tn and ∆Rn signs. From the comparison we

conclude that the six-character encoding gives significantly better results.

Fig. 7 shows the AUC on testing data averaged across all diseases as

a function of the RMS error parameters σR and σT . Based on the charts we

selected the optimal values of parameters σR = 3.5 mV and σT = 10.6 ms.

Note that zero values σT = σR = 0, which corresponds to discrete encoding,

are evidently far away from being optimal.

Fig. 8 shows how the average AUC for NB classifier on testing data

depends on the frequency threshold parameter θ(N − 3). Trigrams that

occur less than twice in a codegram are not meaningful for the diagnosis.

Fig. 9 shows how the AUC for NB classifier on testing data depends on

the RMS error parameters σR and σT for 2 of the 18 diseases.

The proximity of training and testing AUCs in all charts indicates that

overfitting of NB classifier is minute, and optimal parameters could be

obtained from the training set even without cross-validation.

4. Conclusion

The information analysis of ECG signals improves the HRV analysis by two

directions. Firstly, it identifies patterns of joint variability of intervals and

amplitudes of R-peaks specific to diseases. Secondly, this type of analysis

is not restricted to cardiovascular diseases. Our experiments show that the

information analysis of the ECG signals reaches a high level of sensitivity

and specificity (90% and higher) in cross-validation experiments.

On average, fuzzy encoding helps to improve this level by 0.65%.

Future research will benefit from more accurate techniques for signal

encoding, statistical modeling, and machine learning.


384

training testing0 1 2 3 4 5 6 7

0.954

0.956

0.958

0.960

0.962

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

0.954

0.956

0.958

0.960

0.962

urolithiasis

training testing0 1 2 3 4 5 6 7

0.9790.9800.9810.9820.9830.9840.9850.986

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

0.9790.9800.9810.9820.9830.9840.9850.986

coronary heart disease

Fig. 9. AUC on training and testing set depending on σR at fixed σT = 10.6 (left-handcharts) and depending on σT at fixed σR = 3.5 (right-hand charts) for two of 18 diseases.

The work was supported by the Russian Foundation for Basic Research

grants 14-07-00908, 14-07-31163. We thank Alex Goborov for his help with

English translation and valuable discussion.

References

1. A. J. Camm, M. Malik, J. T. Bigger, et al. Heart rate variability — standardsof measurement, physiological interpretation, and clinical use. Circulation,vol. 93 (1996), pp. 1043–1065.

2. T. Hastie, R. Tibshirani, J. Friedman. The Elements of Statistical Learning,2nd edition. Springer (2009), 533 p.

3. M. Malik, A. J. Camm. Components of heart rate variability. What theyreally mean and what we really measure. Am. J. Cardiol, vol. 72 (1993),pp. 821–822.

4. V. Uspenskiy. Information Function of the Heart. Clinical Medicine, vol. 86,no. 5 (2008), pp. 4–13.

5. V. Uspenskiy. Information Function of the Heart. A Measurement Model.Measurement 2011, Proceedings of the 8-th International Conference (Slo-vakia, 2011), p. 383–386.

6. V. Uspenskiy. Diagnostic System Based on the Information Analysis of Elec-trocardiogram. MECO 2012. Advances and Challenges in Embedded Com-puting (Bar, Montenegro, June 19-21, 2012), pp. 74–76.

385





APPLICATION OF TWO ROBUST METHODS IN INTER-

LABORATORY COMPARISONS WITH SMALL SAMPLES

ЕVGENIY T. VOLODARSKY

State Technological University Kiev Politekhnika, Ukraine

ZYGMUNT L. WARSZA

Industrial Research Institute of Automation and Measurement

PIAP Warszawa Poland

Email: [email protected] & [email protected]

Two robust methods of assessing the value and the uncertainty of the measurand from the

samples of small number of experimental data are presented. Those methods should be

used when some measurements results contain outliers, i.e. when the values of certain

measurement significantly differ from the others. They allow to set a credible statistical

parameters of the measurements with the use of all experimental data of small samples.

The following considerations are illustrated by few numerical examples of the

interlaboratory key comparison, when number of measurement data is very small.

Compared are the results of calculations obtained for the numerical example by a

classical method without and with the rejection of outlier and by two robust methods: a

rescaled median absolute deviation MADS and an iterative two-criteria method.

Keywords: precision, uncertainty of measurements, outliers, robust statistics, inter-

laboratory comparisons, proficiency testing.

1. Introduction

In many experimental studies in various fields, including the technical and

scientific research, interlaboratory comparison and laboratory proficiency testing

the measurement samples can contain few number of elements only. This occurs

because of the high costs of measurements, the use of destructive methods, the

poor availability of objects for testing, or the inability of multiple tests due to

long or limited time of their execution. For small samples the measurement

result and its uncertainty uA evaluated by the GUM recommendations [1],

significantly depends on the outliers. Therefore the obtained values sometimes

may be even unreliable or unrealistic. Removing one observation only from a

small sample significantly reduces the credibility of the evaluation results. For

example for a very small sample of 4 elements the relative standard deviation of

uncertainty s(uA)/uA is as high as 42%, and for n=3 it will increase even up to

52% (GUM [1], Table E.1 in Appendix E.1). The removal of only one

observation from a such small sample increases the relative standard deviation

386


of uncertainty approximately on 24%. So the general tendency for small samples

with outliers is to use the robust statistical methods, which applying all data

obtained experimentally, including outliers. These methods are developed and to

be used from the late 70's of the twentieth century. They are highly resistant to

the influence of outliers. Such data considered before in conventional methods

as to be "bad" can be successfully used now. Literature on these methods is quite

rich. An overview of the basic items are in the bibliography [6] - [9], [12].

Robust methods provide less than conventional methods the impact of too high

errors caused by different usually unrecognized sources. The term robust means

resistance immunity to irregularities and inhomogeneities of the sample data.

In the robust statistics the outlier data are not removed, but are used different

ways to modify their values, or their participation in procedures to estimate the

statistical parameters of the sample. A number of robust statistical methods,

(among others) are programmed in MatLab. They can be included in the main

international metrological recommendations [3] -[5] and in new upgraded GUM.

Various data processing tasks appear constantly in the new applications of

robust statistical methods including such one as calibration of multi-parameter

measurements in chemometrics. One of the areas where robust methods could

also be usefully applied is estimation of accuracy of results obtained by the

some measurement method in inter-laboratory comparison experiments [6], [9].

2. Method of rescaled median deviation

In the simplest robust method for a sample of n elements used is the Median

Absolute Deviation

nin

Mx −= medMAD (1)

where: i

x is the i-th element of the sample, in

xM med= is the median.

This simple robust procedure is as follows:

- for all n data xi ordered by values determined is the median med and

considered to be the estimate of the measurement result value,

- the deviations of the sample data sets from this median the median absolute

deviation MAD is calculated,

- standard uncertainty s(x) of the measurand is considered the rescaled median

deviation MADS

s(x) ≡ MADs = κ (n) MAD (2)

For a normal distribution the value of κ∞=1,483 is the asymptotic limit of the

ratio of s(x)/MAD when n → ∞, i.e. for the general population. Use of κ∞ for

samples with a finite number n of measurements gives too low the assessment of

uncertainty, as s(xn) > s(x∞). Then for the more accurate estimation coefficient

κ(n) as dependent on the number of elements n in the data sample has to be

387


applied. We use values of the coefficient κ(n) published by Randa of NIST in

[10]. Other values are proposed by Haque and Khan but only for n>20 [11].

3. The robust iterative method

More reliable statistical parameters than by above rescaled median method can

be obtained by an iterative robust methods. In the method of robust statistics

considered here the outlier data is downloading to positions closer to the center

of the distribution. This operation is called winsoryzation after the name of

American mathematician Winsor. Samples with the outlier data should not be

simulated by a model of single normal distribution and the least squares method

(LSM) is not useful, as shares of single data in it increases with the square of its

distance from the center of concentration. More resistant to large deviations is

the criterion of minimum modules (LMM) given by Laplace. So, in robust

methods many ways of both criteria "symbiosis" are used. It is assumed that

only the central part of the PDF (Probability Density Function) of sample data

distribution, i.e. for small deviations from the estimate of measurand value, does

not differ from the normal distribution. Only for them the least-squares criterion

LSM can be used. Beyond the limits of this range the criterion of minimum

module LMM is used to reduce the impact of outliers. After Tukay [7] and

Huber works it is possible to apply for data processing the iterative robust

method under acronym IRLS (iteratively reweighted least squares). In this

method the following functional is used for the sensitivity

1

( )n

i

i

xρ µ

=

−∑ (3)

where: )-( i µρ x is the function depended on the selected parameter c.

For observations of the deviation values of ||ε < σc (where σ is the standard

deviation, c - factor) a square function is used and for larger deviations the

modules |x|i

µε −= are minimized. So the function )(ερ is "more mild" for data

outliers with values ||ε > σc from the center of the sample distribution. Constant

c determines the degree of "robustness". The value of the constant c depends on

the procentage of "contamination" of the sample distribution. For 1% c = 2, and

for the 5% c=1.4. Commonly c=1.5 is used. Experimental data are modified in

accordance with the selected criterion as follows

ˆ for

ˆ ˆ* for

ˆ ˆfor

i i

i i

i

x x c

x c x c

c x c

µ σ

µ σ µ σ

µ σ µ σ

− ≤

= − < −

+ > +

(4)

where î

xmed=µ is from data xi ranked in ascending order.

388


"Treatment" of the data by (4) is one of the ways of winsoryzation. As resistant

to outliers the estimate of the sample data grouping center µ the median i

xmed

shall be preliminary adopted. Huber [12] finds that the best assessment of the

distribution center is the midrange between the lower first (p=1/4) and the higher

third (p=3/4) of the sample quartiles (inter-quartile midrange), see Fig 1.

µ

x

( ) σ⋅xp

σµ + σµ 2+ σµ 3+σµ 3− σµ 2− σµ −

IQR

50%

40.

30.

20.

10.

0

39890.

σµ 67450− . σµ 67450+ .

Fig 1. Definition of inter-quartile mid-range: dotted lines – ordinates of first and third quartile

a= µ - 0.6745, b = µ + 0.6745.

The iterative procedure starts after arranging the elements of the sample

according to their values n

xxx ...,, 21 . Then the center of grouping data is

nixmedxi

,...1,,*

== (5)

In this case the standard deviation is

nMADs ⋅= 483,1* (6)

Then for c=1,5 with *5,1 s=ϕ can be determined boundaries of the range ϕ±

*x

to which are compared the original data xi. Data protruding beyond this range

are pulled on this boundaries and whole procedure is repeated. In any step (j) of

an iterative procedure, after the modified value from the step (j-1) according to

the conditions (4), is in turn fined a new mean value and new standard deviation

of the truncated sample on both sides (2 · 13.4%) as result of use the formula (4)

∑=

=

∗n

ijij nxx

1)()( ;/ (7)

∑ −−=

=

∗n

i

jjijnxxs

1

2)()()( )1(/)(134,1 (8)

The factor 1.134 is used when c = 1.5. If c =1.4 this coefficient is 1.162.

The resulting value j

s is used to calculate a new distance j

ϕ =1.5s to

boundaries of inter-quartile interval and again data coming off as outliers are

pulled on them, and the procedure as above is continued. Convergence of the

algorithm is determined by comparing the calculated values *

jx and *

1−jx of the

current and the previous iteration step. The procedure is repeated until changes

389


of j

x andj

s between successive steps will be minimal. The procedure is

stopped after j=m steps, where the difference of the standard deviations

)1()( −−

mmss for the two successive steps has become acceptably small.

Robust iterative double-criteria method IRLS has no defects of the median

method. It allows in the calculation of the standard deviation of the sample also

to include the outliers, i.e., data of the maximum absolute deviation, bringing

them to the borders of the inter-quartile diapason of normal probability

distribution of the data. This procedure is recommended for the inter-

comparison laboratory measurements [4], but without explanations of formulas.

An example of use it is presented in section 4.

Test of data homogeneity conditions and determination of limits of extreme

deviations for small number of data in the laboratory proficiency procedure is

described in detail in [13]. How to use it is illustrated on the numerical example

of very small - four element sample of data with outlier. Compared are results of

use the Grubbs test and of the robust method. These problems are presented also

in [14].

4. Numerical example

In this example, the mean value of measurement results of nine laboratories and

its estimated uncertainty are calculated by two classic and two above robust

methods. Obtained values will be compared. Measurement data is taken from

[2]. Nine laboratories conducted a joint experiment involving comparative

measurements by a tested method to assess its accuracy. It was assumed initially

that the credibility of all laboratory measurements are the same. From

measurements made by tested method in n = 9 laboratories received are mean

values x1–x9 given below in ascending order

17.570 19.500 20.100 20.155 20.300 20.705 20.940 21.185 24.140.

Two underlined results x1 and x9 are the significant outliers. Results obtained by

four various methods are shown in Table 1.

Table 1. Comparison of the results obtained by four methods.

Method For

all data

Rejected x1, x9

by Grabbs crit.

Robust

MADs

Robust

iterative

Result value 511.200 =x m =20.4 med=20.3 412.20*

5 =x

Standard

uncertainty 727.10 =s 501.0=s s(x9)=1.045 039.1*

=s

For all the 9 initial data xi=xi(0) the mean value 511,200 =x and the sample

standard deviation 727,10 =s . In the traditional model (cross-contamination) it is

390


assumed that only valid observations are derived from a normal distribution. A

consequence of that is to use the proper test, e.g. Grubbs test to find the outliers

sxxGnn

/)(max −= (9)

After rejection two outliers 570.17(0)1 =x and 9(0) 24.40,x = from the remaining

data obtained is the average value 41,20=x as the result common to the whole

experiment and much lower than previous standard deviation s = 0.50 both

calculated from measurements in 7 laboratories only. These assessments are of

the relatively lower statistical reliability.

In the classical approach the average values 0x , m calculated by both

methods differ relatively little. Standard uncertainty s0 of the data of all nine

laboratories is very high. After elimination of two outliers by the Grabbs

criterion, the uncertainty calculated for seven laboratories is almost 3.5 times

lower. However, measurements are unreasonably idealized here. The reliability

of the averaged data for 7 labs is decreasing as the formula ( ) )1(21/ −= nuusAA

(Appendix E.1 GUM [1]) showed that the relative standard deviation of the

standard measurement uncertainty will increase from 25% to 29%.

For both robust methods values of data grouping center are nearly similar.

Their uncertainties differ each other only by 9% and are between these two of

the classical method. For the iterative method achieved is *1.039s s= > . The

mean value and standard deviation determined by this method is based on the

data of all laboratories and seems to be as closer to the data which would be for

a larger number of independent measurements treated as general population.

5. Summary

The rescaled median deviation method given in section 2, is very simple but it

does not give correct results when the outlier is far from the rest of the data.

Iterative method of section 3 is more complicated, but easier to automate the

algorithm. With the introduction of the threshold ±cσ decreasing sensitivity to

data outliers, oriented is mainly to determine robust assessment of uncertainty.

Carried out in section 4 results of calculation showed the usefulness of the

application of two criteria iterative robust method resistant to determine the

statistical parameters of samples with a small number of data when they are

taken from the general population of the assumed normal distribution, but

include the results significantly different from the others. It allows you to more

objectively assess the value of the result and the accuracy of the test methods.

The analysis shows that for the evaluation of results presented in controlled

laboratories, should take into account the number of samples n obtained for the

investigated objects. When a sample is of a small number of items, to evaluate

the performance of results you can use the robust method of an iterative process

391


of data with winsoryzation of outliers. In this case received is a much smaller

variance and greater credibility than by standard methods.

Application of robust methods should be added to the upgraded GUM

version.

References:

1. Guide to the Expression of Uncertainty in Measurement (GUM), revised

and corrected version of GUM 1995, BIPM_JCGM 100:2008.

2. ISO 5725-2:1994 - Accuracy (trueness and precision) of measurement

methods and results. Part 2: Basic method for the determination of

repeatability and reproducibility of a standard measurement method.

3. EN ISO/IEC 17025: 2005 General requirements for the competence of

testing and calibration laboratories. ICS 03.120.20

4. ISO 13528-2005 Statistical methods for use in proficiency testing by

interlaboratory comparisons. Annex C.

5. EN ISO/IEC 17043 Conformity assessment — General requirements for

proficiency testing.

6. Belli M., Ellison S. L. et all: Implementation of proficiency testing

schemes for a limited number of participants. Accreditation and Quality

Assurance (2007) 12:391–398

7. Tukey J. W.: Exploratory Data Analysis. Addison-Wesley. 1978

8. Olive David J.: Applied Robust Statistics - Southern Illinois University

Department of Mathematics. June 23, 2008

9. Wilrich P.T.: Robust estimates of the theoretical standard deviation to be

used in interlaboratory precision experiments. Accreditation and Quality

Assurance 2007, v. 12, Issue 5, pp 231-240

10. Randa J.: Update to Proposal for KCRV & Degree of Equivalence for

GTRF Key Comparisons. NIST, 2005 GT-RF / 2005-04 Internet.

11. Haque M. Ershadul, Khan Jafar A., Globally Robust Confidence Intervals

for Location. Dhaka Univ. J. Sci. v. 60(1), 2012, p.109-113

12. Huber P. J., Ronchetti E. M.: Robust Statistics. 2nd ed. Wiley 2011 pp. 380

13. Volodarski E. T., Warsza Z. L.: Applications of the robust statistic

estimation on the example of inter-laboratory measurements. Przegląd

Elektrotechniczny - Electrical Review 11, 2013 p.260 -267 (in Polish)

14. Volodarski E. T., Warsza Z. L.: Robust estimation in multi-laboratory

measurements for samples with small number of elements. Electronic page

of AMCTM X Symposium, 9 -12 September 2014, VNIM St. Petersburg.

392





VALIDATION OF CMM EVALUATION SOFTWARE

USING TraCIM*

K. WENDT†, M. FRANKE, F. HÄRTIG,

Physikalisch-Technische Bundesanstalt (PTB),

D-38116 Braunschweig, Germany †E-mail: [email protected]

www.ptb.de

Computational software used in metrology can be validated online at the point of use

based on test data and associated reference results. Since modern metrological

applications often use complex algorithms for calculating results, it is very import that all

computational links are recognized explicitly and are known to be operating correctly. In

order to establish traceability to national standards also in metrological computation, the

European project EMRP NEW06 TraCIM was launched by the EC and the European

metrology association EURAMET. An example from the field of coordinate metrology is

used to explain the basic concept: the validation of least squares fitting algorithms for

regular features such as cylinders, cones, planes or spheres via the Internet by means of

the TraCIM system.

Keywords: Software test, validation of least squares fit, substitute element, TraCIM.

1. Introduction

In the past, certain national metrological institutes and organizations for

standardization (e.g. NIST, PTB, ISO 10360-6, and B89.4) have put some effort

into establishing standards for testing evaluation software in the field of

coordinate metrology, in particular software for fitting an associated feature to

data points measured by a coordinate measuring machine (CMM) on a real

workpiece. However, existing implementations for the validation of such

algorithms are error-prone, time-consuming and cost-intensive as there is

manual interaction needed on both sides involved, the testing body and the

manufacturer of the software under test. Furthermore, the present situation is

unsatisfactory as it leaves the end user uncertain of whether his particular

installation of the software can still be considered as valid, e.g. after the release

of updates of such software. In order to facilitate and to automate the validation

* TraCIM is a network to deliver computational traceability in metrology at the point of use. The

work is jointly funded by the EMRP participating countries within EURAMET and the European

Union (Joint Research Project (JRP) NEW06).

393


of metrological software at the point of use, the TraCIM project was initiated,

funded by the EMRP [1].

2. Overview of the TraCIM system

One of the main objectives of the TraCIM project (Traceability of Computa-

tionally-Intensive Metrology) is to develop new technology that allows users to

validate their software directly at the point of use (e.g. on the measuring

instrument itself) and at any time. To achieve this goal, an entire infrastructure

will be provided, including not only technical, but also legal and commercial

aspects.

TraCIM has three major aspects, its technical implementation, legal issues and

commercial requirements. The technical implementation provides a client-server

architecture which allows a direct link between the NMIs and the service users.

It is a fundamental principle that the TraCIM service is provided and hosted

only by national metrology institutes (NMI) or other authorized organizations.

These institutions assume del credere liability and are ultimately most able to

guarantee the correctness of the test results.

TraCIM’s IT architecture consists of four central modules (Figure 1):

(a) The TraCIM core system is a JavaEE application running on a JBoss

server which processes requests from test client software (ordering of

tests, sending and receiving test data, sending certificates), stores

customer and order data and communicates with a Web shop and expert

modules. It is operated by a competent metrology institute.

(b) The so-called expert modules are applications, which provide specific

test data on demand, compare reference data to data calculated by the

software under test and issue test certificates or test reports. Each

expert module operates basically autonomously. Since the individual

tests may vary significantly from one test application to another, only

few input/output parameters have been specified for the data traffic

between the expert module and the core application. This applies, for

instance, to the support of a software interface in JAVA which allows

the expert module to be linked to the server system.

(c) The Web shop provides methods for user registration and for ordering

tests. Additionally a payment system may be integrated. Both modules

are currently still in development.

(d) The client software is an interface program on the computer of the user,

which is preferably smoothly integrated in the user application. It is

responsible for connecting the software under test with the TraCIM

server. The client interface allows the reception of test data and – after

processing them – the sending of the calculated results directly to the

394


TraCIM core application. Client-server communication runs via a REST interface. Hereby, the data are embedded into an XML structure in which test data can be defined in a free data format (such as binary formats, existing test data structures or newly specified formats) depending on the application. The provider of the expert module is solely responsible for the test data format, the test data and the reference results.

To order a software test, users can visit the TraCIM Web shop via an Internet browser. There, they will find information about available tests and costs. When selecting a specific test, the Web shop will prompt the user for additional information required for that test (extent of the test, range of test data, etc.) and forward it to a payment system, if present. At the end of the ordering process the customer will receive an e-mail with an individual order key. At the current stage of implementation the Web shop is not yet accessible. Until the Web shop has been set up, interested users can contact the TraCIM secretariat at PTB directly at [email protected] to register and to have the chance to see and try out the TraCIM system.

Fig. 1: TraCIM modules

The validation of CMM evaluation software is subject to charges. Figure 2 explains the costs of the so-called ’Gaussian test‘ which is already offered by the German national metrology institute (PTB) in the department of coordinate metrology. Various types of tests are on offer. Besides an individual test, also packages of 10 and 50 tests are available.

395


Fig. 2: Business concept Gaussian test of PTB

After successfully completing the payment process, the test client can

submit the order key to the TraCIM System. On acceptance of the key, it will

send test data and administrative information to the client computer together

with a process key, which identifies the specific test (important if an order key is

good for more than one test) (Fig. 3a).

Fig. 3a: Mandatory TraCIM header Fig. 3b: Mandatory data elements of the

calculation report

The user then has to process the test data by means of the software under

test, computing the test result referring to the computational aim for the specific

test, format it according to the TraCIM expert module requirements and send the

calculation report back to the TraCIM system (Fig. 3b). Finally, the addressed

TraCIM expert module evaluates the result, generates a report and a certificate

396


and sends both to the core application, which forwards both of them to the

client.

To exchange data between the server and the client, REST is used, which is

a technique using the HTTP/HTTPS protocol. The exchange of data is provided

via a specified XML format. The data format contains a number of elements that

are mandatory. The most important one is the ‘process key’. This key allows the

unambiguous assignment of test data, test results and the test certificate and

prevents a ‘mismatch’ of sent and received data. The format of the elements is

clearly specified in TraCIM, expect for those which are test-specific. Elements

such as the ‘test package’ or ‘result package’ contain information and data

specially tailored for the corresponding software test (see section 3). Therefore,

the structure and format of these data packages strongly depend on the type of

the particular software test. All data sent and received through the TraCIM

server are stored in a database for archival purposes and for reasons of tracea-

bility. This applies both to data exchanged with clients and with expert modules

as well.

3. Outline of the test for Gaussian best fit elements

This section gives a description of the software design of a specific expert

module for testing Gaussian best fit elements (the so-called ‘Gaussian expert’).

The test is targeted towards software manufacturers in the domain of coordinate

metrology. It should prove that their evaluation software is operating correctly

within specified error tolerances.

It is only possible to verify and validate a software algorithm when the kind of

mathematical problem it is intended to solve has been clearly defined. Formal

and complete statements of the computational aims are available in the TraCIM

computational aims database [2]. These statements clearly show what

computational task the software for computing least squares best fit geometric

elements to data should execute.

3.1. Test data

In order to test software for computing Gaussian least squares best fit elements,

different sets of data are sent to the client. The data consist of point clouds. Each

point cloud is composed of data points or vectors, i.e. values of x-, y- and z-

coordinates representing certain geometric elements such as lines, planes,

circles, cylinders, cones, and spheres, or, more generally, computational objects

(Figure 4a). The test data sets are designed to simulate a range of sizes, shapes,

locations, orientations, and samplings of real inspection features. They are also

397


designed to simulate typical CMM measurement errors, including probing errors

and form deviations on workpieces. For instance, systematic and random errors

in the range of 20 to 50 µm have been superimposed on the nominal coordinates.

Each set of data has between 8 and 50 data points. A complete set of test data

comprises in total 44 different data sets. The individual data sets consist of com-

plete or non-extreme partial features, e.g. a spherical cap or a cylindrical gusset.

The test data are randomly selected from a database containing tens of thousands

of verified data sets and sent to the client in an XML format. Initial estimates of

the parameters to be calculated are not provided.

The test data are sent from the ‘Gaussian expert’ to the core server and then

via XML/REST to the client application. As the TraCIM system also supplies

the appropriate XML schemas, it is easy to validate and extract the data

received. The received points are directly imported into the program under test

to calculate the desired result parameters and subsequently to export the calcu-

lated parameters in the test specific XML format.

It is essential that the algorithm under test and the reference results are

designed for exactly the same computational aim. Therefore, a database with

computational aims is under construction, as part of the EMRP project TraCIM.

Fig. 4a: Gaussian test data Fig. 4b: Calculation report

398


3.2. Test evaluation

As mentioned above the calculated test results are automatically sent from the

end user to the expert module via the TraCIM core application. Within the

‘Gaussian expert’ the calculated parameters are extracted from the XML

structure for all 44 geometric elements together with the maximum permissible

errors (MPE) specified by the client.

To verify the correctness of the calculated test parameters, they are

compared with reference values. However, the calculated Gaussian fit

parameters are not determinable independently of each other and thus frequently

highly correlated. Therefore, corresponding to ISO 10360-6, four classes of

performance values are defined:

(a) Location:

The distance of the reference point normal to the calculated line, plane,

axis of cylinder or cone, whereby the reference point is defined as the

centroid of the data points lying on the associated line or plane or,

respectively, the projection of the centroid onto the axis of the

associated cylinder or cone. For a circle and a sphere it is defined as the

distance between the centre points.

(b) Orientation:

The angle between the reference and the calculated direction

(c) Size:

The difference between the reference and the calculated radius of a

cylinder, circle or cone.

(d) Angle:

The difference between the reference and calculated apex angle of a

cone.

It is in the responsibility of the client to state the quality of the calculated results

by specifying maximum permissible errors for each class.

Assuming that in each class d refers to the maximum performance value

determined and MPEd refers to the maximum permissible error associated with

the class, then the software under test does not yield any sufficiently accurate

results, if for any d the following applies

d > MPEd + Ud (1)

where Ud denotes the numerical test uncertainty, which quantifies how accurate

the reference values are. This value has to be determined by the tester/test body

providing the test. In case all values d are smaller than the corresponding limit

values (1), a test certificate is issued stating that the software successfully passed

the test.

399


4. The TraCIM research project

The TraCIM system is being set up within the scope of the European Metrology

Research Project (EMRP) under the denomination ‘Traceability for

Computationally-Intensive Metrology’. The project is coordinated by the

National Physical Laboratory (NPL). Furthermore, it involves five other

European metrology institutes, four partners from industry and four universities.

The first implementation of the TraCIM system is being undertaken by the

Physikalisch-Technische Bundesanstalt (PTB).

5. Summary

The verification of application software is becoming increasingly challenging

due to the lack of validation tools, in which adequate methods are used to check

the correctness of each software implementation at the point of use. This paper

aims to describe a ‘black-box’ test procedure, which allows each individual user

to verify the software at the point of use by means of verified test data. In

particular, criteria to assess the performance of software for fitting Gaussian

substitute elements are presented.

The development of appropriate tools, methods and procedures is part of the

Joint Research Project NEW06 ‘Traceability for computational-intensive

metrology’ (TraCIM).

References:

1. A.B. Forbes, I.M. Smith, F. Härtig, K. Wendt: Overview of EMRP Joint

Research Project NEW06 ’Traceability for Computational Intensive

Metrology‘, in Proc. Int. Conf. on Advanced Mathematical and Computa-

tional Tools in Metrology and Testing (AMCTM 2014), (St. Petersburg,

Russia, 2014)

2. http://www.tracim-cadb.npl.co.uk/tracim_compaims_menu.php

400





SEMI-PARAMETRIC POLYNOMIAL METHOD FOR

RETROSPECTIVE ESTIMATION OF THE CHANGE-POINT

OF PARAMETERS OF NON-GAUSSIAN SEQUENCES

S. V. ZABOLOTNII

Department of Radioengineering, Cherkasy State Technological University,

Cherkasy, 18000, Ukraine, www.chdtu.edu.ua


Z. L. WARSZA

Industrial Research Institute of Automation and Measurement

PIAP Warszawa Poland

Email: [email protected] & [email protected]

An application of the maximization technique in the synthesis of polynomial adaptive

algorithms for retrospective (a posteriori) estimation of the change-point of the mean

value and standard deviation (uncertainty) of the non-Gaussian random sequences is

presented. Statistical simulation shows a significant increase in the accuracy of

polynomial estimates, which is achieved by taking into account the higher-order statistics

(cumulant coefficients) of handled statistical data.

Keywords: change-point, retrospective estimation, stochastic polynomial, non-Gaussian

sequence; moments, cumulant coefficients.

1. Introduction

One of the important tasks of the diagnosis of stochastic processes is the

measurement of the point at which the properties of the observed process are

subject to a change (disorder). This point is called “the change-point”. Statistical

methods of detecting the change-point can be used in real time or a posteriori.

The latter ones, carried out at a sample of fixed volume, are also called the

retrospective methods [1]. Solving of such problems is needed in many

applications, such as the continues measurement of statistical components for

diagnosis of some industrial processes, the detection of climate change [2],

genetic analysis of time series [3], segmentation of speech signals and messages

of social networks [4]. Such wide range of tasks require the development of a

large variety of mathematical models and tools including analysis of uncertainty.

The most theoretical papers devoted to the problem of discovering the time of

the “change-point” of stochastic process, are focused on the class of processes

described by the Gaussian law. However, the actual statistical data often differ

401


significantly from the Gaussian model. In the classical parametric methods (e.g.

maximum likelihood MML, Bayesian) a priori information about the form of the

laws of distribution, as well as their high complexity is required. Thus, a

significant amount of the contemporary research concerns the construction of

applied statistical methods which would allow to remove or minimize the

required amount of the a priori information. Such methods are based on robust

statistical processing procedures that are insensitive to “non-exactness” of

probabilistic models, or on nonparametric criteria, independent of specific types

of distributions. The price for “omission” of probabilistic properties in handled

statistical data is the deterioration of quality characteristics in comparison with

the optimal parametric methods.

The use of higher-order statistics (described by moments or cumulants) is

one of the alternative approaches in solving problems related to processing of

non-Gaussian signals and data. In this paper the application a novel statistical

method, in solving problems of a posteriori type estimation of change points is

considered. The method is called the polynomial maximization method (PMM)

and it was proposed by Kunchenko [5]. This method used in conjunction with

the moment-cumulant description allows to simplify substantially the process of

synthesis of adaptive statistical algorithms. Studies of effectiveness using

statistical modeling is also included.

2. The mathematical formulation of the problem

Let us consider the observed sample n

xxxx ,..., 21=

obtained by the regular

sampling of measured random process. If autocorrelation is negligible, elements

of this sample can be interpreted as a set of n independent random variables. The

probabilistic nature of this sample can be described by the initial moments

( )ϑαj

: the mean value ϑ , variance 2σ and by cumulant coefficients

lγ up to a

given order sl 2,3= . Up to some (a priori unknown) point of the discrete time

τ , the mean value is equal to 0ϑ , and then, at the time 1+τ its value jumps to

1ϑ . On the basis of analysis of the entire sample of measurement observations

necessary is to estimate the change-point τ and its uncertainty.

3. General algorithm of polynomial change-point estimation

Let x

be equally distributed sampled elements. Consider the algorithm

presented in [5] and denoted by acronym PMM. It is shown in that paper that the

estimate of an arbitrary parameter ϑ can be found by solving the following

stochastic equations with respect to ϑ :

402


( ) ( ) 01

ˆ1 1

=

−

== =

∑ ∑rr

s

i

i

n

v

i

vix

nh

ϑϑ

ϑαϑ ,

where: s - is the order of the polynomial used for parameter estimation, ( )ϑαi

-

theoretical initial moments of the i-th order.

Coefficients ( )ϑih (for si ,1= ) can be found by solving the system of linear

algebraic equations, given by conditions of minimization of variance (with the

appropriate order s) of the estimate of the parameter ϑ .

A new approach for finding the posteriori estimates of change-point,

proposed in this paper, is based on application of PMM method. In this approach

there is used a property of the following stochastic polynomials:

( ) ( ) ( )∑ ∑= =

+=

s

i

n

v

i

visnxknkxl

1 1

0 ϑϑϑ

, (1)

where

( ) ( ) ( )[ ] ϑϑαϑϑ

ϑ

dhk

a

s

i

ii∫∑=

=

1

0 , ( ) ( ) ϑϑϑ

ϑ

dhk

a

ii ∫= , si ,1= (2a,b)

which is the expectation that they obtain a maximum as a function ϑ at the true

value point of this parameter.

True value of parameterϑ belongs to some interval ( )ba, . If the stochastic

polynomial of the form (1) will be maximized with use a parameter ϑ which

has a change-point (step change from value 0ϑ to value 1ϑ ), then we can build a

polynomial form statistics:

( )( ) ( ) ( ) ( ) ( ) ( )∑ ∑∑ ∑

= +== =

+−++=

s

i

n

rv

i

vi

s

i

r

v

i

vi

s

rxkkrnxkrkP

1 1

110

1 1

00010 , ϑϑϑϑϑϑ , (3)

which will have a maximum in a neighborhood of the true value τ of the

change-point. Thus, the general algorithm of applying PMM method for finding

the estimation of the moment of the change-point τ can be formulated as

follows:

( )

( )1011

,ˆ ϑϑτs

rnr

Pmaxarg−≤≤

= (3a)

4. A posteriori estimation of the change-point of mean value by maximum

likelihood method

One of the basic directions in investigations of a posteriori problems of the

change point study is based on the idea of the maximization of the likelihood. It

was elaborated in details by Hinckley [6]. He proposed a general asymptotic

403


approach to obtain distributions of a priori change-point estimates by method of

maximum likelihood (MML). Application of this approach requires an a priori

information about the distribution law of statistical data, before and after the

change point. For a Gauss distribution it is known that estimation of the mean

value by MML method is similar as a linear estimation by moments (MM):

∑=

=

n

v

vx

n 1

1θ (4)

The estimate of the form (4) is consistent and not shifted. So non-parametric

MM estimator can be used for estimation of the mean value of random variables

of any arbitrary distribution. However, this assessment is effective only for the

Gaussian model. For this probabilistic model the logarithm of the maximum

likelihood function (MML) with known variance 2σ is transformed [1] into

statistics of the form:

( ) ( ) ( ) ( )∑∑+==

−−+−=

n

rv

v

r

v

vrxrnxrT

1

2

1

1

2

010 , θθθθ , (5)

( )10 ,θθr

T has a maximum in a neighborhood of the true value of the change-

point τ . Thus, the desired change-point estimate can be find by the algorithm:

( )1011

,maxargˆ θθτr

nr

T−≤≤

= (5a)

Since statistics (5) or (5a) do not depend on any other probabilistic

parameters they can be used for nonparametric estimation of the change-point of

the mean value of random sequences with an arbitrary distribution. However, in

such situations (similarly, as in the case where the mean is evaluating according

(4)), the nonparametric algorithms lose their optimality. To overcome this

difficulty, the nonlinear estimation algorithms based on the of the minimization

of the polynomial are described below. They allow taking into account, in a

simple way, the degree of non-Gaussian character of the statistical data.

5. Polynomial estimation of the change-point of mean value

It is known from [5] that the estimate of the mean value θ obtained by PMM

method using a polynomial of degree 1=s coincides with the form (4) of the

linear estimate by the moment method MM. Hence the synthesis of polynomial

algorithms for estimating the change-point of this parameter is justified only for

degrees 2≥s . At a degree 2=s polynomial estimate of the mean value can be

calculated by solving the following quadratic equation

404


( ) ( ) ( )22 2

3 3 4 4 31 1 1 ˆ

1 1 12 2 2 0

n n n

v v v

v v v

x x xn n n

θ θ

γ θ γ σ γ θ σ γ γ σ

= = ==

− − + − + + − =∑ ∑ ∑

(6)

The analysis of eq. (14) shows that the estimated value of 2ˆ

=sθ depends

additionally on coefficients of skewness 3γ and excess kurtosis

4γ . If the values

of these parameters are equal to zero, then the distribution can be for example

the normal (Gaussian) one. In this case the polynomial estimate (6) reduces to

the classical estimate of the form (4). It is shown in [5] that the use of eq. (6)

ensures the higher accuracy (decrease the variance) than the estimate (4).

The asymptotic value of this estimate (for ∞→n ) is given by the

following formula:

( )

4

2

3

4322

1,γ

γγγ

+

−=g (7)

Using the analytical expressions (2a,b) one can easily find that, for order

s=2, the coefficients maximizing the selected stochastic polynomial of the form

(1) in a neighborhood of the true value of the parameter θ are the following:

( ) ( )[ ]θσγσθγθγσ

θ2

3

2

4

3

3

2

3

0 62326

−++

∆

=k ,

(8a-c)

( ) ( )[ ]θσγθγσ

θ 42

32

3

1 2 ++

∆

=k , ( ) θγσ

θ 3

2

3

2∆

−=k

where ( )2

34

6

2 2 γγσ −+=∆ .

In the presence of an a priori information about the mean values of 0θ

before and 1θ after of the change-point, and under the condition 01 θθ > , for the

order of the polynomial (3) for 2=s can be expressed as follows:

( )( ) ( ) ( ) ( )( ) ( )

( ) ( )( ) ( )

2 3 3 2 2 20 1 3 1 0 4 1 0 3 1 0

2 2 23 1 0 4 1 0 3 1 0

1 1

1 1, 2

3 2

2 .

r

n n

v v

v r v r

P n r

x x

θ θ γ θ θ σ γ θ θ σ γ θ θ

γ θ θ σ γ θ θ γ θ θ

= + = +

= − − + + − − −

+ − + + − − −∑ ∑

(9)

405


6. Statistical modeling of a posteriori estimate of change-point

Based on results of above considerations, a software package in a software

environment MATLAB, has been developed. It allows to perform the statistical

modeling of the proposed semi-parametric estimation procedures, applied to the

estimation of the mean value and variance of the change-points of non-Gaussian

random sequences. Both, single and multi- experiments (in the sense of the

Monte Carlo method) can be simulated. The accuracy obtained by classical and

proposed polynomial algorithms for experimental data can be also compared.

In Figure 1b presented are results for an numerical example obtained by

estimation procedures for the mean value of the change-point τ of mean values

00 =θ and 11 =θ of the non-Gaussian sequence (Figure 1a), where 1=σ ,

23 =γ and 54 =γ . The calculations were performed using the classical version

(2) of the algorithm of a posteriori estimation by MML method (coinciding with

PMM if 1=s ) as well by polynomial algorithm (9) of PMM for 2=s .

а)

b)

Figure 1. Example of a posteriori estimation of the change-point of mean value.

The results presented in Figure 1b clearly confirm the potentially higher

precision obtained by polynomial statistics for 2=s , since the maximum of the

corresponding function is strongly marked, as compared with the smoothed form

of the statistic for 1=s .

406


Results of the single experiment do not allow to compare adequately the

accuracy of the statistical estimation algorithms. As a comparative criterion of

efficiency, the ratio of variances of the estimates of the change-point is used.

That can be obtained by a series of experiments with the same initial values of

the model parameters. It should be noted that theoretically the results of

statistical algorithms of a posteriori estimation of the change-point can depend

on a various factors, including e.g.: the relative value of the mean jump at the

change-point, the probabilistic nature (values of coefficients of higher order

cumulants) of non-Gaussian random sequences, the presence of an a priori

information about the values variable of parameters. Furthermore, the accuracy

of estimations of the change-point depends on the chosen number n of the

sample and on the accuracy of the variance estimates, i.e., on the number of

experiments m performed under the same initial conditions.

As the example, results of statistical modeling for 200=n and

2000=m are shown on Figures 2a,b. Coefficient 2G is the ratio of variances

of the change-point estimates of mean value obtained by PMM method with the

polynomial order 2=s and for 1=s statistics, respectively. The value of 2G

characterizes the relative increase of statistical accuracy.

a) b)

Figure 2 - Experimental values of coefficients of the variance reduction of estimates of

the change-point of mean values

Figure 2a shows the dependence of 2G on the relative values of the jump

( ) σθθ 01 −=q at the change-point, obtained with different coefficients of

skewness 3γ and kurtosis

4γ . Figure 2b presents the dependence of 2G on

3γ

(for 104 =γ and 5.0=q ), obtained under different a priori information about

the mean values of random sequences before and after the change point.

407


Analysis of these and many other experimental results confirm the

theoretical results concerning the effectiveness of the polynomial method in the

change-point estimation. It turns out that the relative growth of accuracy is

roughly the same for different formulations of the problem, related to the

presence or absence of an a priori information about the values of the variable

parameter. The improvement does not significantly depend on the relative

magnitude of the jump at the change-point. It is determined primarily by the

degree of “non-Gaussian” of the process, which numerically is expressed as the

absolute values of the of higher-order cumulant coefficients. More details and

evaluation of the change point of variance is possible to find in our paper [7].

7. Conclusions

The results of the research lead to the general conclusion about the potentially

high efficiency of the implementation of the polynomial maximization method

PMM to the synthesis of the simple adaptive algorithms for estimating the

change-points of parameters of stochastic processes of non-Gaussian character

of statistical data.

The obtained results allowed to develop a fundamentally new approach to

the construction of semi-parametric algorithms for a posterior estimation of the

change-point. This approach is based on application of stochastic polynomials.

References

1. Chen J., Gupta A. K. Parametric statistical change point analysis.

Birkhaeuser, p. 273, 2012.

2. Reeves J., Chen J., Wang X. L., Lund R., and Lu Q. A review and

comparison of change-point detection techniques for climate data. Journal

of Applied Meteorology and Climatology, 46 (6) p. 900 -915, 2007.

3. Wang Y., Wu C., Ji Z., Wang B., and Liang Y. Non-parametric change-

point method for differential gene expression detection. PLoS ONE, 6 (5):

e20060, 2011.

4. Liu S., Yamada M., Collier N., & Sugiyama M. Change-point detection in

time-series data by relative density-ratio estimation. Neural Networks,

vol.43, p.72- 83, 2013.

5. Kunchenko Y., Polynomial Parameter Estimations of Close to Gaussian

Random variables. Shaker Verlag, Aachen Germany, 2002.

6. Hinkley D. Inference about the change-point in a sequence of random

variables Biometrika. 1970 . vol.57. No.1. p. 1- 17.

408


7. Zabolotnii S. V., Warsza Z. L., Semi-parametric estimation of the change-

point of parameters of the non-Gaussian sequences by polynomial

maximization method. Przegląd Elektrotechniczny - Electrical Review vol.

91, no 1 (2015) p. 102 -107 (in Polish)

409





USE OF A BAYESIAN APPROACH TO IMPROVE

UNCERTAINTY OF MODEL-BASED MEASUREMENTS BY

HYBRID MULTI-TOOL METROLOGY

NIEN FAN ZHANG

Statistical Engineering Division, National Institute of Standards and Technology,

Gaithersburg, MD 20899, USA

BRYAN M. BARNES, RICHARD M. SILVER, AND HUI ZHOU

Semiconductor and Dimensional Metrology Division, National Institute of Standards and

Technology, Gaithersburg, MD 20899, USA

In high resolution critical dimensional metrology, when modeling measurements, a

library of curves is usually assembled through the simulation of a multi-dimensional

parameter space. A nonlinear regression routine described in this paper is then used to

identify an optimum set of parameters that yields the closest experiment-to-theory

agreement and generates the model-based measurements for the desired parameters. To

improve the model-based measurements, other measurement techniques can also be used

to provide a priori information. In this paper, a Bayesian statistical approach is proposed

to allow the combination of different measurement techniques that are based on different

physical measurements. The effect of this hybrid metrology approach is shown to reduce

the uncertainties of the parameter estimators, i.e., the model-based measurements.

Keywords: Covariance matrix, critical dimension measurements, generalized least

squares estimator, nonlinear regression, prior information, simulation

1. Introduction

In high resolution critical dimensional metrology, when modeling measurements

a library of curves can be assembled through the simulation of a multi-

dimensional parameter space. A nonlinear regression is then used to identify an

optimum set of parameters that yields the closest experiment-to-theory

agreement. This approach assumes that the model is adequately describing the

physical conditions and that an acceptable fit is achieved with the best set of

parameters, which are the desired model-based measurements for those

parameters. However, measurement noise, model inaccuracy, and parametric

correlation all lead to measurement uncertainty in the fitting process for critical

dimension measurements. To improve the measurements, techniques based on

different physical measurement principles may be used to provide supplemental

a priori information and augment the parametric fitting. The Bayesian approach

410


proposed in this paper allows the combination of different measurement

techniques that are based on different physical measurement principles. The

effect of this approach will be shown to reduce the uncertainties of the

parameter estimators.

2. Nonlinear regression models for critical dimension study

A complete set of model-based measurements for high resolution critical

dimension study such as scatterfield microscopy (see [1]) or scanning electronic

microscopy includes 1,...,

Ny y , which are the measured values of a variable of

interest Y , e.g., intensity, and 1,...,N

x x , which represent the measurement

conditions, e.g., the values of wavelength or angle, under which the N data

points 1,...,

Ny y are obtained correspondingly. See [2], where the details for the

case of optical measurements are presented. As mentioned in the Introduction,

model-based simulations can be performed at each of 1,...,

Nx x based upon a

representation of the sample defined using K measurement/tool parameters.

The simulated response is denoted by ( ; ), 1,...,i

y x i N=a , where

1 ,..., T

Ka a=a is a parameter vector representing the adjustable (i.e. variable)

parameters, for example, line height, line width, etc. Our goal is to compare

1 ,..., N

y y with ( ; ), 1,...,i

y x i N=a , the simulated values under the condition

of , 1,...,i

x i N= for the parameters 1 ,..., T

Ka a=a , to find an optimal estimator

of the parameter vector a . In general ( ; )i

y x a is a nonlinear function of the

parameter vector a . We have a nonlinear regression for i

y and ( ; )i

y x a for

1,...,i N= given by

( , )i i i

y b y x ε= + +a for 1,...,i N= , (1)

where b is an unknown constant and i

ε is the corresponding random error with

zero mean to estimate the parameters a . Using a first order Taylor expansion at

a specific point of the vector a , 1 (0),..., (0)T

Ka a=a(0) , similar to Equ. 3 in

[2], an approximation of that nonlinear function in (1) gives a linear regression

model

1 ( )

( ; )( ; ( )) ( (0))

K

i

i i k k i

k k

y xy b y x a a

aε

==

∂ = + + − + ∂

∑a a 0

aa 0 , 1,...,i N= , (2)

where ( ; )i

y x a(0) is the simulated value of ( ; )i

y x a at a = a(0) and

( ; )i

k

y x

a=

∂

∂ a a(0)

ais the value of the partial derivative of ( ; )

iy x a with respect to

ka

411


at .a = a(0) The covariance matrix of 1 ,...,

T

Nε ε=ε is denoted by .V In

general, V does not have to be a diagonal matrix. However, it is easy to reduce

it to the case of a diagonal matrix by a matrix transformation See P. 221, [3].

Thus, without loss of generality, we assume that the random variables i

ε are

uncorrelated. Namely, V is a diagonal matrix denoted by 2 2

1[ ,..., ]N

diag σ σ=V .

By re-parameterization, (2) can be written as

1

(0) (0) (0)

K

i ik k i

k

y b D β ε

=

= + +∑ 1,...,i N= , (3)

where (0) (0)k k k

a aβ = − (4)

( ; )(0) i

ik

k

y xD

a=

∂

= ∂ a a(0)

a, (5)

and

(0) ( ; )i i i

y y y x= − a(0) . (6)

The linear model in (3) is expressed in a matrix form by

( )

1 11 1 1

2 21 2 21

1

(0) 1 (0) (0)

(0) 1 (0) (0)

(0) 1 (0) (0)

K

K

N N NK NK

y D D b

y D D b

y D D

ε

εβ

εβ

= + = +

1 D(0) εβ(0)

(7)

or

b= + ⋅ +Y(0) 1 D(0) β(0) ε . (8)

We denote the vector of the new parameters by

'b

=

β (0)β(0)

, which is a (K+1) by 1 vector and ( )' =D (0) 1 D(0) , which is a

N by (K+1) matrix with 1 as a unit vector of length = N and with D(0) a N by

K matrix. From (7),

' '= ⋅ +Y(0) D (0) β (0) ε . (9)

The mean of ε is 0 and the covariance matrix of Y(0) and ε is V . Based on

(9) and Gauss-Markov-Aitken Theorem, the best linear unbiased estimator

(B.L.U.E.) of 'β (0) is the generalized least squares (GLS) estimator given by

( )1

1 1ˆ ' ' ' 'T T−

− −=β (0) D (0) V D (0) D (0) V Y(0) . (10)

412


See [4], pp. 97-98. Namely, among all the linear unbiased estimators of 'β (0) ,

'β (0) is the one with the smallest variance. The corresponding estimator of the

original parameters ; 1,..., k

a k K= =a is a given by

'

1ˆˆ (0) (0)

k k ka aβ

+= + (11)

for 1,...,k K= . The covariance matrix of parameter estimators is given by

( )1

1ˆCov[ ' ] ' 'T

−−

= ⋅ ⋅β (0) D (0) V D (0) . (12)

The standard deviations or standard uncertainties of ˆ (0)k

β or ˆk

a , 1,...,k K=

are given by the square roots of the diagonal elements of ˆCov[ ]a and denoted

by ˆk

aσ .

In practice, we need to check whether the parameter b is statistically

significant from zero. Under the normal assumption, a t-test can be done by

using ˆ

ˆ

ˆb

bt

σ

= , where the denominator is the estimated standard deviation of b

and is obtained from (12). Compare the value with the critical point at α level,

1 2 ( 1)t N Kα−

− − to determine whether the intercept b is significant from zero

or not. Once having a significant b , we may subtract it from Y(0) and then (1)

reduces to Equation 1 in [2]. From now on in this paper, we assume that

Equation 1 in [2] is an appropriate model, where the assumption of zero mean of

ε is satisfied leading to the model described by the Equation 8 in [2], i.e.,

= ⋅ +Y(0) D(0) β(0) ε (13)

and other corresponding results.

3. Bayesian analysis and the use of prior information of parameters

Recent studies have shown that measurements made by some techniques are

intrinsically limited by correlation of the fitting parameters and some other

causes leading greater uncertainty than desired. See [5] and [6]. However,

quantitative information regarding these parameters, either from other

measurement techniques or a priori manufacturing knowledge of material

parameters may be available and used to improve measurement uncertainties. A

Bayesian approach was proposed in [2] to allow the combination of different

measurement techniques.

413


In Bayesian analysis, model parameters such as β(0) in (13) or equivalently,

k

a , 1,...,k K= are treated as random and have their own probability

distributions. We assume that among the K parameters, the first p ( p K≤ )

parameters, 1,...,p

a a or equivalently 1 (0),..., (0)p

β β have prior probability

distributions. The prior distributions about 1,...,p

a a or 1 (0),..., (0)p

β β are

combined with the data’s likelihood function based on Y(0) and D(0) in (13)

according to Bayes Theorem to yield the posterior distributions about the

parameters 1 (0),..., (0)p

β β . Under Gaussian assumptions of the random errors,

iε in (1) and the prior distributions of the parameters, a direct approach with

analytical results can be applied. Specifically, we assume that k

a for each

1,...,k p= have prior information and is Gaussian distributed. The mean of

1 ,...,p

a a is given by* * *

1( ,..., )T

pa a=a and a known covariance matrix. The

means of the corresponding adjusted regression parameters are given by * * *

1( (0),..., (0))T

p pβ β=β (0) with * *(0) [ (0)] (0),

k k k kE a aβ β= = − 1,...,k K= . We

assume that ; 1,..., i

a i p= are uncorrelated from each other and the covariance

matrix of )1( ,...,T

pa a or equivalently the sub-vector of β(0) denoted by

1( (0),..., (0))T

p pβ β=β (0) is given by

1

2 2[ ,..., ]

p pa a

diag σ σ=β (0)Σ . That is,

pβ (0) is Gaussian distributed with mean *

pβ (0) and covariance matrix

pβ (0)Σ ,

i.e., *( )(0) ( ( ), )

pp p

N β 0β β 0 Σ∼ . Referring to the regression model in (13), by the

Bayesian approach, we treat ( )NY(0) D(0)β(0),V∼ with

*( )(0) ( ( ), )

pp p

N β 0β β 0 Σ∼ and the parameter set excluding

pβ (0) having

noninformative prior distributions. The posterior distribution for β(0) can be

obtained by a weighted linear regression. See [7]. Therefore, we can treat the

prior information on pβ (0) as p additional “data points” of the response

variable in (13). See [8], pp. 382-384. In general, for 1,...,k p= , we have

* (0)k k N k

β β ε+

= + . (14)

Combining (14) with (13), we have an expanded linear model given by

= ⋅ +* * *

Y (0) D (0) β(0) ε , (15)

where the ( )N p+ by K matrix

414


11 1 1

1

*

(0) (0) (0)

(0) (0) (0)( )

( ) 1 0 0 0

0 1 0

0 0 0 1 0 0

p K

N Np NK

D D D

D D D

= =

D 0D 0

1

(16)

with 1 a p x K matrix consisting of p row vectors of length of K with only a

single 1 in each row and the other elements in each row are zeros as shown in

the second equality. In (15),* *

1 1( (0),..., (0), (0),..., (0))T

N py y β β=

*Y (0)

and 1 1( ,..., , ,..., )T

N N N pε ε ε ε

+ +=

*ε with [ ]E =

*ε 0 and the covariance matrix

of *ε given by

1

* * 2 2 2 21Cov[ ] [ ,..., , ,..., ]

pN a a

diag σ σ σ σ= =V ε . Similar to (10),

the posterior estimators of β(0) based on the GLS are given by

( )1

# 1 1ˆ T T−

− −=

* * * * * *β (0) D (0) V D (0) D (0) V Y (0) (17)

with the posterior covariance matrix of the parameter estimators given by

( )1

# # * 1ˆˆCov[ ] Cov[ ] T−

−= = ⋅ ⋅

* *a β (0) D (0) V D (0) , (18)

where # #ˆˆ = +a β (0) a(0) are the posterior estimators of the original parameters

a . It is clear that #β (0) is the B.L.U.E. of β(0) based on the expanded model in

(15). It is shown in [2] that

#ˆ ˆCov[ ] Cov[ ]≤β (0) β(0) . (19)

That is the difference of the covariance matrices of β(0) and #β (0) is a

nonnegative definite matrix. See [4], pp. 97 - 98. From [9],

#ˆ ˆVar[ (0)] Var[ (0)]k k

β β≤ for 1,...,k K= . (20)

Or equivalently, #ˆ ˆVar[ ] Var[ ]k k

a a≤ for 1,...,k K= . This indicates that the

variance of a posterior parameter estimator is equal to or smaller than that of the

corresponding usual GLS estimator without prior information of the model

parameters. In addition, no matter of which parameter the prior information is

used for, the variances of all posterior estimators are equal to or smaller than

those of the corresponding usual GLS estimators. The two variances are the

415


same if and only if 2

ia

σ = ∞ for 1,...,i p= , i.e., if there is no prior information

for all these p model parameters. Note that since both estimators are unbiased

estimators of the model parameters, the posterior estimators have smaller mean

squared errors than those of the corresponding usual GLS estimators,

correspondingly. By the same argument for (19), it can be shown that

# 2ˆVar[ (0)]

ii β

β σ≤ (21)

for 1,...,i p= . Thus, the posterior variances are smaller than the prior variances

of the model parameters, correspondingly. From (20) and (21), it is clear that

when we use prior information about the regression model parameters from

other metrology sources, e.g., atomic force microscopy for optical critical

dimension parametric modeling, the resultant uncertainties of the posterior

estimators are smaller than both the prior uncertainties and the uncertainties of

the regular GLS estimators of the model parameters. Therefore, by using the

Bayesian analysis to form hybrid measurement results from multiple sources,

the resultant uncertainties are improved. For illustration, a practical example was

discussed in [2].

4. Conclusions

In this paper, a nonlinear regression is used to identify an optimum set of

parameters that yields the closest experimental-to-theory agreement and thus

generates the model-based measurements for the desired parameters with the

associated uncertainties. To improve the measurements, a Bayesian statistical

approach has been applied to combine measurement information from other

reference metrology platforms into the regression analysis for the original

model-based measurements. The resultant estimators of the model parameters

have smaller variances and smaller mean squared errors than those based on the

measurements from the original model-based measurements alone. The

measurement uncertainties are also improved. The new methodology has

important implications in devising measurement strategies that take advantage

of the best measurement attributes of each individual technique.

References

1. R. M. Silver, B. M. Barnes, R. Attota, R. Jun, M. Stocker, E. Marx, H. J.

Patrick, Scatterfield microscopy for extending the limits of image-based

optical metrology, Applied Optics, 46, 4248-4257, 2007.

2. N. F. Zhang, R. M. Silver, H. Zhou, and B. M. Barnes, Improving optical

measurement uncertainty with combined multitool metrology using a

Bayesian approach, Applied Optics, 51(25), 6196-6206, 2012.

416


3. C. R. Rao, Linear Statistical Inferences and Its Applications, 2nd

Ed., John

Wiley & Sons, New York, 1973.

4. C. R. Rao and H. Toutenburg, Linear Models: Least squares and

Alternatives, Springer, New York, 1995.

5. R. M. Silver, T. A. Germer, R. Attota, B. M. Barnes, B. Bunday, J. Allgair,

E. Marx, and J. Jun, Fundamental limits of optical critical dimension

metrology: a simulation study, Proc. SPIE, 6518, 65180U, 2007.

6. A. Vaid, B. B. Yan, Y. T. Jiang, M. Kelling, C. Hartig, J. Allgair, P.

Ebersbach, M. Sendelbach, N. Rana, A. Katnani, E. Mclellan, C. Archie, C.

Bozdog, H. Kim, M. Sendler, S. Ng, B. Sherman, B. Brill, I. Turovets, and

R. Urensky, Holistic metrology approach: hybrid metrology utilizing

scatterometry, critical dimension atomic force microscope and critical

dimension-scanning electron microscope, J. Micro/Nanolith. MEMS

MOEMS, 10, 043016, 2011.

7. D. V. Lindley and A. F. M. Smith, Bayes estimates for the linear models, J.

R. Stat. Soc. Ser. B., 34(1),1-41, 1972.

8. A. Gelman, J. B. Carlin, H. S. Stern, and D. B. Rubin, Bayesian Data

Analysis, 2nd Ed., Chapman & Hall/CRC, Boca Raton, 2004.

9. R. A. Horn and C. R. Johnson, Matrix Analysis, Cambridge University

Press, Cambridge, 1985.

417





APPLICATION OF EFFECTIVE NUMBER OF OBSERVATIONS

AND EFFECTIVE DEGREES OF FREEDOM FOR ANALYSIS OF

AUTOCORRELATED OBSERVATIONS

A. ZIEBA

AGH University of Science and Technology

Krakow, Poland

[email protected]

The minimum extension to the standard formalism of Type A uncertainty evaluation for

autocorrelated observations consists in retaining the arithmetic mean as an estimator of

expected value but changing the formulae for sample standard deviation, standard

deviation of the mean and coverage interval. These formulae can be expressed in a

compact form by introducing two quantities: the effective number of observations neff

and the effective degrees of freedom νeff. They are fixed real numbers when the

autocorrelation function is known and they become estimators when only an estimate of

the autocorrelation function is available. The presentation of the subject involves a

critical synthesis of available solutions, augmented by some new results and tested using

a Monte Carlo method.

Keywords: Autocorrelation; Type A uncertainty; effective number; unbiased estimator

1. Introduction

The standard algorithm of Type A evaluation of uncertainty is optimal when

observations are independent, identically distributed, and with a normal

distribution of measurement error. This paper concerns a case when the first

assumption is lifted, i. e., observations are autocorrelated. Autocorrelated data

are ubiquitous in Earth sciences [1] and economics, and can occur in other areas

of science.

Assuming that normality holds, a suitable stochastic model is fully defined

by the expectation µ, standard deviation σ and autocorrelation function ρk. An

exact solution for the best linear unbiased estimator (BLUE) can be derived

using generalized least squares [2]. This work considers a minimal

generalization of the standard formalism, with the mean value x retained as an

estimator of expectation. The loss of effectiveness for the finite sample is usually

small [2].

418


On the other hand, autocorrelation has strong influence on evaluation of

standard and expanded uncertainty. The well-known estimators of variance s2

and variance of the mean )(2xs are no longer unbiased, and the latter is not even

consistent. When the autocorrelation function (ACF) is known one cane derive

new estimators 2a

s and )(2xs

a (index a after ‘autocorrelated’) for which all well-

known properties of standard estimators are retained.

To keep the notation unchanged as much as possible one can introduce two

quantities, namely the effective number of observations and effective degrees of

freedom Both depend only on the sample size n and ρk hence they are fixed

numbers when the ACF is known. They will be estimators when the ACF is to be

estimated from the investigated sample xi.

2. Effective Numbers for Known Autocorrelation Function

2.1. Definition of effective number of observations neff

The relation between the variance 2σ of the data and variance of the estimated

mean 2

xσ for autocorrelated sample of size n is given by the formula [3]

2

21

1

2)(2

nknn

n

k

kx

σρσ

−+= ∑

−

=

. (1)

It can be rewritten in a familiar form

eff

xn

22 σ

σ = ,

(2)

with effective number of observations defined as

∑−

=

−+

=1

1

21

n

k

k

eff

n

kn

nn

ρ

.

(3)

As discussed in [4], this effective number was introduced independently in no

fewer than ten papers. There are several reasons to introduce this quantity.

• It organizes our intuitive understanding of the problem by suggesting

that the stochastic properties of an autocorrelated sample are similar to

other, usually smaller number of independent observations,

• It allows us to express formulae for standard deviations in a compact

form

419


• neff approaches unity in the limit of all ρk → 1. This transition provides

the link to the case of systematic error.

• An important application of neff concerns a sample of finite length taken

from a continuous stochastic process. It contains an infinite continuum

of points, but it can be characterized by a finite effective number of

observations [5].

2.2. Unbiased estimators of variance

The formulae for unbiased estimators of sample variance 2a

s and variance of the

mean )(2xs

a were given for the first time by Bayley and Hammersley [6] (A

published proof is given in [3]).

The sample variance

22sCs

a= with: ( )∑

=

−

−

=

−

−

=

n

i

i

eff

effxx

ns

nn

nnC

1

22

1

1,

)1(

)1( (4)

is expressed as a product of standard sample variance s2 (for uncorrelated data)

and a correction factor C. As long as neff is not very small, the value of 2a

s is

rather close to 2s because C is not much larger then 1 (for positive correlations).

However, the variance of the mean,

( )∑=

−

−

==

n

i

i

effeff

a

axx

nnn

sxs

1

22

2

)1(

1)( , (5)

is markedly different even for a large sample size. The GUM formula for the

Type A standard uncertainty consists in taking the square root of unbiased

estimator of variance. Hence, the Type A standard uncertainty for autocorrelated

observations is

)()( 2xsxu

a≡ . (6)

2.3. Effective degree of freedom

This parameter is defined [6] by the formula

eff

as

ν

σ4

2 2)(Var = , (7)

analogous to the expression )1(2)(Var 42−= ns σ for independent

observations.

420


A complicated exact formula for eff

ν is given in [6]. By retaining its leading

terms one can derive [7] a simple asymptotic formula

1

211

1

2

−

+

≅

∑−

=

n

k

k

eff

n

ρ

ν .

(8)

The effective degrees of freedom is a real number in the interval (0, n − 1] for

both positive and negative correlations. Note that 1−≠ effeff nν .

It follows from definition (7) that νeff can be used to estimate the relative

dispersion of an estimator of standard deviation using the asymptotic formula,

.)2()(Var 2/1−= effas νσ Even more important is the application of νeff in

calculating the expanded uncertainty

)()()( xstxukxUaP

=≡ (9)

defining the coverage interval. Eq. (9) depends on a conjecture that the

expansion coefficient k is given by a critical value tP of a Student variable with

the effective degrees of freedom. MC simulation [10], aimed at checking that

conjecture and the accuracy of an approximation (8) for the finite sample, has

shown that the presented formalism can safely be used when the ACF is known.

3. The Case of Autocorrelation Function Estimated from the Data

3.1. Estimators of autocorrelation function

The estimator of the ACF that is most commonly used (and implemented in

computer programs) is

∑

∑

=

+

−

=

−

−−

=n

i

i

ki

kn

i

i

k

xx

xxxx

r

1

2

1

)(

)()(

.

(10)

The calculation of both effective numbers by direct replacement of the

autocorrelation function ρk by rk in Eqs. (3) and (8) leads to estimators with

large negative bias [8]. There are two reasons for that undesirable feature.

First, estimator (10) is biased, with a negative bias proportional to n−1

. The

first source of bias is the different number of terms in the numerator and

denominator of (10). This component of bias can be compensated for by

introducing a factor n/(n − k). A more important source of bias is the replacement

421


of the expectation µ by the mean x . Quenouille [9] has introduced a bias-

reduced estimator of the ACF, with a residual bias ∝ n−2

. It is given by the

expression

22

)2()1()( kk

k

Q

k

rrrr

+

−= . (11)

The symbols )1(

kr and

)2(k

r denote elements of two autocorrelation functions

calculated, respectively, from two halves of the sample.

Figure 1 shows en example estimate rk of the ACF function. Real

information on autocorrelation is contained in the slope (at small values of lag

k). The remaining part (tail) resembles a continuous function but it is merely

autocorrelated noise.

-0,6

-0,4

-0,2

0,0

0,2

0,4

0,6

0,8

1,0

6050403020

k

rk

SMA, n = 60 slope

tail

nc=13

10

Fig. 1. Exemplary estimate of ACF for 60-element autocorrelated sample. It is calculated from 60-

elements autocorrelated sample generated using a model of simple moving average (SMA) of 5

succesive independent random numbers. The non-zero elements of true ACF are: ρ0 = 1, ρ1 = 0.8,

ρ2 = 0.6, ρ3 = 0.4, ρ4 = 0.2.

Zhang [11] has proposed a procedure of truncating the ACF based on an

idea of detecting the essentially nonzero values of rk. A better and simpler

method introduced in [8] consists in limiting the estimate rk to its positive

elements before its first transition through zero (FTZ method). The limiting lag

obtained, nc (Fig. 1), is used instead of n − 1 in formulae (3) and (8) to now

define estimators of both effective numbers. In particular

422


∑=

+

=cn

k

k

eff

r

nn

1

*

21

ˆ .

(12)

An analogous Quenouille’s estimator of the effective number of observations )(ˆ Q

effn is obtained by using )(Q

kr instead of

kr in (12).

The FTZ method can only be applied when all elements of the ACF are

nonnegative. This is the case in nearly all experimental situations.

3.2. Estimators of standard deviation and standard deviation of the mean

When the ACF is known the scatter of the estimators a

s and )(xsa

is the same,

determined solely by the effective degrees of freedom νeff. When the ACF is

estimated from the data this scatter increases because, in addition to the

stochastic properties of the sum ∑ −2)( xx

i, the effective number of

observations is now an estimator with nonzero dispersion. This increase is

modest for the sample standard deviation because the factor C in (4) is close to

unity. However, the scatter of the standard deviation of the mean is much larger

because of the stochastic properties of the factor )1ˆ(1 −effn . Quantitative

investigations for both estimators obtained using the MC method are presented

in [8].

3.3. Coverage interval

The idea of terminating the experimental ACF (FTZ method) and the use of bias-

reduced estimators of the ACF can also be used to estimate the effective degrees

of freedom. The value effν obtained allows us to define the critical value of

Student’s variable tP for an assumed coverage probability P and the expanded

uncertainty (9).

The validity of a calculation of expanded uncertainty was investigated using

Monte Carlo method for the first-order autoregressive model AR(1) ( k

ka

−=ρ ,

a = 0.66) [10]. It was tentatively assumed that Eq. (9) can be used with a critical

value of Student’s variable corresponding to the nominal coverage probability

P = 0.95 and the effective degrees of freedom estimated from the data. That way

one can calculate, for a given n-element MC sample, the coverage interval

)(xstxaP

± , (13)

and check whether it covers the true value (µ = 0 in simulations). Such results for

a sufficiently large set (about 100 000) of MC samples, allows us to determine

the “real” coverage probability P*.

423


Preliminary results (Fig. 2) were obtained for sample sizes n = 15, 60, 240

and 1000; the corresponding values of neff are indicated on the graph. The value

of 1 − P* remains in agreement with nominal 1 − P = 0.05 in the limit of large n.

The discrepancies at the smaller sample sizes are considerable. This simulation

also suggests that the use of bias-reduced estimator )(Q

kr to define effective

numbers is of importance because the discrepancy is about half that for the

standard estimator rk.

10 100 1000

0,00

0,05

0,10

0,15

0,20

0,25

1− P

*

sample size n

1− P = 0.05

neff = 3.4

12.3

48333

standard rk

rk(Q)

Fig. 2. MC investigation of the validity of coverage interval estimated with use of ACF derived from

the data. See the text.

4. Conclusions

The standard GUM algorithm for Type A uncertainty evaluation can be

generalized for the case of autocorrelated observations. The described formalism

represents its minimal extension and does not depend on there being any

particular model of the autocorrelated series.

Although the theory for the case of known ACF has existed for six decades

but is not widely known. Ongoing investigation concerns the case when the ACF

is to be estimated from the data. A Monte Carlo method can be used to check the

validity of various approaches for the given type of autocorrelated data and

sample size.

References

[1] H. v. Storch and F. W. Zwiers. Statistical Analysis in Climate Research.

Cambridge University Press 1999.

424


[2] J. S. Chipman, K. R. Kadiyala, A. Madansky, and J. W. Pratt. Efficiency of

the sample mean when residuals follow a first-order stationary Markoff

process. J. Amer. Statist. Assoc. 63, 1237 (1968).

[3] G. E. P. Box, G. M. Jenkins, and G. C. Reinsel. Time Series Analysis:

Forecasting and Control 3rd ed. Englewood Cliffs: Prentice Hall, 1994, p.

30.

[4] A. Zieba, Effective number of observations and unbiased estimators of

variance for autocorrelated data − an overview. Metrol. Meas. Syst. 17, 3

(2010).

[5] C. E. Leith. The standard error of time-averaged estimates of climatic

means. J. Appl. Meteorol. 12, 1066 (1973).

[6] G. V. Bayley and J. M. Hammersley. The “effective” number of

independent observations in an autocorrelated time series. J. R. Stat. Soc.

Suppl. 8, 184 (1946).

[7] Eq. (8) is a corrected version of unnumbered formula in Ref. [6], p. 185.

[8] A. Zieba and P. Ramza. Standard deviation of the mean of autocorrelated

observations estimated with the autocorrelation function estimated from the

data. Metrol. Meas. Syst. 18, 529 (2011).

[9] M. H. Quenouille. Approximate tests of correlation in time-series. J. R.

Statist. Soc. B, 11, 68 (1949). Better presentation: F. C. H. Marriott and J.

A. Pope. Bias in the estimation of autocorrelations. Biometrika, 41, 390

(1954).

[10] P. Ramza and A. Zieba, to be published.

[11] N. F. Zhang, N.F. (2006). Calculation of the uncertainty of the mean of

autocorrelated measurements. Metrology 43, 276 (2006).

425





Author Index

Almeida, N., 98

Azzam, N., 116

Baksheeva, Y., 90

Balakrishnan, N., 124

Barnes, B.M., 409

Batista, E., 98

Belousov, V.I., 105

Berzhinskaya, M.V., 149

Binacchi, M., 156

Boukebbab, S., 116

Boyko, I.G. , 219

Bunakov, V., 377

Burmistrova, N., 132

Chimitova, E.V., 124, 350

Chaves-Jacob, J., 116

Chernysheva, N.S., 179

Chunovkina, A., 132

Cox, M.G., 9

Crampton, A., 285

Cundeva-Blajer, M., 140

Danilov, A.A., 149

Dantas, C.C., 247

De Bièvre, P., 1

De Boeck, B., 301

Demeyer, S., 156

Didieux, F., 156

Dobre, M., 301

Dovica, M., 279

Ďuriš, S., 279, 293

Ehara, K., 358

Evans, D.J., 310

Ezhela, V.V., 105

Fischer, N., 156

Filipe, E., 98

Forbes, A.B., 17, 164, 273

Franke, M., 392

Godinho, I., 98

Granovskii, V.A., 29

Harris, P., 9

Härtig, F., 164, 187, 392

Hovanov, N., 171

Ionov, A.B., 179

Ionov, B.P., 179

Keller, F., 187

Kok, G.J.P., 195, 203

Köning, R., 211

Kreinovich, V.Ya., 38, 330, 340

Krivov, A.S., 219

Kucherenko, Yu.V., 149

Kudeyarov, Yu.A., 241

Kulyabina, E.V., 241

Kuselman, I., 50

Kuyanov, Y.V., 105

Kyriazis, G.A., 229

Lemeshko, B.Yu., 54

Lima, E.A.O., 247

Linares, J.-M, 116, 252

Longstaff, A.P., 285

Lugovsky, K.S., 105

Lugovsky, S.B., 105

Maniscalco, U., 260

Marinko, S.V., 219

426


Martins, L.L., 265

Melo, S.B., 247

Merlone, A., 293

Minh, H.D., 273

Ordinartseva, N.P., 149

Palenčár, R., 279, 293

Parkinson, S., 285

Pavese, F., 1

Pavlásek, P., 279, 293

Pelevic, N., 203

Peruzzi, A., 301

Petry, J., 301

Rebordão, J.M., 265

Ribeiro, A.S., 265

Rizzo, R., 260

Rukhin, A.L., 310

Sapozhnikova, K., 90

Semenov, K.K., 320, 330, 340

Semenova, M.A., 350

Shestakov, A.L., 66

Shiro, M., 358

Shirono, K., 358

Silver, R.M., 409

Siraya, T., 368

Slosarčík, S., 279

Smith, I.M., 164, 195, 273

Soares Bandiera, S., 247

Solopchenko, G.N., 330, 340

Sprauel, J.M., 252

Stepanov, A., 132

Taymanov, R., 90

Tanaka, H., 358

Teles, F.A.S., 247

Tkachenko, N.P., 105

Tselykh, V., 377

Uspenskiy, V., 377

Volodarsky, Е.T., 385 Vorontsov, K., 377

Warsza, Z.L., 385, 400

Wendt, K., 164, 187, 392

Willink, R., 78

Wimmer, G., 211, 279

Witkovský, V., 211

Zabolotnii, S.V., 400

Zhang, N.-F., 409

Zhou, H., 409

Zieba, A., 417

427





Keywords Index

Accelerometers 310

Accuracy 385

Acoustic signals 90

Adaptive measuring system 66

Alignment iterative closest point

116

Analytical chemistry 50

Anderson–Darling test 54

Autocorrelation 417

Automated planning 286

Bayesian approach 132, 301

Bayesian signal analysis 229

Calibration 9, 98, 149

Calibration curves 368

Cartesian method 229

Certification of algorithms 368

CFD 203

Change-point 400

Characteristic function 179

Classification 50

Combination of data 78

Composite hypotheses 54

Computation time 229

Computational code 156

Computational vision 265

Condition number 195

Continuous space 247

Conversion 293

Correct measuring 171

Correctness 29

Covariance matrix 279, 409

Coverage intervals 132

Cramer–von Mises–Smirnov test

54

Criteria for the measurement result

105

Critical dimension measurements

409

Cross-validation 377

Cumulative coefficients 400

Current status data 124

Decision 1

Degree selection 9

Displacement 265

Diversity of thinking 1

Drug delivery devices 98

Dynamic characteristic 29

Dynamic error evaluation 66

Dynamic measurements 29, 66

Dynamic measuring system 66

Economic value 171

Effective number 417

Electrocardiography 377

Electromagnetic quantities 140

Ellipse fitting 211

EllipseFit4HC 211

Emotion measurement 90

Empirical function 17

Epstein frame 140

Expanded uncertainty 301

FCC simulation 247

Finite element method 140

Fire engineering 156

Fluid Catalytic Cracking 247

Fuzzy intervals 340

Gaussian distribution 301

428


Gaussian processes 17, 156

Generalized least squares 409

Geometric element 273

Goodness-of-fit 54, 124, 350

GUM 105, 301

Heydemann correction 211

Historical temperature scale 293

Homogeneity 385

HPLC 241

Human errors 50

Identification 241

Importance sampling 156

Inaccurate data 340

Inconsistent data 78

Indirect measurements 105

Information function of heart 377

Insignificance of regression

parameters 350

Instrument 29

Instrument transformer 140

Intelligent instruments 180

Inter-laboratory comparisons 385

Interval computations 38

Interval uncertainty 38

Interval-censored samples,

Maximum likelihood estimator

124

Interval-related statistical

techniques 38

Inverse problem 29, 320, 330

ISO 1101 252

Iterative signal recovery approach

66

Jackknife 252

JCGM 105

Kalman filtering 220

KCRV 310

Key comparison uncertainty 358

Kolmogorov test 54

Kuiper test 54

Latin hypercube sampling 204

Least squares 392

Machine learning 377

Machining toolpath 116

Marginal likelihood 358

MatLab 211

Measurement estimation 260

Measurement model 90

Measurement process 286

Measuring instruments 149

Metrology 140

Microflow 98

Modal control of dynamic

behaviour 66

Model of experiment 279

Modelling 50

Moments 400

Monte-Carlo 124, 156, 203

Multi-disease diagnostic system

377

Neural networks 66, 260

Non-Gaussian sequence 400

Non-normality 301

Non-parametric maximum

likelihood estimator 124

Nonlinear regression 409

Numerical accuracy 195

Numerical computation 247

Numerical error 273

Numerical methods 140

Numerical peer review 105

Numerical sensitivity 195

Numerical software self-

verification 340

Numerical uncertainty 195

Observed state vector 66

One-shot devices 124

Optical metrology 265

Optimization 286

Outliers 385

429


Performance metric 195

Polynomial chaos 203

Polynomial representation 9

Precision 385

Prior information 409

Probability of exceeding threshold

156

Procedure for uncertainty

management 286

Proficiency testing 385

Propagation of distributions 105

Proportional hazards model 350

PUMA 286

Quadrature homodyne

interferometers 211

Quantification 50

Radiation thermometry 179

Random effects 78

Real-time data processing 330

Reference pairs 187

Regularity 29

Reliability 241

Retrospective estimation 400

Risk 1

Robust statistics 385

Sensitivity 310

Signal discretisation 377

Simple cross-effect model 350

Simulation 310

Single thinking 1

Sliding mode control 66

Soft sensors 260

Software test 392

Software tool 293

Software validation 164, 195

Sonic nozzle 203

Static measurements 320

Stochastic polynomial 400

Stopping iterative procedures 320

Student criterion 241

Substitute element 392

Surface roughness 116

Survival analysis 350

Suspension bridge 265

Temperature chamber 219

Test equipment calibration 219

Test uncertainty 187

Traceability 164

TraCIM 187, 392

Type A uncertainty 417

Unbiased estimator 417

Uncertainties of interferometric

phases 211

Uncertainty 1, 9, 98, 149, 187,

219, 247, 273, 279, 286, 310, 385

Uncertainty calculation 179

Uncertainty evaluation 105, 203,

252

Units of account 171

Validation 392

Watson test 54

Zhang tests 54



Series on Advances in Mathematics for Applied Sciences

Editorial Board

M. A. J. ChaplainDepartment of MathematicsUniversity of DundeeDundee DD1 4HNScotland

C. M. DafermosLefschetz Center for Dynamical SystemsBrown University Providence, RI 02912USA

J. FelcmanDepartment of Numerical MathematicsFaculty of Mathematics and PhysicsCharles University in PragueSokolovska 8318675 Praha 8The Czech Republic

M. A. HerreroDepartamento de Matematica AplicadaFacultad de MatemáticasUniversidad ComplutenseCiudad Universitaria s/n28040 MadridSpain

S. KawashimaDepartment of Applied SciencesEngineering FacultyKyushu University 36Fukuoka 812Japan

N. BellomoEditor-in-ChargeDepartment of MathematicsPolitecnico di Torino Corso Duca degli Abruzzi 2410129 Torino ItalyE-mail: [email protected]

F. BrezziEditor-in-ChargeIMATI - CNRVia Ferrata 527100 PaviaItalyE-mail: [email protected]

M. LachowiczDepartment of MathematicsUniversity of WarsawUl. Banacha 2PL-02097 WarsawPoland

S. LenhartMathematics DepartmentUniversity of TennesseeKnoxville, TN 37996–1300USA

P. L. LionsUniversity Paris XI-DauphinePlace du Marechal de Lattre de TassignyParis Cedex 16France

B. PerthameLaboratoire J.-L. LionsUniversité P. et M. Curie (Paris 6)BC 1874, Place JussieuF-75252 Paris cedex 05, France

K. R. RajagopalDepartment of Mechanical Engrg.Texas A&M UniversityCollege Station, TX 77843-3123USA

R. RussoDipartimento di MatematicaII University NapoliVia Vivaldi 4381100 CasertaItaly



Aims and Scope

This Series reports on new developments in mathematical research relating to methods, qualitative and numerical analysis, mathematical modeling in the applied and the technological sciences. Contributions rlated to constitutive theories, fluid dynamics, kinetic and transport theories, solid mechanics, system theory and mathematical methods for the applications are welcomed.

This Series includes books, lecture notes, proceedings, collections of research papers. Monograph collections on specialized topics of current interest are particularly encouraged. Both the proceedings and monograph collections will generally be edited by a Guest editor.

High quality, novelty of the content and potential for the applications to modern problems in applied science will be the guidelines for the selection of the content of this series.

Instructions for Authors

Submission of proposals should be addressed to the editors-in-charge or to any member of the editorial board. In the latter, the authors should also notify the proposal to one of the editors-in-charge. Acceptance of books and lecture notes will generally be based on the description of the general content and scope of the book or lecture notes as well as on sample of the parts judged to be more significantly by the authors.

Acceptance of proceedings will be based on relevance of the topics and of the lecturers contributing to the volume.

Acceptance of monograph collections will be based on relevance of the subject and of the authors contributing to the volume.

Authors are urged, in order to avoid re-typing, not to begin the final prepara-tion of the text until they received the publisher’s guidelines. They will receive from World Scientific the instructions for preparing camera-ready manuscript.



Published*:

Vol. 72 Advanced Mathematical and Computational Tools in Metrology VII eds. P. Ciarlini et al.

Vol. 73 Introduction to Computational Neurobiology and Clustering by B. Tirozzi, D. Bianchi and E. Ferraro

Vol. 74 Wavelet and Wave Analysis as Applied to Materials with Micro or Nanostructure by C. Cattani and J. Rushchitsky

Vol. 75 Applied and Industrial Mathematics in Italy II eds. V. Cutello et al.

Vol. 76 Geometric Control and Nonsmooth Analysis eds. F. Ancona et al.

Vol. 77 Continuum Thermodynamics by K. Wilmanski

Vol. 78 Advanced Mathematical and Computational Tools in Metrology and Testing eds. F. Pavese et al.

Vol. 79 From Genetics to Mathematics eds. M. Lachowicz and J. Miękisz

Vol. 80 Inelasticity of Materials: An Engineering Approach and a Practical Guide by A. R. Srinivasa and S. M. Srinivasan

Vol. 81 Stability Criteria for Fluid Flows by A. Georgescu and L. Palese

Vol. 82 Applied and Industrial Mathematics in Italy III eds. E. De Bernardis, R. Spigler and V. Valente

Vol. 83 Linear Inverse Problems: The Maximum Entropy Connection by H. Gzyl and Y. Velásquez

Vol. 84 Advanced Mathematical and Computational Tools in Metrology and Texting IX eds. F. Pavese et al.

Vol. 85 Continuum Thermodynamics Part II: Applications and Examples by B. Albers and K. Wilmanski

Vol. 86 Advanced Mathematical and Computational Tools in Metrology and Testing X eds. F. Pavese et al.

*To view the complete list of the published volumes in the series, please visit:http://www.worldscibooks.com/series/samas_series.shtml


Computational tools in metrology and testing x

Engineering

testing x

computational requirements

theme of advanced mathematical

book series

world regions

adv math compu tools

applied sciences volume

usa office