Top Banner

of 58

Probability of Detection Curves

Aug 07, 2018

Download

Documents

Dav89
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 8/20/2019 Probability of Detection Curves

    1/144

    HSEHealth & Safety

    Executive

    Probability of Detection (PoD) curves 

    Derivation, applications and limitations

    Prepared by Jacobi Consulting Limited

    for the Health and Safety Executive 2006

    RESEARCH REPORT 454 

  • 8/20/2019 Probability of Detection Curves

    2/144

    HSEHealth & Safety

    Executive

    Probability of Detection (PoD) curves 

    Derivation, applications and limitations

    George A Georgiou

    Jacobi Consulting Limited

    57 Ockendon Road

    London N1 3NL

    There is a large amount of ‘Probability of Detection’ (PoD) data available (eg National NDT Centre (UK),NORTEST (Norway), NIL (Netherlands) and in particular NTIAC (USA)). However, it is believed that PoDcurves produced from PoD data are not very well understood by many who use and apply them. Forexample, in producing PoD curves, a certain material and thickness may have been used and yet one canfind the same PoD quoted for a range of thicknesses. In other cases, PoD curves may have beendeveloped for pipes, but they have been applied to plates or other geometries. Similarly, PoD curves forone type of weld (eg single sided) have been used for other welds (eg double sided). PoD data are also

    highly dependent on the Non-Destructive Testing (NDT) methods used to produce them and these datacan be significantly different, even when applied to the same flaws and flaw specimens. It is oftenassumed that the smallest flaw detected is a good measure of PoD, but there is usually a large gapbetween the smallest flaw detected and the largest flaw missed. Similarly, it is often assumed that humanreliability is a very important factor in NDT procedures, and yet it is usually found not to be as important asother operational and physical parameters.

    It is important to question the validity of how PoD curves are applied as well as their limitations. This reportaims to answer such questions and in particular their relevance to fitness for service issues involving PoD.

    The overall goal of this project is to provide clear, concise, understandable and practical information onPoD curves, which will be particularly useful for Health and Safety Inspectors when discussing safetycases involving PoD curves.

    This report and the work it describes were funded by the Health and Safety Executive (HSE). Its contents,including any opinions and/or conclusions expressed, are those of the author alone and do not necessarilyreflect HSE policy.

    HSE BOOKS

  • 8/20/2019 Probability of Detection Curves

    3/144

     © Crown copyright 2006

    First published 2006

    All rights reserved. No part of this publication may be  reproduced, stored in a retrieval system, or transmitted in 

    any form or by any means (electronic, mechanical, 

    photocopying, recording or otherwise) without the prior written permission of the copyright owner. 

    Applications for reproduction should be made in writing to:Licensing Division, Her Majesty's Stationery Office,St Clements House, 2-16 Colegate, Norwich NR3 1BQor by e-mail to [email protected]

     

    ii

  • 8/20/2019 Probability of Detection Curves

    4/144

    TABLE OF CONTENTS 

    TABLE CAPTIONS AND FIGURE CAPTIONS v 

    EXECUTIVE SUMMARY vi Background vi 

    Objectives vi 

    Work Carried Out vi Conclusions vii 

    Recommendations vii 

    1. INTRODUCTION 1 

    2. OBJECTIVES 2 

    3. DERIVATION OF POD CURVES 2 

    3.1. A HISTORICAL BACKGROUND AND DEVELOPMENT OF NDT R ELIABILITY METHODS 2 

    3.2. EXPERIMENTAL R EQUIREMENTS TO PRODUCE POD CURVES 3 

    3.3. THE AVAILABLE PROBABILITY METHODS TO PRODUCE POD CURVES 4 

    3.3.1. PoD Curves for Hit/Miss Data 4 

    3.3.2. PoD Curves for Signal Response Data 5 

    3.3.3. Sample Sizes 6 

    3.4. CONFIDENCE LIMITS (OR CONFIDENCE I NTERVALS) 7 

    3.5. PUBLISHED WORK ON THE MODELLING OF POD 8 

    3.5.1. An Overview 8 

    3.5.2. The PoD-generator (The Netherlands) 9 

    3.5.3. Iowa State University (USA) 9 

    3.5.4. National NDT Centre (UK) 9 

    4. THE PRACTICAL APPLICATION OF POD CURVES 10 

    4.1. HOW POD CURVES ARE USED IN I NDUSTRY 10 

    4.2. PUBLISHED WORK ON POD CURVES IN DIFFERENT I NDUSTRIES 10 

    4.2.1. Aerospace (NASA) 10 

    4.2.2. Aircraft Structures, Inclusions in Titanium Castings 10 

    4.2.3. NORDTEST Trials 11 

    4.2.4. Nuclear Components (The PISC Trials) 11 

    4.2.5. Offshore tubular Joints 11 

    4.2.6. Dutch Welding Institute (NIL) 11 

    4.2.7. Railways (National NDT Centre (UK)) 12 

    4.2.8. LPG Storage Vessels 13 

    5. THE LIMITATIONS OF APPLYING POD CURVES 13 

    5.1. COMMENTS ON HIT/MISS DATA AND SIGNAL R ESPONSE DATA 13 

    5.2. IMPORTANT OPERATING AND PHYSICAL PARAMETERS 14 

    5.2.1. NDT Method 15 

    5.2.2. Fluorescent Penetrant NDT 15 

    5.2.3. Material Properties 15 

    5.2.4. Specimen Weld Geometry 15 

    5.2.5. Flaw Characteristics 16 

    iii

  • 8/20/2019 Probability of Detection Curves

    5/144

    5.2.6.  Human Reliability 16

    6.  DISCUSSION 16

    6.1. 

    I NTRODUCTION 16

    6.2.  AIMS AND OBJECTIVES 17

    6.3. 

    HISTORICAL DEVELOPMENT 17

    6.4.  FLAW SAMPLE SIZES FOR ‘HIT/MISS’ DATA AND ‘SIGNAL R ESPONSE’ DATA 18

    6.4.1.  Model for Hit/Miss Data 18

    6.4.2. 

    Model for Signal Response Data 19

    6.4.3.  To Compute PoD parameters 20

    6.4.4.  To Achieve the Desired PoD/Confidence Limit Combination 20

    6.5.  POD MODELLING 20

    6.6.  PRACTICAL APPLICATIONS OF POD 21

    6.6.1.  Aircraft Structures, Inclusions in Titanium Castings 21

    6.6.2. 

     NORDTEST Trials 21

    6.6.3.   Nuclear Components (The PISC Trials) 21

    6.6.4.  Offshore Tubular Joints 21

    6.6.5.  Dutch Welding Institute (NIL) 22

    6.6.6.  Railways 22

    6.6.7.  LPG Storage Vessels 22

    6.7.  DEPENDENCE OF POD ON OPERATIONAL AND PHYSICAL PARAMETERS 22

    6.7.1.  Important Operational and Physical Parameters 22

    6.7.2. 

     NDT Method 23

    6.7.3. 

    Fluorescent Penetrant NDT 236.7.4.  Material Properties 23

    6.7.5.  Specimen Weld Geometry 23

    6.7.6. 

    Flaw Characteristics 23

    6.7.7.  Human Reliability 24

    7.  INDEPENDENT VERIFICATION 24

    8. 

    CONCLUSIONS 24

    9.  RECOMMENDATIONS 25

    10.  ACKNOWLEDGEMENTS 25

    11.  REFERENCES 25

    12. VERIFICATION STATEMENT

    TABLES  1

    FIGURES 

    APPENDIX A 

    APPENDIX B 

    APPENDIX C 

    1 - 13

    GLOSSARY OF TERMS, STATISTICAL TERMINOLOGY AND OTHER

    RELEVANT INFORMATION

    AN AUDIT TOOL FOR THE PRODUCTION AND APPLICATION OF

    POD CURVESTHE VALIDITY OF THE JCL ‘INDEX OF DETECTION’ MODEL

    iv

  • 8/20/2019 Probability of Detection Curves

    6/144

    TABLE CAPTIONS AND FIGURE CAPTIONS

    TABLE CAPTIONS

    Table 1  Maximum Probability Tables

    FIGURE CAPTIONS

    Figure 1 

    Example of detection percentages for a handheld Eddy-Current inspection and a ‘log-odds’ distribution fit to the data.

    Figure 2 

    Ultrasonic NDT hit/miss data illustrating the relatively large gap between the smallest

    flaw detected and the largest flaw missed.

    Figure 3  The linear relationship between the log-odds and log flaw size.Figure 4 Schematic of the PoD for flaws of fixed dimension for ‘hit/miss’ data.

    Figure 5  Schematic of the PoD for flaws of fixed dimension for ‘signal response’ data.Figure 6

     

    A comparison between the log-odds and cumulative log-normal distribution functions

    for the same parameters =0 and =1.0.

    Figure 7  An example of when the log-odds model was not applicable to the data collected

    Figure 8   PoD (a) log-odds model results for different NDT methods applied to the same flaw

    specimen.

    Figure 9 

     PoD (a) log-odds model results for fluorescent penetrant: no developer and developer

    applied to the same flaw specimen

    Figure 10   PoD (a) log-odds model results for manual eddy currents: different materials butnominally the same flaws

    Figure 11   PoD (a) log-odds model results for X-ray radiography: different weld conditions butnominally the same flaws

    Figure 12   PoD (a) log-odds model results for fluorescent penetrant: different flaws but nominally

    the same specimens

    Figure 13   PoD (a) log-odds model results for Ultrasound (Immersion): different operators but

    inspecting the same flaw specimen

    v

  • 8/20/2019 Probability of Detection Curves

    7/144

  • 8/20/2019 Probability of Detection Curves

    8/144

     

    vii

     

    In order to illustrate and explain many of the important issues discussed, and which are particularly

    relevant to PoD, a number of experimental and theoretical examples are provided throughout the

    report.

    Conclusions

    •  The ‘log-odds’ distribution is found to be one of the best fits for hit/miss NDT data.

    • 

    The log-normal distribution is found to be one of the best fits for signal response NDT data, andin particular for flaw length and flaw depth data as determined by ultrasonic NDT.

    •  In some cases, the ‘log-odds’ and cumulative log-normal distributions are very similar, but

    there are many cases where they are significantly different.

    •  There are NDT data when neither the ‘log-odds’ nor the log-normal distributions are

    appropriate and other distributions need to be considered.

    •  There is often a large gap between the smallest flaw detected and the largest flaw missed.

    •  Very small or very large flaws do not contribute much to the PoD analysis of hit/miss data.

    •  To achieve a valid ‘log-odds’ model solution for hit/miss data, a good overlap between the

    smallest flaw detected and the largest flaw missed is necessary.

    • 

    To achieve a valid log-normal model solution for signal response data, there is less reliance onflaw size range overlap, but more on the linear relationship between ln(â) and ln(a).

    •  When the  PoD (a)  function decreases with increasing flaw size, it is usually an indication that

    the NDT procedures are poorly designed.

    •  When the lower confidence limit decreases with increasing flaw size, notwithstanding an

    acceptable  PoD (a)  function, it is usually associated with extreme or unreasonable values of the

    mean and standard deviation.

    •  The effect on PoD results for particular operational and physical parameters can be significant

    for datasets selected from the NTIAC data book of PoD curves.

    •  The PoD data in the NTIAC data book were collected some 30 years ago and may not

    necessarily reflect current capabilities with modern digital instrumentation. However, the

    results are still believed to be relevant to best practice NDT.•  The PoD data illustrated in each of the figures 7 – 13 are valid for the particular datasets in

    question. It would be wrong to draw too many general conclusions about the particular PoDvalues (e.g. ultrasound is better than X-ray).

    •  Figures 7 - 13 serve to illustrate the possible effects that the physical and operational parameters

    can have on the PoD and an awareness of these effects is important when quoting PoD results.

    •   NDT methods, equipment ‘calibration’, fluorescent penetrant developers, material, surface

    condition, flaws and human factors are all important operational and physical parameters,

    which can have a significant effect on PoD results.

    •  Whilst human factors are important variables in NDT procedures, they are often found not to be

    as important as other operational and physical variables.

    • 

    The ‘Log-odds’ distribution was found to be the most appropriate distribution to use with theJCL ‘Probability of Inclusion’ model.

    •  The earlier JCL ‘Probability of Inclusion’ model has been validated against an independently

    developed ‘Probability of Inclusion’ model by MBEL.

    Recommendations

    •  Publish a signal response data book of PoD results.

    •  Publish a more up to date data book from different PoD studies and collate them in a way which best serves more general industrial and modelling applications.

    •  Set up a European style project or Joint Industry Project to realise the above recommendations.

  • 8/20/2019 Probability of Detection Curves

    9/144

    viii

  • 8/20/2019 Probability of Detection Curves

    10/144

  • 8/20/2019 Probability of Detection Curves

    11/144

     

    2

    (Appendix D). Both the updated model and updated companion guidelines are now considered as

    having wider applications than just the ultrasonic NDT of LPG storage vessels.

    The whole report has been read by a qualified statistician to verify and check the calculations and toassess that the conclusions and recommendations are based on sound scientific reasoning. Additional

    verifications have been carried out by others and the full details are discussed in section 7 and a

    formal verification statement is made in Section 12.

    The overall goal of this project is to provide clear, concise and understandable information on PoD

    curves, which will be particularly useful for Health and Safety Inspectors in discussing safety cases

    involving PoD curves.

    2.  OBJECTIVES

    !  To provide a clear and understandable description of how PoD curves are derived.

    !  To provide practical applications of how PoD curves are used and their relevance to fitness for

    service issues.

    ! To quantify the limitations of PoD curves.

    3.  DERIVATION OF POD CURVES

    3.1.  A HISTORICAL BACKGROUND AND DEVELOPMENT OF NDT R ELIABILITY METHODS 

     Non-destructive Testing (NDT) reliability may be defined as 'the probability of detecting a crack in

    a given size group under the inspection conditions and procedures specified'   (1). There are of

    course other similar definitions, but the underlying statistical parameter is the PoD, which has become the accepted formal measure of quantifying NDT reliability. The PoD is usually expressed as

    a function of flaw size (i.e. length or depth), although in reality it is a function of many other

     physical and operational parameters, such as, the material, the geometry, the flaw type, the NDT

    method, the testing conditions and the NDT personnel (e.g. their certification, education andexperience).

    Repeat inspections of the same flaw size or the same flaw type will not necessarily result in

    consistent hit or miss indications. Hence there is a spread of detection results for each flaw size and

    flaw type and this is precisely why the detection capability is expressed in statistical terms such as

    the PoD. An early example of this is illustrated in the paper by Lewis et al (2), who had 60 air forceinspectors use the same surface eddy-current technique to inspect 41 known cracks around

    countersunk fastener holes in a 1.5m length of a wing box. The results are illustrated in Figure 1 in

    terms of a detection percentage (i.e. the number of times a crack was detected relative to the number

    of detection attempts). The chances of detecting the cracks increases with crack size, as one might

    expect, but none of the cracks were detected 100% of the time and different cracks with the samesize have quite different detection percentages. Figure 1 also shows that the ‘log-odds’ distribution is

    a reasonable fit to this data and illustrates why PoD is considered an appropriate measure of

    detection capability.

    PoD functions, for describing the reliability of an NDT method or technique have been the subject ofmany studies and have undergone considerable development since the late 1960’s and early 1970's,

    where most of the pioneering work was carried out in the aerospace industry (3,4). In order to ensure

    the structural integrity of critical components it was becoming more evident that instead of asking the

    question ‘…what is the smallest flaw that can be detected by an NDT method?’    it was more

    appropriate, from a fracture mechanics point of view, to ask ‘…what is the largest flaw that can be

    missed?’   To elaborate on this point here, ultrasonic inspection data has been re-plotted from the‘Non-destructive Testing Information Analysis Centre’ (NTIAC) capabilities data book (5). Figure 2

  • 8/20/2019 Probability of Detection Curves

    12/144

     

    3

    illustrates the detection capabilities of an ultrasonic surface wave inspection of two flat aluminium

     plates (thicknesses 1.5mm and 5.6mm), containing a total of 311 simulated fatigue cracks with

    varying depths. The flaws are recorded as detected (or hit) with PoD=1, or missed with PoD=0.

    Figure 2 shows three distinct regions separated by the lines a smallest   (i.e. the smallest flaw detected)and alargest   (i.e. the largest flaw missed). The region between a smallest and alargest   shows that there are

    flaws of the same size which are sometimes detected and sometimes not detected. It is also clear that

    alargest  is significantly larger than a smallest  .

    In 1969, a program was initiated by the National Aeronautics and Space Administration (NASA) to

    determine the largest flaw that could be missed for the various NDT methods that were to be used in

    the design and production of the space shuttle. The methodology by NASA was soon adopted by the

    US Air Force as well as the US commercial aircraft industry. In the last two decades many more

    industries have adopted similar NDT reliability methods based on PoD. Some of these will be

    discussed in more detail in section 4 below.

    Early on in the mid-1970’s, a constant PoD for all flaw types of a given size was proposed andBinomial distribution methods were used to estimate this probability, along with an associated error

    or ‘lower confidence limit’ as it is often called (1). Whilst good PoD estimates could be obtained fora single flaw size, very large sample sizes were required to obtain good estimates of the ‘lowerconfidence limit’ (see section 3.4 below for more details on the confidence limit). It is clear from

    Figure 1, that this early assumption about a constant PoD for flaws of a given size, whilst making the

     probability calculations easier, was too simplistic as different detection percentages were being

    recorded for the same flaw size.

    In cases where there was an absence of large sample sizes, various grouping schemes were

    introduced to analyse the data, but in these cases estimates for the lower confidence limit were no

    longer valid. In the early to the mid-1980s, the approach was to assume a more general model for the

    PoD vs. flaw size ‘a’ . Various analyses of data from reliability experiments on NDT methods

    indicated that the  PoD (a)   function could be modelled closely by either the cumulative 'log-normal'distribution or the 'log-logistic' (or ‘log-odds’) distribution (6). Both of these models will be

    discussed in more detail below.

    The statistical parameters (e.g. mean, median and standard deviation) associated with the  PoD (a)  

    functions can be estimated using standard statistical methods like 'maximum likelihood methods' (6)

    (see also Appendix A, section 2).

    3.2.  EXPERIMENTAL R EQUIREMENTS TO PRODUCE POD CURVES 

    The ‘Recommended Practice’ (1), which was originally prepared for the aircraft industry, provides

    comprehensive information on the experimental sequence of events for generating data to producePoD curves and to ‘certify’ (i.e. validate) an NDT method or procedure.

    The sequence of events can be broadly summarised as follows (see also (3)):

    •  Manufacture or procure flaw specimens with the required large number of relevant flaw sizesand flaw types

    •  Inspect the flaw specimens with the appropriate NDT method

    •  Record the results as a function of flaw size

    •  Plot the PoD curve as a function of flaw size

    However, before the manufacture or procurement of flaw specimens, it is necessary to make thefollowing crucial decisions:

  • 8/20/2019 Probability of Detection Curves

    13/144

  • 8/20/2019 Probability of Detection Curves

    14/144

     

    5

    where a  is the flaw size and m and are the median and standard deviation respectively.

    Another convenient form of equation (1) can be written as:

    a

    a

    e PoD a

    1 e

    ln

    ln( )

      +

    +

    +

      (2)

    and it is straightforward (see Appendix A, Section 3.2) to show that the parameters and are

    related to m and by:

    m

     

    (3)

    3

     

    #

     

    (4)

    From equation (2), it is straight forward to show that (see Appendix A, section 3.2):

     PoD aa

    1 PoD a

    ( )ln ln

    ( ) 

    % &

    '

    )

      (5)

    The term on the left hand side is called the logarithm of the ‘odds’

    (i.e. odds = probability of success/probability of failure) and equation (5) demonstrates that:

    odds aln( ) ln   (6)

    hence the name ‘the log-odds model’ when applied to the hit/miss data.

    In Figure 1, it is evident that the log-odds  PoD (a)   function fits the particular hit/miss eddy current

    data well. Further evidence is given in Figure 3 where the linear relationship shown above in

    equation (6) is demonstrated (see also reference 6). The particular parameters and in Figure 3 

    (i.e. = -2.9 and = 1.69) were computed using maximum likelihood methods (6). The statistical parameters m and can be calculated from equations (3) and (4).

    Recall the discussion above in section 3.1 regarding the detection probabilities of repeat inspections

    of the same flaw, as well as of different flaw types with the same size. The different detection

     probabilities result in a distribution of probabilities for some fixed flaw length (or flaw depth). Thestandard way of defining the distribution of these probabilities is through a 'probability density

    function' (see Appendix A, section 3.4). In the case of 'hit/miss' data the  PoD (a) function is the mean

    of the probability density function for each flaw length or depth (Figure 4).

    3.3.2.  PoD Curves for Signal Response Data

    For signal response data, much more information is supplied in the signal for analysis than is in the

    hit/miss data. In fact, as will be shown below, the  PoD (a)  function is derived from the correlation of

    â vs. a data.

    For signal response data it has been observed in a number of studies (6, 8) that an approximate linearrelationship exists between ln(â)  and  ln(a). The relationship is usually expressed by:

  • 8/20/2019 Probability of Detection Curves

    15/144

     

    6

    1 1ln â ln(a)( )   #  (7)

    where is an error term and is normally distributed with zero mean and constant standard deviation

    . The term 1  + 1 ln(a)  in equation (7) is the mean (a)  of the probability density function of

    ln(â). In signal response data, a flaw is regarded as ‘detected’ if â   exceeds some pre-defined

    threshold âth.

    Equation (7) is really expressing the fact that ln(â)   is normally distributed with mean

    (a) =  1 + 1 ln(a) and constant standard deviation (i.e. N(   (a),2).

    The  PoD (a) function for signal response data (i.e. ln(â)) can be expressed as:

    th PoD a Probability (ln(â) > ln(â( ) ))   (8)

    In other words, it is the area contained between the probability density function of ln(â)  and above

    the flaw evaluation threshold ln(âth )  (see Figure 5).

    Using standard statistical notation (9), equation (8) can be written as;

    th 1 1ln(â ln(a))

     PoD a 1 F   ) (

    ( ) 

    #

    $

     

    '

    ( )

    ( )

    +

      (9)

    where F  is the continuous cumulative distribution function (see Appendix A, Section 3)).

    It is fairly straight forward to show that with the symmetric properties of the Normal distribution

    equation (9) can be written as (see Appendix A, Section 3):

    th 1 1

    1

    ln(a) ln(â PoD a F 

    ) /( )

    #

    $ #

     

    '

    ( )

    ( )

    * +

      (10)

    which is the cumulative log-normal distribution with:

    th 1

    1

    ln âmean = a

      ( )( )

     

    (11)

    and

    1

    standard deviation =#

     

    (12)

    The estimates for 1 , 1   and are computed from the PoD data using the maximum likelihood

    method (6).

    3.3.3.  Sample Sizes

    (a) To compute PoD parameters

  • 8/20/2019 Probability of Detection Curves

    16/144

  • 8/20/2019 Probability of Detection Curves

    17/144

     

    8

    If we have an unknown exact value ‘e’  for the area and a known approximate value ‘  A’ for the area,

    we will be able to calculate a maximum possible error, or deviation ‘  d’    from the error formulae.

    Hence we can say that:

     A d e A d  

    That is, ‘e’ lies between A - d and A + d with 100% certainty.

    In statistics however, a similar problem of estimating the true parameter ‘p’  of a population (e.g. thePoD), would require us to determine two numerical values ‘p1’ and ‘p2’ , that depend on a particular

    random sample set and include ‘p’   with 100% certainty. However, from a sample set we cannot draw

    conclusions about the population with 100% certainty. We need to modify our approach since the

    numerical quantities  p1   and  p2   depend on the sample set and will be different for each random set.

    The interval with end points  p1   and  p2   is called a ‘confidence interval’. The concept of the

    confidence interval is usually expressed in the following way:

    1 2 P p p p C ( )  (13)

    where C  is called ‘the confidence level’.  The point  p1   is called ‘the lower confidence limit’  and the

     point  p2 is called ‘the upper confidence limit’ (9, 10).

    For example, if we assign C   to be 95%, what is the meaning of a ‘95% confidence interval for the

     population parameter  p? To illustrate the point, let  p be the mean of the population. Equation (13) is

    often wrongly interpreted as ‘there is a 95% probability that the confidence interval contains the

     population mean  p’. However, any particular confidence interval will either contain the population

    mean or it won’t. The confidence level C  does have this probability value associated with it, but it is

    not a probability in the normal usage, since  p1  and  p2 in equation (13) are not unique and are differentfor each random sample selected. The correct interpretation of equation (13) is based on repeated

    sampling. If samples of the same size are drawn repeatedly from a population and a confidence

    interval is calculated from each sample, then we can expect 95% of these different intervals to

    contain the true population mean.

    Formal definitions of terms associated with the confidence interval are provided in Appendix A,

    section 2 and an example is provided in Appendix A, section 2 of how the confidence interval is

    calculated for the population mean with a known standard deviation.

    3.5.  PUBLISHED WORK ON THE MODELLING OF POD

    During the last two decades, the modelling of NDT capability has increased and improved

    substantially. The models are now being used as part of PoD studies to simulate the results of

    inspecting components with quite complex geometries.

    3.5.1.  An Overview

    The savings in carrying out modelling of PoD, as opposed to the experimental determination of PoD,has been a strong motivation in the development of such models.

    The historical development of computational NDT and PoD models is discussed in some detail in a

    relatively recent NTIAC publication (11), covering the period from 1977-2001. The development of

    modelling PoD has focussed on NDT methods such as ultrasound, eddy currents, X-ray radiography

  • 8/20/2019 Probability of Detection Curves

    18/144

     

    9

    and numerous publications are cited in reference 11. Whilst the models have been used to produce

    PoD results for particular NDT methods and flaws, their other main contribution has been to

    optimise and validate the NDT procedures.

    During the 1990’s there were major research efforts in modelling NDT reliability and PoD from

    Iowa State University (USA) and the National NDT Centre, Harwell (UK). Two notable publications

    in the 1990’s were Thompson (12), which contained an updated review of the PoD methodologydeveloped for the NDT of titanium components and Wall (13), which focussed on the PC-based

    models at Harwell and included corrections to PoD models due to human and environmental factors.

    Both the above publications are worthy of consideration for anyone wishing to start modelling PoD

    or to get a very good overview of the capabilities and usefulness of modelling PoD.

    A number of models discussed in reference 11 also consider the probability of false calls (or

     probability of false alarms (PFA)) and there is a good description of PFA in references (5) and (11).PFA will not be reported here as it is outside the scope of the project.

    3.5.2.  The PoD-generator (The Netherlands)

    A recent NDT reliability model that is worth mentioning is the ‘PoD-generator’. This particular

    model was developed in the Netherlands as part of a joint industry project and presented at the 16th 

    world conference on NDT 2004 (14). The model allows the assessment and optimisation of an

    inspection program for in-service components. The PoD-generator is really 3 models in one; the

    ‘degradation model’, which predicts the initiation and growth of flaws, the ‘inspection model’, which

    simulates the performance of the NDT method (i.e. currently it can deal with ultrasound or

    radiography) and the ‘integrity’ model’, which predicts the probability of failure. The degradationmodel passes information about the flaws to the inspection model, which in turn passes information

    about the inspection performance to the integrity model. A simple example of ultrasonic pulse-echomeasurements to illustrate the concept of the PoD-generator is provided in reference (14).

    3.5.3.  Iowa State University (USA)

    The main centre of excellence in the USA for PoD studies is almost certainly Iowa State University.

    In the field of modelling they have developed physically detailed models for predicting PoD. Someof their main collaborations in the USA have been, understandably, with the aerospace industry and

    the air force research laboratories. In fact, in the September 2005 NTIAC Newsletter, there was aninteresting article on developments at Iowa State regarding PoD. The article reported that The Model

    Assisted PoD (MAPOD) Working Group has been established with the joint support of some major

    aerospace and air force research laboratories. The MAPOD approach is based on using modelling to

    determine PoD results in a way that reduces the need for the empirical approach, which can incur

    substantial costs and is usually slow to deliver results. More detailed information on MAPOD can befound in the September 2005 NTIAC Newsletter or by visiting the NTIAC website at

    www.ntiac.com.

    3.5.4.  National NDT Centre (UK)

    The National NDT Centre (NNDTC) in the UK, which is now part of ESR Technology Ltd, holds asimilar position in the UK and Europe on PoD as Iowa State holds in the USA. One of the major

    contributions of the NNDTC has been in the development of computer models for predicting PoD.

    However, they have also contributed to a number of national and international trials on PoD (e.g.

    USA ageing aircraft programme) as well as some high profile industrial applications of PoD (see

    Section 4).

  • 8/20/2019 Probability of Detection Curves

    19/144

     

    10

    On the modelling, there is the PoD for ultrasonic corrosion mapping (15), which predicts the PoD

    theoretically as well as by a simulation approach. Simulated images are brought up on the screen and

    the inspector can mark where flaws are seen, like a ‘spot the ball’ approach. The data is then

    analysed in terms of PoD and false calls. There are also PoD models originally developed for theEuropean Space Agency (ESA), which deal with ultrasonic C-scanning and radiography of

    composite materials. The ESA work was reported at WCNDT 2000 (16). More recent work on

    modelling PoD includes the Magnetic Flux Leakage method in floor scanners and Eddy Currents forfastener inspection in airframe structures. There are also a number of other model applications,

    notably in the offshore industry and more specific information on these can be found on the NNDTC

    website at www.nndtc.com.

    4.  THE PRACTICAL APPLICATION OF POD CURVES

    4.1.  HOW POD CURVES ARE USED IN INDUSTRY 

    PoD curves provide reference to results that have been obtained for particular flaws using specific

     NDT procedures. However, it is important to appreciate that in using particular PoD curves for

    different applications that some validation of the NDT procedures is carried out. The POD curves

     provide important results for quantifying the performance capability of NDT procedures as well asthe operators and could be used as a basis for:

    •  Establishing design acceptance requirements

    •   NDT procedure qualification and acceptance

    •  Qualification of personnel performance

    •  Comparing the performance capabilities of NDT procedures

    •  Selecting an applicable NDT procedure

    •  Quantifying improvements in NDT procedures

    •  Developing repeatable NDT data for fracture mechanics

    The examples provided below of PoD applications to different industries link in well with the above

    uses of PoD curves (3).

    4.2.  PUBLISHED WORK ON POD CURVES IN DIFFERENT INDUSTRIES 

    The methodology of PoD reliability studies, developed in the late 60’s and early 70’s, for the

    aerospace industry has been adopted by a number of other industries and some of these will bediscussed here.

    4.2.1.  Aerospace (NASA)

    The first general requirements to quantify the capabilities of NDT methods came with the design and production of the NASA space shuttle system. In the past, the capability and reliability of routinelyapplied NDT procedures was assumed, but no one had produced any factual evidence. For example,

    knowing the smallest flaw detected by an NDT method was not much use, as there were many flawslarger than this smallest flaw that were missed. The flaw size which was more relevant was the

    largest flaw that could be missed (c.f. Figure 2). NASA initiated a research program in 1969 to

    determine the largest flaw that could be missed for the materials and NDT methods that were to be

    used in relation to the design and production of the space shuttle (1, 3).

    4.2.2.  Aircraft Structures, Inclusions in Titanium Castings

    Childs et al (17) assessed X-rays radiography for the detection of ceramic inclusions in thick

    Titanium (Ti) castings used in aircraft structures. The castings were manufactured using the ‘HotIsostatic Pressure (HIP) process. During the HIP process, the ceramic face coat can break into

  • 8/20/2019 Probability of Detection Curves

    20/144

     

    11

    splinters (or ‘spall’) and become embedded in the casting as ceramic inclusions called ‘shells’. The

    X-ray radiography results were analysed in terms of PoD as a function of shell diameter for different

    face coat formulations from different suppliers. The PoD results were used to improve the face coat

    formulations to improve detectability.

    4.2.3.  NORDTEST Trials

    The NORDTEST programme (18) set out to compare manual ultrasonic NDT with X-ray

    radiography when applied to carbon manganese steel butt welds ! 25mm thick. The study was used

    to establish ‘acceptance curves’ as opposed to PoD curves. The acceptance curves defined

    acceptance probabilities vs. flaw height, where the acceptance probabilities were really 1 – PoD. The

    results of the NORDTEST trials demonstrated that there was an approximate relationship between

    certain ultrasonic NDT and radiographic NDT acceptance criteria (see also reference (19)).

    4.2.4.  Nuclear Components (The PISC Trials)

    The Programme for the Inspection of Steel Components (PISC), carried out in the mid to late

    seventies (20), was concerned with the flaw detection capabilities of ultrasonic NDT on thick walled

    nuclear pressure vessel components (i.e. ~ 250mm).

    The ultrasonic NDT procedures used in the trials were applied too rigidly and did not allow the

    signal responses from large planar flaws to be evaluated properly. Hence, relatively low PoD’s were

    obtained for quite large flaws. This is a good example of poorly designed NDT procedures leading tounexpected and low PoD results (see also the discussion below in section 5.1).

    In the above PISC-I trials some of the inspectors were allowed to use their own preferred NDT

     procedures. This approach proved more effective and the PoD results were much higher for the same

    large flaws.

    In the PISC-II trials (21), the approach of using more flexible ultrasonic NDT procedures showed

    that the flaw characteristics (e.g. flaw shape, flaw geometry, orientation) had a relatively larger

    influence on the final PoD results compared to other parameters of the NDT procedures.

    4.2.5.  Offshore tubular Joints

    The underwater PoD trials at University College London in the early 1990’s considered the detectionof fatigue cracks in offshore tubular joints (22). The results of the trials were used to compare the

    flaw detection capabilities of Magnetic Particle Inspection (MPI) with a number of eddy current

     NDT techniques as well ultrasonic NDT techniques using creeping waves. For the techniques

    considered, the 90-95 PoD/Confidence limit combination was being achieved for cracks with typical

    lengths " 100 mm.

    4.2.6.  Dutch Welding Institute (NIL)

    The Dutch Welding Institute (Nederlands Instituut voor Lastechwick (NIL)) acts as a moderator of

     NDT in the Netherlands, but does not have its own experts in NDT.

    During the mid-1980’s to the mid-90’s NIL produced four reports based on four major joint industry

     projects (JIP), which were funded and carried out by Dutch industry. One of the JIP projects (23)

    involved assessing the reliability of mechanised ultrasonic NDT, in comparison with standard film

    radiography and manual ultrasonic NDT, for detecting flaws in thin steel welded plates (i.e. 6mm to

    15mm).

  • 8/20/2019 Probability of Detection Curves

    21/144

     

    12

    There were 244 simulated, but realistic, flaw types such as lack of penetration, lack of fusion, slag

    and gas inclusions and cracks, which spanned 21 flat welded test plates. Some of the main

    conclusions were:

    •  Mechanised ultrasonic NDT (i.e. mechanised pulse-echo and time of flight diffraction (TOFD)),

     performed better than manual ultrasonic NDT with respect to flaw detection capability.

     

    Mechanised ultrasonic NDT was better at flaw sizing than manual ultrasonic NDT.•  Double exposure weld bevel radiography performed than 0

    0 film radiography.

    •  The detection performance did not depend on the wall thickness in the range 6mm to 12mm.

    Some of the PoD values associated with this particular flaw population, and the 6mm to 12mm

     plates, are as follows:

    NDT Methods PoD Values (%)

    Mechanised ultrasonic and TOFD 60-80

    Manual ultrasonic NDT 50

    00  film radiography 65Double exposure weld bevel radiography 95

    False calls 10-20

    It is always important in these kinds of studies not to draw too many general conclusions but simply

    accept the results for this particular set of flaw specimens.

    The results of this particular NIL study on PoD, along with the NORDTEST (18), PISC (20, 21) and

    the underwater trials at UCL (22), are reviewed in more detail in an HSE report with a focus on

    offshore technology (24).

    4.2.7.  Railways (National NDT Centre (UK))

    During the last 5 years the NNDTC has worked with the UK rail industry’s main line and London

    underground to improve and quantify the reliability of inspection. This has included looking at thereliability of ultrasonic near-end and far-end scan methods used on axles as well as other work on

     bogie frames, wheel sets and train structures. There has been work also on the rail infrastructure

    including rail inspection and edge-corner cracking issues and electromagnetic modelling.

    POD is commonly used in the rail industry to quantify reliability and to optimise the inspection

     periodicity using probabilistic methods. NNDTC has produced improved estimates of POD for

    ultrasonic axle inspection. POD estimates have also been produced for the improved NDE methods

    and for new designs utilising hollow axles.

    More recently, the NNDTC has developed a simulation model utilising real A-Scan data and datafrom real flaws to produce POD curves for far-end and near end axle inspection. This enables

    specific POD curves to be produced for individual axle designs and geometries. The location and

    sizes of the cracks can be altered and the effect of geometric features on detectability evaluated.

    There has been a lot of interest in the industry in improved methods for NDT measurement of bogie

    frames and NNDTC has been heavily involved in this, particularly for inspecting less accessible parts

    of the bogie. This work included POD trials on manual ultrasonic inspection of welds in bogie

    frames (25).

  • 8/20/2019 Probability of Detection Curves

    22/144

     

    13

    4.2.8.  LPG Storage Vessels

    The extent of non-invasive inspection of Liquid Petroleum Gas (LPG) storage vessels has been

    considered previously by Georgiou and a probabilistic model was devised for optimising flaw

    detection and a number of reports and papers were published (26-29). A guidelines document (28)

    was written to assist companies and HSE inspectors to assess how much NDT was required in order

    to achieve a desired probability of detecting a flaw and was based on a concept called the ‘index ofdetection’ (IoD). The IoD was related to ‘Probability of Inclusion’ (PoI) curves and to a particular

     PoD (a)  curve (i.e. for ultrasonic NDT), which was kindly provided by NNDTC from a particular

    PoD study (30).

    Since the work by Georgiou, some HSE inspectors have considered the PoI curves as well as the IoD

    results in the guidelines document (28). It was considered timely to assess their comments as well as

     pull together all the statistical models considered so far, validate them against real data using

    appropriate statistical techniques and select the best available model. The additional work on the PoI

    curves and the IoD have been completed alongside this PoD work and are provided as two self

    contained reports in Appendices C and D respectively. The updated work is now considered to have

    wider applications than just the ultrasonic NDT of LPG storage vessels.

    5.  THE LIMITATIONS OF APPLYING POD CURVES

    5.1.  COMMENTS ON HIT/MISS DATA AND SIGNAL R ESPONSE DATA 

    Whilst the approaches to determine the  PoD (a)   function for hit/miss data and the signal response

    data are quite different, the log-odds and cumulative log-normal distribution functions are very

    similar for the same statistical parameters. Figure 6 shows a comparison between the log-odds andcumulative log-normal for =0 and =1 (6).

    On occasions the behaviour of the PoD data may appear illogical and the  PoD (a)   function selected

    (e.g. log-odds or cumulative log-normal) may not fit the data. It may be of course that othermodelling approaches need to be considered (31, 32). However, it is useful to carry out some quick

    checks to see if there is something specific about the data in order to decide what action to take.

    In the case of hit/miss data it has been observed that the  PoD (a)   function can sometimes decrease

    with flaw size (i.e. large flaws are missed more than small ones). This is usually because the NDT

    experiment was poorly designed and it would require a repeat some of the trials with better designed NDT procedures. There was a good example of this in the PISC-I trials discussed above in section

    4.2.4. There are also a number of examples of this in the NTIAC data book (5) and a particular one is

     provided in Figure 7, simply to illustrate how the PoD curve can behave for such a case.

    Regions of flaw hits and flaw misses should not be distinct, there has to be a good overlap (c.f.Figure 2), otherwise the analysis that fits the log-odds model will not produce a valid solution (6).

    This usually means more data is required in the region between a smallest   and alargest  (c.f. Figure 2).

    It is also possible to produce what appears to be an acceptable  PoD (a)   function that fits the data

    well, but the confidence limit decreases with increasing flaw size. This is usually evidence that thelog-odds model is not  a good fit. This behaviour is usually associated with extreme values of and

    (e.g. large and small ).

    In signal response data, there is less reliance on the overlap of flaw size range and more emphasis on

    the linear relationship between ln(â)   and ln(a). When the relationship is not linear, the cumulative

    log-normal will not fit the data. This is usually associated with unreasonably values for and andthe lower confidence limit will eventually decrease with increasing flaw size (similar to that

  • 8/20/2019 Probability of Detection Curves

    23/144

     

    14

    observed with the hit/miss data). When these situations occur, it is worth checking that the NDT

    experiment was designed and executed properly. Failing that, it is likely that a different model needs

    to be investigated (32).

    5.2.  IMPORTANT OPERATING AND PHYSICAL PARAMETERS 

    In the Recommended Practice (1), operating parameters for each of five NDT methods are provided(i.e. Ultrasound, Eddy Currents, Penetrants, Magnetic Particle and Radiography). Each method has a

    very detailed list of both operator controlled parameters, relating to the NDT method, and physical

     parameters associated with the specimen and flaws. The parameters for each NDT method are too

    numerous to repeat here, but different NDT methods will be considered to assess the differences in

    their respective PoD curves. In addition, the effects on the PoD curves from material properties, the

    specimen geometry and the flaw characteristics will also be considered.

    The NTIAC data book (5) provides information on 423 PoD curves covering eight NDT methods

    (i.e. the five mentioned in the Recommended Practice (1) above as well as visual testing and two so

    called emerging NDT methods, which are Holographic Interferometry and ‘Edge of Light’

    inspection).

    In this report, the NTIAC data book was used as the prime source of raw PoD data and these data

    have been used to assess how the PoD curves are affected by the various operational and physical

     parameters in the sub-headings below. It is important to note that the NTIAC data book contains only

    hit/miss data and in each of the 423 PoD curves it is the log-odds model that has been used to fit the

    data (using a 95% confidence limit). In all cases the actual flaw dimensions have been verified by

    destructive analysis and measurement.

    In order to assess the effects of the operational and physical parameters, it was necessary to find PoDdata where only one of the operational or physical parameters changed while the other parameters

    remained the same. This was not always completely clear, as there were always some uncertainties,notwithstanding the data sheets indicating which parameters were nominally the same and which

    were different. The examples selected cover a range of NDT methods and help to illustrate the kind

    of differences that can exist between PoD results, but without any deliberate attempt to maximise

    these differences.

    Before observing the effects of certain operational and physical parameters on the PoD results, it isimportant to note the following points:

    •  The PoD data in the NTIAC data book (5) was collected about 30 years ago and may not

    necessarily reflect current capabilities with modern digital instrumentation.

    • 

    The PoD data illustrated in each of the figures 7 – 13 are valid for the particular datasets inquestion. It would be wrong to draw general conclusions about PoD values (e.g. ultrasound is

     better than X-ray). The figures merely serve to illustrate the possible effects that the physical and

    operational parameters can have on the PoD and that we should be aware of these effects when

    quoting PoD results.

    •  Equipment ‘calibration’ is also one of the important variables in the application of an NDT procedure. It is believed that no attempt was made to resolve calibration issues in collecting the

    inspection data.

    •  The designated operators A, B and C recorded in the datasets are not necessarily the same 3

     persons each time.

     Notwithstanding the above points, the PoD datasets in the NTIAC data book (5) are considered arich, comprehensive and valid set of data, which would almost certainly be prohibitively expensive

  • 8/20/2019 Probability of Detection Curves

    24/144

     

    15

    to repeat by any one organisation using more modern digital technology. Such data does not appear

    to exist elsewhere in such an easily accessible and consistent format in which to illustrate the

    comparisons below.

    5.2.1.  NDT Method

    To illustrate the differences in the PoD curves that can occur for different NDT methods, the NTIACdata book was used to identify the NDT carried out on the same flaw specimen and by the same

    designated operator. Whether the designated operator (e.g. operator C) is precisely the same person

    in each case is not absolutely clear. However, it is believed that the cases selected offer a reasonable

    independent measure of the differences.

    Two Titanium flat plates (i.e. thicknesses 1.7mm and 5.7mm) with a total of 135 cracks were

    inspected by the same designated operator using; manual eddy currents, manual ultrasound (surface

    waves) and X-ray radiography. The PoD curves for each method are plotted in Figure 8 and show the

    differences in the PoD curves for a particular dataset. In particular, the flaw size corresponding to the

    90% PoD varies significantly for each NDT method (i.e. 3.4mm, 14.8mm and 18.5mm for

    ultrasound, eddy currents and X-ray radiography respectively).

    5.2.2.  Fluorescent Penetrant NDT

    In the case of fluorescent penetrant NDT, two datasets were considered which quantify the

    differences in PoD between the cases of no developer and developer being used to reveal surfaceflaws (Figure 9). Best practice cleaning procedures were followed between inspections. Whilst

    surface lengths were measured, the depths were predicted from validated crack growth procedures.

    The PoD differences in Figure 9 for this particular flaw specimen (i.e. Haynes 188 alloy (AMS

    5608A, with 125 RMS and with dimensions 3.5in x 16in x 0,19in) are quite large, with the 90% PoD

    not being achieved without the developer.

    5.2.3.  Material Properties

    To assess the differences in PoD for different material being inspected with the same NDT method,

    the same size flaw specimen with the same nominal flaws had to be found. Clearly, this was not

    going to be 100% possible with different material specimens. However, by a close examination of

    the datasheets, which accompanying each dataset, it was possible to find flaw specimens that were produced in the same way with the intention of producing the same flaw types.

    The three PoD results illustrated in Figure 10 are for three different materials (i.e. aluminium,

    titanium and steel). The datasheets for these three datasets suggest that the only physical difference is

    the material, although the width of the steel plate is different from the other two. The thickness of allthree is the same and they are in the same ‘as machined’ state. The flaws were all initiated using the

    same mechanism and cover a similar length range, but clearly the flaws in each different material

    specimen will not be identical.

    It is worth mentioning that these particular steel PoD results improved dramatically once thespecimen went beyond the ‘as machined’ state (e.g. etching and proof loading).

    5.2.4.  Specimen Weld Geometry

    The aim here was to look for PoD data that had different weld conditions. Since the NTIAC data

     book has considered mainly flat panel specimens and bolt holes, there was not a V-butt weld tocompare with J-prep weld, for example. The closest was a particular comparison for the same flaw

  • 8/20/2019 Probability of Detection Curves

    25/144

     

    16

    specimen, but with a different condition of the weld. The PoD results for aluminium welds with

    crowns, was compared to PoD results with the welds ground flush. The NDT method used was X-ray

    radiography and the results are illustrated in Figure 11.

    The differences in PoD are relatively small and the 90% PoD was not achieved in both cases,

    although for the ‘welds ground flush’ PoD, the 90% PoD was very close to 0.75in (~19mm).

    5.2.5.  Flaw Characteristics

    To illustrate the possible effect on the PoD from inspecting different flaws, fluorescent penetrant

     NDT was considered for longitudinal cracks and transverse cracks covering the same flaw length

    range. The datasets for two specimens were found that were physically the same apart from the

    flaws. The PoD results for this comparison are illustrated in Figure 12. The transverse flaws are

    associated with a much lower PoD values for relatively smaller crack lengths (i.e. below about 0.15in

    ( ~ 4mm)), but the PoD results are more similar for larger crack lengths (i.e. above about 0.25in (~

    6mm)), beyond which the PoD results both converge to unity.

    5.2.6. 

    Human ReliabilityThis is an area which has been researched extensively in the UK at the NNDTC (c.f. reference 13). It

    would be very easy to show some quite startling differences in detection capability based on human

    reliability studies. In manual ultrasonic NDT, for example, an often quoted anecdote is ‘you can only

     believe a manual ultrasonic NDT result 50% of the time’. Perhaps this originated from typicaldifferences observed in the past between two operators in various detection trials (c.f. the NIL study

    (23) discussed in section 4.2.6). The ‘50% anecdote’ is believed to be too simplistic for manysituations and more information needs to be considered.

    The NTIAC data book (5) does contain a great deal of data where the physical and operational

     parameters are the same, for a particular NDT method, and the only difference is the operator.

    However the data book did not set out to study human reliability. Nevertheless, it does appear that

    for nearly every datasheet there are 3 PoD results (i.e. according to operators A, B, C). There are a

    number of cases where the differences in the PoD results are significant and others where the PoD

    results are very similar.

    The cases selected for illustrating operator variability here are illustrated in Figure 13 for ultrasonicimmersion NDT, inspecting titanium plates with low cycle fatigue cracks.

    Whilst the results appear similar, and the same order of flaws is missed by each operator, the 90%

    PoD varies by at least a factor of 2 (c.f. operator A with operators B and C)

    6.  DISCUSSION

    This section brings together all the salient features of this study. It is much more than an Executive

    Summary and is aimed particularly at HSE inspectors who want a reasonably quick understanding ofPoD without having to read the whole report.

    6.1.  INTRODUCTION 

    There is a large amount of ‘Probability of Detection’ (PoD) information available; starting with the

     pioneering work in the late 1960’s to early 1970’s for the aerospace industry, to more recent and

    more general industrial applications.

  • 8/20/2019 Probability of Detection Curves

    26/144

     

    17

    PoD curves have been produced for a range of Non-Destructive Testing (NDT) methods (e.g.

    ultrasound, radiography, eddy currents, fluorescent penetrants). Whilst it is reasonable to assume that

    each NDT method will produce different PoD curves (even when applied to the same flaws), it is

     believed that many who use PoD curves do not fully appreciate how significant the differences can be. PoD curves are also dependent on a number of physical and operational parameters.

    It is important to understand how PoD curves are derived and to question the validity of theirapplication, as well as to appreciate their limitations. This discussion aims to provide this

    information as well as considering the relevance of PoD to fitness for service issues.

    6.2.  AIMS AND OBJECTIVES 

    The overall goal is to provide concise and understandable information on PoD curves, which Health

    and Safety Inspectors should find useful when discussing safety cases involving PoD curves. The

    specific objectives are:

    •  To provide a clear and understandable description of how PoD curves are derived.

    • 

    To provide practical applications of PoD curves, particularly to fitness for service issues.•  To quantify the limitations of PoD curves.

    6.3.  HISTORICAL DEVELOPMENT 

    An early definition of NDT reliability is, 'the probability of detecting a crack in a given size group

    under the inspection conditions and procedures specified'  (1). The PoD is usually expressed as a

    function of flaw size (i.e. length or depth), although it is a function of many physical and operational parameters.

    Repeat inspections of the same flaw size or the same flaw type do not necessarily result in consistent

    hit or miss indications. There is a spread of detection results, which is why the detection capability is

    expressed in terms of the PoD. An early example is illustrated by Lewis et al (2), where 60 air force

    inspectors, using the same surface eddy-current technique, inspected 41 cracks around fastener holes.

    The results in Figure 1 show that the chances of detecting the cracks increase with crack size.

    However, none of the cracks were detected 100% of the time and different cracks of the same sizehave different detection percentages. Figure 1 also shows that the ‘log-odds’ distribution is a good fit

    to the data and illustrates why PoD is an appropriate measure of detection capability.

    PoD functions have been the subject of many studies since the late 1960’s and early 1970's, where

    most of the work was carried out in the aerospace industry (3, 4). It was becoming clear that to

    ensure the structural integrity of critical components, the question ‘…what is the smallest flaw that

    can be detected by an NDT method?’   was less appropriate than the question ‘…what is the largest flaw that can be missed?’   To elaborate on this, some real ultrasonic inspection data has been

    considered from the ‘Non-destructive Testing Information Analysis Centre’ (NTIAC) capabilities

    data book (5). Figure 2 illustrates the detection capabilities of an ultrasonic surface wave inspectionof two flat aluminium plates, containing a total of 311 simulated fatigue cracks with varying depths.

    The flaws were recorded as detected (or hit) with PoD=1, or missed with PoD=0.In Figure 2, thereare three distinct regions separated by the lines a smallest   and alargest . The region between a smallest  and

    alargest  shows flaws of the same size, which are hit and missed, and alargest  is much larger than a smallest .

    In 1969, a program was initiated by the National Aeronautics and Space Administration (NASA) to

    determine the largest flaw that could be missed, for various NDT methods to be used in the design

    and production of the space shuttle. The methodology by NASA was soon adopted by the US Air

  • 8/20/2019 Probability of Detection Curves

    27/144

     

    18

    Force as well as the US commercial aircraft industry. In the last two decades many more industries

    have adopted similar NDT reliability methods based on PoD. Some of these will be discussed below.

    Early on in the mid-1970’s, a constant PoD for all flaw types of a given size was proposed andBinomial distribution methods were used to estimate this probability, along with an associated error

    or ‘lower confidence limit’ (1). It is clear from Figure 1, that this early assumption about a constant

    PoD for flaws of a given size was too simplistic.

    In the early to the mid-1980s, the approach was to assume a more general model for the PoD vs. flaw

    size ‘a’ . Various analyses of data from reliability experiments on NDT methods indicated that the

     PoD (a)  function could be modelled closely by either the 'log-logistic' (or ‘log-odds’) distribution or

    the cumulative 'log-normal' distribution (6).

    6.4.  FLAW SAMPLE SIZES FOR ‘HIT/MISS’ DATA AND ‘SIGNAL R ESPONSE’ DATA 

    The ‘Recommended Practice’ (1), originally prepared for the aircraft industry, provides

    comprehensive information on the experimental sequence of events for generating data to produce

    PoD curves and to ‘certify’ (i.e. validate) an NDT method or procedure.

    The sequence of events can be broadly summarised as follows (see also (3)):

    •  Manufacture or procure flaw specimens with a large number of flaw sizes and flaw types

    •  Inspect the flaw specimens with the appropriate NDT method

    •  Record the results as a function of flaw size

    •  Plot the PoD curve as a function of flaw size

    However, before the manufacture or procurement of flaw specimens, it is necessary to ask:

     

    What flaw parameter size will be used (e.g. flaw length or flaw depth)?•  What overall flaw size range is to be investigated (e.g. 1mm to 9mm)?

    •  How many intervals are required within the flaw size?

    The recommended practice (1) also provides critical information on the flaw sample size, for each

    flaw width interval, in order to achieve the desired PoD and the appropriate lower confidence limit.

    It is important to appreciate that in selecting the sample size there are two distinct issues to address.

    First, the sample size has to be large enough to achieve the desired PoD and confidence limitcombination. Second, the sample size has to be large enough to determine the statistical parameters,

    associated with the PoD curve that best fits the data.

    Originally, NDT results were always recorded in terms of ‘hit/miss’ data (c.f. Figure 2), which is

    discrete data. This way of recording data is still appropriate for some NDT methods (e.g. magnetic

     particle testing). However, in many inspections there is more information in the NDT response (e.g.

    the light intensity in fluorescent NDT). Since the NDT signal response can be interpreted as the

     perceived flaw size, the data is often called â, that is, ‘a hat’ or ‘signal response’ data, which iscontinuous data.

    6.4.1.  Model for Hit/Miss Data

    For hit/miss data a number of different statistical distributions have been considered (7). It was found

    that the log-logistic distribution was the most acceptable and the  PoD (a) function can be written as;

  • 8/20/2019 Probability of Detection Curves

    28/144

  • 8/20/2019 Probability of Detection Curves

    29/144

     

    20

    ln(a) PoD a F 

     

    $

     

    % &

    ' (

    ( )  

    which is the cumulative log-normal distribution where

    the mean th 1

    1

    ln âa

     ( )( )  and standard deviation

    1

     

    #

     

    .

    The estimates for 1 , 1  and are computed using ‘maximum likelihood’ methods (6).

    6.4.3.  To Compute PoD parameters

    In order to determine the parameters associated with the  PoD (a)   function, for hit/miss data, it is

    recommended that the flaw sizes be uniformly distributed between the minimum and maximum flawsize of interest, with a minimum of 60 flaws(6).

    For signal responses data, a direct consequence of the additional information means the range of flaw

    sizes is not as critical. The recommendation is a minimum of 30 flaws in the sample size (6).

    6.4.4.  To Achieve the Desired PoD/Confidence Limit Combination

    In practice, a PoD and lower confidence limit combination often quoted is 90% and 95%

    respectively. For hit/miss and signal response NDT data (1), it is necessary to have a minimum

    sample of 29 flaws in each flaw width interval. This could be interpreted as 29 flaw specimens with

    one flaw in each specimen. This means that if 6 flaw width intervals were used, a minimum of 174flaw specimens would be necessary, a considerable cost to produce PoD curves experimentally.

    With such a large number of flaws, the requirement to compute the  PoD (a)  function parameters, as

    discussed above in (a), is easily satisfied.

    6.5.  POD MODELLING 

    During the last two decades, the modelling of NDT capability has increased and improvedsubstantially. The savings in carrying out modelling of PoD, as opposed to the experimental

    determination of PoD, has been a strong motivation in developing models.

    The historical development of computational NDT and PoD models is discussed in some detail in a

    relatively recent NTIAC publication (11), covering the period from 1977-2001. The development of

    modelling PoD has focussed on NDT methods such as ultrasound, eddy currents, X-ray radiographyand numerous publications are cited in reference 11.

    During the 1990’s there were major research efforts in modelling NDT reliability and PoD from

    Iowa State University (USA) and the National NDT Centre, Harwell (UK). Two notable publicationsin the 1990’s were Thompson (12), which contained an updated review of the PoD methodology

    developed for the NDT of titanium components and Wall (13), which focussed on the PC-basedmodels at Harwell and included corrections to PoD models due to human and environmental factors.

    Both the above publications are worthy of consideration for anyone wishing to start modelling PoD

    or to get a very good overview of the capabilities and usefulness of modelling PoD.

  • 8/20/2019 Probability of Detection Curves

    30/144

     

    21

    In recent years both Iowa State and the National NDT Centre (NNDTC) have continued developing

    models to determine PoD results. Iowa State has established the Model Assisted PoD (MAPOD)

    working group, with the joint support of some major aerospace and airframe research laboratories.

    The NNDTC, which is now part of ESR Technology Ltd, have continued to develop some interestingPoD models for composites (16), magnetic flux leakage in floor scanners and other PoD applications

    in the offshore industry (see www.nndtc.com ).

    A recent interesting NDT reliability model is the ‘PoD-generator’ (14). The model allows the

    assessment and optimisation of an inspection program for in-service components using ultrasound

    and radiography.

    6.6.  PRACTICAL APPLICATIONS OF POD

    The methodology of PoD reliability studies, developed in the 60s and 70s, for the aerospace industry

    has been adopted by a number of other industries and some of these are discussed below.

    6.6.1.  Aircraft Structures, Inclusions in Titanium Castings

    Childs et al (17) assessed X-rays radiography for the detection of ceramic inclusions in thickTitanium (Ti) castings for aircraft structures. The castings were manufactured using the ‘Hot

    Isostatic Pressure (HIP) process. During this process, the ceramic face coat can break into splinters

    and become embedded in the casting as inclusions (i.e. ‘shells’). The X-ray radiography results were

    analysed in terms of PoD of the shell diameter, for different face coat formulations, and the resultswere used to improve the face coat formulations and also improve detectability.

    6.6.2.  NORDTEST Trials

    The NORDTEST trials (18) set out to compare manual ultrasonic NDT with X-ray radiography when

    applied to carbon manganese steel butt welds ! 25mm thick. The trials were used to establish

    ‘acceptance curves’, which defined acceptance probabilities (i.e. 1-PoD) against flaw height. Theresults of the NORDTEST trials demonstrated that there was an approximate relationship between

    certain ultrasonic NDT and radiographic NDT acceptance criteria (see also reference (19)).

    6.6.3.  Nuclear Components (The PISC Trials)

    The Programme for the Inspection of Steel Components (PISC), carried out in the mid to late

    seventies (20), considered flaw detection capabilities of ultrasonic NDT on thick nuclear pressure

    vessel components (i.e. ~ 250mm). The ultrasonic NDT procedures were applied too rigidly and

    signal responses from large planar flaws were not evaluated properly, resulting in low PoD values.

    However, some of the inspectors were also allowed to use their own preferred NDT procedures,

    which proved more effective and the PoD results were much higher for the same large flaws.

    In the PISC-II trials (21), the approach of using more flexible ultrasonic NDT procedures showed

    that the flaw characteristics (e.g. flaw shape, flaw geometry, orientation) had a relatively largerinfluence on the final PoD results compared to other physical parameters.

    6.6.4.  Offshore Tubular Joints

    The underwater PoD trials at University College London in the early 1990’s considered the detection

    of fatigue cracks in offshore tubular joints (22). The results of the trials were used to compare the

    flaw detection capabilities of Magnetic Particle Inspection (MPI) with a number of eddy current

     NDT techniques as well ultrasonic creeping wave NDT techniques. The 90-95 PoD/Confidence limit

    combination was being achieved for cracks with typical lengths " 100 mm.

  • 8/20/2019 Probability of Detection Curves

    31/144

     

    22

    6.6.5.  Dutch Welding Institute (NIL)

    During the mid-90s, the Dutch Welding Institute (Nederlands Instituut voor Lastechwick (NIL))

     produced a report (23) involving, amongst other NDT methods, the reliability of mechanised

    ultrasonic NDT for detecting flaws in thin steel welded plates (i.e. 6mm to 15mm).

    Some of the main conclusions for ultrasonic NDT were:

    •  Mechanised ultrasonic NDT and time of flight diffraction (TOFD), had a higher flaw detection

    capability than manual ultrasonic NDT (i.e. PoD of 60%-80% compared to 50% respectively).

    •  Mechanised ultrasonic NDT was better at flaw sizing than manual ultrasonic NDT.

    An HSE report, focussing on offshore technology (24), has carried out a detailed reviewed on the

     NORDTEST trials, the PISC trials, the underwater trials at UCL and the NIL PoD study.

    6.6.6.  Railways

    During the last 5 years the NNDTC has worked with the UK rail industry’s main line and London

    underground to improve and quantify the reliability of inspection.

    PoD methods are commonly used in the rail industry to quantify reliability and to optimise inspection

     periodicity. The NNDTC has developed a simulation model utilising real A-Scan data and data from

    real flaws to produce POD curves for far-end and near end axle inspection. NNDTC has also been

    heavily involved in POD trials on manual ultrasonic inspection of welds in bogie frames (25).

    6.6.7.  LPG Storage Vessels

    The NDT of LPG storage vessels was considered by Georgiou and a probabilistic model foroptimising flaw detection was developed. The reports and papers published (26-29), included a

    guidelines document (28) written to assist companies and HSE inspectors to assess the amount NDT

    required order to achieve a desired PoD and used a concept called the ‘index of detection’ (IoD).

    In the meantime, some HSE inspectors have considered the IoD work. It was timely to assess their

    comments as well as pull together the statistical models considered so far, validate them against

    different data using statistical techniques and select the best available model. This additional work

    has been completed alongside this PoD study and is now considered to have wider applications than

     just the ultrasonic NDT of LPG storage vessels (c.f. Appendices C and D).

    6.7.  DEPENDENCE OF POD ON OPERATIONAL AND PHYSICAL PARAMETERS 

    6.7.1.  Important Operational and Physical Parameters

    The NTIAC data book (5) has been the prime source of raw PoD data (i.e. 423 PoD curves)for assessing the effects on PoD curves by the operational and physical parameters. The

     NTIAC data book contains only hit/miss data and in each of the 423 PoD curves it is the log-

    odds model that has been used to fit the data, along with a 95% confidence limit.

    To assess the effects of the parameters, PoD data was considered where only one of the parameterschanged while the other parameters remained the same. The examples selected cover a range of NDT

    methods and help to illustrate the kind of differences that can exist between PoD results, but without

    any deliberate attempt to maximise these differences.

    Before observing the effects of certain parameters on the PoD curves, it is important to note the

    following points:

  • 8/20/2019 Probability of Detection Curves

    32/144

     

    23

    •  The PoD data in the NTIAC data book was collected about 30 years ago and may not necessarily

    reflect current capabilities with modern digital instrumentation.

    •  The PoD data illustrated in the figures to follow are valid for the particular datasets in question.

    It would be wrong to draw general conclusions about PoD values. The figures merely serve toillustrate the possible effects that the parameters can have on the PoD and that we should be

    aware of these effects when quoting PoD results.

     

    Equipment ‘calibration’ is an important variable in the application of an NDT procedure. It is believed that no attempt was made to resolve this issue in collecting the inspection data.

    •  The designated operators A, B and C recorded in the datasets are not necessarily the same 3

     people each time.

     Notwithstanding the above points, the PoD datasets in the NTIAC data book are a rich, and

    comprehensive set of data, which would almost certainly be prohibitively expensive to repeat by any

    one organisation using more modern digital technology. Such data does not appear to exist elsewhere

    in such an easily accessible and consistent format to illustrate the comparisons below.

    6.7.2.  NDT Method

    Two Titanium flat plates (i.e. thicknesses 1.7mm and 5.7mm) with a total of 135 cracks wereinspected by the same designated operator using manual eddy currents, manual ultrasound (surface

    waves) and X-ray radiography. The PoD curves for each method are plotted in Figure 8 and show the

    differences in the PoD curves for a particular dataset.

    6.7.3.  Fluorescent Penetrant NDT

    Two datasets were considered which quantify the differences in PoD between the cases of no

    developer and developer being used to reveal surface flaws (Figure 9). Whilst surface lengths were

    measured, the depths were predicted from validated crack growth procedures.

    6.7.4. 

    Material PropertiesThe three PoD results illustrated in Figure 10 are for aluminium, titanium and steel. The datasheets

    for these datasets suggest the only physical difference is the material, although the width of the steel plate is different from the other two. The thickness of all three is the same and they are in the same

    ‘as machined’ state. The flaw types were all initiated using the same mechanism. However, the flaws

    in each different material specimen are clearly not identical. It is worth noting that the particular PoD

    results for steel improves dramatically once the specimen goes beyond the ‘as machined’ state (e.g.

    etching and proof loading).

    6.7.5.  Specimen Weld Geometry

    The NTIAC data book contains data on flat panel specimens and bolt holes, with no V-butt welds tocompare with J-prep welds, for example. However, there are particular PoD results for aluminium

    welds with crowns and PoD results with the same aluminium welds ground flush. The NDT method

    used was X-ray radiography and the results are illustrated in Figure 11.

    6.7.6.  Flaw Characteristics

    Fluorescent penetrant NDT was considered for inspecting longitudinal cracks and transverse cracks

    covering the same flaw length range. The PoD results for this comparison are illustrated in Figure 12.The transverse flaws are associated with much lower PoD values for relatively smaller crack lengths

    (i.e. below about 4mm), but are more similar for larger crack lengths (i.e. above about 6mm).

  • 8/20/2019 Probability of Detection Curves

    33/144

     

    24

    6.7.7.  Human Reliability

    It would be very easy to show significant differences in detection capability based on human

    reliability studies. In manual ultrasonic NDT an often quoted anecdote is, ‘you can only believe a

    manual ultrasonic NDT result 50% of the time’. Perhaps this originated from typical differences

    observed in the past (c.f. the NIL study (23)). The ‘50% anecdote’ is believed to be too simplistic for

    many situations and more information is required. The NTIAC data book does contain a great deal ofdata where the only difference is the operator (i.e. operators A, B, C).

    The cases selected for illustrating operator variability here are illustrated in Figure 13 for ultrasonic

    immersion NDT, inspecting titanium plates with low cycle fatigue cracks.

    7.  INDEPENDENT VERIFICATION

    The independent verification was carried out by George A Georgiou (GAG), Emilie Beye (EB), andMelody Drewry (MD). Whilst GAG is the author of this report, there were specific aspects of the

    statistical theory and calculations in Appendix C that he did not carry out and particular checks were

    carried out in relation to the formal statistical tests and the datasets used in those tests. Similarly, EB

    and MD, who are co-authors of Appendix C, were not involved at all in the main study andnumerous checks were carried out by them in the following areas:

    •  Derivation of the mathematical equations

    •  The statistical definitions and terminologies in Appendix A

    •  The figures and the corresponding datasets (i.e. Figure 2 and Figures 7 – 13)

    •  The comparison of the Probability of Inclusion curves in Appendix C

    •  The updated Index of Detection model in Appendix D

    •  The Conclusions and Recommendations

    A formal verification statement is made in section 12.

    8.  CONCLUSIONS

    •  The ‘log-odds’ distribution is found to be one of the best fits for hit/miss NDT data.

    •  The log-normal distribution is found to be one of the best fits for signal response NDT data, and

    in particular for flaw length and flaw depth data as determined by ultrasonic NDT.

    •  In some cases, the ‘log-odds’ and cumulative log-normal distributions are very similar, but thereare many cases where they are significantly different.

    •  There are NDT data when neither the ‘log-odds’ nor the log-normal distributions are appropriate

    and other distributions need to be considered.

    •  There is often a large gap between the smallest flaw detected and the largest flaw missed.

    • 

    Very small or very large flaws do not contribute much to the PoD analysis of hit/miss data.•  To achieve a valid ‘log-odds’ model solution for hit/miss data, a good overlap between the

    smallest flaw detected and the largest flaw missed is necessary.

    •  To achieve a valid log-normal model solution for signal response data, there is less reliance onflaw size range overlap, but more on the linear relationship between ln(â) and ln(a).

    •  When the  PoD (a)   function decreases with increasing flaw size, it is usually an indication thatthe NDT procedures are poorly designed.

    •  When the lower confidence limit decreases with increasing flaw size, notwithstanding an

    acceptable  PoD (a)  function, it is usually associated with extreme or unreasonable values of the

    mean and standard deviation.

    •  The effect on PoD results for particular operational and physical parameters can be significant

    for datasets selected from the NTIAC data book of PoD curves.

  • 8/20/2019 Probability of Detection Curves

    34/144

     

    25

    •  The PoD data in the NTIAC data book were collected some 30 years ago and may not

    necessarily reflect current capabilities with modern digital instrumentation. However, the results

    are still believed to be relevant to best practice NDT.

    •  The PoD data illustrated in each of the figures 7 – 13 are valid for the particular datasets inquestion. It would be wrong to draw too many general conclusions about the particular PoD

    values (e.g. ultrasound is better than X-ray).

     

    Figures 7 - 13 serve to illustrate the possible effects that the physical and operational parameterscan have on the PoD and an awareness of these effects is important when quoting PoD results.

    •   NDT methods, equipment ‘calibration’, fluorescent penetrant developers, material, surface

    condition, flaws and human factors are all important operational and physical parameters, which

    can have a significant effect on PoD results.

    •  Whilst human factors are important variables in NDT procedures, they are often found not to be

    as important as other operational and physical variables.

    •  The ‘Log-odds’ distribution was found to be the most appropriate distribution to use with theJCL‘Probability of Inclusion’ model.

    •  The earlier JCL ‘Probability of Inclusion’ model has been validated against an independentlydeveloped ‘Probability of Inclusion’ model by MBEL.

    9.  RECOMMENDATIONS

    •  Publish a signal response data book of PoD results.

    •  Publish a more up to date data book from different PoD studies and collate them in a way which

     best serves more general industrial and modelling applications.

    •  Set up a European style project or Joint Industry Project to realise the above recommendations.

    10.  ACKNOWLEDGEMENTS

    The author would like to acknowledge the organisations ASM and NTIAC for giving permission to

    reprint and re-plot various figures in this study. Special acknowledgement goes to Ward Rummel, the

    author of the NTIAC data book, for his advice during discussions on various PoD datasets in the data

     book.

    The author would also like to acknowledge Martin Wall (ERS Technology Ltd) for highlighting thePoD research work that has been carried out at the NNDTC over the last two decades.

    Lastly, the author would like to thank the HSE for funding this work and in particular to Graeme

    Hughes for his guidance and useful discussion throughout this study.

    11.  REFERENCES

    1. 

    Rummel W D: ‘Recommended practice for a demonstration of non-destructive evaluation (NDE)reliability on aircraft production parts’. Materials Evaluation Vol. 40 August 1982.

    2.  Lewis W H, Sproat W H, Dodd B D and Hamilton J M: ‘Reliability of non-destructive inspection

     – Final Report’. SA-ALC/MME 76-6-38-1, San Antonio Air Logistics Centre, Kelly Air ForceBase, Texas, 1978.

    3.  Rummel W D: ‘Probability of detection as a quantitative measure of non-destructive testing end-to-End process capabilities’, 1998. www.asnt.org/publications/materialseval/basics/Jan98 .

    4.  AGARD Lecture Series 190: ‘A recommended methodology to quantify NDE/NDI based on

    aircraft engine experience’, April 1993, ISBN 92-835-0707-X

    5.   NTIAC Non-destructive Evaluation (NDE) capabilities data book, 3rd   ed., November 1997,

     NTIAC DB-97-02, Non-destructive Testing Information Analysis Centre.

  • 8/20/2019 Probability of Detection Curves

    35/144

     

    26

    6.  Berens A P: ‘NDE reliability data analysis’, non-destructive evaluation and quality control:

    qualitative non-destructive evaluation'. ASM Metals Data book, Volume 17, Fifth printing,

    December 1997, ISBN 0-87170-007-7 (v.1).

    7.  Berens A P and Hovey P W: ‘Evaluation of NDE reliability characterisation’, AFWAL-TR-81-4160, Vol. 1, Air Force Wright-Aeronautical Laboratories, Wright-Patterson Air Force Base,

    December 1981.

    8. 

    Sturges D J: ‘Approaches to measuring probability of detection for subsurface flaws’, Proc. 3rd

     Ann. Res. Symp., ASNT 1994 Spring Conference, New Orleans, 1994, pp229-231.

    9.  Crawshaw J and Chambers J: ‘A concise course in A-level statistics’, 1984, Stanley Thornes

    (Publishers) Ltd. ISBN 0-7487-0455-8

    10. Kreyszig E: ‘Advanced engineering mathematics’, John Wiley & Sons, Inc. 1983 (5th Edition,

     p947).

    11. Matzknin G A and Yolken HT: ‘Probability of detection (PoD) for non-destructive evaluation

    (NDE)’, NTIAC-TA-00-01, August 2001.

    12. Thompson R Bruce: ‘Overview of the ETC PoD methodology’, Review of Progress in

    Quantitative Non-destructive Evaluation, Vol. 18b, Plenum Press, New York, July 19-24, 1998, pp2295-2304.

    13. 

    Wall M: ‘Modelling of NDT reliability and applying corrections for human factors’, EuropeanAmerican Workshop, Deter