Top Banner
3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2 Basic Physics and Mathematics Used in Radiation Protection Session 15-16 Data Analysis IAEA Post Graduate Educational Course Radiation Protection and Safety of Radiation Sources
33

3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

Mar 29, 2015

Download

Documents

Max Toogood
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 1 of 33

Session I.2.15-16

Part I Review of Fundamentals

Module 2 Basic Physics and Mathematics Used in Radiation Protection

Session 15-16 Data Analysis

IAEA Post Graduate Educational CourseRadiation Protection and Safety of Radiation Sources

Page 2: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 2 of 33

Upon completion of this section the student should be able to demonstrate an understanding of the following statistical concepts pertaining to sample data:

Regression Correlation

Objectives

Page 3: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 3 of 33

A “regression” is a statistical technique used to investigate the relationship among two or more variables

An independent variable, “x”, is linked to a dependent variable, “y”

The relationship may follow the generic form of a straight line, for example, y = mx + b, where m is the slope and b is the intercept

Regression

Page 4: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 4 of 33

Other forms of relationships include:

Y = a + bx + cx2 a parabola

Y = abx an exponential curve

Y = axb a geometric curve

Regression

Page 5: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 5 of 33

Linear Least Squares Regression

Linear least squares regression is by far the most widely used modeling method. It is what most people mean when they say they have used "regression", "linear regression" or "least squares" to fit a model to their data.

Not only is linear least squares regression the most widely used modeling method, but it has been adapted to a broad range of situations that are outside its direct scope.

It plays a strong underlying role in many other modeling methods.

Page 6: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 6 of 33

Linear Least Squares Regression

Used directly, with an appropriate data set, linear least squares regression can be used to fit the data with any function of the form:

F(x,) = 1 + 2x1 + 3x2 + …

in which (see next slide)

Page 7: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 7 of 33

Linear Least Squares Regression

Each explanatory variable in the function is multiplied by an unknown parameter

There is at most one unknown parameter with no corresponding explanatory variable

All of the individual terms are summed to produce the final function value

Page 8: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 8 of 33

Linear Least Squares Regression

In statistical terms, any function that meets these criteria would be called a "linear function"

The term "linear" is used, even though the function may not be a straight line

Page 9: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 9 of 33

Linear Least Squares Regression

The unknown parameters are considered to be variables and the explanatory variables are considered to be known coefficients corresponding to those "variables”

The problem becomes a system of “linear” equations that can be solved for the values of the unknown parameters

Page 10: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 10 of 33

Linear Least Squares Regression

In the least squares method the unknown parameters are estimated by minimizing the sum of the squared deviations between the data and the model

The minimization process reduces the system of equations formed by the data to a sensible system of p (where p is the number of parameters in the functional part of the model) equations in p unknowns

This new system of equations is then solved to obtain the parameter estimates

Page 11: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 11 of 33

Linear Least Squares Regression

Linear models are not limited to being straight lines or planes, but include a fairly wide range of shapes

For example, a simple quadratic curve is linear in the statistical sense

A straight-line model or a polynomial is also linear in the statistical sense because they are linear in the parameters, though not with respect to the observed explanatory variable, x

Page 12: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 12 of 33

Linear Least Squares Regression

Just as models that are linear in the statistical sense do not have to be linear with respect to the explanatory variables, nonlinear models can be linear with respect to the explanatory variables, but not with respect to the parameters

Page 13: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 13 of 33

Linear Least Squares Regression

For example, F(x,) = 1 + 1 2x is linear in x but it cannot be written in the general form of a linear model

This is because the slope of this line is expressed as the product of two parameters

As a result, nonlinear least squares regression could be used to fit this model, but linear least squares cannot be used

Page 14: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 14 of 33

Linear Least Squares Regression

Advantages:

Although there are types of data that are better described by functions that are nonlinear in the parameters, many processes in science and engineering are well-described by linear models

This is because either the processes are inherently linear or because, over short ranges, any process can be well-approximated by a linear model

Page 15: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 15 of 33

Linear Least Squares Regression

Disadvantages:

The main disadvantages of linear least squares are: limitations in the shapes that linear models can

assume over long ranges poor extrapolation properties sensitivity to outliers

Linear models with nonlinear terms in the predictor variables curve relatively slowly, so for inherently nonlinear processes it becomes increasingly difficult to find a linear model that fits the data well as the range of the data increases.

Page 16: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 16 of 33

Linear Least Squares Regression

Finally, while the method of least squares often gives optimal estimates of the unknown parameters, it is very sensitive to the presence of unusual data points in the data used to fit a model

One or two outliers can sometimes seriously skew the results of a least squares analysis

Page 17: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 17 of 33

Linear Least Squares Regression

Application of this concept enables us to model a relationship to data

Do not expect the data to perfectly follow the model

The idea is to establish a model that will provide a “best fit” in representing the data

Page 18: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 18 of 33

Regression Example

For the following data, what is the function that relates the “Y” values to the “X” values?

X 1 2 3 5 7

Y 4 3 4 8 9

Page 19: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 19 of 33

Plot of sample data

Regression Example

Y = F(X)

0

2

4

6

8

10

0 2 4 6 8

X

Y Y

Page 20: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 20 of 33

A line is drawn through the data points

The difference between the value represented by the line and the observed value is determined

This value is squared (which gets rid of negative values)

Regression Example

Page 21: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 21 of 33

Sum of Squares - Total

The total sum of the squares is calculated for each data point. This is called the “sum of squares,” or SS

The line that is the “best fit” is one that has the smallest SS value

SSTOTAL = (Yi – YAVG)2

Page 22: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 22 of 33

Sum of Squares - Residual

The residual sum of the squares, SSRES, is calculated by determining all the residuals, squaring them, and summing the squares. It has n-2 degrees of freedom

SSRES = (Yi – Ýi)2

where Ýi is the “fitted” or predicted value of Y

Another procedure for calculating SSRES is:

SSRES = SSTOTAL - SSREG

Page 23: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 23 of 33

Sum of Squares - Regression

The regression sum of the squares, SSREG, is calculated by determining all the residuals, squaring them, and summing the squares.

It has 1 degree of freedom.

SSREG = (Ýi - YAVG)2

where Ýi is the “fitted” or predicted value of Y

Page 24: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 24 of 33

F Statistic

When comparing the variance of two different groups (populations), the null hypothesis, H0 , is that the variance of the two groups is equal (A

2 = B2)

The alternative hypothesis, H1, is that the variance of the two groups is not equal (A

2 B2)

Page 25: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 25 of 33

F Statistic

From the first population, you make nA observations and calculate the sample variance, SA

2 with dfA = nA –1

From the second population, you make nB observations and calculate the sample variance, SB

2 with dfB = nB –1

Page 26: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 26 of 33

F Statistic

Let SMAX2 and SMIN

2 denote the larger and smaller of SA

2 and SB2, respectively

Likewise, let dfMAX and dfMIN denote their respective degrees of freedom

The test statistic, F, also known as the “F ratio” or “variance ratio” is:

F =Smax

2

Smin2

Page 27: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 27 of 33

F Statistic

If H0 is correct, the F ratio should not be much larger than 1

The question is “how large is large?”

The 0.975 quartile (for a 2 tail distribution) is f0.975(dfMAX, dfMIN)

Page 28: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 28 of 33

Source of Variation

Sum ofSquares

Degrees of Freedom

Mean Square F

Regression SSREG 1MSREG =

SSREG/df

MSREG

MSRES

Residual SSRES n - 2MSRES =

SSRES/df

Total SSTOTAL n - 1

Regression Analysis Table

Page 29: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 29 of 33

Slope

The equation for the slope of a regression line is:

B =

For the data in the example, the slope,

B = 1.04

[nX2 – (X)2][n(XY) – (X)(Y)]

Page 30: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 30 of 33

Line Fit Plot

0

5

10

1 2 3 5 7

X

Y

Y

Predicted Y

Regression analysis plot of sample data

Regression Analysis

Page 31: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 31 of 33

A regression analysis of these sample data indicates the slope to be 1.04, with an intercept value of 1.84

(see next slide)

Regression Analysis

Page 32: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 32 of 33

SUMMARY OUTPUT

Regression StatisticsMultiple R 0.92977946R Square 0.864489844Adjusted R Square 0.819319792Standard Error 1.14846224Observations 5

ANOVAdf SS MS F Significance F

Regression 1 25.24310345 25.2431 19.13856 0.022100495Residual 3 3.956896552 1.318966Total 4 29.2

Coefficients Standard Error t Stat P-value Lower 95%Intercept 1.844827586 1.000297221 1.844279 0.162345 -1.338567595X Variable 1 1.043103448 0.238436495 4.374764 0.0221 0.284291395

Regression Analysis

Page 33: 3/2003 Rev 1 I.2.15-16 – slide 1 of 33 Session I.2.15-16 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.

3/2003 Rev 1 I.2.15-16 – slide 33 of 33

Where to Get More Information

Cember, H., Johnson, T. E., Introduction to Health Physics, 4th Edition, McGraw-Hill, New York (2008)

Martin, A., Harbison, S. A., Beach, K., Cole, P., An Introduction to Radiation Protection, 6th Edition, Hodder Arnold, London (2012)

Firestone, R.B., Baglin, C.M., Frank-Chu, S.Y., Eds., Table of Isotopes (8th Edition, 1999 update), Wiley, New York (1999)