Topics Part 1 – Single measurement 1. Basic stuff (Chapter 1 and 2) 2. Propagation of uncertainties (Chapter 3) Part 2 – Multiple measurements as independent results 1. Mean and standard deviation (Chapter 4) 2. Basic on probability distribution function (not in text explicitly) 3. The Binomial distribution (Chapter 10) 4. The Poisson distribution (Chapter 11) 5. Normal distribution (first half of Chapter 5) 6. χ 2 test – how well does the data fit the distribution model? (Chapter 12) Part 3 – Multiple measurements as one sample 1. Central limit theorem (not in text explicitly) 2. Normal distribution (second half of Chapter 5) 3. Propagation of error (Chapter 3) 4. Rejection of data (Chapter 6) 5. Merging two sets of data together (Chapter 7) Part 4 - Dependent variables Part 4 Dependent variables 1. Curve fitting (Chapter 8) 2. Covariance and correlation (Chapter 9)
21
Embed
Chapter 11 - Covariance and Correlationkwng/fall2010/phy335/lecture/lecture 14.pdf3. The Binomial distribution (Chapter 10) 4. The Poisson distribution (Chapter 11) 5. Normal distribution
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Topics Part 1 – Single measurement
1. Basic stuff (Chapter 1 and 2)2. Propagation of uncertainties (Chapter 3)
Part 2 – Multiple measurements as independent results
1. Mean and standard deviation (Chapter 4)2. Basic on probability distribution function (not in text explicitly)3. The Binomial distribution (Chapter 10)4. The Poisson distribution (Chapter 11)5. Normal distribution (first half of Chapter 5)6. χ2 test – how well does the data fit the distribution model? (Chapter 12)
Part 3 – Multiple measurements as one sample
1. Central limit theorem (not in text explicitly)2. Normal distribution (second half of Chapter 5)3. Propagation of error (Chapter 3)4. Rejection of data (Chapter 6)5. Merging two sets of data together (Chapter 7)
Part 4 - Dependent variablesPart 4 Dependent variables
1. Curve fitting (Chapter 8)2. Covariance and correlation (Chapter 9)
Basic Idea Given two random variables x and y, how to show they are independent of each other (or correlated in the opposite sense)?sense)?
0 8
1
Independent random variables x and yy
0 2
0.4
0.6
0.8
‐0.4
‐0.2
0
0.2
‐1 ‐0.8 ‐0.6 ‐0.4 ‐0.2 0 0.2 0.4 0.6 0.8 1x
‐1
‐0.8
‐0.6
0.4
Each dot is a pair of independent random numbers (x, y) in the range of [-1, 1]. Total of 100 dots.
Depending on the distribution the scattering may not center about the origin but it always
Basic Idea Depending on the distribution, the scattering may not center about the origin, but it always centers about the mean (x,y).
Consider the product of (x-x) and (y-y). It will be positive in quadrants I and III, and negative in quadrants II and IV If x and y are independent of each other the scatter
Independent random variables x and y
negative in quadrants II and IV. If x and y are independent of each other, the scatter randomly in all quadrants and the sum of their products should tend to 0.
y
(x,y) x
. variablesrandomt independen arey and x if 0 )yy()xx( ii
N
1i→−−∑
=
CovarianceCovariance
Covariance between two random variables x and y is defined as
)yy()xx(N1 ii
N
xy −−= ∑σN 1i=
If the random variables x and y are independent of each other, | | 0 If d l t d ill b f f 0 ( b|σxy|→ 0. If x and y are correlated, σxy will be far from 0 (can be positive or negative) but its magnitude will not exceed σxσy(Schwartz inequality).
ExampleI h t d 15 d d 15 d i d d tl
x y (x‐x) (y‐y) (x‐x)(y‐y)0 07253 0 964829 0 39924 0 397462 0 15868
I have generated 15 random x and 15 random y independently.
For simplicity, we assume q is a function of two variables x and y. We have learned two equations of error propagation:
Sum)ial(Different qq :equationFirst yxq σσσ ∂+
∂=
Sum) re(Quadratui yq
xq :equation Second
)(yx
q
2y
22
x
22
q
yxq
σσσ ⎟⎟⎠
⎞⎜⎜⎝
⎛∂∂
+⎟⎠⎞
⎜⎝⎛∂∂
=
∂∂
yx yq ⎟⎠
⎜⎝ ∂⎠⎝ ∂
Is there any relation between these two equations?
Error propagation and Covariance Iiii -(1)--- )y-y(
yq )x-x(
xq )y,xq( q
∂∂
+∂∂
+≈
ii
ii
)y-y(1q)x-x(1q)y,xq(1
)y-y(yq )x-x(
xq )y,xq(
N1 q
yx
∂+
∂+=
⎟⎟⎠
⎞⎜⎜⎝
⎛∂∂
+∂∂
+=∴
∂∂
∑∑∑∑∑
∑
ii
)yy(q)xx(q)yxq(q
:(1) intoback (2) Substitute-(2)--- )y,xq(
)yy(Ny
)xx(Nx
)y,xq(N
∂+
∂+≈
=∂
+∂
+ ∑∑∑∑∑
ii
iii
q,in deviation standardtheasqin y uncertaintheconsider t weIf
)y-y(yq )x-x(
xq q
)y-y(y
)x-x(x
)y,xq( q
∂∂
+∂∂
+=
∂+
∂+≈
2
ii2
q
2i
2q
q)y-y(yq )x-x(
xq q
N1
)qq(N1
qqy
σ
σ
⎟⎟⎠
⎞⎜⎜⎝
⎛−
∂∂
+∂∂
+=∴
−=
∑
∑
22
22
2
ii
)()(1qq2)(1q)(1q
)y-y(yq )x-x(
xq
N1
yxN
⎟⎟⎞
⎜⎜⎛ ∂⎟⎞
⎜⎛ ∂+⎟⎟
⎞⎜⎜⎛ ∂
+⎟⎞
⎜⎛ ∂
⎟⎟⎠
⎞⎜⎜⎝
⎛∂∂
+∂∂
=
⎠⎝ ∂∂
∑∑∑
∑
xy2
y
22
x
2
iiii
yq
xq2
yq
xq
)y-y()x-x(Ny
qxq2 )y-y(
Nyq)x-x(
Nxq
σσσ ⎟⎟⎠
⎞⎜⎜⎝
⎛∂∂
⎟⎠⎞
⎜⎝⎛∂∂
+⎟⎟⎠
⎞⎜⎜⎝
⎛∂∂
+⎟⎠⎞
⎜⎝⎛∂∂
=
⎟⎟⎠
⎜⎜⎝ ∂⎟⎠
⎜⎝ ∂
+⎟⎟⎠
⎜⎜⎝ ∂
+⎟⎠
⎜⎝ ∂
= ∑∑∑
Error propagation and Covariance II22 ⎞⎛⎞⎛⎞⎛⎞⎛
xy2
y
22
x
22
q yq
xq2
yq
xq σσσσ ⎟⎟
⎠
⎞⎜⎜⎝
⎛∂∂
⎟⎠⎞
⎜⎝⎛∂∂
+⎟⎟⎠
⎞⎜⎜⎝
⎛∂∂
+⎟⎠⎞
⎜⎝⎛∂∂
= This is a general case.
Case I. x and y are independent variables. σxy = 0
Sum)e(Quadratur qq 2y
22
x
22
q σσσ ⎟⎟⎠
⎞⎜⎜⎝
⎛∂∂
+⎟⎠⎞
⎜⎝⎛∂∂
= )(Qyx yxq ⎟⎠
⎜⎝ ∂
⎟⎠
⎜⎝ ∂
Case 2. x and y are dependent variables. σxy ≠ 0. When x and y are highly correlated, | σxy | is maximum and equal to σ σ (Schwarz inequality)equal to σxσy (Schwarz inequality).
yq
xq2
yq
xq yx
2y
22
x
22
q σσσσσ ⎟⎟⎠
⎞⎜⎜⎝
⎛∂∂
⎟⎠⎞
⎜⎝⎛∂∂
+⎟⎟⎠
⎞⎜⎜⎝
⎛∂∂
+⎟⎠⎞
⎜⎝⎛∂∂
=
yq
xq
2
yx σσ ⎟⎟⎠
⎞⎜⎜⎝
⎛∂∂
+∂∂
=
Sum) ial(Different yq
xq yxq σσσ
∂∂
+∂∂
=∴
Error propagation and Covariance III
In Summary,
Sum)e(Quadraturqq 22
22
2 σσσ ⎟⎟⎞
⎜⎜⎛ ∂
+⎟⎞
⎜⎛ ∂= Sum)e(Quadratur
yx yxq σσσ ⎟⎟
⎠⎜⎜⎝ ∂
+⎟⎠
⎜⎝ ∂
qq2 ∂∂corresponds to the case when x and y are absolute independent of each other.
Sum)ial(Different yq
xq yx
2q σσσ
∂∂
+∂∂
=
corresponds to the case when x and y are completely correlated.
Anything is between these two extreme cases.
Coefficient of Linear CorrelationIn linear regression, we expect x and y follow a linear relationship y=A+Bx. In other words, for our least square fit to work, we expect a high
l ti b t d
Coefficient of Linear Correlation
correlation between x and y.
To measure the correlation between x and y, we define the coefficient of Linear Correlation (r) as the covariance normalized by σxσy.ea Co e a o ( ) as e co a a ce o a ed by σxσy
)yy()xx(r
ii
N
1ixy−−
==∑=
σ
)yy()xx(r
2i
N
1i
2i
N
1i
yx −−
==
∑∑==
σσ
Coefficient of Linear CorrelationSchwarz Inequality: σxy ≤ σxσy
1 r 1- r xy ≤≤⇒=∴σσ
σ
yxσσIf x and y are independent (uncorrelated) r=0 and we should not try to do linear regression in this case.
If all data points falls exactly on a straight line y=A+Bx,
)yy()xx(
)yy()xx( r
2i
N2
i
N
ii
N
1i
yx
xy
−−
−−==
∑∑
∑=
σσσ
)(B)(
)xx(B
22N
2N
2i
N
1i
1i1i
−=
∑∑
∑
∑∑
=
==
)xx(B
)xx(B)xx(
N
2i
N
1i
2i
2
1i
2i
1i
−=
−−
∑
∑∑
=
==
1
)xx(B 2i
N
1i
=
−∑=
Note: If you have only two data points r=1 Hence correlation has noNote: If you have only two data points, r=1. Hence correlation has no meaning if there are only two data points.
ExampleConsider the following data points and do a linear regression: