Top Banner
Body size distribution of European Collembola Lecture 9 Moments of distributions
30

Body size distribution of European Collembola

Feb 23, 2016

Download

Documents

lenora

Lecture 9 Moments of distributions. Body size distribution of European Collembola. Body size distribution of European Collembola. Modus. The histogram of raw data. Three Collembolan weight classes. What is the average body weight ? . Sample mean. Population mean. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Body  size distribution  of  European Collembola

Body size distribution of European Collembola

Lecture 9Moments of distributions

Page 2: Body  size distribution  of  European Collembola

Body size distribution of European Collembola

SpeciesBody

weight [mg]

ln weight

ln body weight [mg] class means

Number of

speciesTetrodontophora bielanensis (Waga 1842) 13.471729 2.6006 -4.71511 7Orchesella chiantica Frati & Szeptycki 1990 13.471729 2.6006 -4.018377 53Disparrhopalites tergestinus Fanciulli, Colla, Dallai 2005 12.924837 2.5592 -3.321643 133Orchesella dallaii Frati & Szeptycki 1990 9.4503028 2.246 -2.624909 224Seira pini Jordana & Arbea 1989 9.4503028 2.246 -1.928176 353Isotomurus pentodon (Kos,1937) 7.1044808 1.9607 -1.231442 395Heteromurus (V.) longicornis (Absolon 1900) 7.1044808 1.9607 -0.534708 325Pogonognathellus flavescens (Tullberg 1871) 6.9512714 1.9389 0.162025 126Orchesella hoffmanni Stomp 1968 6.9512714 1.9389 0.858759 45Heteromurus (H) constantinellus Lučić, Ćurčić & Mitić 2007 6.3862223 1.8541 1.555493 24Pogonognathellus longicornis (Müller 1776) 6.2133935 1.8267 2.252226 9Orchesella devergens Handschin 1924 6.2133935 1.8267Orchesella flavescens (Bourlet 1839) 6.2133935 1.8267Orchesella quinquefasciata (Bourlet 1841) 6.2133935 1.8267

0

100

200

300

400

500

-4.72 -4.02 -3.32 -2.62 -1.93 -1.23 -0.53 0.16 0.86 1.56 2.25

Num

ber o

f spe

cies

ln body weight class

CollembolaThe histogram of raw data

Modus

Page 3: Body  size distribution  of  European Collembola

Weighed mean

)(1111

ifxnnxnx

nx

k

ii

ik

iii

k

ii

Class 1 Class 2 Class 3N 25 31 43

Mean 1.8169079 1.032923 0.5310592.6005933 1.313477 0.6518082.5591508 1.313477 0.6518082.2460468 1.313477 0.6518082.2460468 1.313477 0.6518081.9607257 1.313477 0.6518081.9607257 1.301948 0.6518081.9389246 1.225568 0.6518081.9389246 1.165038 0.6518081.8541429 1.165038 0.6518081.8267072 1.165038 0.6518081.8267072 1.165038 0.6518081.8267072 1.006355 0.6518081.8267072 1.006355 0.6518081.8267072 1.006355 0.6518081.584378 1.006355 0.6518081.584378 1.006355 0.6518081.584378 1.006355 0.6518081.584378 1.006355 0.6131521.584378 1.006355 0.5738351.584378 1.006355 0.5738351.5326904 1.006355 0.5338341.5326904 0.939683 0.4931251.5064044 0.871022 0.4931251.4529137 0.871022 0.4931251.4529137 0.835906 0.493125

0.835906 0.4931250.800247 0.4890140.800247 0.4516820.764026 0.4516820.756712 0.4516820.727225 0.451682

0.409479

Three Collembolan weight classes

What is the average body weight?

013.1531.09943033.1

9931812.1

9925

x

n

xn

ii

1 n

xx

n

ii

1

Population mean Sample mean

Page 4: Body  size distribution  of  European Collembola

ln body weight [mg] class means

Number of

speciesFrequency Arithmetic

mean Variance

-4.72 7 =B2/B14 =A2*C2 =(A2-D14)^2*C2-4.02 53 0.031286895 -0.125723 0.202268085-3.32 133 0.078512397 -0.26079 0.267516588-2.62 224 0.132231405 -0.347095 0.174619987-1.93 353 0.208382527 -0.401798 0.042653444-1.23 395 0.233175915 -0.287143 0.013917567-0.53 325 0.191853601 -0.102586 0.1698983170.16 126 0.074380165 0.0120514 0.1995107270.86 45 0.026564345 0.0228124 0.1447740291.56 24 0.014167651 0.0220377 0.1301786272.25 9 0.005312869 0.0119658 0.073837264

Sum 1694 -1.475751 1.462535979StDev 1.209353538

0

0.05

0.1

0.15

0.2

0.25

-4.72 -4.02 -3.32 -2.62 -1.93 -1.23 -0.53 0.16 0.86 1.56 2.25

Num

ber o

f spe

cies

ln body weight class

Collembola

nnxf i)( 1

Weighed mean

k

iii

k

i

iin

i

i xfxnxn

nxx

111

)(

Discrete distributions

Continuous distributions

max

min

)( dxxxf

The average European springtail has a body weight of e-1.476 = 023 mg.

Most often encounted is a weight around e-1.23 = 029 mg.

Page 5: Body  size distribution  of  European Collembola

Why did we use log transformed values?

SpeciesAverage

body length [mm]

Body weight

[mg]

Tetrodontophora bielanensis (Waga 1842) 7 13.472Orchesella chiantica Frati & Szeptycki 1990 7 13.472Disparrhopalites tergestinus Fanciulli, Colla, Dallai 2005 6.875 12.925Orchesella dallaii Frati & Szeptycki 1990 6 9.4503Seira pini Jordana & Arbea 1989 6 9.4503Isotomurus pentodon (Kos,1937) 5.3 7.1045Heteromurus (V.) longicornis (Absolon 1900) 5.3 7.1045Pogonognathellus flavescens (Tullberg 1871) 5.25 6.9513Orchesella hoffmanni Stomp 1968 5.25 6.9513Heteromurus (H) constantinellus Lučić, Ćurčić & Mitić 2007 5.06 6.3862Pogonognathellus longicornis (Müller 1776) 5 6.2134Orchesella devergens Handschin 1924 5 6.2134Orchesella flavescens (Bourlet 1839) 5 6.2134Orchesella quinquefasciata (Bourlet 1841) 5 6.2134

5 =JEŻELI(B86=0;0;EXP(-1.875+LN(B86)*2.3))

3.2875.1 ][]/[][ mmLLWemgW

0

100

200

300

400

500

-6.00 -4.00 -2.00 0.00 2.00 4.00

Num

ber o

f spe

cies

ln body weight class

Collembola

0

100

200

300

400

500

0 2 4 6 8 10

Num

ber o

f spe

cies

Body weight class

CollembolaLog transformed data Linear data

The distribution is skewed

Page 6: Body  size distribution  of  European Collembola

Body weight [mg] class

means

Number of

speciesFrequency Arithmetic

meanGeometric

mean

0.01 7 0.004132231 3.702E-05 -0.0194839260.02 53 0.031286895 0.0005626 -0.1257225390.04 133 0.078512397 0.0028338 -0.2607901530.07 224 0.132231405 0.0095797 -0.3470954050.15 353 0.208382527 0.0303016 -0.4017981870.29 395 0.233175915 0.0680574 -0.2871426150.59 325 0.191853601 0.1123956 -0.1025856551.18 126 0.074380165 0.0874629 0.0120514462.36 45 0.026564345 0.062698 0.022812374.74 24 0.014167651 0.0671181 0.0220376819.51 9 0.005312869 0.0505194 0.011965782

Sum 1694 0.491566 -1.4757512Exp() 0.228606933

0

100

200

300

400

500

0 2 4 6 8 10

Num

ber o

f spe

cies

Body weight class

Collembola

LzWWLWW

mmLLWemgWz

lnlnln

][]/[][

0

0

3.2875.1

In the case of exponentially distributed data we have to use the geometric mean.To make things easier we first log-transform our data.

nxn

n

ii

n

ii

ex

1

ln

1

Geometric mean

The average European springtail has a body weight of

e-1.476 = 023 mg.

lb scaled weight classes

Page 7: Body  size distribution  of  European Collembola

ln body weight [mg] class means

Number of

speciesFrequency Arithmetic

mean Variance

-4.72 7 =B2/B14 =A2*C2 =(A2-D14)^2*C2-4.02 53 0.031286895 -0.125723 0.202268085-3.32 133 0.078512397 -0.26079 0.267516588-2.62 224 0.132231405 -0.347095 0.174619987-1.93 353 0.208382527 -0.401798 0.042653444-1.23 395 0.233175915 -0.287143 0.013917567-0.53 325 0.191853601 -0.102586 0.1698983170.16 126 0.074380165 0.0120514 0.1995107270.86 45 0.026564345 0.0228124 0.1447740291.56 24 0.014167651 0.0220377 0.1301786272.25 9 0.005312869 0.0119658 0.073837264

Sum 1694 -1.475751 1.462535979StDev 1.209353538

0

0.05

0.1

0.15

0.2

0.25

-4.72 -4.02 -3.32 -2.62 -1.93 -1.23 -0.53 0.16 0.86 1.56 2.25

Num

ber o

f spe

cies

ln body weight class

Collembola

nnxf i)( 1

1

)(1

2

2

n

xxs

n

ii

n

xn

ii

1

2

2)(

Variance

)()(1

22i

n

ii xfxxs

Continuous distributions

dxxfxxs max

min

22 )()(

2ss Standard deviation

Mean

1 SD

The standard deviation is a measure of the width of the statistical distribution that has the sam

dimension as the mean.

Degrees of freedom

Page 8: Body  size distribution  of  European Collembola

The standard deviation as a measure of errorsEnvironmental pollution

Station NOx [ppm]1 8.492 1.123 9.114 7.755 0.756 8.237 0.978 6.069 8.48

10 5.8811 8.5112 9.6213 3.3514 7.7415 2.0316 5.0617 7.6118 0.9919 2.5520 8.91

Mean 5.66Variance 10.45

Standard deviation

3.23

DistanceAverage NOx

concentrationStandard deviation

1 9.53 1.702 7.37 1.183 5.24 0.864 3.15 0.265 2.17 0.186 1.05 0.097 0.84 0.148 0.63 0.109 0.32 0.03

10 0.21 0.02

The precision of derived metrics should always match the precision of the raw data

02468

101214

1 2 3 4 5 6 7 8 9 10

Conc

entr

ation

Distance [km]

± 1 standard deviation is the most often used estimator of error.The probablity that the true mean is within ± 1 standard deviation is approximately 68%.The probablity that the true mean is within ± 2 standard deviations is approximately 95%.

± 1 standard deviation

Page 9: Body  size distribution  of  European Collembola

MeanStandard deviation

5.44 4.15

4.49 5.29

5.55 3.39

5.56 3.13

Standard deviation and standard errorEnvironmental

pollution

StationNOx

[ppm]1 8.492 1.123 9.114 7.755 0.756 8.237 0.978 6.069 8.48

10 5.8811 8.5112 9.6213 3.3514 7.7415 2.0316 5.0617 7.6118 0.9919 2.5520 8.91

The standard deviation is constant irrespective of sample size.

The precision of the estimate of the mean should increase with sample size n.

The standard error is a measure of precision.

nSDSE

DistanceAverage NOx

concentrationStandard deviation

Standard error n=20

1 9.53 3.32 0.742 7.37 2.45 0.553 5.24 1.24 0.284 3.15 0.67 0.155 2.17 0.87 0.196 1.05 0.34 0.087 0.84 0.14 0.038 0.63 0.10 0.029 0.32 0.03 0.01

10 0.21 0.02 0.01

0

2

4

6

8

10

12

1 2 3 4 5 6 7 8 9 10

Conc

entr

ation

Distance [km]

Page 10: Body  size distribution  of  European Collembola

)()()(2)()()()(1

2

11

2

1

22i

n

ii

n

iii

n

iii

n

ii xfxxfxxxfxxfxxs

2

1

22

1

22 )()(1)(2)()( xxfxxxxxfxs i

n

iii

n

ii

E(x2) [E(x)]2

222 )()( xExE

The variance is the difference between the mean of the squared values and the squared mean

1

( ) ( )n

k ki i

i

E X x f x

( ) ( )k kE X x f x dx

( )E X k-th central moment

2 2 2

1

( ) ( ) (( ) )n

i ii

X f X E X

Mathematical expectation

Central moments

First central momentFirst moment of central tendency

2

11

2

2

11

n

x

n

xs

n

ii

n

ii

Page 11: Body  size distribution  of  European Collembola

00.20.40.60.8

1

0 2 4 6 8

f(x)

x

3

3

(( ) )E X

Skewness

3 3 2 2 3 3 2 3(( ) ) ( ) 3 ( ) 3 ( ) ( ) 3 ( ) 2E X E X E X E X E X E X Third central moment

4

4

( )( ) 3XE

Kurtosis

00.20.40.60.8

1

0 2 4 6 8

f(x)

x

00.20.40.60.8

1

0 500 1000 1500 2000f(x

)

x

00.20.40.60.8

1

1 1.5 2

f(x)

x

=0 >0 <0

Symmetric distribution Right skewed distribution Left skewed distribution

=0

00.20.40.60.8

1

0 2 4 6 8

f(x)

x

>0

Page 12: Body  size distribution  of  European Collembola

What is the probability that of 10 newborn babies at least 7 are boys?

( ) k n knp k p q

k

172.05.05.01010

5.05.0910

5.05.0810

5.05.0710

)6( 010192837

kp

0

0.05

0.1

0.15

0.2

0.25

0.3

0 2 4 6 8 10

p(X)

X

p(girl) = p(boy) = 0.5

Lecture 10Important statistical distributions

n

iip

0

1

Bernoulli distribution

Page 13: Body  size distribution  of  European Collembola

( ) k n knp k p q

k

0

( ) ( )k

x n x

x

nF k p x k p q

x

The Bernoulli or binomial distribution comes from the Taylor expansion of the binomial

n

i

nin

i

nin qpin

qpin

qp0

1

0

1 )1()(

npq

np

2

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0 1 2 3 4 5 6 7 8 9 10p

f(p)

1010( ) 0.2 0.8k kp k

k

1010( ) 0.2 0.8k kp k

k

Bernoulli or binomial distribution

Page 14: Body  size distribution  of  European Collembola

Assume the probability to find a certain disease in a tree population is 0.01. A bio-monitoring program surveys 10 stands of trees and takes in each case a random sample of

100 trees. How large is the probability that in these stands 1, 2, 3, and more than 3 cases of this disease will occur?

146.39.9

9.999.0*01.0*1000

1001.0*10002

0074.099.0*01.03

1000)3(

0022.099.0*01.02

1000)2(

0004.099.0*01.01

1000)1(

9973

9982

999

p

p

pMean, variance, standard deviation

99.099.001.03

100099.001.0

21000

99.001.01

1000

99.001.00

1000199.001.01)3(1)3(

997399829991

100003

0

i

inikpkp

Page 15: Body  size distribution  of  European Collembola

What happens if the number of trials n becomes larger and larger and p the event probability becomes smaller and smaller.

( )! 1 ( )!( )!( 1)! ( ) ( ) ! ( 1)!( )

1

k r k

rk r k

r k r r kp X kk r r r k r r

r

1lim1

( )!lim 1( 1)!( )

r r

r k

e

rr k

r r

( )!

k

p X k ek

Poisson distribution

( ) k n knp k p q

k

rrpq

rp

prpnp 11

The distribution or rare events

Page 16: Body  size distribution  of  European Collembola

Assume the probability to find a certain disease in a tree population is 0.01. A bio-monitoring program surveys 10 stands of trees and takes in each case a random sample of

100 trees. How large is the probability that in these stands 1, 2, 3, and more than 3 cases of this disease will occur?

1001.0*1000

0076.0!3

10)3(

0023.0!2

10)2(

00045.0!110)1(

103

102

10

ep

ep

ep

0074.0)3(

0022.0)2(

0004.0)1(

p

p

pPoisson solution Bernoulli solution

The probability that no infected tree will be detected

000045.0!0

10)0( 10100

eepep )0(

The probability of more than three infected trees

981.0019.01)3(019.00076.00023.000045.0)3()2()1()0(

kppppp

99.0)3( kp

Bernoulli solution

Page 17: Body  size distribution  of  European Collembola

00.05

0.10.15

0.20.25

0.30.35

0.4

0 1 2 3 4 5 6 7 8 9 10 11 12 13k

p(k)

= 1

= 2 = 3

= 4 = 6

2 1

Variance, mean

Skewness

Page 18: Body  size distribution  of  European Collembola

What is the probability in Duży Lotek to have three times cumulation if the first time 14 000 000 people bet, the second time 20 000 000,

and the third time 30 000 000?

The probability to win is

140000001

!49!43!6)6( p

142857.214000000

130000000

428571.114000000

120000000

114000000

114000000

3

2

1

117.0!0

142857.2

239.0!0

428571.1

368.0!01

142857.20

3

428571.10

2

10

1

ep

ep

ep

The events are independent:

01.0117.0*239.0*368.03,2,1 p

The zero term of the Poisson distribution gives the probability of no eventThe probability of at least one event:

ekp 1)1(

Page 19: Body  size distribution  of  European Collembola

A pile model to generate the binomial.If the number of steps is very, very large the binomial becomes smooth.

The normal distribution is the continous equivalent to the discrete

Bernoulli distribution

Abraham de Moivre (1667-1754)

2

21

21)(

x

exf

)( 2

)( xCexf

Page 20: Body  size distribution  of  European Collembola

If we have a series of random variates Xn, a new random variate Yn that is the sum of all Xn will for n→∞ be a variate that is asymptotically normally distributed.

00.010.020.030.040.05

-2 -1.2 -0.4 0.4 1.2 2X

Freq

uenc

y

00.010.020.030.040.05

-2 -1.2 -0.4 0.4 1.2 2X

Freq

uenc

y

0

0.02

0.04

0.06

-2 -1.2 -0.4 0.4 1.2 2X

Freq

uenc

y

0

0.05

0.1

0.15

-2 -1.2 -0.4 0.4 1.2 2X

Freq

uenc

y

00.05

0.10.15

0.20.25

-2 -1.2 -0.4 0.4 1.2 2X

Freq

uenc

y

0

0.05

0.1

0.15

-2 -1.2 -0.4 0.4 1.2 2X

Freq

uenc

y

The central limit theorem

Page 21: Body  size distribution  of  European Collembola

00.020.040.060.08

0.10.120.140.160.18

0.2

0 3 6 9 12 15 18X

f(x)

n=20

0

0.02

0.04

0.06

0.08

0.1

0.12

0 6 12 18 24 30 36 42 48X

f(x)

n=50

0

0.05

0.1

0.15

0.2

0.25

0.3

0 2 4 6 8 10X

f(x)

n=10

0

0.01

0.02

0.03

0.04

0.05

0.06

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5X

f(x)

2

2( )21( )

2

x

f x e

2

2( )21( )

2

x

f x e

0

0.2

0.4

0.6

0.8

1

1.2

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5X

f(x)

2

2( )21( )

2

vx

F x e dv

The normal or Gaussian distribution

Mean: Variance: 2

Page 22: Body  size distribution  of  European Collembola

Important features of the normal distribution• The function is defined for every real x.• The frequency at x = m is given by

1 0.4( )2

p x

• The distribution is symmetrical around m. • The points of inflection are given by the second

derivative. Setting this to zero gives

( )x x

Page 23: Body  size distribution  of  European Collembola

00.020.040.060.08

0.10.120.140.160.18

0.2

0 3 6 9 12 15 18X

f(x)

n=20

0

0.02

0.04

0.06

0.08

0.1

0.12

0 6 12 18 24 30 36 42 48X

f(x)

n=50

0

0.05

0.1

0.15

0.2

0.25

0.3

0 2 4 6 8 10X

f(x)

n=10

0

0.01

0.02

0.03

0.04

0.05

0.06

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5X

f(x)

2

2( )21( )

2

x

f x e

+- 0.68

+2-2 0.95

95.021

68.021

121

2

2

2

212

2

21

21

x

x

x

e

e

e

975.021

5.021

2

2

212

21

x

x

e

e

Many statistical tests compare observed values with those of the standard normal distribution and assign

the respective probabilities to H1.

2

2( )21( )

2

vx

F x e dv

Page 24: Body  size distribution  of  European Collembola

The Z-transform

2

21

21)(

x

exf

xZ

221

21)(

Zexf

The variate Z has a mean of 0 and and variance of 1.

A Z-transform normalizes every statistical distribution.Tables of statistical distributions are always given as Z-

transforms.

The standard normal

The 95% confidence limit

Page 25: Body  size distribution  of  European Collembola

P( - < X < + ) = 68%P( - 1.65 < X < + 1.65) =

90%P( - 1.96 < X < + 1.96) =

95%P( - 2.58 < X < + 2.58) =

99% P( - 3.29 < X < + 3.29) =

99.9%

The Fisherian significance levels

00.020.040.060.08

0.10.120.140.160.18

0.2

0 3 6 9 12 15 18X

f(x)

n=20

0

0.02

0.04

0.06

0.08

0.1

0.12

0 6 12 18 24 30 36 42 48X

f(x)

n=50

0

0.05

0.1

0.15

0.2

0.25

0.3

0 2 4 6 8 10X

f(x)

n=10

0

0.01

0.02

0.03

0.04

0.05

0.06

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5X

f(x)

2

2( )21( )

2

x

f x e

+- 0.68

+2-2 0.95

The Z-transformed (standardized) normal distribution

Page 26: Body  size distribution  of  European Collembola

x,s

x,s

x,s

x,s

x,sx,sx,s

x,s

,

The estimation of the population mean from a series of samples

xnnnxn

s

nxZ

n

i

i

n

ii

n

ii

1

1

2

1

xZn

The n samples from an additive random variate.

Z is asymptotically normally distributed.

nx

Confidence limit of the estimate of a mean from a series of

samples.

is the desired probability level.

00.020.040.060.080.1

0.120.140.160.180.2

0 3 6 9 12 15 18X

f(x)

n=20

0

0.02

0.04

0.06

0.08

0.1

0.12

0 6 12 18 24 30 36 42 48X

f(x)

n=50

0

0.05

0.1

0.15

0.2

0.25

0.3

0 2 4 6 8 10X

f(x)

n=10

0

0.01

0.02

0.03

0.04

0.05

0.06

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5X

f(x)

2

2( )21( )

2

x

f x e

+- 0.68

+2-2 0.95

Standard error

Page 27: Body  size distribution  of  European Collembola

How to apply the normal distribution

Intelligence is approximately normally distributed with a mean of 100 (by definition) and a standard deviation of 16 (in North America). For an intelligence study we need 100 persons with an IO above 130. How many persons do we have to test to find this

number if we take random samples (and do not test university students only)?

2 2

2 2( ) ( )1302 2

130

1 1( 130) 12 2

v v

F x e dv e dv

( ) ( )az F x a

0

0.005

0.01

0.015

0.02

0.025

0.03

40 60 80 100 120 140 160

IQ

f(IQ

)

IQ<130 IQ>130

Page 28: Body  size distribution  of  European Collembola
Page 29: Body  size distribution  of  European Collembola

One and two sided tests

We measure blood sugar concentrations and know that our method estimates the concentration with an error of about 3%. What is the probability that our

measurement deviates from the real value by more than 5%?

Page 30: Body  size distribution  of  European Collembola

Albinos are rare in human populations. Assume their frequency is 1 per 100000 persons. What is the probability to find 15

albinos among 1000000 persons?

15 9999851000000( 15) (0.00001) (0.99999)

15p X

=KOMBINACJE(1000000,15)*0.00001^15*(1-0.00001)^999985 = 0.0347

np 2 npq