Top Banner
Measurement and Representation of Hydrological Quantities Leonardo da Vinci - Vitruvian Man, ca 1487 photo by Luc Viatour, www.lucnix.be Riccardo Rigon Sunday, September 12, 2010
76

6 measurement&representation

Jan 29, 2015

Download

Documents

How Hydrological measure appears and how to treat them (an introduction)
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 6   measurement&representation

Measurement and Representation of

Hydrological Quantities

Leon

ard

o d

a V

inci

- V

itru

vian

Man

, ca

14

87

p

hoto

by

Luc

Via

tou

r, w

ww

.lu

cnix

.be

Riccardo RigonSunday, September 12, 2010

Page 2: 6   measurement&representation

Riccardo Rigon

Measurement and Representation of Hydrological Quantities

Objectives:

2

•In these pages the spatio-temporal variability of measurements of hydrological quantities is discussed by means of examples.

•One deduces that statistical instruments must be used to describe these quantities.

Sunday, September 12, 2010

Page 3: 6   measurement&representation

Riccardo Rigon

Measurement and Representation of Hydrological Quantities

Hyd

rom

etri

c H

eigh

tFr

icken

hau

sen

, on

th

e R

iver

Men

o

3

Sunday, September 12, 2010

Page 4: 6   measurement&representation

Riccardo Rigon

Measurement and Representation of Hydrological Quantities

4

Hyd

rom

etri

c H

eigh

tFr

icken

hau

sen

, on

th

e R

iver

Men

o

Sunday, September 12, 2010

Page 5: 6   measurement&representation

Riccardo Rigon

Measurement and Representation of Hydrological Quantities

The hydrological cycles is controlled by innumerable factors: hence it depends on innumerable degrees of freedom. Only a small portion of these factors can be taken into consideration, while the remaining part needs to be modelled as a boundary condition or as “background noise” (this noise is either modelled or eliminated with statistical instruments).

The dynamics of the hydrological cycle are non-linear. Both the hydrodynamics and the thermodynamics of the processes, that involve numerous phase changes, are non-linear. Another non-linear characteristic is that many of these processes are activated in function of some regulating quantity surpassing a threshold value. For example, the condensation of water vapour into raindrops is triggered when air humidity exceeds saturation; landslides are triggered when the internal friction forces of the material are overcome by the thrust of water within the capillarities of the soil; the channels of a hydrographic network begin to form when running water reaches a certain value of force per unit area.

Hydrological Data have Complex Trends 1/2

5

Sunday, September 12, 2010

Page 6: 6   measurement&representation

Riccardo Rigon

Measurement and Representation of Hydrological Quantities

The dynamics include processes which are linearly unstable: for example the

baroclinic instability the drives meteorological processes at the middle

latitudes.

The dynamics of climate and hydrology are dissipative. That is to say they

transfer and transform mechanical energy into thermal energy. The

hydrodynamic process of turbulence transports energy from the larger

spatial scales to the smaller ones, where the energy is dissipated through

friction. Wave phenomena of various kind (e.g. gravity waves) transport the

energy contained in water and in air.

6

Hydrological Data have Complex Trends 2/2

Sunday, September 12, 2010

Page 7: 6   measurement&representation

Riccardo Rigon

Measurement and Representation of Hydrological Quantities

Some Typical Problemsprecipitation

7

Sunday, September 12, 2010

Page 8: 6   measurement&representation

Riccardo Rigon

Measurement and Representation of Hydrological Quantities

8

Some Typical Problemsincident solar radiation

Sunday, September 12, 2010

Page 9: 6   measurement&representation

Riccardo Rigon

Measurement and Representation of Hydrological Quantities

9

Some Typical Problems Flow of the River Adige at San Lorenzo Bridge

0200

400

600

800

1000

1200

1400

Anno

Port

ate

m^3/s

1990 1995 2000 2005

Sunday, September 12, 2010

Page 10: 6   measurement&representation

Riccardo Rigon

Measurement and Representation of Hydrological Quantities

10

Some Typical ProblemsDistribution of monthly river flows in Trento

Sunday, September 12, 2010

Page 11: 6   measurement&representation

Riccardo Rigon

Measurement and Representation of Hydrological Quantities

Grafico bilancio annuo del bacino (2000)

0,8675

0,343

-0,184

0,797

-0,3

-0,2

-0,1

0

0,1

0,2

0,3

0,4

0,5

0,6

0,7

0,8

0,9

1

gen-00 feb-00 mar-00 apr-00 mag-00 giu-00 lug-00 ago-00 set-00 ott-00 nov-00 dic-00

Tempo (mese- anno)

Valo

re (m

c/s)

P - precipitazione ET - evapotraspirazione Inv - volume invasato (accumulo) R - rilascio

11

Some Typical ProblemsAnnual water budget for the Lake of Serraia catchment

Sunday, September 12, 2010

Page 12: 6   measurement&representation

Riccardo Rigon

Measurement and Representation of Hydrological Quantities

12

Some Typical ProblemsWater content of the soil in the Little Washita catchment (Oklahoma)

Sunday, September 12, 2010

Page 13: 6   measurement&representation

Riccardo Rigon

Measurement and Representation of Hydrological Quantities

13

Some Typical ProblemsWater content of the soil in the Little Washita catchment (Oklahoma)

Sunday, September 12, 2010

Page 14: 6   measurement&representation

Riccardo Rigon

Measurement and Representation of Hydrological Quantities

14

Some Typical ProblemsSpatial distribution of preceipitation

Sunday, September 12, 2010

Page 15: 6   measurement&representation

Riccardo Rigon

Measurement and Representation of Hydrological Quantities

15

Some Typical ProblemsSpatial pattern of the hydrographic network

Sunday, September 12, 2010

Page 16: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Luci

o F

on

tan

a -

Exp

ecta

tion

s (M

oM

A), 1

95

9

Riccardo Rigon

Sunday, September 12, 2010

Page 17: 6   measurement&representation

Riccardo Rigon

Measurement and Representation of Hydrological Quantities

Objectives:

17

•In these pages the fundamental elements of statistical analysis will be recalled.

•Population, sample and various elementary statistics, such as mean, variance and covariance, will be defined.

•The existence of statistics and their value will be argued.

•The concept of random sampling will be introduced.

Sunday, September 12, 2010

Page 18: 6   measurement&representation

Riccardo Rigon

Statistics

Population and Sample

18

Statistical inference assumes that a dataset is representative of a subset of

cases, among all the possible cases, called the sample. All the possible

cases represent the population from which the dataset has been extracted.

While the sample is know, generally the population is not. Hypotheses are

implicitly made about the population.

Sunday, September 12, 2010

Page 19: 6   measurement&representation

Riccardo Rigon

Statistics

1860 1880 1900 1920 1940 1960 1980 20008

9

10

11

12

13

14

15a) Bergen:Sep temperature

time

Tem

pera

ture

(oC

)

5 6 7 8 9 10 11 12 13 14 150

5

10

15

20

25

30b) Bergen:Sep temperature distribution (1861!1997)

Fre

quency

Temperature (oC)

Exploratory Data Analysistemporal representation - histogram

19

A set of n data constitutes, therefore, a sample of data.

These data can be represented in various forms. Each representation form emphasises certain characteristics.

Sunday, September 12, 2010

Page 20: 6   measurement&representation

Riccardo Rigon

Statistics

Sample Means

20

x :=1n

n

t=1

x,t

< x >:=1n

n

i=1

xi

Temporal Mean

Spatial Mean

The mean is an indicator of position

Given a sample, various statistics can be calculated. For example:

Sunday, September 12, 2010

Page 21: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

21

Corr

ado C

aud

ek

Statistical Inference

Sunday, September 12, 2010

Page 22: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

21

Corr

ado C

aud

ek

Statistical Inference

•Statistical inference is the process which allows one to formulate

conclusions with regards to a population on the basis of a sample of

observations extracted casually from the population.

Sunday, September 12, 2010

Page 23: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

21

Corr

ado C

aud

ek

Statistical Inference

•Statistical inference is the process which allows one to formulate

conclusions with regards to a population on the basis of a sample of

observations extracted casually from the population.

•Central to classic statistical inference is the notion of sample distribution,

that is to say how the statistics of the samples vary if casual samples, of the

same size n, are repeatedly extracted from the population.

Sunday, September 12, 2010

Page 24: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

21

Corr

ado C

aud

ek

Statistical Inference

•Statistical inference is the process which allows one to formulate

conclusions with regards to a population on the basis of a sample of

observations extracted casually from the population.

•Central to classic statistical inference is the notion of sample distribution,

that is to say how the statistics of the samples vary if casual samples, of the

same size n, are repeatedly extracted from the population.

•Even though, in each practical application of statistical inference, the

researcher only has one n-sized casual sample, the possibility that the

sampling can be repeated furnishes the conceptual foundation for deciding

how informative the observed sample is of the population in its entirety.

Sunday, September 12, 2010

Page 25: 6   measurement&representation

Riccardo Rigon

Statistics

Exploratory Data Analysis

22

The mean is not the only indicator of position

Mode

Sunday, September 12, 2010

Page 26: 6   measurement&representation

Riccardo Rigon

Statistics

Median and Mode

23

The mode represents the most frequent value.

The median represents the value for which 50% of the data has an inferior value and (obviously!) the other 50% has a greater value.

If the histogram distinctly presents various maximums, though the matter risks being controverial, the dataset is said to be multimodal.

Sunday, September 12, 2010

Page 27: 6   measurement&representation

Riccardo Rigon

Statistics

Empirical Distribution Function

24

Given the dataset

hi = h1, · · ·, hn

the empirical cumulative distribution function is defined

and having derived from this the ordered set in ascending order

hj = (h1, · · ·, hn) h1 ≤ h2 ≤ · ≤ hn

ECDFi(h) :=1n

i

j=1

j

Sunday, September 12, 2010

Page 28: 6   measurement&representation

Riccardo Rigon

Statistics

ECDF

25

The empirical cumulative distribution function can be represented as illustrated.

The ordinate value identified by the curve is called the frequency of non-

exceedance or quantile.

20 40 60 80

0.0

0.2

0.4

0.6

0.8

1.0

Frequenza di non superamento

h[mm]

P[H<h]

Sunday, September 12, 2010

Page 29: 6   measurement&representation

Riccardo Rigon

Statistics

2620 40 60 80

0.0

0.2

0.4

0.6

0.8

1.0

Frequenza di non superamento

h[mm]

P[H<h]

0.5 quantile

The 0.5 quantile separates the data distribution in half in relation to the ordinate.

ECDF

Sunday, September 12, 2010

Page 30: 6   measurement&representation

Riccardo Rigon

Statistics

2720 40 60 80

0.0

0.2

0.4

0.6

0.8

1.0

Frequenza di non superamento

h[mm]

P[H<h]

0.5 quantile

The 0.5 quantile separates the data distribution in half in relation to the ordinate.

ECDF

Sunday, September 12, 2010

Page 31: 6   measurement&representation

Riccardo Rigon

Statistics

2820 40 60 80

0.0

0.2

0.4

0.6

0.8

1.0

Frequenza di non superamento

h[mm]

P[H<h]

0.5 quantile

median

And so the median is identified

ECDF

Sunday, September 12, 2010

Page 32: 6   measurement&representation

Riccardo Rigon

Statistics

Box and Whisker Diagrams

29

20 40 60 80

0.0

0.2

0.4

0.6

0.8

1.0

Frequenza di non superamento

h[mm]

P[H<h]

0.5 quantile

The procedure can be generalised and represented with a box and whisker diagram.

0.75 quantile

0.25 quantile

“whisker”

The box and whisker diagram is another way of representing the data distribution.

Sunday, September 12, 2010

Page 33: 6   measurement&representation

Riccardo Rigon

Statistics

A parameter is a describes a certain aspect of the population.

• For example, the (real) mean annual precipitation at a weather station

is a parameter. Let us suppose that this mean is

• In any concrete situation the parameters are unknown

30

Corr

ado C

aud

ek

µh = 980 mm

Parameters and Statistics

Sunday, September 12, 2010

Page 34: 6   measurement&representation

Riccardo Rigon

Statistics

A statistic is a number that can be calculated on the basis of data

given by a sample, without any knowledge of the parameters of the

population.

• Let us suppose, for example, that the casual sample of precipitation

data covers 30 years of measurement and that the mean annual

precipitation, on the basis of the sample, is

• This mean is a statistic.

31

Corr

ado C

aud

ek

h = 1002 mm

Parameters and Statistics

Sunday, September 12, 2010

Page 35: 6   measurement&representation

Riccardo Rigon

Statistics

Other Statistics: the Range

32

Rx := max(x)−min(x)

The range is the simplest indicator of data distribution. It is an indicator of the scale of the data. However, it only considers two data and does not consider the other n-2 data that make up the sample.

Sunday, September 12, 2010

Page 36: 6   measurement&representation

Riccardo Rigon

Statistics

Other Statistics: Variance and Standard Deviation

33

V ar(x) :=1n

n

i=1

(xi − x)

σx :=

1n

n

i=1

(xi − x)

The variance is an indicator of “scale” that considers all the data of the sample

Sunday, September 12, 2010

Page 37: 6   measurement&representation

Riccardo Rigon

Statistics

34

V ar(x) :=1

n− 1

n

i=2

(xi − x)

σx :=

1n− 1

n

i=1

(xi − x)

The unbiased version of the variance takes into account that only n-1 data are independent, their mean being fixed.

Other Statistics: Variance and Standard Deviation

“corrected” version (unbiased)

Sunday, September 12, 2010

Page 38: 6   measurement&representation

Riccardo Rigon

Statistics

Coefficient of Variation

• The coefficient of variation (CV) of a data sample is defined as the

ratio of between the standard deviation and the mean:

• The greater the coefficient of variation, the less informative and

indicative the mean is in relation to the future trends of the

population.

35

CVx :=σx

x

Sunday, September 12, 2010

Page 39: 6   measurement&representation

Riccardo Rigon

Statistics

36

Skewness is a measure of the asymmetry of the data distribution

skx :=n

i=1

1n

xi − x

σx

3

kx := 3 +n

i=1

1n

xi − x

σx

4

Other Statistics: Skewness and Kurtosis

Kurtosis is a measure of the “peakedness” of the data distribution

Sunday, September 12, 2010

Page 40: 6   measurement&representation

Riccardo Rigon

Statistics

Estimation and Hypothesis Testing

Usually, we are not interested in the statistics for themselves, but in

what the statistics tell us about the population of interest.

• We could, for example, use the annual mean precipitation, measured

at all hydro-meteorological stations, to estimate the mean annual

precipitation for the Italian Peninsula.

• Or, we could use the mean of the sample to establish whether the

mean annual precipitation has mutated during the duration of the

sample.

37

Sunday, September 12, 2010

Page 41: 6   measurement&representation

Riccardo Rigon

Statistics

These two questions belong to the two main schools of classical

statistical inference

• The estimation of parameters

• Statistical hypothesis testing

38

Estimation and Hypothesis Testing

Sunday, September 12, 2010

Page 42: 6   measurement&representation

Riccardo Rigon

Statistics

Sample Variability

A fundamental aspect of sample statistics is that they vary from one

sample to the next. In the case of annual precipitation, it is very

improbable that the mean of the sample, of 1002mm, will coincide

with the mean of the population.

• The variability of a sample statistic from sample to sample is called

sample variability.

– When sample variability is very high, the sample is

misinformative in relation to the population parameter.

– When the sample variability is small, the statistic is informative,

even though it is practically impossible that the statistic of a

sample be exactly the same as the population parameter.

39

Sunday, September 12, 2010

Page 43: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

40

Corr

ado C

aud

ek

Sample Variability

Simulation Sample variability will be illustrated as follows:

1. we will consider a discrete variable that can only assume a small number of possible values (N = 4);

2. a list will be furnished listing all possible samples of size n = 2;

3. the mean will be calculated for each possible sample of size n = 2;

4. the distribution of means of the samples of size n = 2 will be examined.

The mean μ and the variance σ of the population will be calculated. It must be noted that μ and σ are parameters, while the mean xi and the variance s2

i of each sample are statistics.

Techniques in Psychological Research and Data Analysis 8

Sunday, September 12, 2010

Page 44: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

41

Corr

ado C

aud

ek

Sample Variability

•The experiment in this example consists of the n=2 extractions with return of a marble xi from an urn that contains N=4 marbles.

•The marbles are numbered as follows: 2, 3, 5, 9

•Extraction with return of the marble corresponds to a population of infinite size (it is in fact always possible to extract a ball from the urn)

Sunday, September 12, 2010

Page 45: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

42

Corr

ado C

aud

ek

Sample Variability

•For each sample of size n=2 the mean of the value of the marbles extracted is calculated:

•For example, if the marbles extracted are x1=2 and x2=3, then:

x =2

i=1

xi

2

x =2 + 3

2=

52

= 2.5

Sunday, September 12, 2010

Page 46: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

43

Corr

ado C

aud

ek

Sample Variability

Three DistributionsWe must distinguish between three distributions:

1. the population distribution

2. the distribution of a sample

3. the sample distribution of the means of all possible samples

Sunday, September 12, 2010

Page 47: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

44

Corr

ado C

aud

ek

Sample Variability

๏ 1. The Population Distribution

The population distribution: the distribution of X (the value of the marble extracted) in the population. In this specific case the population is of infinite size and has the following probability distribution:

xi pi

2 1/43 1/45 1/49 1/4

Total 1

Sunday, September 12, 2010

Page 48: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

45

Corr

ado C

aud

ek

Sample Variability

•The mean of the population is:

•The variance of the population is:

µ =

xipi = 4.75

σ2 =

(xi − µ)2pi = 7.1875

Sunday, September 12, 2010

Page 49: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

46

Corr

ado C

aud

ek

Sample Variability

๏ 2. The Distribution of a Sample

The distribution of a sample: the distribution of X in a specific sample.

•If, for example, the x1 = 2 and x2 = 3, then the mean of this sample is and the variance is x = 2.5 s2 = 0.5

Sunday, September 12, 2010

Page 50: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

47

Corr

ado C

aud

ek

Sample Variability

๏ 3. The Sample Distribution of a the Means

The sample distribution of a the means: the distribution of the means of all the possible samples.

•If the size of the samples is n=2, then there are 4X4=16 possible samples. We can therefore list their means.

sample mean xi sample mean xi

3, 2 2.5 2, 3 2.55, 2 3.5 2, 5 3.59, 2 5.5 2, 9 5.55, 3 4.0 3, 5 4.09, 3 6.0 3, 9 6.09, 5 7.0 9, 5 7.02, 2 2.0 3, 3 3.05, 5 5.0 9, 9 9.0

Sunday, September 12, 2010

Page 51: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

48

Corr

ado C

aud

ek

Sample Variability

•The sample distribution of the means has the following probability distribution:

xi pi

2.0 1/162.5 2/163.0 1/163.5 2/164.0 2/165.0 1/165.5 2/166.0 2/167.0 2/169.0 1/16

Total 1

Sunday, September 12, 2010

Page 52: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

49

Corr

ado C

aud

ek

Sample Variability

•The mean of the sample distribution of the means is:

•The variance of the population is:

µx =

xipi = 4.75

σ2x =

(xi − µx)2pi = 3.59375

Sunday, September 12, 2010

Page 53: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

50

Corr

ado C

aud

ek

Sample Variability

The example we have seen is very particular insomuch that the population is known. In practice the population distribution is never known.

However, we can take note of two important properties of the sample distribution of the means:

•The mean of the sample distribution of means is the same as the population mean

•The variance of the sample distribution of means is the equal to the ratio of the variance of the population to the numerosity n of the sample:

µxµ

σ2x

σ2

σ2x =

σ2

n=

7.18752

= 3.59375

Sunday, September 12, 2010

Page 54: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

51

Corr

ado C

aud

ek

Sample Variability

The two things to note can be summarised as follows:

•The mean and variance of the sample distribution of means are determined by the mean and variance of the population:

•The variance of the sample distribution of the means is smaller than the variance of the population.

σ2x =

σ2

nµx = µ

Sunday, September 12, 2010

Page 55: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

52

Corr

ado C

aud

ek

Sample Variability

To follow, we will use the properties of the sample distribution to make inferences about the parameters of the population even when the population distribution is not known.

Sunday, September 12, 2010

Page 56: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

53

Corr

ado C

aud

ek

Sample Variability

Three DistributionsTherefore, we have distinguished between three distributions:

1. the population distribution

2. the distribution of a sample

3. the sample distribution of the means of all possible samples

µx = 4.75, σ2x = 3.59375

Ω = 2, 3, 5, 9, µ = 4.75, σ2 = 7.1875

Ωi = 2, 3, x = 2.5, s2 = 0.5

Ωx = 2.0, 2.5, 3.0, 3.5, 4.0, 5.0, 5.5, 6.0, 7.0, 9.0,

Sunday, September 12, 2010

Page 57: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

54

Corr

ado C

aud

ek

Sample Variability

The population distribution: this is the distribution that contains all possible observations. The mean and variance of this distribution are

indicated with μ and σ2.

1. The distribution of a sample: this is the distribution of the values of the population that make up a particular casual sample of size n. The single values are indicated x1,.... xn, and the mean and variance are indicated and s2.

2. The sample distribution of the means of the samples: this is the distribution of the for al the possible samples of size n that can be extracted from the population being considered. The mean and variance of the sample distribution of means are indicated by and .

x

xi

µx σ2x

Sunday, September 12, 2010

Page 58: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

55

Corr

ado C

aud

ek

Sample Variability

The distribution that is the basis of statistical inference is the sample

distribution.

Definition: the sample distribution of a statistic is the distribution of values that the specific statistic assumes for all samples of size n that can be extracted from the population.

It must be noted that if the simulation considers less samples than all those theoretically possible than the resulting distribution will only be an approximation of the real sample distribution.

Sunday, September 12, 2010

Page 59: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

56

Having created different statistics, we can now make some hypotheses. For

example:

• Do the samples all have the same mean and the same variance?

• Does the mean depend on the numerosity of the sample?

• Does the variance depend on the numerosity of the sample?

Estimation and Hypothesis Testing

Sunday, September 12, 2010

Page 60: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

57

If the samples do not have the same mean, a trend can present istself.

Estimation and Hypothesis Testing

Sunday, September 12, 2010

Page 61: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

58

The variance can vary with the numerosity of the sample !

If it does not stabilise as the data of the sample increases than the data

are said to have “Infinite Variance Syndrome”.

Estimation and Hypothesis Testing

Sunday, September 12, 2010

Page 62: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

59

Null Hypothesis

We will have a chance to look at hypothesis testing in detail in future

lectures. However, it is well to remember the following:

• Generally, it is not possible to definitively prove anything. One can

only attempt to prove that a hypothesis is not true.

• Let H0 be the (null) hypothesis to be tested. If H0 can not be rejected,

then one an affirm that “it is true” with a certain degree of confidence.

Sunday, September 12, 2010

Page 63: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

60

Given two datasets, for example:

andhi = h1, · · ·, hn li = l1, · · ·, ln

La covariance between these two datasets is defined as:

Cov(hi, li) :=1

N − 1

n

1

(li − li)(hi − hi)

Other Statistics: Covariance

Sunday, September 12, 2010

Page 64: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

61

hi = h1, · · ·, hn li = l1, · · ·, ln

ρlh :=Cov(l, h)√

σh σl

Other Statistics: Correlation

Given two datasets, for example:

and

La correlation between these two datasets is defined as:

Sunday, September 12, 2010

Page 65: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

62

Please observe that one can consider the correlation between two sample series of equal length:

hi = h1, · · ·, hn−1 hi+1 = h2, · · ·, hn−1and

Cov(hi, hi+1) :=1

N − 1

n−1

j=1

(hi − hi)(hi+1 − hi+1)

Resulting in:

Other Statistics: Correlation

Sunday, September 12, 2010

Page 66: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

63

Repeating this operation for the series which are gradually reduced in length and separated by r instants, the resulting series are:

and

From where:

hi+r = hr, · · ·, hnhri = h1, · · ·, hn−r

Cov(hri , hi+r) :=

1N − 1

n−r

j=1

(hri − hr

i )(hi+r − hi+r)

ρ(hri , hi+r) :=

Cov(hri , hi+r)

σri σi + r

Other Statistics: Correlation

Sunday, September 12, 2010

Page 67: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

64

Other Statistics: Autocorrelation

Sunday, September 12, 2010

Page 68: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

Random Sampling

Within the strategy of creating and analysing data samples, the selection ( or,

sometimes, the generation) of random samples plays an important role.

A random sample of n events, selected from a population, is such if the probability

of that sample being selected is the same as any other sample of the same size.

If the data are generated, then one is carrying out a random experiment. Some

examples of this are:

•tossing a coin;

•counting the rainy days in a year; and

•counting the days when the river flow at the Bridge of San Lorenzo, Trento, is

greater than a predetermined value.

Sunday, September 12, 2010

Page 69: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

66

Corr

ado C

aud

ek

Sample Variability

Simulation 2Let us consider another example where sample variability is illustrated as follows:

1. the same population as in the previous example shall be used (N = 4);

2. by means of the computer programme R, 50,000 samples will be extracted, with replacement, from the population of size n = 2;

3. the mean will be calculated for each of these samples of size n = 2;

4. the mean and variance of the distribution of means of the 50,000 samples of size n = 2 will be calculated.

Sunday, September 12, 2010

Page 70: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

67

3 Simulazione 2

N <- 4

n <- 2

nSamples <- 50000

X <- c(2, 3, 5, 9)

Mean <- mean(X)

Var <- var(X)*(N-1)/N

SampDistr <- rep(0, nSamples)

for (i in 1:nSamples)

samp <- sample(X, n, replace=T)

SampDistr[i] <- mean(samp)

MeanSampDistr <- mean(SampDistr)

VarSampDistr <- var(SampDistr)*(nSamples-1)/nSamples

Tecniche di Ricerca Psicologica e di Analisi dei Dati 27

Corr

ado C

aud

ek

Sample Variability

Sunday, September 12, 2010

Page 71: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

67

3 Simulazione 2

N <- 4

n <- 2

nSamples <- 50000

X <- c(2, 3, 5, 9)

Mean <- mean(X)

Var <- var(X)*(N-1)/N

SampDistr <- rep(0, nSamples)

for (i in 1:nSamples)

samp <- sample(X, n, replace=T)

SampDistr[i] <- mean(samp)

MeanSampDistr <- mean(SampDistr)

VarSampDistr <- var(SampDistr)*(nSamples-1)/nSamples

Tecniche di Ricerca Psicologica e di Analisi dei Dati 27

Corr

ado C

aud

ek

Sample Variability

Sunday, September 12, 2010

Page 72: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

67

3 Simulazione 2

N <- 4

n <- 2

nSamples <- 50000

X <- c(2, 3, 5, 9)

Mean <- mean(X)

Var <- var(X)*(N-1)/N

SampDistr <- rep(0, nSamples)

for (i in 1:nSamples)

samp <- sample(X, n, replace=T)

SampDistr[i] <- mean(samp)

MeanSampDistr <- mean(SampDistr)

VarSampDistr <- var(SampDistr)*(nSamples-1)/nSamples

Tecniche di Ricerca Psicologica e di Analisi dei Dati 27

Mean and Variance of the Sample

Corr

ado C

aud

ek

Sample Variability

Sunday, September 12, 2010

Page 73: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

67

3 Simulazione 2

N <- 4

n <- 2

nSamples <- 50000

X <- c(2, 3, 5, 9)

Mean <- mean(X)

Var <- var(X)*(N-1)/N

SampDistr <- rep(0, nSamples)

for (i in 1:nSamples)

samp <- sample(X, n, replace=T)

SampDistr[i] <- mean(samp)

MeanSampDistr <- mean(SampDistr)

VarSampDistr <- var(SampDistr)*(nSamples-1)/nSamples

Tecniche di Ricerca Psicologica e di Analisi dei Dati 27

Mean and Variance of the Sample

50,000 samples are extracted

Corr

ado C

aud

ek

Sample Variability

Sunday, September 12, 2010

Page 74: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

68

3 Simulazione 2

Risultati della simulazione

> Mean

[1] 4.75

> Var

[1] 7.1875

> MeanSampDistr

[1] 4.73943

> VarSampDistr

[1] 3.578548

> Var/n

[1] 3.59375

Tecniche di Ricerca Psicologica e di Analisi dei Dati 28

Corr

ado C

aud

ek

Sample Variability

Results of analysis with R:

Sunday, September 12, 2010

Page 75: 6   measurement&representation

Statistical Inference and Descriptive Statistics

Riccardo Rigon

69

Corr

ado C

aud

ek

Sample Variability

Population:

๏Sample distribution of the means:

๏Results of the R simulation:

µ = 4.75, σ2 = 7.1875

µx = 4.75, σ2x = 3.59375

µx = 4.73943, σ2x = 3.578548

Sunday, September 12, 2010

Page 76: 6   measurement&representation

Riccardo Rigon

Measurement and Representation of Hydrological Quantities

70

Thank you for your attention!

G.U

lric

i -

Uom

o d

op

e av

er l

avora

to a

lle

slid

es ,

20

00

?

Sunday, September 12, 2010