How Many Samples do I Need? Part 3

1 of 45

How Many Samples do I Need?Part 3

Presenter: Sebastian Tindall

60 minutes

DQO Training CourseDay 1

Module 6

2 of 45

How Many Samples do I Need?

REMEMBER:

HETEROGENEITY

IS THE RULE!

3 of 45

Sampling for

Environmental ActivitiesChuck Ramsey

EnviroStat, Inc.PO Box 636

Fort Collins, CO 80522970-689-5700

970-229-9977 fax

[email protected]

www.envirostat.org

4 of 45

Sampling for

Analytical Purposes

Pierre Gy

Translated by A.G. Royle

John Wiley & Sons

1998

ISBN: 0-471-97956-2

5 of 45

Pierre Gy’s Sampling Theory and

Sampling Practices, 2nd Edition

Francis F. Pitard

CRC Press

1993

ISBN: 0-8493-8917-8

Heterogeneity, Sampling Correctness, and Statistical Process Control

6 of 45

Seven Major Sampling Errors Fundamental Error - FE Grouping and Segregation Error - GSE Materialization Error - ME

– Delimination Error - DE

– Extraction Error - EE Preparation Error - PE Trends - CE2

Cycles - CE3

7 of 45

Seven Major Sampling Errors

SE = FE + GSE + DE + EE + PE + CE2 +CE3

8 of 45

Ramsey’s “Rules”

All measurements are an average

With discreet sampling, the sample average is a random variable

With discreet sampling, the sample SD is an artifact of the sample collection process

9 of 45

Ramsey’s “Rules”

Heterogeneity is the rule Multi-increment sampling can drive a skewed

distribution towards normal (by invoking the CLT)

FE2

– proportional to particle size – inversely proportional to mass

Lab data are suspect (error can be large)

10 of 45

Ramsey’s “Rules” (cont.)

Good sampling technique is critical Typical sample sizes will underestimate the mean Quality control (QC) is important

— NO boiler plate; (e.g., PARCC)— QC must be problem specific

Maximize the use of onsite analysis to guide planning & decisions

DQOs are the most important component of the process

11 of 45

Ramsey’s “Rules” (cont.)

One measurement is a crap shoot:– Tremendous heterogeneity (variability) between:

Particles within a sample Aliquots of a sample Duplicate samples

Never take ONE grab sample to base a decision– Always collect X increments and use AT LEAST

one multi-increment sample to make the decision

12 of 45

Average Exposure

In discreet sampling:• the sample mean is a random variable.• the 95% UCL is a random variable.• the sample range is a random variable.• the sample standard deviation is a random variable• the sample standard deviation is an artifact of sample

collection process.• n (# samples) is NOT proportional to the size of the

population (e.g. area, mass, or volume).

13 of 45

Average A = 16 ppmAverage B = 221 ppmAverage from discrete sampling is a random variable

AB

A

B

A

A

A

A

B

B

B

B

BA

A

Average depends on locations sampled

14 of 45

Hot Spots

• 1,000,000 g at site• 100,000 g > AL• Take 10 samples• 1> AL• Remove that 1• Re-sample = clean• Wrong!• If 100,000 >AL• Minus 1• Still 99,999>AL

x

AL= action level

15 of 45

Hot Spots Simply Means: “I want to look at units (e.g. Mass, volume) that are becoming

smaller and smaller and



smaller”

$ $ $ $ $ $ $ $ $

16 of 45

Additional Population Considerations

• Sample support - “physical size, shape and orientation of the material that is extracted from the sampling unit that is actually available to be measured or observed, and therefore, to represent the sampling unit.”

• Assure enough sample for analyses

• Specify how the sample support will be processed and sub-sampled for analysis.

EPA Guidance on Choosing a Sampling Design for Environmental Data Collection, EPA QA/G-5S, December 2002, EPA/240/R-02/005

17 of 45

Sub-Sampling

• The DQO must define what represents the population in terms of laboratory sample size:

• Typical laboratory sample sizes that are digested or extracted: metals - 1g, volatiles - 5g, semi-volatiles - 30 g

• The 1g or 30g sample analyzed by the lab is supposed to represent a larger area/mass (e.g., acre). Does it?

18 of 45

Multi-Increment Sampling is the Way to Go

Next slides show “How to” perform multi-increment

sampling

19 of 45

n = m * k

k = 3k = 3

m = 2

FAM/Laboratory

Collect “n” samples

Group into “k”increments

Combine “k”into “m”multi-increments

Remember;we want the

AVERAGEover theDecision Unit

20 of 45

Multi-Increment Samplingn = number of samples requiredk = incrementsm= samples analyzed

n = m * k Mass of m Mass of k Total Mass

sent to lab

100 = 100 * 1 1 Kg 1000 g 100 Kg

100 = 50 * 2 1 Kg 500 g 50 Kg

100 = 25 * 4 1 Kg 250 g 25 Kg

100 = 20 * 5 1 Kg 200 g 20 Kg

100 = 10 * 10 1 Kg 100 g 10 Kg

100 = 5 * 20 1 Kg 50 g 5 Kg

100 = 4 * 25 1 Kg 40 g 4 Kg

100 = 2 * 50 1 Kg 20 g 2 Kg

100 = 1 * 100 1 Kg 10 g 1 Kg

21 of 45


x x x x xx x x x x

exposure unit = decision unit [DU] (1)

calc d & FE & mass

(2,3,4)

10 scoops(5)

Samples & QC

(6)

Lab(7)

Grind(9)

Re-Calculate particle size

(8)

Sub sample mass for lab

analysis(10)

Analyze entire sub

sample(11)

Average concentration

for DU(12,13)

22 of 45


1. Agree on exposure unit or decision unit.2. Select or measure a reasonable maximum sample particle size.3. Select the FE.4. Calculate the mass of sample needed based on the FE and particle

size.5. Select n, m, & k6. Using a square scoop large enough to capture the maximum particle

size, collect enough sample increments (k) to equal the masscalculated in #4 and place in a jar, combining increments into one “sample”.

7. Repeat within a given decision unit to produce replicates (duplicate, triplicates, etc.) to generate QC “samples”.

8. Deliver the sample and QC sample(s) to the lab (m).

23 of 45

Multi-Increment Sampling is the Way to Go, continued

9. Calculate the particle size of sample needed based on the desired sub-sampling FE and the mass that the lab normally uses for a given analysis (extraction).

10. Lab may have to grind entire mass of field sample (& QCs) to the agreed upon maximum analytical particle size in #8.

11. Lab must perform one-dimensional sub-sampling of entire mass [spread entire ground sample on flat surface in thin layer, then systematically or randomly collect sufficient small mass sub-sampling increments to equal the mass the laboratory requires for an analysis; do likewise for each QC sample].

12. Combine sub-sampling increments into the “sample”, then digest/extract/analyze the sample and QC samples.

13. Calculate the COPC concentration from each sample.14. Concentration represents average concentration or activity per

decision unit.

24 of 45

Comparison of Discrete vs. Multi-Increment

n Avg SD n m k Avg SD5 34.2 51.4 250 5 50 68.9 145.85 48.0 40.7 250 5 50 72.4 153.95 9.6 12.9 250 5 50 74.6 150.35 291.8 331.0 250 5 50 66.5 157.1

Discrete Multi-Increment

Remember: (In discreet sampling)

1. An average is a random variable;

2. The SD is an artifact of the sample collection process.

25 of 45

SHOW VDT File

X-bar as Random Variable

26 of 45

Effects of Grinding a Soil

Rep 2-g Not Ground

50-g Not Ground 2-g Ground 50-g Ground

1 0.39 0.25 2.33 2.03 2 0.48 1.81 2.25 2.04 3 0.37 0.37 2.22 2.00 4 0.41 1.48 2.28 2.03 5 28.61 7.93 2.15 1.97 6 0.48 0.56 2.15 2.00 7 0.45 0.35 2.15 1.90 8 0.68 0.75 2.17 2.02 9 0.77 0.56 2.00 1.97 10 1.08 0.35 1.98 1.98 11 0.77 0.62 2.10 1.90 12 0.47 5.62 1.96 1.91

mean 2.91 1.72 2.15 1.98 std dev 8.09 2.46 0.12 0.05 R S D 278% 143% 5.50% 2.57%

Walsh, Marianne E.; Ramsey, Charles A.; Jenkins, Thomas F., The Effect ofParticle Size Reduction by Grinding on Subsampling Variance for ExplosivesResidues in Soil, Chemosphere 49 (2002) 1267-1273.

27 of 45

Fundamental Error

FE = fundamental errorM = mass of sample (g)d = maximum particle size <5% oversize (cm)

M

dFE

32 5.22 ~

EPA/600/R-92/128, July 1992

28 of 45

Fundamental Error

22.5= ~ clfg c - mineralogical factor

- density factor (for soil ~ 2.5) l - liberation factor (between 0 -1) f - shape factor (for soil ~0.5) g - granulometric factor ~0.25

MdFE

32

5.22 ~

29 of 45

Fundamental Error

Solve forparticle size

Solve for

mass of sample

OR

dFEM 3

2

5.22

)(

2

3

5.22FE

dM

30 of 45

Constant Particle SizeSample Mass Approx. FE (%)1 gm 42%2 gm 30%10 gm 13%30 gm 8%

Particle Size - 2mm

9217 gm 20%4097 gm 30%

Particle Size - 2.54 cm

31 of 45

Examples of FE, Mass, Particle Size

Mass Particle SizeFE~20%

Particle SizeFE~30%

1g 0.12mm 0.16mm

2g 0.16mm 0.20mm

10g 0.26mm 0.35mm

30g 0.38mm 0.50mm

32 of 45

Examples of FE, Mass, Particle Size

May not work well or at all with some media

•Clay

•Water

•Air

33 of 45

Example

Soil like material Largest particle about 4 mm Action limit is 500 ppm Analytical aliquot is one gram Is this acceptable?

Compliments of EnviroStat, Inc.

34 of 45

Example (cont)

Check particle size representatives

FE percent = 120%


EPA/600/R-92/128, July 1992

44.11

)4.0)(5.22(18 332

S

mFE m

dfS

22 FESFE FE = = 1.2

FE percent = 1.2 * 100

44.1

35 of 45

Example (cont)

What mass is required to reduce FE to 15%?

But lab can analyze 10 grams at the most

gM S 6415.

4).5.22(2

3


36 of 45

Example (cont)

To what particle size does the sample need to be reduced to achieve FE of 15%?

mmC

MFEd

mmC

MFEd

S

S

15.22

)1(15.

2.25.22

)10(15.

3

2

3

2

3

2

3

2


37 of 45

Example (cont)

What is the FE to take 64 grams and grind it to 0.1 cm and take one gram?

Ignoring all the other errors

%2115.15. 2222

12 FEFETE


38 of 45

Example (cont)

Option 1– take at least 64 grams and grind to 0.1 cm

– analyze one gram

Option 2– take at least 64 grams and grind to 0.22 cm

– analyze 10 grams

Other options– investigate/estimate sampling factors (clfg)


39 of 45

Multi-increment Sampling

Saves money by taking fewer samples to make decision

Eliminates the classical statistics obstacles Samples are representative of population Results are defensible Does not excite the public Faster Cheaper

40 of 45

All measurements are an average In discreet sampling,

the sample average is a random variable The sample range is a random variable The sample UCL is a random variable The sample standard deviation is a random variable

In discreet sampling, the SD is an artifact of the sample collection process

Heterogeneity is the rule Multi-increment sampling can save your butt! Multi-increment sampling can get you defensible data within

your sampling & analyses budget

Key Points

41 of 45

Due to inherent heterogeneity, collecting representative sample is difficult

Managing Uncertainty approach and “Ramsey’s Rules” advocate – using cheaper, real-time, on-site methods

– increasing sample density or coverage

Controlling laboratory analysis quality does not control all error

Errors occur in each step of the collection and analysis process

Key Points (cont.)

42 of 45

Managing Uncertainty approach encourages use of DWP to provide flexibility to obtain sufficient sample density

Larger the “mass”, the lower the sampling error Smaller the “particle”, the lower the sampling error Proper sub-sampling is critical Sample design must assess the normal, skewed, and

badly skewed distributions For badly skewed computer simulations are needed Multi-increment samples drive the distribution to

normal

Key Points (cont.)

43 of 45

How Many Samples do I Need?

REMEMBER:

HETEROGENEITY

IS THE RULE!

44 of 45

Summary Use Classical Statistical sampling approach:

• Very likely to fail to get representative data in most cases

Use Other Statistical sampling approaches:• Bayesian• Geo-statistics• Kriging

Use M-Cubed Approach: Based on Massive FAM

Use Multi-Increment sampling approach:• Can use classical statistics• Cheaper• Faster• Defensible: restricted to surfaces (soils, sediments, etc.)

MASSIVE DATA Required

45 of 45

End of Module 6Thank you

Questions?Comments?

This concludes our presentation for Day 1See you here at 8:30 AM tomorrow for

Day 2.

How Many Samples do I Need? Part 3

Documents