1 of 45 How Many Samples do I Need? Part 3 Presenter: Sebastian Tindall 60 minutes DQO Training Course Day 1 Module 6
Dec 30, 2015
1 of 45
How Many Samples do I Need?Part 3
Presenter: Sebastian Tindall
60 minutes
DQO Training CourseDay 1
Module 6
2 of 45
How Many Samples do I Need?
REMEMBER:
HETEROGENEITY
IS THE RULE!
3 of 45
Sampling for
Environmental ActivitiesChuck Ramsey
EnviroStat, Inc.PO Box 636
Fort Collins, CO 80522970-689-5700
970-229-9977 fax
www.envirostat.org
4 of 45
Sampling for
Analytical Purposes
Pierre Gy
Translated by A.G. Royle
John Wiley & Sons
1998
ISBN: 0-471-97956-2
5 of 45
Pierre Gy’s Sampling Theory and
Sampling Practices, 2nd Edition
Francis F. Pitard
CRC Press
1993
ISBN: 0-8493-8917-8
Heterogeneity, Sampling Correctness, and Statistical Process Control
6 of 45
Seven Major Sampling Errors Fundamental Error - FE Grouping and Segregation Error - GSE Materialization Error - ME
– Delimination Error - DE
– Extraction Error - EE Preparation Error - PE Trends - CE2
Cycles - CE3
7 of 45
Seven Major Sampling Errors
SE = FE + GSE + DE + EE + PE + CE2 +CE3
8 of 45
Ramsey’s “Rules”
All measurements are an average
With discreet sampling, the sample average is a random variable
With discreet sampling, the sample SD is an artifact of the sample collection process
9 of 45
Ramsey’s “Rules”
Heterogeneity is the rule Multi-increment sampling can drive a skewed
distribution towards normal (by invoking the CLT)
FE2
– proportional to particle size – inversely proportional to mass
Lab data are suspect (error can be large)
10 of 45
Ramsey’s “Rules” (cont.)
Good sampling technique is critical Typical sample sizes will underestimate the mean Quality control (QC) is important
— NO boiler plate; (e.g., PARCC)— QC must be problem specific
Maximize the use of onsite analysis to guide planning & decisions
DQOs are the most important component of the process
11 of 45
Ramsey’s “Rules” (cont.)
One measurement is a crap shoot:– Tremendous heterogeneity (variability) between:
Particles within a sample Aliquots of a sample Duplicate samples
Never take ONE grab sample to base a decision– Always collect X increments and use AT LEAST
one multi-increment sample to make the decision
12 of 45
Average Exposure
In discreet sampling:• the sample mean is a random variable.• the 95% UCL is a random variable.• the sample range is a random variable.• the sample standard deviation is a random variable• the sample standard deviation is an artifact of sample
collection process.• n (# samples) is NOT proportional to the size of the
population (e.g. area, mass, or volume).
13 of 45
Average A = 16 ppmAverage B = 221 ppmAverage from discrete sampling is a random variable
AB
A
B
A
A
A
A
B
B
B
B
BA
A
Average depends on locations sampled
14 of 45
Hot Spots
• 1,000,000 g at site• 100,000 g > AL• Take 10 samples• 1> AL• Remove that 1• Re-sample = clean• Wrong!• If 100,000 >AL• Minus 1• Still 99,999>AL
x
AL= action level
15 of 45
Hot Spots Simply Means: “I want to look at units (e.g. Mass, volume) that are becoming
smaller and smaller and
smaller and smaller and
smaller and smaller and
smaller”
$ $ $ $ $ $ $ $ $
16 of 45
Additional Population Considerations
• Sample support - “physical size, shape and orientation of the material that is extracted from the sampling unit that is actually available to be measured or observed, and therefore, to represent the sampling unit.”
• Assure enough sample for analyses
• Specify how the sample support will be processed and sub-sampled for analysis.
EPA Guidance on Choosing a Sampling Design for Environmental Data Collection, EPA QA/G-5S, December 2002, EPA/240/R-02/005
17 of 45
Sub-Sampling
• The DQO must define what represents the population in terms of laboratory sample size:
• Typical laboratory sample sizes that are digested or extracted: metals - 1g, volatiles - 5g, semi-volatiles - 30 g
• The 1g or 30g sample analyzed by the lab is supposed to represent a larger area/mass (e.g., acre). Does it?
18 of 45
Multi-Increment Sampling is the Way to Go
Next slides show “How to” perform multi-increment
sampling
19 of 45
n = m * k
k = 3k = 3
m = 2
FAM/Laboratory
Collect “n” samples
Group into “k”increments
Combine “k”into “m”multi-increments
Remember;we want the
AVERAGEover theDecision Unit
20 of 45
Multi-Increment Samplingn = number of samples requiredk = incrementsm= samples analyzed
n = m * k Mass of m Mass of k Total Mass
sent to lab
100 = 100 * 1 1 Kg 1000 g 100 Kg
100 = 50 * 2 1 Kg 500 g 50 Kg
100 = 25 * 4 1 Kg 250 g 25 Kg
100 = 20 * 5 1 Kg 200 g 20 Kg
100 = 10 * 10 1 Kg 100 g 10 Kg
100 = 5 * 20 1 Kg 50 g 5 Kg
100 = 4 * 25 1 Kg 40 g 4 Kg
100 = 2 * 50 1 Kg 20 g 2 Kg
100 = 1 * 100 1 Kg 10 g 1 Kg
21 of 45
Multi-Increment Sampling is the Way to Go
x x x x xx x x x x
exposure unit = decision unit [DU] (1)
calc d & FE & mass
(2,3,4)
10 scoops(5)
Samples & QC
(6)
Lab(7)
Grind(9)
Re-Calculate particle size
(8)
Sub sample mass for lab
analysis(10)
Analyze entire sub
sample(11)
Average concentration
for DU(12,13)
22 of 45
Multi-Increment Sampling is the Way to Go
1. Agree on exposure unit or decision unit.2. Select or measure a reasonable maximum sample particle size.3. Select the FE.4. Calculate the mass of sample needed based on the FE and particle
size.5. Select n, m, & k6. Using a square scoop large enough to capture the maximum particle
size, collect enough sample increments (k) to equal the masscalculated in #4 and place in a jar, combining increments into one “sample”.
7. Repeat within a given decision unit to produce replicates (duplicate, triplicates, etc.) to generate QC “samples”.
8. Deliver the sample and QC sample(s) to the lab (m).
23 of 45
Multi-Increment Sampling is the Way to Go, continued
9. Calculate the particle size of sample needed based on the desired sub-sampling FE and the mass that the lab normally uses for a given analysis (extraction).
10. Lab may have to grind entire mass of field sample (& QCs) to the agreed upon maximum analytical particle size in #8.
11. Lab must perform one-dimensional sub-sampling of entire mass [spread entire ground sample on flat surface in thin layer, then systematically or randomly collect sufficient small mass sub-sampling increments to equal the mass the laboratory requires for an analysis; do likewise for each QC sample].
12. Combine sub-sampling increments into the “sample”, then digest/extract/analyze the sample and QC samples.
13. Calculate the COPC concentration from each sample.14. Concentration represents average concentration or activity per
decision unit.
24 of 45
Comparison of Discrete vs. Multi-Increment
n Avg SD n m k Avg SD5 34.2 51.4 250 5 50 68.9 145.85 48.0 40.7 250 5 50 72.4 153.95 9.6 12.9 250 5 50 74.6 150.35 291.8 331.0 250 5 50 66.5 157.1
Discrete Multi-Increment
Remember: (In discreet sampling)
1. An average is a random variable;
2. The SD is an artifact of the sample collection process.
25 of 45
SHOW VDT File
X-bar as Random Variable
26 of 45
Effects of Grinding a Soil
Rep 2-g Not Ground
50-g Not Ground 2-g Ground 50-g Ground
1 0.39 0.25 2.33 2.03 2 0.48 1.81 2.25 2.04 3 0.37 0.37 2.22 2.00 4 0.41 1.48 2.28 2.03 5 28.61 7.93 2.15 1.97 6 0.48 0.56 2.15 2.00 7 0.45 0.35 2.15 1.90 8 0.68 0.75 2.17 2.02 9 0.77 0.56 2.00 1.97 10 1.08 0.35 1.98 1.98 11 0.77 0.62 2.10 1.90 12 0.47 5.62 1.96 1.91
mean 2.91 1.72 2.15 1.98 std dev 8.09 2.46 0.12 0.05 R S D 278% 143% 5.50% 2.57%
Walsh, Marianne E.; Ramsey, Charles A.; Jenkins, Thomas F., The Effect ofParticle Size Reduction by Grinding on Subsampling Variance for ExplosivesResidues in Soil, Chemosphere 49 (2002) 1267-1273.
27 of 45
Fundamental Error
FE = fundamental errorM = mass of sample (g)d = maximum particle size <5% oversize (cm)
M
dFE
32 5.22 ~
EPA/600/R-92/128, July 1992
28 of 45
Fundamental Error
22.5= ~ clfg c - mineralogical factor
- density factor (for soil ~ 2.5) l - liberation factor (between 0 -1) f - shape factor (for soil ~0.5) g - granulometric factor ~0.25
MdFE
32
5.22 ~
29 of 45
Fundamental Error
Solve forparticle size
Solve for
mass of sample
OR
dFEM 3
2
5.22
)(
2
3
5.22FE
dM
30 of 45
Constant Particle SizeSample Mass Approx. FE (%)1 gm 42%2 gm 30%10 gm 13%30 gm 8%
Particle Size - 2mm
9217 gm 20%4097 gm 30%
Particle Size - 2.54 cm
31 of 45
Examples of FE, Mass, Particle Size
Mass Particle SizeFE~20%
Particle SizeFE~30%
1g 0.12mm 0.16mm
2g 0.16mm 0.20mm
10g 0.26mm 0.35mm
30g 0.38mm 0.50mm
32 of 45
Examples of FE, Mass, Particle Size
May not work well or at all with some media
•Clay
•Water
•Air
33 of 45
Example
Soil like material Largest particle about 4 mm Action limit is 500 ppm Analytical aliquot is one gram Is this acceptable?
Compliments of EnviroStat, Inc.
34 of 45
Example (cont)
Check particle size representatives
FE percent = 120%
Compliments of EnviroStat, Inc.
EPA/600/R-92/128, July 1992
44.11
)4.0)(5.22(18 332
S
mFE m
dfS
22 FESFE FE = = 1.2
FE percent = 1.2 * 100
44.1
35 of 45
Example (cont)
What mass is required to reduce FE to 15%?
But lab can analyze 10 grams at the most
gM S 6415.
4).5.22(2
3
Compliments of EnviroStat, Inc.
36 of 45
Example (cont)
To what particle size does the sample need to be reduced to achieve FE of 15%?
mmC
MFEd
mmC
MFEd
S
S
15.22
)1(15.
2.25.22
)10(15.
3
2
3
2
3
2
3
2
Compliments of EnviroStat, Inc.
37 of 45
Example (cont)
What is the FE to take 64 grams and grind it to 0.1 cm and take one gram?
Ignoring all the other errors
%2115.15. 2222
12 FEFETE
Compliments of EnviroStat, Inc.
38 of 45
Example (cont)
Option 1– take at least 64 grams and grind to 0.1 cm
– analyze one gram
Option 2– take at least 64 grams and grind to 0.22 cm
– analyze 10 grams
Other options– investigate/estimate sampling factors (clfg)
Compliments of EnviroStat, Inc.
39 of 45
Multi-increment Sampling
Saves money by taking fewer samples to make decision
Eliminates the classical statistics obstacles Samples are representative of population Results are defensible Does not excite the public Faster Cheaper
40 of 45
All measurements are an average In discreet sampling,
the sample average is a random variable The sample range is a random variable The sample UCL is a random variable The sample standard deviation is a random variable
In discreet sampling, the SD is an artifact of the sample collection process
Heterogeneity is the rule Multi-increment sampling can save your butt! Multi-increment sampling can get you defensible data within
your sampling & analyses budget
Key Points
41 of 45
Due to inherent heterogeneity, collecting representative sample is difficult
Managing Uncertainty approach and “Ramsey’s Rules” advocate – using cheaper, real-time, on-site methods
– increasing sample density or coverage
Controlling laboratory analysis quality does not control all error
Errors occur in each step of the collection and analysis process
Key Points (cont.)
42 of 45
Managing Uncertainty approach encourages use of DWP to provide flexibility to obtain sufficient sample density
Larger the “mass”, the lower the sampling error Smaller the “particle”, the lower the sampling error Proper sub-sampling is critical Sample design must assess the normal, skewed, and
badly skewed distributions For badly skewed computer simulations are needed Multi-increment samples drive the distribution to
normal
Key Points (cont.)
43 of 45
How Many Samples do I Need?
REMEMBER:
HETEROGENEITY
IS THE RULE!
44 of 45
Summary Use Classical Statistical sampling approach:
• Very likely to fail to get representative data in most cases
Use Other Statistical sampling approaches:• Bayesian• Geo-statistics• Kriging
Use M-Cubed Approach: Based on Massive FAM
Use Multi-Increment sampling approach:• Can use classical statistics• Cheaper• Faster• Defensible: restricted to surfaces (soils, sediments, etc.)
MASSIVE DATA Required
45 of 45
End of Module 6Thank you
Questions?Comments?
This concludes our presentation for Day 1See you here at 8:30 AM tomorrow for
Day 2.