Monroe L. Weber-Shir k S chool of Civil and Environmental Engi neering “BAD” DATA Sun e e Sun
Dec 19, 2015
Monroe L. Weber-Shirk
School of Civil and
Environmental Engineering
“BAD” DATA“BAD” DATA
SunSun eeee SunSun
OverviewOverview
Bad Data Learning from unexpected results Isotherm Analysis Research
Bad Data Learning from unexpected results Isotherm Analysis Research
Sources of “Bad” DataSources of “Bad” Data Error in preparation of samples
mass or volume measurement error contamination improper storage sample substitution sample loss samples with high heterogeneity
Apparatus failures leaks incompatible materials inadequate control of an important parameter
Error in preparation of samples mass or volume measurement error contamination improper storage sample substitution sample loss samples with high heterogeneity
Apparatus failures leaks incompatible materials inadequate control of an important parameter
Instrument ErrorsInstrument Errors
detector malfunction below detection limit or above maximum interference software (instrument or computer) hardware (analog to digital converter, power
supply,...) calibration
detector malfunction below detection limit or above maximum interference software (instrument or computer) hardware (analog to digital converter, power
supply,...) calibration
More sources of “Bad Data”More sources of “Bad Data”
Error in data analysis numerical error (data entry) units (classic errors of factors of 10 and factors of
1000) incorrectly applied theory
Error in theory
Error in data analysis numerical error (data entry) units (classic errors of factors of 10 and factors of
1000) incorrectly applied theory
Error in theory
Bad Data aren’t Bad!
“Bad” data usually means the results were unexpected perhaps unorthodox!
Copernicus “Concerning the Revolutions of the Celestial Bodies”1543
Papal Index of forbidden books until 1835 _____________________
Data do not lie! Data always mean something If you ignore data that you don’t understand you are
missing an opportunity to learn
Bad data for 292 years!
Unexpected Results
Lack of repeatability (poor precision) scatter for all data outlier systemic error
0123456789
10
0 2 4 6 8 10
distance from source (m)
Con
cen
trati
on
(m
g/L
)
measured concentration
expected concentration
0123456789
10
0 2 4 6 8 10
distance from source (m)
Con
cen
trat
ion
(m
g/L
)
measured concentration
expected concentration
0123456789
10
0 2 4 6 8 10
distance from source (m)
Con
cen
trat
ion
(m
g/L
) measured concentration
expected concentration
Unexpected Results
Inconsistent with theory mass balances indicate loss or gain of mass
inconsistent with previous results
Sun ee Sun
02468
101214161820
0 2 4 6 8 10
parameter A
Par
amet
er B
measured trend
expected trend
Responses to Unexpected Results
Determine accuracy of technique by analyzing known samples
Determine precision of technique by analyzing replicates
Evaluate propagation of errors through analysis are you trying to measure the difference between two large
numbers? is the precision of the measurement similar to the magnitude
of the estimate? Are you not controlling an important parameter? Is the parameter that you are studying insignificant?
Isotherm Analysis Pointers Units
Express mass of VOC in grams Express concentrations as g per mL Remember GC injection volume was 0.1 mL
Use names to keep track of parameters in spreadsheet
Build sheet from left to right
More PointersMore Pointers
Soil density = 1.6 g/mL Soil moisture content is 13% Soil mass was close to 20 g Analyze the 4 data sets as sets
Use the data from one group to calculate a single value for each parameter
You will get 4 estimates for each parameter
Soil density = 1.6 g/mL Soil moisture content is 13% Soil mass was close to 20 g Analyze the 4 data sets as sets
Use the data from one group to calculate a single value for each parameter
You will get 4 estimates for each parameter
L
GL
GGGVOC
GS
GLS
L VH
CVCM
CM
HK
sc
spreadsheet
Why is a small fs a measurement problem?
Why is a small fs a measurement problem?
EPICS Error AnalysisEPICS Error Analysis
water water
low solubilit
y
low solubilit
y
high solubilit
y
high solubilit
y
Assume 10% error in measuring gas concentrations
What are the maximum and minimum values of mass in liquid phase?
Assume 10% error in measuring gas concentrations
What are the maximum and minimum values of mass in liquid phase?
(10 ± 1) – (9 ± 0.9)
11 – 8.1 = 2.9
9 – 9.9 = -0.9
(10 ± 1) –( 1 ± 0.1)
11 – 0.9 = 10.1
9 – 1.1 = 7.9