Presentations in this series 1. Overview and Randomization 2. Self-matching 3. Proxies 4. Intermediates 5. Instruments 6. Equipoise Avoiding Bias Due to Unmeasured Covariates Alec Walker
Dec 28, 2015
Presentations in this series1. Overview
and Randomization2. Self-matching3. Proxies4. Intermediates5. Instruments6. Equipoise
Avoiding Bias Due toUnmeasured Covariates
Alec Walker
7
Let O be an outcome (either T treatment or D disease)P be a proxyX be an unmeasured covariate
P is a proxy for X with respect to O if thedistribution of O given P is identical to the distribution of O given P and X
Which is to say that X adds no information about O, if you know P.
A textbook definition fromeconometrics.
8
Let O be an outcome (either T treatment or D disease)P be a proxyX be an unmeasured covariate
P is a proxy for X with respect to O if thedistribution of O given P is identical to the distribution of O given P and X
Which is to say that X adds no information about O, if you know P.
Note that O, P and X could all be multidimensional, that is vectors of outcomes, proxies and unmeasured covariates, respectively. This definition could also be conditioned on other, measured covariates.
A textbook definition fromeconometrics.
Proxy variables areCorrelates of an unmeasured covariate
That are useful to the extent that they capture the influence of the unmeasured covariate on a third characteristic
Control for a proxy replaces control for the unmeasured covariate
9
10
Interview responses may be proxies for – Historical measurements (diet, smoking, alcohol …)– Internal states– Genetic traits
Biological markers are proxies for biological processesAge, sex, SES are stand-ins for their many correlates
.
Examples of proxies
11
Interview responses may be proxies for – Historical measurements (diet, smoking, alcohol …)– Internal states– Genetic traits
Biological markers are proxies for biological processesAge, sex, SES are stand-ins for their many correlates
.
Examples of proxies
In diabetics, retinal vascular disease is a proxy for vascular disease more generally and is easily ascertained by funduscopic examination. In looking at determinants of myocardial infarction, control for retinal vascular disease could represent control for coexisting vascular pathology.
https://www.myhealth.va.gov/mhv-portal-web/anonymous.portal?_nfpb=true&_pageLabel=commonConditions&contentPage=va_health_library/diabetic_retinopathy_advanced_info.html
Early diabetic retinopathySource: US Department of Veterans Affairs
https://www.myhealth.va.gov/mhv-portal-web/anonymous.portal?_nfpb=true&_pageLabel=commonConditions&contentPage=va_health_library/diabetic_retinopathy_advanced_info.html
microaneurysms
Early diabetic retinopathySource: US Department of Veterans Affairs
https://www.myhealth.va.gov/mhv-portal-web/anonymous.portal?_nfpb=true&_pageLabel=commonConditions&contentPage=va_health_library/diabetic_retinopathy_advanced_info.html
Advanced diabetic retinopathySource: US Department of Veterans Affairs
Thialozinedionesfor diabetes
Acute myocardial infarction
Coronary artery
disease
UT
(Unmeasured) Severity of Diabetes
Retinal vascular disease
UD
23
Without mechanistic information, for each of these situations,
( covariate causes proxyproxy causes covariateboth caused by a third factor )
… the proxy looks like a transformation of the predictor, with added error.
Proxy value = f(Predictor value) + error
24
Treated
Untreated
The true value of the unmeasured covariate is a predictor of treatment
An accurate proxy
26
The proxy predicts treatment almost as well as does the true value.
Treated
Untreated
The true value of the unmeasured covariate is a predictor of treatment
An accurate proxy
27
The proxy almost p
erfectl
y represents
the value of th
e unmeasured covaria
te.
Treated
Untreated
An accurate proxy
28
The proportion of treated among subjects in a particular small range of proxy values
An accurate proxy
32
The proportion of treated among subjects in a particular small range of proxy values
An accurate proxy
33
The proportion of treated among subjects in a particular small range of proxy values
… is the same as the proportion of treated among subjects in the corresponding small range of true values.
An accurate proxy
34
UntreatedTh
e pro
xy is
still c
orrelat
ed
with th
e unkn
own mea
sure
.
Proxies with substantial
random error
Untreated
Treated
40
Treatment is still associated with higher values of the proxy, but thediscriminationis muchworse.
Proxies with substantial
random error
41
Proxies with substantial
random errorTreated
Untreated
The corre
lation between th
e two
proxy measures is
still e
vident.
43
Both proxies show poor discrimination between treated and untreated.
Proxies with substantial
random errorTreated
Untreated
44
The two proxies can be combined into a function that discriminates better than either proxy alone.
Proxies with substantial
random errorTreated
Untreated
45
47
Let O be an outcome (either T treatment or D disease)P be a proxyX be an unmeasured covariate
P is a proxy for X with respect to O if thedistribution of O given P is identical to the distribution of O given P and X.
A textbook definition fromeconometrics.
51
Let O be an outcome (either T treatment or D disease)P be a proxyX be an unmeasured covariate
P is a proxy for X with respect to O if thedistribution of O given P is identical to the distribution of O given P and X.
None of the causal graphs or correlation patterns that we’ve looked at so far produce
this behavior, unless the proxy is perfect.
What are the economists talking about?
A textbook definition fromeconometrics.
52
Proxy variables can correspond to different components of a composite predictor
Proxy A = f(Predictor Component A) + error A
Proxy B = f(Predictor Component B) + error B
53
Proxy variables can correspond to different components of a composite predictor.For example, “Severity of Diabetes.”
Hemoglobin A1C
= f(Glucose control last 90 days) + error A
Retinal vascular disease = f(Vascular damage) + error B
Thialozinedionesfor diabetes
Acute myocardial infarction
Coronary artery
disease
UT
Retinal vascular disease
UD
54
UX
Thialozinedionesfor diabetes
Acute myocardial infarction
Coronary artery
disease
UT
Retinal vascular disease
UDHb A1C
UY
Diabetes Mellitus
55
UX
Thialozinedionesfor diabetes
Acute myocardial infarction
Coronary artery
disease
UT
Retinal vascular disease
UDHb A1C
UY
Diabetes Mellitus
56
UX
58
The proxy measures are uncorrelated with one another.
Treated
Untreated
Trea
ted
Unt
reat
ed
Proxies for components of a composite variable
59
Proxy A captures more of the distinction.
Proxy B captures none of the distinction between treatments.
Treated
Untreated
Trea
ted
Unt
reat
ed
Proxies for components of a composite variable
When you have several candidate proxies for an unmeasured covariate, examine them simultaneously for prediction of the outcome (treatment, disease or both), and retain only those that do.
60
When you have several candidate proxies for an unmeasured covariate, examine them simultaneously for prediction of the outcome (treatment, disease or both), and retain only those that do.
61
When you have several candidate proxies for an unmeasured covariate, examine them simultaneously for prediction of the outcome (treatment, disease or both), and retain only those that do.
Measurement error Correlated proxies Keeps all relevant ones
62
When you have several candidate proxies for an unmeasured covariate, examine them simultaneously for prediction of the outcome (treatment, disease or both), and retain only those that do.
Measurement error Correlated proxies Keeps all relevant ones
Proxies for components of composite unmeasured covariate Uncorrelated proxies Keeps the correct predictor.
63
When you have several candidate proxies for an unmeasured covariate, examine them simultaneously for prediction of the outcome (treatment, disease or both), and retain only those that do.
Measurement error Correlated proxies Keeps all relevant ones
Proxies for components of composite unmeasured covariate Uncorrelated proxies Keeps the correct predictor.
Propensity scores (composite multi-variate treatment predictors), allow you to account for both settings.
64
66
The physician’s belief in the patient’s risk for peptic ulcer and bleeding cannot be measured directly. But we can look to known correlates of treatment choice as measures of the physician’s belief and treat these as proxy variables.
Celecoxibversus
Naproxen
PUBHospital
Admission
MD-perceived risk of peptic ulcer & bleeding (PUB)
True risk of PUB
69
After control for correlates that completely capture perceived PUB diathesis, there is no further confounding.
Celecoxibversus
Naproxen
PUBHospital
Admission
MD-perceived risk of peptic ulcer & bleeding (PUB)
True risk of PUB
70
After control for correlates that completely capture perceived PUB diathesis, there is no further confounding.
Celecoxibversus
Naproxen
PUBHospital
Admission
MD-perceived risk of peptic ulcer & bleeding (PUB)
True risk of PUB
71
After control for correlates that completely capture perceived PUB diathesis, there is no further confounding.
Celecoxibversus
Naproxen
PUBHospital
Admission
MD-perceived risk of peptic ulcer & bleeding (PUB)
True risk of PUB
Primary Discharge Diagnosis N % N % RR
With control for many, many proxies a strong effect emerges.
72
73
A proxy is (1) a correlate that (2) captures the effect of an unmeasured covariate on either treatment or disease.
Whether a correlate is a proxy is defined only in respect of a third, predicted variable.
Strong correlates may be only weak proxies.
Composite (multidimensional) proxies are useful when no single candidate proxy captures the unmeasured covariate.
Propensity scoring creates multidimensional proxies.