Top Banner
Discussion of Discussion of The Problem of The Problem of False Discoveries: False Discoveries: How to Balance How to Balance Objectives Objectives 2009 IES Research 2009 IES Research Conference Conference David Judkins David Judkins Westat Westat
23

Discussion of The Problem of False Discoveries: How to Balance Objectives 2009 IES Research Conference David Judkins Westat.

Dec 31, 2015

Download

Documents

Valentine Sims
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Discussion of The Problem of False Discoveries: How to Balance Objectives 2009 IES Research Conference David Judkins Westat.

Discussion ofDiscussion ofThe Problem of The Problem of

False Discoveries: False Discoveries: How to Balance How to Balance

ObjectivesObjectives2009 IES Research 2009 IES Research

ConferenceConference

David JudkinsDavid Judkins

WestatWestat

Page 2: Discussion of The Problem of False Discoveries: How to Balance Objectives 2009 IES Research Conference David Judkins Westat.

2

I would like to commend the authors on I would like to commend the authors on their fine work.their fine work.

I found nothing to disagree with.I found nothing to disagree with. I would like to spend my time talking about I would like to spend my time talking about

the nature of confirmatory versus the nature of confirmatory versus exploratory analysis, how to group exploratory analysis, how to group outcomes, how to drill down, and the utility outcomes, how to drill down, and the utility of single dimensional summaries of multi-of single dimensional summaries of multi-dimensional outcomes.dimensional outcomes.

Thanks to Andrea Piesse of Westat for Thanks to Andrea Piesse of Westat for valuable commentsvaluable comments

Of course, my remarks are personal and do Of course, my remarks are personal and do not necessarily reflect Westat policies. not necessarily reflect Westat policies.

Page 3: Discussion of The Problem of False Discoveries: How to Balance Objectives 2009 IES Research Conference David Judkins Westat.

3

G.E.P. BoxG.E.P. Box

I haven’t read any work by him directly I haven’t read any work by him directly on multiple comparisons or false on multiple comparisons or false discovery controldiscovery control

But he has written elegantly about the But he has written elegantly about the nature of discovery and the use of nature of discovery and the use of statistics in that processstatistics in that process

An understanding of his work will help An understanding of his work will help researchers distinguish between researchers distinguish between exploratory and confirmatory analysis in exploratory and confirmatory analysis in their own worktheir own work

Page 4: Discussion of The Problem of False Discoveries: How to Balance Objectives 2009 IES Research Conference David Judkins Westat.

4

Statistics for DiscoveryStatistics for Discovery Box, 2001, Journal of Applied StatisticsBox, 2001, Journal of Applied Statistics Based on his 2000 Deming LectureBased on his 2000 Deming Lecture Knowledge development is an iterative Knowledge development is an iterative

processprocess Alternates between induction and Alternates between induction and

deductiondeduction In the inductive phase, we use new data to In the inductive phase, we use new data to

improve current modelsimprove current models In the deductive phase, we design and In the deductive phase, we design and

conduct experiments to test the logical conduct experiments to test the logical consequences of the improved modelsconsequences of the improved models

Page 5: Discussion of The Problem of False Discoveries: How to Balance Objectives 2009 IES Research Conference David Judkins Westat.

5

Long HistoryLong History

Francis Bacon discussed the Francis Bacon discussed the iterative nature of knowledge iterative nature of knowledge development at the beginning of the development at the beginning of the Age of Enlightenment.Age of Enlightenment.

Steve Stigler told Box that Bishop Steve Stigler told Box that Bishop Robert Grosseteste, one of the Robert Grosseteste, one of the founders of Oxford University in the founders of Oxford University in the 1200s, also talked about this idea 1200s, also talked about this idea and attributed it to Aristotle.and attributed it to Aristotle.

Page 6: Discussion of The Problem of False Discoveries: How to Balance Objectives 2009 IES Research Conference David Judkins Westat.

6

Box’s IllustrationBox’s Illustration

Model: Today is like every day.Model: Today is like every day. Deduction: My car will be in my Deduction: My car will be in my

parking space.parking space. Data: It isn’t!Data: It isn’t! Induction: Someone must have taken Induction: Someone must have taken

it.it.

Page 7: Discussion of The Problem of False Discoveries: How to Balance Objectives 2009 IES Research Conference David Judkins Westat.

7

Box’s Illustration (2)Box’s Illustration (2)

Model: My car has been stolen.Model: My car has been stolen. Deduction: My car will not be in the Deduction: My car will not be in the

parking lot.parking lot. Data: No. It’s over there!Data: No. It’s over there! Induction: Someone took it and Induction: Someone took it and

brought it back.brought it back.

Page 8: Discussion of The Problem of False Discoveries: How to Balance Objectives 2009 IES Research Conference David Judkins Westat.

8

Box’s Illustration (3)Box’s Illustration (3)

Model: A thief took it and brought it Model: A thief took it and brought it back.back.

Deduction: My car will be broken Deduction: My car will be broken into.into.

Data: No. It’s unharmed and locked!Data: No. It’s unharmed and locked! Induction: Someone who had a key Induction: Someone who had a key

took it.took it.

Page 9: Discussion of The Problem of False Discoveries: How to Balance Objectives 2009 IES Research Conference David Judkins Westat.

9

Box’s Illustration (4)Box’s Illustration (4)

Model: My wife used my car.Model: My wife used my car. Deduction: She has probably left me Deduction: She has probably left me

a note.a note. Data: Yes. Here it is!Data: Yes. Here it is!

Page 10: Discussion of The Problem of False Discoveries: How to Balance Objectives 2009 IES Research Conference David Judkins Westat.

10

Box on Judge versus Box on Judge versus DetectiveDetective

In the trial, there is a judge and jury In the trial, there is a judge and jury before whom, under very strict before whom, under very strict rules, all the evidence must be rules, all the evidence must be brought together at one time and the brought together at one time and the jury must decide, whether the jury must decide, whether the hypothesis of innocence can be hypothesis of innocence can be rejected beyond all reasonable rejected beyond all reasonable doubt. This is very much like a doubt. This is very much like a statistical test. statistical test.

Page 11: Discussion of The Problem of False Discoveries: How to Balance Objectives 2009 IES Research Conference David Judkins Westat.

11

Box on Judge versus Box on Judge versus Detective (2)Detective (2)

However, the apprehension of the However, the apprehension of the defendant by a detective will have defendant by a detective will have been conducted by a very different been conducted by a very different process. … The approach of the process. … The approach of the detective closely parallels that of the detective closely parallels that of the scientific investigator. scientific investigator.

Page 12: Discussion of The Problem of False Discoveries: How to Balance Objectives 2009 IES Research Conference David Judkins Westat.

12

Fitting Randomized Trials Fitting Randomized Trials into this Paradigminto this Paradigm

““Randomized trials” is, I believe, the Randomized trials” is, I believe, the name favored in education research for name favored in education research for experiments.experiments.

Much of the tradition for how to run Much of the tradition for how to run them and analyze them comes from the them and analyze them comes from the fields of medical interventions, devices fields of medical interventions, devices and pharmaceuticals, where, of course, and pharmaceuticals, where, of course, they are known as randomized clinical they are known as randomized clinical trials. trials.

What aspects of that tradition are What aspects of that tradition are appropriate in education research? appropriate in education research?

Page 13: Discussion of The Problem of False Discoveries: How to Balance Objectives 2009 IES Research Conference David Judkins Westat.

13

Regulatory Role of CRTsRegulatory Role of CRTs I think that much of the tradition has arisen I think that much of the tradition has arisen

from the regulatory role of CRTs.from the regulatory role of CRTs. The FDA panels are much like Box’s juries, The FDA panels are much like Box’s juries,

and the FDA administrators like Box’s and the FDA administrators like Box’s judges.judges.

Of course, there is a huge set of Of course, there is a huge set of investigators at the drug companies working investigators at the drug companies working to synthesize new drugs and to develop new to synthesize new drugs and to develop new devices.devices.

But there is a severe administrative and But there is a severe administrative and legal separation between the two operations.legal separation between the two operations.

Page 14: Discussion of The Problem of False Discoveries: How to Balance Objectives 2009 IES Research Conference David Judkins Westat.

14

Education Researchers Education Researchers Wear Both HatsWear Both Hats

So when are we acting like judges So when are we acting like judges and when like investigators? When and when like investigators? When like the FDA and when like the drug like the FDA and when like the drug company development arms?company development arms?

This determines to a large extent This determines to a large extent whether formal control over family-whether formal control over family-wise error rates is appropriate and wise error rates is appropriate and thus whether adjustments must be thus whether adjustments must be made for multiple comparisonsmade for multiple comparisons

Page 15: Discussion of The Problem of False Discoveries: How to Balance Objectives 2009 IES Research Conference David Judkins Westat.

15

EnshrinementEnshrinement I would say that we should treat an I would say that we should treat an

analysis as a confirmatory analysis in the analysis as a confirmatory analysis in the language of Schochet and Deke if there language of Schochet and Deke if there is a good chance that the findings will is a good chance that the findings will become accepted knowledge for years to become accepted knowledge for years to come.come.

I also think that there is a fairly strong I also think that there is a fairly strong danger of exploratory analyses being danger of exploratory analyses being mistaken for confirmatory, so I urge very mistaken for confirmatory, so I urge very clear language in the caveats of clear language in the caveats of exploratory analysesexploratory analyses

Page 16: Discussion of The Problem of False Discoveries: How to Balance Objectives 2009 IES Research Conference David Judkins Westat.

16

What Works What Works ClearinghouseClearinghouse

The title suggests that all the guidance The title suggests that all the guidance to be found is very solid and reliable.to be found is very solid and reliable.

Thus, I think that requiring FWER Thus, I think that requiring FWER control for entry into WWC is very control for entry into WWC is very appropriate.appropriate.

But then how do we facilitate the But then how do we facilitate the induction phase?induction phase?

How do we work to improve the How do we work to improve the models that for the most part are still models that for the most part are still very primitive in education research?very primitive in education research?

Page 17: Discussion of The Problem of False Discoveries: How to Balance Objectives 2009 IES Research Conference David Judkins Westat.

17

What Might Work What Might Work Clearinghouse Clearinghouse

Report all the findings from Report all the findings from randomized trials with no concern randomized trials with no concern about FWER? about FWER?

Also, report findings from poorly Also, report findings from poorly controlled observational studies? controlled observational studies?

A resource for experimenters not for A resource for experimenters not for implementersimplementers

Page 18: Discussion of The Problem of False Discoveries: How to Balance Objectives 2009 IES Research Conference David Judkins Westat.

18

GroupingGrouping

Peter and John mention that grouping Peter and John mention that grouping outcomes is a powerful way of mitigating outcomes is a powerful way of mitigating the multiple comparison problemthe multiple comparison problem

But how to form them?But how to form them? In education research, there is a strong In education research, there is a strong

urge to treat each assessment as a urge to treat each assessment as a separate domainseparate domain Are receptive and expressive vocabulary Are receptive and expressive vocabulary

skills really separate domains? skills really separate domains?

Page 19: Discussion of The Problem of False Discoveries: How to Balance Objectives 2009 IES Research Conference David Judkins Westat.

19

Sources of Resistance to Sources of Resistance to GroupingGrouping

Maybe a sense that they want to be Maybe a sense that they want to be doing investigative work rather than doing investigative work rather than judging work?judging work?

Pressure from test publishers to see Pressure from test publishers to see results for their assessments results for their assessments presented separately?presented separately?

Page 20: Discussion of The Problem of False Discoveries: How to Balance Objectives 2009 IES Research Conference David Judkins Westat.

20

Post-Peek GroupingPost-Peek Grouping

What happens when we use conditional What happens when we use conditional grouping rules?grouping rules?

Let Let XX and and YY be two outcome variables, and be two outcome variables, and ZZ be the average of the twobe the average of the two

Suppose we only estimate the effect of Suppose we only estimate the effect of TT on on ZZ if we first find that the difference in the if we first find that the difference in the effects of effects of TT on on XX and and YY are not statistically are not statistically different from each other?different from each other?

Otherwise, we publish the effects of Otherwise, we publish the effects of TT on on XX and and TT on on YY separately with multiple separately with multiple comparison adjustmentcomparison adjustment

Page 21: Discussion of The Problem of False Discoveries: How to Balance Objectives 2009 IES Research Conference David Judkins Westat.

21

Post-Peek Grouping (2)Post-Peek Grouping (2)

The math is complex. Preliminary The math is complex. Preliminary simulations hint, however, that the simulations hint, however, that the procedure is too liberal, failing to procedure is too liberal, failing to provide FWER control.provide FWER control.

Also not clear how to generalize to Also not clear how to generalize to more than two outcomesmore than two outcomes

Page 22: Discussion of The Problem of False Discoveries: How to Balance Objectives 2009 IES Research Conference David Judkins Westat.

22

Drilling DownDrilling Down If there is a significant effect on the If there is a significant effect on the

composite domain outcome, then there composite domain outcome, then there is natural interest in the components.is natural interest in the components.

I think this falls under the rubric of I think this falls under the rubric of exploratory analysis done to facilitate exploratory analysis done to facilitate the induction phase of knowledge the induction phase of knowledge building.building.

If FWER control is attempted for the If FWER control is attempted for the drill-down, the resampling methods drill-down, the resampling methods would certainly appear best suited given would certainly appear best suited given the strong correlations.the strong correlations.

Page 23: Discussion of The Problem of False Discoveries: How to Balance Objectives 2009 IES Research Conference David Judkins Westat.

23

Multi-Domain Outcome Multi-Domain Outcome IndicesIndices

Not every summary measure needs to be Not every summary measure needs to be built up from a set of correlated items around built up from a set of correlated items around the same latent construct.the same latent construct.

Think of the quality-of-life indices published Think of the quality-of-life indices published for cities around the world.for cities around the world.

Educational and developmental progress is Educational and developmental progress is multi-dimensional, but that does not mean multi-dimensional, but that does not mean that every dimension needs to be reported that every dimension needs to be reported separately.separately.

We should not insist that all outcome We should not insist that all outcome measures have high reliability for uni-measures have high reliability for uni-dimensional latent variables.dimensional latent variables.