Top Banner
Background Methods Results So what? A re-analysis of the Cochrane Library data the dangers of unobserved heterogeneity in meta-analyses Evan Kontopantelis 12 David Springate 13 David Reeves 13 1 NIHR School for Primary Care Research 2 Centre for Health Informatics, Institute of Population Health 3 Centre for Biostatistics, Institute of Population Health Centre for Biostatistics, 10 Feb 2014 Kontopantelis A re-analysis of the Cochrane Library data
34

Internal 2014 - Cochrane data

May 10, 2015

Download

Science

A re-analysis of the Cochrane Library data: the dangers of unobserved heterogeneity in meta-analyses
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

A re-analysis of the Cochrane Library datathe dangers of unobserved heterogeneity in meta-analyses

Evan Kontopantelis12 David Springate13 DavidReeves13

1NIHR School for Primary Care Research

2Centre for Health Informatics, Institute of Population Health

3Centre for Biostatistics, Institute of Population Health

Centre for Biostatistics, 10 Feb 2014

Kontopantelis A re-analysis of the Cochrane Library data

Page 2: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

Outline

1 Background

2 MethodsDataAnalyses

3 ResultsMethod performanceCochrane data

4 So what?SummaryRelevant and future work

Kontopantelis A re-analysis of the Cochrane Library data

Page 3: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

Meta-analysis

Synthesising existing evidence to answer clinical questionsRelatively young and dymanic field of researchActivity reflects the importance of MA and potential toprovide conclusive answersIndividual Patient Data meta-analysis is the best option,but considerable cost and access to patient data requiredWhen original data unavailable, evidence combined in atwo stage process

retrieving the relevant summary effect statisticsusing MA model to calculate the overall effect estimate µ

Kontopantelis A re-analysis of the Cochrane Library data

Page 4: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

Heterogeneity estimateor between-study variance estimate τ 2

Model selection depends on the heterogeneity estimateIf present usually a random-effects approach is selectedBut a fixed-effects model may be chosen for theoretical orpractical reasonsDifferent approaches for combining study results

Inverse varianceMantel-HaenszelPeto

Kontopantelis A re-analysis of the Cochrane Library data

Page 5: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

Meta-analysis methods

Inverse variance: fixed- or random-effects & continuous ordichotomous outcome

DerSimonian-Laird, moment based estimatorAlso: ML, REML, PL, Biggerstaff-Tweedie,Follmann-Proschan, Sidik-Jonkman

Mantel-Haenszel: fixed-effect & dichotomous outcomeodds ratio, risk ratio or risk differencedifferent weighting schemelow events numbers or small studies

Peto: fixed-effect & dichotomous outcomePeto odds ratiosmall intervention effects or very rare events

if τ2 > 0 only modelled through inverse variance weighting

Kontopantelis A re-analysis of the Cochrane Library data

Page 6: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

Random-effects (RE) models

Accurate τ2 important performance driverLarge τ2 leads to wider CIsZero τ2 reduces all methods to fixed-effectThree main approaches to estimating:

DerSimonian-Laird (τ2DL)

Maximum Likelihood (τ2ML)

Restricted Maximum Likelihood (τ2REML)

Many methods use one of these but vary in estimating µIn practice, τ2

DL computed and heterogeneity quantified andreported using Cochran’s Q, I2 or H2

Kontopantelis A re-analysis of the Cochrane Library data

Page 7: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

Random or fixed?two ‘schools’ of thought

Fixed-effect (FE)‘what is the average result of trials conducted to date’?assumption-free

Random-effects (RE)‘what is the true treatment effect’?various assumptions

normally distributed trial effectsvarying treatment effect across populations although findingslimited since based on observed studies only

more conservative; findings potentially more generalisable

Researchers reassured when τ2 = 0FE often used when low heterogeneity detected

Kontopantelis A re-analysis of the Cochrane Library data

Page 8: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

Simples!

Start(sort of)

Outcome(s) continuous

Inverse Variance weighting methods (IV)

Yes

Fixed-effect by conviction

Fixed-effect IV model

Yes

No

Detected heterogeneity

No Random-effects IV model

DL VC ML

REMLPL

Yes

Outcome(s) dichotomousNo

Maentel-Haenszel methods (MH)

Fixed-effect by conviction

Fixed-effect MH true model

YesDetected

heterogeneity

NoCombining dichotomous

and continuous outcomes

Transform dichotomous

outsomes to SMD

Feeling adventurous?

Yes

Yes! No!Rare events

Very rare events?

Estimate heterogeneity (τ2)No

No

Random-effects MH-IV hybrid model

Yes

Peto methods (P)

Fixed-effect Peto true model

YesNo

Outcome(s) time-to-eventNo

Fixed-effect Peto O-E true model

Yes

Bayesian?

No

τ2 est

BP

MVaMVb

Yes

Random-effects IV model

DL

τ2 estimation

DL DL2 DLb VC

VC2ML REML PL

Non-zero prior

Yes

τ2 est

B0

No

Kontopantelis A re-analysis of the Cochrane Library data

Page 9: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

Cochrane Database for Systematic Reviews

Richest resource of meta-analyses in the worldFifty-four active groups responsible for organising, advisingon and publishing systematic reviewsAuthors obliged to use RevMan and submit the data andanalyses file along with the review, contributing to thecreation of a vast data resourceRevMan offers quite a few fixed-effect choices but only theDerSimonian-Laird random-effects method has beenimplemented to quantify and account for heterogeneity

hidden data

Kontopantelis A re-analysis of the Cochrane Library data

Page 10: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

Software options

RevManEasy to useStreamlined and ‘idiot-proof’Limited model optionsData manipulation generally not possible

MetaEasy for data collection and some manipulationStata offers quite a few packages with advanced optionsand model choices: metan, metaan, metabias etcR similarly very well supported

Kontopantelis A re-analysis of the Cochrane Library data

Page 11: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

DataAnalyses

‘Real’ DataCochrane Database for Systematic Reviews

Python code to crawl Wiley website for RevMan filesDownloaded 3,845 relevant RevMan files (of 3,984available in Aug 2012) and imported in StataEach file a systematic reviewWithin each file, various research questions might havebeen posed

investigated across various relevant outcomes?variability in intervention or outcome?

Kontopantelis A re-analysis of the Cochrane Library data

Page 12: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

DataAnalyses

‘Real’ DataCochrane Database for Systematic Reviews

Cochrane database

CD000006

Group: Pregnancy and Childbirth

Review name: Absorbable suture

materials for primary repair of episiotomy and

second degree tears

Meta-analysis 1Synthetic sutures

versus catgut

Meta-analysis 2Fast-absorbing synthetic versus

standard absorbable synthetic material

Meta-analysis 3Glycerol impregnated catgut (softgut)

versus chromic catgut

Meta-analysis 4Monofilament versus standard

polyglycolic sutures

Outcome 1.1Short-term pain: pain at day 3 or less

(women experiencing any pain)

Subgroup 1.1.1Standard synthetic; k=9

Subgroup 1.1.2Fast absorbing; k=1

Outcome 1.9Dyspareunia - at 3 months

postpartumSubgroup 1.9.1

Standard synthetic; k=5

Subgroup 1.9.2Fast absorbing; k=1

Main 1.9.0k=6

Main 1.1.0k=10

Outcome 2.1Short-term pain: at 3 days or less

Main 2.1.0k=3

Outcome 2.11Maternal satisfaction: satisfied with

repair at 12 months

Main 2.11.0k=1

Outcome 3.1Short-term pain: pain at 3 days or

less

Main 3.1.0k=1

Outcome 3.8Dyspareunia at 6 - 12 months

Main 3.8.0k=1

Outcome 4.1Short-term pain: mean pain scores at

3 days

Main 4.1.0k=1

Outcome 4.4Wound problems at 8 - 12 weeks:

women seeking professional help for problem with perineal repair

Main 4.4.0k=1

Kontopantelis A re-analysis of the Cochrane Library data

Page 13: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

DataAnalyses

Simulated Data

Generated effect size Yi and within study varianceestimates σ2

i for each simulated meta-analysis studyDistribution for σ2

i based on the χ21 distribution

For Yi (where Yi = θi + ei )assumed ei ∼ N(0, σ2

i )various distributional scenarios for θi : normal, moderateand extreme skew-normal, uniform, bimodalthree τ2 values to capture low (I2 = 15.1%), medium(I2 = 34.9%) and large (I2 = 64.1%) heterogeneity

For each distributional assumption and τ2 value, 10,000meta-analysis cases simulated

Kontopantelis A re-analysis of the Cochrane Library data

Page 14: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

DataAnalyses

The questions

Investigate the potential bias when assuming τ2 = 0Compare the performance of τ2 estimators in variousscenariosPresent the distribution of τ2 derived from allmeta-analyses in the Cochrane LibraryPresent details on the number of meta-analysed studies,model selection and zero τ2

Assess the sensitivity of results and conclusions usingalternative models

Kontopantelis A re-analysis of the Cochrane Library data

Page 15: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

DataAnalyses

Between-study variance estimatorsfrequentist, more or less

DerSimonian-Lairdone-step (τ2

DL)two-step (τ2

DL2)non-parametric bootstrap (τ2

DLb)minimum τ2

DL = 0.01 assumed (τ2DLi )

Variance componentsone-step (τ2

VC)two-step (τ2

VC2)Iterative

Maximum likelihood (τ2ML)

Restricted maximum likelihood (τ2REML)

Profile likelihood (τ2PL)

Kontopantelis A re-analysis of the Cochrane Library data

Page 16: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

DataAnalyses

Between-study variance estimatorsBayesian

Sidik and Jonkman model error variancecrude ratio estimates used as a-priori values (τ2

MVa)VC estimator used to inform a-priori values with minimumvalue of 0.01 (τ2

MVb)Rukhin

prior between-study variance zero (τ2B0)

prior between-study variance non-zero and fixed (τ2BP)

Kontopantelis A re-analysis of the Cochrane Library data

Page 17: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

DataAnalyses

Assessment criteriain the 10,000 meta-analysis cases for each simulation scenario

Average bias & average absolute bias in τ2

Percentage of zero τ2

Coverage probability for the effect estimateType I errorproportion of 95% CIs for the overall effect estimate thatcontain the true overall effect θi

Error-interval estimation for the effectquantifies accuracy of estimation of the error-intervalaround the point estimateratio of estimated confidence interval for the effect,compared to the interval based on the true τ2

Kontopantelis A re-analysis of the Cochrane Library data

Page 18: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

Method performanceCochrane data

Which method?

Performance not affected much by effects’ distributionAbsolute bias

B0 (k ≤ 3) and MLCoverage

MVa-BP (k ≤ 3) and DLbError-interval estimation and detecting

DLbDLb seems best method overall, especially in detectingheterogeneity

appears to be a big problem: DL failed to detect high τ2 forover 50% of small meta-analyses

Bayesian methods did well for very small MAs

Kontopantelis A re-analysis of the Cochrane Library data

Page 19: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

Method performanceCochrane data

Meta-analyses numbers

Of the 3,845 files 2,801 had identified relevant studies andcontained any data98,615 analyses extracted 57,397 of which meta-analyses

32,005 were overall meta-analyses25,392 were subgroup meta-analyses

Estimation of an overall effectPeto method in 4,340 (7.6%)Mantel-Haenszel in 33,184 (57.8%)Inverse variance in 19,873 (34.6%)random-effects more prevalent in inverse variance methodsand larger meta-analyses

34% of meta-analyses on 2 studies (53% k ≤ 3)!

Kontopantelis A re-analysis of the Cochrane Library data

Page 20: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

Method performanceCochrane data

Meta-analyses by Cochrane group

22

Figures Figure 1: All meta-analyses, including single-study and subgroup meta-analyses

0

2000

4000

6000

8000

10000

12000

14000

Preg

nanc

y an

d Ch

ildbi

rth

Schi

zoph

reni

aN

eona

tal

Men

stru

al D

isord

ers a

nd S

ubfe

rtili

tyDe

pres

sion

Anxi

ety

and

Neu

rosis

Airw

ays

Hepa

to-B

iliar

yFe

rtili

ty R

egul

atio

nM

uscu

losk

elet

alSt

roke

Acut

e Re

spira

tory

Infe

ctio

nsRe

nal

Dem

entia

and

Cog

nitiv

e Im

prov

emen

tPa

in P

allia

tive

and

Supp

ortiv

e Ca

reIn

fect

ious

Dise

ases

Hear

tBo

ne Jo

int a

nd M

uscl

e Tr

aum

aM

etab

olic

and

End

ocrin

e Di

sord

ers

Gyna

ecol

ogic

al C

ance

rDe

velo

pmen

tal P

sych

osoc

ial a

nd L

earn

ing…

Colo

rect

al C

ance

rHy

pert

ensio

nAn

aest

hesia

Haem

atol

ogic

al M

alig

nanc

ies

Drug

s and

Alc

ohol

Inco

ntin

ence

Infla

mm

ator

y Bo

wel

Dise

ase

and

Func

tiona

l…M

ovem

ent D

isord

ers

Neu

rom

uscu

lar D

iseas

eO

ral H

ealth

Perip

hera

l Vas

cula

r Dise

ases

Brea

st C

ance

rTo

bacc

o Ad

dict

ion

Cyst

ic F

ibro

sis a

nd G

enet

ic D

isord

ers

Back

Skin

HIV/

AIDS

Inju

ries

Eyes

and

Visi

onW

ound

sEa

r Nos

e an

d Th

roat

Diso

rder

sEp

ileps

yU

pper

Gas

troi

ntes

tinal

and

Pan

crea

tic D

iseas

esEf

fect

ive

Prac

tice

and

Org

anisa

tion

of C

are

Pros

tatic

Dise

ases

and

Uro

logi

c Ca

ncer

sM

ultip

le S

cler

osis

and

Rare

Dise

ases

of t

he…

Mul

tiple

Scl

eros

isCo

nsum

ers a

nd C

omm

unic

atio

nLu

ng C

ance

rSe

xual

ly T

rans

mitt

ed D

iseas

esCh

ildho

od C

ance

rO

ccup

atio

nal S

afet

y an

d He

alth

Sexu

ally

Tra

nsm

itted

Infe

ctio

nsPu

blic

Hea

lth

Single Study Fixed-effect model (by choice or necessity) Random-effects model

Kontopantelis A re-analysis of the Cochrane Library data

Page 21: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

Method performanceCochrane data

Model selection by meta-analysis size

 

16%

20%

24%

25%27%

29% 32% 31% 31% 33% 33% 34% 30% 38% 30% 30% 32% 33% 35% 37% 38%0

2000400060008000

100001200014000160001800020000

Model selection by number of available studies(% of Random‐effects meta‐analyses)

Fixed‐effect (by choice or necessity) Random‐effects

Kontopantelis A re-analysis of the Cochrane Library data

Page 22: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

Method performanceCochrane data

Meta-analyses by method choice

23

Figure 2: Model selection by number of available studies (and % of random-effects meta-analyses)*

*note that in many cases fixed-effect models were used when heterogeneity was detected

Figure 3: Comparison of zero between-study variance estimates rates in the Cochrane library data and in simulations, using the DerSimonian-Laird method*

*Normal distribution of the effects assumed in the simulations (more extreme distributions produced similar results).

21%

27%

31% 37%

41% 51%

15%

19%

22%

22%

27% 30%

0

2000

4000

6000

8000

10000

12000

2 3 4 5 6-9 10+Number of Studies in meta-analysis

Peto (FE) Inverse Variance (FE) Inverse Variance (RE) Mantel-Haenszel (FE) Mantel-Haenszel (RE)

0

10

20

30

40

50

60

70

80

90

100

2 3 4 5 10 20

% o

f zer

o τ^

2 es

timat

es w

ith D

erSi

mon

ian-

Laird

Number of studies in meta-analyis

Observed

true τ^2=0.01

true τ^2=0.03

true τ^2=0.10

Kontopantelis A re-analysis of the Cochrane Library data

Page 23: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

Method performanceCochrane data

Comparing Cochrane data with simulated

To assess the validity of a homogeneity assumption wecompared the percentage of zero τ2

DL, in real andsimulated dataCalculated τ2

DL for all Cochrane meta-analysesPercentage of zero τ2

DL was lower in the real data than inthe low and moderate heterogeneity simulated dataSuggests that mean true between-study variance is higherthan generally assumed but fails to be detected; especiallyfor small meta-analyses

Kontopantelis A re-analysis of the Cochrane Library data

Page 24: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

Method performanceCochrane data

Comparing Cochrane data with simulated

23

Figure 2: Model selection by number of available studies (and % of random-effects meta-analyses)*

*note that in many case fixed-effect models were used when heterogeneity was detected

Figure 3: Comparison of zero between-study variance estimates rates in the Cochrane library data and in simulations, using the DerSimonian-Laird method*

*Normal distribution of the effects assumed in the simulations (more extreme distributions produced similar results).

21%

27%

31% 37%

41% 51%

15%

19%

22%

22%

27% 30%

0

2000

4000

6000

8000

10000

12000

2 3 4 5 6-9 10+Number of Studies in meta-analysis

Peto (FE) Inverse Variance (FE) Inverse Variance (RE) Mantel-Haenszel (FE) Mantel-Haenszel (RE)

0

10

20

30

40

50

60

70

80

90

100

2 3 4 5 10 20

% o

f zer

o τ^

2 es

timat

es w

ith D

erSi

mon

ian-

Laird

Number of studies in meta-analyis

Observed

true I^2=15%

true I^2=35%

true I^2=64%

Kontopantelis A re-analysis of the Cochrane Library data

Page 25: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

Method performanceCochrane data

Reanalysing the Cochrane data

We applied all methods to all 57,397 meta-analyses toassess τ2 distributions and the sensitivity of the resultsand conclusionsFor simplicity discuss differences between standardmethods and DLb; not a perfect method but one thatperformed well overallAs in simulations, DLb identifies more heterogeneousmeta-analyses; τ2

DL = 0 for 50.5% & τ2DLb = 0 for 31.2%

Distributions of τ2 agree with the hypothesised χ21

Kontopantelis A re-analysis of the Cochrane Library data

Page 26: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

Method performanceCochrane data

Distributions for τ2

050

010

0015

0020

00

# of

met

a-an

alys

es

0 .1 .2 .3 .4 .5

t2 estimate

Zero est(%): DL=44.9, DLb=29.6, VC=48.9 REML=45.4 ML=62.2, B0=49.2, VC2=44.3, DL2=45.3Non-convergence(%): ML=0.7, REML=1.4.

Inverse Variance

010

0020

0030

0040

0050

00

# of

met

a-an

alys

es

0 .1 .2 .3 .4 .5

t2 estimate

Zero est(%): DL=54.2, DLb=32.7, VC=58.8 REML=55.6 ML=75.0, B0=59.6, VC2=53.9, DL2=55.5Non-convergence(%): ML=1.3, REML=1.9.

Mantel-Haenszel

020

040

060

0

# of

met

a-an

alys

es

0 .1 .2 .3 .4 .5

t2 estimate

Zero est(%): DL=50.8, DLb=27.3, VC=54.2 REML=51.4 ML=70.0, B0=54.8, VC2=49.6, DL2=51.0Non-convergence(%): ML=0.6, REML=1.0.

Peto & O-E

020

0040

0060

0080

00

# of

met

a-an

alys

es

0 .1 .2 .3 .4 .5

t2 estimate

Zero est(%): DL=50.7, DLb=31.2, VC=55.0 REML=51.7 ML=70.2, B0=55.6, VC2=50.2, DL2=51.6Non-convergence(%): ML=1.0, REML=1.6.

all methods

non-zero estimates only

DL DLb VC ML

REML B0 VC2 DL2

Kontopantelis A re-analysis of the Cochrane Library data

Page 27: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

Method performanceCochrane data

Changes in results and conclusions

Inverse variance with DLbwhen τ2

DL > 0 but ignored, conclusions change for 19.1% ofanalysesin overwhelming majority of changes, effects stopped beingstatistically significant

Findings were similar for Mantel-Haenszel and Petomethods, although the validity of the inverse varianceweighting in these (which is a prerequisite for the use orrandom-effects models) warrants further investigation

Kontopantelis A re-analysis of the Cochrane Library data

Page 28: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

Method performanceCochrane data

Changes in results and conclusionse.g. inverse variance analyses

RevMan DerSimonian-Laird Random-effects method says

heterogeneity is present

Analysis with bootstrap DL rarely changes conclusions (although higher heterogeneity estimates and found in around 20% more

meta-analysis

Conclusions change for:0.9% of analyses

No

Estimated heterogeneity ‘ignored’ by authors and a

fixed-effect model is chosenYes

Analysis with bootstrap DL rarely changes conclusions

Conclusions change for:2.4% of analyses

No

Analysis with bootstrap DL makes a difference in 1 in 5 analyses (as would analysis with standard DL

but to a smaller extent)

Conclusions change for:19.1% of analyses

Yes

Kontopantelis A re-analysis of the Cochrane Library data

Page 29: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

SummaryRelevant and future work

Findings

Methods often fail to detect τ2 in small MAEven when τ2 > 0, often ignoredMean true heterogeneity higher than assumed orestimated; but standard method fails to detect itNon-parametric DerSimonian-Laird bootstrap seems bestmethod overall, especially in detecting heterogeneityBayesian estimators MVa (Sidik-Jonkman) and BP(Ruhkin) performed very well when k ≤ 319-21% of statistical conclusions change, when τ2

DL > 0but ignored

Kontopantelis A re-analysis of the Cochrane Library data

Page 30: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

SummaryRelevant and future work

Conclusions

Detecting and accurately estimating τ2 in a small MA isvery difficult; yet for 53% of Cochrane MAs, k ≤ 3τ2 = 0 assumed to lead to a more reliable meta-analysisand high τ2 is alarming and potentially prohibitiveEstimates of zero heterogeneity should also be a concernsince heterogeneity is likely present but undetectedBootstrapped DL leads to a small improvement butproblem largely remains, especially for very small MAsCaution against ignoring heterogeneity when detectedFor full generalisability, random-effects essential?

Kontopantelis A re-analysis of the Cochrane Library data

Page 31: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

SummaryRelevant and future work

Effect sizes in Randomised Controlled Trials

Most large treatment effects emerge from small studies,and when additional trials are performed, the effect sizesbecome typically much smallerWell validated large effects are uncommon and pertain tononfatal outcomes

ORIGINAL CONTRIBUTION

Empirical Evaluation of Very Large TreatmentEffects of Medical InterventionsTiago V. Pereira, PhDRalph I. Horwitz, MDJohn P. A. Ioannidis, MD, DSc

MOST EFFECTIVE INTERVEN-tions in health care con-fer modest, incrementalbenefits.1,2 Randomized

trials, the gold standard to evaluatemedical interventions, are ideally con-ducted under the principle of equi-poise3: the compared groups are notperceived to have a clear advantage;thus, very large treatment effects areusually not anticipated. However, verylarge treatment effects are observed oc-casionally in some trials. These effectsmay include both anticipated and un-expected treatment benefits, or theymay involve harms.

Large effects are important to docu-ment reliably because in a relative scalethey represent potentially the cases inwhich interventions can have the mostimpressive effect on health outcomesand because they are more likely to beadopted rapidly and with less evi-dence. Consequently, it is important toknow whether, when observed, verylarge effects are reliable and in what sortof experimental outcomes they are com-monly observed. The importance ofvery large effects has drawn attentionmostly in observational studies4,5 buthas not been well studied in random-ized evidence. It is unknown how of-ten very large effects are replicated insubsequent trials of the same compari-son, disease and outcome. If data ob-served in 1 experiment happen to be atthe extreme of a distribution, subse-

For editorial comment see p 1691.

Author Affiliations: Health Technology AssessmentUnit, Institute of Education and Sciences, GermanHospital Oswaldo Cruz, Sao Paulo, Brazil (DrPereira); GlaxoSmithKline, King of Prussia, Pennsyl-vania, and Yale University School of Medicine, NewHaven, Connecticut (Dr Horwitz); and StanfordPrevention Research Center, Departments ofMedicine and Health and Research, and Policy,

Stanford University School of Medicine, andDepartment of Statistics, School of Humanities andSciences, Stanford University, Stanford, California(Dr Ioannidis).Corresponding Author: John P. A. Ioannidis, MD, DSc,Stanford Prevention Research Center, Medical SchoolOffice Bldg, 1265 Welch Rd, Room X306, Stanford,CA 94305 ([email protected]).

Context Most medical interventions have modest effects, but occasionally some clini-cal trials may find very large effects for benefits or harms.

Objective To evaluate the frequency and features of very large effects in medicine.

Data Sources Cochrane Database of Systematic Reviews (CDSR, 2010, issue 7).

Study Selection We separated all binary-outcome CDSR forest plots with com-parisons of interventions according to whether the first published trial, a subsequenttrial (not the first), or no trial had a nominally statistically significant (P� .05) very largeeffect (odds ratio [OR], �5). We also sampled randomly 250 topics from each groupfor further in-depth evaluation.

Data Extraction We assessed the types of treatments and outcomes in trials withvery large effects, examined how often large-effect trials were followed up by othertrials on the same topic, and how these effects compared against the effects of therespective meta-analyses.

Results Among 85 002 forest plots (from 3082 reviews), 8239 (9.7%) had a sig-nificant very large effect in the first published trial, 5158 (6.1%) only after the firstpublished trial, and 71 605 (84.2%) had no trials with significant very large effects.Nominally significant very large effects typically appeared in small trials with mediannumber of events: 18 in first trials and 15 in subsequent trials. Topics with verylarge effects were less likely than other topics to address mortality (3.6% in firsttrials, 3.2% in subsequent trials, and 11.6% in no trials with significant very largeeffects) and were more likely to address laboratory-defined efficacy (10% in firsttrials,10.8% in subsequent, and 3.2% in no trials with significant very largeeffects). First trials with very large effects were as likely as trials with no very largeeffects to have subsequent published trials. Ninety percent and 98% of the verylarge effects observed in first and subsequently published trials, respectively,became smaller in meta-analyses that included other trials; the median odds ratiodecreased from 11.88 to 4.20 for first trials, and from 10.02 to 2.60 for subsequenttrials. For 46 of the 500 selected topics (9.2%; first and subsequent trials) with avery large-effect trial, the meta-analysis maintained very large effects with P� .001when additional trials were included, but none pertained to mortality-related out-comes. Across the whole CDSR, there was only 1 intervention with large beneficialeffects on mortality, P� .001, and no major concerns about the quality of the evi-dence (for a trial on extracorporeal oxygenation for severe respiratory failure innewborns).

Conclusions Most large treatment effects emerge from small studies, and when ad-ditional trials are performed, the effect sizes become typically much smaller. Well-validated large effects are uncommon and pertain to nonfatal outcomes.JAMA. 2012;308(16):1676-1684 www.jama.com

1676 JAMA, October 24/31, 2012—Vol 308, No. 16 ©2012 American Medical Association. All rights reserved.

Downloaded From: http://jama.jamanetwork.com/ by The University of Manchester Library, Evan Kontopantelis on 02/07/2014

Kontopantelis A re-analysis of the Cochrane Library data

Page 32: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

SummaryRelevant and future work

Publication bias

Publication bias was present in a substantial proportion oflarge meta-analyses that were recently published in fourmajor medical journals (BMJ, JAMA, Lancet, and PLOSMedicine between 2008 and 2012).

Publication Bias in Recent Meta-AnalysesMichal Kicinski*

Department of Science, Hasselt University, Hasselt, Belgium

Abstract

Introduction: Positive results have a greater chance of being published and outcomes that are statistically significanthave a greater chance of being fully reported. One consequence of research underreporting is that it may influencethe sample of studies that is available for a meta-analysis. Smaller studies are often characterized by larger effects inpublished meta-analyses, which can be possibly explained by publication bias. We investigated the associationbetween the statistical significance of the results and the probability of being included in recent meta-analyses.Methods: For meta-analyses of clinical trials, we defined the relative risk as the ratio of the probability of includingstatistically significant results favoring the treatment to the probability of including other results. For meta-analyses ofother studies, we defined the relative risk as the ratio of the probability of including biologically plausible statisticallysignificant results to the probability of including other results. We applied a Bayesian selection model for meta-analyses that included at least 30 studies and were published in four major general medical journals (BMJ, JAMA,Lancet, and PLOS Medicine) between 2008 and 2012.Results: We identified 49 meta-analyses. The estimate of the relative risk was greater than one in 42 meta-analyses,greater than two in 16 meta-analyses, greater than three in eight meta-analyses, and greater than five in four meta-analyses. In 10 out of 28 meta-analyses of clinical trials, there was strong evidence that statistically significant resultsfavoring the treatment were more likely to be included. In 4 out of 19 meta-analyses of observational studies, therewas strong evidence that plausible statistically significant outcomes had a higher probability of being included.Conclusions: Publication bias was present in a substantial proportion of large meta-analyses that were recentlypublished in four major medical journals.

Citation: Kicinski M (2013) Publication Bias in Recent Meta-Analyses. PLoS ONE 8(11): e81823. doi:10.1371/journal.pone.0081823

Editor: Daniele Marinazzo, Universiteit Gent, Belgium

Received June 27, 2013; Accepted October 17, 2013; Published November 27, 2013

Copyright: © 2013 Michal Kicinski. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permitsunrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Funding: Michal Kicinski is currently a PhD fellow at the Research Foundation-Flanders (FWO). The funders had no role in study design, data collectionand analysis, decision to publish, or preparation of the manuscript.

Competing interests: The author has declared that no competing interests exist.

* E-mail: [email protected]

Introduction

When some study outcomes are more likely to be publishedthan other, the literature that is available to doctors, scientists,and policy makers provides misleading information. Thetendency to decide to publish a study based on its results hasbeen long acknowledged as a major threat to the validity ofconclusions from medical research[1,2]. During the past 25years, the phenomenon of research underreporting has beenextensively investigated. It is clear that statistically significantresults supporting the hypothesis of the researcher often havea greater chance of being published and fully reported[3–7].

Meta-analysis, a statistical approach to estimate a parameterof interest based on multiple studies, plays an essential role inmedical research. One consequence of researchunderreporting is that it influences the sample of studies that isavailable for a meta-analysis[8,9]. This causes a bias, unlessthe process of study selection is modeled correctly[10]. Suchmodeling requires strong assumptions about the nature of the

publication bias, especially when the size of a meta-analysis isnot very large and when robust techniques cannot beused[11–13]. As a result, when publication bias occurs, thevalidity of the meta-analysis is uncertain.

It is well-known that smaller studies are often characterizedby larger effects in published meta-analyses[14–16].Publication bias is one of the possible explanations of thisphenomenon[17]. Although a meta-analysis is typicallypreceded by an investigation of the presence of publicationbias, the standard detection methods are characterized by alow power[11,18–22]. Therefore, the sample of includedstudies may be unrepresentative of the population of allconducted studies even when publication bias has not beendetected. In this study, we investigated whether statisticallysignificant outcomes that showed a positive effect of thetreatment (in the case of clinical trials) and plausible statisticallysignificant outcomes (in the case of observational studies andinterventional studies) had a greater probability of beingincluded in recent meta-analyses than other outcomes. We

PLOS ONE | www.plosone.org 1 November 2013 | Volume 8 | Issue 11 | e81823

Kontopantelis A re-analysis of the Cochrane Library data

Page 33: Internal 2014 - Cochrane data

BackgroundMethodsResults

So what?

SummaryRelevant and future work

Future work

Look for publication biasExamine factors that predict large effect sizes andsignificant findings (e.g. subanalyses)Is model choice (FE or RE) driven by the results? (i.e.‘hope’ for a significant finding?)Update our Stata metaan command to include theBayesian methods (DLb already added)

Kontopantelis A re-analysis of the Cochrane Library data

Page 34: Internal 2014 - Cochrane data

Appendix Thank you!

A Re-Analysis of the Cochrane Library Data: The Dangersof Unobserved Heterogeneity in Meta-AnalysesEvangelos Kontopantelis1,2,3*, David A. Springate1,2, David Reeves1,2

1 Centre for Primary Care, NIHR School for Primary Care Research, Institute of Population Health, University of Manchester, Manchester, United Kingdom, 2 Centre for

Biostatistics, Institute of Population Health, University of Manchester, Manchester, United Kingdom, 3 Centre for Health Informatics, Institute of Population Health,

University of Manchester, Manchester, United Kingdom

Abstract

Background: Heterogeneity has a key role in meta-analysis methods and can greatly affect conclusions. However, true levelsof heterogeneity are unknown and often researchers assume homogeneity. We aim to: a) investigate the prevalence ofunobserved heterogeneity and the validity of the assumption of homogeneity; b) assess the performance of various meta-analysis methods; c) apply the findings to published meta-analyses.

Methods and Findings: We accessed 57,397 meta-analyses, available in the Cochrane Library in August 2012. Usingsimulated data we assessed the performance of various meta-analysis methods in different scenarios. The prevalence of azero heterogeneity estimate in the simulated scenarios was compared with that in the Cochrane data, to estimate thedegree of unobserved heterogeneity in the latter. We re-analysed all meta-analyses using all methods and assessed thesensitivity of the statistical conclusions. Levels of unobserved heterogeneity in the Cochrane data appeared to be high,especially for small meta-analyses. A bootstrapped version of the DerSimonian-Laird approach performed best in bothdetecting heterogeneity and in returning more accurate overall effect estimates. Re-analysing all meta-analyses with thisnew method we found that in cases where heterogeneity had originally been detected but ignored, 17–20% of thestatistical conclusions changed. Rates were much lower where the original analysis did not detect heterogeneity or took itinto account, between 1% and 3%.

Conclusions: When evidence for heterogeneity is lacking, standard practice is to assume homogeneity and apply a simplerfixed-effect meta-analysis. We find that assuming homogeneity often results in a misleading analysis, since heterogeneity isvery likely present but undetected. Our new method represents a small improvement but the problem largely remains,especially for very small meta-analyses. One solution is to test the sensitivity of the meta-analysis conclusions to assumedmoderate and large degrees of heterogeneity. Equally, whenever heterogeneity is detected, it should not be ignored.

Citation: Kontopantelis E, Springate DA, Reeves D (2013) A Re-Analysis of the Cochrane Library Data: The Dangers of Unobserved Heterogeneity in Meta-Analyses. PLoS ONE 8(7): e69930. doi:10.1371/journal.pone.0069930

Editor: Tim Friede, University Medical Center Gottingen, Germany

Received February 20, 2013; Accepted June 13, 2013; Published July 26, 2013

Copyright: � 2013 Kontopantelis et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permitsunrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Funding: EK was partly supported by a National Institute for Health Research (NIHR) School for Primary Care Research fellowship in primary health care. Thefunders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. No additional external funding received forthis study.

Competing Interests: The authors have declared that no competing interests exist.

* E-mail: [email protected]

Introduction

Meta-analysis (MA), the methodologies of synthesising existing

evidence to answer a clinical or other research question, is a

relatively young and dynamic area of research. The furore of

methodological activity reflects the clinical importance of meta-

analysis and its potential to provide conclusive answers, rather

than incremental knowledge contributions, much more cheaply

than a new large Randomised Clinical Trial (RCT).

The best analysis approach is an Individual Patient Data (IPD)

meta-analysis, which requires access to patient level data and

considerably more effort (to obtain the datasets mainly). However,

with IPD data, clinical and methodological heterogeneity,

arguably the biggest concern for meta-analysts, can be addressed

through patient-level covariate controlling or subgroup analyses

when covariate data are not available across all studies.

When the original data are unavailable, researchers have to

combine the evidence in a two stage process, retrieving the

relevant summary effects statistics from publications and using a

suitable meta-analysis model to calculate an overall effect estimate

mm. Model selection depends on the estimated heterogeneity, or

between-study variance, and its presence usually leads to the

adoption of a random-effects (RE) model. The alternative, the

fixed-effects model (FE), is used when meta-analysts, for theoretical

or practical reasons, decide not to adjust for heterogeneity, or have

assumed or estimated the between-study variability to be zero.

Different approaches exist for combining individual study results

into an overall estimate of effect under the fixed- or random-effects

assumptions: inverse variance, Mantel-Haenszel and Peto [1].

Inverse variance approaches are the most flexible and are

suitable for continuous or dichotomous data through a fixed-effect

or one of numerous random-effects methods. The DerSimonian

and Laird [2] method (DL), a moment-based estimator, is the

PLOS ONE | www.plosone.org 1 July 2013 | Volume 8 | Issue 7 | e69930

This project was supported by the School for Primary Care Researchwhich is funded by the National Institute for Health Research (NIHR).The views expressed are those of the author(s) and not necessarilythose of the NHS, the NIHR or the Department of Health.

Comments, suggestions: [email protected]

Kontopantelis A re-analysis of the Cochrane Library data