Choosing Monitoring Boundaries: Balancing Risks and Benefits · 19/4/2017 · boundaries than clinical colleagues • Desire to have a clear, data-driven statistic do the work, but

ChoosingMonitoringBoundaries:BalancingRisksandBenefits

[email protected]

DepartmentofBiostatistics,EpidemiologyandInformaticsUniversityofPennsylvania

April19,2017

Outline

• Whatdowemonitor?• Howdowemonitorit?• Achallengingexample• Somenewproposals• Discussion

2

Whatdowemonitor?• Clinicaltrialarchitectureistypicallydefinedbyaprimaryefficacyoutcome

• A fundamentalroleofaDSMBistoassessthebenefit/riskratio

• Priorstudiescanyieldalistofpotentialrisks/benefits– Maybesymptoms(nausea,pain,etc.)ormayberisksofsevereoutcomes(elevatedstroke,cancer,death)

– Mayalsohaveimportantsecondaryefficacyendpoints(e.g.fracturesinWHIHormoneReplacementTrial)

– Trialsstructurealsosetuptocaptureunanticipatedadverseevents

3

Someofthechallengestoassessingbenefit/risk

1. Multivariateoutcomesneedtobeconsidered– Theseoutcomesmaybeofvaryingseverity

2. Risksmaychangeovertime3. Risksmaybeinfrequent/rare4. Fornoveltherapies,risksmaybelargelyunknown5. Expecttheunexpected…1&2implythatinordertoevaluaterisk/benefitonehastoprioritizetheoutcomesandprioritizetheimportanceofearly/lateevents(explicitlyorimplicitly,formallyorinformally)

4

Twoapproachesformonitoringrisk/benefit

1. Multipleoutcomesassessedseparately– Primaryendpointmayhaveaformalmonitoringboundary– DSMBispresentedwithanalysesofseveralseparateendpoints:primary,key2nd-ry,importantsafetyoutcomes

– DSMBweighstotalityofevidence,asubjectivejudgmentismadeforoverallbalanceofrisk/benefit

2. Astatisticsummarizingrisk/benefitisassessed– Compositeendpointdeterminedpriortostartoftrial– Risk/benefitp-valuecalculated/comparedtoaboundary– Subjectivejudgmentstillneededtoweighttotalityofevidence

5

Issuesthatcomplicateevaluationofthebenefit/risktradeoff

• Severityofhealthoutcomesaffectedbythetreatmentmaybeverydifferent– Assessingoverallbenefitmeansgivingrelativeweightstothese

risks/benefits– Patients/cliniciansmayhavedifferingopinionsontheseweights

• Frequencyofhealthoutcomesaffectedbythetreatmentmaybeverydifferent– Whendoestheincreasedriskofarare,butserioussideeffect

offsetthebenefitofatreatment?• Toleranceofasideeffectdependsonwhetheritisina

healthypopulationorsickpopulation• Timingofendpointsmaydiffer:earlyharm,laterbenefitor

viceversa6

WHIExample• Women’sHealthInitiative(WHI)conductedtwohormonetherapy(HT)trials

• TrialswereuniqueintheamountofdatacollectedonHTpriortothetrialstart– Expecting40-50%decreaseinheartattacks– Observationalstudiesraisedconcernoverincreaseinbreastcancer

• AformalmonitoringplanwasputintoplaceforbothefficacyandharmforbothHTtrials– Considered8outcomesofroughlyequalimportance.Mostthoughttoberelatedtoefficacy

– Hadaglobalindexofbenefit/risk(Z=-1)(Wittes etal.2007;Freedmanetal.1996) 7

WHIHTmonitoringplan

• Primaryefficacyendpoint:Coronaryheartdisease(CHD)

• Primarysafetyendpoint:invasivebreastcancer• Formalanalysesusedweightedlog-rankstatistictofurtherdown-weightearlyevents– MotivatedbyexpectedearlyCHDbenefitandlateBCAharm.Also,drugneededtimetohaveaneffect

– Unweightedusefulincasetherewereearlyharms,don’twanttodownweightthem

8

WHITrial:Unexpectedoutcomes• Discrepancybetweenexpectedandobservedefficacyandsafetyendpoints– Earlyon,anincreasedriskofCHD/stroke/PEforactivearmemergedinbothtrials

– Lateron,divergenteffectsappearedforbreastcancer

• Debateensuedwhetherandhowthesafetyendpointbemodified(Wittes etal2007)

• Levelofsignificanceanddirectionofeffectvariedbasedonweightedvsunweightedlog-rankstatistics

9

WHIExample:Lessonslearned/affirmed

• Monitoringmultivariateoutcomesiscomplex• Reliablyassessingriskandharmsmeansknowingwhichendpointsarewhich

• Difficulttorelyonasinglep-valuewhenconsideringamultivariateoutcome

• Decision-makingisultimatelyasubjectiveactivity

10

WHIexamplehighlightsmonitoringduality• Pre-specifiedboundariesprotectagainstinflatingp-values

bydefiningriskcategoriesafteradifferenceisobserved• Formalboundarieshowevercan”lockthinking”andneed

tobeflexibleinthefaceofunexpectedrisks– Adesiretosticktopre-specifiedboundaries– Ironically,statisticianscanbequickertoditchtheboundariesthanclinicalcolleagues

• Desiretohaveaclear,data-drivenstatisticdothework,butinterpretationneedstobringinaglobalperspective– Datafromothertrials– Leaningsofothertrendsindata– Uncertaintyinassumptionsbehindmonitoringboundary

11

Needforbetterstatisticalapproachestoassessbenefit/risk?

Usualstatisticalapproacheshavesomelimitations:• Timetofirstignoressubsequentandpotentiallymoresevereoccurringendpoints

• HRcanoveremphasizeincreasesinsmallabsoluterisk

• HR–precisionlimitedbynumberofevents• Unoetal.2015discussadvantagesofriskdifference,percentiledifference,restrictedmeansurvivaldifferenceinnon-inferioritytrials

• Multiplicity

12

Manyrecentproposalsforassessingbenefit/risk*

–Win-ratio:Pocock etal.2012,FinkelsteinandSchoenfeld 1999;Bebu andLachin 2016,Oakes2016

– Severityranking:ShawandFay2016– TotalAssessment:Evansetal2015(DOOR),Berryetal.2013

– OutcomeWeighting:Bakal etal.2013– Proportionfavoringtreatment:Buyse 2010– Jointtest:FinkelsteinandSchoenfeld 2014

13

Approachestoassessingbenefit/risk

1. Createanaggregatescorefromaweightedsumofoutcomes– Interpretedasaglobalassessmentofpatientoutcome– Naturallyincorporatesmultipleevents

2. Orderoutcomesintermsofapreferredimportanceandrank/classifypatientsusingthehighestorderedoutcomepossible– Forcensoredeventtimesoftenmeansrankingpatientsoveracommonfollow-uptime

– Essentiallycreatesaweightedcombinationofscorestatistics,wheretheratesrelatetotheprobabilityoftheeventsofhigherorderbeingobserved

14

Differingopinionsonwhethertocreateseparatesafetyandefficacycomposites

• EvansandFollmann 2016advocateaunifiedcompositeofbenefitandriskasapragmaticendpointofeffectiveness

• Kipetal2008recommendagainstlumpingsafetyandefficacylimitsinterpretabilityinsettingofcardiovasculardisease• Frequentlydominatedbyasubclassofendpoints• Toosusceptibletoprovidingmisleadingevidence

• “Althoughnumerousapproachesandframeworkshavebeenproposedinrecentyears,thereisnosingleapproachorframeworkthatcanbeappliedandutilizedineverysetting.”(Ch 8,Jiang,He2016)

15

Winratio1

• Patientsintreatmentandcontrolgroupsareplacedintomatchedpairsaccordingtotheirriskprofiles

• Determineprioritizationofoutcomes– Example:twoendpoints:deathorMIHospitalization,consider

timetodeathfirstthentimetohospitalization• Withineachpair,atx subjectislabeledawinnerorloser

usingtheoutcomeofhighestprioritypossible– Comparetimetodeathifpossible;otherwisecomparetimeto

hospitalization;otherwisetied• Thewinratioistheratioofwins/lossfortreatmentarm

– P-valueandCIarereadilyobtainable

1.Pocock 201216

Usefulfeaturesofwinratio

• Canconsiderallobservedeventsonapatient– Allowsmoresevereeventstohavehigherpriority– Particularlyusefulincaseswherefirsteventisexpectedtobethelesssevereevent

• Potentiallyhigherpowerthananysingleendpoint– Particularlyiftreatmenteffectsimilaracrossendpoints

• Easytocalculateandmakeinference1,2,3

– UnpairedversionisavailableusingaU-statisticderivedfromallpossibletx-controlpair

171. Pocock 20122.Bebu andLachin,2016;3.Oakes2016

Winratioexample:TheSOLVDtrial(NEJM1991)

Background• SOLVDincludedaRCTofanoveltreatmentforpreventionof

mortality/hospitalizationinpatientswithcongestiveheartfailure(CHF)andweakleftventricleejectionfraction(EF)

• In1986-89,2569patientsrandomizedtoenalapril orplacebo

• Enalapril foundbeneficialformortality(p=0.0036)andtimetofirsthospitalization/death(p<0.0001)

Analysis• Consideredasubsetof662diabeticsubjects• Computeusualtime-to-first(TTF)endpoint• Computewinratioforcontrol-treatmentpatientspairs

formedusingabaselineCoxmodelriskscorefordeath18

SOLVDTrial:Time-to-firstanalysis

19

SOLVD:Winratio

• 343onPlaceboarm,319onactivearm– 24patientsgounusedinthepairedanalysis

• 145winsonactive;112winsonplacebo• WR=145/112=1.29(p=0.038)– 189rankedondeath:98winsforactive,91winsforplacebo

– 68rankedonhospitalization:47winsactive;21winsplacebo

20

Afewkeypointsaboutwinratio

• TheparametertheWRestimatesdependsonthecensoringdistributionsoftheendpoint– Importantconsiderationifearlyandlaterisks

• Trialsofdifferentlengthswillgenerallybeestimatingadifferenteffectestimate

• Whenpatientshavevaryingfollow-uplengthstheWRbecomesmoredifficulttointerpret– SOLVDfollow-up:1dayto4.6yearsinexample

• Ifdeathdeterminesseverity,thenisrankingbyotherlesssevereendpointsgaininginformationorameansofpotentialmisclassification?

21

Winratio:Gaininginformationfromhospitalizationormisclassifying?

22

HH

Censoringtime

Patient1diedat3years;Patient2censoredat2.5years;diedat4years

1

2

WinRatio:Gaininginformationfromhospitalizationormisclassifying?

23

HH

Censoringtime

Patient1diedat3years;Patient2censoredat2.5years;diedat4years

Thetruestateofinformationhereisthatthepatient1severityrelativetopatient2isintervalcensored.

1

2

ClinicalseverityrankingShawandFaySIM2016

• Rankindividualsaccordingtoclinicalseverity,usinginformationonboththesurrogateandtrueendpoint– Rankingfunctionofthetwoeventtimescanvarybysetting

• Settingofinterest:XDR-TB:sputumconversion/death– Rankpatientsbytimeofdeathifobserved

• Earlierisworse

– Ranktimetosputumconversionforthesurvivors• Earlierisbetter

– Conversiontimeirrelevantifpatientlaterdies• Performtwosampletestonaninterval-censoredclinicalseveritywhichincorporatesbivariatesurvivalinformation

24

ShawandFaySIM2016Ranking Values: Worst to Best

Time to Surrogate

Tim

e to

Dea

th

0 5 10 15 20 ∞∞

05

1015

20

12 23 3 34 4 4 45 5 5 5 56 6 6 6 6 67 7 7 7 7 7 78 8 8 8 8 8 8 89 9 9 9 9 9 9 9 9

10 10 10 10 10 10 10 10 10 1011 11 11 11 11 11 11 11 11 11 1112 12 12 12 12 12 12 12 12 12 12 1213 13 13 13 13 13 13 13 13 13 13 13 1314 14 14 14 14 14 14 14 14 14 14 14 14 1415 15 15 15 15 15 15 15 15 15 15 15 15 15 1516 16 16 16 16 16 16 16 16 16 16 16 16 16 16 1617 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 1718 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 1819 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 1920 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 2041 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21

1234567891011121314151617181920

25

Greybox:Severityscoreforpatientwhoconvertedinweeks6-8,inclusive,butdroppedoutafterweek16.Intervalcensoringindisjointintervals

Furthermusingsontestsofseverityusingjointsurvivaldistribution

• Takeadvantageofalltheinformationregardingthesurvivaltime(notlimitedtocommonfollow-uptimesforpairs)

• Includingthetrueuncertaintyabouttheseverityofapatient

• Teststatisticforcompositestillhastheproblemthatthatparameterestimateddependsonlengthoftrial– FayandShawshowedthattheresultingteststatisticisaweightedsumofateststatisticondeathandonthesurrogate

26

DOORRankingEvansetal.2015

• Collectionofpossibilitiesofclinicaloutcomesofapatientsarerankedaccordingtopreferredtoleastpreferredoutcome– Rateallpossibleclinicalpathsonanordinalscale– Rating/rankingcanbedonebyexpertclinicalpanel,potentially

alsoincludingpatients– ThenaU-typestatisticcouldbeusedtoexamineiftheoutcome

ontx betterthanthatforapatient• Proposedablindedadjudicatedcommitteecouldevaluate

clinicalseveritybasedonpatientchart– Notpracticalforlargertrialsorveryreproducible

• Similarideasdiscussedbyanumberofauthors,includingChuang-Steinetal.1991

27

DOORhypotheticalexample(Evansetal.2015)

1. ClinicalbenefitwithoutAE2. ClinicalbenefitwithAE3. Survivalw/oclinicalbenefitorAE4. Survivalw/oclinicalbenefit+AE5. Death• Insettingofanti-infective,breaktiesusinglengthofantibioticregimen(DOOR/RADAR)

28

DOOR:AdvantagesandlimitationsAdvantages– Simpleandintuitivemeasure– Rankingcognitivelyeasierthanweighting– Canincorporatedifferentrankingsystems

Limitations– Varyinglengthoffollow-upcanbeachallenge– Lossofinformationthroughtiescanbeaproblemforordinal

– Willbedifficulttoadapttounexpectedbenefitsorrisks.Wouldneedtoreconveneoutcomerankingpanel

• Perhapsbestusedalong-sideIndividualcomponentsforinterpretation

29

Somepragmaticconsiderations• Forcompositeendpoints:creategroup(s)basedonsimilar

severity– Somesettingsmaywanttopoolsafetyandriskfornetclinical

benefit– Addedinterpretabilityifindividualsoutcomesoccurwithsimilar

frequency• Sensitivityanalysistoseeimpactofvaluesystems

– Ifusingoutcomeweighting,canbeusedidentifythe“valuebreakingpoint”

• PracticeRundecisionscenarios:Valuableexercisetohonetheneededvaluejudgements(somecanbepre-specified)andstatisticaldecisionboundaries

• Clearpresentationandvisualizationofdata(estimates)forDSMBreportwillaidinassessmentoftotalityofevidence

30

Conclusions• Nooneapproachwillworkforeverysetting• Goodtorememberallapproachesinvolvesubjectivity• Specificendpoint+compositesthatsummarizeeffectonmultipleendpointsseemslikeaflexibleandpowerfulcombination

• Statisticalpropertiesofcomposites needrigorousexaminationandthoroughnumericalinvestigationbeforestartoftrialforexpectedscenarios

• Practicerundecisionscenarios:Valuableexercisetotohonetheneededvaluejudgementsandstatisticaldecisionboundaries

• Apriordevelopmentofrisk-benefitstatisticandboundaryisausefuldecisiontoolbutcannotbeprescriptive

31

32

Thankyou!

References1. Bakal JA,Westerhout CM,CantorWJ,Fernández-Avilés F,WelshRC,FitchettD,GoodmanSG,

ArmstrongPW.Evaluationofearlypercutaneouscoronaryinterventionvs.standardtherapyafterfibrinolysisforST-segmentelevationmyocardialinfarction:contributionofweightingthecompositeendpoint.EuropeanHeartJournal.2013Mar21;34(12):903-8.Bebu

2. Bebu I,Lachin JM.Largesampleinferenceforawinratioanalysisofacompositeoutcomebasedonprioritizedcomponents.Biostatistics.2016Jan1;17(1):178-87.

3. BerryJD,MillerR,MooreDH,Cudkowicz ME,VanDenBergLH,KerrDA,DongY,IngersollEW,ArchibaldD.TheCombinedAssessmentofFunctionandSurvival(CAFS):anewendpointforALSclinicaltrials.Amyotrophiclateralsclerosisandfrontotemporaldegeneration.2013Apr1;14(3):162-8.BuyseM.Generalizedpairwisecomparisonsofprioritizedoutcomesinthetwo-sampleproblem.StatisticsinMedicine.2010Dec30;29(30):3245-57.

4. Chuang-SteinC,Mohberg NR,SinkulaMS.Threemeasuresforsimultaneouslyevaluatingbenefitsandrisksusingcategoricaldatafromclinicaltrials.StatisticsinMedicine.1991Sep1;10(9):1349-59.

5. EvansSR,Follmann D.UsingOutcomestoAnalyzePatientsRatherthanPatientstoAnalyzeOutcomes:AStepTowardPragmatisminBenefit:RiskEvaluation.StatisticsinBiopharmaceuticalResearch.2016Oct1;8(4):386-93.

6. EvansSR,RubinD,Follmann D,Pennello G,HuskinsWC,PowersJH,Schoenfeld D,Chuang-SteinC,CosgroveSE,FowlerVG,Lautenbach E.Desirabilityofoutcomeranking(DOOR)andresponseadjustedfordurationofantibioticrisk(RADAR).ClinicalInfectiousDiseases.2015Jun25:civ495.

7. FinkelsteinDM,Schoenfeld DA.Combiningmortalityandlongitudinalmeasuresinclinicaltrials.StatisticsinMedicine.1999Jun15;18(11):1341-54.

33

References(2)8. FinkelsteinDM,Schoenfeld DA.Ajointtestforprogressionandsurvivalwithinterval-censoreddata

fromacancerclinicaltrial.StatisticsinMedicine.2014May30;33(12):1981-9.

9. FreedmanL,AndersonG,Kipnis V,PrenticeR,WangCY,Rossouw J,Wittes J,DeMets D.Approachestomonitoringtheresultsoflong-termdiseasepreventiontrials:examplesfromtheWomen'sHealthInitiative.ControlledClinicalTrials.1996Dec1;17(6):509-25.

10. JiangQ,HeW,editors.Benefit-RiskAssessmentMethodsinMedicalProductDevelopment:BridgingQualitativeandQuantitativeAssessments.ChapmanandHall/CRC;2016Jul7.

11. KipKE,Hollabaugh K,MarroquinOC,WilliamsDO.Theproblemwithcompositeendpointsincardiovascularstudies:thestoryofmajoradversecardiaceventsandpercutaneouscoronaryintervention.JournaloftheAmericanCollegeofCardiology.2008Feb19;51(7):701-7.

12. OakesD.Onthewin-ratiostatisticinclinicaltrialswithmultipletypesofevent.Biometrika.2016Jul25:asw026.

13. Pocock SJ,Ariti CA,CollierTJ,WangD.Thewinratio:anewapproachtotheanalysisofcompositeendpointsinclinicaltrialsbasedonclinicalpriorities.EuropeanHeartJournal.2012Jan1;33(2):176-82.

14. ShawPA,FayMP.Aranktestforbivariatetime-to-eventoutcomeswhenoneeventisasurrogate.Statisticsinmedicine.2016Aug30;35(19):3413-23.

15. UnoH,Wittes J,FuH,SolomonSD,Claggett B,TianL,Cai T,Pfeffer MA,EvansSR,WeiLJ.Alternativestohazardratiosforcomparingtheefficacyorsafetyoftherapiesinnoninferiority studies.AnnalsofInternalMedicine.2015Jul21;163(2):127-34.

16. Wittes J,Barrett-ConnorE,Braunwald E,ChesneyM,CohenHJ,DeMets D,DunnL,DwyerJ,HeaneyRP,VogelV,WaltersL.MonitoringtherandomizedtrialsoftheWomen'sHealthInitiative:theexperienceoftheDataandSafetyMonitoringBoard.ClinicalTrials.2007Jun1;4(3):218-34.

34

35

Choosing Monitoring Boundaries: Balancing Risks and Benefits · 19/4/2017 · boundaries than clinical colleagues • Desire to have a clear, data-driven statistic do the work, but

Documents