Data Abstrac*on Assistant (DAA): A new tool for data abstrac*on during systema*c reviews Ian Saldanha, MBBS, MPH, PhD May 16, 2017 JHU CHSOR Seminar/EPC Journal Club
DataAbstrac*onAssistant(DAA):Anewtoolfordataabstrac*on
duringsystema*creviews
IanSaldanha,MBBS,MPH,PhDMay16,2017
JHUCHSORSeminar/EPCJournalClub
CollaboratorsDr.TianjingLi(PI,JHU)Dr.ChristopherSchmid(PIofthesubcontracttoBrownU)Dr.IanSaldanha(ProjectDirector,JHU)Mr.BryantSmith(SRDRCoordinator,BrownU)Mr.JensJap(Programmer,BrownU)Mr.JosephCanner(DataManager,JHU)
OtherSteeringCommiPeeMembers
Dr.JesseBerlin(JnJ)Ms.VernalBranchMs.SimonaCarini(UCSF)Dr.WileyChan(KaiserPermanente)Dr.KayDickersin(JHU)
2
Dr.SusanHuVless(JHU)Dr.IdaSim(UCSF)Dr.M.HassanMurad(MayoClinic)Ms.ElizabethWhamondDr.ByronWallace(NortheasternU)
Background
• DataabstracYon–KeysystemaYcreviewstep!
• Threemainapproaches– SingleabstracYon– SingleabstracYonplusverificaYon– DualindependentabstracYonplusadjudicaYon
4
Problem1-Inaccuracy
5
• SingleabstracYon~30%errorrate1• SingleabstracYonplusverificaYon– 20%moreerrorsthanindependentdualabstracYon2
– Errorshighestfornumericalresults3
– Errorshigherforlessexperiencedabstractors3• ErrorsinpublishedsystemaYcreviews– ≥1errorin37%ofmeta-analyses4
– ≥1errorin48%ofmeta-analyses5
1. HortonJetal.JClinEpidemiol.20102. BuscemiNetal.JClinEpidemiol.20063. GreshamGetal.CochraneColloquium2014.(Abstract)
4. GøtzschePCetal.JAMA20075. JonesAPetal.JClinEpidemiol.2005
Problem2-Inefficiency
6
• Time– ~50minutesperarYcleperabstractor1
– AddiYonalYmefordataadjudicaYon
• 50%moreYmeforindependentdualabstracYonplusadjudicaYonthansingleabstracYonplusverificaYon2
1. GreshamGetal.(Abstract)CochraneColloquium20142. BuscemiNetal.JClinEpidemiol.2006
Problem3-Inconsistentrecommenda*ons
7
“Asaminimum,informa6onthatinvolvessubjec.veinterpreta6onandinforma6onthatiscri.caltotheinterpreta6onofresults(e.g.outcomedata)shouldbeextractedindependentlybyatleasttwopeople.”
“Ideallytworesearchersshouldindependentlyperformthedataextrac6on….Asanacceptedminimum,oneresearchercanextractthedatawithasecondresearcherindependentlycheckingthedata...”
“Ataminimum,usetwoormoreresearchers,workingindependently,toextractquan.ta.veandothercri.caldata….Forothertypesofdata,oneindividualcouldextractthedatawhilethesecondindividualindependentlychecks...”
HowdoesDAAwork?
9
1. ViewstudyarYcle(PDFs)sidebysidewithdataabstracYonforms(e.g.,inSystemaYcReviewDataRepository[SRDR]).
2. Highlight(or“pin”)thelocaYonoftextinthePDF.
3. CopytextfromthePDFintothedataabstracYonform.
1.Oncedocumentisselected,PDFwillloadonrighthalfofscreeninthisarea
2.DraganddropflagontoPDFtext3.Clickonflagtoviewpinnedloca.ons
4.Clickontexttoautoma.callyselectthePDFandscrolltothemarkedtext.
hPps://www.youtube.com/watch?v=5dGIL6jltYQ&feature=youtu.be
HowdoesDAAwork?
HowmightDAAhelp?
11
• Reduceerrors• Reduce*metakenfordataabstracYon• FacilitatethetrackingofabstractedinformaYon– HelpwithupdaYngsystemaYcreviews
Willitwork?
13
• Randomizedcrossovertrial,enYrelyonlineA. DAA-facilitatedsingleabstracYonplusverificaYonB. TradiYonalsingledataabstracYonplusverificaYonC. TradiYonalindependentdualdataabstracYon
• Outcomes– AccuracyofabstracYon(i.e.,errorrates)– EfficiencyofabstracYon(i.e.,Ymetaken)
DAATrial–EligibilityCriteria
14
• Age≥20years• Haveabstracteddatafrom≥1arYcleforasystemaYcreviewinanyfield
• Self-reportedcomfortwithscienYficarYclesinEnglish
DAATrial–SampleSize
15
• 48dataabstractors(24pairs)
LESSexperienced
(Published<3systemaYcreviews)
MOREexperienced
(Published≥3systemaYcreviews)
&
DAATrial–Dataabstrac*on
17
• DataabstracYonformsfor48pre-idenYfiedclinicaltrialarYcles
• AwertraininginSRDRandDAA,eachpairabstractsdatafrom6arYcles,twoundereachapproach(A,B,andC)inarandomlyassignedsequence
DAATrial–Outcome1
18
Errorrates• AbstracYonbyPIandProjectDirectorwillbethereferencestandard(“answerkey”)
• Noopen-endedtextquesYonsusedforanalysis• Foragivendataitem,anydifferencebetweenabstracteddataandanswerkeyisanerror.– Determinedusingacomputerprogram
DAATrial–Outcome2
19
Timetakenfordataabstrac*on• BothiniYalabstracYonanddataverificaYon/adjudicaYon
• Measurement– AutomaYcallyrecordedbySRDRYmer– Self-recordedbyabstractors• InSRDR• AQualtricssurvey
DAATrial–Minimizingbias
20
Poten*alsource
Strategy
Sequencegenera6on
Computergeneratedrandomnumbers(1to24)
Alloca6onconcealment
ProjectDirectoremailstheSeniorStaYsYcian
Masking/Blinding
NotfeasibleforabstractorsorProjectDirector.Bothoutcomes(Ymeanderrors)areobjecYve.
Incompleteoutcomedata
AcYvefollowup.TwomethodsformeasuringYme.
Other Washoutnotneeded(noanYcipatedcarryovereffect).Analysisaccountsforsequence.
21
Sta*s*calAnalysisforErrorRates
Ques*ontype(Design/Baseline/Results)
Dataabstrac*onapproach(A/B/C)
Dataabstrac*onsequence1-6
Systema*creview1-4
Abstractorpair1-24
Ar*cle1-48
DAATrial–Wherearewe?
22
• AlldataabstracYoncompleted
• Datacleaningisongoing
• Dataanalysisoverthesummer
• Resultsexpected–September2017
DAATrial–ExitPollResults
23
• All52abstractorscompletedtheexitpoll• 83%foundDAAoverall“veryeasy”or“somewhateasy”touse• 74%were“verylikely”orsomewhatlikelytouseDAAinthefuture• 84%were“verylikely”or“somewhatlikely”torecommendthatothersuseDAAinthefuture• 62%namedthe“abilitytoclickonflags”astheirfavoriteDAAfeature
Implica*ons
24
• Novelsowwaretool– Designedtoworkwithvariousdatasystems– Testedinarigorousrandomizedtrial
• DAAtrialcould:– InformrecommendaYonsfordataabstracYon– Promotemorerobustevidence-baseddecisions– Saveourenterpriseresources