Top Banner
Do Open Access Articles Have Greater Citation Impact? A critical review of the literature Iain D. Craig, 1 Andrew M. Plume, 2 Marie E. McVeigh, 3 James Pringle 3 and Mayur Amin 2 1 Wiley-Blackwell, 9600 Garsington Road, Oxford, OX4 2DQ, UK 2 Elsevier, The Boulevard, Langford Lane, Kidlington, Oxford, OX5 1GB, UK 3 Thomson Scientific, 3501 Market Street, Philadelphia, PA 19104, USA NOTICE: This is the author’s version of a work accepted for publication by Elsevier. Changes resulting from the publishing process, including peer review, editing, corrections, structural formatting and other quality control mechanisms, may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Journal of Informetrics, Volume 1, Issue 3, July 2007, pp 239-248, http://dx.doi.org/10.1016/j.joi.2007.04.001
20

Do open access articles have greater citation impact?: A critical review of the literature

Dec 19, 2022

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Do open access articles have greater citation impact?: A critical review of the literature

Do Open Access Articles Have Greater Citation Impact?

A critical review of the literature

Iain D. Craig,1 Andrew M. Plume,2 Marie E. McVeigh,3 James Pringle3 and Mayur Amin2

1 Wiley-Blackwell, 9600 Garsington Road, Oxford, OX4 2DQ, UK2 Elsevier, The Boulevard, Langford Lane, Kidlington, Oxford, OX5 1GB, UK3 Thomson Scientific, 3501 Market Street, Philadelphia, PA 19104, USA

NOTICE: This is the author’s version of a work accepted for publication by Elsevier. Changes resulting from the publishing process, including peer review, editing, corrections, structural formatting and other quality control mechanisms, may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Journal of Informetrics, Volume 1, Issue 3, July 2007, pp 239-248, http://dx.doi.org/10.1016/j.joi.2007.04.001

Page 2: Do open access articles have greater citation impact?: A critical review of the literature

2

Page 3: Do open access articles have greater citation impact?: A critical review of the literature

3

Contents

4 Executiveoverview

5 Introduction

6 Methodologicalissuesincitationanalysis

7 Correlationsofonlineavailabilityandincreasedcitations

9 CorrelationsofOpenAccessandincreasedcitations

11 Whenshouldcitationcountingbegin?

13 DeconstructingtheOpenAccesscitationeffect

15 WhatdoesOpenAccessmeanforindividualauthors?

17 Conclusions

19 Acknowledgements

20 Glossary

Page 4: Do open access articles have greater citation impact?: A critical review of the literature

4

ThelastfewyearshaveseentheemergenceofseveralOpenAccessoptionsinscholarlycommunicationwhichcanbroadlybegroupedintotwoareasreferredtoas‘Gold’and‘Green’OpenAccess(OA).InthisarticlewereviewtheliteratureexaminingtherelationshipbetweenOAstatusandcitationcountsofscholarlyarticles,andtakenopositionontherelativevalueorsustainabilityofthesecommunicationmodels.

EarlystudiesshowedacorrelationbetweenthefreeonlineavailabilityorOAstatusofarticlesandhighercitationcounts.

Theauthorsofmanyofthesestudiesimpliedthatthiscorrelationwascausal,withoutdueconsiderationofpotentialconfoundingfactors.

MorerecentinvestigationshaveappliedsophisticatedbibliometricmethodstodissectthenatureoftherelationshipbetweenarticleOAstatusandcitations.

Threenon-exclusivepostulateshavebeenproposedtoaccountfortheobservedcitationdifferencesbetweenOAandnon-OAarticles:anOpenAccesspostulate,aSelectionBiaspostulate,andanEarlyViewpostulate.

a.TheOpenAccess(OA)postulatesuggeststhatauthorsaremorelikelytoread,andthuscite,articlesthataremadeavailableinanOAmodel.

b.TheSelectionBias(SB)postulatesuggeststhatthemostprominent(andthusmostcitable)authorsaremorelikelytomaketheirarticlesavailableinanOAmodel,andthattheyaremorelikelytodosowiththeirmostimportant(andthusmostcitable)articles.

c.TheEarlyView(EV)postulaterelatesonlytoarticlespostedbeforefinaljournalpublication,andsuggeststhattheperiodbetweentheearlypostingofanarticle(eitherpre-printorpost-print)andtheappearanceofthecognatepublishedjournalarticleallowsforearlieraccrualofcitations.Failingtoaccountforthiseffectmustnecessarilygiveabiasedresult.

Themostrigorousstudytodate,conductedinthefieldofcondensedmatterphys-ics,showedthataftercontrollingforaclearlydemonstratedEarlyViewpostulate,theremainingdifferenceincitationcountsbetweenOAandnon-OAarticlesisexplainedbytheSelectionBiaspostulate.NoevidencewasfoundtosupporttheOApostulateper se;i.e.articleOAstatusalonehaslittleornoeffectoncitations.

Ascitationpracticesvarywidelybydiscipline,furtherstudiesusingasimilarlyrigorousapproacharerequiredtodeterminethegeneralityofthisfindinginotherfieldsofresearch.Suchstudiesmustaccountfortheheterogeneousdistributionofcitationsacrossanygroupofarticlesandestablishthedateofearliestavailabilityofeacharticleinthestudy,ascitationaccumulationistimesensitive.

Executive overview

1

2

3

4

5

6

7

Page 5: Do open access articles have greater citation impact?: A critical review of the literature

5

Introduction

Withtheadventoftheinternetandelectronicpublishing,newmodelsofscholarlycommunicationhaveemergedthatsimultaneouslycomplementandchallengeestablishedsystems.Althoughtheterm‘OpenAccess’(OA)istakenbroadlytomeanthataccessing,downloading,andreadingmaterialisfreetotheentirepopulationofinternetusers,severaloptionsfortheprovisionofthataccesshaveemerged.Thesecanbegroupedintotwobroadmodels:‘Gold’and‘Green’.TheGoldmodelusesatraditionaljournalpublicationsystem,butshiftstheeconomic/financialmodel.Insteadofasubscriberpayingtoreadthefinalversionofapeer-reviewedarticle,anauthororsponsorpaystopublishthearticle,andreadingthearticleisfreetoanyonewishingtodoso.Ajournalmayoperatewhollyunderthismodel,ormayuseahybridmodelcombiningsubscriptionandarticlesponsorship.TheGreenmodelofOAreliesonpostingtheauthor’smanuscriptofanarticleintoaninstitutionalorsubject-basedelectronicarchive,eitherintheformofapre-print(assubmittedtoajournalforpeerreview)orasafinalcopyofthepeer-reviewededitedfulltext(apost-print).1

Alessrigorousbutincreasinglycommonformofarchivingistheuseofindividualauthorwebpages,outsideofastructuredarchive.Articlesthatarepostedaspre-printsmaybesubsequentlyacceptedforpublicationinajournalandmaythenalsobearchivedaspost-prints,sometimes,butnotuniversally,replacingthepre-printversion.Economically,GreenOAreliesonthesustainabilityoftheexistingjournalsystemas,unlikeGoldOA,itdoesnotprovideanyfinancialsupportforjournals.

AnincreasingamountofresearchontheeffectsofOAmodelsonscholarlycommunicationhasemergedinrecentyears,anditisclearthatthemethodsforperformingthisresearchhavebeendevelopingalongsidethenewmodelsthattheystudy.Oneoftheforemostquestionsaskedis:‘DoOpenAccessresearcharticleshaveagreatercitationimpact?’Anotherwayofaskingthisquestionatthemostpersonallevelfortheauthorsofjournalarticlesis:‘Willmyresearchpaper(s),andthereforewillI,getacitationbenefitfromtheGoldandGreenOpenAccessmodels?’Inthisarticlewesurveytheoriginalresearchliteratureonthistopictodate,withaparticularemphasisonmethodologicalissues,andhighlightareasinwhichfurtherresearchisrequired.Asthisisanevolvingareaofresearch,theterminologyhasyettobecomefixed;wefollowtheterminologyusedbytheauthorsofeacharticleunderdiscussion.

1Although sometimes Green OA is used in a narrower context to refer to post-print archiving only, we use the broader definition here, inclusive of both pre-print and post-print archiving, because citation studies referenced here are generally broader in their definition of what is termed OA.

Page 6: Do open access articles have greater citation impact?: A critical review of the literature

6

Methodological issues in citation analysis

Acitationisdefinedasthelistingofapreviouslypublishedarticleinthereferencesectionofacurrentwork;thisisusuallytakentoimplytherelevanceofthecitedarticletothecurrentwork.Informationaboutarticlesandthecitationsbetweenthemarecollectedindatabasesknownascitationindexes.Thebest-knownexampleofacitationindexisThomsonScientific’sWebofScience®,whichnowcontainsabout40millionbibliographicrecordsandover550millioncitationsfromthepastcentury.OtherindexesincludeScopus™,GoogleScholar™,CiteSeer,andNASA’sAstrophysicsDataSystem.Citationanalysisisacoretoolintheresearchdisciplineknownasbibliometrics,definedasthequantitativeanalysisoftheunitsofscientificcommunication(e.g.articles,bookchapters,etc.)andthecitationsthatconnectthem.Thisfieldconsistsofaworldwidecommunityofresearchbibliometricians,withtheirownlearnedsocieties,journals,andactiveonlinediscussionlists.

Bibliometricsisaspecializedandoftencomplexfieldofstudy,andfartranscendsasimplecountingofcitations.ThreemeasurementproblemsinbibliometricsareofparticularrelevancetotherelationshipbetweenOAandcitations.Firstly,itcanbedifficulttomatchthedevelopmentofcitationstopublicationdatesorarticlepost-ingdates.Citationsaccruetoarticlesovertime,andthusolderarticlestypicallyhavegreatercitationcountsthannewarticles.Toovercomethisageeffectcitationsmustbecountedoverafixedperiodoftimeafterpublicationorpostingtoallowusefulcomparisonsbetweenarticlespublishedatdifferenttimes.Secondly,comparisonsoftheaveragepropertiesoftwosetsofarticles(groupedforexamplebyjournal,subjectarea,nationality,orOAstatus)mustbeinterpretedverycarefully,assuchaggregatesusuallyrepresentaheterogeneouspopulationwithaskeweddistributionofcitations.Finally,subjectvariationsmustbeacknowledged,becausedisseminationofresearchfindingsviajournalarticlesisnottheprimarychannelofscholarlycommunicationinalldisciplines(seeTable7.3inMoed,2005)2,andbecausecitationpracticesthem-selvesdiffergreatlybetweensubjectfields,3makingitdifficulttotranslateobservationsfromonesubjectdirectlytothepredictionofeffectsinanothersubject.

2Moed, H.F. (2005) Citation Analysis in Research Evaluation (Dordrecht: Springer).

3Zitt, M., Ramanana-Rahary, S., and Bassecoulard, E. (2005) Relativity of citation performance and excellence measures: from cross-field to cross-scale effects of field-normalisation. Scientometrics 63: 373-401.

Page 7: Do open access articles have greater citation impact?: A critical review of the literature

7

Correlations of online availability and increased citations

ThefirstresearchthatshowedacorrelationbetweenarticlesmadeavailableonlineandhighercitationswascarriedoutbyLawrence.4,5LawrencebasedhisstudysolelyonconferenceproceedingsarticlesincomputersciencesandrelateddisciplinesthatwerelistedintheDigitalBibliography&LibraryProjectComputerScienceBibliography.Heassessedtheavailabilityofacorrespondingfull-textarticleonlineandthenumberofcitations(excludingauthorself-citations)receivedtodateusingtheResearchIndex(oftenreferredtoasCiteSeer)autonomousindexingdatabase.Importantly,hemadenodistinctionbetweenthedifferentmethodsofmakingthefull-textarticlesavailableonline.Onlineavailabilitycanoccurinmanyways,onlysomeofwhichcanbecorrectlytermed‘OpenAccess’.Thereisalsoambiguityaboutthedefinitionof‘online’and‘offline’inthisstudy–itisnotclearwhetherthelatterincludesarticlesthatareavailableonline,butonlyviaasubscription.ArecentinvestigationcomparingCiteSeertoThomsonScientific’sWebofScienceandtoGoogleScholarsuggestedthatCiteSeermaynotbeidealforcitationstudies6.However,itisnotclearhowthisapparentover-reportingofcitationsmightaffectwithin-subject,andwithin-citation-index,comparisons.

Lawrencedemonstratedacorrelationbetweenthelikelihoodofonlineavailabilityofthefull-textarticleandthetotalnumberofcitationstodateforarticlespub-lishedinnon-overlappingconsecutivepairsofyearsfrom1989to1999.Hefurthershowedthattherelativecitationcountsforarticlesavailableonlineareonaverage336%higherthanthoseforarticlesnotfoundonlinebutthatwerepresentedatthesameconference(or‘publicationvenue’).

InhisanalysisLawrenceassumedthat‘articlespublishedinthesamevenuearelikelytobeofsimilarquality’,andthat,byinference,articlesofsimilarquality

5Lawrence, S. (2001b) Free online availability substantially increases a paper’s impact. Nature 411: 521.

4Lawrence, S. (2001a) Online or invisible? Available at http://cite-seer.ist.psu.edu/online-nature01/ (link verified 23rd April 2007)

Figure 1

Typical skewed distribution of citations across a grouping of articles. Just 15% of the articles attract 50% of citations and almost 90% of citations were to 50% of the articles. The selection bias hypothesis suggests that Open Access articles are represented disproportionately in the former sub-group. Adapted from Seglen6.

0

1 0

2 0

3 0

4 0

5 0

6 0

7 0

8 0

9 0

1 0 0

0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 1 0 0

P e r c e n ta g e o f a r ti c le s

Percentage of citations

6Bar-Ilan, J. (2006) An ego-centric citation analysis of the works of Michael O. Rabin based on multiple citation indexes, Information Processing and Management 42: 1553-1566.

Page 8: Do open access articles have greater citation impact?: A critical review of the literature

8

wouldreceivesimilarnumbersofcitations.Neitherofthesekeyassumptionswassupportedbypreviousquantitativeanalyses.Seglen7reportedthatwithinthe‘publicationvenue’ofasinglejournal,15%ofarticlesaccountedfor50%ofcitations,andalmost90%ofcitationsweretojust50%ofthearticles(Figure1).LawrencedidacknowledgethattheremaybeanelementofwhatwouldlaterbecalledSelectionBiasatwork;thatis,thatthe‘higherqualityarticlesaremorelikelytobemadeavailableonline’.WhenLawrenceattemptedtoaccountforthisbias,bylimitingtheanalysisbypublicationvenueto20conferenceswithstringentselectioncriteria,therelativecitationcountsofonlinearticlesreducedto286%,whilestillnotensuringapopulationofuniformlycitablearticles.Inotherwords,somearticleswillnaturallybemorecitedthanothers,irrespectiveofanyotherattributeorconditionofthatarticle.Inhisconclusions,Lawrence(quiterightly)didnotclaimthatthisobservedcorrelationwasproofofcausality,ascorrelationalonecannotproveacausallink.

Ataroundthesametime,Andersonet al.8reportedontheconsequencesofmakingselectedarticles,acceptedforpublicationinPediatricsbetween1997and1999,freelyavailableonline-onlyatthejournal’swebsite.Thisisarticle-levelGoldOA,butwithouttherequirementforsponsorshipofthecostsofpublicationbytheauthororanotherparty.Theanalysiswascomplicatedbythefactthatuptothehalf-waypointinthisperiod(July1998),articlesnotmade‘OpenAccess’inthiswaywerepublishedsolelyintheprintjournal,withnoonlineavailability.Afterthemid-pointofthestudy,bothOAandnon-OAarticleswerepublishedonline,althoughtheiraccessmodeldiffered.Thus,acceptedpapersmadeOAinthefirsthalfofthisperiodwerepostedonlineimmediately,whilethosedestinedforprintonlyweresubjecttoadditionalprintinganddistributiondelays.Andersonandcolleaguesinvestigatedcitationcountsforeacharticle(cumulativetotheendof2000),usinganunspecifiedversionofThomsonScientific’sScienceCitationIndex®.TheauthorsnotedacitationdisadvantagefortheOAarticlesofmorethantwofold.Articlespublishedonline-onlywereselectednotbythesubmittingauthorsthemselves,butratherbythePediatricseditor.Theseonline-onlyarticleswereselectedtogive‘preferencetoarticlesofbroaderinternationalinterest’,butnotnecessarilytheirrelativequalityorscientificimportance,comparedwiththoseacceptedforprintpublication.TheremovalofthepossibleinfluenceofanauthorSelectionBiasanditsreplacementwithaneditorialSelectionBiasmayhavebeensufficienttoaccountfortheobserveddifferencesinmeancitationcounts.

TheoutcomesoftheLawrenceandAndersonet al.studieswereexactlyoppositetoeachother,buttogetherillustratesomeofthemethodologicalproblemsindeter-miningwhetherOAaffectscitations.Theformershowedacitationadvantage,thelatteradisadvantage.Bothsufferedfromlackofclarityabouttheprecisemethodofcitationcounting,andbothsufferedfromSelectionBiasinsomeform;inLawrence’scaseanauthorSelectionBias,andinAndersonet al.’scaseaneditorialSelectionBias.Lawrenceaccountedforanimportantpotentialconfounderinremovingsubsequentself-citationsbyoneormoreofthesameauthorsinsubsequentarticles,whereasAndersonet al.didnot.Authors’self-citationshavebeenacknowledgedasasourceofdistortionintheanalysisofscientificcommunicationandaremorelikelytobemadetopaperswithmultipleauthors.9Itisreasonabletoexpectthatarticleswithnumer-ousauthorswouldalsohaveagreaterlikelihoodofbeingself-archivedbyoneormoreoftheauthors,ortobeotherwisemadeavailableonline.

9Glänzel, W. and Thijs, B. (2004) Does co-authorship inflate the share of self-citations? Scientometrics 61: 395-404.

8Anderson, K., Sack, J., Krauss, L., and O’Keefe, L. (2001) Publishing online-only peer-reviewed biomedi-cal literature: three years of cita-tion, author perception, and usage experience. Journal of Electronic Publishing 6.

7Seglen, P.O. (1992) The skewness of science. Journal of the American Society for Information Science 43: 628-638.

Page 9: Do open access articles have greater citation impact?: A critical review of the literature

9

Correlations of Open Access and increased citations

ThefirststudytoassesstheeffectofGreenOArelatingtopublishedjournalarticles(andnotsimplytoconferenceproceedingsand‘onlineavailability’)waspublishedbyHarnadandBrody,10andelementsofthesedatawereincludedinlaterpapers.11,12Over95,000pre-printmanuscriptsinphysicsandmathematicsdepositedinthesubject-basedrepositoryarXiv(http://www.arXiv.org)werematchedwiththefinalpublishedjournalarticleindexedinThomsonScientific’sWebofScienceandweretermed‘OpenAccess’.Citationcountstothesearticleswerethencomparedwiththoseforallotherarticles(termed‘Non-OpenAccess’)publishedinthesamejournalandthesameyear(between1992and2003)andtheratioofthetwovaluesderived.Articleswithacorrespondingpre-printversiondepositedinarXivhadhighercitationcountsthanthosethatdidnot.ThederivedOpenAccess/Non-OpenAccessratiovariesbysubjectfield,yearofpublication,andwhethercertainfactorshavebeencontrolledornot(suchasremovalofauthorself-citationsorcomparisonwitharticlespublishedinthesamejournal).

Animportantmethodologicalissueinthisstudyisthatitignoredthepotentialskewnessofthedistributionofcitationswithineachgroupofarticles.Coupledwiththefactthatasmallproportionofarticleshadacorrespondingpre-printversioninarXiv,thismeansthatdistortionsduetonon-uniformsamplingwereevenmorelikelythanifamorerepresentativesamplewereavailable.Moreover,expressionofthecitationcountdifferencesasaratioobscurestheactualmagnitudeoftheeffect:intheexampleofverysmallsamplesizesofOpenAccessandNon-OpenAccessarticles,largeratiosmaybeduetoasmallnumberofadditionalcitationsfortheOpenAccessarticles,aneffectthatisreadilyattributabletomanyotherfactorsandindiscerniblefromcommoncitation‘backgroundnoise’,aneffectthatisreadilyattributabletomanyotherfactorsandindiscerniblefromcommoncitation‘backgroundnoise’.Furthermore,thesearticlesmayhavebeenpublishedatanytimeduringa12-monthperiod,sothattheywereavailableforcitationforverydifferentdurations,however,thiseffectdiminisheswithlargercountingintervals.Inthisstudy,theauthorsalsodrewadirectcausallinkbetweenOpenAccessandcitations,basedonanobservedcorrelationbetweenthem,butwithoutsubstantiatingthisclaim.

Antelman13tookanovelapproachtoexaminingtherelationshipbetweenonlineavailabilityoffull-textarticles(notstrictlyGreenOpenAccess)andcitationcounts.Usingamethodthatmimicstheinformation-seekingbehaviourofresearchers,shemanuallysearchedonlineforrandomlyselectedarticlespublishedinleadingjournalsacrossfourdistinctdisciplines(mathematics,electricalandelectronicengineering,politicalscience,andphilosophy).Thearticlesconsideredwerepublishedin2001and2002(1999and2000forphilosophy),andcitationcountsuntil2003(excludingauthorself-citationsandcitationsfromwithinthesamejournalissue)werecollectedfromThomsonScientific’sWebofScience.Full-textarticlesfreelyavailableonline(atalocationotherthanthepublisher’swebsite)thathadthesametitleastheselectedarticlesweredeemed‘Open’;theremainderweretermed‘NotOpen’.Onlyinmathematicswasasubstantialnumberof‘Open’articlesmadeavailablethroughasubject-basedelectronicarchive(whichisamajoraspectoftheGreenmodeofOpenAccess).Intheotherdisciplines,postingontheauthor’sownwebsitewastheprimarymeansofonlineavailability.AntelmancalculatedmeancitationcountsofOpenandNotOpenarticlesandshowedthatthepercentagedifferencebetweenthemeansofthetwocohortsvariedbydisci-pline,from45%inphilosophyupto91%inmathematics.Itisdifficulttogeneralizeabouttheseresults,owingtosocietaldifferencesinthecitationbehavioursof

10 Harnad, S. and Brody, T. (2004) Comparing the impact of Open Access (OA) vs. non-OA articles in the same journals. D-Lib Magazine 10.

11 Brody, T. (2004) Citation analysis in the Open Access world. Avail-able at http://eprints.ecs.soton.ac.uk/10000 (link verified 23rd April 2007)

12 Harnad, S., Brody, T., Vallières, F., Carr, L., Hitchcock, S., Gingras, Y., Oppenheim, C., Stamerjohanns, H., and Hilf, E.R. (2004) The access/impact problem and the green and gold roads to Open Access. Serials Review 30: 310-314.

13Antelman, K. (2004) Do Open-Access articles have a greater citation impact? College & Research Libraries 65: 372-382.

Page 10: Do open access articles have greater citation impact?: A critical review of the literature

10

authorsinthesedisparateacademiccommunities.Antelmanrecognizedsmallsamplesizesandtheskewnessofcitationdistributions

amongstarticlesappearinginthesamejournalasconfoundingfactors,andmadeanattempttoaccountforthisinthestatisticalanalysisofthedata.LikeHarnadet al.12,shedidnotsuggestorexploreanypotentialreasonsfortheobservedcorrelation,beyondtheassumptionthatonlineavailabilityleadstoincreasedcitationcounts.Inasubsequentresponse14toalettertotheeditoraboutthispaper,15Antelmanwascare-fultostatethatherdatadidnotsupportanynotionofcausality.InsteadshenotedthatadditionalunpublishedresearchindicatesasignificantSelectionBiaseffect(atleastinthesocialsciences).

Followingasimilarapproach,Hajjemet al.16usedarobottosearchonlineformorethan1.3millionarticlespublishedinThomsonScientific-indexedjournalsacrosstendistinctdisciplines(biology,psychology,sociology,health,politicalscience,economics,education,law,business,andmanagement).Thearticleswerepublishedbetween1992and2003,andcitationcountswerecumulativetotheendofthisperiod,includingauthorself-citations.Full-textarticlesavailableonlinethathadthesametitleandfirstauthornameastheselectedarticlesweredeemed‘Open’;theremainderweretermed‘NotOpen’.ThemagnitudeofthederivedOpenAccess/NotOpenAccessratiovariedbysubjectfieldandyearofpublicationbetween25%and250%.Again,citationpracticesintheseareasaresodivergentastoconfoundgeneralizationacrosssubjectareas.

Therobot’sabilitytoidentifyOpenandNotOpenarticlescorrectlywassubsequentlyre-analysedinatechnicalpaper17andfoundtosignificantlyoverestimateOpenarticles.Thedatawerere-analysedbythetwoauthorswhohadcontributedtobothpapers,andwerepublishedinanothertechnicalarticlethatreversedthisfindingtoagreeagainwiththeoriginalarticle.18Thereareadditionalmethodologicalissuescon-cerningtheapparentaveragingofaverages:OpenandNotOpencitationcountsweregeneratedbyaveragingforeachgroupthecitationsperissue,thenissuesperjournal,thenjournalsperdiscipline.Thesedatacouldbeverysusceptibletooutliervaluesinapotentiallyskewedcitationuniverse.Samplesizesacrosssubjectdisciplinesarealsoheterogeneous:thereported‘OpenAccessAdvantage’(orOAA)inBiology,whichcomprised49%ofthearticlesinthestudy,wasbyfarthelowestOAAreportedat36%.ConverselythehighestvalueofOAA,172%,occurredinSociology,adisciplinethataccountedforabout8%ofthearticlesstudied.ThecalculatedmeanandmedianvaluesoftheOAAacrossalltendisciplines(83%and77%,respectively)alsotooknoaccountoftherelativesizesofthedisciplinesinvolved,andsoalsorepresentedaveragesofaver-agesinapotentiallyskeweddistribution.Whiletheauthorsnotedthattheobservedcorrelationsdidnotdemonstratecausality,theydismissedthepossibleinfluenceofaSelectionBiaswithoutfurtheranalysis.

14Antelman, K. (2006) Letter to the Editor: Response to Philip Davis. College & Research Libraries 67: 105.

15Davis, P.M. (2006) Letter to the Editor: Do Open-Access articles have a greater citation impact? College & Research Libraries 67: 103-104.

16Hajjem, C., Harnad, S., and Gingras, Y. (2005) Ten-year cross-disciplinary comparison of the growth of Open Access and how it increases research citation impact. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering 28: 39-47.

17Antelman, K., Bakkalbasi, N., Goodman, D., Hajjem, C., and Harnad, S. (2005) Evaluation of algorithm performance on identify-ing OA. Available at http://eprints.ecs.soton.ac.uk/11689 (link verified 23rd April 2007)

18Hajjem, C. and Harnad, S. (2006) Manual evaluation of robot perfor-mance in identifying Open Access articles. Available at http://eprints.ecs.soton.ac.uk/12220 (link verified 23rd April 2007)

Page 11: Do open access articles have greater citation impact?: A critical review of the literature

11

When should citation counting begin?

Noneofthestudiesdiscussedsofarhavetakenaccountofthecriticaldimensionoftemporalprogression:thatis,thetimedifferencebetweenwhenanarticleismadeavailableonlineordepositedinanelectronicarchiveandwhenitispublished.Thesoleconsiderationiswhetherornotthearticlewasfreelyavailableatthetimeofthestudy.Furthermore,therelativetimingofpublicationandthecountingofreferencestothearticlemustbepreciselydefined(ideallyimposingafixedwindowofcitationcounting)toallowtruemeasurementofcitationeffects(Figure2).Fixedcitationwindowsareastandardmethodinbibliometricanalysis,inordertogiveequaltime-spansforcitationtoarticlespublishedindifferentyears,oratdifferenttimesinthesameyear.Inordertoarguethatonlineavailabilitybearsacausalrelationshipwithsubsequentcitations,thedurationofthisonlineavailabilitymustbeestablishedandthetime-courseofcitationaccrualtoOpenandNotOpenarticlesmustbeexaminedrelativetotheirearliestavailabilityineitherform.

AbreakthroughintheanalysisoftherelationshipbetweencitationsandOpenAccessstatuscamewiththestudyperformedbySchwarzandKennicutt21onarticlespublishedinAstrophysical Journal(ApJ)in1999or2002withamatchingpre-printversiondepositedintheastrophysicssectionofarXiv.TheseauthorswereamongthefirsttorecognizethatcitationcountingbeginsearlierforarticlesdepositedinthearXivrepositoryaspre-prints(i.e.beforepublicationinapeer-reviewedjournal)thanarticleswithoutpre-printsdepositedinarXiv.SchwarzandKennicuttwerecertainlythefirsttoattempttoaccountforthisintheanalysisoftheirfindings.Comparingcitationstothese‘posted’articleswithinthelimitedcitationdatasourceofNASA’s

21 Schwarz, G.J. and Kennicutt, R.C. (2004) Demographic and citation trends in astrophysical journal papers and preprints. Bulletin of the American Astronomical Society 36: 1654-1663.

-1 2 -6 0 6 1 2 1 8 2 4 3 0 3 6 4 2 4 8 5 4 6 0 6 6 7 2 7 8 84

M o n th s a fte r p u b l i c a ti o n

Citations per month

Typical citation time-course across a grouping of articles. The black line represents articles published in a journal at month 0, while the grey line represents articles deposited as a pre-print 12 months prior to publication in a journal at month 0. Citation counts at 24 months after journal publication therefore allows an additional 12 months of citation counting for the pre-print group – this is the basis of the Early View hypothesis. Adapted from Henneken19 and Moed20.

Figure 2

20 Moed, H.F. (2007) The effect of “Open Access” upon citation impact: An analysis of ArXiv’s Condensed Matter Section. Journal of the American Society for Infor-mation Science and Technology (in press). Available at http://arxiv.org/abs/cs.DL/0611060 (link verified 23rd April 2007).

19 Henneken, E.A., Kurtz, M.J., Eichhorn, G., Accomazzi, A., Grant, C., Thompson, D., Murray, S.S. (2006) Effect of e-printing on citation rates in astronomy and physics. Journal of Electronic Publishing 9.

Page 12: Do open access articles have greater citation impact?: A critical review of the literature

12

AstrophysicsDataSystem(ADS)versusthoseofnon-postedApJarticlesshowedthattheformerhadcitationcountsonaveragetwicethoseofthelatter(includingself-citations).An(approximately)fixedcitationwindowwasimposedbycountingcitestoarticlespublishedinthesecondhalfof1999untilafixedpointduring2003.Follow-upworkbyMetcalfe22expandedthisanalysistoatotalof13journals(includ-ingthegeneralisttitlesNatureandScience)andbroadlyconfirmedthesefindingsforarticlespublishedin2002.

SchwarzandKennicuttproducedcitationhistogramsshowingtheaccrualofcitationstoarticleswithapre-printversionandthosewithout,andfoundamarkeddifferenceintheprofiles.Pre-printedarticleswereavailabletobecitedbyanaverageof12monthsbeforethefinalApJpublishedarticle,effectivelystartingthecitationcountingprocessearlier.Thedatasuggestedthatthisearliercitationcountingdoesnotaffectthefinalmagnitudeofcitationsaccruedtoajournalarticle.AtleastpartoftheremainingdifferencemaybeattributabletotheSelectionBiasinpostedarticlesthattheauthorsdescribedintheirdiscussion.Inrelatedwork,Metcalfe23analysedcitationcounts(againfromADS)forarticlespublishedinSolar Physics in2003withamatchingpre-printversiondepositedintheastrophysicssectionofarXivorinanindependentsolarphysicsarchiveatMontanaStateUniversity(MSU).Despiteverysmallsamplesizes(170articles,13ofwhichwere‘posted’),Metcalfeshowedasignificantdifferenceincitationcountsbetween‘unposted’articles,those‘posted’inarXiv,andthose‘posted’inMSU.LikeSchwarzandKennicutt,Metcalfealsocomparedcitationcountsfor‘posted’and‘unposted’conferenceproceedingsarticles.WhereasSchwarzandKennicuttdidsotodeterminetherelationshipbetweenpublicationvenueandcitations,MetcalfeattemptedtointerpretthesedataasevidenceforalackofaSelectionBiaseffect.However,thefactthat‘posted’proceedingspapersfollowthesamehighercitationpatternsas‘posted’journalarticlesdoesnotsupportthisview.

22Metcalfe, T.S. (2005) The rise and citation impact of astro-ph in major journals. Bulletin of the American Astronomical Society 37: 555-557.

23 Metcalfe, T.S. (2006) The citation impact of digital preprint archives for solar physics papers. Solar Physics 239: 549-553.

Page 13: Do open access articles have greater citation impact?: A critical review of the literature

13

24 Kurtz, M.J., Eichhorn, G., Accomazzi, A., Grant, C., Demleitner, M., Henneken, E, and Murray, S.S. (2005) The effect of use and access on citations. Information Processing and Management 41: 1395-1402.

Deconstructing the Open Access citation effect

Allofthestudiesdiscussedabovewereconcernedwithdemonstratingadifferencebetweenaveragecitationcountstoarticlesthatweremadeavailableonlineandthosethatwerenot.Whilesomeimpliedacausalrelationship,mostacknowledgedSelectionBiasasapossibleexplanationfortheobservedcitationpatterns,andsomealsonoteddifferencesintheeffectivecitationlife-timesofthetwogroups.Thearticlesdiscussedinthissectionrepresentanewphaseinthedevelopmentoftheliteratureonthistopic.TheyareconcernedwithsystematicallydeconstructingtheelementsoftheOpenAccesscitationeffect,whichtheyrecognizeasbeingacomplex,multi-dimensionalphenomenon.

Kurtzet al.24werethefirsttoformalizethesepossibleexplanationsfortheobserveddifferencesincitationpatternsandtoexaminesystematicallytheireffectsbycontrollingforeach,oneatatime:ageneralOpenAccesseffectduetounre-strictedabilitytoreadandcitearticles(theOApostulate);theEarlyViewpostulate(whichtheytermthe‘EarlyAccesseffect’),duetoarticlesappearingsooner;andaSelectionBiasduetomoreprominentauthorspostingtheirarticles,and/orauthorspreferentiallypostingtheirbetterworks(theSelectionBiaspostulate).Toinvesti-gatetheOAandEarlyViewpostulatestheauthorscalculatedtheprobabilitythatanarticlewillciteanotherarticlepreviouslypublishedinadefinedwindowoftime,usingcitationsonlywithinandbetweenasetofsevencoreastrophysicsjournalsintheADSdatabase.TheresultsclearlyshowedthatthereisnogeneralOAeffect:themassiveincreaseinonlineavailabilityofarticlesinbotharXivandADSinthe1990swasnotcorrelatedwithanysubsequentincreaseincitationstothesearticles.EvidencesuggestingastrongEarlyVieweffectwassuggestedbyanincreasefromthelate1990sintheprobabilityofsubsequentcitationofanarticlewithinthefirstsixmonthsafterpublication(whetheritwasalsodepositedinarXivornot).WhilethisperiodcorrelateswiththeriseinpopularityofarXiv,thiscorrelationcannotbetakenasproofofcause:otherfactorsaffectingthecitationbehaviourofauthorsinthisfieldcouldbeinfluencingtheageoftheircitedreferences.

ToinvestigatetheSelectionBiaspostulate,Kurtzet al.(2005)tookanapproachsimilartothatofSchwarzandKennicutt(2004),lookingatarticlespublishedinApJin2003withamatchingpre-printversiondepositedintheastrophysicssectionofarXiv.Comparingcitations(againtakenfromADS)ofthese‘posted’articleswiththoseofnon-postedApJ articlesshowedthattheformerhadcitationcountsonaver-agetwicethoseofthelatter(includingself-citations).AstrongSelectionBiaseffectwasdemonstratedbytheobservationthatarticleswithaversionpostedinarXivhadagreaterprobabilityofoccurringinthetop200mostcitedarticlespublishedinApJin2003.Afollow-upstudybymostofthesameauthors19usedsimilarmethodstoconfirmtheSelectionBiaseffectinApJandfourotherastrophysicsjournals:articleswithaversionpostedinarXivwereover-representedinthetop100mostcitedarticlespublishedsincethelate1990s.Tore-examinetheEarlyViewpostulatetheauthorsexaminedarticlespublishedinApJbetween1997and1999andcountedcita-tionsaccruedeachmonthafterpublicationfor5years.TheresultantcitationcurvesforarticleswithorwithoutaversionpostedinarXivwereverysimilarinshape.LikeSchwarzandKennicutt(2004),HennekenandcolleaguesfailedtoaccountfortheclearEarlyVieweffectofabout12monthsofearliercitationcounting(theaveragelengthofearlyvisibilityforApJpre-printsfoundbySchwarzandKennicutt).Weretheytodoso,theremainingdifferenceswouldbecomemoreeasilyvisibleandopenforexplanationintermsoftheSelectionBiaspostulate.

Usingthesesamethreepostulates(butwithSelectionBiasre-termed‘Quality

Page 14: Do open access articles have greater citation impact?: A critical review of the literature

14

Differential’),DavisandFromerth25analysedthearticlespublishedinfourmath-ematicsjournalsbetween1997and2005,eitherwithorwithoutmatchingversionsdepositedinthemathematicssectionofarXiv.UsingcitationcountsfromthelimitedMathSciNetdatabase(whichincludesself-citationsand,uniquely,citationsofthepre-printversions),theauthorsshowedanaverage35%increaseincitations(or1.1citationsperarticle)ofarticlespostedinarXivversusthosethatwerenotposted.TheEarlyViewpostulatewastestedbyregressionanalysisofcitationcountsforindividualarticlespostedinarXivandsubsequentlypublishedinthesamejournalagainstthenumberofdaysbeforejournalpublicationtheywerepostedinarXiv.Althoughtherewasnosignificantcorrelation(indeed,manyhighly-citedarticlesweredepositedinarXivafterpublication),wemightnotexpectoneinafieldsuchasmathematics,inwhichcitationpracticesaresuchthattheaverageageofcitedreferencesisrelativelyhigh,andthefrequencyandspeedofpublicationarerelativelylow.Bycontrast,articleswithaversionpostedinarXivwereover-representedamongthemosthighlycitedarticlesinthisstudy,implyingaSelectionBiaseffect.

25 Davis, P. M. and Fromerth, M.J. (2007) Does the arXiv lead to higher citations and reduced publisher downloads for mathematics articles? Scientometrics 71: 203-2150.

Page 15: Do open access articles have greater citation impact?: A critical review of the literature

15

26 Eysenbach, G. (2006) Citation advantage of Open Access articles. PLoS Biology 4, e157.

27 Boyack, K.W. (2004) Mapping knowledge domains: Characterizing PNAS. Proceedings of the National Academy of Sciences USA 101: 5192-5199.

What does Open Access mean for individual authors?

InanattempttoinvestigatetheeffectofGoldOpenAccessoncitations,Eysenbach26undertookananalysisofarticlespublishedinthelatterhalfof2004inasinglehybridGoldjournal,theProceedings of the National Academy of Sciences(PNAS).PNASisalargemultidisciplinaryjournal,publishinginareasasdiverseasbiochemistry,neuro-science,genetics,biophysics,chemistry,evolution,microbiology,andplantsciences.Articleswhosecostofpublicationwasbornebytheauthorsofthearticle(oraspon-sor)weretermed‘OA’,whiletheremainderweretermed‘non-OA’.Asallthearticleswerepublishedwithoutdelayafterpeerreviewandtypesetting,noEarlyVieweffectwouldhavebeenpresenttoconfoundtheanalysis.

Thetwogroupsofarticlesweresubjectedtologisticregressionusingseveralvariables,includingauthor,funding,subject,andotherpublicationcharacteristics.WhileOAstatuswasfoundtoremainasignificantpredictorofthelikelihoodthatanarticlewouldbecitedatleastoncewithin10-16monthsafterpublication,sotoowereseveralotherfactors.Amongthesewerethenumberofauthorsonthepaperandfundingfromcompetitivegrants,eachofwhichcanbeconstruedasanindependentmarkerforscientificrigourandthusarticle‘quality’.Indeed,anearlierstudyshowedaclearcor-relationbetweengrantfundingandcitationcountsforarticlespublishedinPNAS.27Giventhepresenceofthesepotentiallyconfoundingfactors,furtheranalysiswouldbeneededtodeterminewhetherOAstatusisindeedtheprimary‘driver’forcitation,asEysenbachmaintains.Moreover,thefirstauthorsofOApapersalsotendedtobemoresenior(morelifetimepublications),andthecombinedaveragecitationsperpaperforthefirstandlastauthorsofOApaperstendedtobehighertoo.Logisticregressionanalyseswerenotpresentedforeitherofthesepotentialconfounders.Takentogether,thesedatasuggestedastronginfluencefromSelectionBiaseffects,whichwerenotexploredinthestudy.Differencesbetweensubjectareasarealludedto,butsamplesizeswerevariableandreducedtherobustnessofthefindings.Themainstudywasnotstratifiedbysubjectareaandsomayhavesufferedfromsubject-specificcitationbiases.Eysenbachextendedhisstudytoassesstheeffectoncitationsofonlineavailabilityofthefull-textarticlesatanylocationotherthanthePNASwebsiteorPubMedCentral.ThisanalysissuggestedthatadditionalonlineavailabilityofOAarticles(whichareofcoursealreadyfreelyavailable)didnotsignificantlyenhancecitabilityfurther,arguingagainstanygeneralOpenAccesseffect.

Eysenbach’sworkhighlightedtheimportanceofauthorcharacteristics(reputation,priorcitationhistory,lifetimepublicationcount,country,fundingorganization,etc.)asconfoundingvariablesintheanalysisofOpenAccessandcitations,buthedidnotfullyexploretheeffectofauthorprestigewhenlookingattheSelectionBiaseffect.Moed20undertookastudymethodologicallysimilartothatofHarnadandBrody,10matchingalmost75,000pre-printmanuscriptsdepositedinthecondensedmatterphysicssectionofarXivwiththefinalpublishedjournalarticleindexedinThomsonScientific’sWebofScience(appearingmostlyinoneof24physicsjournals).Citationcounts(excludingself-citations)toboththearXivversionandthefinalpublishedversionofthese‘OA’articleswerethencomparedwiththoseforallother‘non-OA’articlespublishedinthesamejournalandaratioofthetwovaluesderived:theCitationImpactDifferential(CID).Thisanalysisconfirmedthatarticleswithacorrespondingpre-printversiondepositedinarXivhadhighercitationcountsthanthosethatdidnot,andthatthesizeofincreasevariedbyyearofpublicationandbyjournal.ItshouldbenotedthatMoed’sCIDvaluesweresystematicallylowerthanthe‘OA’/’non-OA’ratiosofHarnadandBrody.However,MoeddidnotascribetheCIDheobservedtoanOAeffectbutwentontoexploretheobserveddifferencefurther.

Page 16: Do open access articles have greater citation impact?: A critical review of the literature

16

Moed’sanalysisoftheunderlyingreasonsforthisdifferenceintroducedinnovativeapproachesandmethodstothestudyofOpenAccessstatusandcitationsforthefirsttime,drawingheavilyonthestandardpracticesofcitationanalysispracticedbyresearchbibliometricians.Thiswasthefirststudytoimposefixedwindowsoftimeforcountingcitationstoeacharticleanalysed.Asnotedearlier,thisisimperativeforfaircomparisonbetweenarticlespublishedatdifferenttimes.MoedanalysedtheEarlyVieweffectbyimposingtwofixedcitationperiodsforeacharticleanalysed,eitherthefirstthreeyearsafterpublicationorthefourthtothesixthyearafterpublication.Citationswerealsocountedonamonthlybasisafterpublicationtogiveagranularviewofearlycitationcounts,thusminimizingtheeffectsofdifferencesinpublicationfrequencyofthecognatejournals.BothmethodsshowedaclearEarlyVieweffect,andthelatterapproachproducedthemostvisuallystrikingresults:whenmonthlycitationcurvesforarticleswithorwithoutaversionpostedinarXivwereplottedonthesamechart,translationofthearXivcurvesbytheaveragelengthofdepositinarXiv(6months,theaveragetimebetweendepositinarXivandformalpublicationoftherefereed,finaljournalarticle)resultedinalmostindistinguishablecurves(Figure3;cf.SchwarzandKennicutt21andHennekenet al.19).Moedalsofoundastrongqualitybiaseffect,reflectedinasignificantover-representationofprominentauthorsinarXiv-depositedarticles,andshowedthatarticleswithhigherproportionsofprominentauthorsaremorelikelytobecitedthanthosewithahigherproportionoflessprominentauthors.WhenboththeEarlyVieweffectandtheSelectionBiaseffectwerecontrolledfor,themagnitudeoftheCIDacrossall24majorphysicsjournalswasvariablebutaveragedjust7%(forauthorswithmorethanfourlifetimearticles).

Moedconcludedthat,basedontheseresults,thereisnogeneralOpenAccesscitationadvantageforindividualauthors,especiallythemoreprolificones.Likealloftheotherstudiesdiscussedabove,thefindingsofthisstudyareonlydirectlyapplicabletothesub-jectdisciplineitisbasedon;generalitycannotbeassumedacrossallareasofresearch,ascitationbehaviourandauthorattitudestoOpenAccessareculturalfactorsthatdifferacrossdifferentfields.3,28However,thisstudyhassetabenchmarkforarigorousmethodthatcanbeappliedtofurtherstudiesoftheseeffectsinothersubjectareas.

28 Swan, A. and Brown, S. (2005) Open access self-archiving – An author study. Available at http://eprints.ecs.soton.ac.uk/10999 (link verified 23rd April 2007)

Citation time-course for published articles with (‘in ArXiv-CM’) and without (‘not in ArXiv-CM’) a self-archived version in the Condensed Matter section of arXiv. The curves represent three-month moving averages. The open squares and grey line represent articles published in a journal only, while the black diamonds and black line represent articles published in a journal but also deposited in arXiv an average of 6 months prior to their publication in a journal. To account for the Early View effect the latter have been translated to the right by 6 months. Reprinted from Moed19 with permission.

Figure 3

Page 17: Do open access articles have greater citation impact?: A critical review of the literature

17

Conclusions

Weposedtwoquestionsatthebeginningofthisarticle:firstly‘DoOpenAccessresearcharticleshaveagreatercitationimpact?’,andsecondly‘Willmyresearchpaper(s),andthereforewillI,getacitationbenefitfromtheGoldandGreenOpenAccessmodels?’Thesetwoquestionsinturnrepresentthemainstagesinthedevelopmentoftheresearchliteratureonthissubject.WhileearlyworkwassimplyconcernedwithseekingapositivecorrelationbetweenOpenAccessandcitationcounts,morerecentworkhasbegunmethodicallytodissectthefactorsthatdrivetheobservedcorrelationandtodiscoverwhatthismightmeanforindividualauthorsusingtheGreenorGoldOpenAccessmodels.

Thesequestionshavedriventhedevelopmentofincreasinglyrobustresearchmethodsthataccountforpotentiallyconfoundingfactorsandbiases.Citationanalysisisnotatrivialundertaking,notleastbecauseitrequirestechnicalabilitywithdatamanipulationandanalysis,andalsorequiresanunderstandingoftheunderlyingdriversforcitationinscholarlypublications.Allbutoneofthestudiesdiscussedabovefailedtodetermineaccuratelythedateofearliestdisseminationofeacharticle,andthentoimposeadefinedcitationwindow,whichmustbeusedifcitationanalysisofOpenAccessstatusistoyielddefinitiveresults.Eventhemostrobustmethodsdevelopedtodateareunabletoshowcausalityunequivocally,norcantheygeneralizetheobservedeffectsattheauthorleveloracrossalargenumberofvarieddisciplines.

Initially,theobservedpositivecorrelationbetweentheOAstatusofagivenarticleandhighercitationcountswasinterpretedascausal,sincetherewasbothanintuitiveandsociologicalappealtothismechanism.However,closeranalysishasnotcon-firmedanycausalrelationship,andhasactuallyshownamorecomplexsetofcon-tributorstotheeffectitself.Assumingthatcitationdifferencesareduesolelytothefreeavailabilityofanarticleimpliesthatmanyscholarsworkinginagivendisciplinearecurrentlytotallyunawareofimportant,relevantliteratureintheirfieldandareunabletoreadandciteit.Thisfurthersuggeststhatauthorswilllimittheircitationstothoseworksthatarereadilyavailableinfavourofcitationstoworksthatareofthehighestrelevance.Thisviewofcitationbehaviourdismissesanycontributingrolefromlong-establishedandrobustmeansofscientificandscholarlycommunication–namely,allmechanismsofpeercommunication,theinfluenceandavailabilityofcitedreferences,andtheinherentvalueagivenresearcherwillplaceonthecontentofapaper,independentlyofthemechanismbywhichitmighthavebeenretrieved.

Instead,theconceptofageneralandhighlyinfluentialOpenAccesseffectrequiresustoimagineasituationinwhichauthorseitherrefertoaparticulararticlesimplybecause itisOpenAccess(notforanyinherentrelevancetothetopicathand)orfailtocitearelevantpaperwhichtheyareunabletoread(andsounabletocite)becauseitis notOpenAccess.Familiaritywiththecitationhistoryofinflu-entialorcanonicalworksinadisciplinesuggeststhattheoverridingdeterminantoflifetimecitationsofanarticleisthequality,importance,andrelevanceoftheworkreportedinthearticle.

WhatdoesOpenAccessprovideforanindividualauthor?Themostrigorousstudyavailabletodate20suggeststhatanyresidualOpenAccesseffectincondensedmatterphysicsisnegligible,afteraccountingforSelectionBiasandEarlyVieweffects.Thissuggeststhatthebenefitsofself-archivingforanindividualarticleortheworkofanindividualauthorareuncertainandcouldbeasmuchaffectedbysubjectarea,inherentvariationsinpublication,andcitationpatternsgenerally,andthepresence

Page 18: Do open access articles have greater citation impact?: A critical review of the literature

18

and/orimportanceofaspecializedonlinepre-printarchive.Scientificcitationisinfluenced,overwhelmingly,bytherelevanceandimportanceofagivenscholarlyworktootherscholarsinthefield.Whileotherfactorsmighthavemoderateeffects,theprocessofscienceisdrivennotbyaccess,butbydiscovery.

Withthisinmind,weinvitethebibliometricsandbroaderscientificcommunitytocontributemethodologicallysoundandwell-interpretedstudiesoftherelationshipsbetweenOAandcitationcountsacrossdiversedisciplines.SuchstudiesneedtoensurethatthesignatureskeweddistributionofcitationpatternsisnotcasuallyassignedonthebasisofsimplecorrelationsandthatartefactssuchasSelectionBiasandEarlyViewareaccountedfor.Truerandomisedstudies(witharticlesrandomlyselectedfromthesamejournalforOAtreatment)mayofferoneapproach,providedthesecanbemanagedpractically,andotherfactors(e.g.thecitingwindow,seasonally)canbecontrolledfor.

Page 19: Do open access articles have greater citation impact?: A critical review of the literature

19

Acknowledgements

TheauthorswouldliketothankJeffreyAronsonattheDepartmentofClinicalPharmacology,OxfordUniversity,andHenrySmall,ChiefScientistatThomsonScientific,forhelpfulcommentsonthemanuscript.

Page 20: Do open access articles have greater citation impact?: A critical review of the literature

20

Definition

Publicationofapeer-reviewedarticleinajournalwherethefinalpublishedarticleismadefreetoreadtoanyonewishingtodoso;thecostofpublicationistypicallybornebytheauthororasponsoronbehalfoftheauthor.GoldOAcomprisestwodistinctapproaches:onewherepublicationiscontingentontheauthororsponsorpaying(oftencalled‘authorpays’),andanotherwherepublicationisnotdepen-dentonpaymentbutwhereanauthororsponsoringorganisationcanopttomakethepublishedarticlefreelyavailablethroughthepaymentofafee(sometimescalledthe‘sponsoredarticle’option)

Postingintoaninstitutionalorsubject-basedelectronicarchiveofapre-printorpost-printarticle.Somehaveusedanarrowerdefinitiontoincludeonlypost-printarticlesbutinthiscaseweuseabroaderdefinitionasitappliesbesttothestudiesreviewedhere

Author’smanuscriptofanarticleassubmittedtoajournalforpeerreview

Author’speer-reviewedmanuscriptofanarticleacceptedforpublicationinajournal

Time-spanoverwhichcitationstoanarticlearecounted

Citationtoanarticlebyoneormoreofthesameauthorsinasubsequentarticle

Adatabasecontaininginformationaboutarticlesandthecitationsbetweenthem

Suggeststhatauthorsaremorelikelytoread,andthuscite,articlesthataremadeavailableunderanOAmodel

Suggeststhattheperiodbetweentheearlypostingofanarticleinarepositoryandtheappearanceofthecognatepublishedjournalarticleallowstimeforearliercita-tion;referstotheGreenOAmodelwherearticles(pre-printsorpost-prints)arepostedearlierthanthefinalpublishedarticle

Suggeststhatthemostprominent(andthusmostcitable)authorsaremorelikelytomaketheirarticlesavailableunderanOAmodel,andthattheyaremorelikelytodosowiththeirmostimportant(andthusmostcitable)articles.Inotherwords,authorstendtopromotetheirbestworkandthebestauthorsaremorelikelytodoso.

Glossary

Term

GoldOpenAccess

GreenOpenAccess

Pre-printarticle

Post-printarticle

Citationwindow

Authorself-citation

Citationindex

OpenAccesspostulate

EarlyViewpostulate

SelectionBiaspostulate