Page 1
CompositionOriginalityToolsComparativeResearchPreliminaryReportSeptember15,2017•http://adam.rusch.me
By:AdamRusch
PhDCandidate,CollegeofEducationEducationPolicy,OrganizationandLeadership
UniversityofIllinoisatUrbana-ChampaignAsonlinesystemsmakeiteasierforlearnerstoaccessamultitudeofacademicresources,itbecomesharderforteachersandgraderstodifferentiateoriginalcontentfrommaterialsthatareplagiarizedorinappropriatelycited.Atbest,astudentmayhavemadeaninnocentmistakebycopyinginformationwithoutgivingappropriatecredittotheoriginalsource.Atworst,astudentmaybedeliberatelyplagiarizinganotherscholar’sworktorepresentashisorherown.Inresponse,various“originalitytools”havearisenintheeducationtechnologymarketplacetoscanstudentpapersandcomparethemagainstcurrentlyknownworkstodetermineifapapercontainsoriginalorunoriginalwriting.ThepurposeofthisprojectistocomparefiveoriginalitytoolsthatarecommonlyintegratedintoLearningManagementSystems(LMS)usedinacademicsettings.AfterstudentssubmitanassignedpaperthroughtheLMS,theinstructorisabletousethetooltocheckfororiginalityaspartoftheassessmentprocess.Thisallowsforcorrectiveaction,suchasaskingformoreappropriatecitation,orpunitiveaction,suchasflaggingtheworkforplagiarism,tobetaken.Thefiveoriginalitytoolsbeinginvestigatedare:SafeAssign,Turnitin,VeriCite,UniCheckandUrkund.SafeAssignisaBlackboardproductavailableexclusivelyontheirplatformsandwasusedontheBlackboardLearnsystemforthisstudy.TheotheroriginalitytoolsareavailableonmostleadingLMSplatforms.TurnitinwasusedasaMoodleplugin.VeriCitewasusedasaSakaiplugin.UniCheckandUrkundwereeachusedthroughtheirstandalonewebportals.MethodsTocomparethefivetools,Iranaseriesof50samplepapersthrougheachsystem.Thepaperswere10pageselectionsfromthePrincetonUniversityDoctoralDissertations,2011-2017database1andcamefromacross10differentdisciplinestoaccountforvariationsinsubjectsandthemes.Thisdatabasewasselectedbecauseithadnotbeenindexedbyanyoftheoriginalitytoolsbeinginvestigated.Ofthepapers,25were“unseeded”andonlycontainedoriginalcontentfromtheauthor.Another25ofthepaperswere“seeded”withknownnon-originalcontentcomingfromeasilyavailablewebsources,suchasWikipedia,EncyclopediaBrittanica,CenterforDisease
1http://dataspace.princeton.edu/jspui/handle/88435/dsp01td96k251d
Page 2
Control,HowStuffWorks.com,andtheEBSCOonlinejournaldatabaseavailabletomeasastudentattheUniversityofIllinois.SeeAppendixIforfulllistoftestworks.Theoriginalityscoreandsignificantmatchingcitationswererecordedasthepaperswererunthrougheachsystem.Ihadtwomajorquestions.Q1:Doestheoriginalitytoolproperlyidentifytheseededmaterialasnon-originalcontentandthesiteIplagiarized?Q2:Doesthetoolfindexamplesoftheauthor’soriginalworkinanyotherplacesonline?Thesewerecaseswheretheauthorshadusedportionsoftheirdoctoralworkinotherscholarlypiecesthathadbeenpublishedbeforeorafterthedissertationdefense.ResultsForQ1,VeriCiteandUniCheckwerethebestperformerswithUrkundclosebehind.Asyousee,thesetoolsidentified22and21ofthe25examplesofseededmaterial,respectively.SafeAssignidentifiedthesiteforourseededmaterialonly15timesandTurnitinidentifiedthesite18times.However,therearesomeinterestingnotesofdivergence.
PlagiarismAnalysisResultsSafeAssign Turnitin VeriCite Urkund UniCheck
15* 18* 22 21 2260% 72% 88% 84% 88%
SeeAppendixIIforfulllistInonecase,SafeAssigndididentifyportionsoftheseededmaterialasnon-original.However,itlistedthesourceasasitethathadcopiedinformationfromthewebsitewhereIhadobtainedit.Essentially,itfoundafellowplagiarizerwhileitdidnotfindtheoriginalseededmaterialitself.Infourcases,Turnitinidentifiedseededmaterialasnon-originalwithoutidentifyingthesitewhereIobtainedit.Twice,likeSafeAssign,itgaveuswebsitesthathadcopiedthematerialand(inthesecases)properlycitedit.Intwoothercases,Turnitintoldusthatthematerialswerefromtheircross-institutionalstudentpaperdatabase(“Submittedto[Institution]StudentPaper”).ForQ2,Ididnotknowiftherewouldbeanyadditionalpublicationsbutwasinterestedinseeingwhatthetoolswouldfind.Turnitinfoundthemostexamplesoforiginalsourcematerialbeingpublishedelsewhere,identifying12works.UrkundandUnicheckeachfound11workspublishedelsewhereandVeriCiteidentified9.SafeAssigndidnotfindany.
SourceAnalysisResultsSafeAssign Turnitin VeriCite Urkund UniCheck
0 12 9 11 11SeeAppendixIIIforfulllist
Page 3
Inafewcases,therewereclosesimilaritiesforsourcepapersandseededmaterialsthatwerenotconsideredsignificantresultsandrecordedasmatcheshere,eveniftheygavehighnon-originalitypercentages.Thesegenerallyfellintooneoftheecategories:1)Theauthorusedcommonjargonforthefieldthatwewouldexpecttoseerepeatedacrossmultiplepapers,2)Theauthorinsertedlongquotationsthatothershadalsoincludedintheirworks,3)Thesystemidentifiedabibliographicreferenceasmatchingcontentbecauseitusedastandardcitationstyle.DiscussionThedifferencesinthewaysthateachtoolperformedhelpsusunderstandsomeofthedifferencesthatmaybepartoftheirdesign.SeeAppendixIVforscreenshotsthatgiveasenseofhoweachsystemworks.Inthefirstcase,SafeAssignseemstobethetoolwiththemostlimitedscansforoutsidesources.Whileitpickedup60%oftheseededmaterials,theyallcamefromeasilyimaginedresearchsiteslikeWikipedia,History.com,andtheCenterforDiseaseControl.ItlooksliketheSafeAssignwebsearchhasaverytightsetofparametersfortheirwebsearch.Itmissedquiteafewgeneralwebsites,foundnoneofthejournalarticlesusedforseededmaterials,andfoundnoexamplesoftheoriginalmaterialbeingpublishedinjournalarticleselsewhere.ItmaybepartoftheSafeAssigndesignphilosophythattheybelievethesegapswillbefilledasaninstitutionaldatabaseisbuiltupofsubmittedstudentpapers.Turnitindoesseemtohaveanextensivewebcachebuiltupfororiginalitycomparison.Itisnotablethatthistoolfoundthemostexamplesoforiginalworkpublishedelsewhere.Ontheotherhand,therearesomeintriguingquestionsabouthowthiswebcacheishandled.Infourcases,thesystemcorrectlyidentifiedseededmaterialasbeingnon-originalcontentbutdidnotgiveusalinktotheoriginalsite.Intwocases,thenon-originalcontentwasattributedtostudentsfromanotherinstitution.Whilethematchwassufficienttodeterminelikelyplagiarisminthiscase,itisverylikelythatinanotherscenariowithlessofamatchitwouldbelessclearandIcan’tbesurewhetherTurnitinrecognizedthattheseededmaterialalsocamefromtheWorldWideWeb.ThereisalsoconsiderabledebateoverthewaythatTurnitinco-minglesallinstitutionalworksintoacommondatabasebecauseofprivacyandintellectualpropertyconcerns.Opponentssayitiswrongtoappropriatethisworkthatwasonlyintendedforaclassassignmentforashareddatabaseorcorporateendeavor.2Intwomorecases,thenon-originalcontentwasidentifiedbutattributedtoasitethatdidn’tmatchourseededmaterial.However,inbothofthosecasesthelinktothewebsitesweredead,andIonlyhadtheTurnitinwebcacheforreference.BasedontheabilitythatTurnitinshowedforfindingsourcematerialinQ2,Iexpectedthattheywouldalsobeable2https://www.insidehighered.com/news/2017/06/19/anti-turnitin-manifesto-calls-resistance-some-technology-digital-age
Page 4
tofindthesiteforseededmaterial.Butagain,withouthavingthecorrectsitelistedIcan’tcreditTurnitinwithfindingtheexamplesIhadseeded.DidtheTurnitinwebcrawlercomeacrossthesesecondarysitesfirstandthendeterminethattheydidn’tneedtomentiontheothersites(thecorrectsites)asthesourceofnon-originalcontent?Ifthatisthecase,thenitisespeciallyproblematicthattheTurnitinwebcacheisoutofdateandcontainsdeadlinks.DidTurnitindecidethatthesourcesitshowedusweremorevalid?Ifaninstructorcannotverifythestatusofallegedlynon-originalcontentitbecomesmoredifficulttoprepareaplanforcorrection.Theperformanceofthelastthreetoolswasnearlyequivalent.VeriCiteandUniCheckperformedthebestinthemainquestion,withUrkundclosebehind.Inthesecondaryquestion,Vericitewasbehindbyjust2-3samples.Ifoundinexaminingthesetoolsthatthedifferencemaybeinthedepththatthewebcrawlindexedsites.Whileallthreefoundthesitesthatanystudentmightaccessfromasimplewebsearch,thedifferencesemergedasthetoolsdugintodeeperlayers.Therewereseveralinstancesofpapersbeingfoundthathadbeensavedinjournal,organizational,orgovernmentalarchives,oronthepersonalwebspaceoffacultymembers(i.e.spaceswheretheycouldgivetheirstudentsthelinktodownloadaPDFandreadthearticleforacollegecourse).Theseresultsrevealthetwodifferentformsoforiginalitychecks.SafeAssignandTurnitinseemtobedesignedforemostforcheckingsubmittedcontentagainsttheinstitutionaldatabasesoftheirclientsandpartnerinstitutions.Thisisusefulfordiscoveringifstudentshavesubmittedthesamepaperinmorethanonecourse,copiedworkfromafriend,orboughtapaperfroma“termpaper”database.VeriCite,UniCheckandUrkundperformedbetterinidentifyingwebsourcesthatmaybethesourceofnon-originalcontent.WhileUniCheckandUrkundfaredslightlybetterthanVeriCiteinthedeepsearch,itwasnotenoughtorenderaclearverdictastoanysystembeingsuperior.Allthreefoundthemostcommonlysearchedelementsthatstudentswouldfindwhencomposinganacademicwork.Thisisimportantforusingcompositionoriginalitytoolsasateachingresourcetotrainstudentsinthewritingprocess.ConclusionWithavarietyofcompositionoriginalitytoolstoselectfrom,institutionslookingtointegrateaserviceintotheirLMSwillbeabletoconsiderwhatevercriteriaisbestforthem.Theremaygoodreasonswhyaninstitutionwouldprefertheinstitutionaldatabasecheckorthewebsearchchecktobethemainfocusoftheirtool.Therearealsostructuralandbureaucraticconsiderationsthatwillinformthedecision.Ultimately,thesetoolsareintendedtomaketheassessmentprocesseasierandmoreinformativeforbothstudentsandinstructors.Freeingupinstructortimethatmightbespentinvestigatingpapersforpossibleinfractionsgivesmoreopportunitiesforotherclassactivities.Anythingthatthetoolscandotoguidestudentsindevelopinggoodwritingpracticesisanobviousbonusthatweshouldhopetobeinthefutureofthesesystems.
Page 5
AppendixIPaper/Subject Year Seed SeedSource Paper/Subject Year Seed SeedSource
Anthopology EastAsianStudiesGordon,Gwendolyn 2014 BridgesIV,WilliamH. 2012Polk,Daniel 2014 Gregory,ScottWentworth 2012Robinson,Mark 2014 Hunter,Michael 2012 2Page AsiaSociety.orgMoranThomas,Amy 2012 2Page Wikipedia Ro,Sang-ho 2012 2Page TodayTranslationsSavova,NadezhdaDimitrova 2013 4Page EBSCOJournal Compton,Eno 2013 4Page NewWorldEncyclopedia
Architecture EconomicsBuckley,Craig 2013 Alvarez,JorgeAlejandro 2016Efrat,Zvi 2014 Ge,Qi 2016Hsieh,LisaL. 2013 Ravit,JasonGregory 2016Campbell,Mark 2014 2Page EBSCOJournal Zeltzer,Dan 2016 2Page AmerSocofMechEngrSunwoo,Irene 2013 4Page EncyclopediaBrittanica Feng,Xiaochen 2016 4Page BeBusinessed.com
ChemicalandBiologicalEngineering GermanGirardi,Matthew 2015 Attanucci,TimothyJ. 2012Bozym,David 2015 Christian,MargaretaIngrid 2012Davis,RaleighLloyd 2015 2Page EBSCOJournal Eldridge,SarahVandegrift 2012Dsilva,CarmelineJoan 2015 2Page WolframMathworld Spies,Petra 2012 2Page WikipediaHiszpanski,AnnaMaria 2015 4Page HowStuffWorks.com King,AlanaJane 2014 4Page ChristianCyclopedia
Chemistry NeuroscienceFortmeyer,IvyCamille 2016 Coen,Philip 2015Ganguly,AahanaNibedita 2016 Silbert,Lauren 2014Terrett,JackAlexander 2016 2Page Phys.org Solway,Alec 2014 2Page EncyclopediaofPhilosophyHone,Graham 2016 2Page Wikipedia Eldar,Eran 2014 4Page WikipediaDigianantonio,Katherine 2016 4Page CenterforDiseaseControl Opendak,Maya 2015 4Page ScopusJournal
Classics PublicandInternationalAffairsJones,MadeleineKersti 2013 Coffey,Diane 2015Clark,VirginiaEmily 2014 Palmer,JohnRB 2013Meinrath,Danielle 2015 Collins,Liam 2014 2Page SmallWarsJournalOswald,Simon 2014 2Page History.com Kanter,David 2014 4Page WikipediaSirois,Martin 2014 4Page ScopusJournal Lim,DarrenJames 2014 4Page CIAWorldbook
Page 6
AppendixII
Paper/Subject SafeAssign TurnItIn VeriCite Urkund UniCheck SeedSource
Gordon 2% 17% 11% 9% 4.3%Polk 2% 10% 10% 2% 0.0%Robinson 1% 10% 10% 2% 0.0%MoranThomas2P 13% 25% 19% 19% 14.7% WikipediaSavova4P 1% 10% 53% 47% 42.1% EBSCOJournal
Buckley 1% 10% 10% 4% 0.0%Efrat 1% 17% 17% 13% 9.8%Hsieh 2% 12% 10% 2% 0.0%Campbell2P 1% 17% 10% 1% 2.1% EBSCOJournalSunwoo4P 28% 54% 48% 46% 46.4% EncyclopediaBrittanica
Girardi 11% 52% 47% 4% 37.3%Bozym 4% 19% 10% 6% 4.7%Davis2P 10% 37% 31% 1% 24.6% EBSCOJournalDsilva2P 9% 49% 42% 24% 60.0% WolframMathworldHiszpanski4P 28% 56%* 41% 41% 43.5% HowStuffWorks.com
Fortmeyer 1% 16% 10% 4% 0.0%Ganguly 4% 28% 10% 6% 2.1%Terrett2P 16% 77% 77% 83% 66.8% Phys.orgHone2P 21% 41% 18% 24% 18.7% WikipediaDigianantonio4P 41% 63% 60% 44% 56.1% CenterforDiseaseControl
Jones 1% 7% 10% 4% 0.0%Clark 8% 12% 22% 15% 9.9%Meinrath 2% 23% 10% 2% 0.0%Oswald2P 20% 26%* 23% 22% 21.1% History.comSirois4P 2% 45% 10% 3% 2.2% ScopusJournal
Bridges 7% 15% 10% 4% 6.5%Gregory 4% 14% 10% 5% 0.0%Hunter2P 26% 32% 27% 27% 21.2% AsiaSociety.orgRo2P 9% 25% 25% 24% 23.5% TodayTranslationsCompton4P 11% 56% 55% 50% 44.4% NewWorldEncyclopedia
Alvarez 2% 96% 10% 94% 74.3%Ge 1% 8% 65% 71% 55.4%Ravit 11% 14% 10% 2% 0.0%Zeltzer2P 20% 49% 41% 35% 29.8% AmerSocofMechEngrFeng4P 32% 94% 34% 96% 30.8% BeBusinessed.com
Attanucci 2% 23% 10% 9% 3.3%Christian 1% 17% 10% 5% 2.7%Eldridge 1% 17% 10% 1% 7.9%Spies2P 13% 30% 24% 23% 22.3% WikipediaKing4P 45% 51%* 39% 52% 34.8% ChristianCyclopedia
Coen 2% 30% 10% 74% 0.0%Silbert 9% 62% 47% 38% 23.7%Solway2P 10% 94%* 88% 95% 74.6% EncyclopediaofPhilosophyEldar4P 34% 61% 52% 84% 53.5% WikipediaOpendak4P 20% 48% 50% 4% 16.3% ScopusJournal
Coffey 2% 93% 94% 96% 63.3%Palmer 1% 9% 38% 38% 2.5%Collins2P 15% 50% 33% 28% 19.0% SmallWarsJournalKanter4P 40% 65% 51% 51% 59.3% WikipediaLim4P 33%* 58% 48% 44% 43.5% CIAWorldbook
SeededMaterialFound 15* 18* 22 21 22 Red:SeededMaterialwasnotidentifiedTotalSamples=25 60% 72% 88% 84% 88% Green:SeededMaterialwasidentified
Chemistry
ChemicalandBiologicalEngineering
Architecture
Anthopology
PlagiarismAnalysisResults
PublicandInternationalAffairs
Neuroscience
German
Economics
EastAsianStudies
Classics
Page 7
AppendixIII
SourceAnalysisResults
Paper/Subject SafeAssign TurnItIn VeriCite Urkund UniCheck
AnthopologyGordon 2% 17% 11% 9% 4.3%Polk 2% 10% 10% 2% 0.0%Robinson 1% 10% 10% 2% 0.0%MoranThomas2P 13% 25% 19% 19% 14.7%Savova4P 1% 10% 53% 47% 42.1%
ArchitectureBuckley 1% 10% 10% 4% 0.0%Efrat 1% 17% 17% 13% 9.8%Hsieh 2% 12% 10% 2% 0.0%Campbell2P 1% 17% 10% 1% 2.1%Sunwoo4P 28% 54% 48% 46% 46.4%
ChemicalandBiologicalEngineeringGirardi 11% 52%* 47%* 4% 37.3%*Bozym 4% 19% 10% 6% 4.7%Davis2P 10% 37% 31% 1% 24.6%Dsilva2P 9% 49% 42% 24% 60%*Hiszpanski4P 28% 56% 41% 41% 43.5%
ChemistryFortmeyer 1% 16% 10% 4% 0.0%Ganguly 4% 28% 10% 6% 2.1%Terrett2P 16% 77%* 77%* 83%* 66.8%*Hone2P 21% 41%* 18% 24% 18.7%Digianantonio4P 41% 63%* 60%* 44% 56.1%*
ClassicsJones 1% 7% 10% 4% 0.0%Clark 8% 12% 22% 15% 9.9%Meinrath 2% 23% 10% 2% 0.0%Oswald2P 20% 26% 23% 22% 21.1%Sirois4P 2% 45% 10% 3% 2.2%
EastAsianStudiesBridges 7% 15% 10% 4% 6.5%Gregory 4% 14% 10% 5% 0.0%Hunter2P 26% 32% 27% 27% 21.2%Ro2P 9% 25% 25% 24% 23.5%Compton4P 11% 56% 55% 50% 44.4%
EconomicsAlvarez 2% 96%* 10% 94%* 74.3%*Ge 1% 8% 65%* 71%* 55.4%*Ravit 11% 14% 10% 2% 0.0%Zeltzer2P 20% 49% 41% 35% 29.8%Feng4P 32% 94%* 34% 96%* 30.8%
GermanAttanucci 2% 23% 10% 9% 3.3%Christian 1% 17% 10% 5% 2.7%Eldridge 1% 17% 10% 1% 7.9%Spies2P 13% 30% 24% 23% 22.3%King4P 45% 51% 39% 52%* 34.8%
NeuroscienceCoen 2% 30%* 10% 74%* 0.0%Silbert 9% 62%* 47%* 38%* 23.7%*Solway2P 10% 94%* 88%* 95%* 74.6%*Eldar4P 34% 61%* 52% 84%* 53.5%*Opendak4P 20% 48%* 50%* 4% 16.3%*
PublicandInternationalAffairsCoffey 2% 93%* 94%* 96%* 63.3%*Palmer 1% 9% 38%* 38%* 2.5%Collins2P 15% 50% 33% 28% 19.0%Kanter4P 40% 65%* 51% 51% 59.3%Lim4P 33% 58% 48% 44% 43.5%
OriginalMaterialFound 0 12 9 11 11(publishedelswhere) Blue: Originalmaterialfoundelsewhere
Page 8
AppendixIVScreenshotstoshowsystemdifferences:
SafeAssignsummaryofChemicalandBiologicalEngineeringworks.
SafeAssignReportforDsilvapaper.Notethattheseededmaterialsource,WolframMathworld,isnotidentified.
Page 9
TurnitinsummaryofClassicsworks.
TurnitinresultsOswaldpaper.Notetheincorrectsourceforseededmaterialidentifiedandthatthisresultcomesfromthecross-institutionaldatabase.
Page 10
VericitesummaryofAnthropologyworks.
VeriCiteresultsSavovapaper.Notehowseededcontentishighlightedandcorrespondstosourcelistedinsidepanel.
Page 11
UniChecksummaryofAnthropologyworks.
UniCheckresultspageofMoranThomaswork.
Page 12
UrkundsummaryofArchitectureworks.
UrkundresultsofSunwoopaper.Notehowbothasitethatcopiedthisseededmaterialaswellasthelegitimatesiteofseededmaterialarelisted.
Page 13
ExampleofSafeAssigncorrectlyidentifyingasourceofseededmaterialinMoranThomaspaperfromAnthropology.ThetexthighlightedinyellowisidentifiedasbeinginanotherstudentpaperandthegreenhighlightappearsinanotherversionofWikipedia.
ExampleofTurnitinfindingoriginalworkpublishedinanotherplaceinFengpaperfromEconomics.InthiscaseitisaPDFstoredonauniversitywebspaceinSingapore.
Page 14
ExampleofVericiteidentifyingtheoriginalworkthatablockquotecamefrominEfratpaperfromArchitecture.Uponinspection,itcanbeseenthatthisselectionisproperlycited,whichisgoodforaninstructortoknow.
ExampleofUrkundidentifyingseededmaterialandfindinganotherplacewheretheoriginalmaterialwaspublishedinKingpaperfromGerman.