Consistent Commitment: Patterns of Engagement across Time in … · 2017. 2. 27. · The FutureLearn platform, launched in 2013, employs a social-constructivist pedagogy, based on

(2015).Consistentcommitment:PatternsofengagementacrosstimeinMassiveOpenOnlineCourses(MOOCs).JournalofLearningAnalytics,2(3),55–80.http://dx.doi.org/10.18608/jla.2015.23.5

ISSN1929-7750(online).TheJournalofLearningAnalyticsworksunderaCreativeCommonsLicense,Attribution-NonCommercial-NoDerivs3.0Unported(CCBY-NC-ND3.0) 55

Consistent Commitment: Patterns of Engagement across Time in Massive Open Online Courses (MOOCs)

RebeccaFergusonandDougClow

InstituteofEducationalTechnology,TheOpenUniversity,[email protected]

ABSTRACT:Massiveopenonline courses (MOOCs) are beingused across theworld to providemillions of learners with access to education.Many who begin these courses complete themsuccessfully,ortotheirownsatisfaction,butthehighnumberswhodonotfinishremainasubjectof concern. In 2013, a team from StanfordUniversity analyzed engagement patterns on threeMOOCs run on the Coursera platform. They found four distinct patterns of engagement thatemergedfromMOOCsbasedonvideosandassessments.SubsequentstudiesontheFutureLearnplatform, which is underpinned by social-constructivist pedagogy, indicate that patterns ofengagement in these massive learning environments may be influenced by decisions aboutpedagogyandlearningdesign.ThispaperreportsontwoofthesestudiesoflearnerengagementwithFutureLearncourses.StudyOnefirsttries,notwhollysuccessfully,toreplicatethefindingsoftheCourserastudyinanewcontext.Itthenusesthesamemethodologicalapproachtoidentifypatternsof learnerengagementontheFutureLearnplatform,andexploreshowthesepatternsmayhavebeeninfluencedbypedagogyandelementsoflearningdesign.StudyTwoinvestigateswhether these patterns of engagement are stable on subsequent presentations of the samecourses.Twopatternsarefoundconsistently inthisandotherwork:samplerswhovisitbriefly,and completers who fully engage with the course. The paper concludes by exploring theimplicationsforbothresearchandpractice.

1 INTRODUCTION ThegenericnameforMOOCsemphasizestheircommonalitiesofscale(massive),economic/philosophicalperspective (open), location (online), and structure (course).At the same time, it omits a key areaofdifference, theirunderlyingpedagogy,which canalsobedescribedas their approach to teachingandlearningandtheirunderstandingofhowlearningtakesplace.TheoriginalMOOCs,developedbySiemensandDownesfrom2008onwards,employedaconnectivistapproach to learning (Downes, 2012). This approach is signalled by references to them as cMOOCs.Connectivismconsiderslearningtobesocial,technologicallyenhanced,distributedwithinanetworkandassociatedwiththerecognitionandinterpretationofpatterns(Siemens,2005).Knowledgeisdevelopedas a resultof experienceand the self-organizingnatureof appropriatelydesignednetworks (Downes,2012).ThecMOOCswerefollowedbyxMOOCs. Inthiscase,thedefining letter“x”didnotrefertoaspecificpedagogy,buttotheroleoftheseMOOCsasanextensiontoapreviousoffering.However,inmanycases,this“extension”versiontooktheessentialelementsofeducationtobecontentandassessment,withinputfromeducatorsbundledaspartofthecontent.Thisledtoaninstructivistapproachtoteachingand



learning in which “learning goals are predefined by an instructor, learning pathways structured byenvironmentandlearnershavelimitedinteractionswithotherlearners”(Littlejohn,2013).InSiemens’view, “cMOOCs focus on knowledge creation and generation whereas xMOOCs focus on knowledgeduplication”(Siemens,2012).ThisbinarydistinctionbetweencMOOCsandxMOOCswasusefulwhenthefirstxMOOCsappeared,buthasbecomelessrelevantasnewMOOCplatformsandcoursedevelopershaveemerged.TheFutureLearnplatform, launched in 2013, employs a social-constructivist pedagogy, based on the ConversationalFramework (Laurillard, 2002; Pask, 1976). This is a general theory of effective learning throughconversations, with oneself and others, about the immediate world and about abstract concepts(Ferguson&Sharples,2014).Toengageinsuccessfulconversations,allpartiesneedaccesstoasharedrepresentationofthesubjectmatteraswellastoolsforcommenting,responding,andreflecting,andsothesetoolsandsharedrepresentationsformedpartofthedesignoftheFutureLearnplatform.AlthoughdifferenttypesofMOOCshavedifferentpedagogies,aproblemthatmostofthemexperienceisthelargedifferencebetweennumbersregisteringandnumberscompleting.Someofthisdrop-offcanbeexplained inpositiveways.MOOCregistrationcanberegardedassimilartobookmarkingacollegewebsite, sendingoff fora courseprospectus,orplacinga textbook temporarily inanonline shoppingbasket—itdoesnotnecessarilyrepresentacommitmenttoengagefurtherwithlearninginthatcontext.FutureLearn reports that arounda thirdof thosewho registerdonot return to start the course. Thissuggestsitismoreusefultocomparethenumberoflearnersstartingacoursewiththenumberoflearnerswhocomplete,ratherthantakingthenumberofpeoplewhoregisterasthebaseline.Uptoathirdofthosewho“dropout”mayhaveclickedthe“Register”buttonwithlittleornointentionofcompleting—orevenstarting—thecourse.Anotherlargegroup,characterizedas“Samplers,”appearstobemadeupofthosewhotakeabrieflookatacourse,decideitisnotforthem,andthenleave(Kizilcec,Piech, & Schneider, 2013). Respondents to surveys associatedwith FutureLearnMOOCs have sharedvariousreasonsforparticipating,includingtryingoutlearningonline,findingoutmoreaboutMOOCs,orfindingoutmoreaboutaspecificuniversity.Thelearningobjectivesoftheselearnersmaybemetwithoutworkingthroughanentirecourse.AnotherperspectiveisofferedbyDownes,whopointsoutthatmostmediaarenotconsumedintheirentiretybyanysingleindividual:“Itʼsactuallyveryraretofindmediaofanysortthatisintendedtobeconsumedinitsentirely.Mostofthetime,inmostthings,wepickandchoosewhatisimportanttous.Thatisthenormalmodeofinteractingwithcontent[…]nobodythinksalibraryafailureifyoudon’treadeverythinginthecollection,oranauthorafailureifyoudon’treadtheirentirecorpus.AndjustsowithMOOCs”(Downes,2014).Despitethesevalidreasonsandexplanationsfornon-completion,thehighdropoutratesonmostMOOCsstillprovidecause forconcern.MOOCsareopen,but“studentsseeknotmerelyaccess,butaccess to



success”(Daniel,2012)anditseemsthatmanyencounternotsuccessbutfailurewhentheystudythesecourses.Thesteepdrop-offofthe“FunnelofParticipation”(Clow,2013),whichshowssimilardeclinesinengagementacrossdifferenttypesofonlinelearningsites,isnotwelcomedbyall.Jordan’s work (Jordan, 2014a) provides an overview of this issue, and her website (Jordan, 2014b)providesadynamicviewofthedataaboutMOOCcompletionratesthathasbeenmadepublic.Thesitecurrentlypresentsdatarelatingtoaround220MOOCsbasedonavarietyofdifferentplatforms.Thesitedealswiththepercentageofregisteredlearnerswhocompleteacourse(ratherthanthepercentageofpeoplewhoactuallystartedthecoursewhogoontocomplete). ItshowsthatMOOCstypicallyreportcompletion ratesof around15%.At the timeofwriting, noMOOCwithmore than60,000 registeredstudents had reported a completion rate of higher than 13%, and only threeMOOCs had achieved acompletion rateofover40%.However, thesehigh-performingoutliers suggest thatweshouldnotbesatisfiedwithMOOCcurrentcompletionrates,becausehigheronesarepossible.Thisvariabilityinfiguressuggeststhatcoursecontext,coursedesignandcoursepedagogyhaveaneffectonretention,andthusonstudents’chancesofsuccess.1.1 Using Analytics to Investigate MOOCs

Learninganalyticsareconcernedwiththeuseoftracedatarelatingto learnersandtheircontexts, forpurposesofunderstandingandoptimizinglearningandtheenvironmentsinwhichitoccurs(SoLAR,2011).Theythereforeofferawayofidentifyingfactorsthatinfluenceretention,enablingeducatorsandplatformproviders to make changes to context, design, and pedagogy where appropriate. The large datasetsgeneratedbyMOOCactivityprovideastrongbasisforthistypeofapproach. KopandcolleagueslookedatengagementpatternsonthePLENK2010connectivistMOOC(Kop,Fournier,&SuiFai,2011).AsthistypeofMOOClinksanetworkoftoolsandresources, it isdifficulttotrackallactivity,buttheirconclusionwasthat40–60ofthe1,641learnersregisteredonthecoursecontributedactivelyonaregularbasis,andthatthisengagementsupportedpositivelearningoutcomes.Thevisibleparticipationrateofotherswasmuchlower,indicatingaconsumingbehaviour.ThisdivisionbetweentheVisibleContributorsandtheConsumerssignalledthattherearedifferentwaysofinteractingonaMOOC.MilliganandcolleaguesinvestigatedpatternsofengagementinconnectivistMOOCsandidentifiedthreedistincttypesofengagement:ActiveParticipation,PassiveParticipation,andLurking(Milligan,Littlejohn,&Margaryan, 2013). These classifications are strongly associated with connectivist pedagogy and toMOOCsthatarestructuredtoenable learnerstoconstructknowledgetogethersoare lessrelevanttocoursesinwhichkeyelementsarepre-definedcoursecontentandassessment.In 2013, Kizilcec, Piech, & Schneider analyzed patterns of engagement and disengagement in threeMOOCsontheCourseraplatform.Twounderstandingsunderpinnedthisanalysis.Onewastheviewthat“learningisaprocessofindividualknowledgeconstruction.”Thesecondwastheviewthattheprincipal



features of these courses were video lectures and assessments. This analysis found four patterns ofengagementwiththesecourses:• Completing:Theselearnerscompletedthemajorityofassessments.• Auditing:Theselearnerswatchedmostofthevideosbutcompletedassessmentsinfrequently,if

atall.• Disengaging:Theselearnerscompletedassessmentsatthestartofthecourse,thenreducedtheir

engagement.• SamplingTheselearnersexploredsomecoursevideos.AstheseclusterswerefoundconsistentlyacrossthreeMOOCs,eachtargetedatstudentsworkingatadifferenteducational level, itappearedplausible that theycouldbeappliedtoothercoursesandthat“MOOCdesignerscanapplythissimpleandscalablecategorizationtotargetinterventionsanddevelopadaptivecoursefeatures”(Kizilcec,Piech,&Schneider,2013).ThisCourserastudyhasprovedtobeveryinfluential.NotonlyisithighlycitedwithintheacademicliteraturebutalsothetermsitusesarewidelyappliedwhendescribingandplanningforMOOCengagement.However, the generalizabilityof these categories is qualifiedon the grounds that these classifications“wouldmakesenseinanyMOOCthatisbasedonvideosandassessments”(Kizilcec,Piech,&Schneider,2013).Thissuggeststhatthesecategoriesmayonlybeapplicabletocoursesusingpedagogiesbasedoncoursecontentandontheuseofeitherformativeorsummativeassessment.Inthispaper,weinvestigatewhetherthesamepatternsofengagementarefoundinMOOCsthatemploysocialconstructivistpedagogy(StudyOne),orifotherpatternsofengagementapply.WealsoinvestigatewhetherpatternsoflearnerengagementinMOOCsarestableacrossmultiplepresentationsofthesamecourse(StudyTwo).Insocial-constructivistMOOCs,knowledgeisjointlyconstructedthroughconversation.Contributingtoorreadingdiscussioncommentsisthereforeanimportantpartofthelearningprocess.Inthesecases,threeelementsshouldbetakenintoaccount:1)activeengagementwithcoursecontent,2)activeengagementwithcourseassessment,and3)activeengagementwithcoursediscussion.1.2 Replication Withinthehardsciences,replicationofpreviouslypublishedworkisroutine.Withinthesocialsciences,itisunusual.Recenteffortstoattempttoreplicatekeyfindingsinsocialpsychologyhavebeencontroversial(Bohannon, 2014). There has been significant replication work in the Educational Data Mining field,particularly the body ofwork associatedwith the Pittsburgh Science of Learning Center’s DataShop.1

1http://www.learnlab.org/technologies/datashop/



However, outside the contextof cognitive tutors and similar fine-grainedanalysis, replicationwork inlearninganalyticsisveryunusual,asfarasweareaware.Asanempiricaldiscipline,webelievethatthefieldoflearninganalyticsiswellplacedtousesuchrigorousapproaches.Tobeclear,weregardtheworkofKizilcecandcolleaguesasimportantandinteresting,andinthestudiesreportedhereweareseekingtotesthowrobusttheirfindingsareinadifferentcontext. 2 FUTURELEARN DATASET InordertoinvestigatepatternsofengagementwithinMOOCsemployingsocial-constructivistpedagogy,weuseddata providedby FutureLearn. This company, ownedby TheOpenUniversity, is currently inpartnershipwithfiftyuniversities,sixteenspecialistorganizations,andsixcentresofexcellencetodeliverfreeonlinecourses.ThecompanyhasdevelopedanewMOOCplatformbasedonscalablewebtechnologyandunderpinnedbyasocial-constructivistpedagogy(Sharples&Ferguson,2014).Eachteachingelement(step) is associated with a free-flowing discussion. These are intended to emulate a “water coolerdiscussion”abouttheimmediatecontent.Aswithconversationsaroundanofficewatercooler,peoplecomeandgo,nobodyisexpectedtoengagethroughout,buttherecanbeacontinuoussenseof livelyinteraction. Typically, the discussion associated with any step on a FutureLearn MOOC will attracthundreds, thousands, or even tens of thousands of contributions, with “Like” and “Follow” optionsprovidingwaysofnavigatingthese.2.1 Study One: Identifying Patterns of Engagement EachFutureLearnpartnerinstitutionhasaccesstodatarelatedtocoursestheyrun.InStudyOne,designedtoidentifypatternsofengagementinFutureLearnMOOCs,wethereforefocusedourattentionondatafrom fourMOOCs run by our institution, TheOpenUniversity. The data on platform activity are notdirectlyassociatedwithdemographicdata(asdiscussedbelow),andsothefiguresforgenderbalancepresentedinTable1aretakenfromresponsestothestart-of-coursesurveyforeachcourse.ThesefourMOOCsransoonafterthelaunchofFutureLearn,atapointwhenthemajorityofmarketingwastakingplacewithintheUK,sothemajorityofparticipantsoneachcoursewerebasedintheUK.ThenumberofparticipantsreportedinTable1indicatesthenumberofindividuals—botheducatorsandlearners—whoappearedinourStudyOnedatasetbecausetheywereactiveontheplatformafterthestartdateof thecourse.Peopleactivebefore thestartdatewere removed fromthedatasetentirely,becausetheonlypeoplewithaccessatthatpointwereeducatorsorFutureLearnstaff. InFutureLearnterms, a Fully Participating learner is one whomarked amajority of the course steps complete andcompletedalltheassessments.



Table1:OverviewofdatasetforStudyOne MOOC1 MOOC2 MOOC3 MOOC4

Subjectarea PhysicalSciences LifeSciences Arts BusinessM 51% 39% 32% 35%F 48% 61% 67% 65%

Participants 5,069 3,238 16,118 9,778FullyParticipating 1,548 684 3,616 1,416ParticipationRate 31% 21% 22% 14%

Each of these MOOCs specified that no previous experience of studying the subject was required.However,theArtsMOOC3wasrecommendedforlearnersaged16+,duetothepossibilitythatdiscussioncontentmight includematerial thatwouldbe inappropriate forminors.FullyParticipating learnersonthese MOOCs could gain a Statement of Participation for a fee, but no credits were awarded andparticipantswere not required to complete assessments by a certain date. As a result, the “finishedassessmentontime”and“finishedassessment late”classificationswereof less relevancethanontheCourserastudy (Kizilcec,Piech,&Schneider,2013).Courseracoursecontent focusesonvideos.WhileFutureLearn courses also include videos, they make use of substantial quantities of other learningmaterial,predominantlydiscussionandtextwithpictures.Thedifferent contextofCourseraandFutureLearnaffected the rawdataaccessible tousasanalysts.Kizilcecandcolleagueswereabletoobtainfulllogdata,assessmentscores,demographicdata,andsurveyresponses,allatanindividuallearnerlevel.Thisenabledthemtoreportsomeinterestingcorrelations.OnearlyFutureLearnMOOCs,demographic,survey,andactivitydatawerenotlinkedbypersonalidentifiers,sowe had access to demographic and survey data only at a summary level for the courseswewereresearching.Wewereable toobtain individual-level activitydata,whichwereanonymizedbeforewereceivedthem.AswellasassigningrandomIDs,thisprocesspartiallyaggregatedtheactivitydata,soweonly had access to the date and time of a learner’s first visit to a step, not the date or time of anysubsequentvisits.Graphicalinspectionofthesedatasuggestedthattherewerenogrossdifferencesinactivitypatternscomparedtowhatonewouldexpectinmoretraditionallogfileanalysis,althoughthischangehashadsomeeffects,whichwenotewheretheyarise.3 CLUSTERING 3.1 Replicating the Method OurinitialapproachinStudyOnefollowedtheoneapplied intheCourserastudyasfaraspossible, inorder to investigate whether previous findings could be replicated in a different context. That studyadopted amethodology set out in Kizilcec, Piech, and Schneider (2013), designed to identify a smallnumberofwaysinwhichlearnersinteractwithMOOCs.



Kizilcecandhiscolleaguesbeganbycomputingadescriptionforindividuallearnersofthewayinwhichtheyengaged ineachassessmentperiodof the course (typically aweek), and thenapplied clusteringtechniquestofindsubpopulationsintheseengagementdescriptions.FutureLearncoursesaredividedintoweeksand,althoughthereisnocompulsiontoworkthroughatthesamepaceasothers,anoverviewofthedatashowedthatactivitydidindeedfollowaweeklycycle,spikingaftertheweeklycourseemailwassentouteachMonday(seeFigure1).Underlyingthespikes,therewasageneralfalling-offinactivityovertime,aswouldbeexpectedinaMOOC(Clow,2013).Itispossible,however,thatourmeasureofactivity(timeoffirstvisittoeachpieceofcontent)maskedlateractivitybecausere-visitstocontentvisitedearlierwerenotcounted,andsoFigure1islikelytoover-statethefall-offinactivity.We therefore followed the Coursera study and computed a description for each learner based onindividualactivityinacourseweek.Their“engagementdescriptions”havesixoreightelements,becauseeachcourseranforsixoreightweeks.

Figure1:ActivityonStudyOne,MOOC1.Darkredverticallinesshowthedateoftheweeklycourseemail.

AsintheCourserastudy,students’activityineachweekwasassignedtooneoffourcategories:• “T=ontrack”iftheyundertooktheassessmentontime• “B=behind”iftheysubmittedtheassessmentlate• “A=auditing”iftheyengagedwithcontentbutnotwiththeassessment• “O=out”iftheydidnotparticipate.Although FutureLearn students were not required to submit assessed work by a particular date, weretainedtheseclassificationsandcountedanassessmentassubmitted“late”ifitwascompletedafterthe



endof the courseweek. For example, if a learnermissed the firstweekof a course (out), submittedassessmentsontimeinweeks2–6(ontrack),submittedweek7’sassessment inweek8(behind),andthenengagedwithcontentonlyinweek8(auditing),theirengagementprofileduringthissectionofthestudywouldappearas[O,T,T,T,T,T,B,A].OncewehadcreatedengagementprofilesforeachStudyOnelearnerinthisway,wefollowedthemethodof Kizilcec’s team and applied the k-means clustering algorithm to partition the learners into a smallnumberofgroups.Toclusterengagementpatterns,anumericalvalueforthedissimilaritybetweenthemis required. There are several possibilities for this, but in this initial exploration, we followed theirapproach:wefirstassignednumericalvaluestoeachlabel(OnTrack=3,Behind=2,Auditing=1,Out=0).We then calculated the L1norm foreachengagementpatternandused that as thebasis forone-dimensionalk-meansclustering,thusminimizingthesumofthedifferencesbetweenindividualpatternsineachcluster.

Figure2:ProportionofusersonStudyOneMOOC1fallingintoeachcategory,byweek

We also followed their practice in repeating clustering 100 times and selecting the solutionwith thehighestlikelihood,becausek-meanshasrandomaspects.Toattempttoreplicatetheirfindings,wefocusedonextractingfourclusters.However,ourclustersdidnotmatchthosefoundintheCourserastudy.Themethodproducedtwoclustersthatwereverysimilartotheonestheyfound.Wefoundaclusterstrikinglysimilartotheir“Completing”group:learnerswhocompletedalmostalltheassessments.Wealsofoundaclustersimilartotheir“Sampling”group:learnerswhovisitedonlyonceortwice,anddidnotattemptanyassessment.



The other two clusters did not match theirs so well. They found an “Auditing” cluster “who didassessmentsinfrequentlyifatallandengagedinsteadbywatchingvideolectures,”anda“Disengaging”cluster “who did assessments at the beginning of the course but then have a marked decrease inengagement.”Ourtwoequivalentclustersdidincludesomelearnerswithpatternslikethat,buttheyalsoincludedmanywhodidnotfitthosedescriptionsneatly,andthereseemedtobesignificantoverlap.Thiswas reflected in thesilhouettescores: they reportedanaveragesilhouettewidthof0.8,butourdataachievedonly0.67.(Thecloserto1.0,thebetterclustered.)Selectinganumberofclusters,k,toextractusingk-meansisnotoriouslyproblematic,unlessthereisanunambiguousapriorirationale.Anotablefeatureofk-meansclusteringisthatitwillalwaysgeneratekclustersregardlessofwhetheranothernumberwouldprovideabetterfitforthedata.We therefore repeated the analysis using values for k that ranged from 3 to 8. We found that thesilhouettewidthwasataminimumfork=4,suggestingthatthismightbetheleastsuitablenumberofclustersforourdata.Wewere concerned that theone-dimensional approachwasdiscardingpotentiallyuseful informationabout patterns of engagement before the clustering algorithm could use them, so we repeated theanalysisagain,thistimerunningk-meansonthenumericengagementprofilesdirectly,treatingthemassix- or eight-dimensional vectors. We explored the four-cluster solution in detail, and again found“Completing”and“Sampling”clusters,butnoclearpatternintheothertwo.Again, we explored k from 3 to 8, which yielded silhouette widths rather lower than for the one-dimensionalapproach(around0.4),andwhichdecreasedmonotonicallyaskincreased.3.2 Adapting the Method Having hadmixed results fromdirect application of themethod,we planned to adapt theirmethod,makingaminimalnumberofchangestoreflectthecontextofourdata. Wedevelopedanewclassification,inordertoreflecttheimportanceofdiscussioninFutureLearnMOOCs.Foreachcontentweek,studentswereassignedthevalue1 if theyviewedcontent,2 if theypostedacomment,4iftheysubmittedtheweek’sassessmentlateand8iftheycompletedthefinalassessmentbeforetheendoftheweekinwhichitwasset(i.e.,early,orontime).Thesevalueswereaddeduptogiveatotalforeachweek.ThepossiblescoresthatcouldbeassignedtoanindividuallearnerinanyoneweekaresetoutinTable2(below).ThemajorityofcommentsonFutureLearnareassociatedwithacontentstep,asstudyweekstypicallyincludeonlyoneor twodiscussionsteps.There is thereforenowaytotell fromour logdatawhether



learnershadparticipatedinthediscussionbyreadingbutnotbyaddingacomment,sothisoptioncouldnotbecoded.

Table2:Scoringmethod

Weused thek-meansalgorithmtoextractclusters fromtheengagementprofilesdirectly,asa six-oreight-dimensionalvectorforeachlearner,toallowforthepossibilityofclusteringbytimeofactivity,aswellasbytotalactivity.ThevariationindimensionsherewasbecausethreeoftheseMOOCsranforeightweeksbutMOOC2,thelifesciencesMOOC,ranforsixweeks.

Figure3:StudyOnesilhouettewidthsfork=2to8(left)and

screeplotofthewithin-groupssumofsquaresfork=1to10(right)Selecting a number of clusters to extract was very problematic. A wide variety of techniques wasemployed, almost all of which yielded unhelpful results, recommending either a minimal number ofclusters(2)oramaximalnumber.2

2Specifically,weappliedallthetechniqueslistedinthisStackExchangepost:http://stackoverflow.com/questions/15376075/cluster-analysis-in-r-determine-the-optimal-number-of-clusters/15376462

Score Interpretation1 onlyvisitedcontent(forexample,video,audio,text)2 commentedbutvisitednonewcontent3 visitedcontentandcommented4 didtheassessmentlateanddidnothingelsethatweek5 visitedcontentanddidtheassessmentlate6 didtheassessmentlate,commented,butvisitednonewcontent7 visitedcontent,commented,lateassessment8 assessmentearlyorontime,butnothingelsethatweek9 visitedcontentandcompletedassessmentearly/ontime10 assessmentearlyorontime,commented,butvisitednonewcontent



Thisinitselfsuggeststhatthisentireapproachtoclusteringwasfarfromideal.However,wewishedtopursuetheapproachtoexplorewhethertheclustersnonethelesshadthevividfacevalidityoftheKizilcecetal.paper.Manualinspectionofthetwo-clustersolutionsuggestedthatthisseparatedthelearnersintothose who visited only briefly (Samplers— see below) and others.We wished to explore the otherpatterns of engagement in some detail. A maximal number of clusters would be uninformative andunwieldyanalytically.Two methods supported a pragmatic decision to extract seven clusters. The mean silhouette widthreachedalocalmaximumatk=7(seeFigure3,left).Wealsousedascreeplottovisualizethetotalwithin-groupssumofsquares(themeasureofhowcloselytheclustersgrouptogetherthatk-meansseekstominimize). The method in using a scree plot is to identify a “kink” in the plot, which is somewhatsubjective.Inthiscase,thewithin-groupssumofsquaresdippedatk=7(seeFigure3,right).3.3 Analysis of Clusters Our cluster descriptions were developed by analysis of learner profiles onMOOC1. Like Kizilcec andcolleagues,wewantedtoensurethattheclustersmadesensefromaneducationalpointofview,eventhoughthesedescriptionsweredevelopedaftertheanalysishadtakenplace.AfterdevelopingtheclusterprofilesforMOOC1,weexaminedtheotherStudyOneMOOCs.Withthreeexceptions,therewasverycloseagreementbetweentheclustersfound.TwooftheMOOCswerestructurallydifferenttotheothers.MOOC1,MOOC3,andMOOC4ranforeightweeks, butMOOC2 ran for only sixweeks.MOOC1,MOOC2, andMOOC4 included assessment everyweek,butMOOC3includedonlythreeassessments.Despitethesedifferences,fiveofthesevenclusterswerefoundinsubstantiallyidenticalforminallfourMOOCs.MOOC2andMOOC3betweenthemgeneratedthreeclustersthatdidnotmatchthosedevelopedonMOOC1,whichareappendedtothelistassupplementaryclusters.Cluster descriptions for Study One are given below, together with notes of where individualMOOCsdiffered. The clusters found across all fourMOOCs are indicatedwith a *, aswell as in the text. TheproportionoflearnersineachclusterisshowninTable3(below).Inallcases,theterm“average”refersto the mean. We have provided typical engagement profiles for each cluster, which we derived byinspectingtheprofileswithineachclustermanually,andselectingonethatoccurredveryfrequently.ClusterI:Samplers*Learners in this cluster visited, but only briefly. This cluster is essentially identical to Kizilcec et al.’s“Sampling”cluster.SamplersmadeupthelargestclusterinallfourMOOCs,accountingfor37%–39%oflearners(56%onMOOC4).Theytypicallyvisitedabout5%ofthecourse,withafewSamplers(11%–24%)visitingonlyasinglestep,althoughonly1%ofSamplersonMOOC3didso.Theywereactiveinaverysmall



numberofweeks,oftenincludingweek1,butnotalways—25%–40%joinedthecourseafterthefirstweek.VeryfewSamplerspostedcomments(6%–15%),andveryfewsubmittedanyassessment,althoughthehandfulthatdidsotypicallydidthisinweek1.A typical engagement profile for this cluster shows individuals looking at content inWeek 1 and notengaginginsubsequentweeks:[1,0,0,0,0,0,0,0].ThisclusterwashighlystableacrossallMOOCs,andacrossmostvaluesofk.ClusterII:StrongStarters*Theselearnerscompletedthefirstassessmentofthecourse,butthendroppedout.StrongStartersmadeup8%–14%oflearners.Allofthemsubmittedthefirstassignment,butthentheirengagementdroppedoffsharply,withverylittleactivityafterthat.Alittleoverathirdofthempostedcomments(35%–38%),andthosewhodidsodidnotpostverymany(1.7–4.0),exceptonMOOC3,where73%oflearnerspostedanaverageof13.7comments.AtypicalengagementprofileforthisclustershowsthemvisitingcontentandsubmittingtheirassessmentontimeinWeek1,visitingcontentinWeek2,thendisengaging:[9,1,0,0,0,0,0,0].Again,thisclusterwashighlystableacrossallMOOCsandmostvaluesofk.ClusterIII:ReturnersTheselearnerscompletedtheassessmentinthefirstweek,returnedtodosoagaininthesecondweek,andthendroppedout.Returnersmadeup6%–8%oflearners,withtheexceptionofthosestudyingMOOC3,wherethisclusterdidnotappear.Thiswasalmostcertainlybecausetherewereonlythreeassessmentsinthatcourse,withatleastoneweekbetweeneach,sothispatternofactivitywasnotpossible.Almostallthelearnersinthiscluster(>97%)submittedtheassessmentsforweek1andweek2.Thisdoesnotmeanthattheyallvisitedinbothweeks;someofthemdidtheweek1assessmentlate.NoReturnersexploredallthecoursesteps;theaverageamountofstepsvisitedvariedfrom23%to47%.Afterthefirsttwoweeks,therewasverylittleactivityindeed.A typical engagement profile for this cluster shows individuals viewing content and completingassessmentsontimeinthefirsttwoweeks,butthendisengaging:[9,9,0,0,0,0,0,0].Thisclustershowedsomevariabilitywithvaryingk.ClusterIV:Mid-wayDropoutsTheselearnerscompletedthreeorfourassessments,butdroppedoutabouthalf-waythroughthecourse.



Mid-wayDropoutsmadeup6%oflearnersonMOOC1,and7%oflearnersonMOOC4.ThisclusterdidnotappearforMOOC2andMOOC3,becausetheirunusualstructure(shorter,fewerassessments)meantthattherewerenotenoughassessmentsotherthanthefinaloneforthemtohavethisprofile.Theselearnersvisitedabouthalfofthecourse(47%,59%),androughlyhalfofthempostedcomments(38%,49%),posting6.3–6.5commentsonaverage.Atypicalengagementprofileforthisclustershowsindividualssubmittingassessmentsontimeinthefirstthreeweeks,submittinglateinthefourthweek,viewingcontentinweeksfiveandsixandthendroppingout:[9,9,9,4,1,1,0,0].Thiscluster,likethepreviousone,showedsomevariabilitywithvaryingk.ClusterV:NearlyThere*These learners consistently completed assessments, but thendroppedout just before theendof thecourse.NearlyThere learnersaccounted for5%–6%of learnersonall fourMOOCs.They typicallyvisitedoverthree-quartersofthecourse(72%–80%)andsubmittedassessmentsconsistently(>90%)untilweek5,andmostly on time (40%–75%), after which their activity declined steeply, and few completed the finalassessment(3%–17%),noneontime.Manyofthempostedcomments(48%–65%),andthosewhodidpostedanaverageof5.7–8.3—exceptforMOOC3,where80%of learnerspostedanaverageof21.8commentseach.A typical engagement profile for this cluster shows high engagement early on, viewing content andsubmitting on time in the first six weeks, and sometimes commenting as well, then submitting theassessment in week seven and ceasing to engage: [11, 11, 9, 11, 9, 9, 8, 0]. This cluster appeareddefinitivelyforallfourMOOCs,butwassomewhatvariablewithvaryingk.ClusterVI:LateCompleters*This cluster includes learnerswho completed the final assessment, and submittedmost of the otherassessments,butwereeitherlateoromittedsome.LateCompletersaccountedfor6%–8%oflearners,exceptonMOOC3,whereonly0.2%oflearnersfellintothiscluster.ThiswaspresumablybecauseMOOC3onlyincludedthreeassessmentpoints,sotherewerefeweropportunitiestogetbehindontheassessment.Eachweek, includingthefinalweek,morethan94%ofthisclustersubmittedtheirassessment.Theaverageproportionsubmittinglatevariedfrom16% to59%.However,more than three-quarters submitted the final assessmenton time (78%–90%).Fewerthanhalfoftheselearnerspostedcomments(40%–43%),apartfromonMOOC3where76%didso,andthosethatpostedmadeanaverageof7.9–15.0comments.



Atypicalengagementprofileforthisclustershowsindividualsvisitingcontentandsubmittingassessmentslate in the first fiveweeks,and thenviewingcontentandsubmittingassessmentson time in the finalweek:[5,5,5,5,5,9,9,9].ThisclusterwasfairlystableacrossallMOOCs,andacrossmostvaluesofk.ClusterVII:KeenCompleters*Thisclusterconsistsoflearnerswhocompletedthecoursediligently,engagingactivelythroughout.KeenCompletersaccountedfor7%to13%oflearners,apartfromonMOOC1,where23%oflearnersfellintothiscluster.All learners inthisclustercompletedall theassessments, includingthefinalone,andalmostallofthemontime(>80%).Onaverage,theKeenCompletersvisited>90%ofthecoursecontent.Theywerealsoassiduouscommenters.Abouttwo-thirdsofthiscluster(68%–73%)contributed20.8–24.4comments on average, although again MOOC3 stood out, with all learners commenting, posting animpressiveaverageof53.7commentseach.A typical engagement profile for this cluster shows individuals viewing comment and submittingassessmentontimeeveryweek,andalsocommentinginseveralweeks:[11,11,9,9,11,11,9,9].ThisclusterwashighlystableacrossallMOOCs,andacrossvaluesofk.MOOC2SupplementaryClusterBecauseitwasshorterthantheotherMOOCs,theMid-wayDropoutsclusterwasnotfoundonthisMOOC.Instead,arathermiscellaneousandhard-to-interpretclusterwasgenerated, fallingbetweenSamplersandStrongStarters,andaccountingfor13%ofthelearners.Sixty-sevenpercentoftheclustersubmittedthefirstassessment,butlate,andmanyeithervisitedregularlyorleftcomments.MOOC3SupplementaryCluster1MOOC3hadonlythreeassessments,sotheReturnersandMid-wayDropoutsclusterswerenotfound.Thefirstalternativeclusterwasasubstantialone,accountingfor20%oflearners.ItwasverysimilartotheSamplersclusterinthatthelearnersvisitedonlybriefly—however,allofthemleftcomments.Theyleftanaverageof5.3comments,whichishighcomparedtotheotherMOOCs,butlowforMOOC3.MOOC3SupplementaryCluster2The second alternative cluster wasmuch smaller, accounting for only 2% ofMOOC3. These learnerssubmittedthefinalassessmentontime,butdidnotengageconsistentlyacrossthecourse.Inthiscluster,44%submittedthefirstassessment,22%thesecond.Theyvisitedonaverage44%ofthecourse.Abouthalfoftheselearnersleftcomments(53%),atarateof10.1each.



Table3:ProportionoflearnersineachStudyOnecluster,byMOOCCluster MOOC1 MOOC2 MOOC3 MOOC4ISamplers 39% 39% 37% 56%IIStrongStarters 11% 14% 8% 10%IIIReturners 6% 8% - 7%IVMid-wayDropouts 6% - - 7%VNearlyThere 6% 6% 6% 5%VILateCompleters 8% 7% 0.2% 6%VIIKeenCompleters 23% 13% 7% 9%MOOC2Sup - 13% - -MOOC3Sup1 - - 20% -MOOC3Sup2 - - 2% -

4 DISCUSSION OF STUDY ONE Kizilcec and colleagues found four clusterswithin theirdata: learnerswhowere completing, auditing,disengaging, and sampling (Kizilcec, Piech, & Schneider, 2013). They identified these clusters afterstructuringtheirdatainawaythatplacedemphasisonlearnerinteractionwithcontent,assessment,anddeadlines.AsSection3.1showed,wefound“Completing”and“Sampling”clusters,buttheotherswerenotfoundsoclearlywithintheFutureLearndatasets.Thesocio-constructivistpedagogyofFutureLearnincorporatesnotonlycontentandassessmentwithinacourse’slearningdesignbutalsodiscussion.Inaddition,ascreditisnotcurrentlyavailableforFutureLearncourses,assessmentistypicallyusedformatively(tosupportlearning)ratherthansummatively(togaugehowmuchhasbeenlearned).TheresultsofouranalysisdifferedfromKizilcecetal.,whichsuggeststhatthisdifferentcontextmayhaveinfluencedtheresults.Ourapproachincludeddataaboutcontributiontodiscussionasaninput,andsomeoftheclusterswefoundreflectedparticularpatternsofparticipationindiscussion. This suggests that a single approach is unlikely to be pedagogy-neutral.We suggest thatopportunitiesforjointknowledgeconstructionshouldbetakenintoaccountwheretheyareavailable.The“Sampling”clusteridentifiedin2013(Kizilcec,Piech,&Schneider,2013)canberelatedtotwoclustersin thecurrent study: theSamplersand theStrongStarters.TheSamplerscluster includespeoplewhoarrived,visitedafewpiecesofcontent,andthenleft.However,italsoincludesmanylearnerswhoarrivedlate—inWeek1,inWeek2orevenlaterinthecourse,andwhothendidnotsustaintheirengagement.Inmanycases,this latearrivalmayhaveindicatedthattheynever intendedtoengagewiththeentirecourse.However,itislikelythatmanyfoundandjoinedthecourseafteritsstart,orwereunabletotakepartinitsearlydays.Thissuggeststhatthetime-boundednatureofthecourse,withacohortworkingthroughthematerialtogetherweekbyweek,mayhavediscouragedparticipationbylatearrivals.TheStrongStartersappear,intheFutureLearndatasets,tobedistinctfromtheSamplers.Onaverage,theycompletedthefirstweek,theymadeoneortwocomments,andmanyofthemlookedbeyondthe



firstweek’smaterials.Theselearnersdidmorethansimplysamplecontent,butwerenotabletosustaintheirengagementwiththecourse.TheclustersofStrongStarters,Returners,Mid-wayDropouts,andNearlyTherearechieflydistinguishedbythetimeperiodoverwhichtheirmembersengagedwiththecourse.StrongStartersengagedforaweek,Returnersforacoupleofweeks,Mid-wayDropoutsforabouthalfthecourse,andNearlyThereforallbutthelastweekortwo.Notsurprisingly,anincreaseintimespentonthecoursewasassociatedwithariseinthemeanproportionofthecoursevisitedbygroupmembersandariseinthemeannumberofcommentscontributed.The two clusters of LateCompleters andKeenCompleters canbe considered as a pair, because theyincludethelearnerswhohaveengagedwiththemajorityofthematerialandalltheassessmentsandwhoare therefore classified by FutureLearn as Fully Participating learners. The main characteristic thatdifferentiatesthetwoclustersisthenumberofcommentsposted.Alltheclustersconsideredpreviouslyin the analysis averaged about one post per person perweek of engagement. On average, Samplersengagedforlessthanaweekandpostedlessthanonecomment.StrongStartersparticipatedforjustoveraweekandpostedjustoveronecomment.Returnersparticipatedforalittleovertwoweeksandpostedjustovertwocomments.LearnersintheNearlyThereclusterengagedforfourtosixweeksandpostedaroundfivecomments.TheLateCompletersclusteralsobroadlyfollowedthisone-comment-per-weekaverage.However,theKeenCompleterspostedtwiceasmuchasthoseclusters,averagingwellovertwocommentsperweek.Itseems,then,thatthereweretwodominantapproachestocompletingthecourse.Themorepopularapproach(therewereconsistentlymoreKeenCompletersthanLateCompleters)wastoengagefully.KeenCompleters did almost everything, everyweek, and visited almost every stepof the course. The LateCompletersgroupdidnotengagewitheveryaspectofthecourse:theycommentedless,andtheyweremore frequently late insubmittedassessments.Soengagingactivelywith thecomments isassociatedwithamoreextensiveengagementwiththecoursematerialsandassessmentthanisapparentinotherclusters. 4.1 Exploring the Clusters in More Detail Asubsequent study, reportedatEC-TEL2015 (Fergusonetal., 2015),explored theseclusters inmoredetail, examining learner engagement with MOOCs run by different universities on the FutureLearnplatform.Thatstudyfoundthatthesevenmainclustersidentifiedinthispaperwereconsistentlyfoundonseven-toeight-weekMOOCsthatsupportengagementwithcontent,assessment,anddiscussion.These studies showed that the process of developing a “simple and scalable categorization to targetinterventions”(Kizilcec,Piech,&Schneider,2013)isnotasstraightforwardasitoriginallyseemed.Studiesofengagementsuggest,perhapsunsurprisingly,thatallMOOCsincludelearnerswhoarrive,lookaround



alittle,andthenleave,aswellas learnerswhoarrive,engagewitheverything,andstayuntiltheend.However,a largepercentageof learners (37% inMOOC1, forexample)donot fit thesepatterns.Howtheselearnersbehavedependsonthepedagogyofthecourse,andonhowitisdesigned.Avariationincourselength,aswithMOOC2,introducesnewclusters,asdoesavariationinassessmentpatterns,aswasthecasewithMOOC3.Ifengagementpatternsshiftwithpedagogyandlearningdesign,dotheyalsoshiftaseducatorsbecomemoreexperiencedandlearnersgrowmorefamiliarwithlearninginMOOCs?Toanswerthisquestion,wereturnedtoMOOC1andMOOC4—theeight-weekMOOCswithassessmentpointseachweek—andstudiedboththefirstandthesecondpresentationofeachofthesecourses.5 STUDY TWO: IDENTIFYING PATTERNS OF ENGAGEMENT ACROSS TIME 5.1 Study Two: Dataset Table4(below)providesabriefoverviewofthedatasetforStudy2,thetwopresentationsofMOOC1andMOOC4.Thisoverviewsupplements,andtosomeextentrepeats,theinformationprovidedaboutthesetwoMOOCsinTable1above.ThefiguresforgenderandforpreviousexperienceofonlinecoursesandMOOCsare taken fromresponses to the start-of-course survey foreachpresentation.Thenumberofparticipantsindicatesthenumberofindividuals—botheducatorsandlearners—whoappearedinourdatasetbecausetheywereactiveafterthestartdateofthecourse.Peoplewhowereactivebeforethestartdatewereremovedfromthedatasetentirely,becausetheonlypeoplewithaccessat thatpointwereeducatorsorFutureLearnstaff.FullyParticipatinglearnersmarkedamajorityofthecoursestepscompleteandcompletedalltheassessments.Although the twopresentations of each coursewere run less than a year apart, there is a change inparticipants’reportedpreviousexperience.Inbothcases,experienceofcoursesrunpartlyorsolelyonlinerisesbytwopercentagepoints(n.s.foreither).Atthesametime,previousexperienceofMOOCsrisesbysixtoeightpercentagepoints(p<0.01forMOOC1,p<0.001forMOOC2).ThisimpliesthatthesecondpresentationofMOOC4hadaround780moreparticipantswithexperienceofMOOCsthanitdidonitsfirstpresentation,while thesecondpresentationofMOOC4hadaround1,380moreparticipantswithexperienceofMOOCsthanitdidonitsfirstpresentation.5.2 Study Two: Method ForStudyTwo,wefollowedthemethodoutlinedinSection3.2(above). Individuals’engagementeachweekwasscoredusingthevaluessetoutinTable2(above).Valuesassignedfordifferenttypesofactivitywere totalled to assign each individual an engagement score ranging from 0 to 11 each week. Theengagementscoresforeachoftheeightweeksofthecoursewereusedtoproduceanengagementprofile



containing eight values for each learner. The k-means algorithmwas used as before to extract sevenclustersfromtheseengagementprofilesasaneight-dimensionalvectorforeachlearner.

Table4:OverviewofdatasetforStudyTwo MOOC1

firstMOOC1second

MOOC4first

MOOC4second

Subjectarea PhysicalSciences

PhysicalSciences Business Business

Startmonth Mar2014 Feb2015 May2014 Mar2015M 51% 50% 35% 36%F 48% 50% 65% 64%Takenonlinecoursebefore 65% 67% 55% 57%TakenMOOCbefore 48% 54% 32% 40%Participants 5,069 5,953 9,778 11,264FullyParticipating 1,548 1,304 1,416 1,240ParticipationRate 31% 30% 14% 11%

5.3 StudyTwo:AnalysisofClusters ThesevenclustersforthesecondpresentationofMOOC1andMOOC4werefoundonmanualinspectiontoalignverycloselywiththoseidentifiedinStudyOne.Quantifyingthissimilarityintheabsenceofanindependentoutputvariable is far fromstraightforward.Asa simplisticbuthopefully comprehensibleapproach,wecalculatedaseriesofdescriptivestatisticstofacilitatecomparison.Size:Whatpercentageof learnerswas included in the cluster?Was any cluster consistently larger orsmallerthantheothers?Steps:EachFutureLearnMOOCisdividedintosteps.Ameannumberofstepsvisitedforeachclusterwascalculatedby totalling all step visits anddividingby thenumberof learners in the cluster. ThismeannumberwasthencomparedwiththetotalnumberofstepsintheMOOC.Comments: What percentage of learners in the cluster added at least one comment to the MOOCdiscussion?Assessment:Whatpercentageoflearnersintheclusterevercompletedthefirstassessment?Howmanyassignmentswerecompletedbyatleast90%ofthecluster?Consideringthesefourareasenabledadetailedcomparisonoftheattributesofeachcluster.5.3.1 StudyTwo:SamplersSamplers(Table5),theleastengagedlearners,makeupthebiggestclusterineachpresentationofthesetwoMOOCs.Onaverage,theyveryconsistentlyvisited4–5%ofthesteps.However,morethanafifthof



themvisitedonlyonestepandleftaftertheirfirstvisit.Thisclusteralsocontainsaveryhighpercentageof latecomers, with more than a third first visiting the course after its first week. Relatively fewcommentedandfewdidthefirst—orany—assessment.

Table5:StudyTwo—Samplers MOOC11st MOOC12nd MOOC41st MOOC42nd Cluster

characteristicsSizeofcluster 39% ***47% 56% 56% OverathirdMean%stepsvisited 5% 5% 5% *4% 5%–10%%Visitonlyonestep 24% ***32% 24% ***27% >20%%Joinafterweek1 27% ***34% 39% ***35% >25%%Leaveacomment 15% ***11% 13% 12% <20%%Dofirstassessment 4% 5% 13% ***7% =<15%%Doanyassessment 6% 6% 15% ***10% =<15%

*p<0.05,**p<0.01,***p<0.001between1stand2nd5.3.2 StudyTwo:StrongStartersTheStrongStarters(Table6)areasmallerclusterthantheSamplers(Table5),buttheyaremuchmoreengaged.Theyvisitmoresteps,alotofthemcomment,andallofthemcompletethefirstassessment.However,fewofthemgoontodomoreassessments.

Table6:StudyTwo—StrongStarters MOOC11st MOOC12nd MOOC41st MOOC42nd Cluster

characteristicsSizeofcluster 11% 12% 10% **14% 8%–14%Mean%stepsvisited 17% 17% 12% 11% 10%–20%%Visitonlyonestep 37% ***27% 35% ***28% 20%–40%%Dofirstassessment 100% 100% 100% 100% 100%%Do>1assessment 5% 6% 7% *5% <10%

*p<0.05,**p<0.01,***p<0.001between1stand2nd5.3.3 StudyTwo:ReturnersTheReturners(Table7)makeupasmall,butfairlyactivecluster.Theyengagewithalotofthecontent,althoughnoneof them looksat itall.Mostof themdo the firstassessmentandgoon todoanotherassessment,andalotofthemengagebycommenting.



Table7:StudyTwo—Returners MOOC1

1stMOOC12nd

MOOC41st

MOOC42nd

Clustercharacteristics

Sizeofcluster 6% 8% 7% 8% 6%–8%Mean%stepsvisited 33% 36% 24% 26% 20%–50%%Visitallsteps 0% 0% 0% 0% 0%%Leaveacomment 39% 35% 23% ***32% >20%–60%%Dofirstassessment 95% 96% 93% 94% >90%%Do>1assessment 95% 96% 91% 91% >90%

***p<0.001between1stand2nd

5.3.4 StudyTwo:MidwayDropoutsTheMidwayDropouts(Table8)makeupanothersmallcluster.Almostallofthemdothefirstassessment,andmostdothreeormoreassessments.Abouthalfofthemcommentatleastonce,andtheyviewabouthalfofthecoursecontent.

Table8:StudyTwo—MidwayDropouts MOOC11st MOOC12nd MOOC41st MOOC42nd Cluster

CharacteristicsSizeofcluster 6% 7% 7% 6% 6%–7%Mean%stepsvisited 59% 65% 49% 48% 45%–65%%Leaveacomment 49% 43% 38% **46% 35%–50%%Dofirstassessment 98% 97% 96% ***99% >95%%Do>2assessments 97% 99% 96% *98% >90%

*p<0.05,**p<0.01,***p<0.001between1stand2nd

5.3.5 StudyTwo:NearlyThereNearlyThere(Table9)isconsistentlythesmallestcluster,madeupofpeoplewhoviewmostofthecoursecontentanddomostoftheassessment.Theyarealsomorelikelythannottopostacomment.

Table9:StudyTwo—Nearlythere MOOC11st MOOC12nd MOOC41st MOOC42nd Cluster

CharacteristicsSizeofcluster 6% 5% 5% 4% 4%–6%Mean%Stepsvisited 76% ***89% 76% 74% 70%–90%%Leaveacomment 65% **53% 48% *55% 45%–65%%Dofirstassessment 98% 98% 99% 99% >95%%Do>4assessments 99% ***100% 97% 97% >90%

*p<0.05,**p<0.01,***p<0.001between1stand2nd

5.3.6 StudyTwo:LateCompletersInmostcases, theengagement inclusters risesconsistently.Eachclusterdescribedabovehasviewedmore content than the last, includedmorepeoplewho comment than the last, and completedmoreassessmentsthanthelast.TheLateCompleters(Table10)areanexception.Theyvisitalmostallthesteps,theydoalmostalloftheassessment,butthepercentagethatcommentisrelativelylow.



Table10:StudyTwo—Latecompleters

MOOC11st MOOC12nd MOOC41st MOOC42nd Cluster

CharacteristicsSizeofcluster 8% 6% 6% 4% 6%–8%Mean%stepsvisited 92% 95% 95% 92% 90%–97%%Leaveacomment 44% 38% 42% *36% 35%–45%%Dofirstassessment 96% 97% 98% 97% >95%%Do>6assessments 93% 93% 95% ***89% >90%*p<0.05,**p<0.01,***p<0.001between1stand2nd5.3.7 StudyTwo:KeenCompletersTheKeenCompleters(Table11)canbeconsideredidealMOOCstudentsfromaneducator’sperspective.Theyvisitall,oralmostall,ofthecontent.Theydoall,oralmostalloftheassessment.Two-thirdsofthemaddtothecoursebycommenting.Theyrepresentmost,butnotall,ofthelearnersclassifiedas“FullyParticipating”andthereforeeligibleforaFutureLearnStatementofParticipation.

Table11:StudyTwo—Keencompleters MOOC11st MOOC12nd MOOC41st MOOC42nd Cluster

characteristicsSizeofcluster 23% ***15% 9% 8% 8%–23%Mean%stepsvisited 95% ***100% 98% 99% 95%–100%%Leaveacomment 71% *66% 68% 67% 65%–75%%Dofirstassessment 100% 100% 100% 100% >98%%Do>7assessments 99% 99% 100% 100% >90%*p<0.05,**p<0.01,***p<0.001between1stand2nd 5.4 Study Two — Discussion Aretheclusterseffectivelythesame?This isdifficulttoquantify.Theclustersareclearlynotidentical:some of the descriptive statistics in Tables 5–11 are statistically significantly different from onepresentation to the next at a high level of significance. But are those differencesmeaningful? As anextreme,99%oflearnersintheNearlyThereclusterfirstpresentationofMOOC1completedmorethanfourassessments,comparedto100%inthesecond(seebottomTable9),andthisdifferenceissignificantatgreaterthanthe0.001level.ItseemshardtoarguethatthisisameaningfuldifferencebetweenclustersoflearnerbehavioursonsubsequentpresentationsofthesameMOOC.Standard hypothesis testing, as we have carried out here, seeks to reject the null hypothesis of nodifferencebetweenthetwogroups.Thisisnotthesameastestingwhetherthereisnodifference:thatisthedomainoftestingforequivalence.Totestforequivalence,onehastofirstdefinetherangeofresultsthat would be considered effectively the same, and then test whether the results fall inside some



confidenceintervalbasedonthatrange.The“clustercharacteristics”columnisoursuggestionforarangethatiseffectivelythesame—andallthedatafallwithinthisrange.Forinstance,theprecisepercentageoflearnersintheSamplerscluster(Table5)wholeaveacommentisdifferentinastatisticallysignificantlywayfromonepresentationofaMOOCtoasubsequentone(15%onMOOC1’sfirstpresentation,11%onthesecond).However,itisfairlylowinallcases(<20%),andthis(andothercharacteristics)distinguishesthisclusterfromtheotherclusters.TheclustersarepracticallysimilarbetweenMOOCsandbetweenonepresentationofaMOOCandthesubsequentone.Unfortunately,asalreadydiscussed,thedatasetavailableforthesestudiesdoesnotcontainanyoutcomemeasures (such as assessment results or survey responses) that would provide an independentcomparison.Thesimilarityinthedescriptivestatisticsfromthemorethan30,000learnerswhoengagedwiththetwopresentationsofMOOC1andMOOC4indicatesthatthesesevenclustersarestableacrosspresentations.Itisnotpossibletotestthisdirectly,astheassessmentandsurveydatacannotbealignedwithactivitydata. However, there is no evidence that these clusters change meaningfully as learners gain moreexperienceinonlinelearning.6 CONCLUSION 6.1 Limitations of this Approach to Cluster Analysis This work has followed and extended the originalmethodology of Kizilcec et al. It has yielded somepotentiallyusefulinsightsforpracticeandforfutureresearch(seesubsequentsections).However,therearesignificantlimitationstotheapproachestakenbyboththisworkandtheworkitbuildson.Both approaches entailed significant processing and data reduction and choices around the clusteranalysis.Onesignificantpracticaladvantageofcodingweeklybehaviourintoasinglemeasureisthatitenablesanalyststoinspectindividualpatternsataglanceinawayimpossibleifthedirectlogfiledataisused. However, this makes data unavailable for clustering: for instance, it would be impossible todistinguishlearnerswhotendedtoworkinoneburstaweekfromlearnerswhostudiedinamorespread-outpattern.Theapproachtakentotheclusteranalysisinbothapproachesisrathersimplistic.Itisnotobviousthatk-meansisthebestclusteringalgorithminthiscontext—notleastbecauseitrequireswhatmaybeanarbitrarydecisiontobemadebytheanalystabouthowmanyclusterstoextract.Itseemshighlylikelythatthepatternsthatemergedinthefinalclustersreflect,atleasttosomedegree,thechoicesmadeinthisprocess.Kizilcecetal.codedweeklybehaviourbasedonwhetherornotlearnersengaged with content or did the assessment, and compressed this in to a single dimension beforeclustering: they found four clusters that reflected the four logical possibilities (Sampling — lowengagementwithcontent;Auditing—highengagementwithcontent;Disengaging—lowengagementwithassessments;Completing—highengagementwithassessment).Theadaptedmethodinthispaper



addeddataaboutcommenting,andfoundclusterswherecommentingbehaviourwasadistinguishingfeature.Ratherthancompressingtheweeklydataintoasingledimension,thispaperusedadimensionforeachweek,anddulyfoundclusterswherebehaviourchangedovertheweeks.Thatisnottosaythatthesepatternsdonotexistinthedata:theyalmostcertainlydo.However,theydonotemergefromthedatawithoutinfluencefromthechoicesmadeintheanalysis.Accordingly,wewishtobeverymodestabouttheclustersfromourownanalysis.Theyarefarfromacorrect,complete,orrobustanalysisoftheparticipationpatternsoflearnersinaMOOC.Weclaimonlythattheyprovideaninformative view of learner behaviour,which has some potentially valuable implications for practice.However, two patterns of behaviour do seem to emerge robustly from cluster analysis of learnerbehaviouronMOOCs:samplers,whovisitonlybriefly;andcompleters,whofullyengagewiththecourse.6.2 Implications for Practice The key characteristic of learning analytics, as distinct from the general category of quantitativeeducationalresearch, isadirectconcernwith improving learningandlearningenvironments.“Thekeystepis“closingtheloop”byfeedingback[…]tolearnersthroughoneormoreinterventions”(Clow,2012).Howcanthisanalysisachievethat?Theclustersidentifiedherecanhelpinformarangeofstrategiesforinterventionandimprovement.AstheseclustershavenowbeenfoundonarangeofpresentationsofMOOCsfromdifferentuniversitiesandcoveringdifferentsubjects,educatorscanbeconfidentthattheyarestableaslongasthelearningdesign(eight-weekcourseswithweeklyassessment)andpedagogy(conversationallearning)remainconstant.ProvidingpreviewsofcoursematerialwouldallowSamplers tomakeamore informeddecisionaboutwhethertoregisterinthefirstplace.Thefourclustersinwhichlearnersengagesolidlybutfailtocomplete—theStrongStarters,Returners,Mid-wayDropouts,andparticularlytheNearlyTherecluster—arepreciselythoselearnersthateducatorswouldwanttofocusontoimprovecompletion.Thisdataanalysissuggeststhatthereareseveralpointsatwhichlearnersmayleaveacourse,andthesemaybedealtwithindifferentways.Discussionstepssetupforlatecomerscouldbeusedtosupportthosewhofallbehindatthestart,whilesign-uppagescoulddrawattentionto thedifficultiesencounteredby learnerswhomovethroughthecourseoutofstepwiththecohort.Thesepagesmightsuggestthatpotential learnersreturnlaterandregisterforasubsequentpresentation.Free-textsurveycommentsonFutureLearnMOOCsthathaverunmorethanonceshowthatsomelearnersreturntosubsequentpresentationsinordertofinishworkingthroughthematerialwithagroup—eventhoughtheyalreadyhaveaccesstothesamecontentbecausetheyregisteredontheoriginalpresentation.



Thecourseweek isanartificialdistinctionthat imposesaweeklyformatonthe learningdesignofthecourse. In doing so, itmakes the decision tomove on to the next week of the course appearmoresignificantthanthedecisiontomoveontothenextstep.Changestolearningdesigncouldprovidebridgesbetweencourseweeks,stressinglinksbetweentheseweeks,andpointinglearnersforward.Thismightreducethedrop-offexperiencedbythesecourses,particularlyattheendsofWeek1andWeek2.Thetwomostclearlysuccessfulclusters—theLateCompletersandtheKeenCompleters—alsoaffordactionableinsights.InthecontextofFutureLearn,atleast,theyappeartoprovidesomesmalldegreeofvindicationoftheextensiveandstructureduseofdiscussioncomments.6.3 Implications for Future Research Theseclusterscouldbeused tosupportqualitative research into learnerengagement. Inparticular, itwouldbeusefulforpracticetoidentifywhatpromptsKeenCompleterstoengagetosuchanextent,andwhatfactorsinfluencedropoutbytheNearlyTherecluster.However,theseresultssuggestthatthisapproachtoclusteringdatafromlearnersonMOOCshasbeenpursuedasfarasisdefensible.NeithertheKizilcecetal.approachnortheadaptationofitinthispaperappears to be particularly robust. Future work should explore more rigorous and sophisticatedapproachestoanalyzinglearnerbehaviour,includinglessreductivedatapreparationandcoding,otherclusteringmethods (e.g.,hierarchical clusteringapproaches), similaritymeasuresother thanEuclideandistance(e.g.,Levenshteindistanceontextualprofiles),andsoon.Otherdatamaywellproveuseful.Forinstance,itiswellknownthatsimplycountingforumcontributionscanbemisleadingifoneisinterestedin the contribution to learning. A more robust set of patterns would be potentially very useful forpractitionerstotargeton-coursesupport.This work has explored patterns of individual learner behaviour within aMOOC, and on subsequentpresentationsofthesameMOOC.Withamorerobustclassificationsystemforlearnerbehaviour,itwouldbeinterestingtoexplorelearnerbehaviourindifferentMOOCs,whichmightshedlightonthestabilityorotherwiseofsuchbehavioursandonwhatassociationstheremaybebetweenMOOCcharacteristicsandlearnerbehaviours.6.4 Final Remarks Asopenofferings,forsomevaluesof“open,”MOOCsaffordawiderangeofengagementpatterns.Twohaveemergedconsistentlyfromthisworkandpreviousresearch:thesamplingbehaviouremployedbypeoplewhovisitacoursebriefly,andthefullyengagedbehaviouremployedbylearnerswhoengagewithall aspects of a course. The studies reported here suggest that both these and other patterns ofengagementremainstableacrosspresentationsofacourse,whenkeyelementsofpedagogyandlearningdesignremainconstant.



ACKNOWLEDGEMENTS WewouldliketothanktheFutureLearnteamforgivingusaccesstotheactivitydatausedinthisanalysis.REFERENCES Bohannon, J. (2014). Replication effort provokes praise—and “bullying” charges. Science, 344(6186).

http://dx.doi.org/10.1126/science.344.6186.788Clow, D. (2012). The learning analytics cycle: Closing the loop effectively. Proceedings of the 2nd

International Conference on Learning Analytics & Knowledge (LAK ʼ12), 134–138.http://dx.doi.org/10.1145/2330601.2330636

Clow,D.(2013).MOOCsandthefunnelofparticipation.Proceedingsofthe3rdInternationalConferenceon Learning Analytics and Knowledge (LAK ʼ13), 185–189.http://dx.doi.org/10.1145/2460296.2460332

Daniel,J.(2012).MakingsenseofMOOCs:Musingsinamazeofmyth,paradoxandpossibility.JournalofInteractiveMediainEducation,2012(3).http://dx.doi.org/10.5334/2012-18

Downes,S.(2012).Connectivismandconnectiveknowledge:Essaysonmeaningandlearningnetworks.Retrievedfromhttp://www.downes.ca/files/books/Connective_Knowledge-19May2012.pdf

Downes, S. (2014, 21 March). Like reading a newspaper (Web log post). Retrieved fromhttp://halfanhour.blogspot.co.uk/2014/03/like-reading-newspaper.html

Ferguson, R., Clow,D., Beale, R., Cooper, A. J.,Morris,N., Bayne, S.,&Woodgate, A. (2015).MovingthroughMOOCS:Pedagogy,learningdesignandpatternsofengagement.Proceedingsofthe10thEuropean Conference on Technology Enhanced Learning (EC-TEL ʼ15), 70–84.http://dx.doi.org/10.1007/978-3-319-24258-3_6

Jordan, K. (2014a). Initial trends in enrolment and completion of massive open online courses. TheInternationalReviewofResearchinOpenandDistanceLearning,15(1),133–160.Retrievedfromhttp://www.irrodl.org/index.php/irrodl/article/view/1651

Jordan, K. (2014b). MOOC completion rates: The Data. Retrieved February, 2, 2016, fromhttp://www.katyjordan.com/MOOCproject.html

Kizilcec, R., Piech, C., & Schneider, E. (2013). Deconstructing disengagement: Analyzing learnersubpopulations inmassive open online courses.Proceedings of the 3rd International LearningAnalytics & Knowledge Conference (LAK ʼ13), 170–179.http://dx.doi.org/10.1145/2460296.2460330

Kop,R.,Fournier,H.,&SuiFai,J.M.(2011).Apedagogyofabundanceorapedagogytosupporthumanbeings? Participant support on Massive Open Online Courses. The International Review ofResearch in Open and Distance Learning, 12(7), 74–93. Retrieved fromhttp://www.irrodl.org/index.php/irrodl/article/view/1041/2025



Laurillard,D.(2002).Rethinkinguniversityteaching:Aconversationalframeworkfortheeffectiveuseoflearningtechnologies(2nded.).London:RoutledgeFalmer.

Littlejohn,A.(2013).CEMCAedtechnotes:UnderstandingMassiveOpenOnlineCourses.Retrievedfromhttp://cemca.org.in/ckfinder/userfiles/files/EdTechNotes2_Littlejohn_final_1June2013.pdf

Milligan,C.,Littlejohn,A.,&Margaryan,A.(2013).PatternsofengagementinconnectivistMOOCs.Journalof Online Learning and Teaching, 9(2). Retrieved fromhttp://jolt.merlot.org/vol9no2/milligan_0613.htm

Pask,G.(1976).Conversationtheory:Applicationsineducationandepistemology.NewYork:Elsevier.Ferguson, R. & Sharples, M. (2014). Innovative pedagogy at massive scale: Teaching and learning in

MOOCs.Proceedingsofthe9thEuropeanConferenceonTechnologyEnhancedLearning(EC-TELʼ14),98–111.http://dx.doi.org/10.1007/978-3-319-11200-8_8

Siemens, G. (2005). Connectivism: A learning theory for the digital age. International Journal ofInstructional Technology and Distance Learning, 2(1). Retrieved fromhttp://www.itdl.org/journal/jan_05/article01.htm

Siemens, G. (2012, 25 July). MOOCs are really a platform (Web log post). Retrieved fromhttp://www.elearnspace.org/blog/2012/07/25/moocs-are-really-a-platform/

Society for LearningAnalyticsResearch. (2011).Open learninganalytics:An integrated&modularizedplatform(WhitePaper).Retrievedfromhttp://solaresearch.org/OpenLearningAnalytics.pdf

Consistent Commitment: Patterns of Engagement across Time in … · 2017. 2. 27. · The FutureLearn platform, launched in 2013, employs a social-constructivist pedagogy, based on

Documents