Top Banner
(2017). An application of extreme value theory to learning analytics: Predicting collaboration outcome from eye-tracking data. Journal of Learning Analytics, 4(3), 140–164. http://dx.doi.org/10.18608/jla.2017.43.8 ISSN 1929-7750 (online). The Journal of Learning Analytics works under a Creative Commons License, Attribution - NonCommercial-NoDerivs 3.0 Unported (CC BY-NC-ND 3.0) 140 An Application of Extreme Value Theory to Learning Analytics: Predicting Collaboration Outcome from Eye-tracking Data Kshitij Sharma Department of Operations, Faculty of Business and Economics University of Lausanne, Switzerland Computer Human Interaction in Learning and Instruction School of Computer and Communication Sciences École Polytechnique Fédérale de Lausanne, Switzerland [email protected] Valérie Chavez-Demoulin Department of Operations, Faculty of Business and Economics University of Lausanne, Switzerland Pierre Dillenbourg Computer Human Interaction in Learning and Instruction School of Computer and Communication Sciences École Polytechnique Fédérale de Lausanne, Switzerland ABSTRACT: The statistics used in education research are based on central trends such as the mean or standard deviation, discarding outliers. This paper adopts another viewpoint that has emerged in statistics, called extreme value theory (EVT). EVT claims that the bulk of normal distribution is comprised mainly of uninteresting variations while the most extreme values convey more information. We apply EVT to eye-tracking data collected during online collaborative problem solving with the aim of predicting the quality of collaboration. We compare our previous approach, based on central trends, with an EVT approach focused on extreme episodes of collaboration. The latter provided a better prediction of the quality of collaboration. KEYWORDS: Eye-tracking, dual eye-tracking, extreme value theory, computer supported collaborative learning, learning analytics, collaboration quality 1 INTRODUCTION This contribution borrows a framework from the field of statistics called extreme value theory (EVT), which has been developed for analyzing time series in domains such as finance and environmental sciences. We explore the relevance of EVT for learning analytics, namely for analyzing collaborative interactions in an educational setting. For these kinds of analyses, statistical methods traditionally focus
25

An Application of Extreme Value Theory to Learning Analytics ...Extreme events are defined as those having low frequency and high severity (or impact). EVT is a branch EVT is a branch

Jan 26, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • (2017).Anapplicationofextremevaluetheorytolearninganalytics:Predictingcollaborationoutcomefromeye-trackingdata.JournalofLearningAnalytics,4(3),140–164.http://dx.doi.org/10.18608/jla.2017.43.8

    ISSN1929-7750(online).TheJournalofLearningAnalyticsworksunderaCreativeCommonsLicense,Attribution-NonCommercial-NoDerivs3.0Unported(CCBY-NC-ND3.0)

    140

    An Application of Extreme Value Theory to Learning Analytics: Predicting Collaboration Outcome from Eye-tracking Data

    KshitijSharma

    DepartmentofOperations,FacultyofBusinessandEconomicsUniversityofLausanne,Switzerland

    ComputerHumanInteractioninLearningandInstructionSchoolofComputerandCommunicationSciences

    ÉcolePolytechniqueFédéraledeLausanne,[email protected]

    ValérieChavez-Demoulin

    DepartmentofOperations,FacultyofBusinessandEconomicsUniversityofLausanne,Switzerland

    PierreDillenbourg

    ComputerHumanInteractioninLearningandInstructionSchoolofComputerandCommunicationSciences

    ÉcolePolytechniqueFédéraledeLausanne,Switzerland

    ABSTRACT: The statistics used in education research are based on central trends such as themeanor standarddeviation, discardingoutliers. This paper adopts another viewpoint that hasemerged in statistics, called extreme value theory (EVT). EVT claims that the bulk of normaldistribution is comprised mainly of uninteresting variations while the most extreme valuesconvey more information. We apply EVT to eye-tracking data collected during onlinecollaborative problem solving with the aim of predicting the quality of collaboration. Wecompare our previous approach, based on central trends, with an EVT approach focused onextreme episodes of collaboration. The latter provided a better prediction of the quality ofcollaboration.

    KEYWORDS: Eye-tracking, dual eye-tracking, extreme value theory, computer supportedcollaborativelearning,learninganalytics,collaborationquality

    1 INTRODUCTION

    This contribution borrows a framework from the field of statistics called extreme value theory (EVT),which has been developed for analyzing time series in domains such as finance and environmentalsciences. We explore the relevance of EVT for learning analytics, namely for analyzing collaborativeinteractionsinaneducationalsetting.Forthesekindsofanalyses,statisticalmethodstraditionallyfocus

  • (2017).Anapplicationofextremevaluetheorytolearninganalytics:Predictingcollaborationoutcomefromeye-trackingdata.JournalofLearningAnalytics,4(3),140–164.http://dx.doi.org/10.18608/jla.2017.43.8

    ISSN1929-7750(online).TheJournalofLearningAnalyticsworksunderaCreativeCommonsLicense,Attribution-NonCommercial-NoDerivs3.0Unported(CCBY-NC-ND3.0)

    141

    on the central tendencies (mean,median, and standard deviation). Generally,we discardedwhatweconsidered to be outliers, which we suspected might be due to measurement errors, cheating, ormiscellaneous events foreign to the cognitive mechanisms under scrutiny. Instead, EVT invites us tofocusontheinteractionepisodes,whichdeviatefromthosecentraltendencies.Theshiftbetweenthesetwoapproaches,fromcentraltoextremes,isaccompaniedbyanothershift:theextremedatapointsdonotcorrespondtoanindividualsubjectorapairbuttosomespecifictimeepisodeswithinalongseriesoftimeeventsproducedbyeachindividualorpair.ThegoalofthispaperistodetermineifEVTcouldprovide us with better discrimination among different levels of collaboration quality compared totraditionalmethods.Wethereforeapplybothmethodstothetimesseriesproducedbyeyetrackersandcomparetheresults.Sincewestudycollaboration,wesynchronizedtheeye-trackingdataproducedbyeachpeer(whatwecall“dualeye-tracking”).EVThasbeentraditionallyusedtoquantifyrareeventslikecentury floods, avalanches, market crashes, or more recently terrorism attacks. Outside of the riskmanagementcontext,ithasnotbeenmuchdevelopedbecauseofthelackofraredata.Inthispaper,wepropose the use and development of extreme value learning tools to explore “rare data” fromeducational“bigdata”experimentssuchaseye-trackingexperiments.

    The paper is organized as follows: Section 2 describes the nature of dual eye-tracking data (DUET),followedinSection3byanintroductiontoEVT.Section4introducestheconceptthatbridgesDUETandEVTintwoways.Intheunivariateway,eachpairoftimeepisodesfromlearnersAandBissubstitutedbyameasureoftheirdifferences,whichproducesatimeseriesofsinglevalues.Inthebivariatemode,wetakeintoconsiderationthedynamiccouplingofthetwotimeseries.TherestofthepapercomparestheresultsproducedbyEVTtothoseresultingfromtraditionalapproaches.

    2 EYE-TRACKING

    Eye-trackingprovidesresearcherswithunprecedentedaccesstoinformationaboutusers’attention.Theeye-trackingdata is rich in termsof temporal resolution.With theadventofeye-tracking technology,the eye-tracking apparatus has become compact and easy to use without sacrificing much of itsecologicalvalidityduringthecontrolledexperiments.Previousresearchhadshownthateye-trackingcanbe useful for unveiling the cognitive processes that underlie verbal interaction and problem-solvingstrategies.Weintroduceheresomekeyconceptsnecessarytounderstandthestudypresentedlater.

    2.1 Fixations and Saccades

    In a nutshell, gaze does not glide over visualmaterial in a smooth continuousway but rather jumpsaroundthestimulus:smallstopsaround200milliseconds,called“fixations,”arefollowedbylongjumps,called “saccades.” It is hypothesized that information is collected only during fixations. However, thedata analysis ismore complex.What if the eyes stop after 180 or 170milliseconds? Can this still beconsidered as a fixation? Eye-trackingmethods require different thresholds to bedefined in order toprocessdata.Are these thresholds the same for all subjects and for all tasks? Ifwe consider a singlesubjectonasingletask,isthethresholdstableovertime?Isitthesameinthemiddleofthescreenoron the periphery? Eye-tracking relies on the craft of “thresholding.” Nüssli (2011) developed

  • (2017).Anapplicationofextremevaluetheorytolearninganalytics:Predictingcollaborationoutcomefromeye-trackingdata.JournalofLearningAnalytics,4(3),140–164.http://dx.doi.org/10.18608/jla.2017.43.8

    ISSN1929-7750(online).TheJournalofLearningAnalyticsworksunderaCreativeCommonsLicense,Attribution-NonCommercial-NoDerivs3.0Unported(CCBY-NC-ND3.0)

    142

    optimization algorithms that systematically explore threshold parameters in order to maximize thequalityofproduceddata.Severalstudieshaveshownthatthelevelofexpertiseofanindividual(Ripoll,Kerlirzin, Stein, & Reine, 1995; Abernethy & Russell, 1987; Charness, Reingold, Pomplun, & Stampe,2001;Reingold,Charness,Pomplun,&Stampe,2001)couldbedeterminedfromeye-trackingdatasincethewayonelooksatanX-RAY(Grant&Spivey,2003;Thomas&Lleras,2007)orapieceofprogrammingcode(Sharma,Jermann,Nüssli,&Dillenbourg,2012)revealsthewayoneunderstandsthesethings.Wewillnotdevelopthesefindingsinthispaperaswefocusoncollaborativesituations.Forinstance,withina collaborative Tetris game, Jermann, Nüssli, and Li (2010) predicted the level of expertise in a pair(expert–expert, novice–novice, or expert–novice pair)with an accuracy of 75%. The core relationshipbetweengazeandcollaborationresultsfromthegaze-dialoguecoupling.

    2.2 Gaze-dialogue Coupling

    Two eye-trackers can be synchronized for studying the gaze of two persons interacting to solve aproblemandforunderstandinghowgazeandspeecharecoupled.Meyer,Sleiderink,andLevelt(1998)showed that the duration between looking at an object and naming it is between 430 and 510milliseconds (eye–voice span). Griffin and Bock (2000) found an eye–voice span of about 900milliseconds.ZelinskyandMurphy(2000)discoveredacorrelationbetweenthetimespentgazingatanobjectandthespokendurationthenameoftheobjectwasgivenaloud.Richardson,Dale,andKirkham(2007)proposedtheeye–eyespanasthedifferencebetweenthetimewhenthespeakerstartslookingat the referred object and the time when listeners look at it. This time lag was termed the “cross-recurrence” between the participants. The average cross-recurrencewas found to be between 1,200and1,400milliseconds.JermannandNüssli(2012)appliedcross-recurrencetoapairprogrammingtask,enablingtheremotecollaboratorstoseetheiractionsonthescreen.Theauthorsfoundthatthecross-recurrence levelswerehigherwhenselectionwasmutuallyvisibleonthescreen,whichrelatedtothecross-recurrenceofteamcoordination.

    2.3 Quality of Interaction and Cross-recurrence Several authors have found a relationship between the cross-recurrence of gazes and the quality ofcollaboration.CherubiniandDillenbourg (2007) founda correlationbetweengaze-recurrenceand theperformanceofteamsinamapannotationtask.Inapeerprogrammingtask,JermannandNüssli(2012)found higher gaze recurrence for pairs that collaborate well, as estimated by theMeier, Spada, andRummel(2007)qualitativecodingscheme.Inaconcept-maptask(Sharma,Caballero,Verma,Jermann,&Dillenbourg,2015;Sharma,Jermann,Nüssli,&Dillenbourg,2013)relatedcross-recurrencetohigherlearning gains. In a collaborative learning task using tangible objects, Schneider and Blikstein (2015)found that cross-recurrence is correlated with the learning gains. In a nutshell, gaze is coupledwithcognition,andsincegazeiscoupledwithdialogue,DUETmethodsconstituteapowerfultoolwithwhichto quantitatively investigate the quality of collaboration. The observed correlations do not implycausality,butsomestudiesshowthatdisplayingthegazeofonepeertotheother,asadeicticgesture,increases teamperformance (Duchowski et al., 2004; Sharma,D’Angelo,Gergle,&Dillenbourg, 2016;Stein&Brennan,2004;VanGog,Jarodzka,Scheiter,Gerjets,&Paas,2009;VanGog,Kester,Nievelstein,

  • (2017).Anapplicationofextremevaluetheorytolearninganalytics:Predictingcollaborationoutcomefromeye-trackingdata.JournalofLearningAnalytics,4(3),140–164.http://dx.doi.org/10.18608/jla.2017.43.8

    ISSN1929-7750(online).TheJournalofLearningAnalyticsworksunderaCreativeCommonsLicense,Attribution-NonCommercial-NoDerivs3.0Unported(CCBY-NC-ND3.0)

    143

    Giesbers, & Paas, 2009; Van Gog & Scheiter, 2010). More importantly, these reported studies havemostlybeenconductedusingANOVAs,correlation tests,F-andt-testsandregressions,whichassumethat thedata followanormaldistribution.Wewill showthat thedistributiontailofeye-trackingdata(low frequency events) is quite different from the tail of normal distribution. Specifically, EVThypothesizes that the events that occur in the tail of a distribution are more distinguishable thanaveragebehaviour.Thenextsection,therefore,introducesthebasicsofEVT.

    3 AN INTRODUCTION TO EXTREME VALUE THEORY

    Extremeeventsaredefinedasthosehavinglowfrequencyandhighseverity(orimpact).EVTisabranchofstatisticsthatdealswithmodellingtheoccurrenceandmagnitudeofsuchevents.Forinstance,flood-wallsarenotbuiltforaverageeventsbutratherforrareandcatastrophicoccurrences.EVTforfinancialor insuranceriskmanagement looksatextremeeventsandconcentratesontheriskofsituationsthatmight never have happened before (McNeil, Frey,& Embrechts, 2015). Such events (market crashes,insurancelosses,etc.)arerarebutverysevereforcompanies,hencetheneedtomodelthedeviationsfrom thecentral tendencies inadifferentmanner.Actually, thedistributionof financial time series isknown to be heavy-tailed. Therefore, EVT methods aim to model the tail with concepts describedhereafter. For a comprehensive introduction, see Coles (2001), or see Chavez-Demoulin and Davison(2012)forareviewofEVTforanalyzingtimeseries.

    EVTisbasedonasymptoticresults.Therefore,thedatausedtomodeleventsisaverysmallsubsetofthewholedataset(usuallyabovethe90thor95thquantile).ThemainadvantagesofusingEVT1areasfollows:First, it isbasedonthemathematicalfoundationsthatforanycommondistributionF,wecancharacterizethetailofFandcanthereforeunderstandthegeneratingprocessofextremeeventsfromanyunderlyingdistributionF.Fcanbeanystandardcontinuousdistribution(normal,student,uniform,exponential, gamma, etc.); hence, EVT imposes no strong assumption upon the data generatingprocesses,unlikeANOVAs.Second,whenanalyzingthedependencestructurebetweentwosequencesof extreme events, the bivariate EVT context does not impose a linear shape of dependence ascorrelation requires (Sharma, Chavez-Demoulin, & Dillenbourg, 2016). Third, even if the theory isestablishedforindependentandidenticallydistributedvariables,itcanbestraightforwardlyextendedtothe stationary context— the contextwemeet ineye-trackingand collaborative learning—or to thenon-stationarycontext.Whyisdualeye-trackingastationarycontext?Thegazetime-seriesareinvariantof temporal-shifts, i.e., ifweshift the timebya factor, thevariability in thegazepatterns remain thesame.Moreover, the gaze data at time t are not completely independent of where the person waslookingattimet−1,i.e.,thereexistsanauto-correlationinthegazedata.Furthermore,wedescribetheadvantagesofEVTovergeneralmethodsusedinbehaviouralresearch:

    • Advantageof EVToverparametricmodels that assumenormalityof thedata:Aspreviouslymentioned,EVTdoesnotassumeanyunderlyingdistribution thatgenerates thedata.That is,

    1 Source:http://www.bioss.ac.uk/people/adam/teaching/OREVT/2007/node12.html

  • (2017).Anapplicationofextremevaluetheorytolearninganalytics:Predictingcollaborationoutcomefromeye-trackingdata.JournalofLearningAnalytics,4(3),140–164.http://dx.doi.org/10.18608/jla.2017.43.8

    ISSN1929-7750(online).TheJournalofLearningAnalyticsworksunderaCreativeCommonsLicense,Attribution-NonCommercial-NoDerivs3.0Unported(CCBY-NC-ND3.0)

    144

    EVT can be applied to data from any standard continuous distribution (normal, student,uniform,exponential,gamma,etc.).

    • Advantage of EVT over parametric models applied on the normalized data: EVT offers acomplementaryviewpoint to lookat thedata,moreparticularly to lookat thetailof thedatadistribution. This is justified because, often in the learning analytics context, the tail of thedistributionismoreinformativethanthebodyofdistribution.ThisisillustratedbytherealdataofFigure5.Inthatcontext,evenifthenormalityofthetransformeddatahold,theparametricmodelsappliedonthedatawouldnotbringmuchinformationbecausethereisnodependencestructuretoexploretheaveragevalues(thepointsseemtoberandomlyspread inthemiddlequadrantoftheplotcontainingtheaveragevalues).Moregenerally,whenagroupofstudentsisinteracting to accomplish a task, the upper tail of the joint distribution of temporalconcentration (or lower tail of the joint distribution of their spatial entropy, like in Figure 5)actuallyrepresentstheepisodesduringwhichthesubjectsaretogetherfocusedinahighlevelofcollaborativequality.Theaveragejointvaluesarelessinformative,probablycontainingothereffectsthancollaboration.Insuchcases,thecompetitiveperformanceofEVTapproachesoverparametricmodels, appliedon thenormalizeddata, emerges from the fact that EVTprovidesthecorrecttoolstolookattheextremesequencesofthedata.

    • AdvantageofEVTovernon-parametricmodels:Bothrelyonlyontheassumptionthatthedataarecontinuous.Manyofthenon-parametricmethodsusedinlearninganalyticsarehypothesistestingandprovideonevalue(thep-value),whichsummarizesthedata.Non-parametricformscanhandleonlylowdimensionalproblems,whichgoesagainsttheflowofbigdata.Ingeneral,inthe(non-stationary)timeseriescontext,thereismuchmoretogainfromdynamicparametricmodels than from hypothesis testing. Because EVT is available for any common continuousdistribution, it offers the advantages of parametricmodels like relying on likelihood, allowingformal inference, likelihood ratio-based hypothesis tests, and also takes into account non-stationary nature in the case of time series and covariate dependence. Note that non-parametricmethodsintheEVTcontextarealsopossible.

    3.1 Univariate Case

    ClassicalEVTconsiderstwodifferentapproaches.Thefirstapproachprovidestheasymptoticbehaviourofthemaximum:

    (1) whereX1,X2,…,Xn is an independentand identicallydistributed randomsequencewithdistributionF.Supposethatwecanfindsequencesofrealnumbers{an>0}and{bn} 𝑎" > 0 suchthatthesequenceof

    normalized(orstabilized)maximumM*n=(Mn–bn)/an𝑀"∗ =()*+),)

    convergesindistribution.

    A remarkable result states that the only possible distribution for the maximum is the generalized

  • (2017).Anapplicationofextremevaluetheorytolearninganalytics:Predictingcollaborationoutcomefromeye-trackingdata.JournalofLearningAnalytics,4(3),140–164.http://dx.doi.org/10.18608/jla.2017.43.8

    ISSN1929-7750(online).TheJournalofLearningAnalyticsworksunderaCreativeCommonsLicense,Attribution-NonCommercial-NoDerivs3.0Unported(CCBY-NC-ND3.0)

    145

    extremevalue(GEV)distribution:

    (2)where−∞<μ<∞isthelocationparameter,σ>0isthescaleparameter,and−∞<ξ<∞istheshapeparameter.Thisresult isequivalenttothewell-knowncentral limittheorem(whichprovidesa limitingdistributionforthemeanofanyunderlyingdistribution)butforthemaximum.Concretely,inmodellingextremesofaseriesofobserveddatax1,x2,xq,wedividethedataintomblocksofn.Thisgivesusanobserved series of block maximamn,1, mn,2, ..., mn,m on which we fit a GEV, by maximum likelihoodestimation,andgetestimatedlocation(μ ̂),shape(σ ̂),andscale(ξ ̂)parameters.ThetoppanelsinFigure1 show an example of the selection of extreme events using the blockwise-maximamethod for GEVmodelfitting.ThesecondclassicalEVTapproach(mathematicallyrelatedtothefirstone)characterizesthetailofanycontinuouscommondistributionFand is referredtoasthepeaks-over-threshold (POT)approach.Moreprecisely, itconsidersamodelfortheexceedancesabovesomehighthresholduthatdefinesthetailofthedistributionF.UnderthePOTapproachitcanbeshownthat:

    • thenumberof exceedancesabove the thresholdu arises according to aPoissonprocesswithparameterλ,andindependently,

    • theexceedancesizeW=X−ufollowsageneralizedParetodistribution(GPD):

    (3)

    definedon{w:w>0and 1 + 𝜉𝑤/˜𝜎 >0},where:

    (4)

    Essentially,parametersoftheGPD(thresholdexcesses)canbedeterminedbyGEV(blockmaxima).Theparameterξ,whichcontrolstheshapeofthetailofthedistributionF,isthesameforbothGPDandGEV.In applications, the POT approach ismore flexible than the blockmaxima approach and often allowsmoredata(morethanjustoneperblock)andthereforeleadstolessuncertainty.Aswecanseeinthetop-leftpanelofFigure1(below),thenumberofpointsconsideredformodellingarethesameasthenumberofblocks.Ontheotherhand,thenumberofpointsinthebottom-leftpanelofFigure1islargerthan when the POT method is used. Once we have determined the appropriate threshold, theparameterλofthePoissonprocessandtheGPDparameters˜σandξcanbeestimatedbymaximizingthe likelihood function. The bottom panels in Figure 1 show an example of the selection of extremeeventsusingthePOTmethodfortheGPDmodelfitting.

  • (2017).Anapplicationofextremevaluetheorytolearninganalytics:Predictingcollaborationoutcomefromeye-trackingdata.JournalofLearningAnalytics,4(3),140–164.http://dx.doi.org/10.18608/jla.2017.43.8

    ISSN1929-7750(online).TheJournalofLearningAnalyticsworksunderaCreativeCommonsLicense,Attribution-NonCommercial-NoDerivs3.0Unported(CCBY-NC-ND3.0)

    146

    Figure1:Topleft:arandomvariablesimulationandtheblockwise-maxima.Topright:thedensityplotforoneoftheblocks;theredpointsshowthemaximumvalueofeachblock.Bottomleft:thesamerandomvariableasinthetop-leftpanel,theredhorizontallineshowsthethresholdforthePOTmethod,theredpointsarethepoints-over-threshold.Bottomright:thedensityplotforthewhole

    distribution;theredverticallineshowsthethresholdforthePOTmethodanddenotesthebeginningofthetailforthedistribution;theredcolouredareashowsthetail,whichcorrespondstothered

    pointsinthebottom-leftpanel.

    Themainpracticaluseofsuchfittedmodels(GEVblockmaximaorPOT)istheadequatecalculationofthe extremequantile ofF, that is, thequantile at a very high level.Using either theGEVor POT,wecalculateavalue,whichhasaverylowprobabilityofbeingexceededinagiventimeperiod.Thisvalueiscalled the “return value,” a name inspired by environmental data inwhich the corresponding returnperiod question is in howmanymonths or years can it be expected that a value of the time seriesexceedsthesamevalueagain.Thereturnvalueissetataveryhighquantile,usually95%,whichmeansthat there is only a 5% chance that a valuewill exceed the computed return value. In Section 4,we

  • (2017).Anapplicationofextremevaluetheorytolearninganalytics:Predictingcollaborationoutcomefromeye-trackingdata.JournalofLearningAnalytics,4(3),140–164.http://dx.doi.org/10.18608/jla.2017.43.8

    ISSN1929-7750(online).TheJournalofLearningAnalyticsworksunderaCreativeCommonsLicense,Attribution-NonCommercial-NoDerivs3.0Unported(CCBY-NC-ND3.0)

    147

    expose the calculation of the return level, and in Section 6 we see that the return level is actuallyeffectivefordeterminingcollaborativequality.

    3.2 Bivariate Case

    AnotherwayofmodellingcollaborationwithEVTistousethegazepatternsfromthetwoparticipantsinapairandanalyzethemasabivariatetimeseries.Givenabivariaterandomsample(X1,Y1),…(Xn,Yn),EVTaddressesthelimitingbehaviourofthecomponent-wisemaxima(M1,n,M2,n),thatis,therespectivemaximumofthesequences{Xi}and{Yi},i=1,…,nasin(1).

    Theasymptotictheoryofbivariateextremesdealswithfindinganon-degeneratebivariatedistributionfunction(thatcantakemorethantwovalues)Gsuchthat,asn→∞

    (5)

    withsequencesal,n>0andbl,n∈R,l=1,2.Ifthelimit(5)exists2andGisanon-degeneratedistributionfunction,thenGhastheform:

    (6)

    ThefunctionA(ω)definedas0≤ω≤1istheso-calledPickandsdependencefunction.TheindependencecasecorrespondingtoG(z1,z2)=exp{−(1/z1+1/z2)},thePickandsfunctionA(ω),measuresthedeparturefromindependence.CompletedependencebetweenthetwoseriesisreflectedbyA(1/2)=0.5;whileatcompleteindependence,A(1/2)=1.

    Whileanalyzingtheeye-trackingtimeseriesoftwopeers,themainpracticaluseofthebivariateEVTisto measure extreme dependence, which is the probability of finding an extreme event in one timeseries,giventhatweobserveanextremeeventinthesecondtimeseries.Thetwoextremeeventsmustoccuratthesametime,asthetwodimensions inthisbivariatespacearethetwogazetimeseriesforthe twopeers.Thisprobability isquantifiedas the tail-dependencebetween the two timeseries.Theclassical methods value typically used to measure the dependence between the two series is thecorrelationcoefficient.Thecorrelationcoefficientiscomputedatthecentraltendencies,whilethetail-dependence is,as inthecaseofreturnvalues,computedataveryhighquantile. InSection4,weusethree different extremal dependence measures as complementary and interpretable ways fordeterminingcollaborativequality.

    4 CONCEPTS

    ToapplyEVT toour researchquestion,predictingcollaborationquality fromDUETtraces,weneedtodefineafewvariables.

    2 To simplify the representation andwithout loss of generality,we transform the data (Xi , Yi) to (Z1i , Z2i), i = 1,…, nwithstandardFréchetmarginssothatPr(Zil≤z)=exp{−1/z}forallz>0andl=1,2.

  • (2017).Anapplicationofextremevaluetheorytolearninganalytics:Predictingcollaborationoutcomefromeye-trackingdata.JournalofLearningAnalytics,4(3),140–164.http://dx.doi.org/10.18608/jla.2017.43.8

    ISSN1929-7750(online).TheJournalofLearningAnalyticsworksunderaCreativeCommonsLicense,Attribution-NonCommercial-NoDerivs3.0Unported(CCBY-NC-ND3.0)

    148

    4.1 Gaze Visual Agitation (VA) VA isdefinedas thecoefficientofvariance (CoV)of the fixationduration. Visualagitation foragiventimewindowtiscomputedasfollows:

    (7)

    In accordance with Richardson, Dale, and Tomlinson (2009), we chose a time window size of twoseconds. The main reason for analyzing the variance of the fixation duration and not the fixationduration itself is the fact that the fixationduration is task-dependent. For instance, in a visual searchtask,thefixationdurationswill inherentlybesmall,astheeyeswouldbeconstantlymovingtosearchthetargetobject,whereasinataskthatrequiresdeeperinformationprocessing,thefixationdurationsarehigher.Thetaskusedinourexperiment,drawingaconcept-maptask,liesinbetween:shortfixationdurationswhenpeerssearchforaconceptonthemapversuslongerfixationswhentheydiscussthelinkbetween the two concepts. In order to keep various task episodes comparable, we use the scaledvarianceofthefixationduration.AlowvalueofVAwouldmeanrelaxedgazepatternswhileahighvaluecouldresultfromstressorfatigue.

    4.2 Gaze Spatial Entropy (SE) SEmeasuresthespatialdistributionofthegazeofeachpeer.TocomputeSE,wefirstdefinea100-pixel-by-100-pixelgridoverthescreenandwecomputeforeachpeertheproportionofgazefixationslocatedin each grid cell (Figure 2). This results in a proportionality matrix and the SE is computed as theShannonentropyofthis2-dimensionalvector.Thespatialentropyisalsotask-independent,asitcanbecomputedforanytask,buttheinterpretationoftheentropyvaluesmightbedependentonthevisualstimuli.AlowvalueofSEwouldmeanthatthesubjectisconcentratingonafewelementsonthescreen,whileahighSEvaluewoulddepictawiderfocussize.

    Figure2:Theprocessofcomputingentropy.Theimageontheleftshowstheexemplarconcept-mapandgazepatterns(greycirclesandarrows).Theimageontherightshowstheplacementofthegrid.

  • (2017).Anapplicationofextremevaluetheorytolearninganalytics:Predictingcollaborationoutcomefromeye-trackingdata.JournalofLearningAnalytics,4(3),140–164.http://dx.doi.org/10.18608/jla.2017.43.8

    ISSN1929-7750(online).TheJournalofLearningAnalyticsworksunderaCreativeCommonsLicense,Attribution-NonCommercial-NoDerivs3.0Unported(CCBY-NC-ND3.0)

    149

    4.3 Return Levels: Univariate Extremes

    Thereturnlevelisthequantileatahighlevel(above90%forexample)ofthedatadistribution.Whydowenotsimplycalculatethisquantilefromthedistributionofourentiredataset?Wecoulddothis,butsmall discrepancies in the estimation of the body distribution would lead to large errors in theestimation of the quantiles in the tail. The POTmodel presented in Section 3 is themathematicallycorrectwaytoestimatesuchhighquantilesandinpracticeleadstomoreaccurateestimation.TheEVTestimationalsobringsinformationabouthowheavyisthetailofthedistributionF;thatis,howlargearetheextremesthatdistributionFcangenerate?This information isprovidedby thevalueof theshapeparameter ξ in (2) or (3): as ξ becomes larger, the tail ofF becomesheavier.Wedonot explore thisfeature further in this paper because as with any other modelling approach, just from the set ofestimated parameters of location µ, scale σ or ˜𝜎,, and shape ξ, it is cumbersome to explain andcomparethedifferentmodels.Hence,weusethereturnlevel,calculatedusingthemodelparameters,whichhasavaluableinterpretation.

    As mentioned in Section 3, the return value (say, calculated at the 95% quantile), symbolizes themeasureof the (unseen)extremeeventwitha5%probability that theactual (unseen)eventexceedsthisvalue.Inwhatfollows,wederivethereturnlevelcalculationfromthePOTmodelaboveathresholdu. We recall that the underlying variable is denoted X and that the exceedances occurrence arrivesaccording to a Poisson processwith parameter λ, and the exceedance sizeW= X − u follows a GPDdenotedasHin(3)withparameters( ,ξ).Forx>u,wehave:

    Itfollowsthat

    (8)

    Hence,thereturnlevelxporextremequantileatthepercentilep(large)isthesolutionof

    (9)

    sothat,

    (10)

  • (2017).Anapplicationofextremevaluetheorytolearninganalytics:Predictingcollaborationoutcomefromeye-trackingdata.JournalofLearningAnalytics,4(3),140–164.http://dx.doi.org/10.18608/jla.2017.43.8

    ISSN1929-7750(online).TheJournalofLearningAnalyticsworksunderaCreativeCommonsLicense,Attribution-NonCommercial-NoDerivs3.0Unported(CCBY-NC-ND3.0)

    150

    Inanon-mathematicalway, the return levelxp is thevalueatwhich theprobabilityofexceeding thisvalue is equal to 1 − p. We obtain the estimated return level (10) by fitting the POT model to theexceedance data, estimating the probability of exceeding the threshold, Pr(X > u), using the Poissonmodelandreplacingtheparameters andξwiththeirmaximumlikelihoodestimates.

    IsEVToverkill,orisitreallynecessarytoanalyzethetwovariablesthatwehavedefined,visualagitationandspatialentropy?Figure3usesQ–Qplotsforcomparingthedistributionofthesetwovariableswithanormaldistribution.Bothplotsshowaheavytailforlowfrequencyvaluesofspatialentropy(leftplot)and visual agitation (right plot), respectively. This justifies the use of sophisticated EVT methods toprocess these tails.We will therefore compare the return levels calculated for the two participants.Similarreturn levelswoulddepictahigheramountoftemporalconcordance. InSection6,wewillseethatcomparingreturnlevelsindeedprovidesanaccurate(andinterpretable)wayofdiscriminatinghighandlowcollaborationquality.

    Figure3:Q–QplotsofSpatialEntropy(left)andVisualAgitation(right)definedinSection4.

    4.4 Three Measures of Extremal Dependence: Bivariate Extremes

    Estimating dependence between the two partners in a pair’s extremal behaviour provides somecomplementaryinformationaboutthepeers’concordance.Wefirstintroducetheextremalcoefficient

    θ=2Α(1/2) (11)whereA is thePickands functionmentioned inSection3.Thus,θ∈ [1,2], and it canbeconvenientlyinterpretedastheeffectivenumberofindependentseries;thecaseθ=2meansthatthetwoseriesareindependent andwe therefore get complete independence. The case θ = 1means that the effectivenumberofindependentseriesis1,andthereforewegetcompletedependence.

    The two other extremal dependence measures we consider come from conventional multivariateextremevaluetheory,characterizingtwoclassesofextremevaluedependence:asymptoticdependence

  • (2017).Anapplicationofextremevaluetheorytolearninganalytics:Predictingcollaborationoutcomefromeye-trackingdata.JournalofLearningAnalytics,4(3),140–164.http://dx.doi.org/10.18608/jla.2017.43.8

    ISSN1929-7750(online).TheJournalofLearningAnalyticsworksunderaCreativeCommonsLicense,Attribution-NonCommercial-NoDerivs3.0Unported(CCBY-NC-ND3.0)

    151

    and asymptotic independence, which characterizes the behaviour of variables as they becomemoreextreme.Inthiscontext,weconsiderthecoefficientofextremaldependence

    (12)

    Thelimitvalueχ∈[0,1]isstrictlypositivewhenalargevalueofZ2leadstoanon-zeroprobabilityofaslargeasvalueZ1. Inotherwords,χ is thetendencyforonevariabletobe largegiventhattheother islarge.Thismeansthattheonlypossibilityforasymptoticindependenceiswhenχ=0.Whenχ>0,thevariablesareasymptoticallydependent.Inthatcontext,wedefine,asasecondextremalcoefficient,theconditionalprobability

    (13)

    From this we see that χ̅ = 1 means perfect dependence between the two series while χ̅ = 0 impliesindependence.The coefficientχ̅ is thereforeameasureofdependence for the classof asymptoticallyindependentmodels.Inourcontext,χtellsusthelevelofasymptoticdependence,andχ̅tellsusaboutthestrengthoftheasymptoticdependence.Inpractice,as(12)and(13)arelimits,wesetavalueofztypicallyataveryhighquantilefor(12)andverylowonefor(13),referredtoasz×100percentilefor(12)andtakingthe(1−z)×100percentilefor(13),asshownintheresultsinSection6.

    Figure4:Exampleillustratingthedeterminationofthecoefficientofextremaldependenceχandthestrengthofdependenceχ̅forthevisualagitationofapair.Thedashedlinesrepresentthe95%

    confidenceintervalsforχandχ̅.Thetail-dependenceanditsstrengthisdeterminedbythevaluesatthehigherquantiles(typicallybetween95%and99%).Theredlinescorrespondto95%.

    Figure4showsanexample illustrating thedeterminationof thecoefficientofextremaldependenceχandthestrengthofdependenceχ̅ forthespatialentropyofapair.Whydowecalculateχandχ̅ forall

  • (2017).Anapplicationofextremevaluetheorytolearninganalytics:Predictingcollaborationoutcomefromeye-trackingdata.JournalofLearningAnalytics,4(3),140–164.http://dx.doi.org/10.18608/jla.2017.43.8

    ISSN1929-7750(online).TheJournalofLearningAnalyticsworksunderaCreativeCommonsLicense,Attribution-NonCommercial-NoDerivs3.0Unported(CCBY-NC-ND3.0)

    152

    the quantiles? This is just an empirical method, and we are only interested in the highest quantilevalues.

    Again, isbivariateEVToverkill,or is it reallynecessary toanalyze thevariables thatwehavedefined,visualagitationandspatialentropy?Figure5showsthatthedependencestructurebetweenthespatialentropyofthetwopeersisfarfromlinear(forbothlowandhighcollaborativequalitypairs).Insuchacase,aPearsoncorrelationwouldleadtoerroneousconclusions.Thisleadstothedevelopmentofmoresophisticated methods to adequately model dependence structure; see, for instance, Sharma et al.(2017).

    Figure5:Scatterplotsofspatialentropybetweenthepeerswithlow(leftpanel)andhigh(rightpanel)qualityofcollaboration.

    5 EXPERIMENT

    TheEVTframeworkpresentedaboveprovidesanewmethodforanalyzingthedualeye-trackingdata.Theresearchquestionwespecificallyaddress isthefollowing:Doextremevaluesfromgazeepisodespredictthequalityofcollaborativelyproducedconceptmapsbetterthancentraltrends?

    To answer this question, we conducted an experiment with 66 master’s students from ÉcolePolytechnique Fédérale de Lausanne who participated in the present study. There were 20 femalesamong the participants. The participants were each compensated with 30 Swiss francs for theirparticipationinthestudy.TheflowoftheexperimentisshowninFigure6.

    Upontheirarrivalinthelaboratory,theparticipantssignedaconsentform.Thentheytookanindividualpre-testonthebasicsofneuronaltransmission.Thentheparticipants individuallywatchedtwovideosabout“restingmembranepotential.”Next,theycreatedacollaborativeconcept-mapusingIHMCCMap

  • (2017).Anapplicationofextremevaluetheorytolearninganalytics:Predictingcollaborationoutcomefromeye-trackingdata.JournalofLearningAnalytics,4(3),140–164.http://dx.doi.org/10.18608/jla.2017.43.8

    ISSN1929-7750(online).TheJournalofLearningAnalyticsworksunderaCreativeCommonsLicense,Attribution-NonCommercial-NoDerivs3.0Unported(CCBY-NC-ND3.0)

    153

    tools.3 Finally, they tookan individualpost-test.The twovideoswere taken from“KhanAcademy.”4,5Thetotallengthofthevideoswas17minutes.Itisworthmentioningthattheteacherwasnotphysicallypresentduringthevideos.Theparticipantscametothelaboratoryinpairs.Whilewatchingthevideos,the participants had full control over the video playerwithout any time constraint. The collaborativeconcept-map phase was 10–12 minutes long. During that time participants could talk to each otherwhiletheirscreensweresynchronized, i.e.,peerswereabletoseeeachother’sactions.Boththepre-testandthepost-testcontainedtrue–falsequestions.

    Figure6:Schematicrepresentationofthedifferentphasesoftheexperiment.

    5.1 Quality of Collaboration

    The final concept-map was compared with the concept-map created by the two experts. The pairreceived a score using the following rules: 1) one mark for each correct connection between twoconcepts,2)onemarkforeachcorrectlabeloftheedgebetweentwoconcepts,3)halfamarkforeachpartiallycorrect labelof theedgebetween twoconcepts.Thepairswere thendivided into two levelsbased on the concept-map score using a median split. Why do we consider this as a measure ofcollaborationquality?ThereasonrestsintheworkofJermann,Mullins,Nüssli,andDillenbourg(2011),Jermann and Nüssli (2012), and Kahrimanis, Chounta, and Avouris (2010), who showed that theactions/task-basedoutcome isoften correlatedwith the collaborationquality.Hence,ourassumptionabouthavingthecollaborativeproductqualityasaproxyofcollaborationqualityisgroundedinpreviousfindings.AsWiseandShaffer(2015)suggest,“...theoryplaysanever-morecriticalrole inanalysis,”sousingthesesupportsfromtheliterature,wecanproceedwiththeaforementionedassumption.

    3CMaptools4RestingMembranePotential-Part15RestingMembranePotential-Part2

  • (2017).Anapplicationofextremevaluetheorytolearninganalytics:Predictingcollaborationoutcomefromeye-trackingdata.JournalofLearningAnalytics,4(3),140–164.http://dx.doi.org/10.18608/jla.2017.43.8

    ISSN1929-7750(online).TheJournalofLearningAnalyticsworksunderaCreativeCommonsLicense,Attribution-NonCommercial-NoDerivs3.0Unported(CCBY-NC-ND3.0)

    154

    6 RESULTS

    6.1 Univariate Extremes

    Recallthequestionweaddressinthispaper:DoesEVTrevealdifferencesthatcentraltrendsfailedtoreveal?

    Figure7 shows thepipeline for dataprocessing. Let us beginwith the central trends approach. Ifwecompare the difference in the average levels of entropy of the peers, we observe no significantdifferences between high- and low-quality pairs. An ANOVA shows no significant difference in theaverageentropydifferenceforthepeerswithhighand lowcollaborationquality (F[1,21.48]=0.01,p-value=.93,Figure8d).Thesamelackofdifferenceisfoundwiththevisualagitation(F[1,22]=1.73,p-value=.20,Figure8c).

    Figure7:Thepipelineforunivariatedata-processing.

    Now,wecomparethepreviousresultswiththoseprovidedbyEVT.Weestimatedthereturnlevel(10)atpercentilep.Tokeepenoughdata,wesetp=90 forvisualagitationandp=95 forspatialentropy.Thereasonforsettingp=90forvisualagitationistohaveenoughdatapointstofitaGEVorPOT.Thedifference between peers in terms of return levels tells us about their synchronicity. The differencebetween peers in return levels for visual agitation is lower for high-quality pairs than for low-qualitypairs(F[1,14.08]=4.92,p-value=.04,one-wayANOVAwithoutassumingequalvariances).Similarly,thedifference between peers in return levels for spatial entropy is also lower for high-quality pairs(F[1,15.15]=8.39,p-value=.01,one-wayANOVAwithoutassumingequalvariances).Figures8aand8bshowthemeansandconfidenceintervalsforthedifferenceinthereturnlevelsforvisualagitationandspatial entropy respectively. In otherwords, both for agitation andentropy, the extremesoccurwithhighersynchronicityforthehigh-qualitypairsthanforthelow-qualitypairs.

  • (2017).Anapplicationofextremevaluetheorytolearninganalytics:Predictingcollaborationoutcomefromeye-trackingdata.JournalofLearningAnalytics,4(3),140–164.http://dx.doi.org/10.18608/jla.2017.43.8

    ISSN1929-7750(online).TheJournalofLearningAnalyticsworksunderaCreativeCommonsLicense,Attribution-NonCommercial-NoDerivs3.0Unported(CCBY-NC-ND3.0)

    155

    (a)Meansandconfidenceintervals(bluebars)forthedifferenceintheestimatedreturnlevels(10)at90percentileforvisualagitation,for

    high-andlow-qualitypairs.

    (b)Meansandconfidenceintervals(bluebars)forthedifferenceintheestimatedreturnlevels(10)at90percentileforspatialentropy,for

    high-andlow-qualitypairs.

    (c)Meansandconfidenceintervals(bluebars)forthedifferenceinthemeanvaluesforvisual

    agitation,forhigh-andlow-qualitypairs.

    (d)Meansandconfidenceintervals(bluebars)forthedifferenceinthemeanvaluesforspatial

    entropy,forhigh-andlow-qualitypairs. Figure8:Results:Univariateextremes

  • (2017).Anapplicationofextremevaluetheorytolearninganalytics:Predictingcollaborationoutcomefromeye-trackingdata.JournalofLearningAnalytics,4(3),140–164.http://dx.doi.org/10.18608/jla.2017.43.8

    ISSN1929-7750(online).TheJournalofLearningAnalyticsworksunderaCreativeCommonsLicense,Attribution-NonCommercial-NoDerivs3.0Unported(CCBY-NC-ND3.0)

    156

    6.2 Bivariate Extremes

    Weagaincomparethetwomethods:DoesEVTrevealdifferences(ofdependenciesamongpeers)thatcentraltrendsdidnot?

    Letus startwith standard correlations. Ifwe compute the correlationbetween the spatial entropyoftwopeers,we can see inboth Figures 10c and10d, thatwe cannot learn anything from the averagevalues (the body of the distribution), and the Pearson correlation/linearmodel does notmake sensehere.Thismightleadtofalseinterpretationsoftheunderlyingcollaborativeprocesses.

    Let us now compare the EVT approach to the bivariate time series. To estimate their extremaldependence,westartbyestimatingtheextremalcoefficientθasin(11)betweenthevariablesforthetwopeers.Weobservethathigh-qualitypairshaveahigherdependenceforvisualagitationthanlow-qualitypairs(F[1,22]=6.07,p-value=0.02,Figure9a).Similarly,high-qualitypairshaveahigherlevelofdependence in visual entropy than low-quality pairs,with the difference being evenmore significant(F[1,22]=7.65,p-value=0.01,Figure9b).Thescalesonthey-axesforFigures9aand9bareinverted.Aswe mentioned in Section 4.4, complete dependence is reflected by θ = 1, whereas completeindependenceisreflectedbyθ=2.

    Next,weestimatethelevelχdefinedin(12)andstrengthχ̅definedin(13)oftheextremaldependence.Weobserveahigherextremaldependence(calculatedatthe95%quantile)betweenthevisualagitationofpeersforpairswithhighcollaborationquality(F[1,22]=9.19,p-value=0.006,Figure11a).Moreover,weobserveanevenmoresignificantdifferenceinthestrengthoftheextremaldependence(calculatedat the 95%quantile) in favour of the pairswith high collaboration quality (F[1,22] = 11.71, p-value =0.002,Figure11c).

    Regarding spatial entropy, we observe effects similar to visual agitation. There is a higher extremaldependence (calculated at the 95% quantile) between the spatial entropy of peers with highcollaborationquality(F[1,22]=6.31,p-value=0.01,Figure11b).Similartothecaseofvisualagitation,weobserveanevenmoresignificantdifferenceinthestrengthofextremaldependence(calculatedatthe95%quantile)forthepairswithhighcollaborationquality(F[1,22]=14.28,p-value=0.001,Figure11d).

    There is a higher (χ) and stronger (χ)̅ (calculated at the 95%quantile) extremal dependence for bothvisual agitation (Figure 10a) and spatial entropy (Figure 10b) for the high-quality pairs than the low-qualitypairs.Weobserveaclearseparation, inthe2-dimensionalspaceofχandχ,̅betweenthehigh-and low-quality pairs (with three and one exception for visual agitation and spatial entropy,respectively). As we observe in the case of temporal univariate return levels, the difference ismoreevidentinthecaseofspatialentropythaninthecaseofvisualagitation.

  • (2017).Anapplicationofextremevaluetheorytolearninganalytics:Predictingcollaborationoutcomefromeye-trackingdata.JournalofLearningAnalytics,4(3),140–164.http://dx.doi.org/10.18608/jla.2017.43.8

    ISSN1929-7750(online).TheJournalofLearningAnalyticsworksunderaCreativeCommonsLicense,Attribution-NonCommercial-NoDerivs3.0Unported(CCBY-NC-ND3.0)

    157

    (a)Meansandconfidenceintervals(bluebars)fortheestimatedextremalcoefficientθforVAofthe

    participants,forhigh-andlow-qualitypairs.

    (b)Meansandconfidenceintervals(bluebars)fortheestimatedextremalcoefficientθforSEofthe

    participants,forhigh-andlow-qualitypairs.

    Figure9:Bivariateextremes:Dependencemeasures.

    (a)Coefficientχandstrengthχ̅ofextremal

    dependenceforVAforhigh(redpoints)andlow(bluepoints)collaborationqualitypairs.

    (b)Coefficientχandstrengthχ̅ofextremal

    dependenceforSEforhigh(redpoints)andlow(bluepoints)collaborationqualitypairs.

  • (2017).Anapplicationofextremevaluetheorytolearninganalytics:Predictingcollaborationoutcomefromeye-trackingdata.JournalofLearningAnalytics,4(3),140–164.http://dx.doi.org/10.18608/jla.2017.43.8

    ISSN1929-7750(online).TheJournalofLearningAnalyticsworksunderaCreativeCommonsLicense,Attribution-NonCommercial-NoDerivs3.0Unported(CCBY-NC-ND3.0)

    158

    Figure10:Results:Bivariateextremes,extremalcoefficient,andtaildependence.

    (a)Meansandconfidenceintervals(bluebars)fortheestimated levelofextremaldependenceχ inthe visual agitation of the participants, for high-andlow-qualitypairs.

    (b) Means and confidence intervals (blue bars)fortheestimated levelofextremaldependenceχforspatialentropyoftheparticipants,forhigh-andlow-qualitypairs.

    (c)SEvaluesforpeersinahigh-qualitypair.Thecorrelationdoesnotreflectthetruerelationship,asthereisnolinearrelationbetweentheSEvalues

    forpeers.

    (d)SEvaluesforpeersinalow-qualitypair.Thecorrelationdoesnotreflectthetruerelationship,asthereisnolinearrelationbetweentheSEvalues

    forpeers.

    Figure11.Results:Bivariateextremes,levels,andstrengthoftaildependence

  • (2017).Anapplicationofextremevaluetheorytolearninganalytics:Predictingcollaborationoutcomefromeye-trackingdata.JournalofLearningAnalytics,4(3),140–164.http://dx.doi.org/10.18608/jla.2017.43.8

    ISSN1929-7750(online).TheJournalofLearningAnalyticsworksunderaCreativeCommonsLicense,Attribution-NonCommercial-NoDerivs3.0Unported(CCBY-NC-ND3.0)

    159

    (c)Meansandconfidenceintervals(bluebars)fortheestimatedstrengthofextremaldependenceχ̅inthevisualagitationoftheparticipants,forhigh-andlow-qualitypairs.

    (d) Means and confidence intervals (blue bars)for the estimated strength of extremaldependence χ̅ in the visual agitation of theparticipants,forhigh-andlow-qualitypairs.

    Figure11.Results:Bivariateextremes,levels,andstrengthoftaildependence.

    7 DISCUSSION

    DoesEVTprovideinterestingfindingscomparedtostatisticalmethodsbasedoncentraltrends?

    Let us first address this question in the univariate context. The comparison ofmean values of visualagitationorspatialentropydidnotrevealanydifferencebetweenhigh-qualityandlow-qualitypairs.Onthecontrary,EVTrevealedthathigh-qualitypairshaveasignificantlysmallerdifferenceofreturnlevelsfor both variables. This shows that during extreme episodes of collaboration there exists a higheramountof“togetherness”amongtheparticipantsinhigh-qualitypairs.

    Thebivariatecontextisevenmoreinteresting.Thethreetaildependencecoefficientsweusedmeasuredependence between the extremes of visual agitation and spatial entropy in a time series. Morespecifically, from theextremal coefficientθwe learn theeffectivenumberof independent series: forhigh-qualitypairs,θ ̂≈1,meaningthatthetimeseriesofonepeer,forbothvariables,sufficestoexplain(ordescribe)theextremesoftheotherpeer.Thishighlightsanextreme“togetherness”incollaborationbetweenthetwoparticipantsofthepair.

    The dependencemeasures χ and χ̅ play a role similar to the Pearson correlation, but they avoid thedrawbacks of standard correlation (not robust to outliers, restricted to linear dependence structure,spoiledbyothereffectsaffecting thebodyof thedistribution). Theextremaldependencemeasuresχandχ̅ focuson theextremevaluesof the twovariables. Similarly to the interpretationof correlation,largevaluesofχ ̂andχ̅̂indicateastrongdependencebetweentheirepisodesofhighVAandSE.Thefact

  • (2017).Anapplicationofextremevaluetheorytolearninganalytics:Predictingcollaborationoutcomefromeye-trackingdata.JournalofLearningAnalytics,4(3),140–164.http://dx.doi.org/10.18608/jla.2017.43.8

    ISSN1929-7750(online).TheJournalofLearningAnalyticsworksunderaCreativeCommonsLicense,Attribution-NonCommercial-NoDerivs3.0Unported(CCBY-NC-ND3.0)

    160

    that the bivariate tail-dependence is higher and stronger for the high-quality pairs confirms theunivariatefindings.

    Usingthebivariatespaceformedbythesamegazemeasureforbothparticipants inthepair(bothforVA and SE), we eliminate the need for grouping (averaging or grouping the individualmeasures in aregressionmodel)thepeermeasuresintopairvariables.

    7.1 Why Does EVT Work?

    One reason EVTworks is that, unlike standardmethods that suffer from the difference between theassumed underlying distribution and the actual distribution, EVT properly models the tail (of anycommondistribution)using thecorrectmodel (POTorGEVblockmaxima). Second,whenweuse theextreme episodes, we focus only on the moments that might reflect the episodes during which thecollaboratorsaremostlikelytobe“together.”Then,byfocusingonextremecollaborationepisodes,weremove the noise that could have prevented classicalmethods from differentiating the collaborationquality levels.This fact isalsoevident inFigures10cand10d.Correlationdoesnot reflect thecorrectrelationbetweentheSEforthetwoparticipants.

    However,whycouldwenottakethetop5%quantileandperformanANOVAonthosevalues?Averysimple answer is that the main assumption for ANOVA is that the values should follow a normaldistribution,anditismathematicallyproventhatthetailofanydistribution,whichisnormalinthecaseofANOVA,doesnotfollowthedistribution.Instead,itfollowstheGPD.Hence,itwouldbestatisticallywrongtoperformanANOVAonsuchvariables.CouldwesimplynormalizethedataandthenperformtheANOVA?Thiscouldleadtoaproblemaswecompletelyignoremanyotherpropertiesofdata(e.g.,skewandkurtosis)whilenormalizingthedata.Thus,keyaspectsofthedatagenerationprocessmightbehiddenor removed.EVTprovidesamethod thatassumesnounderlyingdistribution regarding thedata generating process, unlike other classicalmethods. This removes the need to force the data tofollowanygivenstatisticaldistribution.

    7.2 When to use EVT?

    EVT offers the correct way (in the sense that it is based on mathematical foundations) to analyzeabnormaldata(inthesenseofdatafarfromtheaveragevalues).TheEVTtheoryforthelargestvaluesor peaks-over-threshold or bivariate case exposed in the paper is available for any underlyingcontinuous distribution. It should be used when analyzing the tail distribution (for any kind ofcontinuousdistribution)asacomplementaryexplorationofthedata,orwhentraditionalmethodsfailorare uninformative, either because the assumptions required by these methods (like the linearmodel/Pearson correlation) based on linear dependence between the two variables are violated ornearly violated or because the average values on which all these (parametric or non-parametric)methodsarebaseddonotcontaintherelevantinformationofinterest,beingthereforelesspredictive.For example,when a student iswriting in a graphical table, the extreme values of her time series ofwriting speed/pressure are her abnormal sequences (in the sense of departure from her standard

  • (2017).Anapplicationofextremevaluetheorytolearninganalytics:Predictingcollaborationoutcomefromeye-trackingdata.JournalofLearningAnalytics,4(3),140–164.http://dx.doi.org/10.18608/jla.2017.43.8

    ISSN1929-7750(online).TheJournalofLearningAnalyticsworksunderaCreativeCommonsLicense,Attribution-NonCommercial-NoDerivs3.0Unported(CCBY-NC-ND3.0)

    161

    measures)andrelatetoherepisodesofstress.Anotherexample,whenateacherlooksattheexamstoinfertheheterogeneityoftheclass,shecannotjustbesatisfiedbyarobustmeasureofthevariabilityofmarks.Shehas tocarefully consider theworstand thebestmarks (theextremes)as the limitsof theclass heterogeneity frame. Neglecting theworst and the best would not onlymean neglecting somestudents(whoprobablyhaveanimportantimpactontheclass)butalsoneglectingrelevantinformation.Furthermore, while analyzing trace data (for example, click-streams), although the theory is notestablishedforthediscretecase,itistypicallyusedtocountvariables,likePoissonvariables,becauseoftheirapproximationbycontinuousdistribution.

    8 CONCLUSION

    Itiseasytounderstandthatastatisticalmodelthatpredictsariseinwaterlevelof5metreshasmoresocial relevance thanamodel thatpredictsa riseof5centimetres. Ineducation, thisapproach is lessintuitive. Typically, a teacher would care for the average level of his class and try to cope with itsheterogeneity. It is hence very counter-intuitive that EVT reaches a higher discriminative power thanmethodsbasedoncentraltrends. Insciences,whatiscounter-intuitiveisalwaysinteresting.However,we should not forget that the extreme values are not outliers but extreme time episodes duringcollaboration,whichislesscounter-intuitive.Ifateachermonitorsaclassroomwithseveralteams,(s)hewouldprobablybealsoattractedby“extreme”episodes;forinstance,whenpeersdonotspeakatallorwhentheyshoutateachother.Inourexperiment,therawdataisnotdialoguebutgazepatterns,andatthispointnothingprovesthatsimilarresultswouldbeobtainedwithotherbehavioural traces.WedonotclaimthatEVTshouldreplaceotherstatisticalmethodsusedinlearninganalytics,butratherthatitexpandstherangeoftoolsavailabletolearningscientists.Byusingitacrossmultiplelearningcontexts,wewilllearnwhenandwhyitbringsmorediscriminativepowerthanmethodsbasedoncentraltrends.

    REFERENCES

    Abernethy,B.,&Russell,D.G.(1987).Therelationshipbetweenexpertiseandvisualsearchstrategyinaracquet sport. Human Movement Science, 6(4), 283–319. http://dx.doi.org/10.1016/0167-9457(87)90001-7

    Charness,N., Reingold, E.M., Pomplun,M.,& Stampe,D.M. (2001). Theperceptual aspectof skilledperformanceinchess:Evidencefromeyemovements.Memory&Cognition,29(8),1146–1152.http://dx.doi.org/10.3758/BF03206384

    Chavez-Demoulin, V., & Davison, A. C. (2012). Modelling time series extremes. REVSTAT: StatisticalJournal,10(1),109–133.

    Cherubini,M.,&Dillenbourg,P. (2007).Theeffectsofexplicit referencing indistanceproblemsolvingover shared maps. In Proceedings of the 2007 International ACM Conference on SupportingGroupWork(GROUP’07),4–7November2007,SanibelIsland,FL,USA(pp.331–340).NewYork:ACM.http://dx.doi.org/10.1145/1316624.1316674

    Coles, S. (2001). An introduction to statistical modeling of extreme values. London: Springer.http://dx.doi.org/10.1007/978-1-4471-3675-0

  • (2017).Anapplicationofextremevaluetheorytolearninganalytics:Predictingcollaborationoutcomefromeye-trackingdata.JournalofLearningAnalytics,4(3),140–164.http://dx.doi.org/10.18608/jla.2017.43.8

    ISSN1929-7750(online).TheJournalofLearningAnalyticsworksunderaCreativeCommonsLicense,Attribution-NonCommercial-NoDerivs3.0Unported(CCBY-NC-ND3.0)

    162

    Duchowski,A.T.,Cournia,N.,Cumming,B.,McCallum,D.,Gramopadhye,A.,Greenstein,J.,…Tyrrell,R.A. (2004).Visualdeicticreference inacollaborativevirtualenvironment. InProceedingsofthe2004 Symposiumon Eye TrackingResearch&Applications (ETRA ’04) 22–24March2004, SanAntonio,TX,USA(pp.35–40).NewYork:ACM.http://dx.doi.org/10.1145/968363.968369

    Grant, E. R., & Spivey, M. J. (2003). Eye movements and problem solving guiding attention guidesthought.PsychologicalScience,14(5),462–466.http://dx.doi.org/10.1111/1467-9280.02454

    Griffin,Z.M.,&Bock,K. (2000).Whattheeyessayaboutspeaking.PsychologicalScience,11(4),274–279.http://dx.doi.org/10.1111/1467-9280.00255

    Jermann,P.,Mullins,D.,Nüssli,M.A.,&Dillenbourg,P.(2011).Collaborativegazefootprints:Correlatesof interactionquality. InConnecting Computer-SupportedCollaborative Learning to Policy andPractice:Proceedingsof the9th InternationalConferenceonComputer-SupportedCollaborativeLearning(CSCL2011),4–8July2011,HongKong,China(Vol.1,No.EPFL-CONF-170043,pp.184–191).InternationalSocietyoftheLearningSciences.

    Jermann, P., & Nüssli, M.-A. (2012). Effects of sharing text selections on gaze cross-recurrence andinteractionquality inapairprogrammingtask. InProceedingsofthe2012ACMConferenceonComputer Supported Cooperative Work (CSCW ʼ12), 11–15 February, Seattle, WA, USA (pp.1125–1134).NewYork:ACM.http://dx.doi.org/10.1145/2145204.2145371

    Jermann,P.,Nüssli,M.-A.,&Li,W.(2010).Usingdualeye-trackingtounveilcoordinationandexpertisein collaborative Tetris. InProceedings of the 24thBCS Interaction SpecialistGroupConference(BCS ’10), 6–10 September 2010, Dundee, UK (pp. 36–44). Swindon, UK: BCS Learning &DevelopmentLtd.

    Kahrimanis,G.,Chounta,I.A.,&Avouris,N.(2010).Studyofcorrelationsbetweenlogfile-basedmetricsof interaction and the quality of synchronous collaboration. International Reports on Socio-Informatics,7(1),24–31.

    McNeil,A., Frey,R.,&Embrechts,P. (2015).Quantitative riskmanagement:Concepts, techniquesandtools.Princeton,NJ:PrincetonUniversityPress.

    Meier, A., Spada, H., & Rummel, N. (2007). A rating scheme for assessing the quality of computer-supported collaboration processes. International Journal of Computer-Supported CollaborativeLearning,2(1),63–86.http://dx.doi.org/10.1007/s11412-006-9005-x

    Meyer, A. S., Sleiderink, A.M., & Levelt,W. J. (1998). Viewing and naming objects: Eyemovementsduring noun phrase production. Cognition, 66(2), B25–B33. http://dx.doi.org/10.1016/S0010-0277(98)00009-2

    Nüssli,M.-A.(2011).Dualeye-trackingmethodsforthestudyofremotecollaborativeproblemsolving.PhDThesis,ÉcolePolytechniqueFédéraledeLausanne.

    Reingold,E.M.,Charness,N.,Pomplun,M.,&Stampe,D.M.(2001).Visualspaninexpertchessplayers:Evidence from eye movements. Psychological Science, 12(1), 48–55.http://dx.doi.org/10.1111/1467-9280.00309

    Richardson, D. C., Dale, R., & Kirkham,N. Z. (2007). The art of conversation is coordination commongroundandthecouplingofeyemovementsduringdialogue.PsychologicalScience,18(5),407–413.http://dx.doi.org/10.1111/j.1467-9280.2007.01914.x

  • (2017).Anapplicationofextremevaluetheorytolearninganalytics:Predictingcollaborationoutcomefromeye-trackingdata.JournalofLearningAnalytics,4(3),140–164.http://dx.doi.org/10.18608/jla.2017.43.8

    ISSN1929-7750(online).TheJournalofLearningAnalyticsworksunderaCreativeCommonsLicense,Attribution-NonCommercial-NoDerivs3.0Unported(CCBY-NC-ND3.0)

    163

    Richardson, D. C., Dale, R., & Tomlinson, T. M. (2009). Conversation, gaze coordination, and beliefsabout visual context. Cognitive Science, 33(8), 1468–1482. http://dx.doi.org/10.1111/j.1551-6709.2009.01057.x

    Ripoll, H., Kerlirzin, Y., Stein, J.-F., & Reine, B. (1995). Analysis of information processing, decisionmaking, and visual strategies in complex problem solving sport situations.HumanMovementScience,14(3),325–349.http://dx.doi.org/10.1016/0167-9457(95)00019-O

    Schneider,B.,&Blikstein,P.(2015).Comparingthebenefitsofatangibleuserinterfaceandcontrastingcasesasapreparationforfuturelearning.InProceedingsofthe11thInternationalConferenceonComputer Supported Collaborative Learning(CSCL 2015), 7–11 June 2015, Gothenburg,Sweden.InternationalSocietyoftheLearningSciences.

    Sharma,K.,Caballero,D.,Verma,H., Jermann,P.,&Dillenbourg,P. (2015). LookingATversus lookingTHROUGH: A dual eye-tracking study inMOOC context. Proceedings of the 11th InternationalConference on Computer Supported Collaborative Learning(CSCL 2015), 7–11 June 2015,Gothenburg,Sweden.InternationalSocietyoftheLearningSciences.

    Sharma,K.,Chavez-Demoulin,V.,&Dillenbourg,P.(2017).Non-stationarymodelingoftail-dependenceoftwosubjects’concentration.(Toappear)AnnalsofAppliedStatistics.

    Sharma,K.,D’Angelo,S.,Gergle,D.,&Dillenbourg,P.(2016).VisualaugmentationofdeicticgesturesinMOOCvideos.InProceedingsofthe12thInternationalConferenceoftheLearningSciences(ICLS’16),20–24June2016,Singapore.ISLS.http://dx.doi.org/10.22318/icls2016.28

    Sharma,K.,Jermann,P.,Nüssli,M.-A.,&Dillenbourg,P.(2012).Gazeevidencefordifferentactivitiesinprogram understanding. In Proceedings of the 24th Annual Conference of Psychology ofProgramming Interest Group. London, UK, November 21–23, 2012.https://infoscience.epfl.ch/record/184006

    Sharma, K., Jermann, P.,Nüssli,M.-A.,&Dillenbourg, P. (2013).Understanding collaborativeprogramcomprehension: Interlacing gaze and dialogues. In Proceedings of the 10th InternationalConference on Computer-Supported Collaborative Learning (CSCL 2013), 15–19 June 2013,Madison, WI, USA. International Society of the Learning Sciences.https://infoscience.epfl.ch/record/184007

    Stein,R.,&Brennan,S.E.(2004).Anotherperson’seyegazeasacueinsolvingprogrammingproblems.InProceedings of the 6th International Conference onMultimodal Interfaces (ICMI ’04) 13–15October 2004, State College, PA, USA (pp. 9–15). New York: ACM.http://dx.doi.org/10.1145/1027933.1027936

    Thomas, L. E., & Lleras, A. (2007). Moving eyes and moving thought: On the spatial compatibilitybetween eye movements and cognition. Psychonomic Bulletin & Review, 14(4), 663–668.http://dx.doi.org/10.3758/BF03196818

    VanGog,T.,Jarodzka,H.,Scheiter,K.,Gerjets,P.,&Paas,F.(2009).Attentionguidanceduringexamplestudy via the model’s eye movements. Computers in Human Behavior, 25(3), 785–791.http://dx.doi.org/10.1016/j.chb.2009.02.007

    VanGog,T.,Kester,L.,Nievelstein,F.,Giesbers,B.,&Paas,F. (2009).Uncoveringcognitiveprocesses:Differenttechniquesthatcancontributetocognitive loadresearchandinstruction.Computers

  • (2017).Anapplicationofextremevaluetheorytolearninganalytics:Predictingcollaborationoutcomefromeye-trackingdata.JournalofLearningAnalytics,4(3),140–164.http://dx.doi.org/10.18608/jla.2017.43.8

    ISSN1929-7750(online).TheJournalofLearningAnalyticsworksunderaCreativeCommonsLicense,Attribution-NonCommercial-NoDerivs3.0Unported(CCBY-NC-ND3.0)

    164

    inHumanBehavior,25,325–331.http://dx.doi.og/10.1016/j.chb.2008.12.021VanGog, T., & Scheiter, K. (2010). Eye tracking as a tool to study and enhancemultimedia learning.

    LearningandInstruction,20(2),95–99.http://dx.doi.org/10.1016/j.learninstruc.2009.02.009Wise,A.,&Shaffer,D.W.(2015).Whytheorymattersmorethaneverintheageofbigdata.Journalof

    LearningAnalytics,2(2),5–13.http://dx.doi.org/10.18608/jla.2015.22.2Zelinsky,G.J.,&Murphy,G.L.(2000).Synchronizingvisualandlanguageprocessing:Aneffectofobject

    name length on eye movements. Psychological Science, 11(2), 125–131.http://dx.doi.org/10.1111/1467-9280.00227