Detecting and Responding to Student Emotion within an Online Tutor Beverly Park WOOLF College of Information and Computer Sciences, University of Massachusetts Amherst •1 •Supported by •National Science Foundation; U.S Dept of Education; •Bill & Melinda Gates Foundation
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Detecting and Responding to Student Emotion within an
Online Tutor
Beverly Park WOOLFCollege of Information and Computer Sciences,
University of Massachusetts Amherst
•1
•Supported by•National Science Foundation; U.S Dept of Education;
•Bill & Melinda Gates Foundation
General Motivation
• What to do in the moment when students are frustrated, bored, etc.?– Increase Challenge? Decrease Challenge?– Provide extra scaffolds?– Provide “affective” scaffolds? What are those?– Encourage to stop and think about what is going on?
• How to measure changes in student affect, capturing micro-changes in student affective states?
Many people: •relate to computers in the same way they relate to
humans (Nass, 2010);•continue to engage in frustrating tasks significantly
longer after an empathic digital response (Picard); •have lowered stress levels after receiving an empathetic message from a digital character (Arroyo et al., 2009); •recalled more information when interacting with an
artist agent compared to scientist agent; •report reduced frustration and more general interest
when working with gender-matched characters.
People and Computers
In Summary
•4
• Empathetic characters help decrease students’ anxiety and boredom. • Simple 2D characters instead of 3D characters work well with students.• Non-natural language processing based tutoring system work well.• Learning companions that show empathy help with students’ negative
affective states,• Growth mindset messages provide a boost in student math learning. • Empathic messages yield higher math performance and learning.• Simple success/failure comments are harmful to students, in
comparison to other conditions
Negative Emotion and Learning
• Confusion is associated with learning under certain conditions
D'Mello, S. K., Lehman, B., Pekrun, R. & Graesser, A. C. (2014) Confusion Can be Beneficial For Learning. Learning & Instruction.
• Boredom reduces task performancePekrun, R., Goetz, T., Daniels, L., Stupinsky, R. & Perry, R. (2010) Boredom in Achievement
Settings: Exploring Control–Value Antecedents and Performance Outcomes of a Neglected Emotion. Journal of Educational Psychology. 102(3), 531-549.
• Boredom increases ineffective behaviors such as gamingBaker, R. S. J. d., D'Mello, S. K., Rodrigo, M. M. T. & Graesser, A. C. (2010). Better to Be Frustrated than Bored: The Incidence, Persistence, and Impact of Learners' Cognitive-
Affective States during Interactions with Three Different Computer-Based Learning Environments. International Journal of Human-Computer Studies. 68(4), 223-241.
Negative Valence Emotions
•6
Affective learning companions congratulate students on effort exerted and talk to them about their effort and learning.
Agents support frustrated students by acting helpful, bored, or confused.
Arroyo et al., AIED2009
Agent EmotionEffort Attribution Shrug High interest
Students believe agents are part of the learning experience, mentors. . . who are together with students against the computer, . . . who are more knowledgeable (most of the time) cognitively and
emotionally. Arroyo et al., AIED2009
Methodology
Measure students’ cognitive and affective attributes, (skills, motivation, engagement) in real-time.
Offer appropriate and timely interventions.
Measure the impact of each intervention
Reduced Frustration
•More Frustrated
•Less Frustrated
•Neutral•Frustration
•Level
Increased Interest•Less boredom for math at posttest time in LC condition. •+F(94,1)=3.4,p=.07
• Remediation of Student Emotion– Teacher-based– Peer-based– Game-based
•24
AGENDA
How are you feeling? Please rate your level of interest in this
•26
Students Self-Report EmotionsFour bipolar emotional axes
Resulting Data
How can we understandstudent affect?
How can student’s actual words be used?
Research to Detect Affect
Two major approaches to examine affect:•Categorical
– Each affect is considered a discrete category– Approach usually used in ITS
•Dimensional models of emotion– Valence (+/-), activation (high/low)– Locus of control (internal/external)– Previous results mixed
Self-report Methods• Student was prompted for affect every 5
problems or 8 minutes (whichever came first)– Asked either frustration, confidence, excitement
or interest each time• Asked to rate their affect on
5-point Likert scale and toanswer “why”
Phase 1: Open Coding
• 450 random responses from 2011 given to 5 coders
• Each coder independently created ~10 categories and tagged the responses using these, i.e., students words were an example of emotion XX;– Coders covers at least 70% of all responses– Could tag a response with multiple categories if
appropriate
Phase 2: Axial Coding• 3 coders used the resulting 5 schemes to create 10 final
categories:– IDK- student doesn’t know why they feel that way– Boring- student says they are bored/material is boring– Easy- student says material is easy– Hard- student says material is hard– Internal- student attributes feelings to self– External- student attributes feelings to something outside
self– Positive- valence of comment is positive– Negative- valence of comment is negative– Supportive- student says tutor is helpful or supports them– Unsupportive- student says tutor is not helpful or does not
support them
Phase 3: Application and Validation of Tags
• Four coders each coded the 2015 data using the 10 agreed upon categories
• Inter-rater reliability by Cohen’s Kappa• Used the coded data from the coder who
had highest agreement with others overall Highest Kappa between any 2 coders for Each Tag
Results & Analyses
Frequency of Each Code Out of a Total Sample Group Frequency of Each Code Out of a Total Sample Group (2015 N = 449; 2011 N = 464) (2015 N = 449; 2011 N = 464)
The tutor seems to improve in promoting positive student affect.
Results & AnalysesPercentage of reports containing each tag Percentage of reports containing each tag
broken down by affectbroken down by affect
20112011
20152015
More positive, less negative affect.More internal, less external affect.
Less boring material.
Results & Analyses
Frequent CombinationsFrequent Combinations
Discussion
• Many students tend to externalize affect– Especially when negative; “the problems are too
difficult”• Populations differed on when they reported the
material was “hard”• The tutor seems to be improving in promoting
positive student affect.
Representations of Affect
Use External Coders
•She looks “Angry”
•She looks “Angry”
•She looks “Angry”
Bromp Observation Method Protocol
•Ocumpaugh, J., Baker, R., and Rodrigo, M.M.T. •Baker-Rodrigo observation method protocol (BROMP) 1.0. Training manual version 1.0. Technical Report. New York, NY: EdLab. Manila, Philippines: Ateneo Laboratory for the Learning Sciences, 2012.
Inter-Rater Reliability
• Reliability is a decent “goodness” metric
• Reliability ≠ Validity
• Good face validity
• ≈ “Angry”
• ≈“Constipat
ed”?
Internal Representation ≈ Experience?
• Self Appraisal
• Self Report
• Relate an Experience to a Representation
•Can I understand how I feel?
Self-Report Requires Matching Experience to Representation
Let’s Address This
• Self-Report Method Reliability
• Establish Method to Measure Reliability
• Distinguish Relative Reliability of Different Methods
Participants
• Eighty One (81) Seventh Graders from two California Middle Schools
• Majority Latin American
• Close to California Median Income
Students Were Given
& the Following Stickers
Angry Anxious Bored Confident Confused
Enjoying Excited Frustrated Interested Relieved
Motivations For Stickers
• Words based on relevance in education and similarity to emotes (faces)
• Emoticons (faces) based on Broekens & Brinkman 2013 Affect Button– Has Extremes of Valence (Pleasure), Activation, &
Dominance– We chose each extreme for faces (2^3)
Averaged Self-Report Values
Can progress reports from virtual teachers improve
student interest, excitement? Given that they encourage students to
stop and reflect... And give them a choice…
Do progress reports have the capacity
to improve student interest and excitement?
What do researchers know about showing progress to students?
• Cognitive side: basic progress charts every 6 problems in an intelligent tutor led to higher learning gains
Arroyo et al. (2007). Repairing disengagement with non-invasive interventions. AIED 2007.
• NSF Cyberlearning DIP Collaborative Research: Impact of Adaptive Interventions on Student Affect, Performance and Learning (2.5 more years)
•ACTIVATING EMOTION EXPERIMENTS
•DEACTIVATING EMOTION EXPERIMENTS
•Frustration, anxiety
•Boredom•Unexcitement
•Characters
•Student Progress Page
The MathSpring System
The Student Progress Page
•61
Dovan Rai, PhD, WPI
•Pekrun, R., Elliot, A. J. & Maier, M. A. (2009)
In Ed Psych, well accepted that motivation/emotions influence achievement, but…
Three Experimental Conditions
•63
• No access to the Student Progress Page
• “My Progress” button was present (student choice)
• Prompt invitation to see “my progress” upon bored (disinterested) or unexcited
• Force student to see my progress when student bored (disinterested) or unexcited
Initial Results about receiving SPP
• SPP access indeed increased across conditions– no-button M = 1.3, – Button present M = 3.1, – SPP offered upon low affect M = 6.0, – SPP forced upon low affect M = 8.8.
• Confirmed that there were no differences between conditions in terms of baseline interest and excitement as measured by the pre-affect survey (ns).
• Our current Gold standard/Truth: self-reports• During Experiment:
– Ask students about their “interest” level and their “excitement” level every 7 minutes, on average.
Bold = features about student interaction with tutor
Predictions of Interest/ExcitementDISINTDISINTNEUTRALNEUTRALNEUTRALNEUTRALNEUTRALNEUTRALDISINTNEUTRALNEUTRALDISINTDISINTDISINTDISINTNEUTRALNEUTRALNEUTRALINTERINTERINTERINTERINTERINTERINTERINTERINTER
• Evidence that having the SPP present instead of absent can lead towards interest and excitement– Not shooting for an ideal policy, just a policy that
works to some extent, capable to compete against others
• Remediation of Student Emotion– Teacher-based– Peer-based– Game-based
•74
AGENDA
Remaining Questions
• What if students started unexcited/bored instead of neutral?
• Can we compute the probability of “improving” their affective state, regardless of where they started?
Future Work
• Look into more detail on:– Internal vs. external tag– Relationship between difficulty and affect
• Possible new affective constructions (e.g., persistence)
• Create better predictive models for specific reasons
• Use NLP to label the responses rather than have human coders do it.
Unresolved Questions• No clear evidence that encouraging the SPP at moments of low-
affect is better than simply having the button available• Choice in Button-Present condition might give students a sense of agency, that in turn might make them “feel good”, engaged, etc.
• Might be that intervening only based on the last report of affect is not good enough? Intervene after episodes of boredom/lack of excitement?
• Maybe SPP at moments of boredom/lack of excitement is not that great to repair those states.
• Maybe our models are not so great and that impacts both the affective paths and the results.
Detecting and Responding to Student Emotion within an
Online Tutor
Thank You.
Any Questions?
•79
This research was funded by the National Science Foundation, #1324385, Cyberlearning DIP, Impact of Adaptive Interventions on Student Affect, Performance, and Learning; Burleson, Arroyo and Woolf (PIs). Any opinions, findings, conclusions or recommendations expressed in this paper are those of the authors and do not necessarily reflect the views of the funding agencies.