Top Banner
[email protected] @cab938 Shape of Educational Data: Predictive Modeling as an Enabler of Personalized Learning Christopher Brooks Research Assistant Professor, School of Information Director of Learning Analytics and Research Digital Education and Innovation University of Michigan
22

Christopher Brooks SOED 2016

Jan 26, 2017

Download

Education

Colleen Ganley
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Christopher Brooks SOED 2016

[email protected] @cab938

Shape of Educational Data: Predictive Modeling as an Enabler of

Personalized Learning

Christopher BrooksResearch Assistant Professor, School of Information

Director of Learning Analytics and ResearchDigital Education and Innovation

University of Michigan

Page 2: Christopher Brooks SOED 2016

[email protected] @cab938

Psychohistory

“…[it] combined history, psychology and mathematical statistics to create a (nearly) exact science of the behavior of very large populations of people…Asimov used the analogy of a gas: in a gas, the motion of a single molecule is very difficult to predict, but the mass action of the gas can be predicted to a high level of accuracy. Asimov applied this concept to the population of the fictional Galactic Empire, which numbered in the quadrillions.”

http://asimov.wikia.com/wiki/Psychohistory

Page 3: Christopher Brooks SOED 2016

[email protected] @cab938

• "Averaginarianism"• Regression towards a mean that

doesn't actually naturally exist• There is a gulf between the

predictive modeling perspectives, and the explanatory modeling ones

Page 4: Christopher Brooks SOED 2016

[email protected] @cab938

Research Perspective• Learners are individuals• There is nuance in data that is

important and being missed bystudying populations vs. individuals

• Computational modelling (esp. predictive modelling) has opportunity to help

Page 5: Christopher Brooks SOED 2016

[email protected] @cab938

Traditional Higher Education

Low Stakes Lifelong Learning

Page 6: Christopher Brooks SOED 2016

[email protected] @cab938

Lecture Capture• How do students integrate educational technologies into their study habits?

– (and do those technologies have any effect?)• A need for insight

– Studies largely show only student satisfaction benefits from lecture capture– Several studies show no effect to the use of lecture capture on performance

• Data mining for usage patterns– Apply unsupervised machine learning methods (k-means clustering) to viewership data by

week– Then built general model from prototypes and apply to new datasets and determine fit

(replication)

Page 7: Christopher Brooks SOED 2016

[email protected] @cab938

Results (Chemistry 2xx 2010)

• 5 groups found, each pedagogically labelled (by investigators!)• Error and size of groups ranges considerably• The final exam period is not indicative of activity throughout semester

Page 8: Christopher Brooks SOED 2016

[email protected] @cab938

The general model

• How well does the model generalize?

Page 9: Christopher Brooks SOED 2016

[email protected] @cab938

(Grades Pairwise Tukey HSD, * p<0.1 ** p<0.05)

(Grades)

Page 10: Christopher Brooks SOED 2016

[email protected] @cab938

Results• Not a predictive model,

but a more discriminate descriptive model– Showed an effect not for general use of lecture

capture, but for specific ways of using lecture capture• Replication suggests there is merit to the model, but that

it is highly contextualized (theme of course)• Data from more sources could add further detail to the

model as to causal effects

Brooks, C. A., Erickson, G., Greer, J. E., Gutwin, C. (2014) Modelling and Quantifying the Behaviours of Students in Lecture Capture Environments. In Computers & Education. Vol 75 June. pages 282-292.

Page 11: Christopher Brooks SOED 2016

[email protected] @cab938

Bonus Calculus Slide

Page 12: Christopher Brooks SOED 2016

[email protected] @cab938

Massive Open Online Courses• As of the end of 2014, MOOCs at Michigan have attracted

1.9 million enrollees and nearly 1 million participants• Of these participants, ~ 300K attempt some assessment

task, ~80K end up passing the course (certificate)• Can we do better in understanding student success in this

environment?• Could we predict who is at-risk for students who want to

obtain a certificate?

Page 13: Christopher Brooks SOED 2016

[email protected] @cab938

• MOOCs lack the diversity of data we have about residential students– Previous achievement (SAT/ACT, last years course)– Socioeconomic status (distance from university, first in family,

wealth)– Gender– Ethnicity– Motivation

• Building predictive models of student achievement in learning analytics is largely done on these entry-level features

• Both frustrating and refreshing– Want accurate models, but want actionable data

Page 14: Christopher Brooks SOED 2016

[email protected] @cab938

• Built a novel feature selection algorithm inspired by work in the text-mining community

• It looks at the pattern of engagement that a student has with course resources

• Build of historical data (last years course) to create day-by-day multilevel models (C4.5)

• Initial work is based on student certificate achievement (pass/fail)– (not the only valuable outcome variable to try and

predict!)

Page 15: Christopher Brooks SOED 2016

[email protected] @cab938

Resource

Day of Course

1 2 3 4 5 6 7 8 9Video

Daily AccessesDay 1: YesDay 2: NoDay 3: YesDay 4: NoDay 5: NoDay 6: NoDay 7: NoDay 8: YesDay 9: No

3-Day countsDay 1-3: YesDay 4-6: NoDay 7-9: Yes

Weekly countsWeek 1: YesWeek 2: Yes

Monthly countsMonth 1: Yes

For a 104 day long course,with three resources(videos, forums, quizzes)this gives us 408 featuresfor the modelling activity.

Page 16: Christopher Brooks SOED 2016

[email protected] @cab938

Text Mining Inspiration• Text mining often uses n-grams as features in a document

– A bigram (cat, good) is the number of pairs of these two words in a document, a trigam (cat, was not good), etc.

– We build engagement n-grams up to 5 gramDaily AccessesDay 1: YesDay 2: NoDay 3: YesDay 4: NoDay 5: NoDay 6: NoDay 7: NoDay 8: YesDay 9: No

Possible bigrams[yes, yes]: 0[no, no]: 3[yes, no]: 3[no, yes]: 2

Possible trigrams:[yes, yes, yes]: 0[yes, yes, no]: 0[yes, no, yes]: 1…

For a 104 day long course, with three resources (videos, forums, quizzes) this gives us 717 more features for the modelling activity.

Page 17: Christopher Brooks SOED 2016

[email protected] @cab938

In a nutshell• We do not have diverse set of data, but

we do have a detailed set of data• And there is a lot of it (200 million+

clickstream events)• By pulling out patterns of resource

access, we can use supervised machine learning (C4.5) techniques to build predictive models

• But what if we did have entry data from students?– Gender & Ethnicity, certification

status, country of origin, etc.

Page 18: Christopher Brooks SOED 2016

[email protected] @cab938

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 750

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Fliess' κ versus Time in Days

Activity Features OnlyDemographics Features OnlyActivity and Demographics Features

Day of Course Offering

Flie

ss' κ

Page 19: Christopher Brooks SOED 2016

[email protected] @cab938

Results• It is possible to create predictive models on clickstream data for MOOCs• 3 weeks into the MOOC seems to be an interesting point for some courses• It is computationally intensive to create these models (daily!)• MOOC entry/demographics information does not seem to add value

C. Brooks, C. Thompson, S. Teasley. (2015) A Time Series Interaction Analysis Method for Building Predictive Models of Learners using Log Data. 5th International Conference on Learning Analytics and Knowledge 2015 (LAK'15)

C. Brooks, C. Thompson, S. Teasley. (2015) Who You Are or What You Do: Comparing the Predictive Power of Demographics vs. Activity Patterns in Massive Open Online Courses (MOOCs). The second annual conference on Learning At Scale 2015 (L@S2015), Works in Progress track. Vancouver BC, March 14-15, 2015. Vancouver, BC.

Page 20: Christopher Brooks SOED 2016

[email protected] @cab938

Page 21: Christopher Brooks SOED 2016

[email protected] @cab938

No Particular Night or Morning

“I looked at the page with my name under the title…it was some other man…the story was familiar – I knew I had written it – but that name on the paper still was not me. It was a symbol, a name.”

“I’ve always figured it that you die each day, and each day is a box…but you never go back and lift the lids...each is a different you, somebody you do not know or understand or want to understand.”

Page 22: Christopher Brooks SOED 2016

[email protected] @cab938

Questions? Comments?

Christopher BrooksResearch Assistant Professor, School of Information

Director of Learning Analytics and Researchin Digital Education and Innovation

University of Michigan

[email protected]@cab938