Generating educational assessment items from - stefan dietze

Generating educational assessment items from Linked Open Data

The case of DBpedia

Muriel Foulonneau

[email protected]

“To Really Learn, Quit Studying and Take a Test” (NYT, Jan, 2011)

Formative assessment

Self-assessment

Items are expensive

Creating, reusing, sharing test items

05/2011 2ESWC 2011

Why generating items?

� Security issue

Adding variability to an item

no expected variation of the construct

� Model-based learning

Generating items from knowledge represented as a model

the construct is modified for each item

30/05/2011 Presentation Tudor 3

Assumption on model-based learning

INTERESTING BECAUSE

- Can enable adaptive learning paths

- Independent from particular representations of learning resources

CONSTRAINTS

A domain model must exist

- Can enable adaptive learning paths

- Bring experts together to design a model of what learners should learn

LIMITATIONS

- Experts are difficult to mobilize for a long modeling exercise

- What about specialized /professional knowledge?

- How to ensure the evolution of the model?


The LoD Cloud as a source of knowledge

� Existing data sources

no need to gather experts

� Including knowledge which is not well codified in curricula

Knowledge gathered from experts as well as non experts

� Many datasets added or modified all the time

Can reflect evolution of the knowledge


Using LoD for model-based learning


Limitations of model-based learning

LoD as a source of knowledge

� Experts are difficult to mobilize for a long modeling exercise

�Existing data sourcesNo need to gather experts

�What about specialized /professional knowledge?

�Including knowledge which is not well codified in curricula

Knowledge gathered from experts as well as non experts

�How to ensure the evolution of the model?

�Many datasets added or modified all the time

Can reflect evolution of the knowledge

Objectives of the experimentation

� Are there limitations to the use of Linked open Data as a knowledge model for learning ?

• Is this feasible?

• Are the datasets relevant?

• How much quality control is needed?

Test on factual knowledge for simple choice items


Semi-automatic item generation

� Manual definition of an item template

� Automatic generation of variables

30/05/2011 ESWC 2011 8

Stem variables

options

key

Auxiliary information

Existing strategies

• Algorithms• X: Value range: 3 to 18 by 3

• Natural language processing• vocabulary questions and cloze questions

• Structured datasets• Vocabulary questions from the WordNet dataset

• Model extraction then question generation• From natural language (or model creation by experts)

� Mostly used in mathematics and scientific subjects • where algorithmic definition of variables is easier

� And for L2 learning

Challenge to generate other types of variables

• Additional information, historical knowledge, feedback…


The QTI item generation process


QTI Item template

IMS Question & Test Interoperability Specification

XML serialization using JSON templates

30/05/2011 ESWC 2011 11

<choiceInteraction responseIdentifier="RESPONSE" shuffle="false"maxChoices="1">

<prompt>What is the capital of {prompt}?</prompt><simpleChoice

identifier="{responseCode1}">{responseOption1}</simpleChoice><simpleChoice

identifier="{responseCode2}">{responseOption2}</simpleChoice><simpleChoice

identifier="{responseCode3}">{responseOption3}</simpleChoice></choiceInteraction>

Get the knowledge from LoD

SELECT ?country ?capitalWHERE {?c <http://dbpedia.org/property/commonName> ?country .?c <http://dbpedia.org/property/capital> ?capital}LIMIT 30

30/05/2011 ESWC 2011 12

SPARQL query to generate capitals in Europe

Never possible to generate an item from a single triple because of constraint to find appropriate labels

Label

Generating item distractors

i.e., incorrect answer options

Strategies

- Instances of the same class

⇒ Creation of a variable store⇒ Random selection of distractors

Next step: Attribute-based resource similarity (can be instances of a different class)

=> use of semantic recommender system

30/05/2011 ESWC 2011 13

Item data dictionary

30/05/2011 ESWC 2011 14

Generation of the QTI-XML item

30/05/2011 ESWC 2011 15

Publication on the TAO platform

� TAO is an open source e-assessment platform based on semantic technologies.

Used for diagnostic, formative, large-scale assessment, including national school monitoring, OECD PISA/PIIAC surveys, competence assessment for unemployed ….

� Supports imports

of IMS-QTI items


Different types of questions

Q1: queries uncontrolled datasets

Q2: queries revised ontology

Q3: ��queries historical information

Q4: queries a linked data set to add item feedback

Q5: queries medical information

30/05/2011 ESWC 2011 17

Q1: What is the capital of { Azerbaijan }?

� Infobox dataset

� 3 were not generated for a country (Neuenburg am Rhein, Wain, and Offenburg)

� “Managua right|20px”

� Two distinct capitals were found for Swaziland (Mbabane, the administrative capital and Lobamba, the royal and legislative capital)

30/05/2011 ESWC 2011 18

Q2: Which country is represented by this flag ?

� Use of FOAF and YAGO

� Transactional closures

<http://dbpedia.org/class/yago/EuropeanCountries><http://dbpedia.org/class/yago/Country108544813>

� 6/30 URIs did not resolve to a usable picture (HTTP 404 errors or encoding problem).

30/05/2011 ESWC 2011 19

Q3:Who succeeded to { Charles VII the Victorious } as ruler of France ?

� YAGO ontology

� 1 was incorrect (The three Musketeers)

� Multiple labels for the same king

Louis IX, Saint Louis, Saint Louis IX

� One item generated with options having inconsistent naming:

Charles VII the Victorious, Charles 09 Of France, Louis VII

30/05/2011 ESWC 2011 20

Q4:What is the capital of { Argentina }? With feedback

� Uses the linkage of the DBpedia dataset with the Flickr wrapper dataset

� The Flickr wrapper data source was unavailable

� No IPR information

30/05/2011 ESWC 2011 21

Q5: Which category does { Asthma } belong to?

� Retrieves diseases and their categories

� SKOS and Dublin Core, Infobox dataset for labels

� SKOS concepts are not related to a specific SKOS scheme

� Categories of diseases from Skeletal disorders to childhood. => the correct answer to the question on Obesity is childhood.

30/05/2011 ESWC 2011 22

Data quality challenges

From Q1, 53,33% were directly usable

neither a defective prompt nor a defective correct answer nor a defective distractor .

Benchmark from unstructured content between 3,5% and 21%.

Issues• Ontology issue

• Labels

• Inaccurate statements

• Data linkage (resolvable URIs)

• Missing inferences

30/05/2011 ESWC 2011 23

Chance that an item

will have a defective distractor =

Data selection

Item difficulty

- can change even with variables not related to the construct (cognitive issues)

- Can change according to the distractors

- => need to establish a framework to assess the difficulty of the construct AND of the item in general (including the relevance of the distractors for instance)

- Psychometric model: what do we know about previous test takers? What can we infer from their performance?

- Ad hoc model: can an a priori difficulty assessment be performed or ��inferred?

30/05/2011 ESWC 2011 24

Future work

� Assessing models on Linked Open Data as a source of knowledge for supporting formative assessment and the learning process

� Improving the selection of distractors by integrating dedicated similarity approach (from a semantic recommender system)

� A wider variety of assessment item models

� A framework to assess the difficulty of items

� Authoring interface for item templates

30/05/2011 ESWC 2011 25

Generating educational assessment items from - stefan dietze

Documents