WHAT’S NEXT? TARGET CONCEPT IDENTIFICATION AND SEQUENCING LEE BECKER 1 , RODNEY NIELSEN 1,2 , IFEYINWA OKOYE 1 , TAMARA SUMNER 1 AND WAYNE WARD 1,2 1 Center for Computational Language and EducAtion Research (CLEAR) University of Colorado at Boulder 2 Boulder Language Technologies 2010.06.18
34
Embed
1 Center for Computational Language and EducAtion Research (CLEAR)
What’s next ? Target Concept Identification and Sequencing Lee Becker 1 , Rodney Nielsen 1,2 , Ifeyinwa Okoye 1 , Tamara Sumner 1 and Wayne Ward 1,2. 1 Center for Computational Language and EducAtion Research (CLEAR) University of Colorado at Boulder 2 Boulder Language Technologies. - PowerPoint PPT Presentation
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
WHAT’S NEXT? TARGET CONCEPT
IDENTIFICATION AND SEQUENCING
LEE BECKER1, RODNEY NIELSEN1,2, IFEYINWA OKOYE1,
TAMARA SUMNER1 AND WAYNE WARD1,2
1 Center for Computational Language and EducAtion Research (CLEAR)University of Colorado at Boulder
2 Boulder Language Technologies
2010.06.18
Goals:
Introduce Target Concept Identification (TCI) Potentially the most important QG related
task Encourage discussion related to TCI
Define a TCI based shared task Illustrate viability
via Baseline and straw man systems Challenge the QG Community to
consider TCI
Overview
Define the Target Concept Identification and Sequencing tasks
Describe component and baseline systems
Discuss the utility of these subtasks in the context of the full Question Generation task
Final Thoughts
QG as a Dialogue Process
Question Generation is much more than surface form realization depends not only on the text or knowledge
source also depends on the context of all previous
Key Concept Identification: CLICK: Building a gold standard concept map
Experts asked to extract and potentially paraphrase spans of text (concepts) from each resource Concept 19: Mantle convection is the process that carries heat
from the core and up to the crust and drives the plumes of magma that come up to the surface and makes islands like Hawaii.
Concept 21: asthenosphere is hot, soft, flowing rock Concept 176: The Theory of Plate tectonics Concept 224: a plate is a large, rigid slab of solid rock
Key Concept Identification: CLICK: Building a gold standard concept map
Experts link and labeled concepts (i.e. build a map) for each of the 20 resources Open ended label vocabulary Discourse-style relations: elaborates, cause, defines, evidence,
etc… Domain specific relations: technique, type of, and indicates,
etc… 10 most frequent labels account for 64% of labels
Key Concept Identification: CLICK: Building a gold standard concept map
Experts individually combined 20 resource maps to span the whole domain
Experts collaboratively combined their individual resource maps to create a final concept map
elaborates… Domain-specific labels: technique, type of, slower than Vocabulary unspecified
10 most frequent labels account for 64% of the links With some refinement could use RST or Penn Discourse
labels to create gold standard Next steps
Create more reliable link classifier Develop a link relation classifier
Key Concept Identification:Graph Analysis
Given a concept-map (graph) identify the key or central concepts (versus supporting concepts)
Approach: Graph analysis using PageRank + HITS algorithm Key concepts are the intersection of:
Concepts selected by PageRank + HITS Concepts with the highest ratio of incoming vs. outgoing links Concepts with the highest term density
Evaluation: No gold standard set of core concepts Experts asked to identify subtopic regions on concept map
Earthquake types, Tsunamis, theory of continental drift… 80% core concept coverage of 25 subtopics
Concept Sequencing
Goal: Create a directed acyclic graph, which represents the logical order in which concepts should be introduced in a lesson or tutorial dialogue (w/r to a pedagogy)
Partial Ordering Example:
1. Pitch represents the perceived fundamental frequency of a sound.2. A shorter string produces a higher pitch.3. A tighter string produces a higher pitch.4. A discussion of the difference in pitch across each of the strings of
a violin and a cello.
1
3
2
4
Concept Sequencing: Straw Man Approach
Aim: Show the viability of a concept sequencing task
Intuition: Concepts that should precede other concepts will exhibit this behavior across the corpus of digital library resources
Issues: Concepts may not appear in their entirety in a
document Aspects of concepts may show up earlier than the
concept as a whole Approach: Treat concept to document alignment
as an information retrieval task
Concept Sequencing:Implementation
Indexed the original 20 CLICK resources at the sentence level using Lucene (Standard Analyzer, similarity score threshold = 0.26)
Concepts are queries A concept’s position in a resource is the sentence
number of the earliest matching sentence
Concept A 1____________2____________3____________4____________5____________6____________
With concept positions identified and tabulated, compute pairwise comparisons between all concepts’ sentence numbers
If concept does not appear in a resource, do not include it in comparison
Concepts with an identical number of predecessors are considered to be at the same level
Preceedes
A B C
A 1 1
B 1
C
Resource 1 Resource 2Preceedes
A B C
A 1 X
B X
C
Resource 3Preceedes
A B C
A 1
B 1 1
C
TotalPreceedes
A B C
A 2 1
B 1 2
C
Concept SequencingResults
Concept Sequencing System Output
Concept SequencingEvaluation Data
Currently no canonical concept sequence for CLICK data
Instead derived gold-standard evaluation data using a set of expert provided remediation strategies for individual students essaysRe
mediation Strategey
Student Essay Sentence Number
Concept Number
21,23 85, 88, 92, 94, 176
1,3 210, 215, 217, 53, 55, 57, 58
24,26 444, 324, 342, 360
19,31 94, 95, 96, 138
42,44,45,46 610, 615, 613, 616, 618, 627
Rem
edia
tion O
rder
Concept SequencingEvaluation Data
Of 55 key concepts 14 did not occur in any of the remediation
strategies 41 left to define concept sequence
evaluation Used frequency of precedence across
remediations to create a first pass concept sequence
Manually removed loops and errant orderings
Concept SequencingEvaluation Data
Gold-standard Evaluation Sequence
Concept SequencingEvaluation
F1-Measure Average Instance Recall (IR) over all gold-standard key
concepts that have predecessors Average Instance Precision (IP) over all of the non-initial
system-output concepts that are aligned to gold-standard key concepts
Gi all predecessors of ith gold-standard key concept Oj all predecessors of jth system output concept
€
R =1
hIRi =
i=1
h
∑ 1
h
G i∩OiG ii=1
h
∑
P =1
lIPj =
j=1
l
∑ 1
l
O j ∩G j
O jj=1
l
∑
Concept SequencingResults and Discussion
F1=0.526 (P=0.383, R=0.726) Gold-standard
Multiple initial nodes System output
One single initial node Linear hierarchies All nodes with same number of predecessors at the same level All inclusive ordering favors recall
Future Work Utilize pairwise data to produce less densely packed graphs More sophisticated measures of semantic similarity Make use of concept map link relationships (cause, define…) Conduct expert studies to get gold-standard sequences and
concepts
Tutorial Dialogue and Question Realization
Dialogue-based ITS Labor intensive Effort centers on authoring of dialogue
content and flow Design of dialogue states non-trivial
Tutorial Dialogue and Question Realization
So what does Target Concept Identification buy us? Critical steps towards more automated ITS
TCI Mappings to Dialogue Management Key Concepts = States or Frames Concept Sequence = Default Dialogue
Management Strategy
Tutorial Dialogue and Question Realization
Example: Concept 486: an earthquake is the
sudden slip of part of the Earth’s crust... Concept 561: …When the stress in a
particular location is great enough... an earthquake begins
Suppose student has stated a paraphrase of 486
ITS can produce: Now that you have defined what an
earthquake is, can you explain what causes them?
Cause
d-b
y
Final Thoughts
Defined Target Concept Identification Baseline and past results suggest
feasibility of TCI subtasks Challenge the QG community to
continue to think of QG as the product of several tasks including TCI
Acknowledgements
Advisers and colleagues at: The University of Colorado at Boulder The Center for Computational Language and EducAtion
Research (CLEAR) Boulder Language Technologies
Support from: The National Science Foundation. NSF (DRL-0733322,
DRL-0733323, DRL-0835393, IIS-0537194) The Institute of Educational Sciences. IES
(R3053070434).
Any findings, recommendations, or conclusions are those of the author and do not necessarily represent the views of NSF or IES.
References
1. F. Ahmad, S. de la Chica, K. Butcher, T. Sumner, and J.H. Martin. Towards automatic conceptual personalization tools. In Proc 7th ACM/IEEE-CS joint conference on Digital Libraries. ACM, 2007.
2. I. L. Beck, M. G. McKeown, C. Sandora, L. Kucan, and J Worthy. Questioning the author: A year-long classroom implementation to engage students with text. The Elementary School Journal, 98:385– 414, 1996.
3. B.S. Bloom. Taxonomy of Educational Objectives: The Classification of Educational Goals. Susan Fauer Company, Inc, 1956.4. S. de la Chica, F. Ahmad, J.H. Martin, and T. Sumner. Pedagogically useful extractive summaries for science education. In
Proc CoLing, volume 1, pages 177– 184. Association for Computational Linguistics, 2008.5. A Graesser, V Rus, and Z Cai. Question classification schemes. In Proc WS on the QGSTEC, 20086. Q. Gu, S. Chica, F. Ahmad, H. Khan, T. Sumner, J.H. Martin, and K. Butcher. Personalizing the selection of digital library
resources to support intentional learn- ing. In Proc Euro Research and Advanced Technology for Digital Libraries, 2008.7. P.W. Jordan, B Hall, M Ringenberg, Y Cue, and C Rose. Tools for authoring a dialogue agent that participates in learning
studies. In Proc AIED, pages 43–50, Amsterdam, The Netherlands, The Netherlands, 2007. IOS Press.8. W.C. Mann and S.A. Thompson. Rhetorical structure theory: Toward a functional theory of text organization. Text, 8(3):243–
281, 1988.9. RD. Nielsen. Question generation: Proposed challenge tasks and their evaluation. In Proc WS on the QGSTEC, 2008.10. RD Nielsen, J Buckingham, G Knoll, B Marsh, and L. Palen. A taxonomy of questions for question generation. In Proc WS on
the Question Generation Shared Task and Evaluation Challenge., 2008.11. R Prasad, N Dinesh, A Lee, E Miltsakaki, L Robaldo, A Joshi, and B Webber. The penn discourse treebank 2.0. In Proc LREC,
2008.12. R Prasad and Aravind Joshi. A discourse-based approach to generating why- questions from texts. In Proc WS on the
QGSTEC, 2008.13. D. Radev, T. Allison, S. Blair-Goldensohn, J. Blitzer, A. C Uelebi, S. Dimitrov, E. Drabek, A. Hakim, W. Lam, D. Liu, J.
Otterbacher, H. Qi, H. Saggion, S. Teufel, M. Topper, A. Winkel, and Z. Zhang. Mead - a platform for multidocument multilingual text summarization. In Proc. LREC 2004, 2004.
14. C.M. Reigeluth. The elaboration theory: Guidance for scope and sequence decisions. In Instructional-Design Theories and Models: A New Paradigm of Instructional Theory. Lawrence Erlbaum Assoc, 1999.
15. V. Rus, Z. Cai, and A.C. Graesser. Question generation: An example of a multi- year evaluation campaign. In Proc WS on the QGSTEC, 2008.
16. R. Soricut and D. Marcu. Sentence level discourse parsing using syntactic and lexical information. In Proc HLT/NAACL, pages 228–235, 2003.
17. S. Susarla, A. Adcock, R. Van Eck, K. Moreno, A. C. Graesser, and the Tutoring Research Group. Development and evaluation of a lesson authoring tool for autotutor. In V. Aleven, U. Hoppe, R. Mizoguchi J. Kay, H. Pain, F. Verdejo, and K. Yacef, editors, Proc. AIED2003, pages 378–387, 2003.
18. L. Vanderwende. The importance of being important. In Proc WS on the QGSTEC, 2008.19. Howard Wainer. Computer-Adaptive Testing: A Primer. 2000.