What*s next? Target Concept Identification and

advertisement
WHAT’S NEXT?
TARGET CONCEPT
IDENTIFICATION
AND SEQUENCING
LEE BECKER1, RODNEY NIELSEN1,2, IFEYINWA OKOYE1,
TAMARA SUMNER1 AND WAYNE WARD1,2
1
2010.06.18
Center for Computational Language and EducAtion Research (CLEAR)
University of Colorado at Boulder
2 Boulder Language Technologies
Goals:

Introduce Target Concept Identification (TCI)
 Potentially

the most important QG related task
Encourage discussion related to TCI
 Define
a TCI based shared task
 Illustrate viability
 via

Baseline and straw man systems
Challenge the QG Community to consider TCI
Overview




Define the Target Concept Identification and
Sequencing tasks
Describe component and baseline systems
Discuss the utility of these subtasks in the context of
the full Question Generation task
Final Thoughts
QG as a Dialogue Process

Question Generation
 is
much more than surface form realization
 depends not only on the text or knowledge source
 also depends on the context of all previous interactions
The Stages of Question Generation
Target
Concept
Identification
What to talk about
next?
Direction of flow
- or Series circuits
Question
Type
Determination
How to ask it?
•Definition Question
•Prediction Question
•Hypothesis Question
Question
Realization
Final natural
language output
What will happen to
the flow of electricity
if you flip the battery
around?
Target Concept Identification


Out of the limitless number of concepts related to the current
dialogue, which one should be used to construct the
question?
Inputs:



Output:


Knowledge sources
Dialogue Context / Interaction History
The next target concept
Subtasks



Key Concept Identification
Concept Relation Identification and Classification
Concept Sequencing
Key Concept Identification



Goal: Extract important concepts from a knowledge
source (plain text, structured databases, etc…)
Want not just the concepts, but the concepts most
critical to learning
Preferably identify core versus supporting concepts
Key Concept Identification:

CLICK - Customized Learning Service for Concept
Knowledge [Gu, et al. 2008]
 Personalized
learning system
 Utilizes Key Concept Identification to:
 Assess
learner’s work
 Recommend digital library resources to help learner remedy
diagnosed deficiencies
 Driven
by concept maps
 Expert
concept map
 Automatically derived concept maps
Key Concept Identification:
CLICK: Building a gold standard concept map

Source data
20 Digital library resources
 Textbook like web text
 collectively considered to contain all the information a high
school graduate should know about earthquakes and plate
tectonics

_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
Key Concept Identification:
CLICK: Building a gold standard concept map

Experts asked to extract and potentially paraphrase spans of text
(concepts) from each resource




Concept 19: Mantle convection is the process that carries heat from the
core and up to the crust and drives the plumes of magma that come up to
the surface and makes islands like Hawaii.
Concept 21: asthenosphere is hot, soft, flowing rock
Concept 176: The Theory of Plate tectonics
Concept 224: a plate is a large, rigid slab of solid rock
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
_____________
Key Concept Identification:
CLICK: Building a gold standard concept map

Experts link and labeled concepts (i.e. build a map) for each
of the 20 resources




Open ended label vocabulary
Discourse-style relations: elaborates, cause, defines, evidence,
etc…
Domain specific relations: technique, type of, and indicates, etc…
10 most frequent labels account for 64% of labels
Key Concept Identification:
CLICK: Building a gold standard concept map


Experts individually combined 20 resource maps to
span the whole domain
Experts collaboratively combined their individual
resource maps to create a final concept map
Key Concept Identification:
CLICK: Automated Approach
_________
_________
_________
_________
_________
_________
_________
_________
_________
_________
_________
_________
_________
_________
_________
Digital library
resources
(webtext)
Extract concepts from
texts using multidocument
summarization
[De la Chica et al.
2008]
Identify and classify
links between
extracted concepts to
create a network
(concept map)
[De la Chica et al.2008]
Discover central or
key concepts (versus
supporting concepts)
through graph
analysis.
[Ahmad et al. 2008]
Key Concept Identification:
Concept Extraction

COGENT System [De la Chica 2008]
 MEAD
[Radev et al. 2004] – Multi-document
summarizer
 Supplemented with additional features to tie into
educational goals
 Run on 20 digital library resources used to construct
expert concept map
 Extracted concepts evaluated against expert map
concepts
 ROUGE-L:
F-Measure 0.6001
 Cosine Similarity: 0.8325
Key Concept Identification:
Concept Relation ID and Classification

Concept Relation Identification
 AKA
Link Identification
 Given two concepts, determine if they should be linked

Concept Relation Classification
 AKA
Link Classification
 Given a linked pair of concepts, assign a label
describing their relationship


This information can be useful both for concept
sequencing and question realization
Can potentially comprise a separate task
Key Concept Identification:
Concept Relation Identification


Given two concepts, determine if they should be linked
Approach [De la Chica et al. 2008]:
SVM-based classifier
 Lexical, syntactic, semantic, and document structure features


Performance
P = 0.2061
 R = 0.0153


Data set is extremely unbalanced


majority classification (no-link) overwhelmingly dominates
A good starting point for a challenging task worthy of
further investigation
Key Concept Identification:
Concept Relation Classification

Towards a gold standard

Experts labeled links on concept maps [Ahmad et al. 2008]
Discourse-like labels: cause, evidence, defines, elaborates…
 Domain-specific labels: technique, type of, slower than
 Vocabulary unspecified




10 most frequent labels account for 64% of the links
With some refinement could use RST or Penn Discourse labels to
create gold standard
Next steps
Create more reliable link classifier
 Develop a link relation classifier

Key Concept Identification:
Graph Analysis


Given a concept-map (graph) identify the key or central
concepts (versus supporting concepts)
Approach:


Graph analysis using PageRank + HITS algorithm
Key concepts are the intersection of:




Concepts selected by PageRank + HITS
Concepts with the highest ratio of incoming vs. outgoing links
Concepts with the highest term density
Evaluation:


No gold standard set of core concepts
Experts asked to identify subtopic regions on concept map


Earthquake types, Tsunamis, theory of continental drift…
80% core concept coverage of 25 subtopics
Concept Sequencing



Goal: Create a directed acyclic graph, which represents the logical
order in which concepts should be introduced in a lesson or tutorial
dialogue (w/r to a pedagogy)
Partial Ordering
Example:
1.
2.
3.
4.
Pitch represents the perceived fundamental frequency of a sound.
A shorter string produces a higher pitch.
A tighter string produces a higher pitch.
A discussion of the difference in pitch across each of the strings of a
violin and a cello.
2

1
4
3
Concept Sequencing:
Straw Man Approach



Aim: Show the viability of a concept sequencing task
Intuition: Concepts that should precede other concepts
will exhibit this behavior across the corpus of digital
library resources
Issues:
Concepts may not appear in their entirety in a document
 Aspects of concepts may show up earlier than the concept
as a whole


Approach: Treat concept to document alignment as an
information retrieval task
Concept Sequencing:
Implementation



Indexed the original 20 CLICK resources at the sentence
level using Lucene (Standard Analyzer, similarity score
threshold = 0.26)
Concepts are queries
A concept’s position in a resource is the sentence number
of the earliest matching sentence
Concept A
Concept B
Concept C
Resource 1
Resource 2
Resource 3
1____________
2____________
3____________
4____________
5____________
6____________
1____________
2____________
3____________
4____________
5____________
6____________
1____________
2____________
3____________
4____________
5____________
6____________
Concept Sequencing:
Implementation
Resource 1
Preceedes
A
A
B
C
Resource 2
B
C
1
1
1
A
A
B
C
Resource 3
B
C
1
X
X
A
A
B
C
Total
B
C
1
1
1
A
Preceedes

Preceedes

With concept positions identified and tabulated, compute
pairwise comparisons between all concepts’ sentence
numbers
If concept does not appear in a resource, do not include it in
comparison
Concepts with an identical number of predecessors are
considered to be at the same level
Preceedes

A
B
C
1
B
C
2
1
2
Concept Sequencing
Results
Concept Sequencing System Output
Concept Sequencing
Evaluation Data

Student Essay Sentence Number
Concept Number
21,23
85, 88, 92, 94, 176
1,3
210, 215, 217, 53, 55, 57, 58
24,26
444, 324, 342, 360
19,31
94, 95, 96, 138
42,44,45,46
610, 615, 613, 616, 618, 627
Remediation Order
Remediation Strategey

Currently no canonical concept sequence for CLICK
data
Instead derived gold-standard evaluation data
using a set of expert provided remediation
strategies for individual students essays
Concept Sequencing
Evaluation Data

Of 55 key concepts
 14
did not occur in any of the remediation strategies
 41 left to define concept sequence evaluation


Used frequency of precedence across remediations
to create a first pass concept sequence
Manually removed loops and errant orderings
Concept Sequencing
Evaluation Data
Gold-standard
Evaluation
Sequence
Concept Sequencing
Evaluation

F1-Measure
Average Instance Recall (IR) over all gold-standard key
concepts that have predecessors
 Average Instance Precision (IP) over all of the non-initial
system-output concepts that are aligned to gold-standard
key concepts
 Gi all predecessors of ith gold-standard key concept
 Oj all predecessors of jth system output concept

h
h
G i  Oi
1
1
R   IRi  
h i1
h i1 G i
l
l
1
1 Oj  G j
P   IP j  
l j 1
l j 1 O j
Concept Sequencing
Results and Discussion


F1=0.526 (P=0.383, R=0.726)
Gold-standard


System output





Multiple initial nodes
One single initial node
Linear hierarchies
All nodes with same number of predecessors at the same level
All inclusive ordering favors recall
Future Work




Utilize pairwise data to produce less densely packed graphs
More sophisticated measures of semantic similarity
Make use of concept map link relationships (cause, define…)
Conduct expert studies to get gold-standard sequences and concepts
Tutorial Dialogue and
Question Realization

Dialogue-based ITS
 Labor
intensive
 Effort centers on authoring of dialogue content and
flow
 Design of dialogue states non-trivial
Tutorial Dialogue and
Question Realization

So what does Target Concept Identification buy us?
 Critical
steps towards more automated ITS creation
 Decreased effort
 Scalability
 Contextual grounding

TCI Mappings to Dialogue Management
 Key
Concepts = States or Frames
 Concept Sequence = Default Dialogue Management
Strategy
Tutorial Dialogue and
Question Realization

Example:
 Concept
 Now
that you have defined what an earthquake is, can
you explain what causes them?
Caused-by
486: an earthquake is the sudden slip of
part of the Earth’s crust...
 Concept 561: …When the stress in a particular
location is great enough... an earthquake begins
 Suppose student has stated a paraphrase of 486
 ITS can produce:
Final Thoughts



Defined Target Concept Identification
Baseline and past results suggest feasibility of TCI
subtasks
Challenge the QG community to continue to think of
QG as the product of several tasks including TCI
Acknowledgements

Advisers and colleagues at:
The University of Colorado at Boulder
 The Center for Computational Language and EducAtion
Research (CLEAR)
 Boulder Language Technologies


Support from:
The National Science Foundation. NSF (DRL-0733322, DRL0733323, DRL-0835393, IIS-0537194)
 The Institute of Educational Sciences. IES (R3053070434).

Any findings, recommendations, or conclusions are those of the author and do not necessarily represent the views of NSF or IES.
References
1. F. Ahmad, S. de la Chica, K. Butcher, T. Sumner, and J.H. Martin. Towards automatic conceptual personalization tools. In Proc 7th ACM/IEEE-CS joint conference
on Digital Libraries. ACM, 2007.
2. I. L. Beck, M. G. McKeown, C. Sandora, L. Kucan, and J Worthy. Questioning the author: A year-long classroom implementation to engage students with text.
The Elementary School Journal, 98:385– 414, 1996.
3. B.S. Bloom. Taxonomy of Educational Objectives: The Classification of Educational Goals. Susan Fauer Company, Inc, 1956.
4. S. de la Chica, F. Ahmad, J.H. Martin, and T. Sumner. Pedagogically useful extractive summaries for science education. In Proc CoLing, volume 1, pages 177–
184. Association for Computational Linguistics, 2008.
5. A Graesser, V Rus, and Z Cai. Question classification schemes. In Proc WS on the QGSTEC, 2008
6. Q. Gu, S. Chica, F. Ahmad, H. Khan, T. Sumner, J.H. Martin, and K. Butcher. Personalizing the selection of digital library resources to support intentional learning. In Proc Euro Research and Advanced Technology for Digital Libraries, 2008.
7. P.W. Jordan, B Hall, M Ringenberg, Y Cue, and C Rose. Tools for authoring a dialogue agent that participates in learning studies. In Proc AIED, pages 43–50,
Amsterdam, The Netherlands, The Netherlands, 2007. IOS Press.
8. W.C. Mann and S.A. Thompson. Rhetorical structure theory: Toward a functional theory of text organization. Text, 8(3):243–281, 1988.
9. RD. Nielsen. Question generation: Proposed challenge tasks and their evaluation. In Proc WS on the QGSTEC, 2008.
10. RD Nielsen, J Buckingham, G Knoll, B Marsh, and L. Palen. A taxonomy of questions for question generation. In Proc WS on the Question Generation Shared
Task and Evaluation Challenge., 2008.
11. R Prasad, N Dinesh, A Lee, E Miltsakaki, L Robaldo, A Joshi, and B Webber. The penn discourse treebank 2.0. In Proc LREC, 2008.
12. R Prasad and Aravind Joshi. A discourse-based approach to generating why- questions from texts. In Proc WS on the QGSTEC, 2008.
13. D. Radev, T. Allison, S. Blair-Goldensohn, J. Blitzer, A. C ̧elebi, S. Dimitrov, E. Drabek, A. Hakim, W. Lam, D. Liu, J. Otterbacher, H. Qi, H. Saggion, S. Teufel,
M. Topper, A. Winkel, and Z. Zhang. Mead - a platform for multidocument multilingual text summarization. In Proc. LREC 2004, 2004.
14. C.M. Reigeluth. The elaboration theory: Guidance for scope and sequence decisions. In Instructional-Design Theories and Models: A New Paradigm of
Instructional Theory. Lawrence Erlbaum Assoc, 1999.
15. V. Rus, Z. Cai, and A.C. Graesser. Question generation: An example of a multi- year evaluation campaign. In Proc WS on the QGSTEC, 2008.
16. R. Soricut and D. Marcu. Sentence level discourse parsing using syntactic and lexical information. In Proc HLT/NAACL, pages 228–235, 2003.
17. S. Susarla, A. Adcock, R. Van Eck, K. Moreno, A. C. Graesser, and the Tutoring Research Group. Development and evaluation of a lesson authoring tool for
autotutor. In V. Aleven, U. Hoppe, R. Mizoguchi J. Kay, H. Pain, F. Verdejo, and K. Yacef, editors, Proc. AIED2003, pages 378–387, 2003.
18. L. Vanderwende. The importance of being important. In Proc WS on the QGSTEC, 2008.
19. Howard Wainer. Computer-Adaptive Testing: A Primer. 2000.
Download