Semantic Inference for Question Answering and Sanda Harabagiu Department of Computer Science University of Texas at Dallas Srini Narayanan International Computer Science Institute Berkeley, CA
Feb 02, 2016
Semantic Inference for Question Answering
andSanda Harabagiu
Department of Computer ScienceUniversity of Texas at Dallas
Srini NarayananInternational Computer Science Institute
Berkeley, CA
Outline
Part I. Introduction: The need for Semantic Inference in QA Current State-of-the-art in QA Parsing with Predicate Argument Structures Parsing with Semantic Frames Special Text Relations
Part II. Extracting Semantic Relations from Questions and Texts Knowledge-intensive techniques Supervised and unsupervised techniques
Outline
Part III. Knowledge representation and inference Representing the semantics of answers Extended WordNet and abductive inference Intentional Structure and Probabilistic Metonymy An example of Event Structure Modeling relations, uncertainty and dynamics Inference methods and their mapping to answer
types
Outline Part IV. From Ontologies to Inference
From OWL to CPRM FrameNet in OWL FrameNet to CPRM mapping
Part V. Results of Event Structure Inference for QA AnswerBank examples Current results for Inference Type Current results for Answer Structure
The need for Semantic Inference in QA
Some questions are complex! Example:
How can a biological weapons program be detected ? Answer: In recent months, Milton Leitenberg, an expert on
biological weapons, has been looking at this murkiest and most dangerous corner of Saddam Hussein's armory. He says a series of reports add up to indications that Iraq may be trying to develop a new viral agent, possibly in underground laboratories at a military complex near Baghdad where Iraqis first chased away inspectors six years ago. A new assessment by the United Nations suggests Iraq still has chemical and biological weapons - as well as the rockets to deliver them to targets in other countries. The UN document says Iraq may have hidden a number of Scud missiles, as well as launchers and stocks of fuel. US intelligence believes Iraq still has stockpiles of chemical and biological weapons and guided missiles, which it hid from the UN inspectors.
Complex questions Example:
How can a biological weapons program be detected ? This question is complex because:
It is a manner question All other manner questions that were evaluated in TREC were asking
about 3 things:• Manners to die, e.g. “How did Cleopatra die?”, “How did Einstein die?”• Manners to get a new name, e.g. “How did Cincinnati get its name?”• Manners to say something in another language, e.g. “How do you say
house in Spanish?” The answer does not contain any explicit manner of detection
information, instead it talks about reports that give indications that Iraq may be trying to develop a new viral agent and assessments by the United Nations suggesting that Iraq still has chemical and biological weapons
Complex questions and semantic information
Complex questions are not characterized only by a question class (e.g. manner questions) Example: How can a biological weapons program be detected ? Associated with the pattern “How can X be detected?” And the topic X = “biological weapons program”
Processing complex questions is also based on access to the semantics of the question topic The topic is modeled by a set of discriminating relations, e.g.
Develop(program); Produce(biological weapons); Acquire(biological weapons) or stockpile(biological weapons)
Such relations are extracted from topic-relevant texts
Alternative semantic representations Using PropBank to access a 1 million word
corpus annotated with predicate-argument structures.(www.cis.upenn.edu/~ace)
We can train a generative model for recognizing the arguments of each predicate in questions and in the candidate answers.
Example: How can a biological weapons program be detected ?
Predicate: detectArgument 0 = detector : Answer(1)Argument 1 = detected: biological weaponsArgument 2 = instrument : Answer(2)
ExpectedAnswer Type
More predicate-argument structures for questions Example: From which country did North Korea import its
missile launch pad metals?
Example: What stimulated India’s missile programs?
Predicate: importArgument 0: (role = importer): North KoreaArgument 1: (role = commodity): missile launch pad metalsArgument 2 (role = exporter): ANSWER
Predicate: stimulateArgument 0: (role = agent): ANSWER (part 1)Argument 1: (role = thing increasing): India’s missile programsArgument 2 (role = instrument): ANSWER (part 2)
Additional semantic resources Using FrameNet
• frame-semantic descriptions of several thousand English lexical items with semantically annotated attestations (www.icsi.berkeley.edu/~framenet)
Example: What stimulated India’s missile programs?
Frame: STIMULATEFrame Element: CIRCUMSTANCES: ANSWER (part 1)Frame Element: EXPERIENCER: India’s missile programsFrame Element: STIMULUS: ANSWER (part 2)
Frame: SUBJECT STIMULUSFrame Element: CIRCUMSTANCES: ANSWER (part 3)Frame Element: COMPARISON SET: ANSWER (part 4)Frame Element: EXPERIENCER: India’s missile programsFrame Element: PARAMETER: nuclear/biological proliferation
Semantic inference for Q/A The problem of classifying questions
E.g. “manner questions”: Example “How did Hitler die?”
The problem of recognizing answer types/structures Should “manner of death” by considered an answer type? What other manner of event/action should be considered as answer
types?
The problem of extracting/justifying/ generating answers to complex questions Should we learn to extract “manner” relations? What other types of relations should we consider? Is relation recognition sufficient for answering complex questions? Is
it necessary?
Manner-of-deathIn previous TREC evaluations 31 questions asked about
manner of death: “How did Adolf Hitler die?”
State-of-the-art solution (LCC): We considered “Manner-of-Death” as an answer type, pointing
to a variety of verbs and nominalizations encoded in WordNet We developed text mining techniques for identifying such
information based on lexico-semantic patterns from WordNet Example:
• [kill #sense1 (verb) – CAUSE die #sense1 (verb)] Source of the troponyms of the [kill #sense1 (verb)] concept
are candidates for the MANNER-OF-DEATH hierarchy• e.g., drown, poison, strangle, assassinate, shoot
Practical Hurdle
Not all MANNER-OF-DEATH concepts are lexicalized as a verb we set out to determine additional patterns that capture such cases Goal: (1) set of patterns (2) dictionaries corresponding to such patterns well known IE technique: (IJCAI’99, Riloff&Jones)
Results: 100 patterns were discovered
X DIE in ACCIDENT seed: train, accident, be killed (ACCIDENT) car wreck
X DIE {from|of} DISEASE seed: cancer be killed (DISEASE) AIDS
X DIE after suffering MEDICAL seed: stroke, suffering of CONDITION (ACCIDENT) complications caused by diabetes
Outline Part I. Introduction: The need for
Semantic Inference in QA Current State-of-the-art in QA Parsing with Predicate Argument
Structures Parsing with Semantic Frames Special Text Relations
Answer types in State-of-the-art QA systems
FeaturesAnswer type
Labels questions with answer type based on a taxonomy
Classifies questions (e.g. by using a maximum entropy model)
QuestionExpansion IR
AnswerSelection
Question Answer
Answer TypePrediction
answer type
Ranked setof passagesDocs
Answer Type Hierarchy
In Question Answering two heads are better than one The idea originated in the IBM’s PIQUANT project
Traditional Q/A systems employ a pipeline approach:• Questions analysis• Document/passage retrieval• Answer selection
Questions are classified based on the expected answer type Answers are also selected based on the expected answer
type, regardless of the question class
Motivated by the success of ensemble methods in machine learning, use multiple classifiers to produce the final output for the ensemble made of multiple QA agents
A multi-strategy, multi-source approach.
Multiple sources, multiple agents
QUESTIONQuestion Analysis
AnswerClassification
Q-FrameAnswer Type
Answering AgentsAnswering AgentsQPlan Generator
QPlan Executor
Predictive Annot.Answering Agents
StatisticalAnswering Agents
Definitional Q.Answering Agents
KSP-Based Answering Agents
Pattern-Based Answering Agents
Knowledge Source Portal
WordNet
Cyk
Web
SemanticSearch
Keyword Search
QGoals
AQUAINTTRECCNSAnswer ResolutionAnswer ResolutionANSWER Answers
Multiple Strategies In PIQUANT, the answer resolution strategies consider that
different combinations of the questions processing, passage retrieval and answer selection from different agents is ideal. This entails the fact that all questions are processed
depending on the questions class, not the question type• There are multiple question classes, e.g. “What” questions asking
about people, “What” questions asking about products, etc.• There are only three types of questions that have been evaluated
yet in systematic ways: • Factoid questions• Definition questions• List questions
Another options is to build an architecture in which question types are processed differently, and the semantic representations and inference mechanisms are adapted for each question type.
The Architecture of LCC’s QA System
Question Parse
SemanticTransformation
Recognition ofExpected
Answer Type
Keyword Extraction
FactoidQuestion
ListQuestion
Named EntityRecognition
(CICERO LITE)
Answer TypeHierarchy(WordNet)
Question Processing
Question Parse
Pattern Matching
Keyword Extraction
Question ProcessingDefinitionQuestion Definition
Answer
Answer Extraction
Pattern Matching
Definition Answer Processing
Answer Extraction
Threshold Cutoff
List Answer Processing ListAnswer
Answer Extraction
Answer Justification
Answer Reranking
Theorem Prover
Factoid Answer Processing
Axiomatic Knowledge Base
FactoidAnswer
MultipleDefinitionPassages
PatternRepository
Single FactoidPassages
MultipleList
Passages
Passage Retrieval
Document Processing
Document Index
AQUAINTDocumentCollection
Extracting Answers for Factoid Questions In TREC 2003 the LCC QA system extracted 289
correct answers for factoid questions The Name Entity Recognizer was responsible for 234
of them
QUANTITY 55 ORGANIZATION 15 PRICE 3NUMBER 45 AUTHORED WORK 11 SCIENCE NAME 2DATE 35 PRODUCT 11 ACRONYM 1PERSON 31 CONTINENT 5 ADDRESS 1COUNTRY 21 PROVINCE 5 ALPHABET 1OTHER LOCATIONS 19 QUOTE 5 URI 1CITY 19 UNIVERSITY 3
Special Case of Names
1934: What is the play “West Side Story” based on?Answer: Romeo and Juliet
1976: What is the motto for the Boy Scouts?Answer: Driving Miss Daisy
1982: What movie won the Academy Award for best picture in 1989?Answer: Driving Miss Daisy
2080: What peace treaty ended WWI?Answer: Versailles
2102: What American landmark stands on Liberty Island?Answer: Statue of Liberty
Questions asking for names of authored works
NE-driven QA The results of the past 5 TREC evaluations of QA
systems indicate that current state-of-the-art QA is determined by the recognition of Named Entities: Precision of recognition Coverage of name classes Mapping into concept hierarchies Participation into semantic relations (e.g.
predicate-argument structures or frame semantics)
Concept Taxonomies For 29% of questions the QA system relied
on an off-line taxonomy with semantic classes such as: Disease Drugs Colors Insects Games
The majority of these semantic classes are also associated with patterns that enable their identification
Definition Questions They asked about:
PEOPLE (most of them starting with “Who”) other types of NAMES general concepts
People questions Many use the PERSON name in the format [First name, Last
name]• examples: Aaron Copland, Allen Iverson, Albert Ghiorso
Some names had the PERSON name in format [First name, Last name1, Last name2]• example: Antonia Coello Novello
Other names had the name as a single word very well known person• examples: Nostradamus, Absalom, Abraham
Some questions referred to names of kings or princes:• examples: Vlad the Impaler, Akbar the Great
Answering definition questions
Most QA systems use between 30-60 patterns The most popular patterns:
Id Pattern Freq. Usage Question
25 person-hyponym QP 0.43% The doctors also consult with former Italian Olympic skier Alberto Tomba, along with other Italian athletes
1907: Who is Alberto Tomba?
9 QP, the AP 0.28% Bausch Lomb, the company that sells contact lenses, among hundreds of other optical products, has come up with a new twist on the computer screen magnifier
1917: What is Bausch & Lomb?
11 QP, a AP 0.11% ETA, a Basque language acronym for Basque Homeland and Freedom _ has killed nearly 800 people since taking up arms in 1968
1987: What is ETA in Spain?
13 QA, an AP 0.02% The kidnappers claimed they are members of the Abu Sayaf, an extremist Muslim group, but a leader of the group denied that
2042: Who is Abu Sayaf?
21 AP such as QP 0.02% For the hundreds of Albanian refugees undergoing medical tests and treatments at Fort Dix, the news is mostly good: Most are in reasonable good health, with little evidence of infectious diseases such as TB
2095: What is TB?
Complex questions Characterized by the need of domain knowledge
There is no single answer type that can be identified, but rather an answer structure needs to be recognized
Answer selection becomes more complicated, since inference based on the semantics of the answer type needs to be activated
Complex questions need to be decomposed into a set of simpler questions
Example of Complex Question
How have thefts impacted on the safety of Russia’s nuclear navy, and has the theft problem been increased or reduced over time?
Need of domain knowledgeNeed of domain knowledge To what degree do different thefts put nuclearor radioactive materials at risk?
Question decompositionQuestion decomposition Definition questions: What is meant by nuclear navy? What does ‘impact’ mean? How does one define the increase or decrease of a problem?
Factoid questions: What is the number of thefts that are likely to be reported? What sort of items have been stolen?
Alternative questions: What is meant by Russia? Only Russia, or also former Soviet facilities in non-Russian republics?
The answer structure For complex questions, the answer structure has a
compositional semantics, comprising all the answer structures of each simpler question in which it is decomposed.
Example:
Q-Sem: How can a biological weapons program be detected?Question pattern: How can X be detected? X = Biological Weapons Program
Conceptual Schemas
INSPECTION SchemaInspect, Scrutinize, Monitor, Detect, Evasion, Hide, Obfuscate
POSSESSION SchemaAcquire, Possess, Develop, Deliver
Structure of Complex Answer Type: EVIDENCECONTENT SOURCE QUALITY JUDGE RELIABILITY
Answer Selection Based on the answer structure Example:
The CONTENT is selected based on:
Conceptual schemas are instantiated when predicate-argument structures or semantic frames are recognized in the text passages
The SOURCE is recognized when the content source is identified
The Quality of the Judgements, the Reliability of the judgements and the Judgements themselves are produced by an inference mechanism
Conceptual Schemas
INSPECTION SchemaInspect, Scrutinize, Monitor, Detect, Evasion, Hide, Obfuscate
POSSESSION SchemaAcquire, Possess, Develop, Deliver
Structure of Complex Answer Type: EVIDENCECONTENT SOURCE QUALITY JUDGE RELIABILITY
ANSWER: Evidence-Combined: Pointer to Text Source:
A1: In recent months, Milton Leitenberg, an expert on biological weapons, has been looking at this murkiest and most dangerous corner of Saddam Hussein's armory.
A2: He says a series of reports add up to indications that Iraq may be trying to develop a new viral agent, possibly in underground laboratories at a military complex near Baghdad where Iraqis first chased away inspectors six years ago.
A3: A new assessment by the United Nations suggests Iraq still has chemical and biological weapons - as well as the rockets to deliver them to targets in other countries.
A4:The UN document says Iraq may have hidden a number of Scud missiles, as well as launchers and stocks of fuel.
A5: US intelligence believes Iraq still has stockpiles of chemical and biological weapons and guided missiles, which it hid from the UN inspectors
Content: Biological Weapons Program:
develop(Iraq, Viral_Agent(instance_of:new))Justification: POSSESSION Schema Previous (Intent and Ability): Prevent(ability, Inspection); Inspection terminated; Status: Attempt ongoing Likelihood: Medium Confirmability: difficult, obtuse, hidden
possess(Iraq, Chemical and Biological Weapons) Justification: POSSESSION Schema Previous (Intent and Ability): Prevent(ability, Inspection); Status: Hidden from Inspectors Likelihood: Medium
possess(Iraq, delivery systems(type : rockets; target: other countries)) Justification: POSSESSION Schema Previous (Intent and Ability): Hidden from Inspectors; Status: Ongoing Likelihood: Medium
Answer
Structure
ANSWER: Evidence-Combined: Pointer to Text Source:
A1: In recent months, Milton Leitenberg, an expert on biological weapons, has been looking at this murkiest and most dangerous corner of Saddam Hussein's armory.
A2: He says a series of reports add up to indications that Iraq may be trying to develop a new viral agent, possibly in underground laboratories at a military complex near Baghdad where Iraqis first chased away inspectors six years ago.
A3: A new assessment by the United Nations suggests Iraq still has chemical and biological weapons - as well as the rockets to deliver them to targets in other countries.
A4:The UN document says Iraq may have hidden a number of Scud missiles, as well as launchers and stocks of fuel.
A5: US intelligence believes Iraq still has stockpiles of chemical and biological weapons and guided missiles, which it hid from the UN inspectors
Content: Biological Weapons Program:
possess(Iraq, fuel stock(purpose: power launchers)) Justification: POSSESSION Schema Previous (Intent and Ability): Hidden from Inspectors; Status: Ongoing Likelihood: Medium
possess(Iraq, delivery systems(type : scud missiles; launchers; target: other countries)) Justification: POSSESSION Schema Previous (Intent and Ability): Hidden from Inspectors; Status: Ongoing Likelihood: Medium
Answer Structure
(continued)
hide(Iraq, Seeker: UN Inspectors; Hidden: CBW stockpiles & guided missiles) Justification: DETECTION Schema Inspection status: Past; Likelihood: Medium
ANSWER: Evidence-Combined: Pointer to Text Source:
A1: In recent months, Milton Leitenberg, an expert on biological weapons, has been looking at this murkiest and most dangerous corner of Saddam Hussein's armory.
A2: He says a series of reports add up to indications that Iraq may be trying to develop a new viral agent, possibly in underground laboratories at a military complex near Baghdad where Iraqis first chased away inspectors six years ago.
A3: A new assessment by the United Nations suggests Iraq still has chemical and biological weapons - as well as the rockets to deliver them to targets in other countries.
A4:The UN document says Iraq may have hidden a number of Scud missiles, as well as launchers and stocks of fuel.
A5: US intelligence believes Iraq still has stockpiles of chemical and biological weapons and guided missiles, which it hid from the UN inspectors
Source: UN documents, US intelligence
SOURCE.Type: Assesment reports; Source.Reliability: Med-high Likelihood: Medium
Answer Structure
(continued)
Judge: UN, US intelligence, Milton Leitenberg (Biological Weapons expert)
JUDGE.Type: mixed; Judge.manner; Judge.stage: ongoing
Quality: low-medium; Reliability: low-medium;
State-of-the-art QA:Learning surface text patterns
Pioneered by Ravichandran and Hovy (ACL-2002) The idea is that given a specific answer type (e.g. Birth-
Date), learn all surface patterns that enable the extraction of the answer from any text passage Patterns are learned by two algorithms:
Relies on Web redundancy
Algorithm 1 (Generates Patterns)Step 1: Select an answer type AT and a question Q(AT)Step 2: Generate a query (Q(AT) & AT) and submit it to
search engine (google, altavista)Step 3: Download the first 1000 documentsStep 4: Select only those sentences that contain the
question content words and the ATStep 5: Pass the sentences through a suffix tree
constructorStep 6: Extract only the longest matching sub-strings
that contain the AT and the question word it is syntactically connected with.
Algorithm 2 (Measures the Precision of Patterns)Step 1: Query by using only question Q(AT)Step 2: Download the first 1000 documentsStep 3: Select only those sentences that contain the question word connected to the ATStep 4: Compute C(a)= #patterns matched
by the correct answer;C(0)=#patterns matched by any word
Step 6: The precision of a pattern is given by: C(a)/C(0)
Step 7: Retain only patterns matching >5 examples
Results and Problems Some results:
Limitations: Cannot handle long-distance dependencies Cannot recognize paraphrases – since no semantic knowledge
is associated with these patterns (unlike patterns used in Information Extraction)
Cannot recognize a paraphrased questions
Answer Type=INVENTOR:<ANSWER> invents <NAME>the <NAME> was invented by <ANSWER><ANSWER>’s invention of the <NAME><ANSWER>’s <NAME> was<NAME>, invented by <ANSWER>That <ANSWER>’s <NAME>
Answer Type=BIRTH-YEAR:<NAME> (<ANSWER>- )<NAME> was born on <ANSWER><NAME> was born in <ANSWER>born in <ANSWER>, <NAME>Of <NAME>, (<ANSWER>
Shallow semantic parsing Part of the problems can be solved by using
shallow semantic parsers Parsers that use shallow semantics encoded as either
predicate-argument structures or semantic frames• Long-distance dependencies are captured• Paraphrases can be recognized by mapping on IE
architectures In the past 4 years, several models for training such
parsers have emerged Lexico-Semantic resources are available (e.g PropBank,
FrameNet) Several evaluations measure the performance of such
parsers (e.g. SENSEVAL, CoNNL)
Outline Part I. Introduction: The need for
Semantic Inference in QA Current State-of-the-art in QA Parsing with Predicate Argument
Structures Parsing with Semantic Frames Special Text Relations
Proposition Bank Overview
A one million word corpus annotated with predicate argument structures [Kingsbury, 2002]. Currently only predicates lexicalized by verbs.
Numbered arguments from 0 to 5. Typically ARG0 = agent, ARG1 = direct object or theme, ARG2 = indirect object, benefactive, or instrument.
Functional tags: ARMG-LOC = locative, ARGM-TMP = temporal, ARGM-DIR = direction.
NP
The futures halt was assailed by Big Board floor traders
PP
VP
VP
NP
S
ARG1 = entity assailed PRED ARG0 = agent
The Model
Consists of two tasks: (1) identifying parse tree constituents corresponding to predicate arguments, and (2) assigning a role to each argument constituent.
Both tasks modeled using C5.0 decision tree learning, and two sets of features: Feature Set 1 adapted from [Gildea and Jurafsky, 2002], and Feature Set 2, novel set of semantic and syntactic features [Surdeanu, Harabagiu et al, 2003].
NP
The futures halt was assailed by Big Board floor traders
PP
VP
VP
NP
S
PRED
Task 1
ARG1 ARG0 Task 2
Feature Set 1
PHRASE TYPE (pt): type of the syntactic phrase as argument. E.g. NP for ARG1.
PARSE TREE PATH (path): path between argument and predicate. E.g. NP S VP VP for ARG1.
PATH LENGTH (pathLen): number of labels stored in the predicate-argument path. E.g. 4 for ARG1.
POSITION (pos): indicates if constituent appears before predicate in sentence. E.g. true for ARG1 and false for ARG2.
VOICE (voice): predicate voice (active or passive). E.g. passive for PRED.
HEAD WORD (hw): head word of the evaluated phrase. E.g. “halt” for ARG1.
GOVERNING CATEGORY (gov): indicates if an NP is dominated by a S phrase or a VP phrase. E.g. S for ARG1, VP for ARG0.
PREDICATE WORD: the verb with morphological information preserved (verb), and the verb normalized to lower case and infinitive form (lemma). E.g. for PRED verb is “assailed”, lemma is “assail”.
NP
The futures halt wasassailedbyBig Board floor traders
PP
VP
VP
NP
S
ARG1 PRED ARG0
Observations about Feature Set 1
Because most of the argument constituents are prepositional attachments (PP) and relative clauses (SBAR), often the head word (hw) is not the most informative word in the phrase.
Due to its strong lexicalization, the model suffers from data sparsity. E.g. hw used < 3%. The problem can be addressed with a back-off model from words to part of speech tags.
The features in set 1 capture only syntactic information, even though semantic information like named-entity tags should help. For example, ARGM-TMP typically contains DATE entities, and ARGM-LOC includes LOCATION named entities.
Feature set 1 does not capture predicates lexicalized by phrasal verbs, e.g. “put up”.
PP
NPin
last June
SBAR
S
VP
NP
that
occurred
yesterday
VP
VPto
be
declared
VP
Feature Set 2 (1/2) CONTENT WORD (cw): lexicalized feature that selects an
informative word from the constituent, other than the head. Selection heuristics available in the paper. E.g. “June” for the phrase “in last June”.
PART OF SPEECH OF CONTENT WORD (cPos): part of speech tag of the content word. E.g. NNP for the phrase “in last June”.
PART OF SPEECH OF HEAD WORD (hPos): part of speech tag of the head word. E.g. NN for the phrase “the futures halt”.
NAMED ENTITY CLASS OF CONTENT WORD (cNE): The class of the named entity that includes the content word. 7 named entity classes (from the MUC-7 specification) covered. E.g. DATE for “in last June”.
Feature Set 2 (2/2) BOOLEAN NAMED ENTITY FLAGS: set of features that indicate if a named
entity is included at any position in the phrase: neOrganization: set to true if an organization name is recognized in the phrase. neLocation: set to true if a location name is recognized in the phrase. nePerson: set to true if a person name is recognized in the phrase. neMoney: set to true if a currency expression is recognized in the phrase. nePercent: set to true if a percentage expression is recognized in the phrase. neTime: set to true if a time of day expression is recognized in the phrase. neDate: set to true if a date temporal expression is recognized in the phrase.
PHRASAL VERB COLLOCATIONS: set of two features that capture information about phrasal verbs:
pvcSum: the frequency with which a verb is immediately followed by any preposition or particle.
pvcMax: the frequency with which a verb is followed by its predominant preposition or particle.
ResultsFeatures Arg P Arg R Arg F1 Role A
FS1 84.96 84.26 84.61 78.76
FS1 + POS tag of head word
92.24 84.50 88.20 79.04
FS1 + content word and POS tag
92.19 84.67 88.27 80.80
FS1 + NE label of content word
83.93 85.69 84.80 79.85
FS1 + phrase NE flags
87.78 85.71 86.73 81.28
FS1 + phrasal verb information
84.88 82.77 83.81 78.62
FS1 + FS2 91.62 85.06 88.22 83.05
FS1 + FS2 + boosting
93.00 85.29 88.98 83.74
Other parsers based on PropBank
Pradhan, Ward et al, 2004 (HLT/NAACL+J of ML) report on a parser trained with SVMs which obtains F1-score=90.4% for Argument classification and 80.8% for detecting the boundaries and classifying the arguments, when only the first set of features is used.
Gildea and Hockenmaier (2003) use features extracted from Combinatory Categorial Grammar (CCG). The F1-measure obtained is 80%
Chen and Rambow (2003) use syntactic and semantic features extracted from a Tree Adjoining Grammar (TAG) and report an F1-measure of 93.5% for the core arguments
Pradhan, Ward et al, use a set of 12 new features and obtain and F1-score of 93.8% for argument classification and 86.7 for argument detection and classification
Applying Predicate-Argument Structures to QA Parsing Questions
Parsing Answers
Result: exact answer= “approximately 7 kg of HEU”
Q: What kind of materials were stolen from the Russian navy?
PAS(Q): What [Arg1: kind of nuclear materials] were [Predicate:stolen] [Arg2: from the Russian Navy]?
A(Q): Russia’s Pacific Fleet has also fallen prey to nuclear theft; in 1/96, approximately 7 kg of HEU was reportedly stolen from a naval base in Sovetskaya Gavan.
PAS(A(Q)): [Arg1(P1redicate 1): Russia’s Pacific Fleet] has [ArgM-Dis(Predicate 1) also] [Predicate 1: fallen] [Arg1(Predicate 1): prey to nuclear theft]; [ArgM-TMP(Predicate 2): in 1/96], [Arg1(Predicate 2): approximately 7 kg of HEU]was [ArgM-ADV(Predicate 2) reportedly] [Predicate 2: stolen] [Arg2(Predicate 2): from a naval base] [Arg3(Predicate 2): in Sovetskawa Gavan]
Outline Part I. Introduction: The need for
Semantic Inference in QA Current State-of-the-art in QA Parsing with Predicate Argument
Structures Parsing with Semantic Frames Special Text Relations
The Model
Consists of two tasks: (1) identifying parse tree constituents corresponding to frame elements, and (2) assigning a semantic role to each frame element.
Both tasks introduced for the first time by Gildea and Jurafsky in 2000. It uses the Feature Set 1 , which later Gildea and Palmer used for parsing based on PropBank.
She clapped her hands in inspiration
PP
VP
NP
NP
S
PRED
Task 1
Agent Cause Task 2Body Part
Extensions Fleischman et al extend the model in 2003 in
three ways: Adopt a maximum entropy framework for learning a
more accurate classification model. Include features that look at previous tags and use
previous tag information to find the highest probability for the semantic role sequence of any given sentence.
Examine sentence-level patterns that exploit more global information in order to classify frame elements.
Applying Frame Structures to QA
Parsing Questions
Parsing Answers
Result: exact answer= “approximately 7 kg of HEU”
Q: What kind of materials were stolen from the Russian navy?
FS(Q): What [GOODS: kind of nuclear materials] were [Target-Predicate:stolen] [VICTIM: from the Russian Navy]?
A(Q): Russia’s Pacific Fleet has also fallen prey to nuclear theft; in 1/96, approximately 7 kg of HEU was reportedly stolen from a naval base in Sovetskaya Gavan.
FS(A(Q)): [VICTIM(P1): Russia’s Pacific Fleet] has also fallen prey to [Goods(P1): nuclear ][Target-Predicate(P1): theft]; in 1/96, [GOODS(P2): approximately 7 kg of HEU]was reportedly [Target-Predicate (P2): stolen] [VICTIM (P2): from a naval base] [SOURCE(P2): in Sovetskawa Gavan]
Outline Part I. Introduction: The need for
Semantic Inference in QA Current State-of-the-art in QA Parsing with Predicate Argument
Structures Parsing with Semantic Frames Special Text Relations
Additional types of relations Temporal relations
TERQUAS ARDA Workshop
Causal relations Evidential relations Part-whole relations
Temporal relations in QA Results of the workshop are accessible from
http://www.cs.brandeis.edu/~jamesp/arda/time/documentation/TimeML-use-in-qa-v1.0.pdf
A set of questions that require the extraction of temporal relations was created (TimeML question corpus) E.g.:
• “When did the war between Iran and Iraq end?”• “Who was Secretary of Defense during the Golf War?”
A number of features of these questions were identified and annotated E.g.:
• Number of TEMPEX relations in the question• Volatility of the question (how often does the answer change)• Reference to repetitive events• Number of events mentioned in the question
Outline Part II. Extracting Semantic
Relations from Questions and Texts Knowledge-intensive techniques Unsupervised techniques
Information Extraction from texts Extracting semantic relations from questions and texts can
be solved by adapting the IE technology to this new task.
What is Information Extraction (IE) ? The task of finding facts about a specified class of events
from free text Filling a table in a database with the information – sush a
database entry can be seen as a list of slots of a template Events are instances comprising many relations that span
multiple arguments
IE Architecture Overview
Phrasal parserPhrasal parser
Entity coreferenceEntity coreference
Domain event rulesDomain event rules
Domain coreferenceDomain coreference
Templette mergingTemplette merging
Domain APIRules
Rules
Coreference filters
Merge condition
Walk-through Example
... a bomb rigged with a trip wire that exploded and killed him...
TEMPLETTEBOMB: “a bomb rigged with a trip wire”
TEMPLETTEDEAD: “A Chinese restaurant chef”
... a bomb rigged with a trip wire/NG that/P exploded/VG and/P killed/VG him/NG...
Parser
him A Chinese restaurant chefEntity Coref
... a bomb rigged with a trip wire that exploded/PATTERN and killed him/PATTERN...
Domain Rules
TEMPLETTEBOMB: “a bomb rigged with a trip wire”LOCATION: “MIAMI”
TEMPLETTEBOMB: “a bomb rigged with a trip wire”DEAD: “A Chinese restaurant chef”
Domain Coref
TEMPLETTEBOMB: “a bomb rigged with a trip wire”DEAD: “A Chinese restaurant chef”LOCATION: “MIAMI”
Merging
Learning domain event rulesand domain relations
build patterns from examples Yangarber ‘97
generalize from multiple examples: annotated text Crystal, Whisk (Soderland), Rapier (Califf)
active learning: reduce annotation Soderland ‘99, Califf ‘99
learning from corpus with relevance judgements Riloff ‘96, ‘99
co-learning/bootstrapping Brin ‘98, Agichtein ‘00
Changes in IE architecture for enabling the extraction of semantic relations
Relation Merging
Document
Tokenizer
EventRecognizer
Event/RelationCoreference
EntityRecognizer
RelationRecognizer
EntityCoreference
-Addition of Relation LayerAddition of Relation Layer
-Modification of NE and Modification of NE and pronominal coreferencepronominal coreferenceto enable relation coreferenceto enable relation coreference
-Add a relation mergingAdd a relation merginglayerlayer
EEML FileGeneration
EEML Results
Entity: Person
Entity: Person
Walk-through Example
Entity: City
Entity: Person
Entity: Time-Quantity
Entity: GeopoliticalEntity
Entity: PersonEvent: Murder
Event: Murder
The murder of Vladimir Golovlyov, an associate of the exiled
tycoon Boris Berezovsky, was the second contract killing in
the Russian capital in as many days and capped a week of
setbacks for the Russian leader.
Walk-through ExampleEvent-Entity Relation: Victim Entity-Entity Relation: AffiliatedWith
Event-Entity Relation: Victim
Event-Entity Relation: EventOccurAt
Entity-Entity Relation: GeographicalSubregion
Entity-Entity Relation: hasLeader
The murder of Vladimir Golovlyov, an associate of the exiled
tycoon Boris Berezovsky, was the second contract killing in
the Russian capital in as many days and capped a week of
setbacks for the Russian leader.
Application to QA Who was murdered in Moscow this week?
Relations: EventOccuredAt + Victim Name some associates of Vladimir Golovlyov.
Relations: AffiliatedWith How did Vladimir Golovlyov die?
Relations: Victim What is the relation between Vladimir Golovlyov
and Boris Berezovsky? Relations: AffliliatedWith
Outline Part II. Extracting Semantic
Relations from Questions and Texts Knowledge-intensive techniques Unsupervised techniques
Learning extraction rules and semantic lexicons Generating Extraction Patterns : AutoSlog (Riloff
1993), AutoSlog-Ts(Riloff 1996)
Semantic Lexicon Induction: Riloff & Shepherd (1997), Roark & Charniak (1998), Ge, Hale, & Charniak (1998), Caraballo (1999), Thompson & Mooney (1999), Meta-Bootstrapping (Riloff & Jones 1999), (Thelen and Riloff 2002)
Bootstrapping/Co-training: Yarowsky (1995), Blum and Mitchell (1998), McCallum & Nigam (1998)
Generating extraction rules From untagged text: AutoSlog-TS (Riloff 1996)
The rule relevance is measured by: Relevance rate * log2 (frequency)
Pre-classified Texts
STAGE 1
Sentence Analyzer
Subject: World Trade CenterVerb: was bombedPP: by terrorists
AutoSlog Heuristics
Concept Nodes:
<x> was bombedby <y>
Concept Nodes:
<x> was bombedby <y>
Pre-classified Texts
STAGE 2
Sentence AnalyzerConcept Node
Dictionary:
<x> was killed<x> was bombed by <y>
Concept Node Dictionary:
<x> was killed<x> was bombed by <y>
Concept Nodes: REL%
<x> was bombed 87% bombed by <y> 84%<w> was killed 63%<z> saw 49%
Concept Nodes: REL%
<x> was bombed 87% bombed by <y> 84%<w> was killed 63%<z> saw 49%
Learning Dictionaries for IE with mutual bootrapping Riloff and Jones (1999)
Generate all candidate extraction rules rom the training corpus using AutoSlog
Apply the candidate extraction rules to the training corpus and save the patternsWith their extractions to EPdata
SemLEx = {seed words}Cat_EPlist = {}
MUTUAL BOOTSTRAPPING LOOP1. Score all extraction rules in Epdata2. best_EP = the highest scoring extraction pattern not already in Cat_Eplist3. Add best_EP to Cat_Eplist4. Add best_EP’s extraction to SemLEx5. Go to step 1.
The BASILISK approach (Thelen & Riloff)
extraction patterns andtheir extractions
corpus
seedwords
5 best candidate words
CandidateWord Pool
semanticlexicon
Pattern Poolbest
patternsextractions
BASILISK = Bootstrapping Approach to SemantIc Lexicon Induction using Semantic Knowledge
Key ideas:1/ Collective evidence over a large set of extraction patterns can reveal strongsemantic associations.
2/ Learning multiple categories simultaneously can constrain the bootstrapping process
Learning Multiple Categories Simultaneously
Bootstrapping a single category Bootstrapping multiple categories
•“One Sense per Domain” assumption: a word belongs to a single semantic category within a limited domain.
The simplest way to take advantage of multiple categories is to resolve conflicts when they arise.
1. A word cannot be assigned to category X if it has already been assigned to category Y.
2. If a word is hypothesized for both category X and category Y at the same time, choose the category that receives the highest score.
Kernel Methods for Relation Extraction Pioneered by Zelenko, Aone and Richardella (2002)
Uses Support Vector Machines and the Voted Perceptron Alorithm (Freund and Shapire, 1999)
It operates on the shallow parses of texts, by using two functions: A matching function between the nodes of the shallow parse
tree; and A similarity function between the nodes
It obtains very high F1-score values for relation extraction (86.8%)
Outline Part III. Knowledge representation and
inference Representing the semantics of answers Extended WordNet and abductive inference Intentional Structure and Probabilistic
Metonymy An example of Event Structure Modeling relations, uncertainty and dynamics Inference methods and their mapping to
answer types
Three representations A taxonomy of answer types in which Named
Entity Classes are also mapped.
A complex structure that results from schema instantiations
Answer type generated by the inference on the semantic structures
Possible Answer Types
TOP
PERSON LOCATION DATE TIME PRODUCT NUMERICAL MONEY ORGANIZATION MANNER REASON
VALUE
DEGREE DIMENSION RATE DURATION PERCENTAGE COUNT
time of day
midnightprime time
clock time
hockeyteam
team,squad
institution,establishment
financialinstitution
educationalinstitution
numerosity,multiplicity
integer,whole number
population denominatorthickness
width,breadth
distance,length
altitude wingspan
Examples
What
played
actressname
Shine
What
BMW
companyproduce
TOP
PERSON LOCATION DATE TIME PRODUCT NUMERICAL MONEY ORGANIZATION MANNER REASON
VALUE
What is the name of theactress that played in Shine?
What does the BMW companyproduce?
PERSON PRODUCT
PRODUCTPERSON
Outline Part III. Knowledge representation and
inference Representing the semantics of answers Extended WordNet and abductive inference Intentional Structure and Probabilistic
Metonymy An example of Event Structure Modeling relations, uncertainty and dynamics Inference methods and their mapping to
answer types
Extended WordNet eXtended WordNet is an ongoing project at the
Human Language Technology Research Institute, University of Texas at Dallas. http://xwn.hlt.utdallas.edu/)
The goal of this project is to develop a tool that takes as input the current or future versions of WordNet and automatically generates an eXtended WordNet that provides several important enhancements intended to remedy the present limitations of WordNet.
In the eXtended WordNet the WordNet glosses are syntactically parsed, transformed into logic forms and content words are semantically disambiguated.
Logic Abduction
Axiom
Builder
Axiom
BuilderJustificationJustification
RelaxationRelaxation
Answer
Ranking
Answer
Ranking
QLF
ALF
XWN axioms
NLP axioms
Lexical chains
SuccessRanked
answers
Answer
explanation
Proof
fails
Motivation: Goes beyond keyword based justification by capturing:
• syntax based relationships • links between concepts in the question and the candidate
answers
COGEX= the LCC Logic Prover for QA
Inputs to the Logic Prover
A logic form provides a mapping of the question and candidate answer text into first order logic predicates.
Question:
Where did bin Laden 's funding come from other than his own wealth ?
Question Logic Form:
( _multi_AT(x1) ) & bin_NN_1(x2) & Laden_NN(x3) & _s_POS(x5,x4) & nn_NNC(x4,x2,x3) & funding_NN_1(x5) & come_VB_1(e1,x5,x11) & from_IN(e1,x1) & other_than_JJ_1(x6) & his_PRP_(x6,x4) & own_JJ_1(x6) & wealth_NN_1(x6)
Justifying the answer
Answer:
... Bin Laden reportedly sent representatives to Afghanistan opium farmers to buy large amounts of opium , probably to raise funds for
al - Qaida ....
Answer Logic Form:
… Bin_NN(x14) & Laden_NN(x15) & nn_NNC(x16,x14,x15) & reportedly_RB_1(e2) & send_VB_1(e2,x16,x17) & representative_NN_1(x17) & to_TO(e2,x21) & Afghanistan_NN_1(x18) & opium_NN_1(x19) & farmer_NN_1(x20) & nn_NNC(x21,x19,x20) & buy_VB_5(e3,x17,x22) & large_JJ_1(x22) & amount_NN_1(x22) & of_IN(x22,x23) & opium_NN_1(x23) & probably_RB_1(e4) & raise_VB_1(e4,x22,x24) & funds_NN_2(x24) & for_IN(x24,x26) & al_NN_1(x25) & Qaida_NN(x26) ...
Lexical Chains
Lexical Chains Lexical chains provide an improved source of world knowledge by supplying the Logic Prover with much needed axioms to link question keywords with answer concepts.
Question: How were biological agents acquired by bin Laden?
Answer: On 8 July 1998 , the Italian newspaper Corriere della Serra indicated that members of The World Front for Fighting Jews and Crusaders , which was founded by Bin Laden , purchased three chemical and biological_agent production facilities in
Lexical Chain: ( v - buy#1, purchase#1 ) HYPERNYM ( v - get#1, acquire#1 )
Axiom selection
XWN AxiomsAnother source of world knowledge is a general purpose knowledge base of more than 50,000 parsed and disambiguated glosses that are transformed into logic form for use during the course of a proof.
Gloss: Kill is to cause to die
GLF: kill_VB_1(e1,x1,x2) -> cause_VB_1(e1,x1,x3) & to_TO(e1,e2) & die_VB_1(e2,x2,x4)
Logic Prover
Axiom Selection Lexical chains and the XWN knowledge base work together to select and generate the axioms
needed for a successful proof when all the keywords in the questions are not found in the answer.
Question: How did Adolf Hitler die?
Answer: … Adolf Hitler committed suicide …
The following Lexical Chain is detected:( n - suicide#1, self-destruction#1, self-annihilation#1 ) GLOSS ( v - kill#1 ) GLOSS ( v - die#1, decease#1, perish#1, go#17, exit#3, pass_away#1, expire#2, pass#25 ) 2
The following axioms are loaded into the Usable List of the Prover: exists x2 all e1 x1 (suicide_nn(x1) -> act_nn(x1) & of_in(x1,e1) & kill_vb(e1,x2,x2)).
exists x3 x4 all e2 x1 x2 (kill_vb(e2,x1,x2) -> cause_vb_2(e1,x1,x3) & to_to(e1,e2) & die_vb(e2,x2,x4)).
Outline Part III. Knowledge representation and
inference Representing the semantics of answers Extended WordNet and abductive inference Intentional Structure and Probabilistic
Metonymy An example of Event Structure Modeling relations, uncertainty and dynamics Inference methods and their mapping to
answer types
Intentional Structure of Questions
Example: Does have ?
x y
Predicate-argument have/possess (Iraq, biological weapons)
structure Arg-0 Arg-1
Question Pattern possess (x,y)
Intentional Structure
Iraq biological weapons
Evidence ??? CoercionMeans of Finding ??? CoercionSource ??? CoercionConsequence ??? Coercion
Coercion of Pragmatic Knowledge0*Evidence (1-possess (2-Iraq, 3-biological weapons)
A form of logical metonymyLapata and Lascarides (Computational Linguistics,2003)allows coercion of interpretations by collecting possible meanings from large corpora.Examples: Mary finished the cigarette
Mary finished smoking the cigarette.
Arabic is a difficult language Arabic is a language that is difficult to learn Arabic is a language that is difficult to process automatically
The IdeaLogic metonymy is in part processed as verbal
metonymy. We model, after Lapata and Lascarides, the interpretation of verbal metonymy as:
where: v—the metonymic verb (enjoy)
o—its object (the cigarette)e—the sought-after interpretation (smoking)
),,( voep
A probabilistic modelBy choosing the ordering , the probability may
be factored as:
where we make the estimations:
ove ,,
),|()|()(),,( veoPevPePvoeP
)(
),()|(ˆ;
)()(ˆ
ef
evfevP
N
efeP
)(
),()|(ˆ
),(
),,(),|(ˆ
ef
eofeoP
vef
veofveoP
)(
),(),(),,(
efN
eofevfvoeP
This is a model of interpretation and coercion
Coercions for intentional structures0*Evidence (1-possess (2-Iraq, 3-biological weaponry)
)3,1,(eP
1. v=discover (1,2,3)2. V=stockpile (2,3)3. V=use (2,3)4. V=0 (1,2,3)
1. e=develop (,3)2. e=acquire ( ,3)
1. e=inspections ( ,2, 3)
2. e=ban ( , 2, from 3)
Topic Coercion
),3,( topiceP
)3,2,(eP
),1,0( veP
Outline Part III. Knowledge representation and
inference Representing the semantics of answers Extended WordNet and abductive inference Intentional Structure and Probabilistic
Metonymy An example of Event Structure Modeling relations, uncertainty and dynamics Inference methods and their mapping to
answer types
ANSWER: Evidence-Combined: Pointer to Text Source:
A1: In recent months, Milton Leitenberg, an expert on biological weapons, has been looking at this murkiest and most dangerous corner of Saddam Hussein's armory.
A2: He says a series of reports add up to indications that Iraq may be trying to develop a new viral agent, possibly in underground laboratories at a military complex near Baghdad where Iraqis first chased away inspectors six years ago.
A3: A new assessment by the United Nations suggests Iraq still has chemical and biological weapons - as well as the rockets to deliver them to targets in other countries.
A4:The UN document says Iraq may have hidden a number of Scud missiles, as well as launchers and stocks of fuel.
A5: US intelligence believes Iraq still has stockpiles of chemical and biological weapons and guided missiles, which it hid from the UN inspectors
Content: Biological Weapons Program:
develop(Iraq, Viral_Agent(instance_of:new))Justification: POSSESSION Schema Previous (Intent and Ability): Prevent(ability, Inspection); Inspection terminated; Status: Attempt ongoing Likelihood: Medium Confirmability: difficult, obtuse, hidden
possess(Iraq, Chemical and Biological Weapons) Justification: POSSESSION Schema Previous (Intent and Ability): Prevent(ability, Inspection); Status: Hidden from Inspectors Likelihood: Medium
possess(Iraq, delivery systems(type : rockets; target: other countries)) Justification: POSSESSION Schema Previous (Intent and Ability): Hidden from Inspectors; Status: Ongoing Likelihood: Medium
Answer
Structure
ANSWER: Evidence-Combined: Pointer to Text Source:
A1: In recent months, Milton Leitenberg, an expert on biological weapons, has been looking at this murkiest and most dangerous corner of Saddam Hussein's armory.
A2: He says a series of reports add up to indications that Iraq may be trying to develop a new viral agent, possibly in underground laboratories at a military complex near Baghdad where Iraqis first chased away inspectors six years ago.
A3: A new assessment by the United Nations suggests Iraq still has chemical and biological weapons - as well as the rockets to deliver them to targets in other countries.
A4:The UN document says Iraq may have hidden a number of Scud missiles, as well as launchers and stocks of fuel.
A5: US intelligence believes Iraq still has stockpiles of chemical and biological weapons and guided missiles, which it hid from the UN inspectors
Content: Biological Weapons Program:
develop(Iraq, Viral_Agent(instance_of:new))Justification: POSSESSION Schema Previous (Intent and Ability): Prevent(ability, Inspection); Inspection terminated; Status: Attempt ongoing Likelihood: Medium Confirmability: difficult, obtuse, hidden
possess(Iraq, Chemical and Biological Weapons) Justification: POSSESSION Schema Previous (Intent and Ability): Prevent(ability, Inspection); Status: Hidden from Inspectors Likelihood: Medium
possess(Iraq, delivery systems(type : rockets; target: other countries)) Justification: POSSESSION Schema Previous (Intent and Ability): Hidden from Inspectors; Status: Ongoing Likelihood: Medium
Answer
Structure
ANSWER: Evidence-Combined: Pointer to Text Source:
A1: In recent months, Milton Leitenberg, an expert on biological weapons, has been looking at this murkiest and most dangerous corner of Saddam Hussein's armory.
A2: He says a series of reports add up to indications that Iraq may be trying to develop a new viral agent, possibly in underground laboratories at a military complex near Baghdad where Iraqis first chased away inspectors six years ago.
A3: A new assessment by the United Nations suggests Iraq still has chemical and biological weapons - as well as the rockets to deliver them to targets in other countries.
A4:The UN document says Iraq may have hidden a number of Scud missiles, as well as launchers and stocks of fuel.
A5: US intelligence believes Iraq still has stockpiles of chemical and biological weapons and guided missiles, which it hid from the UN inspectors
Content: Biological Weapons Program:
develop(Iraq, Viral_Agent(instance_of:new))Justification: POSSESSION Schema Previous (Intent and Ability): Prevent(ability, Inspection); Inspection terminated; Status: Attempt ongoing Likelihood: Medium Confirmability: difficult, obtuse, hidden
possess(Iraq, Chemical and Biological Weapons) Justification: POSSESSION Schema Previous (Intent and Ability): Prevent(ability, Inspection); Status: Hidden from Inspectors Likelihood: Medium
possess(Iraq, delivery systems(type : rockets; target: other countries)) Justification: POSSESSION Schema Previous (Intent and Ability): Hidden from Inspectors; Status: Ongoing Likelihood: Medium
Answer
Structure Temporal Reference/Grounding
ANSWER: Evidence-Combined: Pointer to Text Source:
A1: In recent months, Milton Leitenberg, an expert on biological weapons, has been looking at this murkiest and most dangerous corner of Saddam Hussein's armory.
A2: He says a series of reports add up to indications that Iraq may be trying to develop a new viral agent, possibly in underground laboratories at a military complex near Baghdad where Iraqis first chased away inspectors six years ago.
A3: A new assessment by the United Nations suggests Iraq still has chemical and biological weapons - as well as the rockets to deliver them to targets in other countries.
A4:The UN document says Iraq may have hidden a number of Scud missiles, as well as launchers and stocks of fuel.
A5: US intelligence believes Iraq still has stockpiles of chemical and biological weapons and guided missiles, which it hid from the UN inspectors
Content: Biological Weapons Program:
possess(Iraq, fuel stock(purpose: power launchers)) Justification: POSSESSION Schema Previous (Intent and Ability): Hidden from Inspectors; Status: Ongoing Likelihood: Medium
possess(Iraq, delivery systems(type : scud missiles; launchers; target: other countries)) Justification: POSSESSION Schema Previous (Intent and Ability): Hidden from Inspectors; Status: Ongoing Likelihood: Medium
Answer Structure
(continued)
hide(Iraq, Seeker: UN Inspectors; Hidden: CBW stockpiles & guided missiles) Justification: DETECTION Schema Inspection status: Past; Likelihood: Medium
Present Progressive Perfect
Present Progressive Continuing
ANSWER: Evidence-Combined: Pointer to Text Source:
A1: In recent months, Milton Leitenberg, an expert on biological weapons, has been looking at this murkiest and most dangerous corner of Saddam Hussein's armory.
A2: He says a series of reports add up to indications that Iraq may be trying to develop a new viral agent, possibly in underground laboratories at a military complex near Baghdad where Iraqis first chased away inspectors six years ago.
A3: A new assessment by the United Nations suggests Iraq still has chemical and biological weapons - as well as the rockets to deliver them to targets in other countries.
A4:The UN document says Iraq may have hidden a number of Scud missiles, as well as launchers and stocks of fuel.
A5: US intelligence believes Iraq still has stockpiles of chemical and biological weapons and guided missiles, which it hid from the UN inspectors
Content: Biological Weapons Program:
develop(Iraq, Viral_Agent(instance_of:new))Justification: POSSESSION Schema Previous (Intent and Ability): Prevent(ability, Inspection); Inspection terminated; Status: Attempt ongoing Likelihood: Medium Confirmability: difficult, obtuse, hidden
possess(Iraq, Chemical and Biological Weapons) Justification: POSSESSION Schema Previous (Intent and Ability): Prevent(ability, Inspection); Status: Hidden from Inspectors Likelihood: Medium
possess(Iraq, delivery systems(type : rockets; target: other countries)) Justification: POSSESSION Schema Previous (Intent and Ability): Hidden from Inspectors; Status: Ongoing Likelihood: Medium
Answer
Structure Uncertainty and Belief
ANSWER: Evidence-Combined: Pointer to Text Source:
A1: In recent months, Milton Leitenberg, an expert on biological weapons, has been looking at this murkiest and most dangerous corner of Saddam Hussein's armory.
A2: He says a series of reports add up to indications that Iraq may be trying to develop a new viral agent, possibly in underground laboratories at a military complex near Baghdad where Iraqis first chased away inspectors six years ago.
A3: A new assessment by the United Nations suggests Iraq still has chemical and biological weapons - as well as the rockets to deliver them to targets in other countries.
A4:The UN document says Iraq may have hidden a number of Scud missiles, as well as launchers and stocks of fuel.
A5: US intelligence believes Iraq still has stockpiles of chemical and biological weapons and guided missiles, which it hid from the UN inspectors
Content: Biological Weapons Program:
develop(Iraq, Viral_Agent(instance_of:new))Justification: POSSESSION Schema Previous (Intent and Ability): Prevent(ability, Inspection); Inspection terminated; Status: Attempt ongoing Likelihood: Medium Confirmability: difficult, obtuse, hidden
possess(Iraq, Chemical and Biological Weapons) Justification: POSSESSION Schema Previous (Intent and Ability): Prevent(ability, Inspection); Status: Hidden from Inspectors Likelihood: Medium
possess(Iraq, delivery systems(type : rockets; target: other countries)) Justification: POSSESSION Schema Previous (Intent and Ability): Hidden from Inspectors; Status: Ongoing Likelihood: Medium
Answer
Structure Uncertainty and Belief
Mutliple Sources with reliability
ANSWER: Evidence-Combined: Pointer to Text Source:
A1: In recent months, Milton Leitenberg, an expert on biological weapons, has been looking
at this murkiest and most dangerous corner of Saddam Hussein's armory.
A2: He says a series of reports add up to indications that Iraq may be trying to develop a new viral agent, possibly in underground laboratories at a military complex near Baghdad where Iraqis first chased away inspectors six years ago.
A3: A new assessment by the United Nations suggests Iraq still has chemical and biological weapons - as well as the rockets to deliver them to targets in other countries.
A4:The UN document says Iraq may have hidden a number of Scud missiles, as well as launchers and stocks of fuel.
A5: US intelligence believes Iraq still has stockpiles of chemical and biological weapons and guided missiles, which it hid from the UN inspectors
Content: Biological Weapons Program:develop(Iraq, Viral_Agent(instance_of:new))Justification: POSSESSION Schema Previous (Intent and Ability): Prevent(ability, Inspection); Inspection terminated;
Status: Attempt ongoing Likelihood: Medium Confirmability: difficult, obtuse, hidden
possess(Iraq, Chemical and Biological Weapons) Justification: POSSESSION Schema Previous (Intent and Ability): Prevent(ability, Inspection); Status: Hidden from Inspectors Likelihood: Medium
possess(Iraq, delivery systems(type : rockets; target: other countries)) Justification: POSSESSION Schema Previous (Intent and Ability): Hidden from Inspectors; Status: Ongoing Likelihood: Medium
Answer
Structure
Event Structure Metaphor
Event Structure for semantically based QA Reasoning about dynamics
Complex event structure• Multiple stages, interruptions, resources, framing
Evolving events• Conditional events, presuppositions.
Nested temporal and aspectual references• Past, future event references
Metaphoric references• Use of motion domain to describe complex events.
Reasoning with Uncertainty Combining Evidence from Multiple, unreliable sources Non-monotonic inference
• Retracting previous assertions• Conditioning on partial evidence
Relevant Previous WorkEvent Structure
Aspect (VDT, TimeML), Situation Calculus (Steedman), Frame Semantics (Fillmore), Cognitive Linguistics (Langacker, Talmy), Metaphor and Aspect (Narayanan)
Reasoning about Uncertainty Bayes Nets (Pearl), Probabilistic Relational Models
(Koller), Graphical Models (Jordan)Reasoning about Dynamics
Dynamic Bayes Nets (Murphy), Distributed Systems (Alur, Meseguer), Control Theory (Ramadge and Wonham), Causality (Pearl)
Outline Part III. Knowledge representation and
inference Representing the semantics of answers Extended WordNet and abductive inference Intentional Structure and Probabilistic
Metonymy An example of Event Structure Modeling relations, uncertainty and dynamics Inference methods and their mapping to
answer types
Structured Probabilistic Inference
Probabilistic inference Filtering
• P(X_t | o_1…t,X_1…t)• Update the state based on the observation sequence
and state set MAP Estimation
• Argmaxh1…hnP(X_t | o_1…t, X_1…t)• Return the best assignment of values to the hypothesis
variables given the observation and states Smoothing
• P(X_t-k | o_1…t, X_1…t)• modify assumptions about previous states, given
observation sequence and state set Projection/Prediction/Reachability
• P(X_t+k | o_1..t, X_1..t)
Answer Type to Inference Method
ANSWER TYPE INFERENCE DESCRIPTIONJustify (Proposition) MAP Proposition is part
of the MAP
Ability (Agent, Act) Filtering; Smoothing
Past/Current Action enabled given current state
Prediction (State) P;R’ MAP Propogate current information andestimate best new state
Hypothetical (Condition) S, R_I Smooth intervene and compute state
Outline Part IV. From Ontologies to Inference
From OWL to CPRM FrameNet in OWL FrameNet to CPRM mapping
Semantic Web The World Wide Web (WWW) contains a large
and expanding information base. HTML is accessible to humans but does not
formally describe data in a machine interpretable form.
XML remedies this by allowing for the use of tags to describe data (ex. disambiguating crawl)
Ontologies are useful to describe objects and their inter-relationships.
DAML+OIL (http://www.daml.org) is an markup language based on XML and RDF that is grounded in description logic and is designed to allow for ontology development, transfer, and use on the web.
Programmatic Access to the web
Web-accessible programs and devices
Knowledge Rep’n for the “Semantic Web”
XML Schema RDF (Resource Description Framework)
RDFS (RDF Schema)
OWL/DAML-L (Logic)
OWL (Ontology)
XML (Extensible Markup Language)
Knowledge Rep’n for “Semantic Web Services”
XML Schema RDF (Resource Description Framework)
RDFS (RDF Schema)
DAML-L (Logic)
DAML+OIL (Ontology)
XML (Extensible Markup Language)
DAML-S (Services)
DAML-S: Semantic Markup for Web Services
DAML-S: A DARPA Agent Markup Language for Services • DAML+OIL ontology for Web services:
• well-defined semantics• ontologies support reuse, mapping, succinct markup, ...
• Developed by a coalition of researchers from Stanford, SRI, CMU, BBN, and Nokia, Yale, under the auspices of DARPA.
• DAML-S version 0.6 posted October,2001 http://www.daml.org/services/daml-s[DAML-S Coalition, 2001, 2002]
[Narayanan & McIlraith 2003]
DAML-S/OWL-S Compositional Primitives
process
atomicprocess
compositeprocess
inputs (conditional) outputs preconditions (conditional) effects
controlconstructs
composedBy
whilesequence
If-then-else
fork
...
PROCESS.OWL
The OWL-S Process Description
Implementation
DAML-S translation to the modeling environment KarmaSIM [Narayanan, 97] (http://www.icsi.berkeley.edu/~snarayan)
Basic Program:
Input: DAML-S description of Events
Output: Network Description of Events in KarmaSIM
Procedure:• Recursively construct a sub-network for each control
construct. Bottom out at atomic event.• Construct a net for each atomic event• Return network
Example of A WMD Ontology in OWL
<rdfs:Class rdf:ID="DevelopingWeaponOfMassDestruction"> <rdfs:subClassOf rdf:resource= SUMO.owl#Making
"/><rdfs:comment>
Making instances of WeaponOfMassDestruction.
</rdfs:comment>
</rdfs:Class>
http://reliant.teknowledge.com/DAML/SUMO.owl
Outline Part IV. From Ontologies to Inference
From OWL to CPRM FrameNet in OWL FrameNet to CPRM mapping
The FrameNet ProjectC Fillmore PI (ICSI)Co-PI’s:
S Narayanan (ICSI, SRI)D Jurafsky (U Colorado) J M Gawron (San Diego State U)
Staff: C Baker Project Manager B Cronin Programmer C Wooters Database Designer
Frames and Understanding Hypothesis: People understand
things by performing mental operations on what they already know. Such knowledge is describable in terms of information packets called frames.
FrameNet in the Larger Context The long-term goal is to reason about the
world in a way that humans understand and agree with.
Such a system requires a knowledge representation that includes the level of frames.
FrameNet can provide such knowledge for a number of domains.
FrameNet representations complement ontologies and lexicons.
The core work of FrameNet
1. characterize frames2. find words that fit the frames3. develop descriptive terminology4. extract sample sentences5. annotate selected examples6. derive "valence" descriptions
The Core Data
The basic data on which FrameNet descriptions are based take the form of a collection of annotated sentences, each coded for the combinatorial properties of one word in it. The annotation is done manually, but several steps are computer-assisted.
Types of Words / Frames
o eventso artifacts, built objectso natural kinds, parts and aggregateso terrain featureso institutions, belief systems, practiceso space, time, location, motiono etc.
Event Frames
Event frames have temporal structure, and generally have constraints on what precedes them, what happens during them, and what state the world is in once the event has been completed.
Sample Event Frame:Commercial TransactionInitial state:
Vendor has Goods, wants MoneyCustomer wants Goods, has Money
Transition:Vendor transmits Goods to CustomerCustomer transmits Money to Vendor
Final state:Vendor has Money
Customer has Goods
Sample Event Frame:Commercial TransactionInitial state:
Vendor has Goods, wants MoneyCustomer wants Goods, has Money
Transition:Vendor transmits Goods to CustomerCustomer transmits Money to Vendor
Final state:Vendor has Money
Customer has Goods
(It’s a bit more complicated than that.)
Partial Wordlist for Commercial Transactions
Verbs: pay, spend, cost, buy, sell, charge
Nouns: cost, price, payment
Adjectives: expensive, cheap
Meaning and Syntax The various verbs that evoke this
frame introduce the elements of the frame in different ways. The identities of the buyer, seller, goods
and money Information expressed in sentences
containing these verbs occurs in different places in the sentence depending on the verb.
Customer Vendor
Goods Money
BUY
from
for
She bought some carrots from the greengrocer for a dollar.
Customer Vendor
Goods Money
PAY
for
to
She paid a dollar to the greengrocer for some carrots.
Customer Vendor
Goods Money
PAY
for
She paid the greengrocer a dollar for the carrots.
FrameNet Product For every target word, describe the frames or conceptual
structures which underlie them, and annotate example sentences
that cover the ways in which information from the associated frames are expressed in these sentences.
Complex Frames With Criminal_process we have, for
example, sub-frame relations (one frame is a
component of a larger more abstract frame) and
temporal relations (one process precedes another)
FrameNet Entities and Relations
Frames Background Lexical
Frame Elements (Roles) Binding Constraints
Identify ISA(x:Frame, y:Frame) SubframeOf (x:Frame, y:Frame) Subframe Ordering
precedes Annotation
A DAML+OIL Frame Class <daml:Class rdf:ID="Frame"> <rdfs:comment> The most general class
</rdfs:comment> <daml:unionOf rdf:parseType="daml:collection"> <daml:Class rdf:about="#BackgroundFrame"/> <daml:Class rdf:about="#LexicalFrame"/> </daml:unionOf></daml:Class>
<daml:ObjectProperty rdf:ID="Name"> <rdfs:domain rdf:resource="#Frame"/> <rdfs:range rdf:resource="&rdf-schema;#Literal"/></daml:ObjectProperty>
DAML+OIL Frame Element<daml:ObjectProperty rdf:ID= "role"> <rdfs:domain rdf:resource="#Frame"/> <rdfs:range rdf:resource="&daml;#Thing"/></daml:ObjectProperty>
</daml:ObjectProperty> <daml:ObjectProperty rdf:ID="frameElement"> <daml:samePropertyAs rdf:resource="#role"/></daml:ObjectProperty>
<daml:ObjectProperty rdf:ID="FE"> <daml:samePropertyAs rdf:resource="#role"/></daml:ObjectProperty>
FE Binding Relation
<daml:ObjectProperty rdf:ID="bindingRelation"> <rdf:comment> See http://www.daml.org/services
</rdf:comment> <rdfs:domain rdf:resource="#Role"/> <rdfs:range rdf:resource="#Role"/></daml:ObjectProperty>
<daml:ObjectProperty rdf:ID="identify"> <rdfs:subPropertyOf rdf:resource="#bindingRelation"/> <rdfs:domain rdf:resource="#Role"/> <daml-s:sameValuesAs rdf:resource="#rdfs:range"/></daml:ObjectProperty>
Subframes and Ordering
<daml:ObjectProperty rdf:ID="subFrameOf"> <rdfs:domain rdf:resource="#Frame"/> <rdfs:range rdf:resource="#Frame"/></daml:ObjectProperty>
<daml:ObjectProperty rdf:ID="precedes"> <rdfs:domain rdf:resource="#Frame"/> <rdfs:range rdf:resource="#Frame"/></daml:ObjectProperty>
The Criminal Process Frame
Frame Element Description
Court The court where the process takes place
Defendant The charged individual
Judge The presiding Judge
Prosecution FE indentifies the attorneys’ prosecuting the defendant
Defense Attorneys’ defending the defendant
The Criminal Process Frame in DAML+OIL
<daml:Class rdf:ID="CriminalProcess"> <daml:subClassOf rdf:resource="#BackgroundFrame"/></daml:Class>
<daml:Class rdf:ID="CP"> <daml:sameClassAs rdf:resource="#CriminalProcess"/></daml:Class>
DAML+OIL Representation of the Criminal Process Frame Elements
<daml:ObjectProperty rdf:ID="court"> <daml:subPropertyOf rdf:resource="#FE"/> <daml:domain rdf:resource="#CriminalProcess"/> <daml:range rdf:resource="&CYC;#Court-Judicial"/></daml:ObjectProperty>
<daml:ObjectProperty rdf:ID="defense"> <daml:subPropertyOf rdf:resource="#FE"/> <daml:domain rdf:resource="#CriminalProcess"/> <daml:range rdf:resource="&SRI-IE;#Lawyer"/></daml:ObjectProperty>
FE Binding Constraints
<daml:ObjectProperty rdf:ID="prosecutionConstraint"> <daml:subPropertyOf rdf:resource="#identify"/> <daml:domain rdf:resource="#CP.prosecution"/> <daml-s:sameValuesAs rdf:resource="#Trial.prosecution"/></daml:ObjectProperty>
• The idenfication contraints can be between • Frames and Subframe FE’s.• Between Subframe FE’s
• DAML does not support the dot notation for paths.
Criminal Process Subframes<daml:Class rdf:ID="Arrest"> <rdfs:comment> A subframe </rdfs:comment> <rdfs:subClassOf rdf:resource="#LexicalFrame"/></daml:Class>
<daml:Class rdf:ID="Arraignment"> <rdfs:comment> A subframe </rdfs:comment> <rdfs:subClassOf rdf:resource="#LexicalFrame"/></daml:Class>
<daml:ObjectProperty rdf:ID="arraignSubFrame"> <rdfs:subPropertyOf rdf:resource="#subFrameOf"/> <rdfs:domain rdf:resource="#CP"/> <rdfs:range rdf:resource="#Arraignment"/></daml:ObjectProperty>
Specifying Subframe Ordering
<daml:Class rdf:about="#Arrest"> <daml:subClassOf> <daml:Restriction> <daml:onPropertyrdf:resource="#precedes"/> <daml:hasClass rdf:resource="#Arraignment"/> </daml:Restriction> </daml:subClassOf></daml:Class>
DAML+OIL CP Annotations<fn:Annotation> <tpos> "36352897" </tpos> <frame rdf:about ="&fn;Arrest"> <time> In July last year </time> <authorities> a German border guard
</authorities> <target> apprehended </target> <suspect> two Irishmen with Kalashnikov assault rifles. </suspect> </frame></fn:Annotation>
Outline Part IV. From Ontologies to Inference
From OWL to CPRM FrameNet in OWL FrameNet to CPRM mapping
Representing Event Frames At the computational level, we use a
structured event representation of event frames that formally specify The frame Frame Elements and filler types Constraints and role bindings Frame-to-Frame relations
• Subcase• Subevent
Events and actionsschema Event
rolesbefore : Phasetransition : Phaseafter : Phasenucleus
constraintstransition :: nucleus
schema Actionevokes Event as eroles
actor : Entityundergoer : Entityself e.nucleus
before aftertransition
nucleus
undergoer
actor
The Commercial-Transaction schema
schema Commercial-Transactionsubcase of Exchangeroles
customer participant1vendor participant2money entity1 : Moneygoods entity2goods-transfer transfer1 money-transfer transfer2
Implementation
DAML-S translation to the modeling environment KarmaSIM [Narayanan, 97] (http://www.icsi.berkeley.edu/~snarayan)
Basic Program:
Input: DAML-S description of Frame relations
Output: Network Description of Frames in KarmaSIM
Procedure:• Recursively construct a sub-network for each control
construct. Bottom out at atomic frame.• Construct a net for each atomic frame• Return network
Outline Part V. Results of Event Structure
Inference for QA AnswerBank Current results for Inference Type Current results for Answer Structure
AnswerBank AnswerBank is a collection of over a 1200 QA
annotations from the AQUAINT CNS corpus. Questions and answers cover the different domains
of the CNS data. Questions and answers are POS tagged, and
syntactically parsed. Question and Answer predicates are annotated with
PropBank arguments and FrameNet (when available) tags. FrameNet is annotating CNS data with frame information
for use by the AQUAINT QA community. We are planning to add more semantic information
including temporal, aspectual information (TIMEML+) and information about event relations and figurative uses.
Event Simulation
Predicate ExtractionRetrieved
Documents
FrameNetFrames
OWL/OWL-STopic
Ontologies
Model Parameterization
CONTEXT
PRM
< PRM Update>
<Pred(args), Topic Model, Answer Type>
<Simulation Triggering >
Answer Types for complex questions in AnswerBank
ANSWER TYPE EXAMPLE NUMBERJustify (Proposition) What is the evidence
that IRAQ has WMD?89
Ability (Agent, Act) How can a Biological Weapons Program be detected?
71
Prediction (State) What were the possible ramifications of India’s launch of the Prithvi missile?
63
Hypothetical (Condition) If Musharraf is removed from power, will Pakistan become a militant Islamic State?
62
Answer Type to Inference Method
ANSWER TYPE INFERENCE DESCRIPTIONJustify (Proposition) MAP Proposition is part
of the MAP
Ability (Agent, Act) Filtering; Smoothing
Past/Current Action enabled given current state
Prediction (State) P;R’ MAP Propogate current information andestimate best new state
Hypothetical (Condition) S, R_I Smooth intervene and compute state
Outline Part V. Results of Event Structure
Inference for QA AnswerBank Current results for Inference Type Current results for Answer Structure
AnswerBank Data We used 80 QA annotations from AnswerBank
Questions were of the four complex types • Justification, Ability, Prediction, Hypothetical
Answers were combined from multiple sentences (Average 4.3) and multiple annotations (average 2.1)
CNS Domains Covered were WMD related (54%) Nuclear Theft (25%) India’s missile program (21%)
Building Models Gold Standard:
From the hand-annotated data in the CNS corpus, we manually built CPRM domain models for inference.
Semantic Web based: From FrameNet frames and from
semantic web ontologies in OWL (SUMO-based, OpenCYC and others), we built CPRM models (semi-automatic)
Percent correct by inference type
66 66
51
87
83
73
63 OWL-based Domain Model
83 Manually generated from CNS
data
50
55
60
65
70
75
80
85
90
Justification Prediction Ability Hypothetical
% c
orr
ect
(co
mp
ared
to
go
ld s
tan
dar
d
OWL-based Domain Model Manually generated from CNS data
Event Structure Inferences For the annotations we classified complex
event structure inferences as Aspectual
• Stages of events, viewpoints, temporal relations (such as start(ev1, ev2), interrupt(ev1, ev2))
Action-Based• Resources (produce,consume,lock), preconditions,
maintenance conditions, effects. Metaphoric
• Event Structure Metaphor (ESM)Events and predications (motion => Action), objects (Motion.Mover => Action.Actor), Parameters(Motion.speed =>Action.rateOfProgress)
ANSWER: Evidence-Combined: Pointer to Text Source:
A1: In recent months, Milton Leitenberg, an expert on biological weapons, has been looking at this murkiest and most dangerous corner of Saddam Hussein's armory.
A2: He says a series of reports add up to indications that Iraq may be trying to develop a new viral agent, possibly in underground laboratories at a military complex near Baghdad where Iraqis first chased away inspectors six years ago.
A3: A new assessment by the United Nations suggests Iraq still has chemical and biological weapons - as well as the rockets to deliver them to targets in other countries.
A4:The UN document says Iraq may have hidden a number of Scud missiles, as well as launchers and stocks of fuel.
A5: US intelligence believes Iraq still has stockpiles of chemical and biological weapons and guided missiles, which it hid from the UN inspectors
Content: Biological Weapons Program:
develop(Iraq, Viral_Agent(instance_of:new))Justification: POSSESSION Schema Previous (Intent and Ability): Prevent(ability, Inspection); Inspection terminated; Status: Attempt ongoing Likelihood: Medium Confirmability: difficult, obtuse, hidden
possess(Iraq, Chemical and Biological Weapons) Justification: POSSESSION Schema Previous (Intent and Ability): Prevent(ability, Inspection); Status: Hidden from Inspectors Likelihood: Medium
possess(Iraq, delivery systems(type : rockets; target: other countries)) Justification: POSSESSION Schema Previous (Intent and Ability): Hidden from Inspectors; Status: Ongoing Likelihood: Medium
Answer
Structure
Content of InferencesComponent Number F-Score
ManualF-ScoreOWL
Aspectual 375 .74 .65
Action-Feature
459 .62 .45
Metaphor 149 .70 .62
Conclusion Answering complex questions requires semantic
representations at multiple levels. NE and Extraction-based Predicate Argument Structures Frame, Topic and Domain Models
All these representations should be capable of supporting inference about relational structures, uncertain information, and dynamic context.
Both Semantic Extraction techniques and Structured Probabilistic KR and Inference methods have matured to the point that we understand the various algorithms and their properties.
Flexible architectures that embody these KR and inference techniques and make use of the expanding linguistic and ontological
resources (such as on the Semantic Web) Point the way to the future of semantically based QA
systems!
References (URL) Semantic Resources
FrameNet: http://www.icsi.berkeley.edu/framenet (Papers on FrameNet and Computational Modeling efforts using FrameNet can be found here).
PropBank: http://www.cis.upenn.edu/~ace/ Gildea’s Verb Index; http://www.cs.rochester.edu/~gildea/Verbs/ (links FrameNet,
PropBank, and VerbNet Probabilistic KR (PRM)
http://robotics.stanford.edu/~koller/papers/lprm.ps (Learning PRM) http://www.eecs.harvard.edu/~avi/Papers/thesis.ps.gz (Avi Pfeffer’s PRM
Stanford thesis) Dynamic Bayes Nets
http://www.ai.mit.edu/~murphyk/Thesis/thesis.pdf (Kevin Murphy’s Berkeley DBN thesis)
Event Structure in Language http://www.icsi.berkeley.edu/~snarayan/thesis.pdf (Narayanan’s Berkeley PhD
thesis on models of metaphor and aspect) ftp://ftp.cis.upenn.edu/pub/steedman/temporality/temporality.ps.gz
(Steedman’s article on Temporality with links to previous work on aspect) http://www.icsi.berkeley.edu/NTL (publications on Cognitive Linguistics and
computational models of cognitive linguistic phenomena can be found here)