Surface Realization Mahalingam.P.R Semester III, M.Tech CSESIS, RSET
Jan 17, 2015
Surface Realization
Mahalingam.P.R
Semester III, M.Tech CSESIS, RSET
Agenda Introduction
Systemic Grammar
Interpersonal meta-function
Ideational meta-function
Textual meta-function
Functional Unification Grammar
Functional Description
Conclusion
2 Surface Realization
Introduction
3 Surface Realization
Discourse Plan
The discourse plan is
generated by the
DISCOURE PLANNER
by taking into
consideration the
communicative goal and
the available Knowledge
Base.
The content is structured
appropriately.
Discourse plan defines:
•Choices made for the
entire communication
(may span multiple
sentences)
•Annotations (hypertext,
figures, etc.)
Surface realizer receives the fully
specified discourse plan.
Generates individual sentences
Constrained by the lexical and
grammatical resources
Resources
Define the realizer’s potential range of
output
If the plan specifies multiple-
sentence output, the surface
realizer is called multiple times.
4 Surface Realization
So, the surface realization component produces
ordered sequence of words as constrained by the
lexicon and grammar.
Input
Sentence-sized chunks of the discourse specification
Influential approaches for surface realization
Systemic Grammar
Functional Unification Grammar
5 Surface Realization
No general consensus as to the level at which the
input to the surface realizer should be specified.
Some approaches specify only the propositional
content.
6 Surface Realization
What does it do?
Surface Realization7
Derive a human readable sentence from a discourse
plan.
Discourse plan does not give syntax, only functional
information. The Surface Realizer adds syntactical
information and assures that the sentence will
comply with lexical and grammatical constraints.
What doesn’t it do?
Surface Realization8
Will not verify that the correctness of the data
provided by the discourse planner or that the
information makes sense.
Does not deal with more than one sentence at a
time. If the plan calls for many sentences, the
surface realizer will be called once for each sentence
required.
Simple Surface Realization Tools
Surface Realization9
Canned Text Systems- Takes a given input and matches it directly to a pre-made sentence.
- Commonly used in simple systems such as error messages or warnings.
- Has no flexibility whatsoever.
Template Systems- The idea of a template is that there are premade sentences with fill in the blank words that are filled in by the input.
- These systems work well with Form Letters and Slightly more advanced Error or Warning Messages.
- They are still very inflexible, but better than canned text systems.
Surface Realization10
The simple surface realization tools eventually gave
way to advanced Feature-based systems
Systemic Grammar Representation of sentences as collections of functions. Rules allow mapping from functions to grammatical forms. (Halliday, 1985)
Functional Unification Grammar Represents sentences as feature structures that can be combined and altered to produce sentences. (Kay, 1979)
“The system will save the document”
The discourse plan
would specify a saving
action done by a system
entity to a document
entity.
Other approaches
Include the specification
of the grammatical form
In this case, a future
tense assertion
Specification of lexical
items
In this case, save,
system and document
11 Surface Realization
Surface Realization12
The two approaches take input at different levels.
Common factor
Input is functionally specified, rather than syntactically
specified
Factor typical of generational systems
Generation systems start with meaning and context
Specify the intended output in terms of function, rather
than form.
Surface Realization13
Can be stated in two ways
ACTIVE FORM
PASSIE FORM
Discourse planners tend not to work with the
syntactic terms.
They are most likely to keep track of the focus or
local topic of the discourse.
More natural to define this distinction in terms of focus.
“The system will save the document”
Surface
Realization
Approaches
Systemic Grammar
Functional Unification
Grammar
Surface Realization14
If the document is the local topic of
discourse, it would be marked as
the focus which could trigger the
use of the passive.
“The document will be saved by
the system”
Both surface realization
approaches categorize grammar in
functional terms.
Systemic Grammar
Surface Realization15
Systemic-
Functional
linguistics
A branch of linguistics
that views language as a
resource for expressing
meaning in context
-An Introduction to
Functional Grammar,
Halliday (1985)
Surface Realization16
A part of Systemic-Functional
linguistics.
Represent sentences as collections
of functions and maintain rules for
mapping these functions on to
explicit grammatical forms.
Well suited for generation
Widely influential in NLG
Surface Realization17
Systemic sentence analysis organize the functions
being expressed in multiple layers.
“The system will save the document”
Layers
“The system will save the
document”
Concepts of theme and
rheme were developed by
the Prague school of
linguistics
-Firbas, 1966
Thematic roles apply here
too, like AGENT,
EXPERIENCER,
INSTRUMENT, and so on.
Surface Realization18
Mood layer –simple declarative
structure
Subject
Finite (auxiliary)
Predicator (verb)
Object
Transitivity layer
Actor / Doer (system)
Process (saving)
Goal – the object being acted upon (document)
Theme layer
Theme
Rheme
Rheme
a topic of informal discussion
different from a theme
Surface Realization19
The three layers deal with different sets of functions.
Meta-functions
• Inter-personal meta function
Mood layer
• Ideational meta function
Transitivity layer
• Textual meta function
Theme layer
Interpersonal meta-function
Surface Realization20
Group the functions that establish and maintain the interaction between the sentence writer and the reader.
Represented by the mood layer Determines whether the writer is
Commanding
Telling
Asking
Examples would be whether the writer is telling the reader something or is asking a question.
Ideational meta-function
Surface Realization21
Concerned with the propositional content of the expression.
Transitivity layer determines Nature of process being expressed
Variety of case roles that must be expressed
Covers much of the semantics.
In other words, identify items like who the actors are, what the goals are for the sentence, and type of process being performed.
Textual meta-function
Surface Realization22
Concerned with the way the expression fits into the
current discourse.
Includes issues of thematization and reference.
Tries to fit the expression with a given theme and
reference.
Represented by the theme layer
Explicitly marks the system as the theme of the sentence
Surface Realization23
Explicit concern for interpersonal and textual issues
as well as traditional semantics
Feature of systemic linguistics that is attractive for NLG.
Many choices that generation systems make depend
on context of communication
Formalized by the interpersonal and textual meta-
functions.
Surface Realization24
System network
Grammar represented using a directed, acyclic and/or
graph, called a system network
Surface Realization25
Curly braces
AND parallel systems
Vertical lines
OR disjoint systems
Surface Realization26
Every clause (represented as the highest level
feature) will simultaneously have a different set of
features for mood, transitivity and theme.
“The system will save the document”
Indicative, declarative clause expressing an active
material process with an unmarked theme.
Realization Statements
Surface Realization27
A systemic grammar uses realization statements to
map from the features specified in the grammar (like
Indicative, Declarative) to syntactic form.
Each feature in the network can have a set of
realization statements specifying constraints on the
final form of the expression.
Shows as italicized statements below each feature
Realization statements allow the grammar to
constrain the structure of the expression as the
system network is traversed.
Some simple operators
Surface Realization28
+X
Insert the function X
The grammar here
specifies that all clauses
will have a predicator.
Some simple operators
Surface Realization29
X=Y
Conflate the functions X and Y. This allows the grammar to build a layered function structure by assigning different functions to the same portion of the expression. Active clauses conflate
the actor with the subject
Passive clauses conflate the goal with the subject
Some simple operators
Surface Realization30
X>Y
Order function X
somewhere before
function Y.
Indicative sentences
place the subject
somewhere before the
predicator.
Some simple operators
Surface Realization31
X:A
Classify the function X with the lexical or grammatical feature A.
Signal a recursive pass through the grammar at a lower level.
Grammar would include other networks similar to the clause network that would apply to phrases, lexical items and morphology. Indicative feature inserts a
subject function that must be a noun phrase.
Phrase further specified by another pass through the grammar.
Some simple operators
Surface Realization32
X!L
Assign function X the
lexical item L.
Finite element of the
passive is assigned the
lexical item “be”
Procedure for generation
-Given a fully specified system network
Surface Realization33
1. Traverse the network from left to right, choosing
the appropriate features and collecting the
associated realization statements.
2. Build an intermediate expression that reconciles
the constraints set by the realization statements
collected during the traversal.
3. Recurse back through the grammar at a lower level
for any function that is not fully specified.
Surface Realization34
We can use the following specification as input.
(
:process save-1
:actor system-1
:goal document-1
:speechact assertion
:tense future
)
“The system will save the document”
Surface Realization35
save-1 knowledge base instance is identified as the process of the intended expression. Assume all knowledge base
objects to be KLONE-styled instances
Actor and goal similarly specified as system-1 and document-1respectively.
Input also specifies that the expression be in the form of an assertion in the future tense.
Generation Process
Surface Realization36
Start at clause feature
Insert a predicator
+predicator
Classify predicator as a verb
predicator:verb
Proceed to mood system
Correct option for a system chosen by a simple query or decision network associated with that system
Decision based on the relevant information from input specification and from Knowledge Base.
Surface Realization37
Mood system chooses the indicative and declarative features Input specifies assertion.
Realization statements associated with the indicative and declarative features will insert subject and finite functions order them as subject, then
finite and then predicator.
+subject
subject > predicator
+finite
finite > predicator
subject > finite
Surface Realization38
The resulting function structure is as follows:
Surface Realization39
Assume save-1 is marked
as a material process in
the knowledge base.
Transitivity function
chooses the material
process feature
Insert goal and process
functions
Conflates the process with
the finite/predicator pair
+goal
+process
process= finite,predicator
Surface Realization40
Since there is no indication in either the input or knowledge base to use a passive, the system chooses the active feature, which
Inserts the actor and conflates it with the subject
+actor
actor=subject
Inserts the object, conflating it with the goal and ordering it after the predicator
+object
object=goal
predicator>object
Surface Realization41
This results in the following functional structure.
Surface Realization42
There is no thematic
specification in the input
Thematic network chooses
unmarked theme
Inserts theme and rheme
Conflate theme with subject
Conflate rheme with
finite/predicate/object group
+theme +rheme
theme=subject
rheme=predicator,object
Surface Realization43
This results in the full function structure as:
Surface Realization44
The generation process recursively enters the grammar a number of times at lower levels to fully specify the phrases, lexical items, and morphology.
This is due to the presence of the following statements
When the network found that it is an indicative statement
finite : auxiliary
subject : noun phrase
When active voice was identified
object : noun phrase
Surface Realization45
Noun phrase network
Create the lexical items The system and the document
Auxiliary network systems
Create the lexical item will
The choice of lexical items system, document and
save can be handled in a number of ways, most
typically by retrieving the lexical item associated with
the relevant knowledge base instances.
The noun phrase and auxiliary network systems
work similar to the clause network we have seen till
now.
Functional Unification Grammar
Surface Realization46
Surface Realization47
Functional Unification Grammar uses unification to
manipulate and reason about feature structures.
With a few manipulations, the same technique can be
applied to NLG.
Basic Idea
Build the generation grammar as a feature structure with
lists of potential alternations
Then unify this grammar with an input specification built
using the same sort of feature structure.
Surface Realization48
Unification process
takes the features specified in the input
reconciles them with those in the grammar
produces a full feature structure which can then be
linearized to form sentence output.
“The system will save the document”
Surface Realization49
A simple functional
unification grammar.
Expressed as an
attribute-value matrix
Supports simple
transitive sentences in
present or future tense
Enforces subject-verb
agreement on number
Surface Realization50
At highest level, the grammar provides alternatives for sentences, noun phrases and verb phrases CAT S
CAT NP
CAT VP
Alternation feature provided by the ALT feature on the left. Curly braces indicate that any of the
enclosed alternatives may be chosen and followed
This level also specifies a pattern indicating the order of the features specified at this level Actor
Process
Goal
Surface Realization51
At sentence level, grammar supports the following features. Actor NP
Process VP
Goal NP
Subject-verb agreement Enforced using the number
feature inside the processfeature.
Number of processes must unify with the path {actor number}
Path list of features specifying a path from the root to a particular feature.
Here, number of process must unify with the number of actor.
Surface Realization52
While the path is given
explicitly, we can also
have relative paths
Like the number feature
of the head feature of the
NP.
The path here,
{↑↑number }, indicates
that the number of the
head of the NP must
unify with the number of
the feature 2 levels up.
Use of {↑↑number}
Surface Realization53
VP level is similar to the NP level except that it has its own alternation between future and present tense.
Tense is specified in the input feature structure.
Unification will select the alternation that matches and then proceed to unify associated values.
If tense is present For example, the head will
be single verb.
If tense is future Insert modal auxiliary “will”
before the head verb.
Surface Realization54
This grammar is similar to the systemic grammar in
the point that it supports multiple levels, that are
entered recursively during the generation process.
The details of the particular sentence we want to
generate is given in an input feature structure.
Functional Description (FD)
Surface Realization55
The input feature structure.
It defines the input specifications for the particular
sentence we want to generate.
It is a feature structure just like the grammar.
Surface Realization56
Here, we see a sentence specification with a particular action the system
a particular goal the document
Process saving of the document by the system in the future
The input structure specifies the particular verbs and nouns to be used as well as the tense Different from input to systemic grammar
In systemic grammar, lexical items retrieved from knowledge base entries associated with actor and goal.
Tense, not included in systemic grammar, is computed by a decision network that determines relative points in time relevant to the content of the expression.
Surface Realization57
Since tense is also to be included in the input feature
structure (Functional Description), more decisions
have to be made by the discourse planning
component.
To produce the output, the input is unified with the
grammar.
May require multiple passes through the grammar.
Surface Realization58
The preliminary unification unifies the input FD with
the S level in the grammar
First alternative at the top level
This results in the structure:
Surface Realization59
The features specified in the input structure have
been unified and merged with the features at the top
level of the grammar.
Features associated with actor include the lexical item
system from the input FD and category NP from the
grammar.
Process feature combines the lexical item and tense from
the input FD with the category and number features from
the grammar.
Surface Realization60
Generation mechanism
now recursively enters the
grammar for each of the
sub-constituents.
It enters the NP level
twice for actor and
goal
It enters the VP level once
for the process.
Final FD
Surface Realization61
Surface Realization62
Every constituent feature that is internally complex has a pattern specification.
Every simple constituent feature has a lexical specification
The system now uses the pattern specifications to linearize the output, producing
“The system will save the document”
Surface Realization63
The example didn’t specify the actor to be plural. We
can do that by adding the feature-value pair
number plural
to the actor structure in the input FD.
Subject-verb agreement would then be enforced by
the unification process.
Grammar requires that the number of heads of NP
and VP match with the number of the actor that
was specified in the input FD.
Conclusion
Surface Realization64
Surface Realization65
The two surface generation grammars illustrate the
nature of computational grammars for generation.
Both used functional categorizations.
Bidirectional grammar
Single grammar for both generation and understanding
Currently under investigation
Haven’t found widespread use in NLG
Additional semantic and contextual information required as input
to the generator
Sample NLG programs
KPML FUF/SURGE
Surface Realization66
A text generation
system based off of the
earlier Penman system.
Uses Systemic-
Functional Linguistics
Principles. http://www.fb10.uni-
bremen.de/anglistik/langpro/kpml/REA
DME.html
A text generation system and English Grammar using Functional Unification.
FUF – Functional Unification Formalism is an implementation of Functional Unification Grammar developed by Elhadad (1992,1993)
http://www.cs.bgu.ac.il/research/projects/surge/index.htm
THANK YOU…
Surface Realization67