Lecture 2: From Semantics To Semantic-Oriented Applications

MARINA SANTINI

P R O G R A M : C O M P U TAT I O N A L L I N G U I S T I C S A N D L A N G U A G E T E C H N O L O G Y

D E P T O F L I N G U I S T I C S A N D P H I L O L O GY

UPPSALA UNIVERSITY, SWEDEN

14 NOV 2013

Semantic Analysis in Language Technology

Lecture 2: From Semantics to Semantic-Oriented Applications

Course Website: http://stp.lingfil.uu.se/~santinim/sais/sais_fall2013.htm

http://www.linkedin.com/in/marinasantini

http://www.lingfil.uu.se/st

http://www.lingfil.uu.se/






http://www.uu.se/en

http://stp.lingfil.uu.se/~santinim/sais/sais_fall2013.htm



http://www.uu.se/en

Lecture 2: From Semantics to Applications

2

From Formal Systems to Natural Language Semantics

The past: Aristotelean Logic Prepositional Logic

[huge temporal gap] Predicate Logic (FOL & co.) Formal Semantics

The present: Computational Semantics & Semantic-Oriented Applications

The future: Actionable Intelligence


3

Aristotelian Logic

The fundamental assumption behind the theory is that propositions are composed of two terms – hence the name "two-term theory" or "term logic" – and that the reasoning process is in turn built from propositions: Aristotle distinguishes singular terms such as Socrates and general terms such

as Greeks. Aristotle further distinguishes (a) terms that could be the subject of predication, and (b) terms that could be predicated of others by the use of the copula ("is a").

A proposition consists of two terms, in which one term (the "predicate") is "affirmed" or "denied" of the other (the "subject"), and which is capable of truth or falsity. Socrates is a man Socrates is not immortal

The syllogism is an inference in which one proposition (the "conclusion") follows of necessity from two others (the "premises"). Socrates is a man, all men are mortal, therefore Socrates is mortal = new knowledge (inferential knowledge)


4

Syllogistic fallacies

People often make mistakes when reasoning syllogistically and mathematically with natural language:

• A=B• B=C• A=C

some cats (A) are black things (B), some black things (B) are televisions (C), it does not follow from the parameters that some cats (A) are

televisions (C). Existential fallacy (use of quantifiers)

The existential fallacy, or existential instantiation, is a formal fallacy: "Everyone in the room is pretty and smart". It does not imply that there is a pretty, smart person in the room, because it does not state that there is a person in the room.

http://en.wikipedia.org/wiki/Formal_fallacy

http://en.wikipedia.org/wiki/Formal_fallacy


5

Prepositional Logic

It was developed into a formal logic by Chrysippus and expanded by the Stoics.

The logic was focused on propositions. This advancement was different from the traditional syllogistic logic which

was focused on terms. It represents any given proposition with a letter. It requires that all propositions have exactly one of two truth-

values: true or false. To take an example, let be the proposition that it is raining outside. This

will be true if it is raining outside and false otherwise.

http://en.wikipedia.org/wiki/Chrysippus

http://en.wikipedia.org/wiki/Stoics

http://en.wikipedia.org/wiki/Proposition

http://en.wikipedia.org/wiki/Syllogism

http://en.wikipedia.org/wiki/Syllogisms


6

The father of Predicate Logic

In 1879 Frege published his Begriffsschrift (Concept Script). This introduced a calculus, a method of representing statements by the use of quantifiers and variables.

http://en.wikipedia.org/wiki/Frege


7

Predicate Logic (aka FOL, etc.)

Predicate logic is also known as first-order predicate calculus, the lower predicate calculus,quantification theory, and first-order logic.

First-order logic is a formal system used in mathematics, philosophy, linguistics, and computer science.

First-order logic is distinguished from propositional logic by its use of quantified variables.

First-order logic is distinguished from propositional logic by its use of quantified variables.

http://en.wikipedia.org/wiki/Formal_system

http://en.wikipedia.org/wiki/Mathematics

http://en.wikipedia.org/wiki/Philosophy

http://en.wikipedia.org/wiki/Linguistics

http://en.wikipedia.org/wiki/Computer_science

http://en.wikipedia.org/wiki/Propositional_logic

http://en.wikipedia.org/wiki/Quantifier

http://en.wikipedia.org/wiki/Propositional_logic



8

Quantifiers

The two fundamental kinds of quantification in predicate logic are universal quantification and existential quantification. The traditional symbol for the universal quantifier "all" is "∀", an inverted letter "A", and for the existential quantifier "exists" is "∃", a rotated letter "E".

http://en.wikipedia.org/wiki/Predicate_(logic)

http://en.wikipedia.org/wiki/Universal_quantification

http://en.wikipedia.org/wiki/Existential_quantification

http://en.wikipedia.org/wiki/A

http://en.wikipedia.org/wiki/Existential_quantification

http://en.wikipedia.org/wiki/E


9

Propositional Logic vs Predicate Logic

A predicate takes an entity or entities in the domain of discourse as input and outputs either True or False.

Consider the two sentences "Socrates is a philosopher" and "Plato is a philosopher".

In propositional logic, these sentences are viewed as being unrelated and are denoted, for example, by p and q. However, the predicate "is a philosopher" occurs in both sentences which have a common structure of "a is a philosopher". The variable a is instantiated as "Socrates" in first sentence and is instantiated as "Plato" in the second sentence

"There exist a such that a is a philosopher" .

Predicates can be also compared. Ex "if a is a philosopher, then a is a scholar". This formula is a conditional statement with "a is a philosopher" as hypothesis and "a is a scholar" as conclusion.

The truth of this formula depends on which object is denoted by a, and on the interpretations of the predicates "is a philosopher" and "is a scholar".

Variables can be quantified over. "For every a, if a is a philosopher, then a is a scholar". The universal quantifier "for every" in this sentence expresses the idea that the claim "if a is a philosopher, then a is a scholar" holds for all choices of a.

http://en.wikipedia.org/wiki/Domain_of_discourse

http://en.wikipedia.org/wiki/Propositional_calculus

http://en.wikipedia.org/wiki/Propositional_calculus

http://en.wikipedia.org/wiki/Material_conditional


http://en.wikipedia.org/wiki/Universal_quantifier

http://en.wikipedia.org/wiki/Universal_quantifier


10

Formal Semantics (wikipedia)

In linguistics, formal semantics seeks to understand linguistic meaning by constructing precise mathematical models of the principles that speakers use to define relations between expressions in a natural language and the world which supports meaningful discourse.

The mathematical tools used are the confluence of formal logic and formal language theory, especially lambda calculus.

Linguists rarely employed formal semantics until Richard Montague showed how English (or any natural language) could be treated like a formal language. His contribution to linguistic semantics, which is now known as Montague grammar, was the basis for further developments.


11

Translating Natural Language to Formal Language by:

Lamba calculus: is a formal system in mathematical

logic and computer science for expressing computation based on function abstraction and application via variable binding and substitution.

(Cf also J&M: 593)

Prolog: Prolog is a general purpose logic

programming language associated with artificial ntelligence and computational linguistics.

Prolog has its roots in first-order logic, a formal logic, and unlike many other programming languages, Prolog is declarative: the program logic is expressed in terms of relations, represented as facts and rules.

Top-down rule-based systems


12

Syntax-based compositionality of meaning


13

Stumbling block: meaning is not always compositional…


14

Multi-Word Expressions

MWEs (a.k.a multiword units or MUs) are lexical units encompassing a wide range of linguistic phenomena, such as: idioms (e.g. kick the bucket = to die), collocations (e.g. cream tea = a small meal eaten in

Britain, with small cakes and tea), regular compounds (cosmetic surgery), graphically unstable compounds (e.g. self-contained <>

self contained <> selfcontained - all graphical variants have huge number of hits in Google),

light verbs (e.g. do a revision vs. revise), lexical bundles (e.g. in my opinion), etc.


15

Stumbling Block: Ambiguity

Lexical ambiguity: ex polisemy Ex: bank

Referential ambiguity: ex anaphoric ambiguity… it was funded by a tycoon

Scopal ambiguity: I can’t find a piece of paper

(a particular piece of paper or any piece of paper? Existential or universal quantifier "∀”or "∃“?)


16

Computational Semantics (wikipedia)

Computational semantics is the study of how to automate the process of constructing and reasoning with meaning representations of natural language expressions. It consequently plays an important role in natural language processing and computational linguistics.

Some traditional topics of interest are: construction of meaning representations, semantic underspecification, anaphora resolution, presupposition projection, quantifier scope resolution.

Methods employed usually draw from formal semantics or statistical semantics. Computational semantics has points of contact with the areas of lexical semantics (word sense disambiguation and semantic role labeling), discourse semantics, knowledge representation and automated reasoning …


17

What is Semantics? ---- What is LT?

Students’ intuition about semantics:

1. Meaning of language (words, phrases, etc.)

2. Break down complex meaning into simpler blocks of meaning

3. Content understanding4. Disambiguation5. Understanding a phrase6. Understanding the

meaning of phrases depending on different contexts

7. Meaning and connotation

Language technology is often called human language technology (HLT) or natural language processing (NLP) and consists of computational linguistics (or CL) and speech technology as its core but includes also many application oriented aspects of them. Language technology is closely connected to computer science and general linguistics. (wikipedia)

Must add:•Statistics•Machine learning


18

What shall we keep from the past?

Computational semantics must be….


19

Computational semantics must address open issues:

Ambiguity Overcome compositionality Etc.


20

Our definition of semantics for LT must include:

Continuity with the past approaches Must be computationally tractable1. Meaning of language

(words, phrases, etc.)2. Break down complex

meaning into simpler blocks of meaning

3. Content understanding4. Disambiguation5. Understanding a phrase6. Understanding the meaning

of phrases depending on different contexts

7. Meaning and connotation

More advanced than past systems:

Must address ambiguity Must address non

compositional meaningAbove all, must tackle new media.In less than 50 years, new media (internet, web, social networks) have completely scrambled ”traditional” semantics and human communication by creating :

•New meanings (sentiment, opionion, etc)•New language (unconvetional texts and syntax and many sublanguages, like tweets, FB posts, etc.)•Big amounts of wild data

Semantics for Language Technolgy must now take also these aspects into account.


21

In conclusion

More than creating a ”understanding system”, currently the stress in how to automatically extract meaningful and actionable information depending on specifc tasks….


22

Visual Insight into big data around us…

Big Data Video: http://youtu.be/qqfeUUjAIyQ (2:21 min)

http://youtu.be/qqfeUUjAIyQ

http://youtu.be/qqfeUUjAIyQ


23

New meanings: the so-called sentiment

Sentiment Analysis’s purpose: detect and extract emotions, attitudes, opnions from text… People behaviour and choices (politics, products, reactions) are driven by sentiment rather than ”sensibility”

(Sense and Sensibility by J. Austin well describe these two opposite behaviours)

A basic ML algorithm underlying many (but not all) applications detecting sentiments: Daniel Jurafsky, Coursera, NLP – Stanford University (video, 13 min)

http://stp.lingfil.uu.se/~santinim/sais/7SentimentAnalysisBaselineAlgorithm.mp4


24

Conclusions

Think about semantics, computational semantics and big data

Think how ML is important for semantic-oriented applications (be proud of the many things you learned during the previous course)

Next time we will continue with Sentiment Analysis, which is a semantic-oriented application…


25

This is the end… Thanks for your attention !

Lecture 2: From Semantics To Semantic-Oriented Applications

Education

natural language

natural language

formal language

universal

order logic

formal system

formal logic

stumbling