Part II. Statistical NLP Advanced Artificial Intelligence Introduction and Grammar Models Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Kristian Kersting.

Post on 19-Dec-2015

214 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

Part II Statistical NLP

Advanced Artificial Intelligence

Introduction and Grammar Models

Wolfram Burgard Luc De Raedt Bernhard Nebel Kristian Kersting

Some slides taken from Helmut Schmid Rada Mihalcea Bonnie Dorr Leila Kosseim Peter Flach and others

Topic

Statistical Natural Language Processing Applies

bull Machine Learning Statistics to Learning the ability to improve onersquos behaviour at a specific

task over time - involves the analysis of data (statistics)

bull Natural Language Processing Following parts of the book

bull Statistical NLP (Manning and Schuetze) MIT Press 1999

Contents

Motivation Zipfrsquos law Some natural language processing tasks Non-probabilistic NLP models

bull Regular grammars and finite state automatabull Context-Free Grammarsbull Definite Clause Grammars

Motivation for statistical NLP Overview of the rest of this part

Rationalism versus Empiricism

Rationalist bull Noam Chomsky - innate language structuresbull AI hand coding NLPbull Dominant view 1960-1985bull Cf eg Steven Pinkerrsquos The language instinct (popular

science book) Empiricist

bull Ability to learn is innatebull AI language is learned from corporabull Dominant 1920-1960 and becoming increasingly important

Rationalism versus Empiricism

Noam Chomskybull But it must be recognized that the notion of ldquoprobability of

a sentencerdquo is an entirely useless one under any known interpretation of this term

Fred Jelinek (IBM 1988)bull Every time a linguist leaves the room the recognition rate

goes upbull (Alternative Every time I fire a linguist the recognizer

improves)

This course

Empiricist approach bull Focus will be on probabilistic models for learning

of natural language No time to treat natural language in depth

bull (though this would be quite useful and interesting)

bull Deserves a full course by itself Covered in more depth in Logic Language and

Learning (SS 05 prob SS 06)

Ambiguity

Statistical Disambiguation

bull Define a probability model for the data

bull Compute the probability of each alternative

bull Choose the most likely alternative

NLP and Statistics

Statistical Methods deal with uncertaintyThey predict the future behaviour of a systembased on the behaviour observed in the past

Statistical Methods require training data

The data in Statistical NLP are the Corpora

NLP and Statistics

Corpus text collection for linguistic purposes

TokensHow many words are contained in Tom Sawyer 71370

TypesHow many different words are contained in TS 8018

Hapax Legomenawords appearing only once

Corpora

The most frequent words are function words

word freq word freq

the 3332 in 906

and 2972 that 877

a 1775 he 877

to 1725 I 783

of 1440 his 772

was 1161 you 686

it 1027 Tom 679

Word Counts

f nf

1 39932 12923 6644 4105 2436 1997 1728 1319 8210 9111-50 54051-100 99gt 100 102

How many words appear f times

Word Counts

About half of the words occurs just onceAbout half of the text consists of the

100 most common wordshellip

Word Counts (Brown corpus)

Word Counts (Brown corpus)

word f r fr word f r frthe 3332 1 3332 turned 51 200 10200and 2972 2 5944 youlsquoll 30 300 9000a 1775 3 5235 name 21 400 8400he 877 10 8770 comes 16 500 8000but 410 20 8400 group 13 600 7800be 294 30 8820 lead 11 700 7700there 222 40 8880 friends 10 800 8000one 172 50 8600 begin 9 900 8100about 158 60 9480 family 8 1000 8000more 138 70 9660 brushed 4 2000 8000never 124 80 9920 sins 2 3000 6000Oh 116 90 10440 Could 2 4000 8000two 104 100 10400 Applausive 1 8000 8000

Zipflsquos Law f~1r (fr = const)

Zipflsquos Law

Minimize effort

Language and sequences

Natural language processingbull Is concerned with the analysis of

sequences of words sentencesbullConstruction of language models

Two types of modelsbullNon-probabilisticbull Probabilistic

Human Language is highly ambiguous at all levels

bull acoustic levelrecognize speech vs wreck a nice beach

bull morphological levelsaw to see (past) saw (noun) to saw (present inf)

bull syntactic levelI saw the man on the hill with a telescope

bull semantic levelOne book has to be read by every student

Key NLP Problem Ambiguity

Language Model

A formal model about language Two types

bull Non-probabilistic Allows one to compute whether a certain sequence

(sentence or part thereof) is possible Often grammar based

bull Probabilistic Allows one to compute the probability of a certain

sequence Often extends grammars with probabilities

Example of bad language model

A bad language model

A bad language model

A good language model

Non-Probabilisticbull ldquoI swear to tell the truthrdquo is possiblebull ldquoI swerve to smell de souprdquo is impossible

Probabilisticbull P(I swear to tell the truth) ~ 0001bull P(I swerve to smell de soup) ~ 0

Why language models

Consider a Shannon Gamebull Predicting the next word in the sequence

Statistical natural language hellip The cat is thrown out of the hellip The large green hellip Sue swallowed the large green hellip hellip

Model at the sentence level

Applications

Spelling correction Mobile phone texting Speech recognition Handwriting recognition Disabled users hellip

Spelling errors

They are leaving in about fifteen minuets to go to her house

The study was conducted mainly be John Black Hopefully all with continue smoothly in my absence Can they lave him my messages I need to notified the bank ofhellip He is trying to fine out

Handwriting recognition

Assume a note is given to a bank teller which the teller reads as I have a gub (cf Woody Allen)

NLP to the rescue hellipbull gub is not a wordbull gun gum Gus and gull are words but gun has a

higher probability in the context of a bank

For Spell Checkers

Collect list of commonly substituted wordsbull piecepeace whetherweather theirthere

ExampleldquoOn Tuesday the whether helliprsquorsquoldquoOn Tuesday the weather helliprdquo

Another dimension in language models

Do we mainly want to infer (probabilities) of legal sentences sequences bull So far

Or do we want to infer properties of these sentences bull Eg parse tree part-of-speech-taggingbullNeeded for understanding NL

Letrsquos look at some tasks

Sequence Tagging

Part-of-speech taggingbull He drives with his bikebull N V PR PN N noun verb preposition pronoun noun

Text extractionbull The job is that of a programmerbull X X X X X X JobTypebull The seminar is taking place from 1500 to 1600bull X X X X X X Start End

Sequence Tagging

Predicting the secondary structure of proteins mRNA hellipbull X = AFARLMMAhellipbull Y = hehestststhesthe hellip

Parsing

Given a sentence find its parse tree Important step in understanding NL

Parsing

In bioinformatics allows to predict (elements of) structure from sequence

Language models based on Grammars

Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

A particular type of Unification Based Grammar (Prolog)

Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

bull Grammar encode rules

Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

(more than just recognition) Result of parsing mostly parse tree

showing the constituents of a sentence eg verb or noun phrases

Syntax usually specified in terms of a grammar consisting of grammar rules

Regular Grammars and Finite State Automata

Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

argumentbull Adj (adjective)

Now acceptbull The cat sleptbull Det N Vi

As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

Lexicon bull The - Detbull Cat - Nbull Slept - Vi

bull hellip

Finite State Automaton

Sentences

bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

Phrase structure

S

NP

D N

VP

NPV

D N

PP

P NP

D N

the dog chased a cat into the garden

Notation

S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

Terminals ~ Lexicon

Phrase structure

Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

Recursionbull bdquoThe girl thought the dog chased the catldquo

VP -gt V SN -gt [girl]V -gt [thought]

Top-down parsing

S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

Context-free grammarSS --gt --gt NPNPVPVP

NPNP --gt PN --gt PN Proper nounProper noun

NPNP --gt Art Adj N--gt Art Adj N

NPNP --gt ArtN--gt ArtN

VPVP --gt VI --gt VI intransitive verbintransitive verb

VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

ArtArt --gt [the]--gt [the]

AdjAdj --gt [lazy]--gt [lazy]

AdjAdj --gt [rapid]--gt [rapid]

PNPN --gt [achilles]--gt [achilles]

NN --gt [turtle]--gt [turtle]

VIVI --gt [sleeps]--gt [sleeps]

VTVT --gt [beats]--gt [beats]

Parse tree

SS

NPNP VPVP

ArtArt AdjAdj NN VtVt NPNP

PNPN

achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

Definite Clause GrammarsNon-terminals may have arguments

SS --gt --gt NPNP((NN))VPVP((NN))

NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

VP(VP(NN)) --gt VI(--gt VI(NN))

Art(Art(singularsingular)) --gt [a]--gt [a]

Art(Art(singularsingular)) --gt [the]--gt [the]

Art(Art(pluralplural)) --gt [the]--gt [the]

N(N(singularsingular)) --gt [turtle]--gt [turtle]

N(N(pluralplural)) --gt [turtles]--gt [turtles]

VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

Number Agreement

DCGs

Non-terminals may have argumentsbull Variables (start with capital)

Eg Number Any

bull Constants (start with lower case) Eg singular plural

bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

Parsing needs to be adapted bull Using unification

Unification in a nutshell (cf AI course)

Substitutions

Eg Num singular T vp(VNP)

Applying substitution bull Simultaneously replace variables by

corresponding termsbull S(Num) Num singular = S(singular)

Unification

Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

Gives Num singular

bull Art(singular) and Art(plural) Fails

bull Art(Num1) and Art(Num2) Num1 Num2

bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

Parsing with DCGs

Now require successful unification at each step

S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

S-gt a turtle sleep fails

Case Marking

PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

He sees her She sees him They see her

But not Them see he

DCGs

Are strictly more expressive than CFGs Can represent for instance

bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

Probabilistic Models

Traditional grammar models are very rigid bull essentially a yes no decision

Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

Illustration

Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

bull Constructed by handbull Can be used to derive stochastic context free

grammarsbull SCFG assign probability to parse trees

Compute the most probable parse tree

Sequences are omni-present

Therefore the techniques we will see also apply tobull Bioinformatics

DNA proteins mRNA hellip can all be represented as strings

bullRobotics Sequences of actions states hellip

bullhellip

Rest of the Course

Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

bull As an example of using undirected graphical models

bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

  • Advanced Artificial Intelligence
  • Topic
  • Contents
  • Rationalism versus Empiricism
  • Slide 5
  • This course
  • Ambiguity
  • NLP and Statistics
  • Slide 9
  • Corpora
  • Word Counts
  • Slide 12
  • Word Counts (Brown corpus)
  • Slide 14
  • Zipflsquos Law
  • Language and sequences
  • Key NLP Problem Ambiguity
  • Language Model
  • Example of bad language model
  • A bad language model
  • Slide 22
  • A good language model
  • Why language models
  • Applications
  • Spelling errors
  • Handwriting recognition
  • For Spell Checkers
  • Another dimension in language models
  • Sequence Tagging
  • Slide 31
  • Parsing
  • Slide 33
  • Language models based on Grammars
  • Grammars and parsing
  • Regular Grammars and Finite State Automata
  • Finite State Automaton
  • Phrase structure
  • Notation
  • Context Free Grammar
  • Slide 41
  • Top-down parsing
  • Context-free grammar
  • Parse tree
  • Definite Clause Grammars Non-terminals may have arguments
  • DCGs
  • Unification in a nutshell (cf AI course)
  • Unification
  • Parsing with DCGs
  • Case Marking
  • Slide 51
  • Probabilistic Models
  • Illustration
  • PowerPoint Presentation
  • Sequences are omni-present
  • Rest of the Course

    Topic

    Statistical Natural Language Processing Applies

    bull Machine Learning Statistics to Learning the ability to improve onersquos behaviour at a specific

    task over time - involves the analysis of data (statistics)

    bull Natural Language Processing Following parts of the book

    bull Statistical NLP (Manning and Schuetze) MIT Press 1999

    Contents

    Motivation Zipfrsquos law Some natural language processing tasks Non-probabilistic NLP models

    bull Regular grammars and finite state automatabull Context-Free Grammarsbull Definite Clause Grammars

    Motivation for statistical NLP Overview of the rest of this part

    Rationalism versus Empiricism

    Rationalist bull Noam Chomsky - innate language structuresbull AI hand coding NLPbull Dominant view 1960-1985bull Cf eg Steven Pinkerrsquos The language instinct (popular

    science book) Empiricist

    bull Ability to learn is innatebull AI language is learned from corporabull Dominant 1920-1960 and becoming increasingly important

    Rationalism versus Empiricism

    Noam Chomskybull But it must be recognized that the notion of ldquoprobability of

    a sentencerdquo is an entirely useless one under any known interpretation of this term

    Fred Jelinek (IBM 1988)bull Every time a linguist leaves the room the recognition rate

    goes upbull (Alternative Every time I fire a linguist the recognizer

    improves)

    This course

    Empiricist approach bull Focus will be on probabilistic models for learning

    of natural language No time to treat natural language in depth

    bull (though this would be quite useful and interesting)

    bull Deserves a full course by itself Covered in more depth in Logic Language and

    Learning (SS 05 prob SS 06)

    Ambiguity

    Statistical Disambiguation

    bull Define a probability model for the data

    bull Compute the probability of each alternative

    bull Choose the most likely alternative

    NLP and Statistics

    Statistical Methods deal with uncertaintyThey predict the future behaviour of a systembased on the behaviour observed in the past

    Statistical Methods require training data

    The data in Statistical NLP are the Corpora

    NLP and Statistics

    Corpus text collection for linguistic purposes

    TokensHow many words are contained in Tom Sawyer 71370

    TypesHow many different words are contained in TS 8018

    Hapax Legomenawords appearing only once

    Corpora

    The most frequent words are function words

    word freq word freq

    the 3332 in 906

    and 2972 that 877

    a 1775 he 877

    to 1725 I 783

    of 1440 his 772

    was 1161 you 686

    it 1027 Tom 679

    Word Counts

    f nf

    1 39932 12923 6644 4105 2436 1997 1728 1319 8210 9111-50 54051-100 99gt 100 102

    How many words appear f times

    Word Counts

    About half of the words occurs just onceAbout half of the text consists of the

    100 most common wordshellip

    Word Counts (Brown corpus)

    Word Counts (Brown corpus)

    word f r fr word f r frthe 3332 1 3332 turned 51 200 10200and 2972 2 5944 youlsquoll 30 300 9000a 1775 3 5235 name 21 400 8400he 877 10 8770 comes 16 500 8000but 410 20 8400 group 13 600 7800be 294 30 8820 lead 11 700 7700there 222 40 8880 friends 10 800 8000one 172 50 8600 begin 9 900 8100about 158 60 9480 family 8 1000 8000more 138 70 9660 brushed 4 2000 8000never 124 80 9920 sins 2 3000 6000Oh 116 90 10440 Could 2 4000 8000two 104 100 10400 Applausive 1 8000 8000

    Zipflsquos Law f~1r (fr = const)

    Zipflsquos Law

    Minimize effort

    Language and sequences

    Natural language processingbull Is concerned with the analysis of

    sequences of words sentencesbullConstruction of language models

    Two types of modelsbullNon-probabilisticbull Probabilistic

    Human Language is highly ambiguous at all levels

    bull acoustic levelrecognize speech vs wreck a nice beach

    bull morphological levelsaw to see (past) saw (noun) to saw (present inf)

    bull syntactic levelI saw the man on the hill with a telescope

    bull semantic levelOne book has to be read by every student

    Key NLP Problem Ambiguity

    Language Model

    A formal model about language Two types

    bull Non-probabilistic Allows one to compute whether a certain sequence

    (sentence or part thereof) is possible Often grammar based

    bull Probabilistic Allows one to compute the probability of a certain

    sequence Often extends grammars with probabilities

    Example of bad language model

    A bad language model

    A bad language model

    A good language model

    Non-Probabilisticbull ldquoI swear to tell the truthrdquo is possiblebull ldquoI swerve to smell de souprdquo is impossible

    Probabilisticbull P(I swear to tell the truth) ~ 0001bull P(I swerve to smell de soup) ~ 0

    Why language models

    Consider a Shannon Gamebull Predicting the next word in the sequence

    Statistical natural language hellip The cat is thrown out of the hellip The large green hellip Sue swallowed the large green hellip hellip

    Model at the sentence level

    Applications

    Spelling correction Mobile phone texting Speech recognition Handwriting recognition Disabled users hellip

    Spelling errors

    They are leaving in about fifteen minuets to go to her house

    The study was conducted mainly be John Black Hopefully all with continue smoothly in my absence Can they lave him my messages I need to notified the bank ofhellip He is trying to fine out

    Handwriting recognition

    Assume a note is given to a bank teller which the teller reads as I have a gub (cf Woody Allen)

    NLP to the rescue hellipbull gub is not a wordbull gun gum Gus and gull are words but gun has a

    higher probability in the context of a bank

    For Spell Checkers

    Collect list of commonly substituted wordsbull piecepeace whetherweather theirthere

    ExampleldquoOn Tuesday the whether helliprsquorsquoldquoOn Tuesday the weather helliprdquo

    Another dimension in language models

    Do we mainly want to infer (probabilities) of legal sentences sequences bull So far

    Or do we want to infer properties of these sentences bull Eg parse tree part-of-speech-taggingbullNeeded for understanding NL

    Letrsquos look at some tasks

    Sequence Tagging

    Part-of-speech taggingbull He drives with his bikebull N V PR PN N noun verb preposition pronoun noun

    Text extractionbull The job is that of a programmerbull X X X X X X JobTypebull The seminar is taking place from 1500 to 1600bull X X X X X X Start End

    Sequence Tagging

    Predicting the secondary structure of proteins mRNA hellipbull X = AFARLMMAhellipbull Y = hehestststhesthe hellip

    Parsing

    Given a sentence find its parse tree Important step in understanding NL

    Parsing

    In bioinformatics allows to predict (elements of) structure from sequence

    Language models based on Grammars

    Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

    A particular type of Unification Based Grammar (Prolog)

    Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

    words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

    bull Grammar encode rules

    Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

    (more than just recognition) Result of parsing mostly parse tree

    showing the constituents of a sentence eg verb or noun phrases

    Syntax usually specified in terms of a grammar consisting of grammar rules

    Regular Grammars and Finite State Automata

    Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

    argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

    argumentbull Adj (adjective)

    Now acceptbull The cat sleptbull Det N Vi

    As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

    Lexicon bull The - Detbull Cat - Nbull Slept - Vi

    bull hellip

    Finite State Automaton

    Sentences

    bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

    Phrase structure

    S

    NP

    D N

    VP

    NPV

    D N

    PP

    P NP

    D N

    the dog chased a cat into the garden

    Notation

    S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

    Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

    Terminals ~ Lexicon

    Phrase structure

    Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

    Recursionbull bdquoThe girl thought the dog chased the catldquo

    VP -gt V SN -gt [girl]V -gt [thought]

    Top-down parsing

    S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

    Context-free grammarSS --gt --gt NPNPVPVP

    NPNP --gt PN --gt PN Proper nounProper noun

    NPNP --gt Art Adj N--gt Art Adj N

    NPNP --gt ArtN--gt ArtN

    VPVP --gt VI --gt VI intransitive verbintransitive verb

    VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

    ArtArt --gt [the]--gt [the]

    AdjAdj --gt [lazy]--gt [lazy]

    AdjAdj --gt [rapid]--gt [rapid]

    PNPN --gt [achilles]--gt [achilles]

    NN --gt [turtle]--gt [turtle]

    VIVI --gt [sleeps]--gt [sleeps]

    VTVT --gt [beats]--gt [beats]

    Parse tree

    SS

    NPNP VPVP

    ArtArt AdjAdj NN VtVt NPNP

    PNPN

    achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

    Definite Clause GrammarsNon-terminals may have arguments

    SS --gt --gt NPNP((NN))VPVP((NN))

    NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

    VP(VP(NN)) --gt VI(--gt VI(NN))

    Art(Art(singularsingular)) --gt [a]--gt [a]

    Art(Art(singularsingular)) --gt [the]--gt [the]

    Art(Art(pluralplural)) --gt [the]--gt [the]

    N(N(singularsingular)) --gt [turtle]--gt [turtle]

    N(N(pluralplural)) --gt [turtles]--gt [turtles]

    VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

    VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

    Number Agreement

    DCGs

    Non-terminals may have argumentsbull Variables (start with capital)

    Eg Number Any

    bull Constants (start with lower case) Eg singular plural

    bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

    Parsing needs to be adapted bull Using unification

    Unification in a nutshell (cf AI course)

    Substitutions

    Eg Num singular T vp(VNP)

    Applying substitution bull Simultaneously replace variables by

    corresponding termsbull S(Num) Num singular = S(singular)

    Unification

    Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

    Gives Num singular

    bull Art(singular) and Art(plural) Fails

    bull Art(Num1) and Art(Num2) Num1 Num2

    bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

    Parsing with DCGs

    Now require successful unification at each step

    S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

    S-gt a turtle sleep fails

    Case Marking

    PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

    PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

    PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

    PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

    S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

    VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

    VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

    VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

    He sees her She sees him They see her

    But not Them see he

    DCGs

    Are strictly more expressive than CFGs Can represent for instance

    bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

    Probabilistic Models

    Traditional grammar models are very rigid bull essentially a yes no decision

    Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

    Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

    Illustration

    Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

    bull Constructed by handbull Can be used to derive stochastic context free

    grammarsbull SCFG assign probability to parse trees

    Compute the most probable parse tree

    Sequences are omni-present

    Therefore the techniques we will see also apply tobull Bioinformatics

    DNA proteins mRNA hellip can all be represented as strings

    bullRobotics Sequences of actions states hellip

    bullhellip

    Rest of the Course

    Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

    All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

    bull As an example of using undirected graphical models

    bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

    • Advanced Artificial Intelligence
    • Topic
    • Contents
    • Rationalism versus Empiricism
    • Slide 5
    • This course
    • Ambiguity
    • NLP and Statistics
    • Slide 9
    • Corpora
    • Word Counts
    • Slide 12
    • Word Counts (Brown corpus)
    • Slide 14
    • Zipflsquos Law
    • Language and sequences
    • Key NLP Problem Ambiguity
    • Language Model
    • Example of bad language model
    • A bad language model
    • Slide 22
    • A good language model
    • Why language models
    • Applications
    • Spelling errors
    • Handwriting recognition
    • For Spell Checkers
    • Another dimension in language models
    • Sequence Tagging
    • Slide 31
    • Parsing
    • Slide 33
    • Language models based on Grammars
    • Grammars and parsing
    • Regular Grammars and Finite State Automata
    • Finite State Automaton
    • Phrase structure
    • Notation
    • Context Free Grammar
    • Slide 41
    • Top-down parsing
    • Context-free grammar
    • Parse tree
    • Definite Clause Grammars Non-terminals may have arguments
    • DCGs
    • Unification in a nutshell (cf AI course)
    • Unification
    • Parsing with DCGs
    • Case Marking
    • Slide 51
    • Probabilistic Models
    • Illustration
    • PowerPoint Presentation
    • Sequences are omni-present
    • Rest of the Course

      Contents

      Motivation Zipfrsquos law Some natural language processing tasks Non-probabilistic NLP models

      bull Regular grammars and finite state automatabull Context-Free Grammarsbull Definite Clause Grammars

      Motivation for statistical NLP Overview of the rest of this part

      Rationalism versus Empiricism

      Rationalist bull Noam Chomsky - innate language structuresbull AI hand coding NLPbull Dominant view 1960-1985bull Cf eg Steven Pinkerrsquos The language instinct (popular

      science book) Empiricist

      bull Ability to learn is innatebull AI language is learned from corporabull Dominant 1920-1960 and becoming increasingly important

      Rationalism versus Empiricism

      Noam Chomskybull But it must be recognized that the notion of ldquoprobability of

      a sentencerdquo is an entirely useless one under any known interpretation of this term

      Fred Jelinek (IBM 1988)bull Every time a linguist leaves the room the recognition rate

      goes upbull (Alternative Every time I fire a linguist the recognizer

      improves)

      This course

      Empiricist approach bull Focus will be on probabilistic models for learning

      of natural language No time to treat natural language in depth

      bull (though this would be quite useful and interesting)

      bull Deserves a full course by itself Covered in more depth in Logic Language and

      Learning (SS 05 prob SS 06)

      Ambiguity

      Statistical Disambiguation

      bull Define a probability model for the data

      bull Compute the probability of each alternative

      bull Choose the most likely alternative

      NLP and Statistics

      Statistical Methods deal with uncertaintyThey predict the future behaviour of a systembased on the behaviour observed in the past

      Statistical Methods require training data

      The data in Statistical NLP are the Corpora

      NLP and Statistics

      Corpus text collection for linguistic purposes

      TokensHow many words are contained in Tom Sawyer 71370

      TypesHow many different words are contained in TS 8018

      Hapax Legomenawords appearing only once

      Corpora

      The most frequent words are function words

      word freq word freq

      the 3332 in 906

      and 2972 that 877

      a 1775 he 877

      to 1725 I 783

      of 1440 his 772

      was 1161 you 686

      it 1027 Tom 679

      Word Counts

      f nf

      1 39932 12923 6644 4105 2436 1997 1728 1319 8210 9111-50 54051-100 99gt 100 102

      How many words appear f times

      Word Counts

      About half of the words occurs just onceAbout half of the text consists of the

      100 most common wordshellip

      Word Counts (Brown corpus)

      Word Counts (Brown corpus)

      word f r fr word f r frthe 3332 1 3332 turned 51 200 10200and 2972 2 5944 youlsquoll 30 300 9000a 1775 3 5235 name 21 400 8400he 877 10 8770 comes 16 500 8000but 410 20 8400 group 13 600 7800be 294 30 8820 lead 11 700 7700there 222 40 8880 friends 10 800 8000one 172 50 8600 begin 9 900 8100about 158 60 9480 family 8 1000 8000more 138 70 9660 brushed 4 2000 8000never 124 80 9920 sins 2 3000 6000Oh 116 90 10440 Could 2 4000 8000two 104 100 10400 Applausive 1 8000 8000

      Zipflsquos Law f~1r (fr = const)

      Zipflsquos Law

      Minimize effort

      Language and sequences

      Natural language processingbull Is concerned with the analysis of

      sequences of words sentencesbullConstruction of language models

      Two types of modelsbullNon-probabilisticbull Probabilistic

      Human Language is highly ambiguous at all levels

      bull acoustic levelrecognize speech vs wreck a nice beach

      bull morphological levelsaw to see (past) saw (noun) to saw (present inf)

      bull syntactic levelI saw the man on the hill with a telescope

      bull semantic levelOne book has to be read by every student

      Key NLP Problem Ambiguity

      Language Model

      A formal model about language Two types

      bull Non-probabilistic Allows one to compute whether a certain sequence

      (sentence or part thereof) is possible Often grammar based

      bull Probabilistic Allows one to compute the probability of a certain

      sequence Often extends grammars with probabilities

      Example of bad language model

      A bad language model

      A bad language model

      A good language model

      Non-Probabilisticbull ldquoI swear to tell the truthrdquo is possiblebull ldquoI swerve to smell de souprdquo is impossible

      Probabilisticbull P(I swear to tell the truth) ~ 0001bull P(I swerve to smell de soup) ~ 0

      Why language models

      Consider a Shannon Gamebull Predicting the next word in the sequence

      Statistical natural language hellip The cat is thrown out of the hellip The large green hellip Sue swallowed the large green hellip hellip

      Model at the sentence level

      Applications

      Spelling correction Mobile phone texting Speech recognition Handwriting recognition Disabled users hellip

      Spelling errors

      They are leaving in about fifteen minuets to go to her house

      The study was conducted mainly be John Black Hopefully all with continue smoothly in my absence Can they lave him my messages I need to notified the bank ofhellip He is trying to fine out

      Handwriting recognition

      Assume a note is given to a bank teller which the teller reads as I have a gub (cf Woody Allen)

      NLP to the rescue hellipbull gub is not a wordbull gun gum Gus and gull are words but gun has a

      higher probability in the context of a bank

      For Spell Checkers

      Collect list of commonly substituted wordsbull piecepeace whetherweather theirthere

      ExampleldquoOn Tuesday the whether helliprsquorsquoldquoOn Tuesday the weather helliprdquo

      Another dimension in language models

      Do we mainly want to infer (probabilities) of legal sentences sequences bull So far

      Or do we want to infer properties of these sentences bull Eg parse tree part-of-speech-taggingbullNeeded for understanding NL

      Letrsquos look at some tasks

      Sequence Tagging

      Part-of-speech taggingbull He drives with his bikebull N V PR PN N noun verb preposition pronoun noun

      Text extractionbull The job is that of a programmerbull X X X X X X JobTypebull The seminar is taking place from 1500 to 1600bull X X X X X X Start End

      Sequence Tagging

      Predicting the secondary structure of proteins mRNA hellipbull X = AFARLMMAhellipbull Y = hehestststhesthe hellip

      Parsing

      Given a sentence find its parse tree Important step in understanding NL

      Parsing

      In bioinformatics allows to predict (elements of) structure from sequence

      Language models based on Grammars

      Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

      A particular type of Unification Based Grammar (Prolog)

      Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

      words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

      bull Grammar encode rules

      Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

      (more than just recognition) Result of parsing mostly parse tree

      showing the constituents of a sentence eg verb or noun phrases

      Syntax usually specified in terms of a grammar consisting of grammar rules

      Regular Grammars and Finite State Automata

      Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

      argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

      argumentbull Adj (adjective)

      Now acceptbull The cat sleptbull Det N Vi

      As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

      Lexicon bull The - Detbull Cat - Nbull Slept - Vi

      bull hellip

      Finite State Automaton

      Sentences

      bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

      Phrase structure

      S

      NP

      D N

      VP

      NPV

      D N

      PP

      P NP

      D N

      the dog chased a cat into the garden

      Notation

      S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

      Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

      Terminals ~ Lexicon

      Phrase structure

      Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

      Recursionbull bdquoThe girl thought the dog chased the catldquo

      VP -gt V SN -gt [girl]V -gt [thought]

      Top-down parsing

      S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

      Context-free grammarSS --gt --gt NPNPVPVP

      NPNP --gt PN --gt PN Proper nounProper noun

      NPNP --gt Art Adj N--gt Art Adj N

      NPNP --gt ArtN--gt ArtN

      VPVP --gt VI --gt VI intransitive verbintransitive verb

      VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

      ArtArt --gt [the]--gt [the]

      AdjAdj --gt [lazy]--gt [lazy]

      AdjAdj --gt [rapid]--gt [rapid]

      PNPN --gt [achilles]--gt [achilles]

      NN --gt [turtle]--gt [turtle]

      VIVI --gt [sleeps]--gt [sleeps]

      VTVT --gt [beats]--gt [beats]

      Parse tree

      SS

      NPNP VPVP

      ArtArt AdjAdj NN VtVt NPNP

      PNPN

      achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

      Definite Clause GrammarsNon-terminals may have arguments

      SS --gt --gt NPNP((NN))VPVP((NN))

      NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

      VP(VP(NN)) --gt VI(--gt VI(NN))

      Art(Art(singularsingular)) --gt [a]--gt [a]

      Art(Art(singularsingular)) --gt [the]--gt [the]

      Art(Art(pluralplural)) --gt [the]--gt [the]

      N(N(singularsingular)) --gt [turtle]--gt [turtle]

      N(N(pluralplural)) --gt [turtles]--gt [turtles]

      VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

      VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

      Number Agreement

      DCGs

      Non-terminals may have argumentsbull Variables (start with capital)

      Eg Number Any

      bull Constants (start with lower case) Eg singular plural

      bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

      Parsing needs to be adapted bull Using unification

      Unification in a nutshell (cf AI course)

      Substitutions

      Eg Num singular T vp(VNP)

      Applying substitution bull Simultaneously replace variables by

      corresponding termsbull S(Num) Num singular = S(singular)

      Unification

      Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

      Gives Num singular

      bull Art(singular) and Art(plural) Fails

      bull Art(Num1) and Art(Num2) Num1 Num2

      bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

      Parsing with DCGs

      Now require successful unification at each step

      S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

      S-gt a turtle sleep fails

      Case Marking

      PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

      PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

      PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

      PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

      S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

      VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

      VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

      VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

      He sees her She sees him They see her

      But not Them see he

      DCGs

      Are strictly more expressive than CFGs Can represent for instance

      bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

      Probabilistic Models

      Traditional grammar models are very rigid bull essentially a yes no decision

      Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

      Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

      Illustration

      Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

      bull Constructed by handbull Can be used to derive stochastic context free

      grammarsbull SCFG assign probability to parse trees

      Compute the most probable parse tree

      Sequences are omni-present

      Therefore the techniques we will see also apply tobull Bioinformatics

      DNA proteins mRNA hellip can all be represented as strings

      bullRobotics Sequences of actions states hellip

      bullhellip

      Rest of the Course

      Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

      All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

      bull As an example of using undirected graphical models

      bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

      • Advanced Artificial Intelligence
      • Topic
      • Contents
      • Rationalism versus Empiricism
      • Slide 5
      • This course
      • Ambiguity
      • NLP and Statistics
      • Slide 9
      • Corpora
      • Word Counts
      • Slide 12
      • Word Counts (Brown corpus)
      • Slide 14
      • Zipflsquos Law
      • Language and sequences
      • Key NLP Problem Ambiguity
      • Language Model
      • Example of bad language model
      • A bad language model
      • Slide 22
      • A good language model
      • Why language models
      • Applications
      • Spelling errors
      • Handwriting recognition
      • For Spell Checkers
      • Another dimension in language models
      • Sequence Tagging
      • Slide 31
      • Parsing
      • Slide 33
      • Language models based on Grammars
      • Grammars and parsing
      • Regular Grammars and Finite State Automata
      • Finite State Automaton
      • Phrase structure
      • Notation
      • Context Free Grammar
      • Slide 41
      • Top-down parsing
      • Context-free grammar
      • Parse tree
      • Definite Clause Grammars Non-terminals may have arguments
      • DCGs
      • Unification in a nutshell (cf AI course)
      • Unification
      • Parsing with DCGs
      • Case Marking
      • Slide 51
      • Probabilistic Models
      • Illustration
      • PowerPoint Presentation
      • Sequences are omni-present
      • Rest of the Course

        Rationalism versus Empiricism

        Rationalist bull Noam Chomsky - innate language structuresbull AI hand coding NLPbull Dominant view 1960-1985bull Cf eg Steven Pinkerrsquos The language instinct (popular

        science book) Empiricist

        bull Ability to learn is innatebull AI language is learned from corporabull Dominant 1920-1960 and becoming increasingly important

        Rationalism versus Empiricism

        Noam Chomskybull But it must be recognized that the notion of ldquoprobability of

        a sentencerdquo is an entirely useless one under any known interpretation of this term

        Fred Jelinek (IBM 1988)bull Every time a linguist leaves the room the recognition rate

        goes upbull (Alternative Every time I fire a linguist the recognizer

        improves)

        This course

        Empiricist approach bull Focus will be on probabilistic models for learning

        of natural language No time to treat natural language in depth

        bull (though this would be quite useful and interesting)

        bull Deserves a full course by itself Covered in more depth in Logic Language and

        Learning (SS 05 prob SS 06)

        Ambiguity

        Statistical Disambiguation

        bull Define a probability model for the data

        bull Compute the probability of each alternative

        bull Choose the most likely alternative

        NLP and Statistics

        Statistical Methods deal with uncertaintyThey predict the future behaviour of a systembased on the behaviour observed in the past

        Statistical Methods require training data

        The data in Statistical NLP are the Corpora

        NLP and Statistics

        Corpus text collection for linguistic purposes

        TokensHow many words are contained in Tom Sawyer 71370

        TypesHow many different words are contained in TS 8018

        Hapax Legomenawords appearing only once

        Corpora

        The most frequent words are function words

        word freq word freq

        the 3332 in 906

        and 2972 that 877

        a 1775 he 877

        to 1725 I 783

        of 1440 his 772

        was 1161 you 686

        it 1027 Tom 679

        Word Counts

        f nf

        1 39932 12923 6644 4105 2436 1997 1728 1319 8210 9111-50 54051-100 99gt 100 102

        How many words appear f times

        Word Counts

        About half of the words occurs just onceAbout half of the text consists of the

        100 most common wordshellip

        Word Counts (Brown corpus)

        Word Counts (Brown corpus)

        word f r fr word f r frthe 3332 1 3332 turned 51 200 10200and 2972 2 5944 youlsquoll 30 300 9000a 1775 3 5235 name 21 400 8400he 877 10 8770 comes 16 500 8000but 410 20 8400 group 13 600 7800be 294 30 8820 lead 11 700 7700there 222 40 8880 friends 10 800 8000one 172 50 8600 begin 9 900 8100about 158 60 9480 family 8 1000 8000more 138 70 9660 brushed 4 2000 8000never 124 80 9920 sins 2 3000 6000Oh 116 90 10440 Could 2 4000 8000two 104 100 10400 Applausive 1 8000 8000

        Zipflsquos Law f~1r (fr = const)

        Zipflsquos Law

        Minimize effort

        Language and sequences

        Natural language processingbull Is concerned with the analysis of

        sequences of words sentencesbullConstruction of language models

        Two types of modelsbullNon-probabilisticbull Probabilistic

        Human Language is highly ambiguous at all levels

        bull acoustic levelrecognize speech vs wreck a nice beach

        bull morphological levelsaw to see (past) saw (noun) to saw (present inf)

        bull syntactic levelI saw the man on the hill with a telescope

        bull semantic levelOne book has to be read by every student

        Key NLP Problem Ambiguity

        Language Model

        A formal model about language Two types

        bull Non-probabilistic Allows one to compute whether a certain sequence

        (sentence or part thereof) is possible Often grammar based

        bull Probabilistic Allows one to compute the probability of a certain

        sequence Often extends grammars with probabilities

        Example of bad language model

        A bad language model

        A bad language model

        A good language model

        Non-Probabilisticbull ldquoI swear to tell the truthrdquo is possiblebull ldquoI swerve to smell de souprdquo is impossible

        Probabilisticbull P(I swear to tell the truth) ~ 0001bull P(I swerve to smell de soup) ~ 0

        Why language models

        Consider a Shannon Gamebull Predicting the next word in the sequence

        Statistical natural language hellip The cat is thrown out of the hellip The large green hellip Sue swallowed the large green hellip hellip

        Model at the sentence level

        Applications

        Spelling correction Mobile phone texting Speech recognition Handwriting recognition Disabled users hellip

        Spelling errors

        They are leaving in about fifteen minuets to go to her house

        The study was conducted mainly be John Black Hopefully all with continue smoothly in my absence Can they lave him my messages I need to notified the bank ofhellip He is trying to fine out

        Handwriting recognition

        Assume a note is given to a bank teller which the teller reads as I have a gub (cf Woody Allen)

        NLP to the rescue hellipbull gub is not a wordbull gun gum Gus and gull are words but gun has a

        higher probability in the context of a bank

        For Spell Checkers

        Collect list of commonly substituted wordsbull piecepeace whetherweather theirthere

        ExampleldquoOn Tuesday the whether helliprsquorsquoldquoOn Tuesday the weather helliprdquo

        Another dimension in language models

        Do we mainly want to infer (probabilities) of legal sentences sequences bull So far

        Or do we want to infer properties of these sentences bull Eg parse tree part-of-speech-taggingbullNeeded for understanding NL

        Letrsquos look at some tasks

        Sequence Tagging

        Part-of-speech taggingbull He drives with his bikebull N V PR PN N noun verb preposition pronoun noun

        Text extractionbull The job is that of a programmerbull X X X X X X JobTypebull The seminar is taking place from 1500 to 1600bull X X X X X X Start End

        Sequence Tagging

        Predicting the secondary structure of proteins mRNA hellipbull X = AFARLMMAhellipbull Y = hehestststhesthe hellip

        Parsing

        Given a sentence find its parse tree Important step in understanding NL

        Parsing

        In bioinformatics allows to predict (elements of) structure from sequence

        Language models based on Grammars

        Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

        A particular type of Unification Based Grammar (Prolog)

        Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

        words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

        bull Grammar encode rules

        Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

        (more than just recognition) Result of parsing mostly parse tree

        showing the constituents of a sentence eg verb or noun phrases

        Syntax usually specified in terms of a grammar consisting of grammar rules

        Regular Grammars and Finite State Automata

        Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

        argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

        argumentbull Adj (adjective)

        Now acceptbull The cat sleptbull Det N Vi

        As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

        Lexicon bull The - Detbull Cat - Nbull Slept - Vi

        bull hellip

        Finite State Automaton

        Sentences

        bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

        Phrase structure

        S

        NP

        D N

        VP

        NPV

        D N

        PP

        P NP

        D N

        the dog chased a cat into the garden

        Notation

        S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

        Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

        Terminals ~ Lexicon

        Phrase structure

        Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

        Recursionbull bdquoThe girl thought the dog chased the catldquo

        VP -gt V SN -gt [girl]V -gt [thought]

        Top-down parsing

        S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

        Context-free grammarSS --gt --gt NPNPVPVP

        NPNP --gt PN --gt PN Proper nounProper noun

        NPNP --gt Art Adj N--gt Art Adj N

        NPNP --gt ArtN--gt ArtN

        VPVP --gt VI --gt VI intransitive verbintransitive verb

        VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

        ArtArt --gt [the]--gt [the]

        AdjAdj --gt [lazy]--gt [lazy]

        AdjAdj --gt [rapid]--gt [rapid]

        PNPN --gt [achilles]--gt [achilles]

        NN --gt [turtle]--gt [turtle]

        VIVI --gt [sleeps]--gt [sleeps]

        VTVT --gt [beats]--gt [beats]

        Parse tree

        SS

        NPNP VPVP

        ArtArt AdjAdj NN VtVt NPNP

        PNPN

        achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

        Definite Clause GrammarsNon-terminals may have arguments

        SS --gt --gt NPNP((NN))VPVP((NN))

        NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

        VP(VP(NN)) --gt VI(--gt VI(NN))

        Art(Art(singularsingular)) --gt [a]--gt [a]

        Art(Art(singularsingular)) --gt [the]--gt [the]

        Art(Art(pluralplural)) --gt [the]--gt [the]

        N(N(singularsingular)) --gt [turtle]--gt [turtle]

        N(N(pluralplural)) --gt [turtles]--gt [turtles]

        VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

        VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

        Number Agreement

        DCGs

        Non-terminals may have argumentsbull Variables (start with capital)

        Eg Number Any

        bull Constants (start with lower case) Eg singular plural

        bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

        Parsing needs to be adapted bull Using unification

        Unification in a nutshell (cf AI course)

        Substitutions

        Eg Num singular T vp(VNP)

        Applying substitution bull Simultaneously replace variables by

        corresponding termsbull S(Num) Num singular = S(singular)

        Unification

        Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

        Gives Num singular

        bull Art(singular) and Art(plural) Fails

        bull Art(Num1) and Art(Num2) Num1 Num2

        bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

        Parsing with DCGs

        Now require successful unification at each step

        S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

        S-gt a turtle sleep fails

        Case Marking

        PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

        PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

        PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

        PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

        S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

        VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

        VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

        VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

        He sees her She sees him They see her

        But not Them see he

        DCGs

        Are strictly more expressive than CFGs Can represent for instance

        bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

        Probabilistic Models

        Traditional grammar models are very rigid bull essentially a yes no decision

        Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

        Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

        Illustration

        Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

        bull Constructed by handbull Can be used to derive stochastic context free

        grammarsbull SCFG assign probability to parse trees

        Compute the most probable parse tree

        Sequences are omni-present

        Therefore the techniques we will see also apply tobull Bioinformatics

        DNA proteins mRNA hellip can all be represented as strings

        bullRobotics Sequences of actions states hellip

        bullhellip

        Rest of the Course

        Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

        All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

        bull As an example of using undirected graphical models

        bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

        • Advanced Artificial Intelligence
        • Topic
        • Contents
        • Rationalism versus Empiricism
        • Slide 5
        • This course
        • Ambiguity
        • NLP and Statistics
        • Slide 9
        • Corpora
        • Word Counts
        • Slide 12
        • Word Counts (Brown corpus)
        • Slide 14
        • Zipflsquos Law
        • Language and sequences
        • Key NLP Problem Ambiguity
        • Language Model
        • Example of bad language model
        • A bad language model
        • Slide 22
        • A good language model
        • Why language models
        • Applications
        • Spelling errors
        • Handwriting recognition
        • For Spell Checkers
        • Another dimension in language models
        • Sequence Tagging
        • Slide 31
        • Parsing
        • Slide 33
        • Language models based on Grammars
        • Grammars and parsing
        • Regular Grammars and Finite State Automata
        • Finite State Automaton
        • Phrase structure
        • Notation
        • Context Free Grammar
        • Slide 41
        • Top-down parsing
        • Context-free grammar
        • Parse tree
        • Definite Clause Grammars Non-terminals may have arguments
        • DCGs
        • Unification in a nutshell (cf AI course)
        • Unification
        • Parsing with DCGs
        • Case Marking
        • Slide 51
        • Probabilistic Models
        • Illustration
        • PowerPoint Presentation
        • Sequences are omni-present
        • Rest of the Course

          Rationalism versus Empiricism

          Noam Chomskybull But it must be recognized that the notion of ldquoprobability of

          a sentencerdquo is an entirely useless one under any known interpretation of this term

          Fred Jelinek (IBM 1988)bull Every time a linguist leaves the room the recognition rate

          goes upbull (Alternative Every time I fire a linguist the recognizer

          improves)

          This course

          Empiricist approach bull Focus will be on probabilistic models for learning

          of natural language No time to treat natural language in depth

          bull (though this would be quite useful and interesting)

          bull Deserves a full course by itself Covered in more depth in Logic Language and

          Learning (SS 05 prob SS 06)

          Ambiguity

          Statistical Disambiguation

          bull Define a probability model for the data

          bull Compute the probability of each alternative

          bull Choose the most likely alternative

          NLP and Statistics

          Statistical Methods deal with uncertaintyThey predict the future behaviour of a systembased on the behaviour observed in the past

          Statistical Methods require training data

          The data in Statistical NLP are the Corpora

          NLP and Statistics

          Corpus text collection for linguistic purposes

          TokensHow many words are contained in Tom Sawyer 71370

          TypesHow many different words are contained in TS 8018

          Hapax Legomenawords appearing only once

          Corpora

          The most frequent words are function words

          word freq word freq

          the 3332 in 906

          and 2972 that 877

          a 1775 he 877

          to 1725 I 783

          of 1440 his 772

          was 1161 you 686

          it 1027 Tom 679

          Word Counts

          f nf

          1 39932 12923 6644 4105 2436 1997 1728 1319 8210 9111-50 54051-100 99gt 100 102

          How many words appear f times

          Word Counts

          About half of the words occurs just onceAbout half of the text consists of the

          100 most common wordshellip

          Word Counts (Brown corpus)

          Word Counts (Brown corpus)

          word f r fr word f r frthe 3332 1 3332 turned 51 200 10200and 2972 2 5944 youlsquoll 30 300 9000a 1775 3 5235 name 21 400 8400he 877 10 8770 comes 16 500 8000but 410 20 8400 group 13 600 7800be 294 30 8820 lead 11 700 7700there 222 40 8880 friends 10 800 8000one 172 50 8600 begin 9 900 8100about 158 60 9480 family 8 1000 8000more 138 70 9660 brushed 4 2000 8000never 124 80 9920 sins 2 3000 6000Oh 116 90 10440 Could 2 4000 8000two 104 100 10400 Applausive 1 8000 8000

          Zipflsquos Law f~1r (fr = const)

          Zipflsquos Law

          Minimize effort

          Language and sequences

          Natural language processingbull Is concerned with the analysis of

          sequences of words sentencesbullConstruction of language models

          Two types of modelsbullNon-probabilisticbull Probabilistic

          Human Language is highly ambiguous at all levels

          bull acoustic levelrecognize speech vs wreck a nice beach

          bull morphological levelsaw to see (past) saw (noun) to saw (present inf)

          bull syntactic levelI saw the man on the hill with a telescope

          bull semantic levelOne book has to be read by every student

          Key NLP Problem Ambiguity

          Language Model

          A formal model about language Two types

          bull Non-probabilistic Allows one to compute whether a certain sequence

          (sentence or part thereof) is possible Often grammar based

          bull Probabilistic Allows one to compute the probability of a certain

          sequence Often extends grammars with probabilities

          Example of bad language model

          A bad language model

          A bad language model

          A good language model

          Non-Probabilisticbull ldquoI swear to tell the truthrdquo is possiblebull ldquoI swerve to smell de souprdquo is impossible

          Probabilisticbull P(I swear to tell the truth) ~ 0001bull P(I swerve to smell de soup) ~ 0

          Why language models

          Consider a Shannon Gamebull Predicting the next word in the sequence

          Statistical natural language hellip The cat is thrown out of the hellip The large green hellip Sue swallowed the large green hellip hellip

          Model at the sentence level

          Applications

          Spelling correction Mobile phone texting Speech recognition Handwriting recognition Disabled users hellip

          Spelling errors

          They are leaving in about fifteen minuets to go to her house

          The study was conducted mainly be John Black Hopefully all with continue smoothly in my absence Can they lave him my messages I need to notified the bank ofhellip He is trying to fine out

          Handwriting recognition

          Assume a note is given to a bank teller which the teller reads as I have a gub (cf Woody Allen)

          NLP to the rescue hellipbull gub is not a wordbull gun gum Gus and gull are words but gun has a

          higher probability in the context of a bank

          For Spell Checkers

          Collect list of commonly substituted wordsbull piecepeace whetherweather theirthere

          ExampleldquoOn Tuesday the whether helliprsquorsquoldquoOn Tuesday the weather helliprdquo

          Another dimension in language models

          Do we mainly want to infer (probabilities) of legal sentences sequences bull So far

          Or do we want to infer properties of these sentences bull Eg parse tree part-of-speech-taggingbullNeeded for understanding NL

          Letrsquos look at some tasks

          Sequence Tagging

          Part-of-speech taggingbull He drives with his bikebull N V PR PN N noun verb preposition pronoun noun

          Text extractionbull The job is that of a programmerbull X X X X X X JobTypebull The seminar is taking place from 1500 to 1600bull X X X X X X Start End

          Sequence Tagging

          Predicting the secondary structure of proteins mRNA hellipbull X = AFARLMMAhellipbull Y = hehestststhesthe hellip

          Parsing

          Given a sentence find its parse tree Important step in understanding NL

          Parsing

          In bioinformatics allows to predict (elements of) structure from sequence

          Language models based on Grammars

          Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

          A particular type of Unification Based Grammar (Prolog)

          Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

          words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

          bull Grammar encode rules

          Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

          (more than just recognition) Result of parsing mostly parse tree

          showing the constituents of a sentence eg verb or noun phrases

          Syntax usually specified in terms of a grammar consisting of grammar rules

          Regular Grammars and Finite State Automata

          Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

          argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

          argumentbull Adj (adjective)

          Now acceptbull The cat sleptbull Det N Vi

          As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

          Lexicon bull The - Detbull Cat - Nbull Slept - Vi

          bull hellip

          Finite State Automaton

          Sentences

          bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

          Phrase structure

          S

          NP

          D N

          VP

          NPV

          D N

          PP

          P NP

          D N

          the dog chased a cat into the garden

          Notation

          S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

          Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

          Terminals ~ Lexicon

          Phrase structure

          Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

          Recursionbull bdquoThe girl thought the dog chased the catldquo

          VP -gt V SN -gt [girl]V -gt [thought]

          Top-down parsing

          S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

          Context-free grammarSS --gt --gt NPNPVPVP

          NPNP --gt PN --gt PN Proper nounProper noun

          NPNP --gt Art Adj N--gt Art Adj N

          NPNP --gt ArtN--gt ArtN

          VPVP --gt VI --gt VI intransitive verbintransitive verb

          VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

          ArtArt --gt [the]--gt [the]

          AdjAdj --gt [lazy]--gt [lazy]

          AdjAdj --gt [rapid]--gt [rapid]

          PNPN --gt [achilles]--gt [achilles]

          NN --gt [turtle]--gt [turtle]

          VIVI --gt [sleeps]--gt [sleeps]

          VTVT --gt [beats]--gt [beats]

          Parse tree

          SS

          NPNP VPVP

          ArtArt AdjAdj NN VtVt NPNP

          PNPN

          achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

          Definite Clause GrammarsNon-terminals may have arguments

          SS --gt --gt NPNP((NN))VPVP((NN))

          NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

          VP(VP(NN)) --gt VI(--gt VI(NN))

          Art(Art(singularsingular)) --gt [a]--gt [a]

          Art(Art(singularsingular)) --gt [the]--gt [the]

          Art(Art(pluralplural)) --gt [the]--gt [the]

          N(N(singularsingular)) --gt [turtle]--gt [turtle]

          N(N(pluralplural)) --gt [turtles]--gt [turtles]

          VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

          VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

          Number Agreement

          DCGs

          Non-terminals may have argumentsbull Variables (start with capital)

          Eg Number Any

          bull Constants (start with lower case) Eg singular plural

          bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

          Parsing needs to be adapted bull Using unification

          Unification in a nutshell (cf AI course)

          Substitutions

          Eg Num singular T vp(VNP)

          Applying substitution bull Simultaneously replace variables by

          corresponding termsbull S(Num) Num singular = S(singular)

          Unification

          Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

          Gives Num singular

          bull Art(singular) and Art(plural) Fails

          bull Art(Num1) and Art(Num2) Num1 Num2

          bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

          Parsing with DCGs

          Now require successful unification at each step

          S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

          S-gt a turtle sleep fails

          Case Marking

          PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

          PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

          PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

          PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

          S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

          VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

          VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

          VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

          He sees her She sees him They see her

          But not Them see he

          DCGs

          Are strictly more expressive than CFGs Can represent for instance

          bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

          Probabilistic Models

          Traditional grammar models are very rigid bull essentially a yes no decision

          Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

          Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

          Illustration

          Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

          bull Constructed by handbull Can be used to derive stochastic context free

          grammarsbull SCFG assign probability to parse trees

          Compute the most probable parse tree

          Sequences are omni-present

          Therefore the techniques we will see also apply tobull Bioinformatics

          DNA proteins mRNA hellip can all be represented as strings

          bullRobotics Sequences of actions states hellip

          bullhellip

          Rest of the Course

          Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

          All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

          bull As an example of using undirected graphical models

          bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

          • Advanced Artificial Intelligence
          • Topic
          • Contents
          • Rationalism versus Empiricism
          • Slide 5
          • This course
          • Ambiguity
          • NLP and Statistics
          • Slide 9
          • Corpora
          • Word Counts
          • Slide 12
          • Word Counts (Brown corpus)
          • Slide 14
          • Zipflsquos Law
          • Language and sequences
          • Key NLP Problem Ambiguity
          • Language Model
          • Example of bad language model
          • A bad language model
          • Slide 22
          • A good language model
          • Why language models
          • Applications
          • Spelling errors
          • Handwriting recognition
          • For Spell Checkers
          • Another dimension in language models
          • Sequence Tagging
          • Slide 31
          • Parsing
          • Slide 33
          • Language models based on Grammars
          • Grammars and parsing
          • Regular Grammars and Finite State Automata
          • Finite State Automaton
          • Phrase structure
          • Notation
          • Context Free Grammar
          • Slide 41
          • Top-down parsing
          • Context-free grammar
          • Parse tree
          • Definite Clause Grammars Non-terminals may have arguments
          • DCGs
          • Unification in a nutshell (cf AI course)
          • Unification
          • Parsing with DCGs
          • Case Marking
          • Slide 51
          • Probabilistic Models
          • Illustration
          • PowerPoint Presentation
          • Sequences are omni-present
          • Rest of the Course

            This course

            Empiricist approach bull Focus will be on probabilistic models for learning

            of natural language No time to treat natural language in depth

            bull (though this would be quite useful and interesting)

            bull Deserves a full course by itself Covered in more depth in Logic Language and

            Learning (SS 05 prob SS 06)

            Ambiguity

            Statistical Disambiguation

            bull Define a probability model for the data

            bull Compute the probability of each alternative

            bull Choose the most likely alternative

            NLP and Statistics

            Statistical Methods deal with uncertaintyThey predict the future behaviour of a systembased on the behaviour observed in the past

            Statistical Methods require training data

            The data in Statistical NLP are the Corpora

            NLP and Statistics

            Corpus text collection for linguistic purposes

            TokensHow many words are contained in Tom Sawyer 71370

            TypesHow many different words are contained in TS 8018

            Hapax Legomenawords appearing only once

            Corpora

            The most frequent words are function words

            word freq word freq

            the 3332 in 906

            and 2972 that 877

            a 1775 he 877

            to 1725 I 783

            of 1440 his 772

            was 1161 you 686

            it 1027 Tom 679

            Word Counts

            f nf

            1 39932 12923 6644 4105 2436 1997 1728 1319 8210 9111-50 54051-100 99gt 100 102

            How many words appear f times

            Word Counts

            About half of the words occurs just onceAbout half of the text consists of the

            100 most common wordshellip

            Word Counts (Brown corpus)

            Word Counts (Brown corpus)

            word f r fr word f r frthe 3332 1 3332 turned 51 200 10200and 2972 2 5944 youlsquoll 30 300 9000a 1775 3 5235 name 21 400 8400he 877 10 8770 comes 16 500 8000but 410 20 8400 group 13 600 7800be 294 30 8820 lead 11 700 7700there 222 40 8880 friends 10 800 8000one 172 50 8600 begin 9 900 8100about 158 60 9480 family 8 1000 8000more 138 70 9660 brushed 4 2000 8000never 124 80 9920 sins 2 3000 6000Oh 116 90 10440 Could 2 4000 8000two 104 100 10400 Applausive 1 8000 8000

            Zipflsquos Law f~1r (fr = const)

            Zipflsquos Law

            Minimize effort

            Language and sequences

            Natural language processingbull Is concerned with the analysis of

            sequences of words sentencesbullConstruction of language models

            Two types of modelsbullNon-probabilisticbull Probabilistic

            Human Language is highly ambiguous at all levels

            bull acoustic levelrecognize speech vs wreck a nice beach

            bull morphological levelsaw to see (past) saw (noun) to saw (present inf)

            bull syntactic levelI saw the man on the hill with a telescope

            bull semantic levelOne book has to be read by every student

            Key NLP Problem Ambiguity

            Language Model

            A formal model about language Two types

            bull Non-probabilistic Allows one to compute whether a certain sequence

            (sentence or part thereof) is possible Often grammar based

            bull Probabilistic Allows one to compute the probability of a certain

            sequence Often extends grammars with probabilities

            Example of bad language model

            A bad language model

            A bad language model

            A good language model

            Non-Probabilisticbull ldquoI swear to tell the truthrdquo is possiblebull ldquoI swerve to smell de souprdquo is impossible

            Probabilisticbull P(I swear to tell the truth) ~ 0001bull P(I swerve to smell de soup) ~ 0

            Why language models

            Consider a Shannon Gamebull Predicting the next word in the sequence

            Statistical natural language hellip The cat is thrown out of the hellip The large green hellip Sue swallowed the large green hellip hellip

            Model at the sentence level

            Applications

            Spelling correction Mobile phone texting Speech recognition Handwriting recognition Disabled users hellip

            Spelling errors

            They are leaving in about fifteen minuets to go to her house

            The study was conducted mainly be John Black Hopefully all with continue smoothly in my absence Can they lave him my messages I need to notified the bank ofhellip He is trying to fine out

            Handwriting recognition

            Assume a note is given to a bank teller which the teller reads as I have a gub (cf Woody Allen)

            NLP to the rescue hellipbull gub is not a wordbull gun gum Gus and gull are words but gun has a

            higher probability in the context of a bank

            For Spell Checkers

            Collect list of commonly substituted wordsbull piecepeace whetherweather theirthere

            ExampleldquoOn Tuesday the whether helliprsquorsquoldquoOn Tuesday the weather helliprdquo

            Another dimension in language models

            Do we mainly want to infer (probabilities) of legal sentences sequences bull So far

            Or do we want to infer properties of these sentences bull Eg parse tree part-of-speech-taggingbullNeeded for understanding NL

            Letrsquos look at some tasks

            Sequence Tagging

            Part-of-speech taggingbull He drives with his bikebull N V PR PN N noun verb preposition pronoun noun

            Text extractionbull The job is that of a programmerbull X X X X X X JobTypebull The seminar is taking place from 1500 to 1600bull X X X X X X Start End

            Sequence Tagging

            Predicting the secondary structure of proteins mRNA hellipbull X = AFARLMMAhellipbull Y = hehestststhesthe hellip

            Parsing

            Given a sentence find its parse tree Important step in understanding NL

            Parsing

            In bioinformatics allows to predict (elements of) structure from sequence

            Language models based on Grammars

            Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

            A particular type of Unification Based Grammar (Prolog)

            Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

            words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

            bull Grammar encode rules

            Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

            (more than just recognition) Result of parsing mostly parse tree

            showing the constituents of a sentence eg verb or noun phrases

            Syntax usually specified in terms of a grammar consisting of grammar rules

            Regular Grammars and Finite State Automata

            Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

            argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

            argumentbull Adj (adjective)

            Now acceptbull The cat sleptbull Det N Vi

            As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

            Lexicon bull The - Detbull Cat - Nbull Slept - Vi

            bull hellip

            Finite State Automaton

            Sentences

            bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

            Phrase structure

            S

            NP

            D N

            VP

            NPV

            D N

            PP

            P NP

            D N

            the dog chased a cat into the garden

            Notation

            S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

            Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

            Terminals ~ Lexicon

            Phrase structure

            Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

            Recursionbull bdquoThe girl thought the dog chased the catldquo

            VP -gt V SN -gt [girl]V -gt [thought]

            Top-down parsing

            S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

            Context-free grammarSS --gt --gt NPNPVPVP

            NPNP --gt PN --gt PN Proper nounProper noun

            NPNP --gt Art Adj N--gt Art Adj N

            NPNP --gt ArtN--gt ArtN

            VPVP --gt VI --gt VI intransitive verbintransitive verb

            VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

            ArtArt --gt [the]--gt [the]

            AdjAdj --gt [lazy]--gt [lazy]

            AdjAdj --gt [rapid]--gt [rapid]

            PNPN --gt [achilles]--gt [achilles]

            NN --gt [turtle]--gt [turtle]

            VIVI --gt [sleeps]--gt [sleeps]

            VTVT --gt [beats]--gt [beats]

            Parse tree

            SS

            NPNP VPVP

            ArtArt AdjAdj NN VtVt NPNP

            PNPN

            achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

            Definite Clause GrammarsNon-terminals may have arguments

            SS --gt --gt NPNP((NN))VPVP((NN))

            NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

            VP(VP(NN)) --gt VI(--gt VI(NN))

            Art(Art(singularsingular)) --gt [a]--gt [a]

            Art(Art(singularsingular)) --gt [the]--gt [the]

            Art(Art(pluralplural)) --gt [the]--gt [the]

            N(N(singularsingular)) --gt [turtle]--gt [turtle]

            N(N(pluralplural)) --gt [turtles]--gt [turtles]

            VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

            VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

            Number Agreement

            DCGs

            Non-terminals may have argumentsbull Variables (start with capital)

            Eg Number Any

            bull Constants (start with lower case) Eg singular plural

            bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

            Parsing needs to be adapted bull Using unification

            Unification in a nutshell (cf AI course)

            Substitutions

            Eg Num singular T vp(VNP)

            Applying substitution bull Simultaneously replace variables by

            corresponding termsbull S(Num) Num singular = S(singular)

            Unification

            Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

            Gives Num singular

            bull Art(singular) and Art(plural) Fails

            bull Art(Num1) and Art(Num2) Num1 Num2

            bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

            Parsing with DCGs

            Now require successful unification at each step

            S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

            S-gt a turtle sleep fails

            Case Marking

            PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

            PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

            PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

            PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

            S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

            VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

            VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

            VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

            He sees her She sees him They see her

            But not Them see he

            DCGs

            Are strictly more expressive than CFGs Can represent for instance

            bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

            Probabilistic Models

            Traditional grammar models are very rigid bull essentially a yes no decision

            Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

            Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

            Illustration

            Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

            bull Constructed by handbull Can be used to derive stochastic context free

            grammarsbull SCFG assign probability to parse trees

            Compute the most probable parse tree

            Sequences are omni-present

            Therefore the techniques we will see also apply tobull Bioinformatics

            DNA proteins mRNA hellip can all be represented as strings

            bullRobotics Sequences of actions states hellip

            bullhellip

            Rest of the Course

            Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

            All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

            bull As an example of using undirected graphical models

            bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

            • Advanced Artificial Intelligence
            • Topic
            • Contents
            • Rationalism versus Empiricism
            • Slide 5
            • This course
            • Ambiguity
            • NLP and Statistics
            • Slide 9
            • Corpora
            • Word Counts
            • Slide 12
            • Word Counts (Brown corpus)
            • Slide 14
            • Zipflsquos Law
            • Language and sequences
            • Key NLP Problem Ambiguity
            • Language Model
            • Example of bad language model
            • A bad language model
            • Slide 22
            • A good language model
            • Why language models
            • Applications
            • Spelling errors
            • Handwriting recognition
            • For Spell Checkers
            • Another dimension in language models
            • Sequence Tagging
            • Slide 31
            • Parsing
            • Slide 33
            • Language models based on Grammars
            • Grammars and parsing
            • Regular Grammars and Finite State Automata
            • Finite State Automaton
            • Phrase structure
            • Notation
            • Context Free Grammar
            • Slide 41
            • Top-down parsing
            • Context-free grammar
            • Parse tree
            • Definite Clause Grammars Non-terminals may have arguments
            • DCGs
            • Unification in a nutshell (cf AI course)
            • Unification
            • Parsing with DCGs
            • Case Marking
            • Slide 51
            • Probabilistic Models
            • Illustration
            • PowerPoint Presentation
            • Sequences are omni-present
            • Rest of the Course

              Ambiguity

              Statistical Disambiguation

              bull Define a probability model for the data

              bull Compute the probability of each alternative

              bull Choose the most likely alternative

              NLP and Statistics

              Statistical Methods deal with uncertaintyThey predict the future behaviour of a systembased on the behaviour observed in the past

              Statistical Methods require training data

              The data in Statistical NLP are the Corpora

              NLP and Statistics

              Corpus text collection for linguistic purposes

              TokensHow many words are contained in Tom Sawyer 71370

              TypesHow many different words are contained in TS 8018

              Hapax Legomenawords appearing only once

              Corpora

              The most frequent words are function words

              word freq word freq

              the 3332 in 906

              and 2972 that 877

              a 1775 he 877

              to 1725 I 783

              of 1440 his 772

              was 1161 you 686

              it 1027 Tom 679

              Word Counts

              f nf

              1 39932 12923 6644 4105 2436 1997 1728 1319 8210 9111-50 54051-100 99gt 100 102

              How many words appear f times

              Word Counts

              About half of the words occurs just onceAbout half of the text consists of the

              100 most common wordshellip

              Word Counts (Brown corpus)

              Word Counts (Brown corpus)

              word f r fr word f r frthe 3332 1 3332 turned 51 200 10200and 2972 2 5944 youlsquoll 30 300 9000a 1775 3 5235 name 21 400 8400he 877 10 8770 comes 16 500 8000but 410 20 8400 group 13 600 7800be 294 30 8820 lead 11 700 7700there 222 40 8880 friends 10 800 8000one 172 50 8600 begin 9 900 8100about 158 60 9480 family 8 1000 8000more 138 70 9660 brushed 4 2000 8000never 124 80 9920 sins 2 3000 6000Oh 116 90 10440 Could 2 4000 8000two 104 100 10400 Applausive 1 8000 8000

              Zipflsquos Law f~1r (fr = const)

              Zipflsquos Law

              Minimize effort

              Language and sequences

              Natural language processingbull Is concerned with the analysis of

              sequences of words sentencesbullConstruction of language models

              Two types of modelsbullNon-probabilisticbull Probabilistic

              Human Language is highly ambiguous at all levels

              bull acoustic levelrecognize speech vs wreck a nice beach

              bull morphological levelsaw to see (past) saw (noun) to saw (present inf)

              bull syntactic levelI saw the man on the hill with a telescope

              bull semantic levelOne book has to be read by every student

              Key NLP Problem Ambiguity

              Language Model

              A formal model about language Two types

              bull Non-probabilistic Allows one to compute whether a certain sequence

              (sentence or part thereof) is possible Often grammar based

              bull Probabilistic Allows one to compute the probability of a certain

              sequence Often extends grammars with probabilities

              Example of bad language model

              A bad language model

              A bad language model

              A good language model

              Non-Probabilisticbull ldquoI swear to tell the truthrdquo is possiblebull ldquoI swerve to smell de souprdquo is impossible

              Probabilisticbull P(I swear to tell the truth) ~ 0001bull P(I swerve to smell de soup) ~ 0

              Why language models

              Consider a Shannon Gamebull Predicting the next word in the sequence

              Statistical natural language hellip The cat is thrown out of the hellip The large green hellip Sue swallowed the large green hellip hellip

              Model at the sentence level

              Applications

              Spelling correction Mobile phone texting Speech recognition Handwriting recognition Disabled users hellip

              Spelling errors

              They are leaving in about fifteen minuets to go to her house

              The study was conducted mainly be John Black Hopefully all with continue smoothly in my absence Can they lave him my messages I need to notified the bank ofhellip He is trying to fine out

              Handwriting recognition

              Assume a note is given to a bank teller which the teller reads as I have a gub (cf Woody Allen)

              NLP to the rescue hellipbull gub is not a wordbull gun gum Gus and gull are words but gun has a

              higher probability in the context of a bank

              For Spell Checkers

              Collect list of commonly substituted wordsbull piecepeace whetherweather theirthere

              ExampleldquoOn Tuesday the whether helliprsquorsquoldquoOn Tuesday the weather helliprdquo

              Another dimension in language models

              Do we mainly want to infer (probabilities) of legal sentences sequences bull So far

              Or do we want to infer properties of these sentences bull Eg parse tree part-of-speech-taggingbullNeeded for understanding NL

              Letrsquos look at some tasks

              Sequence Tagging

              Part-of-speech taggingbull He drives with his bikebull N V PR PN N noun verb preposition pronoun noun

              Text extractionbull The job is that of a programmerbull X X X X X X JobTypebull The seminar is taking place from 1500 to 1600bull X X X X X X Start End

              Sequence Tagging

              Predicting the secondary structure of proteins mRNA hellipbull X = AFARLMMAhellipbull Y = hehestststhesthe hellip

              Parsing

              Given a sentence find its parse tree Important step in understanding NL

              Parsing

              In bioinformatics allows to predict (elements of) structure from sequence

              Language models based on Grammars

              Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

              A particular type of Unification Based Grammar (Prolog)

              Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

              words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

              bull Grammar encode rules

              Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

              (more than just recognition) Result of parsing mostly parse tree

              showing the constituents of a sentence eg verb or noun phrases

              Syntax usually specified in terms of a grammar consisting of grammar rules

              Regular Grammars and Finite State Automata

              Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

              argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

              argumentbull Adj (adjective)

              Now acceptbull The cat sleptbull Det N Vi

              As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

              Lexicon bull The - Detbull Cat - Nbull Slept - Vi

              bull hellip

              Finite State Automaton

              Sentences

              bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

              Phrase structure

              S

              NP

              D N

              VP

              NPV

              D N

              PP

              P NP

              D N

              the dog chased a cat into the garden

              Notation

              S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

              Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

              Terminals ~ Lexicon

              Phrase structure

              Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

              Recursionbull bdquoThe girl thought the dog chased the catldquo

              VP -gt V SN -gt [girl]V -gt [thought]

              Top-down parsing

              S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

              Context-free grammarSS --gt --gt NPNPVPVP

              NPNP --gt PN --gt PN Proper nounProper noun

              NPNP --gt Art Adj N--gt Art Adj N

              NPNP --gt ArtN--gt ArtN

              VPVP --gt VI --gt VI intransitive verbintransitive verb

              VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

              ArtArt --gt [the]--gt [the]

              AdjAdj --gt [lazy]--gt [lazy]

              AdjAdj --gt [rapid]--gt [rapid]

              PNPN --gt [achilles]--gt [achilles]

              NN --gt [turtle]--gt [turtle]

              VIVI --gt [sleeps]--gt [sleeps]

              VTVT --gt [beats]--gt [beats]

              Parse tree

              SS

              NPNP VPVP

              ArtArt AdjAdj NN VtVt NPNP

              PNPN

              achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

              Definite Clause GrammarsNon-terminals may have arguments

              SS --gt --gt NPNP((NN))VPVP((NN))

              NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

              VP(VP(NN)) --gt VI(--gt VI(NN))

              Art(Art(singularsingular)) --gt [a]--gt [a]

              Art(Art(singularsingular)) --gt [the]--gt [the]

              Art(Art(pluralplural)) --gt [the]--gt [the]

              N(N(singularsingular)) --gt [turtle]--gt [turtle]

              N(N(pluralplural)) --gt [turtles]--gt [turtles]

              VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

              VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

              Number Agreement

              DCGs

              Non-terminals may have argumentsbull Variables (start with capital)

              Eg Number Any

              bull Constants (start with lower case) Eg singular plural

              bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

              Parsing needs to be adapted bull Using unification

              Unification in a nutshell (cf AI course)

              Substitutions

              Eg Num singular T vp(VNP)

              Applying substitution bull Simultaneously replace variables by

              corresponding termsbull S(Num) Num singular = S(singular)

              Unification

              Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

              Gives Num singular

              bull Art(singular) and Art(plural) Fails

              bull Art(Num1) and Art(Num2) Num1 Num2

              bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

              Parsing with DCGs

              Now require successful unification at each step

              S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

              S-gt a turtle sleep fails

              Case Marking

              PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

              PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

              PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

              PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

              S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

              VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

              VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

              VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

              He sees her She sees him They see her

              But not Them see he

              DCGs

              Are strictly more expressive than CFGs Can represent for instance

              bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

              Probabilistic Models

              Traditional grammar models are very rigid bull essentially a yes no decision

              Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

              Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

              Illustration

              Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

              bull Constructed by handbull Can be used to derive stochastic context free

              grammarsbull SCFG assign probability to parse trees

              Compute the most probable parse tree

              Sequences are omni-present

              Therefore the techniques we will see also apply tobull Bioinformatics

              DNA proteins mRNA hellip can all be represented as strings

              bullRobotics Sequences of actions states hellip

              bullhellip

              Rest of the Course

              Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

              All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

              bull As an example of using undirected graphical models

              bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

              • Advanced Artificial Intelligence
              • Topic
              • Contents
              • Rationalism versus Empiricism
              • Slide 5
              • This course
              • Ambiguity
              • NLP and Statistics
              • Slide 9
              • Corpora
              • Word Counts
              • Slide 12
              • Word Counts (Brown corpus)
              • Slide 14
              • Zipflsquos Law
              • Language and sequences
              • Key NLP Problem Ambiguity
              • Language Model
              • Example of bad language model
              • A bad language model
              • Slide 22
              • A good language model
              • Why language models
              • Applications
              • Spelling errors
              • Handwriting recognition
              • For Spell Checkers
              • Another dimension in language models
              • Sequence Tagging
              • Slide 31
              • Parsing
              • Slide 33
              • Language models based on Grammars
              • Grammars and parsing
              • Regular Grammars and Finite State Automata
              • Finite State Automaton
              • Phrase structure
              • Notation
              • Context Free Grammar
              • Slide 41
              • Top-down parsing
              • Context-free grammar
              • Parse tree
              • Definite Clause Grammars Non-terminals may have arguments
              • DCGs
              • Unification in a nutshell (cf AI course)
              • Unification
              • Parsing with DCGs
              • Case Marking
              • Slide 51
              • Probabilistic Models
              • Illustration
              • PowerPoint Presentation
              • Sequences are omni-present
              • Rest of the Course

                Statistical Disambiguation

                bull Define a probability model for the data

                bull Compute the probability of each alternative

                bull Choose the most likely alternative

                NLP and Statistics

                Statistical Methods deal with uncertaintyThey predict the future behaviour of a systembased on the behaviour observed in the past

                Statistical Methods require training data

                The data in Statistical NLP are the Corpora

                NLP and Statistics

                Corpus text collection for linguistic purposes

                TokensHow many words are contained in Tom Sawyer 71370

                TypesHow many different words are contained in TS 8018

                Hapax Legomenawords appearing only once

                Corpora

                The most frequent words are function words

                word freq word freq

                the 3332 in 906

                and 2972 that 877

                a 1775 he 877

                to 1725 I 783

                of 1440 his 772

                was 1161 you 686

                it 1027 Tom 679

                Word Counts

                f nf

                1 39932 12923 6644 4105 2436 1997 1728 1319 8210 9111-50 54051-100 99gt 100 102

                How many words appear f times

                Word Counts

                About half of the words occurs just onceAbout half of the text consists of the

                100 most common wordshellip

                Word Counts (Brown corpus)

                Word Counts (Brown corpus)

                word f r fr word f r frthe 3332 1 3332 turned 51 200 10200and 2972 2 5944 youlsquoll 30 300 9000a 1775 3 5235 name 21 400 8400he 877 10 8770 comes 16 500 8000but 410 20 8400 group 13 600 7800be 294 30 8820 lead 11 700 7700there 222 40 8880 friends 10 800 8000one 172 50 8600 begin 9 900 8100about 158 60 9480 family 8 1000 8000more 138 70 9660 brushed 4 2000 8000never 124 80 9920 sins 2 3000 6000Oh 116 90 10440 Could 2 4000 8000two 104 100 10400 Applausive 1 8000 8000

                Zipflsquos Law f~1r (fr = const)

                Zipflsquos Law

                Minimize effort

                Language and sequences

                Natural language processingbull Is concerned with the analysis of

                sequences of words sentencesbullConstruction of language models

                Two types of modelsbullNon-probabilisticbull Probabilistic

                Human Language is highly ambiguous at all levels

                bull acoustic levelrecognize speech vs wreck a nice beach

                bull morphological levelsaw to see (past) saw (noun) to saw (present inf)

                bull syntactic levelI saw the man on the hill with a telescope

                bull semantic levelOne book has to be read by every student

                Key NLP Problem Ambiguity

                Language Model

                A formal model about language Two types

                bull Non-probabilistic Allows one to compute whether a certain sequence

                (sentence or part thereof) is possible Often grammar based

                bull Probabilistic Allows one to compute the probability of a certain

                sequence Often extends grammars with probabilities

                Example of bad language model

                A bad language model

                A bad language model

                A good language model

                Non-Probabilisticbull ldquoI swear to tell the truthrdquo is possiblebull ldquoI swerve to smell de souprdquo is impossible

                Probabilisticbull P(I swear to tell the truth) ~ 0001bull P(I swerve to smell de soup) ~ 0

                Why language models

                Consider a Shannon Gamebull Predicting the next word in the sequence

                Statistical natural language hellip The cat is thrown out of the hellip The large green hellip Sue swallowed the large green hellip hellip

                Model at the sentence level

                Applications

                Spelling correction Mobile phone texting Speech recognition Handwriting recognition Disabled users hellip

                Spelling errors

                They are leaving in about fifteen minuets to go to her house

                The study was conducted mainly be John Black Hopefully all with continue smoothly in my absence Can they lave him my messages I need to notified the bank ofhellip He is trying to fine out

                Handwriting recognition

                Assume a note is given to a bank teller which the teller reads as I have a gub (cf Woody Allen)

                NLP to the rescue hellipbull gub is not a wordbull gun gum Gus and gull are words but gun has a

                higher probability in the context of a bank

                For Spell Checkers

                Collect list of commonly substituted wordsbull piecepeace whetherweather theirthere

                ExampleldquoOn Tuesday the whether helliprsquorsquoldquoOn Tuesday the weather helliprdquo

                Another dimension in language models

                Do we mainly want to infer (probabilities) of legal sentences sequences bull So far

                Or do we want to infer properties of these sentences bull Eg parse tree part-of-speech-taggingbullNeeded for understanding NL

                Letrsquos look at some tasks

                Sequence Tagging

                Part-of-speech taggingbull He drives with his bikebull N V PR PN N noun verb preposition pronoun noun

                Text extractionbull The job is that of a programmerbull X X X X X X JobTypebull The seminar is taking place from 1500 to 1600bull X X X X X X Start End

                Sequence Tagging

                Predicting the secondary structure of proteins mRNA hellipbull X = AFARLMMAhellipbull Y = hehestststhesthe hellip

                Parsing

                Given a sentence find its parse tree Important step in understanding NL

                Parsing

                In bioinformatics allows to predict (elements of) structure from sequence

                Language models based on Grammars

                Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

                A particular type of Unification Based Grammar (Prolog)

                Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

                words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

                bull Grammar encode rules

                Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

                (more than just recognition) Result of parsing mostly parse tree

                showing the constituents of a sentence eg verb or noun phrases

                Syntax usually specified in terms of a grammar consisting of grammar rules

                Regular Grammars and Finite State Automata

                Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

                argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

                argumentbull Adj (adjective)

                Now acceptbull The cat sleptbull Det N Vi

                As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

                Lexicon bull The - Detbull Cat - Nbull Slept - Vi

                bull hellip

                Finite State Automaton

                Sentences

                bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

                Phrase structure

                S

                NP

                D N

                VP

                NPV

                D N

                PP

                P NP

                D N

                the dog chased a cat into the garden

                Notation

                S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

                Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

                Terminals ~ Lexicon

                Phrase structure

                Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                Recursionbull bdquoThe girl thought the dog chased the catldquo

                VP -gt V SN -gt [girl]V -gt [thought]

                Top-down parsing

                S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                Context-free grammarSS --gt --gt NPNPVPVP

                NPNP --gt PN --gt PN Proper nounProper noun

                NPNP --gt Art Adj N--gt Art Adj N

                NPNP --gt ArtN--gt ArtN

                VPVP --gt VI --gt VI intransitive verbintransitive verb

                VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                ArtArt --gt [the]--gt [the]

                AdjAdj --gt [lazy]--gt [lazy]

                AdjAdj --gt [rapid]--gt [rapid]

                PNPN --gt [achilles]--gt [achilles]

                NN --gt [turtle]--gt [turtle]

                VIVI --gt [sleeps]--gt [sleeps]

                VTVT --gt [beats]--gt [beats]

                Parse tree

                SS

                NPNP VPVP

                ArtArt AdjAdj NN VtVt NPNP

                PNPN

                achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                Definite Clause GrammarsNon-terminals may have arguments

                SS --gt --gt NPNP((NN))VPVP((NN))

                NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                VP(VP(NN)) --gt VI(--gt VI(NN))

                Art(Art(singularsingular)) --gt [a]--gt [a]

                Art(Art(singularsingular)) --gt [the]--gt [the]

                Art(Art(pluralplural)) --gt [the]--gt [the]

                N(N(singularsingular)) --gt [turtle]--gt [turtle]

                N(N(pluralplural)) --gt [turtles]--gt [turtles]

                VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                Number Agreement

                DCGs

                Non-terminals may have argumentsbull Variables (start with capital)

                Eg Number Any

                bull Constants (start with lower case) Eg singular plural

                bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                Parsing needs to be adapted bull Using unification

                Unification in a nutshell (cf AI course)

                Substitutions

                Eg Num singular T vp(VNP)

                Applying substitution bull Simultaneously replace variables by

                corresponding termsbull S(Num) Num singular = S(singular)

                Unification

                Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                Gives Num singular

                bull Art(singular) and Art(plural) Fails

                bull Art(Num1) and Art(Num2) Num1 Num2

                bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                Parsing with DCGs

                Now require successful unification at each step

                S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                S-gt a turtle sleep fails

                Case Marking

                PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                He sees her She sees him They see her

                But not Them see he

                DCGs

                Are strictly more expressive than CFGs Can represent for instance

                bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                Probabilistic Models

                Traditional grammar models are very rigid bull essentially a yes no decision

                Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                Illustration

                Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                bull Constructed by handbull Can be used to derive stochastic context free

                grammarsbull SCFG assign probability to parse trees

                Compute the most probable parse tree

                Sequences are omni-present

                Therefore the techniques we will see also apply tobull Bioinformatics

                DNA proteins mRNA hellip can all be represented as strings

                bullRobotics Sequences of actions states hellip

                bullhellip

                Rest of the Course

                Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                bull As an example of using undirected graphical models

                bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                • Advanced Artificial Intelligence
                • Topic
                • Contents
                • Rationalism versus Empiricism
                • Slide 5
                • This course
                • Ambiguity
                • NLP and Statistics
                • Slide 9
                • Corpora
                • Word Counts
                • Slide 12
                • Word Counts (Brown corpus)
                • Slide 14
                • Zipflsquos Law
                • Language and sequences
                • Key NLP Problem Ambiguity
                • Language Model
                • Example of bad language model
                • A bad language model
                • Slide 22
                • A good language model
                • Why language models
                • Applications
                • Spelling errors
                • Handwriting recognition
                • For Spell Checkers
                • Another dimension in language models
                • Sequence Tagging
                • Slide 31
                • Parsing
                • Slide 33
                • Language models based on Grammars
                • Grammars and parsing
                • Regular Grammars and Finite State Automata
                • Finite State Automaton
                • Phrase structure
                • Notation
                • Context Free Grammar
                • Slide 41
                • Top-down parsing
                • Context-free grammar
                • Parse tree
                • Definite Clause Grammars Non-terminals may have arguments
                • DCGs
                • Unification in a nutshell (cf AI course)
                • Unification
                • Parsing with DCGs
                • Case Marking
                • Slide 51
                • Probabilistic Models
                • Illustration
                • PowerPoint Presentation
                • Sequences are omni-present
                • Rest of the Course

                  Statistical Methods deal with uncertaintyThey predict the future behaviour of a systembased on the behaviour observed in the past

                  Statistical Methods require training data

                  The data in Statistical NLP are the Corpora

                  NLP and Statistics

                  Corpus text collection for linguistic purposes

                  TokensHow many words are contained in Tom Sawyer 71370

                  TypesHow many different words are contained in TS 8018

                  Hapax Legomenawords appearing only once

                  Corpora

                  The most frequent words are function words

                  word freq word freq

                  the 3332 in 906

                  and 2972 that 877

                  a 1775 he 877

                  to 1725 I 783

                  of 1440 his 772

                  was 1161 you 686

                  it 1027 Tom 679

                  Word Counts

                  f nf

                  1 39932 12923 6644 4105 2436 1997 1728 1319 8210 9111-50 54051-100 99gt 100 102

                  How many words appear f times

                  Word Counts

                  About half of the words occurs just onceAbout half of the text consists of the

                  100 most common wordshellip

                  Word Counts (Brown corpus)

                  Word Counts (Brown corpus)

                  word f r fr word f r frthe 3332 1 3332 turned 51 200 10200and 2972 2 5944 youlsquoll 30 300 9000a 1775 3 5235 name 21 400 8400he 877 10 8770 comes 16 500 8000but 410 20 8400 group 13 600 7800be 294 30 8820 lead 11 700 7700there 222 40 8880 friends 10 800 8000one 172 50 8600 begin 9 900 8100about 158 60 9480 family 8 1000 8000more 138 70 9660 brushed 4 2000 8000never 124 80 9920 sins 2 3000 6000Oh 116 90 10440 Could 2 4000 8000two 104 100 10400 Applausive 1 8000 8000

                  Zipflsquos Law f~1r (fr = const)

                  Zipflsquos Law

                  Minimize effort

                  Language and sequences

                  Natural language processingbull Is concerned with the analysis of

                  sequences of words sentencesbullConstruction of language models

                  Two types of modelsbullNon-probabilisticbull Probabilistic

                  Human Language is highly ambiguous at all levels

                  bull acoustic levelrecognize speech vs wreck a nice beach

                  bull morphological levelsaw to see (past) saw (noun) to saw (present inf)

                  bull syntactic levelI saw the man on the hill with a telescope

                  bull semantic levelOne book has to be read by every student

                  Key NLP Problem Ambiguity

                  Language Model

                  A formal model about language Two types

                  bull Non-probabilistic Allows one to compute whether a certain sequence

                  (sentence or part thereof) is possible Often grammar based

                  bull Probabilistic Allows one to compute the probability of a certain

                  sequence Often extends grammars with probabilities

                  Example of bad language model

                  A bad language model

                  A bad language model

                  A good language model

                  Non-Probabilisticbull ldquoI swear to tell the truthrdquo is possiblebull ldquoI swerve to smell de souprdquo is impossible

                  Probabilisticbull P(I swear to tell the truth) ~ 0001bull P(I swerve to smell de soup) ~ 0

                  Why language models

                  Consider a Shannon Gamebull Predicting the next word in the sequence

                  Statistical natural language hellip The cat is thrown out of the hellip The large green hellip Sue swallowed the large green hellip hellip

                  Model at the sentence level

                  Applications

                  Spelling correction Mobile phone texting Speech recognition Handwriting recognition Disabled users hellip

                  Spelling errors

                  They are leaving in about fifteen minuets to go to her house

                  The study was conducted mainly be John Black Hopefully all with continue smoothly in my absence Can they lave him my messages I need to notified the bank ofhellip He is trying to fine out

                  Handwriting recognition

                  Assume a note is given to a bank teller which the teller reads as I have a gub (cf Woody Allen)

                  NLP to the rescue hellipbull gub is not a wordbull gun gum Gus and gull are words but gun has a

                  higher probability in the context of a bank

                  For Spell Checkers

                  Collect list of commonly substituted wordsbull piecepeace whetherweather theirthere

                  ExampleldquoOn Tuesday the whether helliprsquorsquoldquoOn Tuesday the weather helliprdquo

                  Another dimension in language models

                  Do we mainly want to infer (probabilities) of legal sentences sequences bull So far

                  Or do we want to infer properties of these sentences bull Eg parse tree part-of-speech-taggingbullNeeded for understanding NL

                  Letrsquos look at some tasks

                  Sequence Tagging

                  Part-of-speech taggingbull He drives with his bikebull N V PR PN N noun verb preposition pronoun noun

                  Text extractionbull The job is that of a programmerbull X X X X X X JobTypebull The seminar is taking place from 1500 to 1600bull X X X X X X Start End

                  Sequence Tagging

                  Predicting the secondary structure of proteins mRNA hellipbull X = AFARLMMAhellipbull Y = hehestststhesthe hellip

                  Parsing

                  Given a sentence find its parse tree Important step in understanding NL

                  Parsing

                  In bioinformatics allows to predict (elements of) structure from sequence

                  Language models based on Grammars

                  Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

                  A particular type of Unification Based Grammar (Prolog)

                  Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

                  words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

                  bull Grammar encode rules

                  Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

                  (more than just recognition) Result of parsing mostly parse tree

                  showing the constituents of a sentence eg verb or noun phrases

                  Syntax usually specified in terms of a grammar consisting of grammar rules

                  Regular Grammars and Finite State Automata

                  Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

                  argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

                  argumentbull Adj (adjective)

                  Now acceptbull The cat sleptbull Det N Vi

                  As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

                  Lexicon bull The - Detbull Cat - Nbull Slept - Vi

                  bull hellip

                  Finite State Automaton

                  Sentences

                  bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

                  Phrase structure

                  S

                  NP

                  D N

                  VP

                  NPV

                  D N

                  PP

                  P NP

                  D N

                  the dog chased a cat into the garden

                  Notation

                  S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

                  Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

                  Terminals ~ Lexicon

                  Phrase structure

                  Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                  Recursionbull bdquoThe girl thought the dog chased the catldquo

                  VP -gt V SN -gt [girl]V -gt [thought]

                  Top-down parsing

                  S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                  Context-free grammarSS --gt --gt NPNPVPVP

                  NPNP --gt PN --gt PN Proper nounProper noun

                  NPNP --gt Art Adj N--gt Art Adj N

                  NPNP --gt ArtN--gt ArtN

                  VPVP --gt VI --gt VI intransitive verbintransitive verb

                  VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                  ArtArt --gt [the]--gt [the]

                  AdjAdj --gt [lazy]--gt [lazy]

                  AdjAdj --gt [rapid]--gt [rapid]

                  PNPN --gt [achilles]--gt [achilles]

                  NN --gt [turtle]--gt [turtle]

                  VIVI --gt [sleeps]--gt [sleeps]

                  VTVT --gt [beats]--gt [beats]

                  Parse tree

                  SS

                  NPNP VPVP

                  ArtArt AdjAdj NN VtVt NPNP

                  PNPN

                  achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                  Definite Clause GrammarsNon-terminals may have arguments

                  SS --gt --gt NPNP((NN))VPVP((NN))

                  NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                  VP(VP(NN)) --gt VI(--gt VI(NN))

                  Art(Art(singularsingular)) --gt [a]--gt [a]

                  Art(Art(singularsingular)) --gt [the]--gt [the]

                  Art(Art(pluralplural)) --gt [the]--gt [the]

                  N(N(singularsingular)) --gt [turtle]--gt [turtle]

                  N(N(pluralplural)) --gt [turtles]--gt [turtles]

                  VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                  VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                  Number Agreement

                  DCGs

                  Non-terminals may have argumentsbull Variables (start with capital)

                  Eg Number Any

                  bull Constants (start with lower case) Eg singular plural

                  bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                  Parsing needs to be adapted bull Using unification

                  Unification in a nutshell (cf AI course)

                  Substitutions

                  Eg Num singular T vp(VNP)

                  Applying substitution bull Simultaneously replace variables by

                  corresponding termsbull S(Num) Num singular = S(singular)

                  Unification

                  Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                  Gives Num singular

                  bull Art(singular) and Art(plural) Fails

                  bull Art(Num1) and Art(Num2) Num1 Num2

                  bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                  Parsing with DCGs

                  Now require successful unification at each step

                  S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                  S-gt a turtle sleep fails

                  Case Marking

                  PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                  PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                  PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                  PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                  S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                  VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                  VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                  VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                  He sees her She sees him They see her

                  But not Them see he

                  DCGs

                  Are strictly more expressive than CFGs Can represent for instance

                  bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                  Probabilistic Models

                  Traditional grammar models are very rigid bull essentially a yes no decision

                  Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                  Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                  Illustration

                  Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                  bull Constructed by handbull Can be used to derive stochastic context free

                  grammarsbull SCFG assign probability to parse trees

                  Compute the most probable parse tree

                  Sequences are omni-present

                  Therefore the techniques we will see also apply tobull Bioinformatics

                  DNA proteins mRNA hellip can all be represented as strings

                  bullRobotics Sequences of actions states hellip

                  bullhellip

                  Rest of the Course

                  Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                  All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                  bull As an example of using undirected graphical models

                  bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                  • Advanced Artificial Intelligence
                  • Topic
                  • Contents
                  • Rationalism versus Empiricism
                  • Slide 5
                  • This course
                  • Ambiguity
                  • NLP and Statistics
                  • Slide 9
                  • Corpora
                  • Word Counts
                  • Slide 12
                  • Word Counts (Brown corpus)
                  • Slide 14
                  • Zipflsquos Law
                  • Language and sequences
                  • Key NLP Problem Ambiguity
                  • Language Model
                  • Example of bad language model
                  • A bad language model
                  • Slide 22
                  • A good language model
                  • Why language models
                  • Applications
                  • Spelling errors
                  • Handwriting recognition
                  • For Spell Checkers
                  • Another dimension in language models
                  • Sequence Tagging
                  • Slide 31
                  • Parsing
                  • Slide 33
                  • Language models based on Grammars
                  • Grammars and parsing
                  • Regular Grammars and Finite State Automata
                  • Finite State Automaton
                  • Phrase structure
                  • Notation
                  • Context Free Grammar
                  • Slide 41
                  • Top-down parsing
                  • Context-free grammar
                  • Parse tree
                  • Definite Clause Grammars Non-terminals may have arguments
                  • DCGs
                  • Unification in a nutshell (cf AI course)
                  • Unification
                  • Parsing with DCGs
                  • Case Marking
                  • Slide 51
                  • Probabilistic Models
                  • Illustration
                  • PowerPoint Presentation
                  • Sequences are omni-present
                  • Rest of the Course

                    Corpus text collection for linguistic purposes

                    TokensHow many words are contained in Tom Sawyer 71370

                    TypesHow many different words are contained in TS 8018

                    Hapax Legomenawords appearing only once

                    Corpora

                    The most frequent words are function words

                    word freq word freq

                    the 3332 in 906

                    and 2972 that 877

                    a 1775 he 877

                    to 1725 I 783

                    of 1440 his 772

                    was 1161 you 686

                    it 1027 Tom 679

                    Word Counts

                    f nf

                    1 39932 12923 6644 4105 2436 1997 1728 1319 8210 9111-50 54051-100 99gt 100 102

                    How many words appear f times

                    Word Counts

                    About half of the words occurs just onceAbout half of the text consists of the

                    100 most common wordshellip

                    Word Counts (Brown corpus)

                    Word Counts (Brown corpus)

                    word f r fr word f r frthe 3332 1 3332 turned 51 200 10200and 2972 2 5944 youlsquoll 30 300 9000a 1775 3 5235 name 21 400 8400he 877 10 8770 comes 16 500 8000but 410 20 8400 group 13 600 7800be 294 30 8820 lead 11 700 7700there 222 40 8880 friends 10 800 8000one 172 50 8600 begin 9 900 8100about 158 60 9480 family 8 1000 8000more 138 70 9660 brushed 4 2000 8000never 124 80 9920 sins 2 3000 6000Oh 116 90 10440 Could 2 4000 8000two 104 100 10400 Applausive 1 8000 8000

                    Zipflsquos Law f~1r (fr = const)

                    Zipflsquos Law

                    Minimize effort

                    Language and sequences

                    Natural language processingbull Is concerned with the analysis of

                    sequences of words sentencesbullConstruction of language models

                    Two types of modelsbullNon-probabilisticbull Probabilistic

                    Human Language is highly ambiguous at all levels

                    bull acoustic levelrecognize speech vs wreck a nice beach

                    bull morphological levelsaw to see (past) saw (noun) to saw (present inf)

                    bull syntactic levelI saw the man on the hill with a telescope

                    bull semantic levelOne book has to be read by every student

                    Key NLP Problem Ambiguity

                    Language Model

                    A formal model about language Two types

                    bull Non-probabilistic Allows one to compute whether a certain sequence

                    (sentence or part thereof) is possible Often grammar based

                    bull Probabilistic Allows one to compute the probability of a certain

                    sequence Often extends grammars with probabilities

                    Example of bad language model

                    A bad language model

                    A bad language model

                    A good language model

                    Non-Probabilisticbull ldquoI swear to tell the truthrdquo is possiblebull ldquoI swerve to smell de souprdquo is impossible

                    Probabilisticbull P(I swear to tell the truth) ~ 0001bull P(I swerve to smell de soup) ~ 0

                    Why language models

                    Consider a Shannon Gamebull Predicting the next word in the sequence

                    Statistical natural language hellip The cat is thrown out of the hellip The large green hellip Sue swallowed the large green hellip hellip

                    Model at the sentence level

                    Applications

                    Spelling correction Mobile phone texting Speech recognition Handwriting recognition Disabled users hellip

                    Spelling errors

                    They are leaving in about fifteen minuets to go to her house

                    The study was conducted mainly be John Black Hopefully all with continue smoothly in my absence Can they lave him my messages I need to notified the bank ofhellip He is trying to fine out

                    Handwriting recognition

                    Assume a note is given to a bank teller which the teller reads as I have a gub (cf Woody Allen)

                    NLP to the rescue hellipbull gub is not a wordbull gun gum Gus and gull are words but gun has a

                    higher probability in the context of a bank

                    For Spell Checkers

                    Collect list of commonly substituted wordsbull piecepeace whetherweather theirthere

                    ExampleldquoOn Tuesday the whether helliprsquorsquoldquoOn Tuesday the weather helliprdquo

                    Another dimension in language models

                    Do we mainly want to infer (probabilities) of legal sentences sequences bull So far

                    Or do we want to infer properties of these sentences bull Eg parse tree part-of-speech-taggingbullNeeded for understanding NL

                    Letrsquos look at some tasks

                    Sequence Tagging

                    Part-of-speech taggingbull He drives with his bikebull N V PR PN N noun verb preposition pronoun noun

                    Text extractionbull The job is that of a programmerbull X X X X X X JobTypebull The seminar is taking place from 1500 to 1600bull X X X X X X Start End

                    Sequence Tagging

                    Predicting the secondary structure of proteins mRNA hellipbull X = AFARLMMAhellipbull Y = hehestststhesthe hellip

                    Parsing

                    Given a sentence find its parse tree Important step in understanding NL

                    Parsing

                    In bioinformatics allows to predict (elements of) structure from sequence

                    Language models based on Grammars

                    Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

                    A particular type of Unification Based Grammar (Prolog)

                    Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

                    words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

                    bull Grammar encode rules

                    Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

                    (more than just recognition) Result of parsing mostly parse tree

                    showing the constituents of a sentence eg verb or noun phrases

                    Syntax usually specified in terms of a grammar consisting of grammar rules

                    Regular Grammars and Finite State Automata

                    Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

                    argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

                    argumentbull Adj (adjective)

                    Now acceptbull The cat sleptbull Det N Vi

                    As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

                    Lexicon bull The - Detbull Cat - Nbull Slept - Vi

                    bull hellip

                    Finite State Automaton

                    Sentences

                    bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

                    Phrase structure

                    S

                    NP

                    D N

                    VP

                    NPV

                    D N

                    PP

                    P NP

                    D N

                    the dog chased a cat into the garden

                    Notation

                    S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

                    Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

                    Terminals ~ Lexicon

                    Phrase structure

                    Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                    Recursionbull bdquoThe girl thought the dog chased the catldquo

                    VP -gt V SN -gt [girl]V -gt [thought]

                    Top-down parsing

                    S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                    Context-free grammarSS --gt --gt NPNPVPVP

                    NPNP --gt PN --gt PN Proper nounProper noun

                    NPNP --gt Art Adj N--gt Art Adj N

                    NPNP --gt ArtN--gt ArtN

                    VPVP --gt VI --gt VI intransitive verbintransitive verb

                    VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                    ArtArt --gt [the]--gt [the]

                    AdjAdj --gt [lazy]--gt [lazy]

                    AdjAdj --gt [rapid]--gt [rapid]

                    PNPN --gt [achilles]--gt [achilles]

                    NN --gt [turtle]--gt [turtle]

                    VIVI --gt [sleeps]--gt [sleeps]

                    VTVT --gt [beats]--gt [beats]

                    Parse tree

                    SS

                    NPNP VPVP

                    ArtArt AdjAdj NN VtVt NPNP

                    PNPN

                    achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                    Definite Clause GrammarsNon-terminals may have arguments

                    SS --gt --gt NPNP((NN))VPVP((NN))

                    NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                    VP(VP(NN)) --gt VI(--gt VI(NN))

                    Art(Art(singularsingular)) --gt [a]--gt [a]

                    Art(Art(singularsingular)) --gt [the]--gt [the]

                    Art(Art(pluralplural)) --gt [the]--gt [the]

                    N(N(singularsingular)) --gt [turtle]--gt [turtle]

                    N(N(pluralplural)) --gt [turtles]--gt [turtles]

                    VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                    VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                    Number Agreement

                    DCGs

                    Non-terminals may have argumentsbull Variables (start with capital)

                    Eg Number Any

                    bull Constants (start with lower case) Eg singular plural

                    bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                    Parsing needs to be adapted bull Using unification

                    Unification in a nutshell (cf AI course)

                    Substitutions

                    Eg Num singular T vp(VNP)

                    Applying substitution bull Simultaneously replace variables by

                    corresponding termsbull S(Num) Num singular = S(singular)

                    Unification

                    Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                    Gives Num singular

                    bull Art(singular) and Art(plural) Fails

                    bull Art(Num1) and Art(Num2) Num1 Num2

                    bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                    Parsing with DCGs

                    Now require successful unification at each step

                    S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                    S-gt a turtle sleep fails

                    Case Marking

                    PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                    PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                    PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                    PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                    S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                    VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                    VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                    VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                    He sees her She sees him They see her

                    But not Them see he

                    DCGs

                    Are strictly more expressive than CFGs Can represent for instance

                    bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                    Probabilistic Models

                    Traditional grammar models are very rigid bull essentially a yes no decision

                    Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                    Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                    Illustration

                    Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                    bull Constructed by handbull Can be used to derive stochastic context free

                    grammarsbull SCFG assign probability to parse trees

                    Compute the most probable parse tree

                    Sequences are omni-present

                    Therefore the techniques we will see also apply tobull Bioinformatics

                    DNA proteins mRNA hellip can all be represented as strings

                    bullRobotics Sequences of actions states hellip

                    bullhellip

                    Rest of the Course

                    Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                    All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                    bull As an example of using undirected graphical models

                    bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                    • Advanced Artificial Intelligence
                    • Topic
                    • Contents
                    • Rationalism versus Empiricism
                    • Slide 5
                    • This course
                    • Ambiguity
                    • NLP and Statistics
                    • Slide 9
                    • Corpora
                    • Word Counts
                    • Slide 12
                    • Word Counts (Brown corpus)
                    • Slide 14
                    • Zipflsquos Law
                    • Language and sequences
                    • Key NLP Problem Ambiguity
                    • Language Model
                    • Example of bad language model
                    • A bad language model
                    • Slide 22
                    • A good language model
                    • Why language models
                    • Applications
                    • Spelling errors
                    • Handwriting recognition
                    • For Spell Checkers
                    • Another dimension in language models
                    • Sequence Tagging
                    • Slide 31
                    • Parsing
                    • Slide 33
                    • Language models based on Grammars
                    • Grammars and parsing
                    • Regular Grammars and Finite State Automata
                    • Finite State Automaton
                    • Phrase structure
                    • Notation
                    • Context Free Grammar
                    • Slide 41
                    • Top-down parsing
                    • Context-free grammar
                    • Parse tree
                    • Definite Clause Grammars Non-terminals may have arguments
                    • DCGs
                    • Unification in a nutshell (cf AI course)
                    • Unification
                    • Parsing with DCGs
                    • Case Marking
                    • Slide 51
                    • Probabilistic Models
                    • Illustration
                    • PowerPoint Presentation
                    • Sequences are omni-present
                    • Rest of the Course

                      The most frequent words are function words

                      word freq word freq

                      the 3332 in 906

                      and 2972 that 877

                      a 1775 he 877

                      to 1725 I 783

                      of 1440 his 772

                      was 1161 you 686

                      it 1027 Tom 679

                      Word Counts

                      f nf

                      1 39932 12923 6644 4105 2436 1997 1728 1319 8210 9111-50 54051-100 99gt 100 102

                      How many words appear f times

                      Word Counts

                      About half of the words occurs just onceAbout half of the text consists of the

                      100 most common wordshellip

                      Word Counts (Brown corpus)

                      Word Counts (Brown corpus)

                      word f r fr word f r frthe 3332 1 3332 turned 51 200 10200and 2972 2 5944 youlsquoll 30 300 9000a 1775 3 5235 name 21 400 8400he 877 10 8770 comes 16 500 8000but 410 20 8400 group 13 600 7800be 294 30 8820 lead 11 700 7700there 222 40 8880 friends 10 800 8000one 172 50 8600 begin 9 900 8100about 158 60 9480 family 8 1000 8000more 138 70 9660 brushed 4 2000 8000never 124 80 9920 sins 2 3000 6000Oh 116 90 10440 Could 2 4000 8000two 104 100 10400 Applausive 1 8000 8000

                      Zipflsquos Law f~1r (fr = const)

                      Zipflsquos Law

                      Minimize effort

                      Language and sequences

                      Natural language processingbull Is concerned with the analysis of

                      sequences of words sentencesbullConstruction of language models

                      Two types of modelsbullNon-probabilisticbull Probabilistic

                      Human Language is highly ambiguous at all levels

                      bull acoustic levelrecognize speech vs wreck a nice beach

                      bull morphological levelsaw to see (past) saw (noun) to saw (present inf)

                      bull syntactic levelI saw the man on the hill with a telescope

                      bull semantic levelOne book has to be read by every student

                      Key NLP Problem Ambiguity

                      Language Model

                      A formal model about language Two types

                      bull Non-probabilistic Allows one to compute whether a certain sequence

                      (sentence or part thereof) is possible Often grammar based

                      bull Probabilistic Allows one to compute the probability of a certain

                      sequence Often extends grammars with probabilities

                      Example of bad language model

                      A bad language model

                      A bad language model

                      A good language model

                      Non-Probabilisticbull ldquoI swear to tell the truthrdquo is possiblebull ldquoI swerve to smell de souprdquo is impossible

                      Probabilisticbull P(I swear to tell the truth) ~ 0001bull P(I swerve to smell de soup) ~ 0

                      Why language models

                      Consider a Shannon Gamebull Predicting the next word in the sequence

                      Statistical natural language hellip The cat is thrown out of the hellip The large green hellip Sue swallowed the large green hellip hellip

                      Model at the sentence level

                      Applications

                      Spelling correction Mobile phone texting Speech recognition Handwriting recognition Disabled users hellip

                      Spelling errors

                      They are leaving in about fifteen minuets to go to her house

                      The study was conducted mainly be John Black Hopefully all with continue smoothly in my absence Can they lave him my messages I need to notified the bank ofhellip He is trying to fine out

                      Handwriting recognition

                      Assume a note is given to a bank teller which the teller reads as I have a gub (cf Woody Allen)

                      NLP to the rescue hellipbull gub is not a wordbull gun gum Gus and gull are words but gun has a

                      higher probability in the context of a bank

                      For Spell Checkers

                      Collect list of commonly substituted wordsbull piecepeace whetherweather theirthere

                      ExampleldquoOn Tuesday the whether helliprsquorsquoldquoOn Tuesday the weather helliprdquo

                      Another dimension in language models

                      Do we mainly want to infer (probabilities) of legal sentences sequences bull So far

                      Or do we want to infer properties of these sentences bull Eg parse tree part-of-speech-taggingbullNeeded for understanding NL

                      Letrsquos look at some tasks

                      Sequence Tagging

                      Part-of-speech taggingbull He drives with his bikebull N V PR PN N noun verb preposition pronoun noun

                      Text extractionbull The job is that of a programmerbull X X X X X X JobTypebull The seminar is taking place from 1500 to 1600bull X X X X X X Start End

                      Sequence Tagging

                      Predicting the secondary structure of proteins mRNA hellipbull X = AFARLMMAhellipbull Y = hehestststhesthe hellip

                      Parsing

                      Given a sentence find its parse tree Important step in understanding NL

                      Parsing

                      In bioinformatics allows to predict (elements of) structure from sequence

                      Language models based on Grammars

                      Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

                      A particular type of Unification Based Grammar (Prolog)

                      Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

                      words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

                      bull Grammar encode rules

                      Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

                      (more than just recognition) Result of parsing mostly parse tree

                      showing the constituents of a sentence eg verb or noun phrases

                      Syntax usually specified in terms of a grammar consisting of grammar rules

                      Regular Grammars and Finite State Automata

                      Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

                      argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

                      argumentbull Adj (adjective)

                      Now acceptbull The cat sleptbull Det N Vi

                      As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

                      Lexicon bull The - Detbull Cat - Nbull Slept - Vi

                      bull hellip

                      Finite State Automaton

                      Sentences

                      bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

                      Phrase structure

                      S

                      NP

                      D N

                      VP

                      NPV

                      D N

                      PP

                      P NP

                      D N

                      the dog chased a cat into the garden

                      Notation

                      S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

                      Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

                      Terminals ~ Lexicon

                      Phrase structure

                      Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                      Recursionbull bdquoThe girl thought the dog chased the catldquo

                      VP -gt V SN -gt [girl]V -gt [thought]

                      Top-down parsing

                      S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                      Context-free grammarSS --gt --gt NPNPVPVP

                      NPNP --gt PN --gt PN Proper nounProper noun

                      NPNP --gt Art Adj N--gt Art Adj N

                      NPNP --gt ArtN--gt ArtN

                      VPVP --gt VI --gt VI intransitive verbintransitive verb

                      VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                      ArtArt --gt [the]--gt [the]

                      AdjAdj --gt [lazy]--gt [lazy]

                      AdjAdj --gt [rapid]--gt [rapid]

                      PNPN --gt [achilles]--gt [achilles]

                      NN --gt [turtle]--gt [turtle]

                      VIVI --gt [sleeps]--gt [sleeps]

                      VTVT --gt [beats]--gt [beats]

                      Parse tree

                      SS

                      NPNP VPVP

                      ArtArt AdjAdj NN VtVt NPNP

                      PNPN

                      achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                      Definite Clause GrammarsNon-terminals may have arguments

                      SS --gt --gt NPNP((NN))VPVP((NN))

                      NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                      VP(VP(NN)) --gt VI(--gt VI(NN))

                      Art(Art(singularsingular)) --gt [a]--gt [a]

                      Art(Art(singularsingular)) --gt [the]--gt [the]

                      Art(Art(pluralplural)) --gt [the]--gt [the]

                      N(N(singularsingular)) --gt [turtle]--gt [turtle]

                      N(N(pluralplural)) --gt [turtles]--gt [turtles]

                      VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                      VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                      Number Agreement

                      DCGs

                      Non-terminals may have argumentsbull Variables (start with capital)

                      Eg Number Any

                      bull Constants (start with lower case) Eg singular plural

                      bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                      Parsing needs to be adapted bull Using unification

                      Unification in a nutshell (cf AI course)

                      Substitutions

                      Eg Num singular T vp(VNP)

                      Applying substitution bull Simultaneously replace variables by

                      corresponding termsbull S(Num) Num singular = S(singular)

                      Unification

                      Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                      Gives Num singular

                      bull Art(singular) and Art(plural) Fails

                      bull Art(Num1) and Art(Num2) Num1 Num2

                      bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                      Parsing with DCGs

                      Now require successful unification at each step

                      S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                      S-gt a turtle sleep fails

                      Case Marking

                      PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                      PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                      PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                      PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                      S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                      VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                      VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                      VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                      He sees her She sees him They see her

                      But not Them see he

                      DCGs

                      Are strictly more expressive than CFGs Can represent for instance

                      bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                      Probabilistic Models

                      Traditional grammar models are very rigid bull essentially a yes no decision

                      Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                      Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                      Illustration

                      Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                      bull Constructed by handbull Can be used to derive stochastic context free

                      grammarsbull SCFG assign probability to parse trees

                      Compute the most probable parse tree

                      Sequences are omni-present

                      Therefore the techniques we will see also apply tobull Bioinformatics

                      DNA proteins mRNA hellip can all be represented as strings

                      bullRobotics Sequences of actions states hellip

                      bullhellip

                      Rest of the Course

                      Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                      All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                      bull As an example of using undirected graphical models

                      bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                      • Advanced Artificial Intelligence
                      • Topic
                      • Contents
                      • Rationalism versus Empiricism
                      • Slide 5
                      • This course
                      • Ambiguity
                      • NLP and Statistics
                      • Slide 9
                      • Corpora
                      • Word Counts
                      • Slide 12
                      • Word Counts (Brown corpus)
                      • Slide 14
                      • Zipflsquos Law
                      • Language and sequences
                      • Key NLP Problem Ambiguity
                      • Language Model
                      • Example of bad language model
                      • A bad language model
                      • Slide 22
                      • A good language model
                      • Why language models
                      • Applications
                      • Spelling errors
                      • Handwriting recognition
                      • For Spell Checkers
                      • Another dimension in language models
                      • Sequence Tagging
                      • Slide 31
                      • Parsing
                      • Slide 33
                      • Language models based on Grammars
                      • Grammars and parsing
                      • Regular Grammars and Finite State Automata
                      • Finite State Automaton
                      • Phrase structure
                      • Notation
                      • Context Free Grammar
                      • Slide 41
                      • Top-down parsing
                      • Context-free grammar
                      • Parse tree
                      • Definite Clause Grammars Non-terminals may have arguments
                      • DCGs
                      • Unification in a nutshell (cf AI course)
                      • Unification
                      • Parsing with DCGs
                      • Case Marking
                      • Slide 51
                      • Probabilistic Models
                      • Illustration
                      • PowerPoint Presentation
                      • Sequences are omni-present
                      • Rest of the Course

                        f nf

                        1 39932 12923 6644 4105 2436 1997 1728 1319 8210 9111-50 54051-100 99gt 100 102

                        How many words appear f times

                        Word Counts

                        About half of the words occurs just onceAbout half of the text consists of the

                        100 most common wordshellip

                        Word Counts (Brown corpus)

                        Word Counts (Brown corpus)

                        word f r fr word f r frthe 3332 1 3332 turned 51 200 10200and 2972 2 5944 youlsquoll 30 300 9000a 1775 3 5235 name 21 400 8400he 877 10 8770 comes 16 500 8000but 410 20 8400 group 13 600 7800be 294 30 8820 lead 11 700 7700there 222 40 8880 friends 10 800 8000one 172 50 8600 begin 9 900 8100about 158 60 9480 family 8 1000 8000more 138 70 9660 brushed 4 2000 8000never 124 80 9920 sins 2 3000 6000Oh 116 90 10440 Could 2 4000 8000two 104 100 10400 Applausive 1 8000 8000

                        Zipflsquos Law f~1r (fr = const)

                        Zipflsquos Law

                        Minimize effort

                        Language and sequences

                        Natural language processingbull Is concerned with the analysis of

                        sequences of words sentencesbullConstruction of language models

                        Two types of modelsbullNon-probabilisticbull Probabilistic

                        Human Language is highly ambiguous at all levels

                        bull acoustic levelrecognize speech vs wreck a nice beach

                        bull morphological levelsaw to see (past) saw (noun) to saw (present inf)

                        bull syntactic levelI saw the man on the hill with a telescope

                        bull semantic levelOne book has to be read by every student

                        Key NLP Problem Ambiguity

                        Language Model

                        A formal model about language Two types

                        bull Non-probabilistic Allows one to compute whether a certain sequence

                        (sentence or part thereof) is possible Often grammar based

                        bull Probabilistic Allows one to compute the probability of a certain

                        sequence Often extends grammars with probabilities

                        Example of bad language model

                        A bad language model

                        A bad language model

                        A good language model

                        Non-Probabilisticbull ldquoI swear to tell the truthrdquo is possiblebull ldquoI swerve to smell de souprdquo is impossible

                        Probabilisticbull P(I swear to tell the truth) ~ 0001bull P(I swerve to smell de soup) ~ 0

                        Why language models

                        Consider a Shannon Gamebull Predicting the next word in the sequence

                        Statistical natural language hellip The cat is thrown out of the hellip The large green hellip Sue swallowed the large green hellip hellip

                        Model at the sentence level

                        Applications

                        Spelling correction Mobile phone texting Speech recognition Handwriting recognition Disabled users hellip

                        Spelling errors

                        They are leaving in about fifteen minuets to go to her house

                        The study was conducted mainly be John Black Hopefully all with continue smoothly in my absence Can they lave him my messages I need to notified the bank ofhellip He is trying to fine out

                        Handwriting recognition

                        Assume a note is given to a bank teller which the teller reads as I have a gub (cf Woody Allen)

                        NLP to the rescue hellipbull gub is not a wordbull gun gum Gus and gull are words but gun has a

                        higher probability in the context of a bank

                        For Spell Checkers

                        Collect list of commonly substituted wordsbull piecepeace whetherweather theirthere

                        ExampleldquoOn Tuesday the whether helliprsquorsquoldquoOn Tuesday the weather helliprdquo

                        Another dimension in language models

                        Do we mainly want to infer (probabilities) of legal sentences sequences bull So far

                        Or do we want to infer properties of these sentences bull Eg parse tree part-of-speech-taggingbullNeeded for understanding NL

                        Letrsquos look at some tasks

                        Sequence Tagging

                        Part-of-speech taggingbull He drives with his bikebull N V PR PN N noun verb preposition pronoun noun

                        Text extractionbull The job is that of a programmerbull X X X X X X JobTypebull The seminar is taking place from 1500 to 1600bull X X X X X X Start End

                        Sequence Tagging

                        Predicting the secondary structure of proteins mRNA hellipbull X = AFARLMMAhellipbull Y = hehestststhesthe hellip

                        Parsing

                        Given a sentence find its parse tree Important step in understanding NL

                        Parsing

                        In bioinformatics allows to predict (elements of) structure from sequence

                        Language models based on Grammars

                        Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

                        A particular type of Unification Based Grammar (Prolog)

                        Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

                        words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

                        bull Grammar encode rules

                        Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

                        (more than just recognition) Result of parsing mostly parse tree

                        showing the constituents of a sentence eg verb or noun phrases

                        Syntax usually specified in terms of a grammar consisting of grammar rules

                        Regular Grammars and Finite State Automata

                        Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

                        argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

                        argumentbull Adj (adjective)

                        Now acceptbull The cat sleptbull Det N Vi

                        As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

                        Lexicon bull The - Detbull Cat - Nbull Slept - Vi

                        bull hellip

                        Finite State Automaton

                        Sentences

                        bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

                        Phrase structure

                        S

                        NP

                        D N

                        VP

                        NPV

                        D N

                        PP

                        P NP

                        D N

                        the dog chased a cat into the garden

                        Notation

                        S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

                        Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

                        Terminals ~ Lexicon

                        Phrase structure

                        Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                        Recursionbull bdquoThe girl thought the dog chased the catldquo

                        VP -gt V SN -gt [girl]V -gt [thought]

                        Top-down parsing

                        S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                        Context-free grammarSS --gt --gt NPNPVPVP

                        NPNP --gt PN --gt PN Proper nounProper noun

                        NPNP --gt Art Adj N--gt Art Adj N

                        NPNP --gt ArtN--gt ArtN

                        VPVP --gt VI --gt VI intransitive verbintransitive verb

                        VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                        ArtArt --gt [the]--gt [the]

                        AdjAdj --gt [lazy]--gt [lazy]

                        AdjAdj --gt [rapid]--gt [rapid]

                        PNPN --gt [achilles]--gt [achilles]

                        NN --gt [turtle]--gt [turtle]

                        VIVI --gt [sleeps]--gt [sleeps]

                        VTVT --gt [beats]--gt [beats]

                        Parse tree

                        SS

                        NPNP VPVP

                        ArtArt AdjAdj NN VtVt NPNP

                        PNPN

                        achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                        Definite Clause GrammarsNon-terminals may have arguments

                        SS --gt --gt NPNP((NN))VPVP((NN))

                        NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                        VP(VP(NN)) --gt VI(--gt VI(NN))

                        Art(Art(singularsingular)) --gt [a]--gt [a]

                        Art(Art(singularsingular)) --gt [the]--gt [the]

                        Art(Art(pluralplural)) --gt [the]--gt [the]

                        N(N(singularsingular)) --gt [turtle]--gt [turtle]

                        N(N(pluralplural)) --gt [turtles]--gt [turtles]

                        VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                        VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                        Number Agreement

                        DCGs

                        Non-terminals may have argumentsbull Variables (start with capital)

                        Eg Number Any

                        bull Constants (start with lower case) Eg singular plural

                        bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                        Parsing needs to be adapted bull Using unification

                        Unification in a nutshell (cf AI course)

                        Substitutions

                        Eg Num singular T vp(VNP)

                        Applying substitution bull Simultaneously replace variables by

                        corresponding termsbull S(Num) Num singular = S(singular)

                        Unification

                        Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                        Gives Num singular

                        bull Art(singular) and Art(plural) Fails

                        bull Art(Num1) and Art(Num2) Num1 Num2

                        bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                        Parsing with DCGs

                        Now require successful unification at each step

                        S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                        S-gt a turtle sleep fails

                        Case Marking

                        PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                        PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                        PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                        PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                        S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                        VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                        VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                        VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                        He sees her She sees him They see her

                        But not Them see he

                        DCGs

                        Are strictly more expressive than CFGs Can represent for instance

                        bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                        Probabilistic Models

                        Traditional grammar models are very rigid bull essentially a yes no decision

                        Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                        Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                        Illustration

                        Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                        bull Constructed by handbull Can be used to derive stochastic context free

                        grammarsbull SCFG assign probability to parse trees

                        Compute the most probable parse tree

                        Sequences are omni-present

                        Therefore the techniques we will see also apply tobull Bioinformatics

                        DNA proteins mRNA hellip can all be represented as strings

                        bullRobotics Sequences of actions states hellip

                        bullhellip

                        Rest of the Course

                        Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                        All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                        bull As an example of using undirected graphical models

                        bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                        • Advanced Artificial Intelligence
                        • Topic
                        • Contents
                        • Rationalism versus Empiricism
                        • Slide 5
                        • This course
                        • Ambiguity
                        • NLP and Statistics
                        • Slide 9
                        • Corpora
                        • Word Counts
                        • Slide 12
                        • Word Counts (Brown corpus)
                        • Slide 14
                        • Zipflsquos Law
                        • Language and sequences
                        • Key NLP Problem Ambiguity
                        • Language Model
                        • Example of bad language model
                        • A bad language model
                        • Slide 22
                        • A good language model
                        • Why language models
                        • Applications
                        • Spelling errors
                        • Handwriting recognition
                        • For Spell Checkers
                        • Another dimension in language models
                        • Sequence Tagging
                        • Slide 31
                        • Parsing
                        • Slide 33
                        • Language models based on Grammars
                        • Grammars and parsing
                        • Regular Grammars and Finite State Automata
                        • Finite State Automaton
                        • Phrase structure
                        • Notation
                        • Context Free Grammar
                        • Slide 41
                        • Top-down parsing
                        • Context-free grammar
                        • Parse tree
                        • Definite Clause Grammars Non-terminals may have arguments
                        • DCGs
                        • Unification in a nutshell (cf AI course)
                        • Unification
                        • Parsing with DCGs
                        • Case Marking
                        • Slide 51
                        • Probabilistic Models
                        • Illustration
                        • PowerPoint Presentation
                        • Sequences are omni-present
                        • Rest of the Course

                          Word Counts (Brown corpus)

                          Word Counts (Brown corpus)

                          word f r fr word f r frthe 3332 1 3332 turned 51 200 10200and 2972 2 5944 youlsquoll 30 300 9000a 1775 3 5235 name 21 400 8400he 877 10 8770 comes 16 500 8000but 410 20 8400 group 13 600 7800be 294 30 8820 lead 11 700 7700there 222 40 8880 friends 10 800 8000one 172 50 8600 begin 9 900 8100about 158 60 9480 family 8 1000 8000more 138 70 9660 brushed 4 2000 8000never 124 80 9920 sins 2 3000 6000Oh 116 90 10440 Could 2 4000 8000two 104 100 10400 Applausive 1 8000 8000

                          Zipflsquos Law f~1r (fr = const)

                          Zipflsquos Law

                          Minimize effort

                          Language and sequences

                          Natural language processingbull Is concerned with the analysis of

                          sequences of words sentencesbullConstruction of language models

                          Two types of modelsbullNon-probabilisticbull Probabilistic

                          Human Language is highly ambiguous at all levels

                          bull acoustic levelrecognize speech vs wreck a nice beach

                          bull morphological levelsaw to see (past) saw (noun) to saw (present inf)

                          bull syntactic levelI saw the man on the hill with a telescope

                          bull semantic levelOne book has to be read by every student

                          Key NLP Problem Ambiguity

                          Language Model

                          A formal model about language Two types

                          bull Non-probabilistic Allows one to compute whether a certain sequence

                          (sentence or part thereof) is possible Often grammar based

                          bull Probabilistic Allows one to compute the probability of a certain

                          sequence Often extends grammars with probabilities

                          Example of bad language model

                          A bad language model

                          A bad language model

                          A good language model

                          Non-Probabilisticbull ldquoI swear to tell the truthrdquo is possiblebull ldquoI swerve to smell de souprdquo is impossible

                          Probabilisticbull P(I swear to tell the truth) ~ 0001bull P(I swerve to smell de soup) ~ 0

                          Why language models

                          Consider a Shannon Gamebull Predicting the next word in the sequence

                          Statistical natural language hellip The cat is thrown out of the hellip The large green hellip Sue swallowed the large green hellip hellip

                          Model at the sentence level

                          Applications

                          Spelling correction Mobile phone texting Speech recognition Handwriting recognition Disabled users hellip

                          Spelling errors

                          They are leaving in about fifteen minuets to go to her house

                          The study was conducted mainly be John Black Hopefully all with continue smoothly in my absence Can they lave him my messages I need to notified the bank ofhellip He is trying to fine out

                          Handwriting recognition

                          Assume a note is given to a bank teller which the teller reads as I have a gub (cf Woody Allen)

                          NLP to the rescue hellipbull gub is not a wordbull gun gum Gus and gull are words but gun has a

                          higher probability in the context of a bank

                          For Spell Checkers

                          Collect list of commonly substituted wordsbull piecepeace whetherweather theirthere

                          ExampleldquoOn Tuesday the whether helliprsquorsquoldquoOn Tuesday the weather helliprdquo

                          Another dimension in language models

                          Do we mainly want to infer (probabilities) of legal sentences sequences bull So far

                          Or do we want to infer properties of these sentences bull Eg parse tree part-of-speech-taggingbullNeeded for understanding NL

                          Letrsquos look at some tasks

                          Sequence Tagging

                          Part-of-speech taggingbull He drives with his bikebull N V PR PN N noun verb preposition pronoun noun

                          Text extractionbull The job is that of a programmerbull X X X X X X JobTypebull The seminar is taking place from 1500 to 1600bull X X X X X X Start End

                          Sequence Tagging

                          Predicting the secondary structure of proteins mRNA hellipbull X = AFARLMMAhellipbull Y = hehestststhesthe hellip

                          Parsing

                          Given a sentence find its parse tree Important step in understanding NL

                          Parsing

                          In bioinformatics allows to predict (elements of) structure from sequence

                          Language models based on Grammars

                          Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

                          A particular type of Unification Based Grammar (Prolog)

                          Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

                          words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

                          bull Grammar encode rules

                          Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

                          (more than just recognition) Result of parsing mostly parse tree

                          showing the constituents of a sentence eg verb or noun phrases

                          Syntax usually specified in terms of a grammar consisting of grammar rules

                          Regular Grammars and Finite State Automata

                          Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

                          argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

                          argumentbull Adj (adjective)

                          Now acceptbull The cat sleptbull Det N Vi

                          As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

                          Lexicon bull The - Detbull Cat - Nbull Slept - Vi

                          bull hellip

                          Finite State Automaton

                          Sentences

                          bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

                          Phrase structure

                          S

                          NP

                          D N

                          VP

                          NPV

                          D N

                          PP

                          P NP

                          D N

                          the dog chased a cat into the garden

                          Notation

                          S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

                          Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

                          Terminals ~ Lexicon

                          Phrase structure

                          Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                          Recursionbull bdquoThe girl thought the dog chased the catldquo

                          VP -gt V SN -gt [girl]V -gt [thought]

                          Top-down parsing

                          S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                          Context-free grammarSS --gt --gt NPNPVPVP

                          NPNP --gt PN --gt PN Proper nounProper noun

                          NPNP --gt Art Adj N--gt Art Adj N

                          NPNP --gt ArtN--gt ArtN

                          VPVP --gt VI --gt VI intransitive verbintransitive verb

                          VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                          ArtArt --gt [the]--gt [the]

                          AdjAdj --gt [lazy]--gt [lazy]

                          AdjAdj --gt [rapid]--gt [rapid]

                          PNPN --gt [achilles]--gt [achilles]

                          NN --gt [turtle]--gt [turtle]

                          VIVI --gt [sleeps]--gt [sleeps]

                          VTVT --gt [beats]--gt [beats]

                          Parse tree

                          SS

                          NPNP VPVP

                          ArtArt AdjAdj NN VtVt NPNP

                          PNPN

                          achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                          Definite Clause GrammarsNon-terminals may have arguments

                          SS --gt --gt NPNP((NN))VPVP((NN))

                          NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                          VP(VP(NN)) --gt VI(--gt VI(NN))

                          Art(Art(singularsingular)) --gt [a]--gt [a]

                          Art(Art(singularsingular)) --gt [the]--gt [the]

                          Art(Art(pluralplural)) --gt [the]--gt [the]

                          N(N(singularsingular)) --gt [turtle]--gt [turtle]

                          N(N(pluralplural)) --gt [turtles]--gt [turtles]

                          VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                          VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                          Number Agreement

                          DCGs

                          Non-terminals may have argumentsbull Variables (start with capital)

                          Eg Number Any

                          bull Constants (start with lower case) Eg singular plural

                          bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                          Parsing needs to be adapted bull Using unification

                          Unification in a nutshell (cf AI course)

                          Substitutions

                          Eg Num singular T vp(VNP)

                          Applying substitution bull Simultaneously replace variables by

                          corresponding termsbull S(Num) Num singular = S(singular)

                          Unification

                          Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                          Gives Num singular

                          bull Art(singular) and Art(plural) Fails

                          bull Art(Num1) and Art(Num2) Num1 Num2

                          bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                          Parsing with DCGs

                          Now require successful unification at each step

                          S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                          S-gt a turtle sleep fails

                          Case Marking

                          PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                          PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                          PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                          PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                          S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                          VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                          VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                          VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                          He sees her She sees him They see her

                          But not Them see he

                          DCGs

                          Are strictly more expressive than CFGs Can represent for instance

                          bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                          Probabilistic Models

                          Traditional grammar models are very rigid bull essentially a yes no decision

                          Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                          Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                          Illustration

                          Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                          bull Constructed by handbull Can be used to derive stochastic context free

                          grammarsbull SCFG assign probability to parse trees

                          Compute the most probable parse tree

                          Sequences are omni-present

                          Therefore the techniques we will see also apply tobull Bioinformatics

                          DNA proteins mRNA hellip can all be represented as strings

                          bullRobotics Sequences of actions states hellip

                          bullhellip

                          Rest of the Course

                          Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                          All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                          bull As an example of using undirected graphical models

                          bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                          • Advanced Artificial Intelligence
                          • Topic
                          • Contents
                          • Rationalism versus Empiricism
                          • Slide 5
                          • This course
                          • Ambiguity
                          • NLP and Statistics
                          • Slide 9
                          • Corpora
                          • Word Counts
                          • Slide 12
                          • Word Counts (Brown corpus)
                          • Slide 14
                          • Zipflsquos Law
                          • Language and sequences
                          • Key NLP Problem Ambiguity
                          • Language Model
                          • Example of bad language model
                          • A bad language model
                          • Slide 22
                          • A good language model
                          • Why language models
                          • Applications
                          • Spelling errors
                          • Handwriting recognition
                          • For Spell Checkers
                          • Another dimension in language models
                          • Sequence Tagging
                          • Slide 31
                          • Parsing
                          • Slide 33
                          • Language models based on Grammars
                          • Grammars and parsing
                          • Regular Grammars and Finite State Automata
                          • Finite State Automaton
                          • Phrase structure
                          • Notation
                          • Context Free Grammar
                          • Slide 41
                          • Top-down parsing
                          • Context-free grammar
                          • Parse tree
                          • Definite Clause Grammars Non-terminals may have arguments
                          • DCGs
                          • Unification in a nutshell (cf AI course)
                          • Unification
                          • Parsing with DCGs
                          • Case Marking
                          • Slide 51
                          • Probabilistic Models
                          • Illustration
                          • PowerPoint Presentation
                          • Sequences are omni-present
                          • Rest of the Course

                            Word Counts (Brown corpus)

                            word f r fr word f r frthe 3332 1 3332 turned 51 200 10200and 2972 2 5944 youlsquoll 30 300 9000a 1775 3 5235 name 21 400 8400he 877 10 8770 comes 16 500 8000but 410 20 8400 group 13 600 7800be 294 30 8820 lead 11 700 7700there 222 40 8880 friends 10 800 8000one 172 50 8600 begin 9 900 8100about 158 60 9480 family 8 1000 8000more 138 70 9660 brushed 4 2000 8000never 124 80 9920 sins 2 3000 6000Oh 116 90 10440 Could 2 4000 8000two 104 100 10400 Applausive 1 8000 8000

                            Zipflsquos Law f~1r (fr = const)

                            Zipflsquos Law

                            Minimize effort

                            Language and sequences

                            Natural language processingbull Is concerned with the analysis of

                            sequences of words sentencesbullConstruction of language models

                            Two types of modelsbullNon-probabilisticbull Probabilistic

                            Human Language is highly ambiguous at all levels

                            bull acoustic levelrecognize speech vs wreck a nice beach

                            bull morphological levelsaw to see (past) saw (noun) to saw (present inf)

                            bull syntactic levelI saw the man on the hill with a telescope

                            bull semantic levelOne book has to be read by every student

                            Key NLP Problem Ambiguity

                            Language Model

                            A formal model about language Two types

                            bull Non-probabilistic Allows one to compute whether a certain sequence

                            (sentence or part thereof) is possible Often grammar based

                            bull Probabilistic Allows one to compute the probability of a certain

                            sequence Often extends grammars with probabilities

                            Example of bad language model

                            A bad language model

                            A bad language model

                            A good language model

                            Non-Probabilisticbull ldquoI swear to tell the truthrdquo is possiblebull ldquoI swerve to smell de souprdquo is impossible

                            Probabilisticbull P(I swear to tell the truth) ~ 0001bull P(I swerve to smell de soup) ~ 0

                            Why language models

                            Consider a Shannon Gamebull Predicting the next word in the sequence

                            Statistical natural language hellip The cat is thrown out of the hellip The large green hellip Sue swallowed the large green hellip hellip

                            Model at the sentence level

                            Applications

                            Spelling correction Mobile phone texting Speech recognition Handwriting recognition Disabled users hellip

                            Spelling errors

                            They are leaving in about fifteen minuets to go to her house

                            The study was conducted mainly be John Black Hopefully all with continue smoothly in my absence Can they lave him my messages I need to notified the bank ofhellip He is trying to fine out

                            Handwriting recognition

                            Assume a note is given to a bank teller which the teller reads as I have a gub (cf Woody Allen)

                            NLP to the rescue hellipbull gub is not a wordbull gun gum Gus and gull are words but gun has a

                            higher probability in the context of a bank

                            For Spell Checkers

                            Collect list of commonly substituted wordsbull piecepeace whetherweather theirthere

                            ExampleldquoOn Tuesday the whether helliprsquorsquoldquoOn Tuesday the weather helliprdquo

                            Another dimension in language models

                            Do we mainly want to infer (probabilities) of legal sentences sequences bull So far

                            Or do we want to infer properties of these sentences bull Eg parse tree part-of-speech-taggingbullNeeded for understanding NL

                            Letrsquos look at some tasks

                            Sequence Tagging

                            Part-of-speech taggingbull He drives with his bikebull N V PR PN N noun verb preposition pronoun noun

                            Text extractionbull The job is that of a programmerbull X X X X X X JobTypebull The seminar is taking place from 1500 to 1600bull X X X X X X Start End

                            Sequence Tagging

                            Predicting the secondary structure of proteins mRNA hellipbull X = AFARLMMAhellipbull Y = hehestststhesthe hellip

                            Parsing

                            Given a sentence find its parse tree Important step in understanding NL

                            Parsing

                            In bioinformatics allows to predict (elements of) structure from sequence

                            Language models based on Grammars

                            Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

                            A particular type of Unification Based Grammar (Prolog)

                            Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

                            words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

                            bull Grammar encode rules

                            Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

                            (more than just recognition) Result of parsing mostly parse tree

                            showing the constituents of a sentence eg verb or noun phrases

                            Syntax usually specified in terms of a grammar consisting of grammar rules

                            Regular Grammars and Finite State Automata

                            Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

                            argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

                            argumentbull Adj (adjective)

                            Now acceptbull The cat sleptbull Det N Vi

                            As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

                            Lexicon bull The - Detbull Cat - Nbull Slept - Vi

                            bull hellip

                            Finite State Automaton

                            Sentences

                            bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

                            Phrase structure

                            S

                            NP

                            D N

                            VP

                            NPV

                            D N

                            PP

                            P NP

                            D N

                            the dog chased a cat into the garden

                            Notation

                            S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

                            Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

                            Terminals ~ Lexicon

                            Phrase structure

                            Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                            Recursionbull bdquoThe girl thought the dog chased the catldquo

                            VP -gt V SN -gt [girl]V -gt [thought]

                            Top-down parsing

                            S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                            Context-free grammarSS --gt --gt NPNPVPVP

                            NPNP --gt PN --gt PN Proper nounProper noun

                            NPNP --gt Art Adj N--gt Art Adj N

                            NPNP --gt ArtN--gt ArtN

                            VPVP --gt VI --gt VI intransitive verbintransitive verb

                            VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                            ArtArt --gt [the]--gt [the]

                            AdjAdj --gt [lazy]--gt [lazy]

                            AdjAdj --gt [rapid]--gt [rapid]

                            PNPN --gt [achilles]--gt [achilles]

                            NN --gt [turtle]--gt [turtle]

                            VIVI --gt [sleeps]--gt [sleeps]

                            VTVT --gt [beats]--gt [beats]

                            Parse tree

                            SS

                            NPNP VPVP

                            ArtArt AdjAdj NN VtVt NPNP

                            PNPN

                            achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                            Definite Clause GrammarsNon-terminals may have arguments

                            SS --gt --gt NPNP((NN))VPVP((NN))

                            NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                            VP(VP(NN)) --gt VI(--gt VI(NN))

                            Art(Art(singularsingular)) --gt [a]--gt [a]

                            Art(Art(singularsingular)) --gt [the]--gt [the]

                            Art(Art(pluralplural)) --gt [the]--gt [the]

                            N(N(singularsingular)) --gt [turtle]--gt [turtle]

                            N(N(pluralplural)) --gt [turtles]--gt [turtles]

                            VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                            VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                            Number Agreement

                            DCGs

                            Non-terminals may have argumentsbull Variables (start with capital)

                            Eg Number Any

                            bull Constants (start with lower case) Eg singular plural

                            bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                            Parsing needs to be adapted bull Using unification

                            Unification in a nutshell (cf AI course)

                            Substitutions

                            Eg Num singular T vp(VNP)

                            Applying substitution bull Simultaneously replace variables by

                            corresponding termsbull S(Num) Num singular = S(singular)

                            Unification

                            Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                            Gives Num singular

                            bull Art(singular) and Art(plural) Fails

                            bull Art(Num1) and Art(Num2) Num1 Num2

                            bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                            Parsing with DCGs

                            Now require successful unification at each step

                            S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                            S-gt a turtle sleep fails

                            Case Marking

                            PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                            PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                            PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                            PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                            S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                            VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                            VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                            VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                            He sees her She sees him They see her

                            But not Them see he

                            DCGs

                            Are strictly more expressive than CFGs Can represent for instance

                            bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                            Probabilistic Models

                            Traditional grammar models are very rigid bull essentially a yes no decision

                            Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                            Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                            Illustration

                            Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                            bull Constructed by handbull Can be used to derive stochastic context free

                            grammarsbull SCFG assign probability to parse trees

                            Compute the most probable parse tree

                            Sequences are omni-present

                            Therefore the techniques we will see also apply tobull Bioinformatics

                            DNA proteins mRNA hellip can all be represented as strings

                            bullRobotics Sequences of actions states hellip

                            bullhellip

                            Rest of the Course

                            Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                            All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                            bull As an example of using undirected graphical models

                            bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                            • Advanced Artificial Intelligence
                            • Topic
                            • Contents
                            • Rationalism versus Empiricism
                            • Slide 5
                            • This course
                            • Ambiguity
                            • NLP and Statistics
                            • Slide 9
                            • Corpora
                            • Word Counts
                            • Slide 12
                            • Word Counts (Brown corpus)
                            • Slide 14
                            • Zipflsquos Law
                            • Language and sequences
                            • Key NLP Problem Ambiguity
                            • Language Model
                            • Example of bad language model
                            • A bad language model
                            • Slide 22
                            • A good language model
                            • Why language models
                            • Applications
                            • Spelling errors
                            • Handwriting recognition
                            • For Spell Checkers
                            • Another dimension in language models
                            • Sequence Tagging
                            • Slide 31
                            • Parsing
                            • Slide 33
                            • Language models based on Grammars
                            • Grammars and parsing
                            • Regular Grammars and Finite State Automata
                            • Finite State Automaton
                            • Phrase structure
                            • Notation
                            • Context Free Grammar
                            • Slide 41
                            • Top-down parsing
                            • Context-free grammar
                            • Parse tree
                            • Definite Clause Grammars Non-terminals may have arguments
                            • DCGs
                            • Unification in a nutshell (cf AI course)
                            • Unification
                            • Parsing with DCGs
                            • Case Marking
                            • Slide 51
                            • Probabilistic Models
                            • Illustration
                            • PowerPoint Presentation
                            • Sequences are omni-present
                            • Rest of the Course

                              word f r fr word f r frthe 3332 1 3332 turned 51 200 10200and 2972 2 5944 youlsquoll 30 300 9000a 1775 3 5235 name 21 400 8400he 877 10 8770 comes 16 500 8000but 410 20 8400 group 13 600 7800be 294 30 8820 lead 11 700 7700there 222 40 8880 friends 10 800 8000one 172 50 8600 begin 9 900 8100about 158 60 9480 family 8 1000 8000more 138 70 9660 brushed 4 2000 8000never 124 80 9920 sins 2 3000 6000Oh 116 90 10440 Could 2 4000 8000two 104 100 10400 Applausive 1 8000 8000

                              Zipflsquos Law f~1r (fr = const)

                              Zipflsquos Law

                              Minimize effort

                              Language and sequences

                              Natural language processingbull Is concerned with the analysis of

                              sequences of words sentencesbullConstruction of language models

                              Two types of modelsbullNon-probabilisticbull Probabilistic

                              Human Language is highly ambiguous at all levels

                              bull acoustic levelrecognize speech vs wreck a nice beach

                              bull morphological levelsaw to see (past) saw (noun) to saw (present inf)

                              bull syntactic levelI saw the man on the hill with a telescope

                              bull semantic levelOne book has to be read by every student

                              Key NLP Problem Ambiguity

                              Language Model

                              A formal model about language Two types

                              bull Non-probabilistic Allows one to compute whether a certain sequence

                              (sentence or part thereof) is possible Often grammar based

                              bull Probabilistic Allows one to compute the probability of a certain

                              sequence Often extends grammars with probabilities

                              Example of bad language model

                              A bad language model

                              A bad language model

                              A good language model

                              Non-Probabilisticbull ldquoI swear to tell the truthrdquo is possiblebull ldquoI swerve to smell de souprdquo is impossible

                              Probabilisticbull P(I swear to tell the truth) ~ 0001bull P(I swerve to smell de soup) ~ 0

                              Why language models

                              Consider a Shannon Gamebull Predicting the next word in the sequence

                              Statistical natural language hellip The cat is thrown out of the hellip The large green hellip Sue swallowed the large green hellip hellip

                              Model at the sentence level

                              Applications

                              Spelling correction Mobile phone texting Speech recognition Handwriting recognition Disabled users hellip

                              Spelling errors

                              They are leaving in about fifteen minuets to go to her house

                              The study was conducted mainly be John Black Hopefully all with continue smoothly in my absence Can they lave him my messages I need to notified the bank ofhellip He is trying to fine out

                              Handwriting recognition

                              Assume a note is given to a bank teller which the teller reads as I have a gub (cf Woody Allen)

                              NLP to the rescue hellipbull gub is not a wordbull gun gum Gus and gull are words but gun has a

                              higher probability in the context of a bank

                              For Spell Checkers

                              Collect list of commonly substituted wordsbull piecepeace whetherweather theirthere

                              ExampleldquoOn Tuesday the whether helliprsquorsquoldquoOn Tuesday the weather helliprdquo

                              Another dimension in language models

                              Do we mainly want to infer (probabilities) of legal sentences sequences bull So far

                              Or do we want to infer properties of these sentences bull Eg parse tree part-of-speech-taggingbullNeeded for understanding NL

                              Letrsquos look at some tasks

                              Sequence Tagging

                              Part-of-speech taggingbull He drives with his bikebull N V PR PN N noun verb preposition pronoun noun

                              Text extractionbull The job is that of a programmerbull X X X X X X JobTypebull The seminar is taking place from 1500 to 1600bull X X X X X X Start End

                              Sequence Tagging

                              Predicting the secondary structure of proteins mRNA hellipbull X = AFARLMMAhellipbull Y = hehestststhesthe hellip

                              Parsing

                              Given a sentence find its parse tree Important step in understanding NL

                              Parsing

                              In bioinformatics allows to predict (elements of) structure from sequence

                              Language models based on Grammars

                              Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

                              A particular type of Unification Based Grammar (Prolog)

                              Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

                              words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

                              bull Grammar encode rules

                              Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

                              (more than just recognition) Result of parsing mostly parse tree

                              showing the constituents of a sentence eg verb or noun phrases

                              Syntax usually specified in terms of a grammar consisting of grammar rules

                              Regular Grammars and Finite State Automata

                              Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

                              argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

                              argumentbull Adj (adjective)

                              Now acceptbull The cat sleptbull Det N Vi

                              As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

                              Lexicon bull The - Detbull Cat - Nbull Slept - Vi

                              bull hellip

                              Finite State Automaton

                              Sentences

                              bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

                              Phrase structure

                              S

                              NP

                              D N

                              VP

                              NPV

                              D N

                              PP

                              P NP

                              D N

                              the dog chased a cat into the garden

                              Notation

                              S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

                              Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

                              Terminals ~ Lexicon

                              Phrase structure

                              Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                              Recursionbull bdquoThe girl thought the dog chased the catldquo

                              VP -gt V SN -gt [girl]V -gt [thought]

                              Top-down parsing

                              S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                              Context-free grammarSS --gt --gt NPNPVPVP

                              NPNP --gt PN --gt PN Proper nounProper noun

                              NPNP --gt Art Adj N--gt Art Adj N

                              NPNP --gt ArtN--gt ArtN

                              VPVP --gt VI --gt VI intransitive verbintransitive verb

                              VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                              ArtArt --gt [the]--gt [the]

                              AdjAdj --gt [lazy]--gt [lazy]

                              AdjAdj --gt [rapid]--gt [rapid]

                              PNPN --gt [achilles]--gt [achilles]

                              NN --gt [turtle]--gt [turtle]

                              VIVI --gt [sleeps]--gt [sleeps]

                              VTVT --gt [beats]--gt [beats]

                              Parse tree

                              SS

                              NPNP VPVP

                              ArtArt AdjAdj NN VtVt NPNP

                              PNPN

                              achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                              Definite Clause GrammarsNon-terminals may have arguments

                              SS --gt --gt NPNP((NN))VPVP((NN))

                              NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                              VP(VP(NN)) --gt VI(--gt VI(NN))

                              Art(Art(singularsingular)) --gt [a]--gt [a]

                              Art(Art(singularsingular)) --gt [the]--gt [the]

                              Art(Art(pluralplural)) --gt [the]--gt [the]

                              N(N(singularsingular)) --gt [turtle]--gt [turtle]

                              N(N(pluralplural)) --gt [turtles]--gt [turtles]

                              VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                              VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                              Number Agreement

                              DCGs

                              Non-terminals may have argumentsbull Variables (start with capital)

                              Eg Number Any

                              bull Constants (start with lower case) Eg singular plural

                              bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                              Parsing needs to be adapted bull Using unification

                              Unification in a nutshell (cf AI course)

                              Substitutions

                              Eg Num singular T vp(VNP)

                              Applying substitution bull Simultaneously replace variables by

                              corresponding termsbull S(Num) Num singular = S(singular)

                              Unification

                              Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                              Gives Num singular

                              bull Art(singular) and Art(plural) Fails

                              bull Art(Num1) and Art(Num2) Num1 Num2

                              bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                              Parsing with DCGs

                              Now require successful unification at each step

                              S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                              S-gt a turtle sleep fails

                              Case Marking

                              PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                              PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                              PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                              PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                              S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                              VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                              VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                              VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                              He sees her She sees him They see her

                              But not Them see he

                              DCGs

                              Are strictly more expressive than CFGs Can represent for instance

                              bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                              Probabilistic Models

                              Traditional grammar models are very rigid bull essentially a yes no decision

                              Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                              Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                              Illustration

                              Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                              bull Constructed by handbull Can be used to derive stochastic context free

                              grammarsbull SCFG assign probability to parse trees

                              Compute the most probable parse tree

                              Sequences are omni-present

                              Therefore the techniques we will see also apply tobull Bioinformatics

                              DNA proteins mRNA hellip can all be represented as strings

                              bullRobotics Sequences of actions states hellip

                              bullhellip

                              Rest of the Course

                              Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                              All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                              bull As an example of using undirected graphical models

                              bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                              • Advanced Artificial Intelligence
                              • Topic
                              • Contents
                              • Rationalism versus Empiricism
                              • Slide 5
                              • This course
                              • Ambiguity
                              • NLP and Statistics
                              • Slide 9
                              • Corpora
                              • Word Counts
                              • Slide 12
                              • Word Counts (Brown corpus)
                              • Slide 14
                              • Zipflsquos Law
                              • Language and sequences
                              • Key NLP Problem Ambiguity
                              • Language Model
                              • Example of bad language model
                              • A bad language model
                              • Slide 22
                              • A good language model
                              • Why language models
                              • Applications
                              • Spelling errors
                              • Handwriting recognition
                              • For Spell Checkers
                              • Another dimension in language models
                              • Sequence Tagging
                              • Slide 31
                              • Parsing
                              • Slide 33
                              • Language models based on Grammars
                              • Grammars and parsing
                              • Regular Grammars and Finite State Automata
                              • Finite State Automaton
                              • Phrase structure
                              • Notation
                              • Context Free Grammar
                              • Slide 41
                              • Top-down parsing
                              • Context-free grammar
                              • Parse tree
                              • Definite Clause Grammars Non-terminals may have arguments
                              • DCGs
                              • Unification in a nutshell (cf AI course)
                              • Unification
                              • Parsing with DCGs
                              • Case Marking
                              • Slide 51
                              • Probabilistic Models
                              • Illustration
                              • PowerPoint Presentation
                              • Sequences are omni-present
                              • Rest of the Course

                                Language and sequences

                                Natural language processingbull Is concerned with the analysis of

                                sequences of words sentencesbullConstruction of language models

                                Two types of modelsbullNon-probabilisticbull Probabilistic

                                Human Language is highly ambiguous at all levels

                                bull acoustic levelrecognize speech vs wreck a nice beach

                                bull morphological levelsaw to see (past) saw (noun) to saw (present inf)

                                bull syntactic levelI saw the man on the hill with a telescope

                                bull semantic levelOne book has to be read by every student

                                Key NLP Problem Ambiguity

                                Language Model

                                A formal model about language Two types

                                bull Non-probabilistic Allows one to compute whether a certain sequence

                                (sentence or part thereof) is possible Often grammar based

                                bull Probabilistic Allows one to compute the probability of a certain

                                sequence Often extends grammars with probabilities

                                Example of bad language model

                                A bad language model

                                A bad language model

                                A good language model

                                Non-Probabilisticbull ldquoI swear to tell the truthrdquo is possiblebull ldquoI swerve to smell de souprdquo is impossible

                                Probabilisticbull P(I swear to tell the truth) ~ 0001bull P(I swerve to smell de soup) ~ 0

                                Why language models

                                Consider a Shannon Gamebull Predicting the next word in the sequence

                                Statistical natural language hellip The cat is thrown out of the hellip The large green hellip Sue swallowed the large green hellip hellip

                                Model at the sentence level

                                Applications

                                Spelling correction Mobile phone texting Speech recognition Handwriting recognition Disabled users hellip

                                Spelling errors

                                They are leaving in about fifteen minuets to go to her house

                                The study was conducted mainly be John Black Hopefully all with continue smoothly in my absence Can they lave him my messages I need to notified the bank ofhellip He is trying to fine out

                                Handwriting recognition

                                Assume a note is given to a bank teller which the teller reads as I have a gub (cf Woody Allen)

                                NLP to the rescue hellipbull gub is not a wordbull gun gum Gus and gull are words but gun has a

                                higher probability in the context of a bank

                                For Spell Checkers

                                Collect list of commonly substituted wordsbull piecepeace whetherweather theirthere

                                ExampleldquoOn Tuesday the whether helliprsquorsquoldquoOn Tuesday the weather helliprdquo

                                Another dimension in language models

                                Do we mainly want to infer (probabilities) of legal sentences sequences bull So far

                                Or do we want to infer properties of these sentences bull Eg parse tree part-of-speech-taggingbullNeeded for understanding NL

                                Letrsquos look at some tasks

                                Sequence Tagging

                                Part-of-speech taggingbull He drives with his bikebull N V PR PN N noun verb preposition pronoun noun

                                Text extractionbull The job is that of a programmerbull X X X X X X JobTypebull The seminar is taking place from 1500 to 1600bull X X X X X X Start End

                                Sequence Tagging

                                Predicting the secondary structure of proteins mRNA hellipbull X = AFARLMMAhellipbull Y = hehestststhesthe hellip

                                Parsing

                                Given a sentence find its parse tree Important step in understanding NL

                                Parsing

                                In bioinformatics allows to predict (elements of) structure from sequence

                                Language models based on Grammars

                                Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

                                A particular type of Unification Based Grammar (Prolog)

                                Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

                                words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

                                bull Grammar encode rules

                                Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

                                (more than just recognition) Result of parsing mostly parse tree

                                showing the constituents of a sentence eg verb or noun phrases

                                Syntax usually specified in terms of a grammar consisting of grammar rules

                                Regular Grammars and Finite State Automata

                                Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

                                argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

                                argumentbull Adj (adjective)

                                Now acceptbull The cat sleptbull Det N Vi

                                As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

                                Lexicon bull The - Detbull Cat - Nbull Slept - Vi

                                bull hellip

                                Finite State Automaton

                                Sentences

                                bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

                                Phrase structure

                                S

                                NP

                                D N

                                VP

                                NPV

                                D N

                                PP

                                P NP

                                D N

                                the dog chased a cat into the garden

                                Notation

                                S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

                                Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

                                Terminals ~ Lexicon

                                Phrase structure

                                Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                                Recursionbull bdquoThe girl thought the dog chased the catldquo

                                VP -gt V SN -gt [girl]V -gt [thought]

                                Top-down parsing

                                S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                                Context-free grammarSS --gt --gt NPNPVPVP

                                NPNP --gt PN --gt PN Proper nounProper noun

                                NPNP --gt Art Adj N--gt Art Adj N

                                NPNP --gt ArtN--gt ArtN

                                VPVP --gt VI --gt VI intransitive verbintransitive verb

                                VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                                ArtArt --gt [the]--gt [the]

                                AdjAdj --gt [lazy]--gt [lazy]

                                AdjAdj --gt [rapid]--gt [rapid]

                                PNPN --gt [achilles]--gt [achilles]

                                NN --gt [turtle]--gt [turtle]

                                VIVI --gt [sleeps]--gt [sleeps]

                                VTVT --gt [beats]--gt [beats]

                                Parse tree

                                SS

                                NPNP VPVP

                                ArtArt AdjAdj NN VtVt NPNP

                                PNPN

                                achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                                Definite Clause GrammarsNon-terminals may have arguments

                                SS --gt --gt NPNP((NN))VPVP((NN))

                                NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                                VP(VP(NN)) --gt VI(--gt VI(NN))

                                Art(Art(singularsingular)) --gt [a]--gt [a]

                                Art(Art(singularsingular)) --gt [the]--gt [the]

                                Art(Art(pluralplural)) --gt [the]--gt [the]

                                N(N(singularsingular)) --gt [turtle]--gt [turtle]

                                N(N(pluralplural)) --gt [turtles]--gt [turtles]

                                VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                                VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                                Number Agreement

                                DCGs

                                Non-terminals may have argumentsbull Variables (start with capital)

                                Eg Number Any

                                bull Constants (start with lower case) Eg singular plural

                                bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                                Parsing needs to be adapted bull Using unification

                                Unification in a nutshell (cf AI course)

                                Substitutions

                                Eg Num singular T vp(VNP)

                                Applying substitution bull Simultaneously replace variables by

                                corresponding termsbull S(Num) Num singular = S(singular)

                                Unification

                                Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                                Gives Num singular

                                bull Art(singular) and Art(plural) Fails

                                bull Art(Num1) and Art(Num2) Num1 Num2

                                bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                                Parsing with DCGs

                                Now require successful unification at each step

                                S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                S-gt a turtle sleep fails

                                Case Marking

                                PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                He sees her She sees him They see her

                                But not Them see he

                                DCGs

                                Are strictly more expressive than CFGs Can represent for instance

                                bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                Probabilistic Models

                                Traditional grammar models are very rigid bull essentially a yes no decision

                                Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                Illustration

                                Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                bull Constructed by handbull Can be used to derive stochastic context free

                                grammarsbull SCFG assign probability to parse trees

                                Compute the most probable parse tree

                                Sequences are omni-present

                                Therefore the techniques we will see also apply tobull Bioinformatics

                                DNA proteins mRNA hellip can all be represented as strings

                                bullRobotics Sequences of actions states hellip

                                bullhellip

                                Rest of the Course

                                Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                bull As an example of using undirected graphical models

                                bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                • Advanced Artificial Intelligence
                                • Topic
                                • Contents
                                • Rationalism versus Empiricism
                                • Slide 5
                                • This course
                                • Ambiguity
                                • NLP and Statistics
                                • Slide 9
                                • Corpora
                                • Word Counts
                                • Slide 12
                                • Word Counts (Brown corpus)
                                • Slide 14
                                • Zipflsquos Law
                                • Language and sequences
                                • Key NLP Problem Ambiguity
                                • Language Model
                                • Example of bad language model
                                • A bad language model
                                • Slide 22
                                • A good language model
                                • Why language models
                                • Applications
                                • Spelling errors
                                • Handwriting recognition
                                • For Spell Checkers
                                • Another dimension in language models
                                • Sequence Tagging
                                • Slide 31
                                • Parsing
                                • Slide 33
                                • Language models based on Grammars
                                • Grammars and parsing
                                • Regular Grammars and Finite State Automata
                                • Finite State Automaton
                                • Phrase structure
                                • Notation
                                • Context Free Grammar
                                • Slide 41
                                • Top-down parsing
                                • Context-free grammar
                                • Parse tree
                                • Definite Clause Grammars Non-terminals may have arguments
                                • DCGs
                                • Unification in a nutshell (cf AI course)
                                • Unification
                                • Parsing with DCGs
                                • Case Marking
                                • Slide 51
                                • Probabilistic Models
                                • Illustration
                                • PowerPoint Presentation
                                • Sequences are omni-present
                                • Rest of the Course

                                  Human Language is highly ambiguous at all levels

                                  bull acoustic levelrecognize speech vs wreck a nice beach

                                  bull morphological levelsaw to see (past) saw (noun) to saw (present inf)

                                  bull syntactic levelI saw the man on the hill with a telescope

                                  bull semantic levelOne book has to be read by every student

                                  Key NLP Problem Ambiguity

                                  Language Model

                                  A formal model about language Two types

                                  bull Non-probabilistic Allows one to compute whether a certain sequence

                                  (sentence or part thereof) is possible Often grammar based

                                  bull Probabilistic Allows one to compute the probability of a certain

                                  sequence Often extends grammars with probabilities

                                  Example of bad language model

                                  A bad language model

                                  A bad language model

                                  A good language model

                                  Non-Probabilisticbull ldquoI swear to tell the truthrdquo is possiblebull ldquoI swerve to smell de souprdquo is impossible

                                  Probabilisticbull P(I swear to tell the truth) ~ 0001bull P(I swerve to smell de soup) ~ 0

                                  Why language models

                                  Consider a Shannon Gamebull Predicting the next word in the sequence

                                  Statistical natural language hellip The cat is thrown out of the hellip The large green hellip Sue swallowed the large green hellip hellip

                                  Model at the sentence level

                                  Applications

                                  Spelling correction Mobile phone texting Speech recognition Handwriting recognition Disabled users hellip

                                  Spelling errors

                                  They are leaving in about fifteen minuets to go to her house

                                  The study was conducted mainly be John Black Hopefully all with continue smoothly in my absence Can they lave him my messages I need to notified the bank ofhellip He is trying to fine out

                                  Handwriting recognition

                                  Assume a note is given to a bank teller which the teller reads as I have a gub (cf Woody Allen)

                                  NLP to the rescue hellipbull gub is not a wordbull gun gum Gus and gull are words but gun has a

                                  higher probability in the context of a bank

                                  For Spell Checkers

                                  Collect list of commonly substituted wordsbull piecepeace whetherweather theirthere

                                  ExampleldquoOn Tuesday the whether helliprsquorsquoldquoOn Tuesday the weather helliprdquo

                                  Another dimension in language models

                                  Do we mainly want to infer (probabilities) of legal sentences sequences bull So far

                                  Or do we want to infer properties of these sentences bull Eg parse tree part-of-speech-taggingbullNeeded for understanding NL

                                  Letrsquos look at some tasks

                                  Sequence Tagging

                                  Part-of-speech taggingbull He drives with his bikebull N V PR PN N noun verb preposition pronoun noun

                                  Text extractionbull The job is that of a programmerbull X X X X X X JobTypebull The seminar is taking place from 1500 to 1600bull X X X X X X Start End

                                  Sequence Tagging

                                  Predicting the secondary structure of proteins mRNA hellipbull X = AFARLMMAhellipbull Y = hehestststhesthe hellip

                                  Parsing

                                  Given a sentence find its parse tree Important step in understanding NL

                                  Parsing

                                  In bioinformatics allows to predict (elements of) structure from sequence

                                  Language models based on Grammars

                                  Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

                                  A particular type of Unification Based Grammar (Prolog)

                                  Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

                                  words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

                                  bull Grammar encode rules

                                  Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

                                  (more than just recognition) Result of parsing mostly parse tree

                                  showing the constituents of a sentence eg verb or noun phrases

                                  Syntax usually specified in terms of a grammar consisting of grammar rules

                                  Regular Grammars and Finite State Automata

                                  Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

                                  argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

                                  argumentbull Adj (adjective)

                                  Now acceptbull The cat sleptbull Det N Vi

                                  As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

                                  Lexicon bull The - Detbull Cat - Nbull Slept - Vi

                                  bull hellip

                                  Finite State Automaton

                                  Sentences

                                  bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

                                  Phrase structure

                                  S

                                  NP

                                  D N

                                  VP

                                  NPV

                                  D N

                                  PP

                                  P NP

                                  D N

                                  the dog chased a cat into the garden

                                  Notation

                                  S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

                                  Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

                                  Terminals ~ Lexicon

                                  Phrase structure

                                  Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                                  Recursionbull bdquoThe girl thought the dog chased the catldquo

                                  VP -gt V SN -gt [girl]V -gt [thought]

                                  Top-down parsing

                                  S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                                  Context-free grammarSS --gt --gt NPNPVPVP

                                  NPNP --gt PN --gt PN Proper nounProper noun

                                  NPNP --gt Art Adj N--gt Art Adj N

                                  NPNP --gt ArtN--gt ArtN

                                  VPVP --gt VI --gt VI intransitive verbintransitive verb

                                  VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                                  ArtArt --gt [the]--gt [the]

                                  AdjAdj --gt [lazy]--gt [lazy]

                                  AdjAdj --gt [rapid]--gt [rapid]

                                  PNPN --gt [achilles]--gt [achilles]

                                  NN --gt [turtle]--gt [turtle]

                                  VIVI --gt [sleeps]--gt [sleeps]

                                  VTVT --gt [beats]--gt [beats]

                                  Parse tree

                                  SS

                                  NPNP VPVP

                                  ArtArt AdjAdj NN VtVt NPNP

                                  PNPN

                                  achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                                  Definite Clause GrammarsNon-terminals may have arguments

                                  SS --gt --gt NPNP((NN))VPVP((NN))

                                  NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                                  VP(VP(NN)) --gt VI(--gt VI(NN))

                                  Art(Art(singularsingular)) --gt [a]--gt [a]

                                  Art(Art(singularsingular)) --gt [the]--gt [the]

                                  Art(Art(pluralplural)) --gt [the]--gt [the]

                                  N(N(singularsingular)) --gt [turtle]--gt [turtle]

                                  N(N(pluralplural)) --gt [turtles]--gt [turtles]

                                  VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                                  VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                                  Number Agreement

                                  DCGs

                                  Non-terminals may have argumentsbull Variables (start with capital)

                                  Eg Number Any

                                  bull Constants (start with lower case) Eg singular plural

                                  bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                                  Parsing needs to be adapted bull Using unification

                                  Unification in a nutshell (cf AI course)

                                  Substitutions

                                  Eg Num singular T vp(VNP)

                                  Applying substitution bull Simultaneously replace variables by

                                  corresponding termsbull S(Num) Num singular = S(singular)

                                  Unification

                                  Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                                  Gives Num singular

                                  bull Art(singular) and Art(plural) Fails

                                  bull Art(Num1) and Art(Num2) Num1 Num2

                                  bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                                  Parsing with DCGs

                                  Now require successful unification at each step

                                  S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                  S-gt a turtle sleep fails

                                  Case Marking

                                  PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                  PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                  PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                  PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                  S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                  VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                  VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                  VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                  He sees her She sees him They see her

                                  But not Them see he

                                  DCGs

                                  Are strictly more expressive than CFGs Can represent for instance

                                  bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                  Probabilistic Models

                                  Traditional grammar models are very rigid bull essentially a yes no decision

                                  Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                  Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                  Illustration

                                  Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                  bull Constructed by handbull Can be used to derive stochastic context free

                                  grammarsbull SCFG assign probability to parse trees

                                  Compute the most probable parse tree

                                  Sequences are omni-present

                                  Therefore the techniques we will see also apply tobull Bioinformatics

                                  DNA proteins mRNA hellip can all be represented as strings

                                  bullRobotics Sequences of actions states hellip

                                  bullhellip

                                  Rest of the Course

                                  Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                  All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                  bull As an example of using undirected graphical models

                                  bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                  • Advanced Artificial Intelligence
                                  • Topic
                                  • Contents
                                  • Rationalism versus Empiricism
                                  • Slide 5
                                  • This course
                                  • Ambiguity
                                  • NLP and Statistics
                                  • Slide 9
                                  • Corpora
                                  • Word Counts
                                  • Slide 12
                                  • Word Counts (Brown corpus)
                                  • Slide 14
                                  • Zipflsquos Law
                                  • Language and sequences
                                  • Key NLP Problem Ambiguity
                                  • Language Model
                                  • Example of bad language model
                                  • A bad language model
                                  • Slide 22
                                  • A good language model
                                  • Why language models
                                  • Applications
                                  • Spelling errors
                                  • Handwriting recognition
                                  • For Spell Checkers
                                  • Another dimension in language models
                                  • Sequence Tagging
                                  • Slide 31
                                  • Parsing
                                  • Slide 33
                                  • Language models based on Grammars
                                  • Grammars and parsing
                                  • Regular Grammars and Finite State Automata
                                  • Finite State Automaton
                                  • Phrase structure
                                  • Notation
                                  • Context Free Grammar
                                  • Slide 41
                                  • Top-down parsing
                                  • Context-free grammar
                                  • Parse tree
                                  • Definite Clause Grammars Non-terminals may have arguments
                                  • DCGs
                                  • Unification in a nutshell (cf AI course)
                                  • Unification
                                  • Parsing with DCGs
                                  • Case Marking
                                  • Slide 51
                                  • Probabilistic Models
                                  • Illustration
                                  • PowerPoint Presentation
                                  • Sequences are omni-present
                                  • Rest of the Course

                                    Language Model

                                    A formal model about language Two types

                                    bull Non-probabilistic Allows one to compute whether a certain sequence

                                    (sentence or part thereof) is possible Often grammar based

                                    bull Probabilistic Allows one to compute the probability of a certain

                                    sequence Often extends grammars with probabilities

                                    Example of bad language model

                                    A bad language model

                                    A bad language model

                                    A good language model

                                    Non-Probabilisticbull ldquoI swear to tell the truthrdquo is possiblebull ldquoI swerve to smell de souprdquo is impossible

                                    Probabilisticbull P(I swear to tell the truth) ~ 0001bull P(I swerve to smell de soup) ~ 0

                                    Why language models

                                    Consider a Shannon Gamebull Predicting the next word in the sequence

                                    Statistical natural language hellip The cat is thrown out of the hellip The large green hellip Sue swallowed the large green hellip hellip

                                    Model at the sentence level

                                    Applications

                                    Spelling correction Mobile phone texting Speech recognition Handwriting recognition Disabled users hellip

                                    Spelling errors

                                    They are leaving in about fifteen minuets to go to her house

                                    The study was conducted mainly be John Black Hopefully all with continue smoothly in my absence Can they lave him my messages I need to notified the bank ofhellip He is trying to fine out

                                    Handwriting recognition

                                    Assume a note is given to a bank teller which the teller reads as I have a gub (cf Woody Allen)

                                    NLP to the rescue hellipbull gub is not a wordbull gun gum Gus and gull are words but gun has a

                                    higher probability in the context of a bank

                                    For Spell Checkers

                                    Collect list of commonly substituted wordsbull piecepeace whetherweather theirthere

                                    ExampleldquoOn Tuesday the whether helliprsquorsquoldquoOn Tuesday the weather helliprdquo

                                    Another dimension in language models

                                    Do we mainly want to infer (probabilities) of legal sentences sequences bull So far

                                    Or do we want to infer properties of these sentences bull Eg parse tree part-of-speech-taggingbullNeeded for understanding NL

                                    Letrsquos look at some tasks

                                    Sequence Tagging

                                    Part-of-speech taggingbull He drives with his bikebull N V PR PN N noun verb preposition pronoun noun

                                    Text extractionbull The job is that of a programmerbull X X X X X X JobTypebull The seminar is taking place from 1500 to 1600bull X X X X X X Start End

                                    Sequence Tagging

                                    Predicting the secondary structure of proteins mRNA hellipbull X = AFARLMMAhellipbull Y = hehestststhesthe hellip

                                    Parsing

                                    Given a sentence find its parse tree Important step in understanding NL

                                    Parsing

                                    In bioinformatics allows to predict (elements of) structure from sequence

                                    Language models based on Grammars

                                    Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

                                    A particular type of Unification Based Grammar (Prolog)

                                    Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

                                    words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

                                    bull Grammar encode rules

                                    Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

                                    (more than just recognition) Result of parsing mostly parse tree

                                    showing the constituents of a sentence eg verb or noun phrases

                                    Syntax usually specified in terms of a grammar consisting of grammar rules

                                    Regular Grammars and Finite State Automata

                                    Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

                                    argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

                                    argumentbull Adj (adjective)

                                    Now acceptbull The cat sleptbull Det N Vi

                                    As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

                                    Lexicon bull The - Detbull Cat - Nbull Slept - Vi

                                    bull hellip

                                    Finite State Automaton

                                    Sentences

                                    bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

                                    Phrase structure

                                    S

                                    NP

                                    D N

                                    VP

                                    NPV

                                    D N

                                    PP

                                    P NP

                                    D N

                                    the dog chased a cat into the garden

                                    Notation

                                    S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

                                    Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

                                    Terminals ~ Lexicon

                                    Phrase structure

                                    Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                                    Recursionbull bdquoThe girl thought the dog chased the catldquo

                                    VP -gt V SN -gt [girl]V -gt [thought]

                                    Top-down parsing

                                    S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                                    Context-free grammarSS --gt --gt NPNPVPVP

                                    NPNP --gt PN --gt PN Proper nounProper noun

                                    NPNP --gt Art Adj N--gt Art Adj N

                                    NPNP --gt ArtN--gt ArtN

                                    VPVP --gt VI --gt VI intransitive verbintransitive verb

                                    VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                                    ArtArt --gt [the]--gt [the]

                                    AdjAdj --gt [lazy]--gt [lazy]

                                    AdjAdj --gt [rapid]--gt [rapid]

                                    PNPN --gt [achilles]--gt [achilles]

                                    NN --gt [turtle]--gt [turtle]

                                    VIVI --gt [sleeps]--gt [sleeps]

                                    VTVT --gt [beats]--gt [beats]

                                    Parse tree

                                    SS

                                    NPNP VPVP

                                    ArtArt AdjAdj NN VtVt NPNP

                                    PNPN

                                    achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                                    Definite Clause GrammarsNon-terminals may have arguments

                                    SS --gt --gt NPNP((NN))VPVP((NN))

                                    NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                                    VP(VP(NN)) --gt VI(--gt VI(NN))

                                    Art(Art(singularsingular)) --gt [a]--gt [a]

                                    Art(Art(singularsingular)) --gt [the]--gt [the]

                                    Art(Art(pluralplural)) --gt [the]--gt [the]

                                    N(N(singularsingular)) --gt [turtle]--gt [turtle]

                                    N(N(pluralplural)) --gt [turtles]--gt [turtles]

                                    VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                                    VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                                    Number Agreement

                                    DCGs

                                    Non-terminals may have argumentsbull Variables (start with capital)

                                    Eg Number Any

                                    bull Constants (start with lower case) Eg singular plural

                                    bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                                    Parsing needs to be adapted bull Using unification

                                    Unification in a nutshell (cf AI course)

                                    Substitutions

                                    Eg Num singular T vp(VNP)

                                    Applying substitution bull Simultaneously replace variables by

                                    corresponding termsbull S(Num) Num singular = S(singular)

                                    Unification

                                    Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                                    Gives Num singular

                                    bull Art(singular) and Art(plural) Fails

                                    bull Art(Num1) and Art(Num2) Num1 Num2

                                    bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                                    Parsing with DCGs

                                    Now require successful unification at each step

                                    S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                    S-gt a turtle sleep fails

                                    Case Marking

                                    PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                    PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                    PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                    PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                    S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                    VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                    VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                    VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                    He sees her She sees him They see her

                                    But not Them see he

                                    DCGs

                                    Are strictly more expressive than CFGs Can represent for instance

                                    bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                    Probabilistic Models

                                    Traditional grammar models are very rigid bull essentially a yes no decision

                                    Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                    Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                    Illustration

                                    Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                    bull Constructed by handbull Can be used to derive stochastic context free

                                    grammarsbull SCFG assign probability to parse trees

                                    Compute the most probable parse tree

                                    Sequences are omni-present

                                    Therefore the techniques we will see also apply tobull Bioinformatics

                                    DNA proteins mRNA hellip can all be represented as strings

                                    bullRobotics Sequences of actions states hellip

                                    bullhellip

                                    Rest of the Course

                                    Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                    All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                    bull As an example of using undirected graphical models

                                    bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                    • Advanced Artificial Intelligence
                                    • Topic
                                    • Contents
                                    • Rationalism versus Empiricism
                                    • Slide 5
                                    • This course
                                    • Ambiguity
                                    • NLP and Statistics
                                    • Slide 9
                                    • Corpora
                                    • Word Counts
                                    • Slide 12
                                    • Word Counts (Brown corpus)
                                    • Slide 14
                                    • Zipflsquos Law
                                    • Language and sequences
                                    • Key NLP Problem Ambiguity
                                    • Language Model
                                    • Example of bad language model
                                    • A bad language model
                                    • Slide 22
                                    • A good language model
                                    • Why language models
                                    • Applications
                                    • Spelling errors
                                    • Handwriting recognition
                                    • For Spell Checkers
                                    • Another dimension in language models
                                    • Sequence Tagging
                                    • Slide 31
                                    • Parsing
                                    • Slide 33
                                    • Language models based on Grammars
                                    • Grammars and parsing
                                    • Regular Grammars and Finite State Automata
                                    • Finite State Automaton
                                    • Phrase structure
                                    • Notation
                                    • Context Free Grammar
                                    • Slide 41
                                    • Top-down parsing
                                    • Context-free grammar
                                    • Parse tree
                                    • Definite Clause Grammars Non-terminals may have arguments
                                    • DCGs
                                    • Unification in a nutshell (cf AI course)
                                    • Unification
                                    • Parsing with DCGs
                                    • Case Marking
                                    • Slide 51
                                    • Probabilistic Models
                                    • Illustration
                                    • PowerPoint Presentation
                                    • Sequences are omni-present
                                    • Rest of the Course

                                      Example of bad language model

                                      A bad language model

                                      A bad language model

                                      A good language model

                                      Non-Probabilisticbull ldquoI swear to tell the truthrdquo is possiblebull ldquoI swerve to smell de souprdquo is impossible

                                      Probabilisticbull P(I swear to tell the truth) ~ 0001bull P(I swerve to smell de soup) ~ 0

                                      Why language models

                                      Consider a Shannon Gamebull Predicting the next word in the sequence

                                      Statistical natural language hellip The cat is thrown out of the hellip The large green hellip Sue swallowed the large green hellip hellip

                                      Model at the sentence level

                                      Applications

                                      Spelling correction Mobile phone texting Speech recognition Handwriting recognition Disabled users hellip

                                      Spelling errors

                                      They are leaving in about fifteen minuets to go to her house

                                      The study was conducted mainly be John Black Hopefully all with continue smoothly in my absence Can they lave him my messages I need to notified the bank ofhellip He is trying to fine out

                                      Handwriting recognition

                                      Assume a note is given to a bank teller which the teller reads as I have a gub (cf Woody Allen)

                                      NLP to the rescue hellipbull gub is not a wordbull gun gum Gus and gull are words but gun has a

                                      higher probability in the context of a bank

                                      For Spell Checkers

                                      Collect list of commonly substituted wordsbull piecepeace whetherweather theirthere

                                      ExampleldquoOn Tuesday the whether helliprsquorsquoldquoOn Tuesday the weather helliprdquo

                                      Another dimension in language models

                                      Do we mainly want to infer (probabilities) of legal sentences sequences bull So far

                                      Or do we want to infer properties of these sentences bull Eg parse tree part-of-speech-taggingbullNeeded for understanding NL

                                      Letrsquos look at some tasks

                                      Sequence Tagging

                                      Part-of-speech taggingbull He drives with his bikebull N V PR PN N noun verb preposition pronoun noun

                                      Text extractionbull The job is that of a programmerbull X X X X X X JobTypebull The seminar is taking place from 1500 to 1600bull X X X X X X Start End

                                      Sequence Tagging

                                      Predicting the secondary structure of proteins mRNA hellipbull X = AFARLMMAhellipbull Y = hehestststhesthe hellip

                                      Parsing

                                      Given a sentence find its parse tree Important step in understanding NL

                                      Parsing

                                      In bioinformatics allows to predict (elements of) structure from sequence

                                      Language models based on Grammars

                                      Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

                                      A particular type of Unification Based Grammar (Prolog)

                                      Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

                                      words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

                                      bull Grammar encode rules

                                      Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

                                      (more than just recognition) Result of parsing mostly parse tree

                                      showing the constituents of a sentence eg verb or noun phrases

                                      Syntax usually specified in terms of a grammar consisting of grammar rules

                                      Regular Grammars and Finite State Automata

                                      Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

                                      argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

                                      argumentbull Adj (adjective)

                                      Now acceptbull The cat sleptbull Det N Vi

                                      As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

                                      Lexicon bull The - Detbull Cat - Nbull Slept - Vi

                                      bull hellip

                                      Finite State Automaton

                                      Sentences

                                      bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

                                      Phrase structure

                                      S

                                      NP

                                      D N

                                      VP

                                      NPV

                                      D N

                                      PP

                                      P NP

                                      D N

                                      the dog chased a cat into the garden

                                      Notation

                                      S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

                                      Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

                                      Terminals ~ Lexicon

                                      Phrase structure

                                      Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                                      Recursionbull bdquoThe girl thought the dog chased the catldquo

                                      VP -gt V SN -gt [girl]V -gt [thought]

                                      Top-down parsing

                                      S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                                      Context-free grammarSS --gt --gt NPNPVPVP

                                      NPNP --gt PN --gt PN Proper nounProper noun

                                      NPNP --gt Art Adj N--gt Art Adj N

                                      NPNP --gt ArtN--gt ArtN

                                      VPVP --gt VI --gt VI intransitive verbintransitive verb

                                      VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                                      ArtArt --gt [the]--gt [the]

                                      AdjAdj --gt [lazy]--gt [lazy]

                                      AdjAdj --gt [rapid]--gt [rapid]

                                      PNPN --gt [achilles]--gt [achilles]

                                      NN --gt [turtle]--gt [turtle]

                                      VIVI --gt [sleeps]--gt [sleeps]

                                      VTVT --gt [beats]--gt [beats]

                                      Parse tree

                                      SS

                                      NPNP VPVP

                                      ArtArt AdjAdj NN VtVt NPNP

                                      PNPN

                                      achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                                      Definite Clause GrammarsNon-terminals may have arguments

                                      SS --gt --gt NPNP((NN))VPVP((NN))

                                      NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                                      VP(VP(NN)) --gt VI(--gt VI(NN))

                                      Art(Art(singularsingular)) --gt [a]--gt [a]

                                      Art(Art(singularsingular)) --gt [the]--gt [the]

                                      Art(Art(pluralplural)) --gt [the]--gt [the]

                                      N(N(singularsingular)) --gt [turtle]--gt [turtle]

                                      N(N(pluralplural)) --gt [turtles]--gt [turtles]

                                      VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                                      VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                                      Number Agreement

                                      DCGs

                                      Non-terminals may have argumentsbull Variables (start with capital)

                                      Eg Number Any

                                      bull Constants (start with lower case) Eg singular plural

                                      bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                                      Parsing needs to be adapted bull Using unification

                                      Unification in a nutshell (cf AI course)

                                      Substitutions

                                      Eg Num singular T vp(VNP)

                                      Applying substitution bull Simultaneously replace variables by

                                      corresponding termsbull S(Num) Num singular = S(singular)

                                      Unification

                                      Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                                      Gives Num singular

                                      bull Art(singular) and Art(plural) Fails

                                      bull Art(Num1) and Art(Num2) Num1 Num2

                                      bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                                      Parsing with DCGs

                                      Now require successful unification at each step

                                      S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                      S-gt a turtle sleep fails

                                      Case Marking

                                      PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                      PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                      PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                      PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                      S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                      VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                      VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                      VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                      He sees her She sees him They see her

                                      But not Them see he

                                      DCGs

                                      Are strictly more expressive than CFGs Can represent for instance

                                      bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                      Probabilistic Models

                                      Traditional grammar models are very rigid bull essentially a yes no decision

                                      Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                      Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                      Illustration

                                      Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                      bull Constructed by handbull Can be used to derive stochastic context free

                                      grammarsbull SCFG assign probability to parse trees

                                      Compute the most probable parse tree

                                      Sequences are omni-present

                                      Therefore the techniques we will see also apply tobull Bioinformatics

                                      DNA proteins mRNA hellip can all be represented as strings

                                      bullRobotics Sequences of actions states hellip

                                      bullhellip

                                      Rest of the Course

                                      Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                      All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                      bull As an example of using undirected graphical models

                                      bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                      • Advanced Artificial Intelligence
                                      • Topic
                                      • Contents
                                      • Rationalism versus Empiricism
                                      • Slide 5
                                      • This course
                                      • Ambiguity
                                      • NLP and Statistics
                                      • Slide 9
                                      • Corpora
                                      • Word Counts
                                      • Slide 12
                                      • Word Counts (Brown corpus)
                                      • Slide 14
                                      • Zipflsquos Law
                                      • Language and sequences
                                      • Key NLP Problem Ambiguity
                                      • Language Model
                                      • Example of bad language model
                                      • A bad language model
                                      • Slide 22
                                      • A good language model
                                      • Why language models
                                      • Applications
                                      • Spelling errors
                                      • Handwriting recognition
                                      • For Spell Checkers
                                      • Another dimension in language models
                                      • Sequence Tagging
                                      • Slide 31
                                      • Parsing
                                      • Slide 33
                                      • Language models based on Grammars
                                      • Grammars and parsing
                                      • Regular Grammars and Finite State Automata
                                      • Finite State Automaton
                                      • Phrase structure
                                      • Notation
                                      • Context Free Grammar
                                      • Slide 41
                                      • Top-down parsing
                                      • Context-free grammar
                                      • Parse tree
                                      • Definite Clause Grammars Non-terminals may have arguments
                                      • DCGs
                                      • Unification in a nutshell (cf AI course)
                                      • Unification
                                      • Parsing with DCGs
                                      • Case Marking
                                      • Slide 51
                                      • Probabilistic Models
                                      • Illustration
                                      • PowerPoint Presentation
                                      • Sequences are omni-present
                                      • Rest of the Course

                                        A bad language model

                                        A bad language model

                                        A good language model

                                        Non-Probabilisticbull ldquoI swear to tell the truthrdquo is possiblebull ldquoI swerve to smell de souprdquo is impossible

                                        Probabilisticbull P(I swear to tell the truth) ~ 0001bull P(I swerve to smell de soup) ~ 0

                                        Why language models

                                        Consider a Shannon Gamebull Predicting the next word in the sequence

                                        Statistical natural language hellip The cat is thrown out of the hellip The large green hellip Sue swallowed the large green hellip hellip

                                        Model at the sentence level

                                        Applications

                                        Spelling correction Mobile phone texting Speech recognition Handwriting recognition Disabled users hellip

                                        Spelling errors

                                        They are leaving in about fifteen minuets to go to her house

                                        The study was conducted mainly be John Black Hopefully all with continue smoothly in my absence Can they lave him my messages I need to notified the bank ofhellip He is trying to fine out

                                        Handwriting recognition

                                        Assume a note is given to a bank teller which the teller reads as I have a gub (cf Woody Allen)

                                        NLP to the rescue hellipbull gub is not a wordbull gun gum Gus and gull are words but gun has a

                                        higher probability in the context of a bank

                                        For Spell Checkers

                                        Collect list of commonly substituted wordsbull piecepeace whetherweather theirthere

                                        ExampleldquoOn Tuesday the whether helliprsquorsquoldquoOn Tuesday the weather helliprdquo

                                        Another dimension in language models

                                        Do we mainly want to infer (probabilities) of legal sentences sequences bull So far

                                        Or do we want to infer properties of these sentences bull Eg parse tree part-of-speech-taggingbullNeeded for understanding NL

                                        Letrsquos look at some tasks

                                        Sequence Tagging

                                        Part-of-speech taggingbull He drives with his bikebull N V PR PN N noun verb preposition pronoun noun

                                        Text extractionbull The job is that of a programmerbull X X X X X X JobTypebull The seminar is taking place from 1500 to 1600bull X X X X X X Start End

                                        Sequence Tagging

                                        Predicting the secondary structure of proteins mRNA hellipbull X = AFARLMMAhellipbull Y = hehestststhesthe hellip

                                        Parsing

                                        Given a sentence find its parse tree Important step in understanding NL

                                        Parsing

                                        In bioinformatics allows to predict (elements of) structure from sequence

                                        Language models based on Grammars

                                        Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

                                        A particular type of Unification Based Grammar (Prolog)

                                        Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

                                        words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

                                        bull Grammar encode rules

                                        Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

                                        (more than just recognition) Result of parsing mostly parse tree

                                        showing the constituents of a sentence eg verb or noun phrases

                                        Syntax usually specified in terms of a grammar consisting of grammar rules

                                        Regular Grammars and Finite State Automata

                                        Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

                                        argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

                                        argumentbull Adj (adjective)

                                        Now acceptbull The cat sleptbull Det N Vi

                                        As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

                                        Lexicon bull The - Detbull Cat - Nbull Slept - Vi

                                        bull hellip

                                        Finite State Automaton

                                        Sentences

                                        bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

                                        Phrase structure

                                        S

                                        NP

                                        D N

                                        VP

                                        NPV

                                        D N

                                        PP

                                        P NP

                                        D N

                                        the dog chased a cat into the garden

                                        Notation

                                        S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

                                        Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

                                        Terminals ~ Lexicon

                                        Phrase structure

                                        Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                                        Recursionbull bdquoThe girl thought the dog chased the catldquo

                                        VP -gt V SN -gt [girl]V -gt [thought]

                                        Top-down parsing

                                        S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                                        Context-free grammarSS --gt --gt NPNPVPVP

                                        NPNP --gt PN --gt PN Proper nounProper noun

                                        NPNP --gt Art Adj N--gt Art Adj N

                                        NPNP --gt ArtN--gt ArtN

                                        VPVP --gt VI --gt VI intransitive verbintransitive verb

                                        VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                                        ArtArt --gt [the]--gt [the]

                                        AdjAdj --gt [lazy]--gt [lazy]

                                        AdjAdj --gt [rapid]--gt [rapid]

                                        PNPN --gt [achilles]--gt [achilles]

                                        NN --gt [turtle]--gt [turtle]

                                        VIVI --gt [sleeps]--gt [sleeps]

                                        VTVT --gt [beats]--gt [beats]

                                        Parse tree

                                        SS

                                        NPNP VPVP

                                        ArtArt AdjAdj NN VtVt NPNP

                                        PNPN

                                        achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                                        Definite Clause GrammarsNon-terminals may have arguments

                                        SS --gt --gt NPNP((NN))VPVP((NN))

                                        NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                                        VP(VP(NN)) --gt VI(--gt VI(NN))

                                        Art(Art(singularsingular)) --gt [a]--gt [a]

                                        Art(Art(singularsingular)) --gt [the]--gt [the]

                                        Art(Art(pluralplural)) --gt [the]--gt [the]

                                        N(N(singularsingular)) --gt [turtle]--gt [turtle]

                                        N(N(pluralplural)) --gt [turtles]--gt [turtles]

                                        VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                                        VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                                        Number Agreement

                                        DCGs

                                        Non-terminals may have argumentsbull Variables (start with capital)

                                        Eg Number Any

                                        bull Constants (start with lower case) Eg singular plural

                                        bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                                        Parsing needs to be adapted bull Using unification

                                        Unification in a nutshell (cf AI course)

                                        Substitutions

                                        Eg Num singular T vp(VNP)

                                        Applying substitution bull Simultaneously replace variables by

                                        corresponding termsbull S(Num) Num singular = S(singular)

                                        Unification

                                        Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                                        Gives Num singular

                                        bull Art(singular) and Art(plural) Fails

                                        bull Art(Num1) and Art(Num2) Num1 Num2

                                        bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                                        Parsing with DCGs

                                        Now require successful unification at each step

                                        S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                        S-gt a turtle sleep fails

                                        Case Marking

                                        PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                        PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                        PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                        PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                        S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                        VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                        VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                        VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                        He sees her She sees him They see her

                                        But not Them see he

                                        DCGs

                                        Are strictly more expressive than CFGs Can represent for instance

                                        bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                        Probabilistic Models

                                        Traditional grammar models are very rigid bull essentially a yes no decision

                                        Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                        Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                        Illustration

                                        Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                        bull Constructed by handbull Can be used to derive stochastic context free

                                        grammarsbull SCFG assign probability to parse trees

                                        Compute the most probable parse tree

                                        Sequences are omni-present

                                        Therefore the techniques we will see also apply tobull Bioinformatics

                                        DNA proteins mRNA hellip can all be represented as strings

                                        bullRobotics Sequences of actions states hellip

                                        bullhellip

                                        Rest of the Course

                                        Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                        All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                        bull As an example of using undirected graphical models

                                        bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                        • Advanced Artificial Intelligence
                                        • Topic
                                        • Contents
                                        • Rationalism versus Empiricism
                                        • Slide 5
                                        • This course
                                        • Ambiguity
                                        • NLP and Statistics
                                        • Slide 9
                                        • Corpora
                                        • Word Counts
                                        • Slide 12
                                        • Word Counts (Brown corpus)
                                        • Slide 14
                                        • Zipflsquos Law
                                        • Language and sequences
                                        • Key NLP Problem Ambiguity
                                        • Language Model
                                        • Example of bad language model
                                        • A bad language model
                                        • Slide 22
                                        • A good language model
                                        • Why language models
                                        • Applications
                                        • Spelling errors
                                        • Handwriting recognition
                                        • For Spell Checkers
                                        • Another dimension in language models
                                        • Sequence Tagging
                                        • Slide 31
                                        • Parsing
                                        • Slide 33
                                        • Language models based on Grammars
                                        • Grammars and parsing
                                        • Regular Grammars and Finite State Automata
                                        • Finite State Automaton
                                        • Phrase structure
                                        • Notation
                                        • Context Free Grammar
                                        • Slide 41
                                        • Top-down parsing
                                        • Context-free grammar
                                        • Parse tree
                                        • Definite Clause Grammars Non-terminals may have arguments
                                        • DCGs
                                        • Unification in a nutshell (cf AI course)
                                        • Unification
                                        • Parsing with DCGs
                                        • Case Marking
                                        • Slide 51
                                        • Probabilistic Models
                                        • Illustration
                                        • PowerPoint Presentation
                                        • Sequences are omni-present
                                        • Rest of the Course

                                          A bad language model

                                          A good language model

                                          Non-Probabilisticbull ldquoI swear to tell the truthrdquo is possiblebull ldquoI swerve to smell de souprdquo is impossible

                                          Probabilisticbull P(I swear to tell the truth) ~ 0001bull P(I swerve to smell de soup) ~ 0

                                          Why language models

                                          Consider a Shannon Gamebull Predicting the next word in the sequence

                                          Statistical natural language hellip The cat is thrown out of the hellip The large green hellip Sue swallowed the large green hellip hellip

                                          Model at the sentence level

                                          Applications

                                          Spelling correction Mobile phone texting Speech recognition Handwriting recognition Disabled users hellip

                                          Spelling errors

                                          They are leaving in about fifteen minuets to go to her house

                                          The study was conducted mainly be John Black Hopefully all with continue smoothly in my absence Can they lave him my messages I need to notified the bank ofhellip He is trying to fine out

                                          Handwriting recognition

                                          Assume a note is given to a bank teller which the teller reads as I have a gub (cf Woody Allen)

                                          NLP to the rescue hellipbull gub is not a wordbull gun gum Gus and gull are words but gun has a

                                          higher probability in the context of a bank

                                          For Spell Checkers

                                          Collect list of commonly substituted wordsbull piecepeace whetherweather theirthere

                                          ExampleldquoOn Tuesday the whether helliprsquorsquoldquoOn Tuesday the weather helliprdquo

                                          Another dimension in language models

                                          Do we mainly want to infer (probabilities) of legal sentences sequences bull So far

                                          Or do we want to infer properties of these sentences bull Eg parse tree part-of-speech-taggingbullNeeded for understanding NL

                                          Letrsquos look at some tasks

                                          Sequence Tagging

                                          Part-of-speech taggingbull He drives with his bikebull N V PR PN N noun verb preposition pronoun noun

                                          Text extractionbull The job is that of a programmerbull X X X X X X JobTypebull The seminar is taking place from 1500 to 1600bull X X X X X X Start End

                                          Sequence Tagging

                                          Predicting the secondary structure of proteins mRNA hellipbull X = AFARLMMAhellipbull Y = hehestststhesthe hellip

                                          Parsing

                                          Given a sentence find its parse tree Important step in understanding NL

                                          Parsing

                                          In bioinformatics allows to predict (elements of) structure from sequence

                                          Language models based on Grammars

                                          Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

                                          A particular type of Unification Based Grammar (Prolog)

                                          Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

                                          words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

                                          bull Grammar encode rules

                                          Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

                                          (more than just recognition) Result of parsing mostly parse tree

                                          showing the constituents of a sentence eg verb or noun phrases

                                          Syntax usually specified in terms of a grammar consisting of grammar rules

                                          Regular Grammars and Finite State Automata

                                          Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

                                          argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

                                          argumentbull Adj (adjective)

                                          Now acceptbull The cat sleptbull Det N Vi

                                          As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

                                          Lexicon bull The - Detbull Cat - Nbull Slept - Vi

                                          bull hellip

                                          Finite State Automaton

                                          Sentences

                                          bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

                                          Phrase structure

                                          S

                                          NP

                                          D N

                                          VP

                                          NPV

                                          D N

                                          PP

                                          P NP

                                          D N

                                          the dog chased a cat into the garden

                                          Notation

                                          S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

                                          Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

                                          Terminals ~ Lexicon

                                          Phrase structure

                                          Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                                          Recursionbull bdquoThe girl thought the dog chased the catldquo

                                          VP -gt V SN -gt [girl]V -gt [thought]

                                          Top-down parsing

                                          S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                                          Context-free grammarSS --gt --gt NPNPVPVP

                                          NPNP --gt PN --gt PN Proper nounProper noun

                                          NPNP --gt Art Adj N--gt Art Adj N

                                          NPNP --gt ArtN--gt ArtN

                                          VPVP --gt VI --gt VI intransitive verbintransitive verb

                                          VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                                          ArtArt --gt [the]--gt [the]

                                          AdjAdj --gt [lazy]--gt [lazy]

                                          AdjAdj --gt [rapid]--gt [rapid]

                                          PNPN --gt [achilles]--gt [achilles]

                                          NN --gt [turtle]--gt [turtle]

                                          VIVI --gt [sleeps]--gt [sleeps]

                                          VTVT --gt [beats]--gt [beats]

                                          Parse tree

                                          SS

                                          NPNP VPVP

                                          ArtArt AdjAdj NN VtVt NPNP

                                          PNPN

                                          achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                                          Definite Clause GrammarsNon-terminals may have arguments

                                          SS --gt --gt NPNP((NN))VPVP((NN))

                                          NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                                          VP(VP(NN)) --gt VI(--gt VI(NN))

                                          Art(Art(singularsingular)) --gt [a]--gt [a]

                                          Art(Art(singularsingular)) --gt [the]--gt [the]

                                          Art(Art(pluralplural)) --gt [the]--gt [the]

                                          N(N(singularsingular)) --gt [turtle]--gt [turtle]

                                          N(N(pluralplural)) --gt [turtles]--gt [turtles]

                                          VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                                          VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                                          Number Agreement

                                          DCGs

                                          Non-terminals may have argumentsbull Variables (start with capital)

                                          Eg Number Any

                                          bull Constants (start with lower case) Eg singular plural

                                          bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                                          Parsing needs to be adapted bull Using unification

                                          Unification in a nutshell (cf AI course)

                                          Substitutions

                                          Eg Num singular T vp(VNP)

                                          Applying substitution bull Simultaneously replace variables by

                                          corresponding termsbull S(Num) Num singular = S(singular)

                                          Unification

                                          Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                                          Gives Num singular

                                          bull Art(singular) and Art(plural) Fails

                                          bull Art(Num1) and Art(Num2) Num1 Num2

                                          bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                                          Parsing with DCGs

                                          Now require successful unification at each step

                                          S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                          S-gt a turtle sleep fails

                                          Case Marking

                                          PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                          PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                          PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                          PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                          S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                          VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                          VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                          VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                          He sees her She sees him They see her

                                          But not Them see he

                                          DCGs

                                          Are strictly more expressive than CFGs Can represent for instance

                                          bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                          Probabilistic Models

                                          Traditional grammar models are very rigid bull essentially a yes no decision

                                          Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                          Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                          Illustration

                                          Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                          bull Constructed by handbull Can be used to derive stochastic context free

                                          grammarsbull SCFG assign probability to parse trees

                                          Compute the most probable parse tree

                                          Sequences are omni-present

                                          Therefore the techniques we will see also apply tobull Bioinformatics

                                          DNA proteins mRNA hellip can all be represented as strings

                                          bullRobotics Sequences of actions states hellip

                                          bullhellip

                                          Rest of the Course

                                          Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                          All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                          bull As an example of using undirected graphical models

                                          bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                          • Advanced Artificial Intelligence
                                          • Topic
                                          • Contents
                                          • Rationalism versus Empiricism
                                          • Slide 5
                                          • This course
                                          • Ambiguity
                                          • NLP and Statistics
                                          • Slide 9
                                          • Corpora
                                          • Word Counts
                                          • Slide 12
                                          • Word Counts (Brown corpus)
                                          • Slide 14
                                          • Zipflsquos Law
                                          • Language and sequences
                                          • Key NLP Problem Ambiguity
                                          • Language Model
                                          • Example of bad language model
                                          • A bad language model
                                          • Slide 22
                                          • A good language model
                                          • Why language models
                                          • Applications
                                          • Spelling errors
                                          • Handwriting recognition
                                          • For Spell Checkers
                                          • Another dimension in language models
                                          • Sequence Tagging
                                          • Slide 31
                                          • Parsing
                                          • Slide 33
                                          • Language models based on Grammars
                                          • Grammars and parsing
                                          • Regular Grammars and Finite State Automata
                                          • Finite State Automaton
                                          • Phrase structure
                                          • Notation
                                          • Context Free Grammar
                                          • Slide 41
                                          • Top-down parsing
                                          • Context-free grammar
                                          • Parse tree
                                          • Definite Clause Grammars Non-terminals may have arguments
                                          • DCGs
                                          • Unification in a nutshell (cf AI course)
                                          • Unification
                                          • Parsing with DCGs
                                          • Case Marking
                                          • Slide 51
                                          • Probabilistic Models
                                          • Illustration
                                          • PowerPoint Presentation
                                          • Sequences are omni-present
                                          • Rest of the Course

                                            A good language model

                                            Non-Probabilisticbull ldquoI swear to tell the truthrdquo is possiblebull ldquoI swerve to smell de souprdquo is impossible

                                            Probabilisticbull P(I swear to tell the truth) ~ 0001bull P(I swerve to smell de soup) ~ 0

                                            Why language models

                                            Consider a Shannon Gamebull Predicting the next word in the sequence

                                            Statistical natural language hellip The cat is thrown out of the hellip The large green hellip Sue swallowed the large green hellip hellip

                                            Model at the sentence level

                                            Applications

                                            Spelling correction Mobile phone texting Speech recognition Handwriting recognition Disabled users hellip

                                            Spelling errors

                                            They are leaving in about fifteen minuets to go to her house

                                            The study was conducted mainly be John Black Hopefully all with continue smoothly in my absence Can they lave him my messages I need to notified the bank ofhellip He is trying to fine out

                                            Handwriting recognition

                                            Assume a note is given to a bank teller which the teller reads as I have a gub (cf Woody Allen)

                                            NLP to the rescue hellipbull gub is not a wordbull gun gum Gus and gull are words but gun has a

                                            higher probability in the context of a bank

                                            For Spell Checkers

                                            Collect list of commonly substituted wordsbull piecepeace whetherweather theirthere

                                            ExampleldquoOn Tuesday the whether helliprsquorsquoldquoOn Tuesday the weather helliprdquo

                                            Another dimension in language models

                                            Do we mainly want to infer (probabilities) of legal sentences sequences bull So far

                                            Or do we want to infer properties of these sentences bull Eg parse tree part-of-speech-taggingbullNeeded for understanding NL

                                            Letrsquos look at some tasks

                                            Sequence Tagging

                                            Part-of-speech taggingbull He drives with his bikebull N V PR PN N noun verb preposition pronoun noun

                                            Text extractionbull The job is that of a programmerbull X X X X X X JobTypebull The seminar is taking place from 1500 to 1600bull X X X X X X Start End

                                            Sequence Tagging

                                            Predicting the secondary structure of proteins mRNA hellipbull X = AFARLMMAhellipbull Y = hehestststhesthe hellip

                                            Parsing

                                            Given a sentence find its parse tree Important step in understanding NL

                                            Parsing

                                            In bioinformatics allows to predict (elements of) structure from sequence

                                            Language models based on Grammars

                                            Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

                                            A particular type of Unification Based Grammar (Prolog)

                                            Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

                                            words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

                                            bull Grammar encode rules

                                            Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

                                            (more than just recognition) Result of parsing mostly parse tree

                                            showing the constituents of a sentence eg verb or noun phrases

                                            Syntax usually specified in terms of a grammar consisting of grammar rules

                                            Regular Grammars and Finite State Automata

                                            Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

                                            argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

                                            argumentbull Adj (adjective)

                                            Now acceptbull The cat sleptbull Det N Vi

                                            As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

                                            Lexicon bull The - Detbull Cat - Nbull Slept - Vi

                                            bull hellip

                                            Finite State Automaton

                                            Sentences

                                            bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

                                            Phrase structure

                                            S

                                            NP

                                            D N

                                            VP

                                            NPV

                                            D N

                                            PP

                                            P NP

                                            D N

                                            the dog chased a cat into the garden

                                            Notation

                                            S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

                                            Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

                                            Terminals ~ Lexicon

                                            Phrase structure

                                            Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                                            Recursionbull bdquoThe girl thought the dog chased the catldquo

                                            VP -gt V SN -gt [girl]V -gt [thought]

                                            Top-down parsing

                                            S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                                            Context-free grammarSS --gt --gt NPNPVPVP

                                            NPNP --gt PN --gt PN Proper nounProper noun

                                            NPNP --gt Art Adj N--gt Art Adj N

                                            NPNP --gt ArtN--gt ArtN

                                            VPVP --gt VI --gt VI intransitive verbintransitive verb

                                            VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                                            ArtArt --gt [the]--gt [the]

                                            AdjAdj --gt [lazy]--gt [lazy]

                                            AdjAdj --gt [rapid]--gt [rapid]

                                            PNPN --gt [achilles]--gt [achilles]

                                            NN --gt [turtle]--gt [turtle]

                                            VIVI --gt [sleeps]--gt [sleeps]

                                            VTVT --gt [beats]--gt [beats]

                                            Parse tree

                                            SS

                                            NPNP VPVP

                                            ArtArt AdjAdj NN VtVt NPNP

                                            PNPN

                                            achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                                            Definite Clause GrammarsNon-terminals may have arguments

                                            SS --gt --gt NPNP((NN))VPVP((NN))

                                            NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                                            VP(VP(NN)) --gt VI(--gt VI(NN))

                                            Art(Art(singularsingular)) --gt [a]--gt [a]

                                            Art(Art(singularsingular)) --gt [the]--gt [the]

                                            Art(Art(pluralplural)) --gt [the]--gt [the]

                                            N(N(singularsingular)) --gt [turtle]--gt [turtle]

                                            N(N(pluralplural)) --gt [turtles]--gt [turtles]

                                            VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                                            VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                                            Number Agreement

                                            DCGs

                                            Non-terminals may have argumentsbull Variables (start with capital)

                                            Eg Number Any

                                            bull Constants (start with lower case) Eg singular plural

                                            bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                                            Parsing needs to be adapted bull Using unification

                                            Unification in a nutshell (cf AI course)

                                            Substitutions

                                            Eg Num singular T vp(VNP)

                                            Applying substitution bull Simultaneously replace variables by

                                            corresponding termsbull S(Num) Num singular = S(singular)

                                            Unification

                                            Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                                            Gives Num singular

                                            bull Art(singular) and Art(plural) Fails

                                            bull Art(Num1) and Art(Num2) Num1 Num2

                                            bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                                            Parsing with DCGs

                                            Now require successful unification at each step

                                            S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                            S-gt a turtle sleep fails

                                            Case Marking

                                            PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                            PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                            PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                            PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                            S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                            VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                            VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                            VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                            He sees her She sees him They see her

                                            But not Them see he

                                            DCGs

                                            Are strictly more expressive than CFGs Can represent for instance

                                            bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                            Probabilistic Models

                                            Traditional grammar models are very rigid bull essentially a yes no decision

                                            Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                            Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                            Illustration

                                            Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                            bull Constructed by handbull Can be used to derive stochastic context free

                                            grammarsbull SCFG assign probability to parse trees

                                            Compute the most probable parse tree

                                            Sequences are omni-present

                                            Therefore the techniques we will see also apply tobull Bioinformatics

                                            DNA proteins mRNA hellip can all be represented as strings

                                            bullRobotics Sequences of actions states hellip

                                            bullhellip

                                            Rest of the Course

                                            Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                            All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                            bull As an example of using undirected graphical models

                                            bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                            • Advanced Artificial Intelligence
                                            • Topic
                                            • Contents
                                            • Rationalism versus Empiricism
                                            • Slide 5
                                            • This course
                                            • Ambiguity
                                            • NLP and Statistics
                                            • Slide 9
                                            • Corpora
                                            • Word Counts
                                            • Slide 12
                                            • Word Counts (Brown corpus)
                                            • Slide 14
                                            • Zipflsquos Law
                                            • Language and sequences
                                            • Key NLP Problem Ambiguity
                                            • Language Model
                                            • Example of bad language model
                                            • A bad language model
                                            • Slide 22
                                            • A good language model
                                            • Why language models
                                            • Applications
                                            • Spelling errors
                                            • Handwriting recognition
                                            • For Spell Checkers
                                            • Another dimension in language models
                                            • Sequence Tagging
                                            • Slide 31
                                            • Parsing
                                            • Slide 33
                                            • Language models based on Grammars
                                            • Grammars and parsing
                                            • Regular Grammars and Finite State Automata
                                            • Finite State Automaton
                                            • Phrase structure
                                            • Notation
                                            • Context Free Grammar
                                            • Slide 41
                                            • Top-down parsing
                                            • Context-free grammar
                                            • Parse tree
                                            • Definite Clause Grammars Non-terminals may have arguments
                                            • DCGs
                                            • Unification in a nutshell (cf AI course)
                                            • Unification
                                            • Parsing with DCGs
                                            • Case Marking
                                            • Slide 51
                                            • Probabilistic Models
                                            • Illustration
                                            • PowerPoint Presentation
                                            • Sequences are omni-present
                                            • Rest of the Course

                                              Why language models

                                              Consider a Shannon Gamebull Predicting the next word in the sequence

                                              Statistical natural language hellip The cat is thrown out of the hellip The large green hellip Sue swallowed the large green hellip hellip

                                              Model at the sentence level

                                              Applications

                                              Spelling correction Mobile phone texting Speech recognition Handwriting recognition Disabled users hellip

                                              Spelling errors

                                              They are leaving in about fifteen minuets to go to her house

                                              The study was conducted mainly be John Black Hopefully all with continue smoothly in my absence Can they lave him my messages I need to notified the bank ofhellip He is trying to fine out

                                              Handwriting recognition

                                              Assume a note is given to a bank teller which the teller reads as I have a gub (cf Woody Allen)

                                              NLP to the rescue hellipbull gub is not a wordbull gun gum Gus and gull are words but gun has a

                                              higher probability in the context of a bank

                                              For Spell Checkers

                                              Collect list of commonly substituted wordsbull piecepeace whetherweather theirthere

                                              ExampleldquoOn Tuesday the whether helliprsquorsquoldquoOn Tuesday the weather helliprdquo

                                              Another dimension in language models

                                              Do we mainly want to infer (probabilities) of legal sentences sequences bull So far

                                              Or do we want to infer properties of these sentences bull Eg parse tree part-of-speech-taggingbullNeeded for understanding NL

                                              Letrsquos look at some tasks

                                              Sequence Tagging

                                              Part-of-speech taggingbull He drives with his bikebull N V PR PN N noun verb preposition pronoun noun

                                              Text extractionbull The job is that of a programmerbull X X X X X X JobTypebull The seminar is taking place from 1500 to 1600bull X X X X X X Start End

                                              Sequence Tagging

                                              Predicting the secondary structure of proteins mRNA hellipbull X = AFARLMMAhellipbull Y = hehestststhesthe hellip

                                              Parsing

                                              Given a sentence find its parse tree Important step in understanding NL

                                              Parsing

                                              In bioinformatics allows to predict (elements of) structure from sequence

                                              Language models based on Grammars

                                              Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

                                              A particular type of Unification Based Grammar (Prolog)

                                              Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

                                              words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

                                              bull Grammar encode rules

                                              Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

                                              (more than just recognition) Result of parsing mostly parse tree

                                              showing the constituents of a sentence eg verb or noun phrases

                                              Syntax usually specified in terms of a grammar consisting of grammar rules

                                              Regular Grammars and Finite State Automata

                                              Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

                                              argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

                                              argumentbull Adj (adjective)

                                              Now acceptbull The cat sleptbull Det N Vi

                                              As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

                                              Lexicon bull The - Detbull Cat - Nbull Slept - Vi

                                              bull hellip

                                              Finite State Automaton

                                              Sentences

                                              bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

                                              Phrase structure

                                              S

                                              NP

                                              D N

                                              VP

                                              NPV

                                              D N

                                              PP

                                              P NP

                                              D N

                                              the dog chased a cat into the garden

                                              Notation

                                              S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

                                              Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

                                              Terminals ~ Lexicon

                                              Phrase structure

                                              Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                                              Recursionbull bdquoThe girl thought the dog chased the catldquo

                                              VP -gt V SN -gt [girl]V -gt [thought]

                                              Top-down parsing

                                              S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                                              Context-free grammarSS --gt --gt NPNPVPVP

                                              NPNP --gt PN --gt PN Proper nounProper noun

                                              NPNP --gt Art Adj N--gt Art Adj N

                                              NPNP --gt ArtN--gt ArtN

                                              VPVP --gt VI --gt VI intransitive verbintransitive verb

                                              VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                                              ArtArt --gt [the]--gt [the]

                                              AdjAdj --gt [lazy]--gt [lazy]

                                              AdjAdj --gt [rapid]--gt [rapid]

                                              PNPN --gt [achilles]--gt [achilles]

                                              NN --gt [turtle]--gt [turtle]

                                              VIVI --gt [sleeps]--gt [sleeps]

                                              VTVT --gt [beats]--gt [beats]

                                              Parse tree

                                              SS

                                              NPNP VPVP

                                              ArtArt AdjAdj NN VtVt NPNP

                                              PNPN

                                              achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                                              Definite Clause GrammarsNon-terminals may have arguments

                                              SS --gt --gt NPNP((NN))VPVP((NN))

                                              NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                                              VP(VP(NN)) --gt VI(--gt VI(NN))

                                              Art(Art(singularsingular)) --gt [a]--gt [a]

                                              Art(Art(singularsingular)) --gt [the]--gt [the]

                                              Art(Art(pluralplural)) --gt [the]--gt [the]

                                              N(N(singularsingular)) --gt [turtle]--gt [turtle]

                                              N(N(pluralplural)) --gt [turtles]--gt [turtles]

                                              VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                                              VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                                              Number Agreement

                                              DCGs

                                              Non-terminals may have argumentsbull Variables (start with capital)

                                              Eg Number Any

                                              bull Constants (start with lower case) Eg singular plural

                                              bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                                              Parsing needs to be adapted bull Using unification

                                              Unification in a nutshell (cf AI course)

                                              Substitutions

                                              Eg Num singular T vp(VNP)

                                              Applying substitution bull Simultaneously replace variables by

                                              corresponding termsbull S(Num) Num singular = S(singular)

                                              Unification

                                              Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                                              Gives Num singular

                                              bull Art(singular) and Art(plural) Fails

                                              bull Art(Num1) and Art(Num2) Num1 Num2

                                              bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                                              Parsing with DCGs

                                              Now require successful unification at each step

                                              S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                              S-gt a turtle sleep fails

                                              Case Marking

                                              PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                              PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                              PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                              PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                              S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                              VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                              VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                              VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                              He sees her She sees him They see her

                                              But not Them see he

                                              DCGs

                                              Are strictly more expressive than CFGs Can represent for instance

                                              bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                              Probabilistic Models

                                              Traditional grammar models are very rigid bull essentially a yes no decision

                                              Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                              Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                              Illustration

                                              Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                              bull Constructed by handbull Can be used to derive stochastic context free

                                              grammarsbull SCFG assign probability to parse trees

                                              Compute the most probable parse tree

                                              Sequences are omni-present

                                              Therefore the techniques we will see also apply tobull Bioinformatics

                                              DNA proteins mRNA hellip can all be represented as strings

                                              bullRobotics Sequences of actions states hellip

                                              bullhellip

                                              Rest of the Course

                                              Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                              All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                              bull As an example of using undirected graphical models

                                              bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                              • Advanced Artificial Intelligence
                                              • Topic
                                              • Contents
                                              • Rationalism versus Empiricism
                                              • Slide 5
                                              • This course
                                              • Ambiguity
                                              • NLP and Statistics
                                              • Slide 9
                                              • Corpora
                                              • Word Counts
                                              • Slide 12
                                              • Word Counts (Brown corpus)
                                              • Slide 14
                                              • Zipflsquos Law
                                              • Language and sequences
                                              • Key NLP Problem Ambiguity
                                              • Language Model
                                              • Example of bad language model
                                              • A bad language model
                                              • Slide 22
                                              • A good language model
                                              • Why language models
                                              • Applications
                                              • Spelling errors
                                              • Handwriting recognition
                                              • For Spell Checkers
                                              • Another dimension in language models
                                              • Sequence Tagging
                                              • Slide 31
                                              • Parsing
                                              • Slide 33
                                              • Language models based on Grammars
                                              • Grammars and parsing
                                              • Regular Grammars and Finite State Automata
                                              • Finite State Automaton
                                              • Phrase structure
                                              • Notation
                                              • Context Free Grammar
                                              • Slide 41
                                              • Top-down parsing
                                              • Context-free grammar
                                              • Parse tree
                                              • Definite Clause Grammars Non-terminals may have arguments
                                              • DCGs
                                              • Unification in a nutshell (cf AI course)
                                              • Unification
                                              • Parsing with DCGs
                                              • Case Marking
                                              • Slide 51
                                              • Probabilistic Models
                                              • Illustration
                                              • PowerPoint Presentation
                                              • Sequences are omni-present
                                              • Rest of the Course

                                                Applications

                                                Spelling correction Mobile phone texting Speech recognition Handwriting recognition Disabled users hellip

                                                Spelling errors

                                                They are leaving in about fifteen minuets to go to her house

                                                The study was conducted mainly be John Black Hopefully all with continue smoothly in my absence Can they lave him my messages I need to notified the bank ofhellip He is trying to fine out

                                                Handwriting recognition

                                                Assume a note is given to a bank teller which the teller reads as I have a gub (cf Woody Allen)

                                                NLP to the rescue hellipbull gub is not a wordbull gun gum Gus and gull are words but gun has a

                                                higher probability in the context of a bank

                                                For Spell Checkers

                                                Collect list of commonly substituted wordsbull piecepeace whetherweather theirthere

                                                ExampleldquoOn Tuesday the whether helliprsquorsquoldquoOn Tuesday the weather helliprdquo

                                                Another dimension in language models

                                                Do we mainly want to infer (probabilities) of legal sentences sequences bull So far

                                                Or do we want to infer properties of these sentences bull Eg parse tree part-of-speech-taggingbullNeeded for understanding NL

                                                Letrsquos look at some tasks

                                                Sequence Tagging

                                                Part-of-speech taggingbull He drives with his bikebull N V PR PN N noun verb preposition pronoun noun

                                                Text extractionbull The job is that of a programmerbull X X X X X X JobTypebull The seminar is taking place from 1500 to 1600bull X X X X X X Start End

                                                Sequence Tagging

                                                Predicting the secondary structure of proteins mRNA hellipbull X = AFARLMMAhellipbull Y = hehestststhesthe hellip

                                                Parsing

                                                Given a sentence find its parse tree Important step in understanding NL

                                                Parsing

                                                In bioinformatics allows to predict (elements of) structure from sequence

                                                Language models based on Grammars

                                                Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

                                                A particular type of Unification Based Grammar (Prolog)

                                                Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

                                                words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

                                                bull Grammar encode rules

                                                Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

                                                (more than just recognition) Result of parsing mostly parse tree

                                                showing the constituents of a sentence eg verb or noun phrases

                                                Syntax usually specified in terms of a grammar consisting of grammar rules

                                                Regular Grammars and Finite State Automata

                                                Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

                                                argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

                                                argumentbull Adj (adjective)

                                                Now acceptbull The cat sleptbull Det N Vi

                                                As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

                                                Lexicon bull The - Detbull Cat - Nbull Slept - Vi

                                                bull hellip

                                                Finite State Automaton

                                                Sentences

                                                bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

                                                Phrase structure

                                                S

                                                NP

                                                D N

                                                VP

                                                NPV

                                                D N

                                                PP

                                                P NP

                                                D N

                                                the dog chased a cat into the garden

                                                Notation

                                                S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

                                                Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

                                                Terminals ~ Lexicon

                                                Phrase structure

                                                Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                                                Recursionbull bdquoThe girl thought the dog chased the catldquo

                                                VP -gt V SN -gt [girl]V -gt [thought]

                                                Top-down parsing

                                                S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                                                Context-free grammarSS --gt --gt NPNPVPVP

                                                NPNP --gt PN --gt PN Proper nounProper noun

                                                NPNP --gt Art Adj N--gt Art Adj N

                                                NPNP --gt ArtN--gt ArtN

                                                VPVP --gt VI --gt VI intransitive verbintransitive verb

                                                VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                                                ArtArt --gt [the]--gt [the]

                                                AdjAdj --gt [lazy]--gt [lazy]

                                                AdjAdj --gt [rapid]--gt [rapid]

                                                PNPN --gt [achilles]--gt [achilles]

                                                NN --gt [turtle]--gt [turtle]

                                                VIVI --gt [sleeps]--gt [sleeps]

                                                VTVT --gt [beats]--gt [beats]

                                                Parse tree

                                                SS

                                                NPNP VPVP

                                                ArtArt AdjAdj NN VtVt NPNP

                                                PNPN

                                                achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                                                Definite Clause GrammarsNon-terminals may have arguments

                                                SS --gt --gt NPNP((NN))VPVP((NN))

                                                NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                                                VP(VP(NN)) --gt VI(--gt VI(NN))

                                                Art(Art(singularsingular)) --gt [a]--gt [a]

                                                Art(Art(singularsingular)) --gt [the]--gt [the]

                                                Art(Art(pluralplural)) --gt [the]--gt [the]

                                                N(N(singularsingular)) --gt [turtle]--gt [turtle]

                                                N(N(pluralplural)) --gt [turtles]--gt [turtles]

                                                VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                                                VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                                                Number Agreement

                                                DCGs

                                                Non-terminals may have argumentsbull Variables (start with capital)

                                                Eg Number Any

                                                bull Constants (start with lower case) Eg singular plural

                                                bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                                                Parsing needs to be adapted bull Using unification

                                                Unification in a nutshell (cf AI course)

                                                Substitutions

                                                Eg Num singular T vp(VNP)

                                                Applying substitution bull Simultaneously replace variables by

                                                corresponding termsbull S(Num) Num singular = S(singular)

                                                Unification

                                                Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                                                Gives Num singular

                                                bull Art(singular) and Art(plural) Fails

                                                bull Art(Num1) and Art(Num2) Num1 Num2

                                                bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                                                Parsing with DCGs

                                                Now require successful unification at each step

                                                S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                                S-gt a turtle sleep fails

                                                Case Marking

                                                PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                                PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                                PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                                PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                                S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                                VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                                VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                                VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                                He sees her She sees him They see her

                                                But not Them see he

                                                DCGs

                                                Are strictly more expressive than CFGs Can represent for instance

                                                bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                                Probabilistic Models

                                                Traditional grammar models are very rigid bull essentially a yes no decision

                                                Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                                Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                                Illustration

                                                Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                                bull Constructed by handbull Can be used to derive stochastic context free

                                                grammarsbull SCFG assign probability to parse trees

                                                Compute the most probable parse tree

                                                Sequences are omni-present

                                                Therefore the techniques we will see also apply tobull Bioinformatics

                                                DNA proteins mRNA hellip can all be represented as strings

                                                bullRobotics Sequences of actions states hellip

                                                bullhellip

                                                Rest of the Course

                                                Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                                All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                                bull As an example of using undirected graphical models

                                                bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                                • Advanced Artificial Intelligence
                                                • Topic
                                                • Contents
                                                • Rationalism versus Empiricism
                                                • Slide 5
                                                • This course
                                                • Ambiguity
                                                • NLP and Statistics
                                                • Slide 9
                                                • Corpora
                                                • Word Counts
                                                • Slide 12
                                                • Word Counts (Brown corpus)
                                                • Slide 14
                                                • Zipflsquos Law
                                                • Language and sequences
                                                • Key NLP Problem Ambiguity
                                                • Language Model
                                                • Example of bad language model
                                                • A bad language model
                                                • Slide 22
                                                • A good language model
                                                • Why language models
                                                • Applications
                                                • Spelling errors
                                                • Handwriting recognition
                                                • For Spell Checkers
                                                • Another dimension in language models
                                                • Sequence Tagging
                                                • Slide 31
                                                • Parsing
                                                • Slide 33
                                                • Language models based on Grammars
                                                • Grammars and parsing
                                                • Regular Grammars and Finite State Automata
                                                • Finite State Automaton
                                                • Phrase structure
                                                • Notation
                                                • Context Free Grammar
                                                • Slide 41
                                                • Top-down parsing
                                                • Context-free grammar
                                                • Parse tree
                                                • Definite Clause Grammars Non-terminals may have arguments
                                                • DCGs
                                                • Unification in a nutshell (cf AI course)
                                                • Unification
                                                • Parsing with DCGs
                                                • Case Marking
                                                • Slide 51
                                                • Probabilistic Models
                                                • Illustration
                                                • PowerPoint Presentation
                                                • Sequences are omni-present
                                                • Rest of the Course

                                                  Spelling errors

                                                  They are leaving in about fifteen minuets to go to her house

                                                  The study was conducted mainly be John Black Hopefully all with continue smoothly in my absence Can they lave him my messages I need to notified the bank ofhellip He is trying to fine out

                                                  Handwriting recognition

                                                  Assume a note is given to a bank teller which the teller reads as I have a gub (cf Woody Allen)

                                                  NLP to the rescue hellipbull gub is not a wordbull gun gum Gus and gull are words but gun has a

                                                  higher probability in the context of a bank

                                                  For Spell Checkers

                                                  Collect list of commonly substituted wordsbull piecepeace whetherweather theirthere

                                                  ExampleldquoOn Tuesday the whether helliprsquorsquoldquoOn Tuesday the weather helliprdquo

                                                  Another dimension in language models

                                                  Do we mainly want to infer (probabilities) of legal sentences sequences bull So far

                                                  Or do we want to infer properties of these sentences bull Eg parse tree part-of-speech-taggingbullNeeded for understanding NL

                                                  Letrsquos look at some tasks

                                                  Sequence Tagging

                                                  Part-of-speech taggingbull He drives with his bikebull N V PR PN N noun verb preposition pronoun noun

                                                  Text extractionbull The job is that of a programmerbull X X X X X X JobTypebull The seminar is taking place from 1500 to 1600bull X X X X X X Start End

                                                  Sequence Tagging

                                                  Predicting the secondary structure of proteins mRNA hellipbull X = AFARLMMAhellipbull Y = hehestststhesthe hellip

                                                  Parsing

                                                  Given a sentence find its parse tree Important step in understanding NL

                                                  Parsing

                                                  In bioinformatics allows to predict (elements of) structure from sequence

                                                  Language models based on Grammars

                                                  Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

                                                  A particular type of Unification Based Grammar (Prolog)

                                                  Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

                                                  words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

                                                  bull Grammar encode rules

                                                  Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

                                                  (more than just recognition) Result of parsing mostly parse tree

                                                  showing the constituents of a sentence eg verb or noun phrases

                                                  Syntax usually specified in terms of a grammar consisting of grammar rules

                                                  Regular Grammars and Finite State Automata

                                                  Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

                                                  argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

                                                  argumentbull Adj (adjective)

                                                  Now acceptbull The cat sleptbull Det N Vi

                                                  As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

                                                  Lexicon bull The - Detbull Cat - Nbull Slept - Vi

                                                  bull hellip

                                                  Finite State Automaton

                                                  Sentences

                                                  bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

                                                  Phrase structure

                                                  S

                                                  NP

                                                  D N

                                                  VP

                                                  NPV

                                                  D N

                                                  PP

                                                  P NP

                                                  D N

                                                  the dog chased a cat into the garden

                                                  Notation

                                                  S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

                                                  Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

                                                  Terminals ~ Lexicon

                                                  Phrase structure

                                                  Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                                                  Recursionbull bdquoThe girl thought the dog chased the catldquo

                                                  VP -gt V SN -gt [girl]V -gt [thought]

                                                  Top-down parsing

                                                  S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                                                  Context-free grammarSS --gt --gt NPNPVPVP

                                                  NPNP --gt PN --gt PN Proper nounProper noun

                                                  NPNP --gt Art Adj N--gt Art Adj N

                                                  NPNP --gt ArtN--gt ArtN

                                                  VPVP --gt VI --gt VI intransitive verbintransitive verb

                                                  VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                                                  ArtArt --gt [the]--gt [the]

                                                  AdjAdj --gt [lazy]--gt [lazy]

                                                  AdjAdj --gt [rapid]--gt [rapid]

                                                  PNPN --gt [achilles]--gt [achilles]

                                                  NN --gt [turtle]--gt [turtle]

                                                  VIVI --gt [sleeps]--gt [sleeps]

                                                  VTVT --gt [beats]--gt [beats]

                                                  Parse tree

                                                  SS

                                                  NPNP VPVP

                                                  ArtArt AdjAdj NN VtVt NPNP

                                                  PNPN

                                                  achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                                                  Definite Clause GrammarsNon-terminals may have arguments

                                                  SS --gt --gt NPNP((NN))VPVP((NN))

                                                  NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                                                  VP(VP(NN)) --gt VI(--gt VI(NN))

                                                  Art(Art(singularsingular)) --gt [a]--gt [a]

                                                  Art(Art(singularsingular)) --gt [the]--gt [the]

                                                  Art(Art(pluralplural)) --gt [the]--gt [the]

                                                  N(N(singularsingular)) --gt [turtle]--gt [turtle]

                                                  N(N(pluralplural)) --gt [turtles]--gt [turtles]

                                                  VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                                                  VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                                                  Number Agreement

                                                  DCGs

                                                  Non-terminals may have argumentsbull Variables (start with capital)

                                                  Eg Number Any

                                                  bull Constants (start with lower case) Eg singular plural

                                                  bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                                                  Parsing needs to be adapted bull Using unification

                                                  Unification in a nutshell (cf AI course)

                                                  Substitutions

                                                  Eg Num singular T vp(VNP)

                                                  Applying substitution bull Simultaneously replace variables by

                                                  corresponding termsbull S(Num) Num singular = S(singular)

                                                  Unification

                                                  Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                                                  Gives Num singular

                                                  bull Art(singular) and Art(plural) Fails

                                                  bull Art(Num1) and Art(Num2) Num1 Num2

                                                  bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                                                  Parsing with DCGs

                                                  Now require successful unification at each step

                                                  S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                                  S-gt a turtle sleep fails

                                                  Case Marking

                                                  PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                                  PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                                  PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                                  PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                                  S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                                  VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                                  VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                                  VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                                  He sees her She sees him They see her

                                                  But not Them see he

                                                  DCGs

                                                  Are strictly more expressive than CFGs Can represent for instance

                                                  bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                                  Probabilistic Models

                                                  Traditional grammar models are very rigid bull essentially a yes no decision

                                                  Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                                  Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                                  Illustration

                                                  Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                                  bull Constructed by handbull Can be used to derive stochastic context free

                                                  grammarsbull SCFG assign probability to parse trees

                                                  Compute the most probable parse tree

                                                  Sequences are omni-present

                                                  Therefore the techniques we will see also apply tobull Bioinformatics

                                                  DNA proteins mRNA hellip can all be represented as strings

                                                  bullRobotics Sequences of actions states hellip

                                                  bullhellip

                                                  Rest of the Course

                                                  Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                                  All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                                  bull As an example of using undirected graphical models

                                                  bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                                  • Advanced Artificial Intelligence
                                                  • Topic
                                                  • Contents
                                                  • Rationalism versus Empiricism
                                                  • Slide 5
                                                  • This course
                                                  • Ambiguity
                                                  • NLP and Statistics
                                                  • Slide 9
                                                  • Corpora
                                                  • Word Counts
                                                  • Slide 12
                                                  • Word Counts (Brown corpus)
                                                  • Slide 14
                                                  • Zipflsquos Law
                                                  • Language and sequences
                                                  • Key NLP Problem Ambiguity
                                                  • Language Model
                                                  • Example of bad language model
                                                  • A bad language model
                                                  • Slide 22
                                                  • A good language model
                                                  • Why language models
                                                  • Applications
                                                  • Spelling errors
                                                  • Handwriting recognition
                                                  • For Spell Checkers
                                                  • Another dimension in language models
                                                  • Sequence Tagging
                                                  • Slide 31
                                                  • Parsing
                                                  • Slide 33
                                                  • Language models based on Grammars
                                                  • Grammars and parsing
                                                  • Regular Grammars and Finite State Automata
                                                  • Finite State Automaton
                                                  • Phrase structure
                                                  • Notation
                                                  • Context Free Grammar
                                                  • Slide 41
                                                  • Top-down parsing
                                                  • Context-free grammar
                                                  • Parse tree
                                                  • Definite Clause Grammars Non-terminals may have arguments
                                                  • DCGs
                                                  • Unification in a nutshell (cf AI course)
                                                  • Unification
                                                  • Parsing with DCGs
                                                  • Case Marking
                                                  • Slide 51
                                                  • Probabilistic Models
                                                  • Illustration
                                                  • PowerPoint Presentation
                                                  • Sequences are omni-present
                                                  • Rest of the Course

                                                    Handwriting recognition

                                                    Assume a note is given to a bank teller which the teller reads as I have a gub (cf Woody Allen)

                                                    NLP to the rescue hellipbull gub is not a wordbull gun gum Gus and gull are words but gun has a

                                                    higher probability in the context of a bank

                                                    For Spell Checkers

                                                    Collect list of commonly substituted wordsbull piecepeace whetherweather theirthere

                                                    ExampleldquoOn Tuesday the whether helliprsquorsquoldquoOn Tuesday the weather helliprdquo

                                                    Another dimension in language models

                                                    Do we mainly want to infer (probabilities) of legal sentences sequences bull So far

                                                    Or do we want to infer properties of these sentences bull Eg parse tree part-of-speech-taggingbullNeeded for understanding NL

                                                    Letrsquos look at some tasks

                                                    Sequence Tagging

                                                    Part-of-speech taggingbull He drives with his bikebull N V PR PN N noun verb preposition pronoun noun

                                                    Text extractionbull The job is that of a programmerbull X X X X X X JobTypebull The seminar is taking place from 1500 to 1600bull X X X X X X Start End

                                                    Sequence Tagging

                                                    Predicting the secondary structure of proteins mRNA hellipbull X = AFARLMMAhellipbull Y = hehestststhesthe hellip

                                                    Parsing

                                                    Given a sentence find its parse tree Important step in understanding NL

                                                    Parsing

                                                    In bioinformatics allows to predict (elements of) structure from sequence

                                                    Language models based on Grammars

                                                    Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

                                                    A particular type of Unification Based Grammar (Prolog)

                                                    Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

                                                    words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

                                                    bull Grammar encode rules

                                                    Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

                                                    (more than just recognition) Result of parsing mostly parse tree

                                                    showing the constituents of a sentence eg verb or noun phrases

                                                    Syntax usually specified in terms of a grammar consisting of grammar rules

                                                    Regular Grammars and Finite State Automata

                                                    Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

                                                    argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

                                                    argumentbull Adj (adjective)

                                                    Now acceptbull The cat sleptbull Det N Vi

                                                    As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

                                                    Lexicon bull The - Detbull Cat - Nbull Slept - Vi

                                                    bull hellip

                                                    Finite State Automaton

                                                    Sentences

                                                    bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

                                                    Phrase structure

                                                    S

                                                    NP

                                                    D N

                                                    VP

                                                    NPV

                                                    D N

                                                    PP

                                                    P NP

                                                    D N

                                                    the dog chased a cat into the garden

                                                    Notation

                                                    S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

                                                    Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

                                                    Terminals ~ Lexicon

                                                    Phrase structure

                                                    Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                                                    Recursionbull bdquoThe girl thought the dog chased the catldquo

                                                    VP -gt V SN -gt [girl]V -gt [thought]

                                                    Top-down parsing

                                                    S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                                                    Context-free grammarSS --gt --gt NPNPVPVP

                                                    NPNP --gt PN --gt PN Proper nounProper noun

                                                    NPNP --gt Art Adj N--gt Art Adj N

                                                    NPNP --gt ArtN--gt ArtN

                                                    VPVP --gt VI --gt VI intransitive verbintransitive verb

                                                    VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                                                    ArtArt --gt [the]--gt [the]

                                                    AdjAdj --gt [lazy]--gt [lazy]

                                                    AdjAdj --gt [rapid]--gt [rapid]

                                                    PNPN --gt [achilles]--gt [achilles]

                                                    NN --gt [turtle]--gt [turtle]

                                                    VIVI --gt [sleeps]--gt [sleeps]

                                                    VTVT --gt [beats]--gt [beats]

                                                    Parse tree

                                                    SS

                                                    NPNP VPVP

                                                    ArtArt AdjAdj NN VtVt NPNP

                                                    PNPN

                                                    achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                                                    Definite Clause GrammarsNon-terminals may have arguments

                                                    SS --gt --gt NPNP((NN))VPVP((NN))

                                                    NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                                                    VP(VP(NN)) --gt VI(--gt VI(NN))

                                                    Art(Art(singularsingular)) --gt [a]--gt [a]

                                                    Art(Art(singularsingular)) --gt [the]--gt [the]

                                                    Art(Art(pluralplural)) --gt [the]--gt [the]

                                                    N(N(singularsingular)) --gt [turtle]--gt [turtle]

                                                    N(N(pluralplural)) --gt [turtles]--gt [turtles]

                                                    VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                                                    VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                                                    Number Agreement

                                                    DCGs

                                                    Non-terminals may have argumentsbull Variables (start with capital)

                                                    Eg Number Any

                                                    bull Constants (start with lower case) Eg singular plural

                                                    bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                                                    Parsing needs to be adapted bull Using unification

                                                    Unification in a nutshell (cf AI course)

                                                    Substitutions

                                                    Eg Num singular T vp(VNP)

                                                    Applying substitution bull Simultaneously replace variables by

                                                    corresponding termsbull S(Num) Num singular = S(singular)

                                                    Unification

                                                    Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                                                    Gives Num singular

                                                    bull Art(singular) and Art(plural) Fails

                                                    bull Art(Num1) and Art(Num2) Num1 Num2

                                                    bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                                                    Parsing with DCGs

                                                    Now require successful unification at each step

                                                    S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                                    S-gt a turtle sleep fails

                                                    Case Marking

                                                    PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                                    PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                                    PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                                    PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                                    S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                                    VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                                    VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                                    VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                                    He sees her She sees him They see her

                                                    But not Them see he

                                                    DCGs

                                                    Are strictly more expressive than CFGs Can represent for instance

                                                    bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                                    Probabilistic Models

                                                    Traditional grammar models are very rigid bull essentially a yes no decision

                                                    Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                                    Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                                    Illustration

                                                    Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                                    bull Constructed by handbull Can be used to derive stochastic context free

                                                    grammarsbull SCFG assign probability to parse trees

                                                    Compute the most probable parse tree

                                                    Sequences are omni-present

                                                    Therefore the techniques we will see also apply tobull Bioinformatics

                                                    DNA proteins mRNA hellip can all be represented as strings

                                                    bullRobotics Sequences of actions states hellip

                                                    bullhellip

                                                    Rest of the Course

                                                    Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                                    All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                                    bull As an example of using undirected graphical models

                                                    bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                                    • Advanced Artificial Intelligence
                                                    • Topic
                                                    • Contents
                                                    • Rationalism versus Empiricism
                                                    • Slide 5
                                                    • This course
                                                    • Ambiguity
                                                    • NLP and Statistics
                                                    • Slide 9
                                                    • Corpora
                                                    • Word Counts
                                                    • Slide 12
                                                    • Word Counts (Brown corpus)
                                                    • Slide 14
                                                    • Zipflsquos Law
                                                    • Language and sequences
                                                    • Key NLP Problem Ambiguity
                                                    • Language Model
                                                    • Example of bad language model
                                                    • A bad language model
                                                    • Slide 22
                                                    • A good language model
                                                    • Why language models
                                                    • Applications
                                                    • Spelling errors
                                                    • Handwriting recognition
                                                    • For Spell Checkers
                                                    • Another dimension in language models
                                                    • Sequence Tagging
                                                    • Slide 31
                                                    • Parsing
                                                    • Slide 33
                                                    • Language models based on Grammars
                                                    • Grammars and parsing
                                                    • Regular Grammars and Finite State Automata
                                                    • Finite State Automaton
                                                    • Phrase structure
                                                    • Notation
                                                    • Context Free Grammar
                                                    • Slide 41
                                                    • Top-down parsing
                                                    • Context-free grammar
                                                    • Parse tree
                                                    • Definite Clause Grammars Non-terminals may have arguments
                                                    • DCGs
                                                    • Unification in a nutshell (cf AI course)
                                                    • Unification
                                                    • Parsing with DCGs
                                                    • Case Marking
                                                    • Slide 51
                                                    • Probabilistic Models
                                                    • Illustration
                                                    • PowerPoint Presentation
                                                    • Sequences are omni-present
                                                    • Rest of the Course

                                                      For Spell Checkers

                                                      Collect list of commonly substituted wordsbull piecepeace whetherweather theirthere

                                                      ExampleldquoOn Tuesday the whether helliprsquorsquoldquoOn Tuesday the weather helliprdquo

                                                      Another dimension in language models

                                                      Do we mainly want to infer (probabilities) of legal sentences sequences bull So far

                                                      Or do we want to infer properties of these sentences bull Eg parse tree part-of-speech-taggingbullNeeded for understanding NL

                                                      Letrsquos look at some tasks

                                                      Sequence Tagging

                                                      Part-of-speech taggingbull He drives with his bikebull N V PR PN N noun verb preposition pronoun noun

                                                      Text extractionbull The job is that of a programmerbull X X X X X X JobTypebull The seminar is taking place from 1500 to 1600bull X X X X X X Start End

                                                      Sequence Tagging

                                                      Predicting the secondary structure of proteins mRNA hellipbull X = AFARLMMAhellipbull Y = hehestststhesthe hellip

                                                      Parsing

                                                      Given a sentence find its parse tree Important step in understanding NL

                                                      Parsing

                                                      In bioinformatics allows to predict (elements of) structure from sequence

                                                      Language models based on Grammars

                                                      Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

                                                      A particular type of Unification Based Grammar (Prolog)

                                                      Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

                                                      words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

                                                      bull Grammar encode rules

                                                      Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

                                                      (more than just recognition) Result of parsing mostly parse tree

                                                      showing the constituents of a sentence eg verb or noun phrases

                                                      Syntax usually specified in terms of a grammar consisting of grammar rules

                                                      Regular Grammars and Finite State Automata

                                                      Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

                                                      argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

                                                      argumentbull Adj (adjective)

                                                      Now acceptbull The cat sleptbull Det N Vi

                                                      As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

                                                      Lexicon bull The - Detbull Cat - Nbull Slept - Vi

                                                      bull hellip

                                                      Finite State Automaton

                                                      Sentences

                                                      bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

                                                      Phrase structure

                                                      S

                                                      NP

                                                      D N

                                                      VP

                                                      NPV

                                                      D N

                                                      PP

                                                      P NP

                                                      D N

                                                      the dog chased a cat into the garden

                                                      Notation

                                                      S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

                                                      Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

                                                      Terminals ~ Lexicon

                                                      Phrase structure

                                                      Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                                                      Recursionbull bdquoThe girl thought the dog chased the catldquo

                                                      VP -gt V SN -gt [girl]V -gt [thought]

                                                      Top-down parsing

                                                      S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                                                      Context-free grammarSS --gt --gt NPNPVPVP

                                                      NPNP --gt PN --gt PN Proper nounProper noun

                                                      NPNP --gt Art Adj N--gt Art Adj N

                                                      NPNP --gt ArtN--gt ArtN

                                                      VPVP --gt VI --gt VI intransitive verbintransitive verb

                                                      VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                                                      ArtArt --gt [the]--gt [the]

                                                      AdjAdj --gt [lazy]--gt [lazy]

                                                      AdjAdj --gt [rapid]--gt [rapid]

                                                      PNPN --gt [achilles]--gt [achilles]

                                                      NN --gt [turtle]--gt [turtle]

                                                      VIVI --gt [sleeps]--gt [sleeps]

                                                      VTVT --gt [beats]--gt [beats]

                                                      Parse tree

                                                      SS

                                                      NPNP VPVP

                                                      ArtArt AdjAdj NN VtVt NPNP

                                                      PNPN

                                                      achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                                                      Definite Clause GrammarsNon-terminals may have arguments

                                                      SS --gt --gt NPNP((NN))VPVP((NN))

                                                      NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                                                      VP(VP(NN)) --gt VI(--gt VI(NN))

                                                      Art(Art(singularsingular)) --gt [a]--gt [a]

                                                      Art(Art(singularsingular)) --gt [the]--gt [the]

                                                      Art(Art(pluralplural)) --gt [the]--gt [the]

                                                      N(N(singularsingular)) --gt [turtle]--gt [turtle]

                                                      N(N(pluralplural)) --gt [turtles]--gt [turtles]

                                                      VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                                                      VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                                                      Number Agreement

                                                      DCGs

                                                      Non-terminals may have argumentsbull Variables (start with capital)

                                                      Eg Number Any

                                                      bull Constants (start with lower case) Eg singular plural

                                                      bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                                                      Parsing needs to be adapted bull Using unification

                                                      Unification in a nutshell (cf AI course)

                                                      Substitutions

                                                      Eg Num singular T vp(VNP)

                                                      Applying substitution bull Simultaneously replace variables by

                                                      corresponding termsbull S(Num) Num singular = S(singular)

                                                      Unification

                                                      Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                                                      Gives Num singular

                                                      bull Art(singular) and Art(plural) Fails

                                                      bull Art(Num1) and Art(Num2) Num1 Num2

                                                      bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                                                      Parsing with DCGs

                                                      Now require successful unification at each step

                                                      S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                                      S-gt a turtle sleep fails

                                                      Case Marking

                                                      PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                                      PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                                      PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                                      PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                                      S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                                      VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                                      VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                                      VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                                      He sees her She sees him They see her

                                                      But not Them see he

                                                      DCGs

                                                      Are strictly more expressive than CFGs Can represent for instance

                                                      bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                                      Probabilistic Models

                                                      Traditional grammar models are very rigid bull essentially a yes no decision

                                                      Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                                      Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                                      Illustration

                                                      Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                                      bull Constructed by handbull Can be used to derive stochastic context free

                                                      grammarsbull SCFG assign probability to parse trees

                                                      Compute the most probable parse tree

                                                      Sequences are omni-present

                                                      Therefore the techniques we will see also apply tobull Bioinformatics

                                                      DNA proteins mRNA hellip can all be represented as strings

                                                      bullRobotics Sequences of actions states hellip

                                                      bullhellip

                                                      Rest of the Course

                                                      Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                                      All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                                      bull As an example of using undirected graphical models

                                                      bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                                      • Advanced Artificial Intelligence
                                                      • Topic
                                                      • Contents
                                                      • Rationalism versus Empiricism
                                                      • Slide 5
                                                      • This course
                                                      • Ambiguity
                                                      • NLP and Statistics
                                                      • Slide 9
                                                      • Corpora
                                                      • Word Counts
                                                      • Slide 12
                                                      • Word Counts (Brown corpus)
                                                      • Slide 14
                                                      • Zipflsquos Law
                                                      • Language and sequences
                                                      • Key NLP Problem Ambiguity
                                                      • Language Model
                                                      • Example of bad language model
                                                      • A bad language model
                                                      • Slide 22
                                                      • A good language model
                                                      • Why language models
                                                      • Applications
                                                      • Spelling errors
                                                      • Handwriting recognition
                                                      • For Spell Checkers
                                                      • Another dimension in language models
                                                      • Sequence Tagging
                                                      • Slide 31
                                                      • Parsing
                                                      • Slide 33
                                                      • Language models based on Grammars
                                                      • Grammars and parsing
                                                      • Regular Grammars and Finite State Automata
                                                      • Finite State Automaton
                                                      • Phrase structure
                                                      • Notation
                                                      • Context Free Grammar
                                                      • Slide 41
                                                      • Top-down parsing
                                                      • Context-free grammar
                                                      • Parse tree
                                                      • Definite Clause Grammars Non-terminals may have arguments
                                                      • DCGs
                                                      • Unification in a nutshell (cf AI course)
                                                      • Unification
                                                      • Parsing with DCGs
                                                      • Case Marking
                                                      • Slide 51
                                                      • Probabilistic Models
                                                      • Illustration
                                                      • PowerPoint Presentation
                                                      • Sequences are omni-present
                                                      • Rest of the Course

                                                        Another dimension in language models

                                                        Do we mainly want to infer (probabilities) of legal sentences sequences bull So far

                                                        Or do we want to infer properties of these sentences bull Eg parse tree part-of-speech-taggingbullNeeded for understanding NL

                                                        Letrsquos look at some tasks

                                                        Sequence Tagging

                                                        Part-of-speech taggingbull He drives with his bikebull N V PR PN N noun verb preposition pronoun noun

                                                        Text extractionbull The job is that of a programmerbull X X X X X X JobTypebull The seminar is taking place from 1500 to 1600bull X X X X X X Start End

                                                        Sequence Tagging

                                                        Predicting the secondary structure of proteins mRNA hellipbull X = AFARLMMAhellipbull Y = hehestststhesthe hellip

                                                        Parsing

                                                        Given a sentence find its parse tree Important step in understanding NL

                                                        Parsing

                                                        In bioinformatics allows to predict (elements of) structure from sequence

                                                        Language models based on Grammars

                                                        Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

                                                        A particular type of Unification Based Grammar (Prolog)

                                                        Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

                                                        words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

                                                        bull Grammar encode rules

                                                        Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

                                                        (more than just recognition) Result of parsing mostly parse tree

                                                        showing the constituents of a sentence eg verb or noun phrases

                                                        Syntax usually specified in terms of a grammar consisting of grammar rules

                                                        Regular Grammars and Finite State Automata

                                                        Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

                                                        argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

                                                        argumentbull Adj (adjective)

                                                        Now acceptbull The cat sleptbull Det N Vi

                                                        As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

                                                        Lexicon bull The - Detbull Cat - Nbull Slept - Vi

                                                        bull hellip

                                                        Finite State Automaton

                                                        Sentences

                                                        bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

                                                        Phrase structure

                                                        S

                                                        NP

                                                        D N

                                                        VP

                                                        NPV

                                                        D N

                                                        PP

                                                        P NP

                                                        D N

                                                        the dog chased a cat into the garden

                                                        Notation

                                                        S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

                                                        Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

                                                        Terminals ~ Lexicon

                                                        Phrase structure

                                                        Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                                                        Recursionbull bdquoThe girl thought the dog chased the catldquo

                                                        VP -gt V SN -gt [girl]V -gt [thought]

                                                        Top-down parsing

                                                        S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                                                        Context-free grammarSS --gt --gt NPNPVPVP

                                                        NPNP --gt PN --gt PN Proper nounProper noun

                                                        NPNP --gt Art Adj N--gt Art Adj N

                                                        NPNP --gt ArtN--gt ArtN

                                                        VPVP --gt VI --gt VI intransitive verbintransitive verb

                                                        VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                                                        ArtArt --gt [the]--gt [the]

                                                        AdjAdj --gt [lazy]--gt [lazy]

                                                        AdjAdj --gt [rapid]--gt [rapid]

                                                        PNPN --gt [achilles]--gt [achilles]

                                                        NN --gt [turtle]--gt [turtle]

                                                        VIVI --gt [sleeps]--gt [sleeps]

                                                        VTVT --gt [beats]--gt [beats]

                                                        Parse tree

                                                        SS

                                                        NPNP VPVP

                                                        ArtArt AdjAdj NN VtVt NPNP

                                                        PNPN

                                                        achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                                                        Definite Clause GrammarsNon-terminals may have arguments

                                                        SS --gt --gt NPNP((NN))VPVP((NN))

                                                        NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                                                        VP(VP(NN)) --gt VI(--gt VI(NN))

                                                        Art(Art(singularsingular)) --gt [a]--gt [a]

                                                        Art(Art(singularsingular)) --gt [the]--gt [the]

                                                        Art(Art(pluralplural)) --gt [the]--gt [the]

                                                        N(N(singularsingular)) --gt [turtle]--gt [turtle]

                                                        N(N(pluralplural)) --gt [turtles]--gt [turtles]

                                                        VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                                                        VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                                                        Number Agreement

                                                        DCGs

                                                        Non-terminals may have argumentsbull Variables (start with capital)

                                                        Eg Number Any

                                                        bull Constants (start with lower case) Eg singular plural

                                                        bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                                                        Parsing needs to be adapted bull Using unification

                                                        Unification in a nutshell (cf AI course)

                                                        Substitutions

                                                        Eg Num singular T vp(VNP)

                                                        Applying substitution bull Simultaneously replace variables by

                                                        corresponding termsbull S(Num) Num singular = S(singular)

                                                        Unification

                                                        Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                                                        Gives Num singular

                                                        bull Art(singular) and Art(plural) Fails

                                                        bull Art(Num1) and Art(Num2) Num1 Num2

                                                        bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                                                        Parsing with DCGs

                                                        Now require successful unification at each step

                                                        S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                                        S-gt a turtle sleep fails

                                                        Case Marking

                                                        PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                                        PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                                        PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                                        PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                                        S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                                        VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                                        VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                                        VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                                        He sees her She sees him They see her

                                                        But not Them see he

                                                        DCGs

                                                        Are strictly more expressive than CFGs Can represent for instance

                                                        bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                                        Probabilistic Models

                                                        Traditional grammar models are very rigid bull essentially a yes no decision

                                                        Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                                        Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                                        Illustration

                                                        Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                                        bull Constructed by handbull Can be used to derive stochastic context free

                                                        grammarsbull SCFG assign probability to parse trees

                                                        Compute the most probable parse tree

                                                        Sequences are omni-present

                                                        Therefore the techniques we will see also apply tobull Bioinformatics

                                                        DNA proteins mRNA hellip can all be represented as strings

                                                        bullRobotics Sequences of actions states hellip

                                                        bullhellip

                                                        Rest of the Course

                                                        Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                                        All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                                        bull As an example of using undirected graphical models

                                                        bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                                        • Advanced Artificial Intelligence
                                                        • Topic
                                                        • Contents
                                                        • Rationalism versus Empiricism
                                                        • Slide 5
                                                        • This course
                                                        • Ambiguity
                                                        • NLP and Statistics
                                                        • Slide 9
                                                        • Corpora
                                                        • Word Counts
                                                        • Slide 12
                                                        • Word Counts (Brown corpus)
                                                        • Slide 14
                                                        • Zipflsquos Law
                                                        • Language and sequences
                                                        • Key NLP Problem Ambiguity
                                                        • Language Model
                                                        • Example of bad language model
                                                        • A bad language model
                                                        • Slide 22
                                                        • A good language model
                                                        • Why language models
                                                        • Applications
                                                        • Spelling errors
                                                        • Handwriting recognition
                                                        • For Spell Checkers
                                                        • Another dimension in language models
                                                        • Sequence Tagging
                                                        • Slide 31
                                                        • Parsing
                                                        • Slide 33
                                                        • Language models based on Grammars
                                                        • Grammars and parsing
                                                        • Regular Grammars and Finite State Automata
                                                        • Finite State Automaton
                                                        • Phrase structure
                                                        • Notation
                                                        • Context Free Grammar
                                                        • Slide 41
                                                        • Top-down parsing
                                                        • Context-free grammar
                                                        • Parse tree
                                                        • Definite Clause Grammars Non-terminals may have arguments
                                                        • DCGs
                                                        • Unification in a nutshell (cf AI course)
                                                        • Unification
                                                        • Parsing with DCGs
                                                        • Case Marking
                                                        • Slide 51
                                                        • Probabilistic Models
                                                        • Illustration
                                                        • PowerPoint Presentation
                                                        • Sequences are omni-present
                                                        • Rest of the Course

                                                          Sequence Tagging

                                                          Part-of-speech taggingbull He drives with his bikebull N V PR PN N noun verb preposition pronoun noun

                                                          Text extractionbull The job is that of a programmerbull X X X X X X JobTypebull The seminar is taking place from 1500 to 1600bull X X X X X X Start End

                                                          Sequence Tagging

                                                          Predicting the secondary structure of proteins mRNA hellipbull X = AFARLMMAhellipbull Y = hehestststhesthe hellip

                                                          Parsing

                                                          Given a sentence find its parse tree Important step in understanding NL

                                                          Parsing

                                                          In bioinformatics allows to predict (elements of) structure from sequence

                                                          Language models based on Grammars

                                                          Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

                                                          A particular type of Unification Based Grammar (Prolog)

                                                          Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

                                                          words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

                                                          bull Grammar encode rules

                                                          Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

                                                          (more than just recognition) Result of parsing mostly parse tree

                                                          showing the constituents of a sentence eg verb or noun phrases

                                                          Syntax usually specified in terms of a grammar consisting of grammar rules

                                                          Regular Grammars and Finite State Automata

                                                          Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

                                                          argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

                                                          argumentbull Adj (adjective)

                                                          Now acceptbull The cat sleptbull Det N Vi

                                                          As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

                                                          Lexicon bull The - Detbull Cat - Nbull Slept - Vi

                                                          bull hellip

                                                          Finite State Automaton

                                                          Sentences

                                                          bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

                                                          Phrase structure

                                                          S

                                                          NP

                                                          D N

                                                          VP

                                                          NPV

                                                          D N

                                                          PP

                                                          P NP

                                                          D N

                                                          the dog chased a cat into the garden

                                                          Notation

                                                          S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

                                                          Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

                                                          Terminals ~ Lexicon

                                                          Phrase structure

                                                          Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                                                          Recursionbull bdquoThe girl thought the dog chased the catldquo

                                                          VP -gt V SN -gt [girl]V -gt [thought]

                                                          Top-down parsing

                                                          S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                                                          Context-free grammarSS --gt --gt NPNPVPVP

                                                          NPNP --gt PN --gt PN Proper nounProper noun

                                                          NPNP --gt Art Adj N--gt Art Adj N

                                                          NPNP --gt ArtN--gt ArtN

                                                          VPVP --gt VI --gt VI intransitive verbintransitive verb

                                                          VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                                                          ArtArt --gt [the]--gt [the]

                                                          AdjAdj --gt [lazy]--gt [lazy]

                                                          AdjAdj --gt [rapid]--gt [rapid]

                                                          PNPN --gt [achilles]--gt [achilles]

                                                          NN --gt [turtle]--gt [turtle]

                                                          VIVI --gt [sleeps]--gt [sleeps]

                                                          VTVT --gt [beats]--gt [beats]

                                                          Parse tree

                                                          SS

                                                          NPNP VPVP

                                                          ArtArt AdjAdj NN VtVt NPNP

                                                          PNPN

                                                          achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                                                          Definite Clause GrammarsNon-terminals may have arguments

                                                          SS --gt --gt NPNP((NN))VPVP((NN))

                                                          NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                                                          VP(VP(NN)) --gt VI(--gt VI(NN))

                                                          Art(Art(singularsingular)) --gt [a]--gt [a]

                                                          Art(Art(singularsingular)) --gt [the]--gt [the]

                                                          Art(Art(pluralplural)) --gt [the]--gt [the]

                                                          N(N(singularsingular)) --gt [turtle]--gt [turtle]

                                                          N(N(pluralplural)) --gt [turtles]--gt [turtles]

                                                          VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                                                          VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                                                          Number Agreement

                                                          DCGs

                                                          Non-terminals may have argumentsbull Variables (start with capital)

                                                          Eg Number Any

                                                          bull Constants (start with lower case) Eg singular plural

                                                          bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                                                          Parsing needs to be adapted bull Using unification

                                                          Unification in a nutshell (cf AI course)

                                                          Substitutions

                                                          Eg Num singular T vp(VNP)

                                                          Applying substitution bull Simultaneously replace variables by

                                                          corresponding termsbull S(Num) Num singular = S(singular)

                                                          Unification

                                                          Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                                                          Gives Num singular

                                                          bull Art(singular) and Art(plural) Fails

                                                          bull Art(Num1) and Art(Num2) Num1 Num2

                                                          bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                                                          Parsing with DCGs

                                                          Now require successful unification at each step

                                                          S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                                          S-gt a turtle sleep fails

                                                          Case Marking

                                                          PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                                          PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                                          PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                                          PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                                          S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                                          VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                                          VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                                          VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                                          He sees her She sees him They see her

                                                          But not Them see he

                                                          DCGs

                                                          Are strictly more expressive than CFGs Can represent for instance

                                                          bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                                          Probabilistic Models

                                                          Traditional grammar models are very rigid bull essentially a yes no decision

                                                          Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                                          Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                                          Illustration

                                                          Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                                          bull Constructed by handbull Can be used to derive stochastic context free

                                                          grammarsbull SCFG assign probability to parse trees

                                                          Compute the most probable parse tree

                                                          Sequences are omni-present

                                                          Therefore the techniques we will see also apply tobull Bioinformatics

                                                          DNA proteins mRNA hellip can all be represented as strings

                                                          bullRobotics Sequences of actions states hellip

                                                          bullhellip

                                                          Rest of the Course

                                                          Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                                          All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                                          bull As an example of using undirected graphical models

                                                          bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                                          • Advanced Artificial Intelligence
                                                          • Topic
                                                          • Contents
                                                          • Rationalism versus Empiricism
                                                          • Slide 5
                                                          • This course
                                                          • Ambiguity
                                                          • NLP and Statistics
                                                          • Slide 9
                                                          • Corpora
                                                          • Word Counts
                                                          • Slide 12
                                                          • Word Counts (Brown corpus)
                                                          • Slide 14
                                                          • Zipflsquos Law
                                                          • Language and sequences
                                                          • Key NLP Problem Ambiguity
                                                          • Language Model
                                                          • Example of bad language model
                                                          • A bad language model
                                                          • Slide 22
                                                          • A good language model
                                                          • Why language models
                                                          • Applications
                                                          • Spelling errors
                                                          • Handwriting recognition
                                                          • For Spell Checkers
                                                          • Another dimension in language models
                                                          • Sequence Tagging
                                                          • Slide 31
                                                          • Parsing
                                                          • Slide 33
                                                          • Language models based on Grammars
                                                          • Grammars and parsing
                                                          • Regular Grammars and Finite State Automata
                                                          • Finite State Automaton
                                                          • Phrase structure
                                                          • Notation
                                                          • Context Free Grammar
                                                          • Slide 41
                                                          • Top-down parsing
                                                          • Context-free grammar
                                                          • Parse tree
                                                          • Definite Clause Grammars Non-terminals may have arguments
                                                          • DCGs
                                                          • Unification in a nutshell (cf AI course)
                                                          • Unification
                                                          • Parsing with DCGs
                                                          • Case Marking
                                                          • Slide 51
                                                          • Probabilistic Models
                                                          • Illustration
                                                          • PowerPoint Presentation
                                                          • Sequences are omni-present
                                                          • Rest of the Course

                                                            Sequence Tagging

                                                            Predicting the secondary structure of proteins mRNA hellipbull X = AFARLMMAhellipbull Y = hehestststhesthe hellip

                                                            Parsing

                                                            Given a sentence find its parse tree Important step in understanding NL

                                                            Parsing

                                                            In bioinformatics allows to predict (elements of) structure from sequence

                                                            Language models based on Grammars

                                                            Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

                                                            A particular type of Unification Based Grammar (Prolog)

                                                            Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

                                                            words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

                                                            bull Grammar encode rules

                                                            Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

                                                            (more than just recognition) Result of parsing mostly parse tree

                                                            showing the constituents of a sentence eg verb or noun phrases

                                                            Syntax usually specified in terms of a grammar consisting of grammar rules

                                                            Regular Grammars and Finite State Automata

                                                            Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

                                                            argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

                                                            argumentbull Adj (adjective)

                                                            Now acceptbull The cat sleptbull Det N Vi

                                                            As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

                                                            Lexicon bull The - Detbull Cat - Nbull Slept - Vi

                                                            bull hellip

                                                            Finite State Automaton

                                                            Sentences

                                                            bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

                                                            Phrase structure

                                                            S

                                                            NP

                                                            D N

                                                            VP

                                                            NPV

                                                            D N

                                                            PP

                                                            P NP

                                                            D N

                                                            the dog chased a cat into the garden

                                                            Notation

                                                            S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

                                                            Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

                                                            Terminals ~ Lexicon

                                                            Phrase structure

                                                            Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                                                            Recursionbull bdquoThe girl thought the dog chased the catldquo

                                                            VP -gt V SN -gt [girl]V -gt [thought]

                                                            Top-down parsing

                                                            S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                                                            Context-free grammarSS --gt --gt NPNPVPVP

                                                            NPNP --gt PN --gt PN Proper nounProper noun

                                                            NPNP --gt Art Adj N--gt Art Adj N

                                                            NPNP --gt ArtN--gt ArtN

                                                            VPVP --gt VI --gt VI intransitive verbintransitive verb

                                                            VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                                                            ArtArt --gt [the]--gt [the]

                                                            AdjAdj --gt [lazy]--gt [lazy]

                                                            AdjAdj --gt [rapid]--gt [rapid]

                                                            PNPN --gt [achilles]--gt [achilles]

                                                            NN --gt [turtle]--gt [turtle]

                                                            VIVI --gt [sleeps]--gt [sleeps]

                                                            VTVT --gt [beats]--gt [beats]

                                                            Parse tree

                                                            SS

                                                            NPNP VPVP

                                                            ArtArt AdjAdj NN VtVt NPNP

                                                            PNPN

                                                            achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                                                            Definite Clause GrammarsNon-terminals may have arguments

                                                            SS --gt --gt NPNP((NN))VPVP((NN))

                                                            NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                                                            VP(VP(NN)) --gt VI(--gt VI(NN))

                                                            Art(Art(singularsingular)) --gt [a]--gt [a]

                                                            Art(Art(singularsingular)) --gt [the]--gt [the]

                                                            Art(Art(pluralplural)) --gt [the]--gt [the]

                                                            N(N(singularsingular)) --gt [turtle]--gt [turtle]

                                                            N(N(pluralplural)) --gt [turtles]--gt [turtles]

                                                            VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                                                            VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                                                            Number Agreement

                                                            DCGs

                                                            Non-terminals may have argumentsbull Variables (start with capital)

                                                            Eg Number Any

                                                            bull Constants (start with lower case) Eg singular plural

                                                            bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                                                            Parsing needs to be adapted bull Using unification

                                                            Unification in a nutshell (cf AI course)

                                                            Substitutions

                                                            Eg Num singular T vp(VNP)

                                                            Applying substitution bull Simultaneously replace variables by

                                                            corresponding termsbull S(Num) Num singular = S(singular)

                                                            Unification

                                                            Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                                                            Gives Num singular

                                                            bull Art(singular) and Art(plural) Fails

                                                            bull Art(Num1) and Art(Num2) Num1 Num2

                                                            bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                                                            Parsing with DCGs

                                                            Now require successful unification at each step

                                                            S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                                            S-gt a turtle sleep fails

                                                            Case Marking

                                                            PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                                            PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                                            PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                                            PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                                            S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                                            VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                                            VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                                            VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                                            He sees her She sees him They see her

                                                            But not Them see he

                                                            DCGs

                                                            Are strictly more expressive than CFGs Can represent for instance

                                                            bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                                            Probabilistic Models

                                                            Traditional grammar models are very rigid bull essentially a yes no decision

                                                            Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                                            Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                                            Illustration

                                                            Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                                            bull Constructed by handbull Can be used to derive stochastic context free

                                                            grammarsbull SCFG assign probability to parse trees

                                                            Compute the most probable parse tree

                                                            Sequences are omni-present

                                                            Therefore the techniques we will see also apply tobull Bioinformatics

                                                            DNA proteins mRNA hellip can all be represented as strings

                                                            bullRobotics Sequences of actions states hellip

                                                            bullhellip

                                                            Rest of the Course

                                                            Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                                            All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                                            bull As an example of using undirected graphical models

                                                            bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                                            • Advanced Artificial Intelligence
                                                            • Topic
                                                            • Contents
                                                            • Rationalism versus Empiricism
                                                            • Slide 5
                                                            • This course
                                                            • Ambiguity
                                                            • NLP and Statistics
                                                            • Slide 9
                                                            • Corpora
                                                            • Word Counts
                                                            • Slide 12
                                                            • Word Counts (Brown corpus)
                                                            • Slide 14
                                                            • Zipflsquos Law
                                                            • Language and sequences
                                                            • Key NLP Problem Ambiguity
                                                            • Language Model
                                                            • Example of bad language model
                                                            • A bad language model
                                                            • Slide 22
                                                            • A good language model
                                                            • Why language models
                                                            • Applications
                                                            • Spelling errors
                                                            • Handwriting recognition
                                                            • For Spell Checkers
                                                            • Another dimension in language models
                                                            • Sequence Tagging
                                                            • Slide 31
                                                            • Parsing
                                                            • Slide 33
                                                            • Language models based on Grammars
                                                            • Grammars and parsing
                                                            • Regular Grammars and Finite State Automata
                                                            • Finite State Automaton
                                                            • Phrase structure
                                                            • Notation
                                                            • Context Free Grammar
                                                            • Slide 41
                                                            • Top-down parsing
                                                            • Context-free grammar
                                                            • Parse tree
                                                            • Definite Clause Grammars Non-terminals may have arguments
                                                            • DCGs
                                                            • Unification in a nutshell (cf AI course)
                                                            • Unification
                                                            • Parsing with DCGs
                                                            • Case Marking
                                                            • Slide 51
                                                            • Probabilistic Models
                                                            • Illustration
                                                            • PowerPoint Presentation
                                                            • Sequences are omni-present
                                                            • Rest of the Course

                                                              Parsing

                                                              Given a sentence find its parse tree Important step in understanding NL

                                                              Parsing

                                                              In bioinformatics allows to predict (elements of) structure from sequence

                                                              Language models based on Grammars

                                                              Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

                                                              A particular type of Unification Based Grammar (Prolog)

                                                              Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

                                                              words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

                                                              bull Grammar encode rules

                                                              Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

                                                              (more than just recognition) Result of parsing mostly parse tree

                                                              showing the constituents of a sentence eg verb or noun phrases

                                                              Syntax usually specified in terms of a grammar consisting of grammar rules

                                                              Regular Grammars and Finite State Automata

                                                              Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

                                                              argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

                                                              argumentbull Adj (adjective)

                                                              Now acceptbull The cat sleptbull Det N Vi

                                                              As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

                                                              Lexicon bull The - Detbull Cat - Nbull Slept - Vi

                                                              bull hellip

                                                              Finite State Automaton

                                                              Sentences

                                                              bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

                                                              Phrase structure

                                                              S

                                                              NP

                                                              D N

                                                              VP

                                                              NPV

                                                              D N

                                                              PP

                                                              P NP

                                                              D N

                                                              the dog chased a cat into the garden

                                                              Notation

                                                              S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

                                                              Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

                                                              Terminals ~ Lexicon

                                                              Phrase structure

                                                              Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                                                              Recursionbull bdquoThe girl thought the dog chased the catldquo

                                                              VP -gt V SN -gt [girl]V -gt [thought]

                                                              Top-down parsing

                                                              S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                                                              Context-free grammarSS --gt --gt NPNPVPVP

                                                              NPNP --gt PN --gt PN Proper nounProper noun

                                                              NPNP --gt Art Adj N--gt Art Adj N

                                                              NPNP --gt ArtN--gt ArtN

                                                              VPVP --gt VI --gt VI intransitive verbintransitive verb

                                                              VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                                                              ArtArt --gt [the]--gt [the]

                                                              AdjAdj --gt [lazy]--gt [lazy]

                                                              AdjAdj --gt [rapid]--gt [rapid]

                                                              PNPN --gt [achilles]--gt [achilles]

                                                              NN --gt [turtle]--gt [turtle]

                                                              VIVI --gt [sleeps]--gt [sleeps]

                                                              VTVT --gt [beats]--gt [beats]

                                                              Parse tree

                                                              SS

                                                              NPNP VPVP

                                                              ArtArt AdjAdj NN VtVt NPNP

                                                              PNPN

                                                              achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                                                              Definite Clause GrammarsNon-terminals may have arguments

                                                              SS --gt --gt NPNP((NN))VPVP((NN))

                                                              NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                                                              VP(VP(NN)) --gt VI(--gt VI(NN))

                                                              Art(Art(singularsingular)) --gt [a]--gt [a]

                                                              Art(Art(singularsingular)) --gt [the]--gt [the]

                                                              Art(Art(pluralplural)) --gt [the]--gt [the]

                                                              N(N(singularsingular)) --gt [turtle]--gt [turtle]

                                                              N(N(pluralplural)) --gt [turtles]--gt [turtles]

                                                              VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                                                              VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                                                              Number Agreement

                                                              DCGs

                                                              Non-terminals may have argumentsbull Variables (start with capital)

                                                              Eg Number Any

                                                              bull Constants (start with lower case) Eg singular plural

                                                              bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                                                              Parsing needs to be adapted bull Using unification

                                                              Unification in a nutshell (cf AI course)

                                                              Substitutions

                                                              Eg Num singular T vp(VNP)

                                                              Applying substitution bull Simultaneously replace variables by

                                                              corresponding termsbull S(Num) Num singular = S(singular)

                                                              Unification

                                                              Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                                                              Gives Num singular

                                                              bull Art(singular) and Art(plural) Fails

                                                              bull Art(Num1) and Art(Num2) Num1 Num2

                                                              bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                                                              Parsing with DCGs

                                                              Now require successful unification at each step

                                                              S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                                              S-gt a turtle sleep fails

                                                              Case Marking

                                                              PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                                              PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                                              PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                                              PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                                              S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                                              VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                                              VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                                              VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                                              He sees her She sees him They see her

                                                              But not Them see he

                                                              DCGs

                                                              Are strictly more expressive than CFGs Can represent for instance

                                                              bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                                              Probabilistic Models

                                                              Traditional grammar models are very rigid bull essentially a yes no decision

                                                              Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                                              Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                                              Illustration

                                                              Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                                              bull Constructed by handbull Can be used to derive stochastic context free

                                                              grammarsbull SCFG assign probability to parse trees

                                                              Compute the most probable parse tree

                                                              Sequences are omni-present

                                                              Therefore the techniques we will see also apply tobull Bioinformatics

                                                              DNA proteins mRNA hellip can all be represented as strings

                                                              bullRobotics Sequences of actions states hellip

                                                              bullhellip

                                                              Rest of the Course

                                                              Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                                              All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                                              bull As an example of using undirected graphical models

                                                              bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                                              • Advanced Artificial Intelligence
                                                              • Topic
                                                              • Contents
                                                              • Rationalism versus Empiricism
                                                              • Slide 5
                                                              • This course
                                                              • Ambiguity
                                                              • NLP and Statistics
                                                              • Slide 9
                                                              • Corpora
                                                              • Word Counts
                                                              • Slide 12
                                                              • Word Counts (Brown corpus)
                                                              • Slide 14
                                                              • Zipflsquos Law
                                                              • Language and sequences
                                                              • Key NLP Problem Ambiguity
                                                              • Language Model
                                                              • Example of bad language model
                                                              • A bad language model
                                                              • Slide 22
                                                              • A good language model
                                                              • Why language models
                                                              • Applications
                                                              • Spelling errors
                                                              • Handwriting recognition
                                                              • For Spell Checkers
                                                              • Another dimension in language models
                                                              • Sequence Tagging
                                                              • Slide 31
                                                              • Parsing
                                                              • Slide 33
                                                              • Language models based on Grammars
                                                              • Grammars and parsing
                                                              • Regular Grammars and Finite State Automata
                                                              • Finite State Automaton
                                                              • Phrase structure
                                                              • Notation
                                                              • Context Free Grammar
                                                              • Slide 41
                                                              • Top-down parsing
                                                              • Context-free grammar
                                                              • Parse tree
                                                              • Definite Clause Grammars Non-terminals may have arguments
                                                              • DCGs
                                                              • Unification in a nutshell (cf AI course)
                                                              • Unification
                                                              • Parsing with DCGs
                                                              • Case Marking
                                                              • Slide 51
                                                              • Probabilistic Models
                                                              • Illustration
                                                              • PowerPoint Presentation
                                                              • Sequences are omni-present
                                                              • Rest of the Course

                                                                Parsing

                                                                In bioinformatics allows to predict (elements of) structure from sequence

                                                                Language models based on Grammars

                                                                Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

                                                                A particular type of Unification Based Grammar (Prolog)

                                                                Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

                                                                words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

                                                                bull Grammar encode rules

                                                                Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

                                                                (more than just recognition) Result of parsing mostly parse tree

                                                                showing the constituents of a sentence eg verb or noun phrases

                                                                Syntax usually specified in terms of a grammar consisting of grammar rules

                                                                Regular Grammars and Finite State Automata

                                                                Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

                                                                argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

                                                                argumentbull Adj (adjective)

                                                                Now acceptbull The cat sleptbull Det N Vi

                                                                As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

                                                                Lexicon bull The - Detbull Cat - Nbull Slept - Vi

                                                                bull hellip

                                                                Finite State Automaton

                                                                Sentences

                                                                bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

                                                                Phrase structure

                                                                S

                                                                NP

                                                                D N

                                                                VP

                                                                NPV

                                                                D N

                                                                PP

                                                                P NP

                                                                D N

                                                                the dog chased a cat into the garden

                                                                Notation

                                                                S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

                                                                Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

                                                                Terminals ~ Lexicon

                                                                Phrase structure

                                                                Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                                                                Recursionbull bdquoThe girl thought the dog chased the catldquo

                                                                VP -gt V SN -gt [girl]V -gt [thought]

                                                                Top-down parsing

                                                                S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                                                                Context-free grammarSS --gt --gt NPNPVPVP

                                                                NPNP --gt PN --gt PN Proper nounProper noun

                                                                NPNP --gt Art Adj N--gt Art Adj N

                                                                NPNP --gt ArtN--gt ArtN

                                                                VPVP --gt VI --gt VI intransitive verbintransitive verb

                                                                VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                                                                ArtArt --gt [the]--gt [the]

                                                                AdjAdj --gt [lazy]--gt [lazy]

                                                                AdjAdj --gt [rapid]--gt [rapid]

                                                                PNPN --gt [achilles]--gt [achilles]

                                                                NN --gt [turtle]--gt [turtle]

                                                                VIVI --gt [sleeps]--gt [sleeps]

                                                                VTVT --gt [beats]--gt [beats]

                                                                Parse tree

                                                                SS

                                                                NPNP VPVP

                                                                ArtArt AdjAdj NN VtVt NPNP

                                                                PNPN

                                                                achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                                                                Definite Clause GrammarsNon-terminals may have arguments

                                                                SS --gt --gt NPNP((NN))VPVP((NN))

                                                                NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                                                                VP(VP(NN)) --gt VI(--gt VI(NN))

                                                                Art(Art(singularsingular)) --gt [a]--gt [a]

                                                                Art(Art(singularsingular)) --gt [the]--gt [the]

                                                                Art(Art(pluralplural)) --gt [the]--gt [the]

                                                                N(N(singularsingular)) --gt [turtle]--gt [turtle]

                                                                N(N(pluralplural)) --gt [turtles]--gt [turtles]

                                                                VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                                                                VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                                                                Number Agreement

                                                                DCGs

                                                                Non-terminals may have argumentsbull Variables (start with capital)

                                                                Eg Number Any

                                                                bull Constants (start with lower case) Eg singular plural

                                                                bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                                                                Parsing needs to be adapted bull Using unification

                                                                Unification in a nutshell (cf AI course)

                                                                Substitutions

                                                                Eg Num singular T vp(VNP)

                                                                Applying substitution bull Simultaneously replace variables by

                                                                corresponding termsbull S(Num) Num singular = S(singular)

                                                                Unification

                                                                Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                                                                Gives Num singular

                                                                bull Art(singular) and Art(plural) Fails

                                                                bull Art(Num1) and Art(Num2) Num1 Num2

                                                                bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                                                                Parsing with DCGs

                                                                Now require successful unification at each step

                                                                S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                                                S-gt a turtle sleep fails

                                                                Case Marking

                                                                PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                                                PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                                                PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                                                PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                                                S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                                                VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                                                VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                                                VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                                                He sees her She sees him They see her

                                                                But not Them see he

                                                                DCGs

                                                                Are strictly more expressive than CFGs Can represent for instance

                                                                bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                                                Probabilistic Models

                                                                Traditional grammar models are very rigid bull essentially a yes no decision

                                                                Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                                                Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                                                Illustration

                                                                Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                                                bull Constructed by handbull Can be used to derive stochastic context free

                                                                grammarsbull SCFG assign probability to parse trees

                                                                Compute the most probable parse tree

                                                                Sequences are omni-present

                                                                Therefore the techniques we will see also apply tobull Bioinformatics

                                                                DNA proteins mRNA hellip can all be represented as strings

                                                                bullRobotics Sequences of actions states hellip

                                                                bullhellip

                                                                Rest of the Course

                                                                Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                                                All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                                                bull As an example of using undirected graphical models

                                                                bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                                                • Advanced Artificial Intelligence
                                                                • Topic
                                                                • Contents
                                                                • Rationalism versus Empiricism
                                                                • Slide 5
                                                                • This course
                                                                • Ambiguity
                                                                • NLP and Statistics
                                                                • Slide 9
                                                                • Corpora
                                                                • Word Counts
                                                                • Slide 12
                                                                • Word Counts (Brown corpus)
                                                                • Slide 14
                                                                • Zipflsquos Law
                                                                • Language and sequences
                                                                • Key NLP Problem Ambiguity
                                                                • Language Model
                                                                • Example of bad language model
                                                                • A bad language model
                                                                • Slide 22
                                                                • A good language model
                                                                • Why language models
                                                                • Applications
                                                                • Spelling errors
                                                                • Handwriting recognition
                                                                • For Spell Checkers
                                                                • Another dimension in language models
                                                                • Sequence Tagging
                                                                • Slide 31
                                                                • Parsing
                                                                • Slide 33
                                                                • Language models based on Grammars
                                                                • Grammars and parsing
                                                                • Regular Grammars and Finite State Automata
                                                                • Finite State Automaton
                                                                • Phrase structure
                                                                • Notation
                                                                • Context Free Grammar
                                                                • Slide 41
                                                                • Top-down parsing
                                                                • Context-free grammar
                                                                • Parse tree
                                                                • Definite Clause Grammars Non-terminals may have arguments
                                                                • DCGs
                                                                • Unification in a nutshell (cf AI course)
                                                                • Unification
                                                                • Parsing with DCGs
                                                                • Case Marking
                                                                • Slide 51
                                                                • Probabilistic Models
                                                                • Illustration
                                                                • PowerPoint Presentation
                                                                • Sequences are omni-present
                                                                • Rest of the Course

                                                                  Language models based on Grammars

                                                                  Grammar Typesbull Regular grammars and Finite State Automatabull Context-Free Grammarsbull Definite Clause Grammars

                                                                  A particular type of Unification Based Grammar (Prolog)

                                                                  Distinguish lexicon from grammarbull Lexicon (dictionary) contains information about

                                                                  words eg word - possible tags (and possibly additional information) flies - V(erb) - N(oun)

                                                                  bull Grammar encode rules

                                                                  Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

                                                                  (more than just recognition) Result of parsing mostly parse tree

                                                                  showing the constituents of a sentence eg verb or noun phrases

                                                                  Syntax usually specified in terms of a grammar consisting of grammar rules

                                                                  Regular Grammars and Finite State Automata

                                                                  Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

                                                                  argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

                                                                  argumentbull Adj (adjective)

                                                                  Now acceptbull The cat sleptbull Det N Vi

                                                                  As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

                                                                  Lexicon bull The - Detbull Cat - Nbull Slept - Vi

                                                                  bull hellip

                                                                  Finite State Automaton

                                                                  Sentences

                                                                  bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

                                                                  Phrase structure

                                                                  S

                                                                  NP

                                                                  D N

                                                                  VP

                                                                  NPV

                                                                  D N

                                                                  PP

                                                                  P NP

                                                                  D N

                                                                  the dog chased a cat into the garden

                                                                  Notation

                                                                  S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

                                                                  Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

                                                                  Terminals ~ Lexicon

                                                                  Phrase structure

                                                                  Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                                                                  Recursionbull bdquoThe girl thought the dog chased the catldquo

                                                                  VP -gt V SN -gt [girl]V -gt [thought]

                                                                  Top-down parsing

                                                                  S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                                                                  Context-free grammarSS --gt --gt NPNPVPVP

                                                                  NPNP --gt PN --gt PN Proper nounProper noun

                                                                  NPNP --gt Art Adj N--gt Art Adj N

                                                                  NPNP --gt ArtN--gt ArtN

                                                                  VPVP --gt VI --gt VI intransitive verbintransitive verb

                                                                  VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                                                                  ArtArt --gt [the]--gt [the]

                                                                  AdjAdj --gt [lazy]--gt [lazy]

                                                                  AdjAdj --gt [rapid]--gt [rapid]

                                                                  PNPN --gt [achilles]--gt [achilles]

                                                                  NN --gt [turtle]--gt [turtle]

                                                                  VIVI --gt [sleeps]--gt [sleeps]

                                                                  VTVT --gt [beats]--gt [beats]

                                                                  Parse tree

                                                                  SS

                                                                  NPNP VPVP

                                                                  ArtArt AdjAdj NN VtVt NPNP

                                                                  PNPN

                                                                  achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                                                                  Definite Clause GrammarsNon-terminals may have arguments

                                                                  SS --gt --gt NPNP((NN))VPVP((NN))

                                                                  NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                                                                  VP(VP(NN)) --gt VI(--gt VI(NN))

                                                                  Art(Art(singularsingular)) --gt [a]--gt [a]

                                                                  Art(Art(singularsingular)) --gt [the]--gt [the]

                                                                  Art(Art(pluralplural)) --gt [the]--gt [the]

                                                                  N(N(singularsingular)) --gt [turtle]--gt [turtle]

                                                                  N(N(pluralplural)) --gt [turtles]--gt [turtles]

                                                                  VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                                                                  VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                                                                  Number Agreement

                                                                  DCGs

                                                                  Non-terminals may have argumentsbull Variables (start with capital)

                                                                  Eg Number Any

                                                                  bull Constants (start with lower case) Eg singular plural

                                                                  bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                                                                  Parsing needs to be adapted bull Using unification

                                                                  Unification in a nutshell (cf AI course)

                                                                  Substitutions

                                                                  Eg Num singular T vp(VNP)

                                                                  Applying substitution bull Simultaneously replace variables by

                                                                  corresponding termsbull S(Num) Num singular = S(singular)

                                                                  Unification

                                                                  Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                                                                  Gives Num singular

                                                                  bull Art(singular) and Art(plural) Fails

                                                                  bull Art(Num1) and Art(Num2) Num1 Num2

                                                                  bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                                                                  Parsing with DCGs

                                                                  Now require successful unification at each step

                                                                  S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                                                  S-gt a turtle sleep fails

                                                                  Case Marking

                                                                  PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                                                  PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                                                  PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                                                  PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                                                  S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                                                  VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                                                  VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                                                  VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                                                  He sees her She sees him They see her

                                                                  But not Them see he

                                                                  DCGs

                                                                  Are strictly more expressive than CFGs Can represent for instance

                                                                  bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                                                  Probabilistic Models

                                                                  Traditional grammar models are very rigid bull essentially a yes no decision

                                                                  Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                                                  Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                                                  Illustration

                                                                  Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                                                  bull Constructed by handbull Can be used to derive stochastic context free

                                                                  grammarsbull SCFG assign probability to parse trees

                                                                  Compute the most probable parse tree

                                                                  Sequences are omni-present

                                                                  Therefore the techniques we will see also apply tobull Bioinformatics

                                                                  DNA proteins mRNA hellip can all be represented as strings

                                                                  bullRobotics Sequences of actions states hellip

                                                                  bullhellip

                                                                  Rest of the Course

                                                                  Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                                                  All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                                                  bull As an example of using undirected graphical models

                                                                  bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                                                  • Advanced Artificial Intelligence
                                                                  • Topic
                                                                  • Contents
                                                                  • Rationalism versus Empiricism
                                                                  • Slide 5
                                                                  • This course
                                                                  • Ambiguity
                                                                  • NLP and Statistics
                                                                  • Slide 9
                                                                  • Corpora
                                                                  • Word Counts
                                                                  • Slide 12
                                                                  • Word Counts (Brown corpus)
                                                                  • Slide 14
                                                                  • Zipflsquos Law
                                                                  • Language and sequences
                                                                  • Key NLP Problem Ambiguity
                                                                  • Language Model
                                                                  • Example of bad language model
                                                                  • A bad language model
                                                                  • Slide 22
                                                                  • A good language model
                                                                  • Why language models
                                                                  • Applications
                                                                  • Spelling errors
                                                                  • Handwriting recognition
                                                                  • For Spell Checkers
                                                                  • Another dimension in language models
                                                                  • Sequence Tagging
                                                                  • Slide 31
                                                                  • Parsing
                                                                  • Slide 33
                                                                  • Language models based on Grammars
                                                                  • Grammars and parsing
                                                                  • Regular Grammars and Finite State Automata
                                                                  • Finite State Automaton
                                                                  • Phrase structure
                                                                  • Notation
                                                                  • Context Free Grammar
                                                                  • Slide 41
                                                                  • Top-down parsing
                                                                  • Context-free grammar
                                                                  • Parse tree
                                                                  • Definite Clause Grammars Non-terminals may have arguments
                                                                  • DCGs
                                                                  • Unification in a nutshell (cf AI course)
                                                                  • Unification
                                                                  • Parsing with DCGs
                                                                  • Case Marking
                                                                  • Slide 51
                                                                  • Probabilistic Models
                                                                  • Illustration
                                                                  • PowerPoint Presentation
                                                                  • Sequences are omni-present
                                                                  • Rest of the Course

                                                                    Grammars and parsing Syntactic level best understood and formalized Derivation of grammatical structure parsing

                                                                    (more than just recognition) Result of parsing mostly parse tree

                                                                    showing the constituents of a sentence eg verb or noun phrases

                                                                    Syntax usually specified in terms of a grammar consisting of grammar rules

                                                                    Regular Grammars and Finite State Automata

                                                                    Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

                                                                    argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

                                                                    argumentbull Adj (adjective)

                                                                    Now acceptbull The cat sleptbull Det N Vi

                                                                    As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

                                                                    Lexicon bull The - Detbull Cat - Nbull Slept - Vi

                                                                    bull hellip

                                                                    Finite State Automaton

                                                                    Sentences

                                                                    bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

                                                                    Phrase structure

                                                                    S

                                                                    NP

                                                                    D N

                                                                    VP

                                                                    NPV

                                                                    D N

                                                                    PP

                                                                    P NP

                                                                    D N

                                                                    the dog chased a cat into the garden

                                                                    Notation

                                                                    S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

                                                                    Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

                                                                    Terminals ~ Lexicon

                                                                    Phrase structure

                                                                    Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                                                                    Recursionbull bdquoThe girl thought the dog chased the catldquo

                                                                    VP -gt V SN -gt [girl]V -gt [thought]

                                                                    Top-down parsing

                                                                    S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                                                                    Context-free grammarSS --gt --gt NPNPVPVP

                                                                    NPNP --gt PN --gt PN Proper nounProper noun

                                                                    NPNP --gt Art Adj N--gt Art Adj N

                                                                    NPNP --gt ArtN--gt ArtN

                                                                    VPVP --gt VI --gt VI intransitive verbintransitive verb

                                                                    VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                                                                    ArtArt --gt [the]--gt [the]

                                                                    AdjAdj --gt [lazy]--gt [lazy]

                                                                    AdjAdj --gt [rapid]--gt [rapid]

                                                                    PNPN --gt [achilles]--gt [achilles]

                                                                    NN --gt [turtle]--gt [turtle]

                                                                    VIVI --gt [sleeps]--gt [sleeps]

                                                                    VTVT --gt [beats]--gt [beats]

                                                                    Parse tree

                                                                    SS

                                                                    NPNP VPVP

                                                                    ArtArt AdjAdj NN VtVt NPNP

                                                                    PNPN

                                                                    achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                                                                    Definite Clause GrammarsNon-terminals may have arguments

                                                                    SS --gt --gt NPNP((NN))VPVP((NN))

                                                                    NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                                                                    VP(VP(NN)) --gt VI(--gt VI(NN))

                                                                    Art(Art(singularsingular)) --gt [a]--gt [a]

                                                                    Art(Art(singularsingular)) --gt [the]--gt [the]

                                                                    Art(Art(pluralplural)) --gt [the]--gt [the]

                                                                    N(N(singularsingular)) --gt [turtle]--gt [turtle]

                                                                    N(N(pluralplural)) --gt [turtles]--gt [turtles]

                                                                    VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                                                                    VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                                                                    Number Agreement

                                                                    DCGs

                                                                    Non-terminals may have argumentsbull Variables (start with capital)

                                                                    Eg Number Any

                                                                    bull Constants (start with lower case) Eg singular plural

                                                                    bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                                                                    Parsing needs to be adapted bull Using unification

                                                                    Unification in a nutshell (cf AI course)

                                                                    Substitutions

                                                                    Eg Num singular T vp(VNP)

                                                                    Applying substitution bull Simultaneously replace variables by

                                                                    corresponding termsbull S(Num) Num singular = S(singular)

                                                                    Unification

                                                                    Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                                                                    Gives Num singular

                                                                    bull Art(singular) and Art(plural) Fails

                                                                    bull Art(Num1) and Art(Num2) Num1 Num2

                                                                    bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                                                                    Parsing with DCGs

                                                                    Now require successful unification at each step

                                                                    S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                                                    S-gt a turtle sleep fails

                                                                    Case Marking

                                                                    PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                                                    PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                                                    PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                                                    PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                                                    S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                                                    VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                                                    VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                                                    VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                                                    He sees her She sees him They see her

                                                                    But not Them see he

                                                                    DCGs

                                                                    Are strictly more expressive than CFGs Can represent for instance

                                                                    bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                                                    Probabilistic Models

                                                                    Traditional grammar models are very rigid bull essentially a yes no decision

                                                                    Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                                                    Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                                                    Illustration

                                                                    Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                                                    bull Constructed by handbull Can be used to derive stochastic context free

                                                                    grammarsbull SCFG assign probability to parse trees

                                                                    Compute the most probable parse tree

                                                                    Sequences are omni-present

                                                                    Therefore the techniques we will see also apply tobull Bioinformatics

                                                                    DNA proteins mRNA hellip can all be represented as strings

                                                                    bullRobotics Sequences of actions states hellip

                                                                    bullhellip

                                                                    Rest of the Course

                                                                    Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                                                    All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                                                    bull As an example of using undirected graphical models

                                                                    bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                                                    • Advanced Artificial Intelligence
                                                                    • Topic
                                                                    • Contents
                                                                    • Rationalism versus Empiricism
                                                                    • Slide 5
                                                                    • This course
                                                                    • Ambiguity
                                                                    • NLP and Statistics
                                                                    • Slide 9
                                                                    • Corpora
                                                                    • Word Counts
                                                                    • Slide 12
                                                                    • Word Counts (Brown corpus)
                                                                    • Slide 14
                                                                    • Zipflsquos Law
                                                                    • Language and sequences
                                                                    • Key NLP Problem Ambiguity
                                                                    • Language Model
                                                                    • Example of bad language model
                                                                    • A bad language model
                                                                    • Slide 22
                                                                    • A good language model
                                                                    • Why language models
                                                                    • Applications
                                                                    • Spelling errors
                                                                    • Handwriting recognition
                                                                    • For Spell Checkers
                                                                    • Another dimension in language models
                                                                    • Sequence Tagging
                                                                    • Slide 31
                                                                    • Parsing
                                                                    • Slide 33
                                                                    • Language models based on Grammars
                                                                    • Grammars and parsing
                                                                    • Regular Grammars and Finite State Automata
                                                                    • Finite State Automaton
                                                                    • Phrase structure
                                                                    • Notation
                                                                    • Context Free Grammar
                                                                    • Slide 41
                                                                    • Top-down parsing
                                                                    • Context-free grammar
                                                                    • Parse tree
                                                                    • Definite Clause Grammars Non-terminals may have arguments
                                                                    • DCGs
                                                                    • Unification in a nutshell (cf AI course)
                                                                    • Unification
                                                                    • Parsing with DCGs
                                                                    • Case Marking
                                                                    • Slide 51
                                                                    • Probabilistic Models
                                                                    • Illustration
                                                                    • PowerPoint Presentation
                                                                    • Sequences are omni-present
                                                                    • Rest of the Course

                                                                      Regular Grammars and Finite State Automata

                                                                      Lexical information - which words are bull Det(erminer)bull N(oun)bull Vi (intransitive verb) - no

                                                                      argumentbull Pn (pronoun) bull Vt (transitive verb) - takes an

                                                                      argumentbull Adj (adjective)

                                                                      Now acceptbull The cat sleptbull Det N Vi

                                                                      As regular grammarbull S -gt [Det] S1 [ ] terminal bull S1 -gt [N] S2bull S2 -gt [Vi]

                                                                      Lexicon bull The - Detbull Cat - Nbull Slept - Vi

                                                                      bull hellip

                                                                      Finite State Automaton

                                                                      Sentences

                                                                      bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

                                                                      Phrase structure

                                                                      S

                                                                      NP

                                                                      D N

                                                                      VP

                                                                      NPV

                                                                      D N

                                                                      PP

                                                                      P NP

                                                                      D N

                                                                      the dog chased a cat into the garden

                                                                      Notation

                                                                      S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

                                                                      Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

                                                                      Terminals ~ Lexicon

                                                                      Phrase structure

                                                                      Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                                                                      Recursionbull bdquoThe girl thought the dog chased the catldquo

                                                                      VP -gt V SN -gt [girl]V -gt [thought]

                                                                      Top-down parsing

                                                                      S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                                                                      Context-free grammarSS --gt --gt NPNPVPVP

                                                                      NPNP --gt PN --gt PN Proper nounProper noun

                                                                      NPNP --gt Art Adj N--gt Art Adj N

                                                                      NPNP --gt ArtN--gt ArtN

                                                                      VPVP --gt VI --gt VI intransitive verbintransitive verb

                                                                      VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                                                                      ArtArt --gt [the]--gt [the]

                                                                      AdjAdj --gt [lazy]--gt [lazy]

                                                                      AdjAdj --gt [rapid]--gt [rapid]

                                                                      PNPN --gt [achilles]--gt [achilles]

                                                                      NN --gt [turtle]--gt [turtle]

                                                                      VIVI --gt [sleeps]--gt [sleeps]

                                                                      VTVT --gt [beats]--gt [beats]

                                                                      Parse tree

                                                                      SS

                                                                      NPNP VPVP

                                                                      ArtArt AdjAdj NN VtVt NPNP

                                                                      PNPN

                                                                      achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                                                                      Definite Clause GrammarsNon-terminals may have arguments

                                                                      SS --gt --gt NPNP((NN))VPVP((NN))

                                                                      NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                                                                      VP(VP(NN)) --gt VI(--gt VI(NN))

                                                                      Art(Art(singularsingular)) --gt [a]--gt [a]

                                                                      Art(Art(singularsingular)) --gt [the]--gt [the]

                                                                      Art(Art(pluralplural)) --gt [the]--gt [the]

                                                                      N(N(singularsingular)) --gt [turtle]--gt [turtle]

                                                                      N(N(pluralplural)) --gt [turtles]--gt [turtles]

                                                                      VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                                                                      VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                                                                      Number Agreement

                                                                      DCGs

                                                                      Non-terminals may have argumentsbull Variables (start with capital)

                                                                      Eg Number Any

                                                                      bull Constants (start with lower case) Eg singular plural

                                                                      bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                                                                      Parsing needs to be adapted bull Using unification

                                                                      Unification in a nutshell (cf AI course)

                                                                      Substitutions

                                                                      Eg Num singular T vp(VNP)

                                                                      Applying substitution bull Simultaneously replace variables by

                                                                      corresponding termsbull S(Num) Num singular = S(singular)

                                                                      Unification

                                                                      Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                                                                      Gives Num singular

                                                                      bull Art(singular) and Art(plural) Fails

                                                                      bull Art(Num1) and Art(Num2) Num1 Num2

                                                                      bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                                                                      Parsing with DCGs

                                                                      Now require successful unification at each step

                                                                      S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                                                      S-gt a turtle sleep fails

                                                                      Case Marking

                                                                      PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                                                      PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                                                      PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                                                      PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                                                      S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                                                      VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                                                      VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                                                      VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                                                      He sees her She sees him They see her

                                                                      But not Them see he

                                                                      DCGs

                                                                      Are strictly more expressive than CFGs Can represent for instance

                                                                      bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                                                      Probabilistic Models

                                                                      Traditional grammar models are very rigid bull essentially a yes no decision

                                                                      Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                                                      Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                                                      Illustration

                                                                      Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                                                      bull Constructed by handbull Can be used to derive stochastic context free

                                                                      grammarsbull SCFG assign probability to parse trees

                                                                      Compute the most probable parse tree

                                                                      Sequences are omni-present

                                                                      Therefore the techniques we will see also apply tobull Bioinformatics

                                                                      DNA proteins mRNA hellip can all be represented as strings

                                                                      bullRobotics Sequences of actions states hellip

                                                                      bullhellip

                                                                      Rest of the Course

                                                                      Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                                                      All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                                                      bull As an example of using undirected graphical models

                                                                      bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                                                      • Advanced Artificial Intelligence
                                                                      • Topic
                                                                      • Contents
                                                                      • Rationalism versus Empiricism
                                                                      • Slide 5
                                                                      • This course
                                                                      • Ambiguity
                                                                      • NLP and Statistics
                                                                      • Slide 9
                                                                      • Corpora
                                                                      • Word Counts
                                                                      • Slide 12
                                                                      • Word Counts (Brown corpus)
                                                                      • Slide 14
                                                                      • Zipflsquos Law
                                                                      • Language and sequences
                                                                      • Key NLP Problem Ambiguity
                                                                      • Language Model
                                                                      • Example of bad language model
                                                                      • A bad language model
                                                                      • Slide 22
                                                                      • A good language model
                                                                      • Why language models
                                                                      • Applications
                                                                      • Spelling errors
                                                                      • Handwriting recognition
                                                                      • For Spell Checkers
                                                                      • Another dimension in language models
                                                                      • Sequence Tagging
                                                                      • Slide 31
                                                                      • Parsing
                                                                      • Slide 33
                                                                      • Language models based on Grammars
                                                                      • Grammars and parsing
                                                                      • Regular Grammars and Finite State Automata
                                                                      • Finite State Automaton
                                                                      • Phrase structure
                                                                      • Notation
                                                                      • Context Free Grammar
                                                                      • Slide 41
                                                                      • Top-down parsing
                                                                      • Context-free grammar
                                                                      • Parse tree
                                                                      • Definite Clause Grammars Non-terminals may have arguments
                                                                      • DCGs
                                                                      • Unification in a nutshell (cf AI course)
                                                                      • Unification
                                                                      • Parsing with DCGs
                                                                      • Case Marking
                                                                      • Slide 51
                                                                      • Probabilistic Models
                                                                      • Illustration
                                                                      • PowerPoint Presentation
                                                                      • Sequences are omni-present
                                                                      • Rest of the Course

                                                                        Finite State Automaton

                                                                        Sentences

                                                                        bull John smiles - Pn Vibull The cat disappeared - Det N Vibull These new shoes hurt - Det Adj N Vibull John liked the old cat PN Vt Det Adj N

                                                                        Phrase structure

                                                                        S

                                                                        NP

                                                                        D N

                                                                        VP

                                                                        NPV

                                                                        D N

                                                                        PP

                                                                        P NP

                                                                        D N

                                                                        the dog chased a cat into the garden

                                                                        Notation

                                                                        S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

                                                                        Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

                                                                        Terminals ~ Lexicon

                                                                        Phrase structure

                                                                        Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                                                                        Recursionbull bdquoThe girl thought the dog chased the catldquo

                                                                        VP -gt V SN -gt [girl]V -gt [thought]

                                                                        Top-down parsing

                                                                        S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                                                                        Context-free grammarSS --gt --gt NPNPVPVP

                                                                        NPNP --gt PN --gt PN Proper nounProper noun

                                                                        NPNP --gt Art Adj N--gt Art Adj N

                                                                        NPNP --gt ArtN--gt ArtN

                                                                        VPVP --gt VI --gt VI intransitive verbintransitive verb

                                                                        VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                                                                        ArtArt --gt [the]--gt [the]

                                                                        AdjAdj --gt [lazy]--gt [lazy]

                                                                        AdjAdj --gt [rapid]--gt [rapid]

                                                                        PNPN --gt [achilles]--gt [achilles]

                                                                        NN --gt [turtle]--gt [turtle]

                                                                        VIVI --gt [sleeps]--gt [sleeps]

                                                                        VTVT --gt [beats]--gt [beats]

                                                                        Parse tree

                                                                        SS

                                                                        NPNP VPVP

                                                                        ArtArt AdjAdj NN VtVt NPNP

                                                                        PNPN

                                                                        achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                                                                        Definite Clause GrammarsNon-terminals may have arguments

                                                                        SS --gt --gt NPNP((NN))VPVP((NN))

                                                                        NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                                                                        VP(VP(NN)) --gt VI(--gt VI(NN))

                                                                        Art(Art(singularsingular)) --gt [a]--gt [a]

                                                                        Art(Art(singularsingular)) --gt [the]--gt [the]

                                                                        Art(Art(pluralplural)) --gt [the]--gt [the]

                                                                        N(N(singularsingular)) --gt [turtle]--gt [turtle]

                                                                        N(N(pluralplural)) --gt [turtles]--gt [turtles]

                                                                        VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                                                                        VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                                                                        Number Agreement

                                                                        DCGs

                                                                        Non-terminals may have argumentsbull Variables (start with capital)

                                                                        Eg Number Any

                                                                        bull Constants (start with lower case) Eg singular plural

                                                                        bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                                                                        Parsing needs to be adapted bull Using unification

                                                                        Unification in a nutshell (cf AI course)

                                                                        Substitutions

                                                                        Eg Num singular T vp(VNP)

                                                                        Applying substitution bull Simultaneously replace variables by

                                                                        corresponding termsbull S(Num) Num singular = S(singular)

                                                                        Unification

                                                                        Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                                                                        Gives Num singular

                                                                        bull Art(singular) and Art(plural) Fails

                                                                        bull Art(Num1) and Art(Num2) Num1 Num2

                                                                        bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                                                                        Parsing with DCGs

                                                                        Now require successful unification at each step

                                                                        S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                                                        S-gt a turtle sleep fails

                                                                        Case Marking

                                                                        PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                                                        PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                                                        PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                                                        PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                                                        S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                                                        VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                                                        VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                                                        VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                                                        He sees her She sees him They see her

                                                                        But not Them see he

                                                                        DCGs

                                                                        Are strictly more expressive than CFGs Can represent for instance

                                                                        bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                                                        Probabilistic Models

                                                                        Traditional grammar models are very rigid bull essentially a yes no decision

                                                                        Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                                                        Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                                                        Illustration

                                                                        Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                                                        bull Constructed by handbull Can be used to derive stochastic context free

                                                                        grammarsbull SCFG assign probability to parse trees

                                                                        Compute the most probable parse tree

                                                                        Sequences are omni-present

                                                                        Therefore the techniques we will see also apply tobull Bioinformatics

                                                                        DNA proteins mRNA hellip can all be represented as strings

                                                                        bullRobotics Sequences of actions states hellip

                                                                        bullhellip

                                                                        Rest of the Course

                                                                        Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                                                        All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                                                        bull As an example of using undirected graphical models

                                                                        bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                                                        • Advanced Artificial Intelligence
                                                                        • Topic
                                                                        • Contents
                                                                        • Rationalism versus Empiricism
                                                                        • Slide 5
                                                                        • This course
                                                                        • Ambiguity
                                                                        • NLP and Statistics
                                                                        • Slide 9
                                                                        • Corpora
                                                                        • Word Counts
                                                                        • Slide 12
                                                                        • Word Counts (Brown corpus)
                                                                        • Slide 14
                                                                        • Zipflsquos Law
                                                                        • Language and sequences
                                                                        • Key NLP Problem Ambiguity
                                                                        • Language Model
                                                                        • Example of bad language model
                                                                        • A bad language model
                                                                        • Slide 22
                                                                        • A good language model
                                                                        • Why language models
                                                                        • Applications
                                                                        • Spelling errors
                                                                        • Handwriting recognition
                                                                        • For Spell Checkers
                                                                        • Another dimension in language models
                                                                        • Sequence Tagging
                                                                        • Slide 31
                                                                        • Parsing
                                                                        • Slide 33
                                                                        • Language models based on Grammars
                                                                        • Grammars and parsing
                                                                        • Regular Grammars and Finite State Automata
                                                                        • Finite State Automaton
                                                                        • Phrase structure
                                                                        • Notation
                                                                        • Context Free Grammar
                                                                        • Slide 41
                                                                        • Top-down parsing
                                                                        • Context-free grammar
                                                                        • Parse tree
                                                                        • Definite Clause Grammars Non-terminals may have arguments
                                                                        • DCGs
                                                                        • Unification in a nutshell (cf AI course)
                                                                        • Unification
                                                                        • Parsing with DCGs
                                                                        • Case Marking
                                                                        • Slide 51
                                                                        • Probabilistic Models
                                                                        • Illustration
                                                                        • PowerPoint Presentation
                                                                        • Sequences are omni-present
                                                                        • Rest of the Course

                                                                          Phrase structure

                                                                          S

                                                                          NP

                                                                          D N

                                                                          VP

                                                                          NPV

                                                                          D N

                                                                          PP

                                                                          P NP

                                                                          D N

                                                                          the dog chased a cat into the garden

                                                                          Notation

                                                                          S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

                                                                          Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

                                                                          Terminals ~ Lexicon

                                                                          Phrase structure

                                                                          Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                                                                          Recursionbull bdquoThe girl thought the dog chased the catldquo

                                                                          VP -gt V SN -gt [girl]V -gt [thought]

                                                                          Top-down parsing

                                                                          S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                                                                          Context-free grammarSS --gt --gt NPNPVPVP

                                                                          NPNP --gt PN --gt PN Proper nounProper noun

                                                                          NPNP --gt Art Adj N--gt Art Adj N

                                                                          NPNP --gt ArtN--gt ArtN

                                                                          VPVP --gt VI --gt VI intransitive verbintransitive verb

                                                                          VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                                                                          ArtArt --gt [the]--gt [the]

                                                                          AdjAdj --gt [lazy]--gt [lazy]

                                                                          AdjAdj --gt [rapid]--gt [rapid]

                                                                          PNPN --gt [achilles]--gt [achilles]

                                                                          NN --gt [turtle]--gt [turtle]

                                                                          VIVI --gt [sleeps]--gt [sleeps]

                                                                          VTVT --gt [beats]--gt [beats]

                                                                          Parse tree

                                                                          SS

                                                                          NPNP VPVP

                                                                          ArtArt AdjAdj NN VtVt NPNP

                                                                          PNPN

                                                                          achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                                                                          Definite Clause GrammarsNon-terminals may have arguments

                                                                          SS --gt --gt NPNP((NN))VPVP((NN))

                                                                          NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                                                                          VP(VP(NN)) --gt VI(--gt VI(NN))

                                                                          Art(Art(singularsingular)) --gt [a]--gt [a]

                                                                          Art(Art(singularsingular)) --gt [the]--gt [the]

                                                                          Art(Art(pluralplural)) --gt [the]--gt [the]

                                                                          N(N(singularsingular)) --gt [turtle]--gt [turtle]

                                                                          N(N(pluralplural)) --gt [turtles]--gt [turtles]

                                                                          VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                                                                          VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                                                                          Number Agreement

                                                                          DCGs

                                                                          Non-terminals may have argumentsbull Variables (start with capital)

                                                                          Eg Number Any

                                                                          bull Constants (start with lower case) Eg singular plural

                                                                          bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                                                                          Parsing needs to be adapted bull Using unification

                                                                          Unification in a nutshell (cf AI course)

                                                                          Substitutions

                                                                          Eg Num singular T vp(VNP)

                                                                          Applying substitution bull Simultaneously replace variables by

                                                                          corresponding termsbull S(Num) Num singular = S(singular)

                                                                          Unification

                                                                          Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                                                                          Gives Num singular

                                                                          bull Art(singular) and Art(plural) Fails

                                                                          bull Art(Num1) and Art(Num2) Num1 Num2

                                                                          bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                                                                          Parsing with DCGs

                                                                          Now require successful unification at each step

                                                                          S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                                                          S-gt a turtle sleep fails

                                                                          Case Marking

                                                                          PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                                                          PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                                                          PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                                                          PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                                                          S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                                                          VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                                                          VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                                                          VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                                                          He sees her She sees him They see her

                                                                          But not Them see he

                                                                          DCGs

                                                                          Are strictly more expressive than CFGs Can represent for instance

                                                                          bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                                                          Probabilistic Models

                                                                          Traditional grammar models are very rigid bull essentially a yes no decision

                                                                          Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                                                          Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                                                          Illustration

                                                                          Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                                                          bull Constructed by handbull Can be used to derive stochastic context free

                                                                          grammarsbull SCFG assign probability to parse trees

                                                                          Compute the most probable parse tree

                                                                          Sequences are omni-present

                                                                          Therefore the techniques we will see also apply tobull Bioinformatics

                                                                          DNA proteins mRNA hellip can all be represented as strings

                                                                          bullRobotics Sequences of actions states hellip

                                                                          bullhellip

                                                                          Rest of the Course

                                                                          Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                                                          All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                                                          bull As an example of using undirected graphical models

                                                                          bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                                                          • Advanced Artificial Intelligence
                                                                          • Topic
                                                                          • Contents
                                                                          • Rationalism versus Empiricism
                                                                          • Slide 5
                                                                          • This course
                                                                          • Ambiguity
                                                                          • NLP and Statistics
                                                                          • Slide 9
                                                                          • Corpora
                                                                          • Word Counts
                                                                          • Slide 12
                                                                          • Word Counts (Brown corpus)
                                                                          • Slide 14
                                                                          • Zipflsquos Law
                                                                          • Language and sequences
                                                                          • Key NLP Problem Ambiguity
                                                                          • Language Model
                                                                          • Example of bad language model
                                                                          • A bad language model
                                                                          • Slide 22
                                                                          • A good language model
                                                                          • Why language models
                                                                          • Applications
                                                                          • Spelling errors
                                                                          • Handwriting recognition
                                                                          • For Spell Checkers
                                                                          • Another dimension in language models
                                                                          • Sequence Tagging
                                                                          • Slide 31
                                                                          • Parsing
                                                                          • Slide 33
                                                                          • Language models based on Grammars
                                                                          • Grammars and parsing
                                                                          • Regular Grammars and Finite State Automata
                                                                          • Finite State Automaton
                                                                          • Phrase structure
                                                                          • Notation
                                                                          • Context Free Grammar
                                                                          • Slide 41
                                                                          • Top-down parsing
                                                                          • Context-free grammar
                                                                          • Parse tree
                                                                          • Definite Clause Grammars Non-terminals may have arguments
                                                                          • DCGs
                                                                          • Unification in a nutshell (cf AI course)
                                                                          • Unification
                                                                          • Parsing with DCGs
                                                                          • Case Marking
                                                                          • Slide 51
                                                                          • Probabilistic Models
                                                                          • Illustration
                                                                          • PowerPoint Presentation
                                                                          • Sequences are omni-present
                                                                          • Rest of the Course

                                                                            Notation

                                                                            S sentence D or Det Determiner (eg articles) N noun V verb P preposition NP noun phrase VP verb phrase PP prepositional phrase

                                                                            Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

                                                                            Terminals ~ Lexicon

                                                                            Phrase structure

                                                                            Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                                                                            Recursionbull bdquoThe girl thought the dog chased the catldquo

                                                                            VP -gt V SN -gt [girl]V -gt [thought]

                                                                            Top-down parsing

                                                                            S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                                                                            Context-free grammarSS --gt --gt NPNPVPVP

                                                                            NPNP --gt PN --gt PN Proper nounProper noun

                                                                            NPNP --gt Art Adj N--gt Art Adj N

                                                                            NPNP --gt ArtN--gt ArtN

                                                                            VPVP --gt VI --gt VI intransitive verbintransitive verb

                                                                            VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                                                                            ArtArt --gt [the]--gt [the]

                                                                            AdjAdj --gt [lazy]--gt [lazy]

                                                                            AdjAdj --gt [rapid]--gt [rapid]

                                                                            PNPN --gt [achilles]--gt [achilles]

                                                                            NN --gt [turtle]--gt [turtle]

                                                                            VIVI --gt [sleeps]--gt [sleeps]

                                                                            VTVT --gt [beats]--gt [beats]

                                                                            Parse tree

                                                                            SS

                                                                            NPNP VPVP

                                                                            ArtArt AdjAdj NN VtVt NPNP

                                                                            PNPN

                                                                            achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                                                                            Definite Clause GrammarsNon-terminals may have arguments

                                                                            SS --gt --gt NPNP((NN))VPVP((NN))

                                                                            NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                                                                            VP(VP(NN)) --gt VI(--gt VI(NN))

                                                                            Art(Art(singularsingular)) --gt [a]--gt [a]

                                                                            Art(Art(singularsingular)) --gt [the]--gt [the]

                                                                            Art(Art(pluralplural)) --gt [the]--gt [the]

                                                                            N(N(singularsingular)) --gt [turtle]--gt [turtle]

                                                                            N(N(pluralplural)) --gt [turtles]--gt [turtles]

                                                                            VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                                                                            VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                                                                            Number Agreement

                                                                            DCGs

                                                                            Non-terminals may have argumentsbull Variables (start with capital)

                                                                            Eg Number Any

                                                                            bull Constants (start with lower case) Eg singular plural

                                                                            bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                                                                            Parsing needs to be adapted bull Using unification

                                                                            Unification in a nutshell (cf AI course)

                                                                            Substitutions

                                                                            Eg Num singular T vp(VNP)

                                                                            Applying substitution bull Simultaneously replace variables by

                                                                            corresponding termsbull S(Num) Num singular = S(singular)

                                                                            Unification

                                                                            Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                                                                            Gives Num singular

                                                                            bull Art(singular) and Art(plural) Fails

                                                                            bull Art(Num1) and Art(Num2) Num1 Num2

                                                                            bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                                                                            Parsing with DCGs

                                                                            Now require successful unification at each step

                                                                            S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                                                            S-gt a turtle sleep fails

                                                                            Case Marking

                                                                            PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                                                            PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                                                            PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                                                            PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                                                            S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                                                            VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                                                            VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                                                            VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                                                            He sees her She sees him They see her

                                                                            But not Them see he

                                                                            DCGs

                                                                            Are strictly more expressive than CFGs Can represent for instance

                                                                            bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                                                            Probabilistic Models

                                                                            Traditional grammar models are very rigid bull essentially a yes no decision

                                                                            Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                                                            Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                                                            Illustration

                                                                            Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                                                            bull Constructed by handbull Can be used to derive stochastic context free

                                                                            grammarsbull SCFG assign probability to parse trees

                                                                            Compute the most probable parse tree

                                                                            Sequences are omni-present

                                                                            Therefore the techniques we will see also apply tobull Bioinformatics

                                                                            DNA proteins mRNA hellip can all be represented as strings

                                                                            bullRobotics Sequences of actions states hellip

                                                                            bullhellip

                                                                            Rest of the Course

                                                                            Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                                                            All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                                                            bull As an example of using undirected graphical models

                                                                            bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                                                            • Advanced Artificial Intelligence
                                                                            • Topic
                                                                            • Contents
                                                                            • Rationalism versus Empiricism
                                                                            • Slide 5
                                                                            • This course
                                                                            • Ambiguity
                                                                            • NLP and Statistics
                                                                            • Slide 9
                                                                            • Corpora
                                                                            • Word Counts
                                                                            • Slide 12
                                                                            • Word Counts (Brown corpus)
                                                                            • Slide 14
                                                                            • Zipflsquos Law
                                                                            • Language and sequences
                                                                            • Key NLP Problem Ambiguity
                                                                            • Language Model
                                                                            • Example of bad language model
                                                                            • A bad language model
                                                                            • Slide 22
                                                                            • A good language model
                                                                            • Why language models
                                                                            • Applications
                                                                            • Spelling errors
                                                                            • Handwriting recognition
                                                                            • For Spell Checkers
                                                                            • Another dimension in language models
                                                                            • Sequence Tagging
                                                                            • Slide 31
                                                                            • Parsing
                                                                            • Slide 33
                                                                            • Language models based on Grammars
                                                                            • Grammars and parsing
                                                                            • Regular Grammars and Finite State Automata
                                                                            • Finite State Automaton
                                                                            • Phrase structure
                                                                            • Notation
                                                                            • Context Free Grammar
                                                                            • Slide 41
                                                                            • Top-down parsing
                                                                            • Context-free grammar
                                                                            • Parse tree
                                                                            • Definite Clause Grammars Non-terminals may have arguments
                                                                            • DCGs
                                                                            • Unification in a nutshell (cf AI course)
                                                                            • Unification
                                                                            • Parsing with DCGs
                                                                            • Case Marking
                                                                            • Slide 51
                                                                            • Probabilistic Models
                                                                            • Illustration
                                                                            • PowerPoint Presentation
                                                                            • Sequences are omni-present
                                                                            • Rest of the Course

                                                                              Context Free GrammarS -gt NP VPNP -gt D NVP -gt V NPVP -gt V NP PPPP -gt P NPD -gt [the]D -gt [a]N -gt [dog]N -gt [cat]N -gt [garden]V -gt [chased]V -gt [saw]P -gt [into]

                                                                              Terminals ~ Lexicon

                                                                              Phrase structure

                                                                              Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                                                                              Recursionbull bdquoThe girl thought the dog chased the catldquo

                                                                              VP -gt V SN -gt [girl]V -gt [thought]

                                                                              Top-down parsing

                                                                              S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                                                                              Context-free grammarSS --gt --gt NPNPVPVP

                                                                              NPNP --gt PN --gt PN Proper nounProper noun

                                                                              NPNP --gt Art Adj N--gt Art Adj N

                                                                              NPNP --gt ArtN--gt ArtN

                                                                              VPVP --gt VI --gt VI intransitive verbintransitive verb

                                                                              VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                                                                              ArtArt --gt [the]--gt [the]

                                                                              AdjAdj --gt [lazy]--gt [lazy]

                                                                              AdjAdj --gt [rapid]--gt [rapid]

                                                                              PNPN --gt [achilles]--gt [achilles]

                                                                              NN --gt [turtle]--gt [turtle]

                                                                              VIVI --gt [sleeps]--gt [sleeps]

                                                                              VTVT --gt [beats]--gt [beats]

                                                                              Parse tree

                                                                              SS

                                                                              NPNP VPVP

                                                                              ArtArt AdjAdj NN VtVt NPNP

                                                                              PNPN

                                                                              achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                                                                              Definite Clause GrammarsNon-terminals may have arguments

                                                                              SS --gt --gt NPNP((NN))VPVP((NN))

                                                                              NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                                                                              VP(VP(NN)) --gt VI(--gt VI(NN))

                                                                              Art(Art(singularsingular)) --gt [a]--gt [a]

                                                                              Art(Art(singularsingular)) --gt [the]--gt [the]

                                                                              Art(Art(pluralplural)) --gt [the]--gt [the]

                                                                              N(N(singularsingular)) --gt [turtle]--gt [turtle]

                                                                              N(N(pluralplural)) --gt [turtles]--gt [turtles]

                                                                              VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                                                                              VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                                                                              Number Agreement

                                                                              DCGs

                                                                              Non-terminals may have argumentsbull Variables (start with capital)

                                                                              Eg Number Any

                                                                              bull Constants (start with lower case) Eg singular plural

                                                                              bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                                                                              Parsing needs to be adapted bull Using unification

                                                                              Unification in a nutshell (cf AI course)

                                                                              Substitutions

                                                                              Eg Num singular T vp(VNP)

                                                                              Applying substitution bull Simultaneously replace variables by

                                                                              corresponding termsbull S(Num) Num singular = S(singular)

                                                                              Unification

                                                                              Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                                                                              Gives Num singular

                                                                              bull Art(singular) and Art(plural) Fails

                                                                              bull Art(Num1) and Art(Num2) Num1 Num2

                                                                              bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                                                                              Parsing with DCGs

                                                                              Now require successful unification at each step

                                                                              S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                                                              S-gt a turtle sleep fails

                                                                              Case Marking

                                                                              PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                                                              PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                                                              PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                                                              PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                                                              S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                                                              VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                                                              VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                                                              VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                                                              He sees her She sees him They see her

                                                                              But not Them see he

                                                                              DCGs

                                                                              Are strictly more expressive than CFGs Can represent for instance

                                                                              bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                                                              Probabilistic Models

                                                                              Traditional grammar models are very rigid bull essentially a yes no decision

                                                                              Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                                                              Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                                                              Illustration

                                                                              Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                                                              bull Constructed by handbull Can be used to derive stochastic context free

                                                                              grammarsbull SCFG assign probability to parse trees

                                                                              Compute the most probable parse tree

                                                                              Sequences are omni-present

                                                                              Therefore the techniques we will see also apply tobull Bioinformatics

                                                                              DNA proteins mRNA hellip can all be represented as strings

                                                                              bullRobotics Sequences of actions states hellip

                                                                              bullhellip

                                                                              Rest of the Course

                                                                              Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                                                              All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                                                              bull As an example of using undirected graphical models

                                                                              bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                                                              • Advanced Artificial Intelligence
                                                                              • Topic
                                                                              • Contents
                                                                              • Rationalism versus Empiricism
                                                                              • Slide 5
                                                                              • This course
                                                                              • Ambiguity
                                                                              • NLP and Statistics
                                                                              • Slide 9
                                                                              • Corpora
                                                                              • Word Counts
                                                                              • Slide 12
                                                                              • Word Counts (Brown corpus)
                                                                              • Slide 14
                                                                              • Zipflsquos Law
                                                                              • Language and sequences
                                                                              • Key NLP Problem Ambiguity
                                                                              • Language Model
                                                                              • Example of bad language model
                                                                              • A bad language model
                                                                              • Slide 22
                                                                              • A good language model
                                                                              • Why language models
                                                                              • Applications
                                                                              • Spelling errors
                                                                              • Handwriting recognition
                                                                              • For Spell Checkers
                                                                              • Another dimension in language models
                                                                              • Sequence Tagging
                                                                              • Slide 31
                                                                              • Parsing
                                                                              • Slide 33
                                                                              • Language models based on Grammars
                                                                              • Grammars and parsing
                                                                              • Regular Grammars and Finite State Automata
                                                                              • Finite State Automaton
                                                                              • Phrase structure
                                                                              • Notation
                                                                              • Context Free Grammar
                                                                              • Slide 41
                                                                              • Top-down parsing
                                                                              • Context-free grammar
                                                                              • Parse tree
                                                                              • Definite Clause Grammars Non-terminals may have arguments
                                                                              • DCGs
                                                                              • Unification in a nutshell (cf AI course)
                                                                              • Unification
                                                                              • Parsing with DCGs
                                                                              • Case Marking
                                                                              • Slide 51
                                                                              • Probabilistic Models
                                                                              • Illustration
                                                                              • PowerPoint Presentation
                                                                              • Sequences are omni-present
                                                                              • Rest of the Course

                                                                                Phrase structure

                                                                                Formalism of context-free grammarsbull Nonterminal symbols S NP VP bull Terminal symbols dog cat saw the

                                                                                Recursionbull bdquoThe girl thought the dog chased the catldquo

                                                                                VP -gt V SN -gt [girl]V -gt [thought]

                                                                                Top-down parsing

                                                                                S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                                                                                Context-free grammarSS --gt --gt NPNPVPVP

                                                                                NPNP --gt PN --gt PN Proper nounProper noun

                                                                                NPNP --gt Art Adj N--gt Art Adj N

                                                                                NPNP --gt ArtN--gt ArtN

                                                                                VPVP --gt VI --gt VI intransitive verbintransitive verb

                                                                                VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                                                                                ArtArt --gt [the]--gt [the]

                                                                                AdjAdj --gt [lazy]--gt [lazy]

                                                                                AdjAdj --gt [rapid]--gt [rapid]

                                                                                PNPN --gt [achilles]--gt [achilles]

                                                                                NN --gt [turtle]--gt [turtle]

                                                                                VIVI --gt [sleeps]--gt [sleeps]

                                                                                VTVT --gt [beats]--gt [beats]

                                                                                Parse tree

                                                                                SS

                                                                                NPNP VPVP

                                                                                ArtArt AdjAdj NN VtVt NPNP

                                                                                PNPN

                                                                                achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                                                                                Definite Clause GrammarsNon-terminals may have arguments

                                                                                SS --gt --gt NPNP((NN))VPVP((NN))

                                                                                NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                                                                                VP(VP(NN)) --gt VI(--gt VI(NN))

                                                                                Art(Art(singularsingular)) --gt [a]--gt [a]

                                                                                Art(Art(singularsingular)) --gt [the]--gt [the]

                                                                                Art(Art(pluralplural)) --gt [the]--gt [the]

                                                                                N(N(singularsingular)) --gt [turtle]--gt [turtle]

                                                                                N(N(pluralplural)) --gt [turtles]--gt [turtles]

                                                                                VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                                                                                VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                                                                                Number Agreement

                                                                                DCGs

                                                                                Non-terminals may have argumentsbull Variables (start with capital)

                                                                                Eg Number Any

                                                                                bull Constants (start with lower case) Eg singular plural

                                                                                bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                                                                                Parsing needs to be adapted bull Using unification

                                                                                Unification in a nutshell (cf AI course)

                                                                                Substitutions

                                                                                Eg Num singular T vp(VNP)

                                                                                Applying substitution bull Simultaneously replace variables by

                                                                                corresponding termsbull S(Num) Num singular = S(singular)

                                                                                Unification

                                                                                Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                                                                                Gives Num singular

                                                                                bull Art(singular) and Art(plural) Fails

                                                                                bull Art(Num1) and Art(Num2) Num1 Num2

                                                                                bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                                                                                Parsing with DCGs

                                                                                Now require successful unification at each step

                                                                                S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                                                                S-gt a turtle sleep fails

                                                                                Case Marking

                                                                                PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                                                                PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                                                                PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                                                                PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                                                                S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                                                                VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                                                                VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                                                                VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                                                                He sees her She sees him They see her

                                                                                But not Them see he

                                                                                DCGs

                                                                                Are strictly more expressive than CFGs Can represent for instance

                                                                                bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                                                                Probabilistic Models

                                                                                Traditional grammar models are very rigid bull essentially a yes no decision

                                                                                Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                                                                Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                                                                Illustration

                                                                                Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                                                                bull Constructed by handbull Can be used to derive stochastic context free

                                                                                grammarsbull SCFG assign probability to parse trees

                                                                                Compute the most probable parse tree

                                                                                Sequences are omni-present

                                                                                Therefore the techniques we will see also apply tobull Bioinformatics

                                                                                DNA proteins mRNA hellip can all be represented as strings

                                                                                bullRobotics Sequences of actions states hellip

                                                                                bullhellip

                                                                                Rest of the Course

                                                                                Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                                                                All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                                                                bull As an example of using undirected graphical models

                                                                                bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                                                                • Advanced Artificial Intelligence
                                                                                • Topic
                                                                                • Contents
                                                                                • Rationalism versus Empiricism
                                                                                • Slide 5
                                                                                • This course
                                                                                • Ambiguity
                                                                                • NLP and Statistics
                                                                                • Slide 9
                                                                                • Corpora
                                                                                • Word Counts
                                                                                • Slide 12
                                                                                • Word Counts (Brown corpus)
                                                                                • Slide 14
                                                                                • Zipflsquos Law
                                                                                • Language and sequences
                                                                                • Key NLP Problem Ambiguity
                                                                                • Language Model
                                                                                • Example of bad language model
                                                                                • A bad language model
                                                                                • Slide 22
                                                                                • A good language model
                                                                                • Why language models
                                                                                • Applications
                                                                                • Spelling errors
                                                                                • Handwriting recognition
                                                                                • For Spell Checkers
                                                                                • Another dimension in language models
                                                                                • Sequence Tagging
                                                                                • Slide 31
                                                                                • Parsing
                                                                                • Slide 33
                                                                                • Language models based on Grammars
                                                                                • Grammars and parsing
                                                                                • Regular Grammars and Finite State Automata
                                                                                • Finite State Automaton
                                                                                • Phrase structure
                                                                                • Notation
                                                                                • Context Free Grammar
                                                                                • Slide 41
                                                                                • Top-down parsing
                                                                                • Context-free grammar
                                                                                • Parse tree
                                                                                • Definite Clause Grammars Non-terminals may have arguments
                                                                                • DCGs
                                                                                • Unification in a nutshell (cf AI course)
                                                                                • Unification
                                                                                • Parsing with DCGs
                                                                                • Case Marking
                                                                                • Slide 51
                                                                                • Probabilistic Models
                                                                                • Illustration
                                                                                • PowerPoint Presentation
                                                                                • Sequences are omni-present
                                                                                • Rest of the Course

                                                                                  Top-down parsing

                                                                                  S -gt NP VP S -gt Det N VP S -gt The N VP S -gt The dog VP S -gt The dog V NP S -gt The dog chased NP S -gt The dog chased Det N S-gt The dog chased the N S-gt The dog chased the cat

                                                                                  Context-free grammarSS --gt --gt NPNPVPVP

                                                                                  NPNP --gt PN --gt PN Proper nounProper noun

                                                                                  NPNP --gt Art Adj N--gt Art Adj N

                                                                                  NPNP --gt ArtN--gt ArtN

                                                                                  VPVP --gt VI --gt VI intransitive verbintransitive verb

                                                                                  VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                                                                                  ArtArt --gt [the]--gt [the]

                                                                                  AdjAdj --gt [lazy]--gt [lazy]

                                                                                  AdjAdj --gt [rapid]--gt [rapid]

                                                                                  PNPN --gt [achilles]--gt [achilles]

                                                                                  NN --gt [turtle]--gt [turtle]

                                                                                  VIVI --gt [sleeps]--gt [sleeps]

                                                                                  VTVT --gt [beats]--gt [beats]

                                                                                  Parse tree

                                                                                  SS

                                                                                  NPNP VPVP

                                                                                  ArtArt AdjAdj NN VtVt NPNP

                                                                                  PNPN

                                                                                  achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                                                                                  Definite Clause GrammarsNon-terminals may have arguments

                                                                                  SS --gt --gt NPNP((NN))VPVP((NN))

                                                                                  NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                                                                                  VP(VP(NN)) --gt VI(--gt VI(NN))

                                                                                  Art(Art(singularsingular)) --gt [a]--gt [a]

                                                                                  Art(Art(singularsingular)) --gt [the]--gt [the]

                                                                                  Art(Art(pluralplural)) --gt [the]--gt [the]

                                                                                  N(N(singularsingular)) --gt [turtle]--gt [turtle]

                                                                                  N(N(pluralplural)) --gt [turtles]--gt [turtles]

                                                                                  VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                                                                                  VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                                                                                  Number Agreement

                                                                                  DCGs

                                                                                  Non-terminals may have argumentsbull Variables (start with capital)

                                                                                  Eg Number Any

                                                                                  bull Constants (start with lower case) Eg singular plural

                                                                                  bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                                                                                  Parsing needs to be adapted bull Using unification

                                                                                  Unification in a nutshell (cf AI course)

                                                                                  Substitutions

                                                                                  Eg Num singular T vp(VNP)

                                                                                  Applying substitution bull Simultaneously replace variables by

                                                                                  corresponding termsbull S(Num) Num singular = S(singular)

                                                                                  Unification

                                                                                  Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                                                                                  Gives Num singular

                                                                                  bull Art(singular) and Art(plural) Fails

                                                                                  bull Art(Num1) and Art(Num2) Num1 Num2

                                                                                  bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                                                                                  Parsing with DCGs

                                                                                  Now require successful unification at each step

                                                                                  S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                                                                  S-gt a turtle sleep fails

                                                                                  Case Marking

                                                                                  PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                                                                  PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                                                                  PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                                                                  PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                                                                  S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                                                                  VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                                                                  VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                                                                  VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                                                                  He sees her She sees him They see her

                                                                                  But not Them see he

                                                                                  DCGs

                                                                                  Are strictly more expressive than CFGs Can represent for instance

                                                                                  bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                                                                  Probabilistic Models

                                                                                  Traditional grammar models are very rigid bull essentially a yes no decision

                                                                                  Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                                                                  Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                                                                  Illustration

                                                                                  Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                                                                  bull Constructed by handbull Can be used to derive stochastic context free

                                                                                  grammarsbull SCFG assign probability to parse trees

                                                                                  Compute the most probable parse tree

                                                                                  Sequences are omni-present

                                                                                  Therefore the techniques we will see also apply tobull Bioinformatics

                                                                                  DNA proteins mRNA hellip can all be represented as strings

                                                                                  bullRobotics Sequences of actions states hellip

                                                                                  bullhellip

                                                                                  Rest of the Course

                                                                                  Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                                                                  All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                                                                  bull As an example of using undirected graphical models

                                                                                  bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                                                                  • Advanced Artificial Intelligence
                                                                                  • Topic
                                                                                  • Contents
                                                                                  • Rationalism versus Empiricism
                                                                                  • Slide 5
                                                                                  • This course
                                                                                  • Ambiguity
                                                                                  • NLP and Statistics
                                                                                  • Slide 9
                                                                                  • Corpora
                                                                                  • Word Counts
                                                                                  • Slide 12
                                                                                  • Word Counts (Brown corpus)
                                                                                  • Slide 14
                                                                                  • Zipflsquos Law
                                                                                  • Language and sequences
                                                                                  • Key NLP Problem Ambiguity
                                                                                  • Language Model
                                                                                  • Example of bad language model
                                                                                  • A bad language model
                                                                                  • Slide 22
                                                                                  • A good language model
                                                                                  • Why language models
                                                                                  • Applications
                                                                                  • Spelling errors
                                                                                  • Handwriting recognition
                                                                                  • For Spell Checkers
                                                                                  • Another dimension in language models
                                                                                  • Sequence Tagging
                                                                                  • Slide 31
                                                                                  • Parsing
                                                                                  • Slide 33
                                                                                  • Language models based on Grammars
                                                                                  • Grammars and parsing
                                                                                  • Regular Grammars and Finite State Automata
                                                                                  • Finite State Automaton
                                                                                  • Phrase structure
                                                                                  • Notation
                                                                                  • Context Free Grammar
                                                                                  • Slide 41
                                                                                  • Top-down parsing
                                                                                  • Context-free grammar
                                                                                  • Parse tree
                                                                                  • Definite Clause Grammars Non-terminals may have arguments
                                                                                  • DCGs
                                                                                  • Unification in a nutshell (cf AI course)
                                                                                  • Unification
                                                                                  • Parsing with DCGs
                                                                                  • Case Marking
                                                                                  • Slide 51
                                                                                  • Probabilistic Models
                                                                                  • Illustration
                                                                                  • PowerPoint Presentation
                                                                                  • Sequences are omni-present
                                                                                  • Rest of the Course

                                                                                    Context-free grammarSS --gt --gt NPNPVPVP

                                                                                    NPNP --gt PN --gt PN Proper nounProper noun

                                                                                    NPNP --gt Art Adj N--gt Art Adj N

                                                                                    NPNP --gt ArtN--gt ArtN

                                                                                    VPVP --gt VI --gt VI intransitive verbintransitive verb

                                                                                    VPVP --gt VT --gt VT NPNP transitive verbtransitive verb

                                                                                    ArtArt --gt [the]--gt [the]

                                                                                    AdjAdj --gt [lazy]--gt [lazy]

                                                                                    AdjAdj --gt [rapid]--gt [rapid]

                                                                                    PNPN --gt [achilles]--gt [achilles]

                                                                                    NN --gt [turtle]--gt [turtle]

                                                                                    VIVI --gt [sleeps]--gt [sleeps]

                                                                                    VTVT --gt [beats]--gt [beats]

                                                                                    Parse tree

                                                                                    SS

                                                                                    NPNP VPVP

                                                                                    ArtArt AdjAdj NN VtVt NPNP

                                                                                    PNPN

                                                                                    achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                                                                                    Definite Clause GrammarsNon-terminals may have arguments

                                                                                    SS --gt --gt NPNP((NN))VPVP((NN))

                                                                                    NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                                                                                    VP(VP(NN)) --gt VI(--gt VI(NN))

                                                                                    Art(Art(singularsingular)) --gt [a]--gt [a]

                                                                                    Art(Art(singularsingular)) --gt [the]--gt [the]

                                                                                    Art(Art(pluralplural)) --gt [the]--gt [the]

                                                                                    N(N(singularsingular)) --gt [turtle]--gt [turtle]

                                                                                    N(N(pluralplural)) --gt [turtles]--gt [turtles]

                                                                                    VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                                                                                    VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                                                                                    Number Agreement

                                                                                    DCGs

                                                                                    Non-terminals may have argumentsbull Variables (start with capital)

                                                                                    Eg Number Any

                                                                                    bull Constants (start with lower case) Eg singular plural

                                                                                    bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                                                                                    Parsing needs to be adapted bull Using unification

                                                                                    Unification in a nutshell (cf AI course)

                                                                                    Substitutions

                                                                                    Eg Num singular T vp(VNP)

                                                                                    Applying substitution bull Simultaneously replace variables by

                                                                                    corresponding termsbull S(Num) Num singular = S(singular)

                                                                                    Unification

                                                                                    Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                                                                                    Gives Num singular

                                                                                    bull Art(singular) and Art(plural) Fails

                                                                                    bull Art(Num1) and Art(Num2) Num1 Num2

                                                                                    bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                                                                                    Parsing with DCGs

                                                                                    Now require successful unification at each step

                                                                                    S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                                                                    S-gt a turtle sleep fails

                                                                                    Case Marking

                                                                                    PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                                                                    PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                                                                    PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                                                                    PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                                                                    S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                                                                    VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                                                                    VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                                                                    VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                                                                    He sees her She sees him They see her

                                                                                    But not Them see he

                                                                                    DCGs

                                                                                    Are strictly more expressive than CFGs Can represent for instance

                                                                                    bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                                                                    Probabilistic Models

                                                                                    Traditional grammar models are very rigid bull essentially a yes no decision

                                                                                    Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                                                                    Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                                                                    Illustration

                                                                                    Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                                                                    bull Constructed by handbull Can be used to derive stochastic context free

                                                                                    grammarsbull SCFG assign probability to parse trees

                                                                                    Compute the most probable parse tree

                                                                                    Sequences are omni-present

                                                                                    Therefore the techniques we will see also apply tobull Bioinformatics

                                                                                    DNA proteins mRNA hellip can all be represented as strings

                                                                                    bullRobotics Sequences of actions states hellip

                                                                                    bullhellip

                                                                                    Rest of the Course

                                                                                    Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                                                                    All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                                                                    bull As an example of using undirected graphical models

                                                                                    bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                                                                    • Advanced Artificial Intelligence
                                                                                    • Topic
                                                                                    • Contents
                                                                                    • Rationalism versus Empiricism
                                                                                    • Slide 5
                                                                                    • This course
                                                                                    • Ambiguity
                                                                                    • NLP and Statistics
                                                                                    • Slide 9
                                                                                    • Corpora
                                                                                    • Word Counts
                                                                                    • Slide 12
                                                                                    • Word Counts (Brown corpus)
                                                                                    • Slide 14
                                                                                    • Zipflsquos Law
                                                                                    • Language and sequences
                                                                                    • Key NLP Problem Ambiguity
                                                                                    • Language Model
                                                                                    • Example of bad language model
                                                                                    • A bad language model
                                                                                    • Slide 22
                                                                                    • A good language model
                                                                                    • Why language models
                                                                                    • Applications
                                                                                    • Spelling errors
                                                                                    • Handwriting recognition
                                                                                    • For Spell Checkers
                                                                                    • Another dimension in language models
                                                                                    • Sequence Tagging
                                                                                    • Slide 31
                                                                                    • Parsing
                                                                                    • Slide 33
                                                                                    • Language models based on Grammars
                                                                                    • Grammars and parsing
                                                                                    • Regular Grammars and Finite State Automata
                                                                                    • Finite State Automaton
                                                                                    • Phrase structure
                                                                                    • Notation
                                                                                    • Context Free Grammar
                                                                                    • Slide 41
                                                                                    • Top-down parsing
                                                                                    • Context-free grammar
                                                                                    • Parse tree
                                                                                    • Definite Clause Grammars Non-terminals may have arguments
                                                                                    • DCGs
                                                                                    • Unification in a nutshell (cf AI course)
                                                                                    • Unification
                                                                                    • Parsing with DCGs
                                                                                    • Case Marking
                                                                                    • Slide 51
                                                                                    • Probabilistic Models
                                                                                    • Illustration
                                                                                    • PowerPoint Presentation
                                                                                    • Sequences are omni-present
                                                                                    • Rest of the Course

                                                                                      Parse tree

                                                                                      SS

                                                                                      NPNP VPVP

                                                                                      ArtArt AdjAdj NN VtVt NPNP

                                                                                      PNPN

                                                                                      achillesachillesbeatsbeatsturtleturtlerapidrapidthethe

                                                                                      Definite Clause GrammarsNon-terminals may have arguments

                                                                                      SS --gt --gt NPNP((NN))VPVP((NN))

                                                                                      NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                                                                                      VP(VP(NN)) --gt VI(--gt VI(NN))

                                                                                      Art(Art(singularsingular)) --gt [a]--gt [a]

                                                                                      Art(Art(singularsingular)) --gt [the]--gt [the]

                                                                                      Art(Art(pluralplural)) --gt [the]--gt [the]

                                                                                      N(N(singularsingular)) --gt [turtle]--gt [turtle]

                                                                                      N(N(pluralplural)) --gt [turtles]--gt [turtles]

                                                                                      VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                                                                                      VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                                                                                      Number Agreement

                                                                                      DCGs

                                                                                      Non-terminals may have argumentsbull Variables (start with capital)

                                                                                      Eg Number Any

                                                                                      bull Constants (start with lower case) Eg singular plural

                                                                                      bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                                                                                      Parsing needs to be adapted bull Using unification

                                                                                      Unification in a nutshell (cf AI course)

                                                                                      Substitutions

                                                                                      Eg Num singular T vp(VNP)

                                                                                      Applying substitution bull Simultaneously replace variables by

                                                                                      corresponding termsbull S(Num) Num singular = S(singular)

                                                                                      Unification

                                                                                      Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                                                                                      Gives Num singular

                                                                                      bull Art(singular) and Art(plural) Fails

                                                                                      bull Art(Num1) and Art(Num2) Num1 Num2

                                                                                      bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                                                                                      Parsing with DCGs

                                                                                      Now require successful unification at each step

                                                                                      S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                                                                      S-gt a turtle sleep fails

                                                                                      Case Marking

                                                                                      PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                                                                      PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                                                                      PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                                                                      PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                                                                      S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                                                                      VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                                                                      VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                                                                      VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                                                                      He sees her She sees him They see her

                                                                                      But not Them see he

                                                                                      DCGs

                                                                                      Are strictly more expressive than CFGs Can represent for instance

                                                                                      bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                                                                      Probabilistic Models

                                                                                      Traditional grammar models are very rigid bull essentially a yes no decision

                                                                                      Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                                                                      Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                                                                      Illustration

                                                                                      Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                                                                      bull Constructed by handbull Can be used to derive stochastic context free

                                                                                      grammarsbull SCFG assign probability to parse trees

                                                                                      Compute the most probable parse tree

                                                                                      Sequences are omni-present

                                                                                      Therefore the techniques we will see also apply tobull Bioinformatics

                                                                                      DNA proteins mRNA hellip can all be represented as strings

                                                                                      bullRobotics Sequences of actions states hellip

                                                                                      bullhellip

                                                                                      Rest of the Course

                                                                                      Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                                                                      All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                                                                      bull As an example of using undirected graphical models

                                                                                      bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                                                                      • Advanced Artificial Intelligence
                                                                                      • Topic
                                                                                      • Contents
                                                                                      • Rationalism versus Empiricism
                                                                                      • Slide 5
                                                                                      • This course
                                                                                      • Ambiguity
                                                                                      • NLP and Statistics
                                                                                      • Slide 9
                                                                                      • Corpora
                                                                                      • Word Counts
                                                                                      • Slide 12
                                                                                      • Word Counts (Brown corpus)
                                                                                      • Slide 14
                                                                                      • Zipflsquos Law
                                                                                      • Language and sequences
                                                                                      • Key NLP Problem Ambiguity
                                                                                      • Language Model
                                                                                      • Example of bad language model
                                                                                      • A bad language model
                                                                                      • Slide 22
                                                                                      • A good language model
                                                                                      • Why language models
                                                                                      • Applications
                                                                                      • Spelling errors
                                                                                      • Handwriting recognition
                                                                                      • For Spell Checkers
                                                                                      • Another dimension in language models
                                                                                      • Sequence Tagging
                                                                                      • Slide 31
                                                                                      • Parsing
                                                                                      • Slide 33
                                                                                      • Language models based on Grammars
                                                                                      • Grammars and parsing
                                                                                      • Regular Grammars and Finite State Automata
                                                                                      • Finite State Automaton
                                                                                      • Phrase structure
                                                                                      • Notation
                                                                                      • Context Free Grammar
                                                                                      • Slide 41
                                                                                      • Top-down parsing
                                                                                      • Context-free grammar
                                                                                      • Parse tree
                                                                                      • Definite Clause Grammars Non-terminals may have arguments
                                                                                      • DCGs
                                                                                      • Unification in a nutshell (cf AI course)
                                                                                      • Unification
                                                                                      • Parsing with DCGs
                                                                                      • Case Marking
                                                                                      • Slide 51
                                                                                      • Probabilistic Models
                                                                                      • Illustration
                                                                                      • PowerPoint Presentation
                                                                                      • Sequences are omni-present
                                                                                      • Rest of the Course

                                                                                        Definite Clause GrammarsNon-terminals may have arguments

                                                                                        SS --gt --gt NPNP((NN))VPVP((NN))

                                                                                        NP(NP(NN)) --gt Art(--gt Art(NN)N()N(NN))

                                                                                        VP(VP(NN)) --gt VI(--gt VI(NN))

                                                                                        Art(Art(singularsingular)) --gt [a]--gt [a]

                                                                                        Art(Art(singularsingular)) --gt [the]--gt [the]

                                                                                        Art(Art(pluralplural)) --gt [the]--gt [the]

                                                                                        N(N(singularsingular)) --gt [turtle]--gt [turtle]

                                                                                        N(N(pluralplural)) --gt [turtles]--gt [turtles]

                                                                                        VI(VI(singularsingular)) --gt [sleeps]--gt [sleeps]

                                                                                        VI(VI(pluralplural)) --gt [sleep]--gt [sleep]

                                                                                        Number Agreement

                                                                                        DCGs

                                                                                        Non-terminals may have argumentsbull Variables (start with capital)

                                                                                        Eg Number Any

                                                                                        bull Constants (start with lower case) Eg singular plural

                                                                                        bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                                                                                        Parsing needs to be adapted bull Using unification

                                                                                        Unification in a nutshell (cf AI course)

                                                                                        Substitutions

                                                                                        Eg Num singular T vp(VNP)

                                                                                        Applying substitution bull Simultaneously replace variables by

                                                                                        corresponding termsbull S(Num) Num singular = S(singular)

                                                                                        Unification

                                                                                        Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                                                                                        Gives Num singular

                                                                                        bull Art(singular) and Art(plural) Fails

                                                                                        bull Art(Num1) and Art(Num2) Num1 Num2

                                                                                        bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                                                                                        Parsing with DCGs

                                                                                        Now require successful unification at each step

                                                                                        S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                                                                        S-gt a turtle sleep fails

                                                                                        Case Marking

                                                                                        PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                                                                        PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                                                                        PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                                                                        PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                                                                        S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                                                                        VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                                                                        VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                                                                        VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                                                                        He sees her She sees him They see her

                                                                                        But not Them see he

                                                                                        DCGs

                                                                                        Are strictly more expressive than CFGs Can represent for instance

                                                                                        bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                                                                        Probabilistic Models

                                                                                        Traditional grammar models are very rigid bull essentially a yes no decision

                                                                                        Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                                                                        Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                                                                        Illustration

                                                                                        Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                                                                        bull Constructed by handbull Can be used to derive stochastic context free

                                                                                        grammarsbull SCFG assign probability to parse trees

                                                                                        Compute the most probable parse tree

                                                                                        Sequences are omni-present

                                                                                        Therefore the techniques we will see also apply tobull Bioinformatics

                                                                                        DNA proteins mRNA hellip can all be represented as strings

                                                                                        bullRobotics Sequences of actions states hellip

                                                                                        bullhellip

                                                                                        Rest of the Course

                                                                                        Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                                                                        All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                                                                        bull As an example of using undirected graphical models

                                                                                        bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                                                                        • Advanced Artificial Intelligence
                                                                                        • Topic
                                                                                        • Contents
                                                                                        • Rationalism versus Empiricism
                                                                                        • Slide 5
                                                                                        • This course
                                                                                        • Ambiguity
                                                                                        • NLP and Statistics
                                                                                        • Slide 9
                                                                                        • Corpora
                                                                                        • Word Counts
                                                                                        • Slide 12
                                                                                        • Word Counts (Brown corpus)
                                                                                        • Slide 14
                                                                                        • Zipflsquos Law
                                                                                        • Language and sequences
                                                                                        • Key NLP Problem Ambiguity
                                                                                        • Language Model
                                                                                        • Example of bad language model
                                                                                        • A bad language model
                                                                                        • Slide 22
                                                                                        • A good language model
                                                                                        • Why language models
                                                                                        • Applications
                                                                                        • Spelling errors
                                                                                        • Handwriting recognition
                                                                                        • For Spell Checkers
                                                                                        • Another dimension in language models
                                                                                        • Sequence Tagging
                                                                                        • Slide 31
                                                                                        • Parsing
                                                                                        • Slide 33
                                                                                        • Language models based on Grammars
                                                                                        • Grammars and parsing
                                                                                        • Regular Grammars and Finite State Automata
                                                                                        • Finite State Automaton
                                                                                        • Phrase structure
                                                                                        • Notation
                                                                                        • Context Free Grammar
                                                                                        • Slide 41
                                                                                        • Top-down parsing
                                                                                        • Context-free grammar
                                                                                        • Parse tree
                                                                                        • Definite Clause Grammars Non-terminals may have arguments
                                                                                        • DCGs
                                                                                        • Unification in a nutshell (cf AI course)
                                                                                        • Unification
                                                                                        • Parsing with DCGs
                                                                                        • Case Marking
                                                                                        • Slide 51
                                                                                        • Probabilistic Models
                                                                                        • Illustration
                                                                                        • PowerPoint Presentation
                                                                                        • Sequences are omni-present
                                                                                        • Rest of the Course

                                                                                          DCGs

                                                                                          Non-terminals may have argumentsbull Variables (start with capital)

                                                                                          Eg Number Any

                                                                                          bull Constants (start with lower case) Eg singular plural

                                                                                          bull Structured terms (start with lower case and take arguments themselves) Eg vp(VNP)

                                                                                          Parsing needs to be adapted bull Using unification

                                                                                          Unification in a nutshell (cf AI course)

                                                                                          Substitutions

                                                                                          Eg Num singular T vp(VNP)

                                                                                          Applying substitution bull Simultaneously replace variables by

                                                                                          corresponding termsbull S(Num) Num singular = S(singular)

                                                                                          Unification

                                                                                          Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                                                                                          Gives Num singular

                                                                                          bull Art(singular) and Art(plural) Fails

                                                                                          bull Art(Num1) and Art(Num2) Num1 Num2

                                                                                          bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                                                                                          Parsing with DCGs

                                                                                          Now require successful unification at each step

                                                                                          S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                                                                          S-gt a turtle sleep fails

                                                                                          Case Marking

                                                                                          PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                                                                          PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                                                                          PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                                                                          PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                                                                          S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                                                                          VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                                                                          VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                                                                          VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                                                                          He sees her She sees him They see her

                                                                                          But not Them see he

                                                                                          DCGs

                                                                                          Are strictly more expressive than CFGs Can represent for instance

                                                                                          bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                                                                          Probabilistic Models

                                                                                          Traditional grammar models are very rigid bull essentially a yes no decision

                                                                                          Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                                                                          Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                                                                          Illustration

                                                                                          Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                                                                          bull Constructed by handbull Can be used to derive stochastic context free

                                                                                          grammarsbull SCFG assign probability to parse trees

                                                                                          Compute the most probable parse tree

                                                                                          Sequences are omni-present

                                                                                          Therefore the techniques we will see also apply tobull Bioinformatics

                                                                                          DNA proteins mRNA hellip can all be represented as strings

                                                                                          bullRobotics Sequences of actions states hellip

                                                                                          bullhellip

                                                                                          Rest of the Course

                                                                                          Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                                                                          All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                                                                          bull As an example of using undirected graphical models

                                                                                          bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                                                                          • Advanced Artificial Intelligence
                                                                                          • Topic
                                                                                          • Contents
                                                                                          • Rationalism versus Empiricism
                                                                                          • Slide 5
                                                                                          • This course
                                                                                          • Ambiguity
                                                                                          • NLP and Statistics
                                                                                          • Slide 9
                                                                                          • Corpora
                                                                                          • Word Counts
                                                                                          • Slide 12
                                                                                          • Word Counts (Brown corpus)
                                                                                          • Slide 14
                                                                                          • Zipflsquos Law
                                                                                          • Language and sequences
                                                                                          • Key NLP Problem Ambiguity
                                                                                          • Language Model
                                                                                          • Example of bad language model
                                                                                          • A bad language model
                                                                                          • Slide 22
                                                                                          • A good language model
                                                                                          • Why language models
                                                                                          • Applications
                                                                                          • Spelling errors
                                                                                          • Handwriting recognition
                                                                                          • For Spell Checkers
                                                                                          • Another dimension in language models
                                                                                          • Sequence Tagging
                                                                                          • Slide 31
                                                                                          • Parsing
                                                                                          • Slide 33
                                                                                          • Language models based on Grammars
                                                                                          • Grammars and parsing
                                                                                          • Regular Grammars and Finite State Automata
                                                                                          • Finite State Automaton
                                                                                          • Phrase structure
                                                                                          • Notation
                                                                                          • Context Free Grammar
                                                                                          • Slide 41
                                                                                          • Top-down parsing
                                                                                          • Context-free grammar
                                                                                          • Parse tree
                                                                                          • Definite Clause Grammars Non-terminals may have arguments
                                                                                          • DCGs
                                                                                          • Unification in a nutshell (cf AI course)
                                                                                          • Unification
                                                                                          • Parsing with DCGs
                                                                                          • Case Marking
                                                                                          • Slide 51
                                                                                          • Probabilistic Models
                                                                                          • Illustration
                                                                                          • PowerPoint Presentation
                                                                                          • Sequences are omni-present
                                                                                          • Rest of the Course

                                                                                            Unification in a nutshell (cf AI course)

                                                                                            Substitutions

                                                                                            Eg Num singular T vp(VNP)

                                                                                            Applying substitution bull Simultaneously replace variables by

                                                                                            corresponding termsbull S(Num) Num singular = S(singular)

                                                                                            Unification

                                                                                            Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                                                                                            Gives Num singular

                                                                                            bull Art(singular) and Art(plural) Fails

                                                                                            bull Art(Num1) and Art(Num2) Num1 Num2

                                                                                            bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                                                                                            Parsing with DCGs

                                                                                            Now require successful unification at each step

                                                                                            S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                                                                            S-gt a turtle sleep fails

                                                                                            Case Marking

                                                                                            PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                                                                            PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                                                                            PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                                                                            PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                                                                            S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                                                                            VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                                                                            VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                                                                            VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                                                                            He sees her She sees him They see her

                                                                                            But not Them see he

                                                                                            DCGs

                                                                                            Are strictly more expressive than CFGs Can represent for instance

                                                                                            bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                                                                            Probabilistic Models

                                                                                            Traditional grammar models are very rigid bull essentially a yes no decision

                                                                                            Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                                                                            Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                                                                            Illustration

                                                                                            Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                                                                            bull Constructed by handbull Can be used to derive stochastic context free

                                                                                            grammarsbull SCFG assign probability to parse trees

                                                                                            Compute the most probable parse tree

                                                                                            Sequences are omni-present

                                                                                            Therefore the techniques we will see also apply tobull Bioinformatics

                                                                                            DNA proteins mRNA hellip can all be represented as strings

                                                                                            bullRobotics Sequences of actions states hellip

                                                                                            bullhellip

                                                                                            Rest of the Course

                                                                                            Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                                                                            All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                                                                            bull As an example of using undirected graphical models

                                                                                            bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                                                                            • Advanced Artificial Intelligence
                                                                                            • Topic
                                                                                            • Contents
                                                                                            • Rationalism versus Empiricism
                                                                                            • Slide 5
                                                                                            • This course
                                                                                            • Ambiguity
                                                                                            • NLP and Statistics
                                                                                            • Slide 9
                                                                                            • Corpora
                                                                                            • Word Counts
                                                                                            • Slide 12
                                                                                            • Word Counts (Brown corpus)
                                                                                            • Slide 14
                                                                                            • Zipflsquos Law
                                                                                            • Language and sequences
                                                                                            • Key NLP Problem Ambiguity
                                                                                            • Language Model
                                                                                            • Example of bad language model
                                                                                            • A bad language model
                                                                                            • Slide 22
                                                                                            • A good language model
                                                                                            • Why language models
                                                                                            • Applications
                                                                                            • Spelling errors
                                                                                            • Handwriting recognition
                                                                                            • For Spell Checkers
                                                                                            • Another dimension in language models
                                                                                            • Sequence Tagging
                                                                                            • Slide 31
                                                                                            • Parsing
                                                                                            • Slide 33
                                                                                            • Language models based on Grammars
                                                                                            • Grammars and parsing
                                                                                            • Regular Grammars and Finite State Automata
                                                                                            • Finite State Automaton
                                                                                            • Phrase structure
                                                                                            • Notation
                                                                                            • Context Free Grammar
                                                                                            • Slide 41
                                                                                            • Top-down parsing
                                                                                            • Context-free grammar
                                                                                            • Parse tree
                                                                                            • Definite Clause Grammars Non-terminals may have arguments
                                                                                            • DCGs
                                                                                            • Unification in a nutshell (cf AI course)
                                                                                            • Unification
                                                                                            • Parsing with DCGs
                                                                                            • Case Marking
                                                                                            • Slide 51
                                                                                            • Probabilistic Models
                                                                                            • Illustration
                                                                                            • PowerPoint Presentation
                                                                                            • Sequences are omni-present
                                                                                            • Rest of the Course

                                                                                              Unification

                                                                                              Take two non-terminals with arguments and compute (most general) substitution that makes them identical egbull Art(singular) and Art(Num)

                                                                                              Gives Num singular

                                                                                              bull Art(singular) and Art(plural) Fails

                                                                                              bull Art(Num1) and Art(Num2) Num1 Num2

                                                                                              bull PN(Num accusative) and PN(singular Case) Numsingular Caseaccusative

                                                                                              Parsing with DCGs

                                                                                              Now require successful unification at each step

                                                                                              S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                                                                              S-gt a turtle sleep fails

                                                                                              Case Marking

                                                                                              PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                                                                              PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                                                                              PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                                                                              PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                                                                              S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                                                                              VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                                                                              VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                                                                              VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                                                                              He sees her She sees him They see her

                                                                                              But not Them see he

                                                                                              DCGs

                                                                                              Are strictly more expressive than CFGs Can represent for instance

                                                                                              bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                                                                              Probabilistic Models

                                                                                              Traditional grammar models are very rigid bull essentially a yes no decision

                                                                                              Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                                                                              Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                                                                              Illustration

                                                                                              Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                                                                              bull Constructed by handbull Can be used to derive stochastic context free

                                                                                              grammarsbull SCFG assign probability to parse trees

                                                                                              Compute the most probable parse tree

                                                                                              Sequences are omni-present

                                                                                              Therefore the techniques we will see also apply tobull Bioinformatics

                                                                                              DNA proteins mRNA hellip can all be represented as strings

                                                                                              bullRobotics Sequences of actions states hellip

                                                                                              bullhellip

                                                                                              Rest of the Course

                                                                                              Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                                                                              All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                                                                              bull As an example of using undirected graphical models

                                                                                              bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                                                                              • Advanced Artificial Intelligence
                                                                                              • Topic
                                                                                              • Contents
                                                                                              • Rationalism versus Empiricism
                                                                                              • Slide 5
                                                                                              • This course
                                                                                              • Ambiguity
                                                                                              • NLP and Statistics
                                                                                              • Slide 9
                                                                                              • Corpora
                                                                                              • Word Counts
                                                                                              • Slide 12
                                                                                              • Word Counts (Brown corpus)
                                                                                              • Slide 14
                                                                                              • Zipflsquos Law
                                                                                              • Language and sequences
                                                                                              • Key NLP Problem Ambiguity
                                                                                              • Language Model
                                                                                              • Example of bad language model
                                                                                              • A bad language model
                                                                                              • Slide 22
                                                                                              • A good language model
                                                                                              • Why language models
                                                                                              • Applications
                                                                                              • Spelling errors
                                                                                              • Handwriting recognition
                                                                                              • For Spell Checkers
                                                                                              • Another dimension in language models
                                                                                              • Sequence Tagging
                                                                                              • Slide 31
                                                                                              • Parsing
                                                                                              • Slide 33
                                                                                              • Language models based on Grammars
                                                                                              • Grammars and parsing
                                                                                              • Regular Grammars and Finite State Automata
                                                                                              • Finite State Automaton
                                                                                              • Phrase structure
                                                                                              • Notation
                                                                                              • Context Free Grammar
                                                                                              • Slide 41
                                                                                              • Top-down parsing
                                                                                              • Context-free grammar
                                                                                              • Parse tree
                                                                                              • Definite Clause Grammars Non-terminals may have arguments
                                                                                              • DCGs
                                                                                              • Unification in a nutshell (cf AI course)
                                                                                              • Unification
                                                                                              • Parsing with DCGs
                                                                                              • Case Marking
                                                                                              • Slide 51
                                                                                              • Probabilistic Models
                                                                                              • Illustration
                                                                                              • PowerPoint Presentation
                                                                                              • Sequences are omni-present
                                                                                              • Rest of the Course

                                                                                                Parsing with DCGs

                                                                                                Now require successful unification at each step

                                                                                                S -gt NP(N) VP(N) S -gt Art(N) N(N) VP(N) Nsingular S -gt a N(singular) VP(singular) S -gt a turtle VP(singular) S -gt a turtle sleeps

                                                                                                S-gt a turtle sleep fails

                                                                                                Case Marking

                                                                                                PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                                                                                PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                                                                                PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                                                                                PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                                                                                S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                                                                                VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                                                                                VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                                                                                VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                                                                                He sees her She sees him They see her

                                                                                                But not Them see he

                                                                                                DCGs

                                                                                                Are strictly more expressive than CFGs Can represent for instance

                                                                                                bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                                                                                Probabilistic Models

                                                                                                Traditional grammar models are very rigid bull essentially a yes no decision

                                                                                                Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                                                                                Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                                                                                Illustration

                                                                                                Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                                                                                bull Constructed by handbull Can be used to derive stochastic context free

                                                                                                grammarsbull SCFG assign probability to parse trees

                                                                                                Compute the most probable parse tree

                                                                                                Sequences are omni-present

                                                                                                Therefore the techniques we will see also apply tobull Bioinformatics

                                                                                                DNA proteins mRNA hellip can all be represented as strings

                                                                                                bullRobotics Sequences of actions states hellip

                                                                                                bullhellip

                                                                                                Rest of the Course

                                                                                                Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                                                                                All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                                                                                bull As an example of using undirected graphical models

                                                                                                bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                                                                                • Advanced Artificial Intelligence
                                                                                                • Topic
                                                                                                • Contents
                                                                                                • Rationalism versus Empiricism
                                                                                                • Slide 5
                                                                                                • This course
                                                                                                • Ambiguity
                                                                                                • NLP and Statistics
                                                                                                • Slide 9
                                                                                                • Corpora
                                                                                                • Word Counts
                                                                                                • Slide 12
                                                                                                • Word Counts (Brown corpus)
                                                                                                • Slide 14
                                                                                                • Zipflsquos Law
                                                                                                • Language and sequences
                                                                                                • Key NLP Problem Ambiguity
                                                                                                • Language Model
                                                                                                • Example of bad language model
                                                                                                • A bad language model
                                                                                                • Slide 22
                                                                                                • A good language model
                                                                                                • Why language models
                                                                                                • Applications
                                                                                                • Spelling errors
                                                                                                • Handwriting recognition
                                                                                                • For Spell Checkers
                                                                                                • Another dimension in language models
                                                                                                • Sequence Tagging
                                                                                                • Slide 31
                                                                                                • Parsing
                                                                                                • Slide 33
                                                                                                • Language models based on Grammars
                                                                                                • Grammars and parsing
                                                                                                • Regular Grammars and Finite State Automata
                                                                                                • Finite State Automaton
                                                                                                • Phrase structure
                                                                                                • Notation
                                                                                                • Context Free Grammar
                                                                                                • Slide 41
                                                                                                • Top-down parsing
                                                                                                • Context-free grammar
                                                                                                • Parse tree
                                                                                                • Definite Clause Grammars Non-terminals may have arguments
                                                                                                • DCGs
                                                                                                • Unification in a nutshell (cf AI course)
                                                                                                • Unification
                                                                                                • Parsing with DCGs
                                                                                                • Case Marking
                                                                                                • Slide 51
                                                                                                • Probabilistic Models
                                                                                                • Illustration
                                                                                                • PowerPoint Presentation
                                                                                                • Sequences are omni-present
                                                                                                • Rest of the Course

                                                                                                  Case Marking

                                                                                                  PN(singularnominative)PN(singularnominative) --gt --gt [he][she][he][she]

                                                                                                  PN(singularaccusative)PN(singularaccusative) --gt --gt [him][her][him][her]

                                                                                                  PN(pluralnominative)PN(pluralnominative) --gt --gt [they][they]

                                                                                                  PN(pluralaccusative)PN(pluralaccusative) --gt --gt [them][them]

                                                                                                  S --gt NP(Numbernominative) NP(Number)S --gt NP(Numbernominative) NP(Number)

                                                                                                  VP(Number) --gt V(Number) VP(Anyaccusative)VP(Number) --gt V(Number) VP(Anyaccusative)

                                                                                                  VP(NumberCase) --gt PN(NumberCase)VP(NumberCase) --gt PN(NumberCase)

                                                                                                  VP(NumberAny) --gt Det N(Number)VP(NumberAny) --gt Det N(Number)

                                                                                                  He sees her She sees him They see her

                                                                                                  But not Them see he

                                                                                                  DCGs

                                                                                                  Are strictly more expressive than CFGs Can represent for instance

                                                                                                  bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                                                                                  Probabilistic Models

                                                                                                  Traditional grammar models are very rigid bull essentially a yes no decision

                                                                                                  Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                                                                                  Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                                                                                  Illustration

                                                                                                  Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                                                                                  bull Constructed by handbull Can be used to derive stochastic context free

                                                                                                  grammarsbull SCFG assign probability to parse trees

                                                                                                  Compute the most probable parse tree

                                                                                                  Sequences are omni-present

                                                                                                  Therefore the techniques we will see also apply tobull Bioinformatics

                                                                                                  DNA proteins mRNA hellip can all be represented as strings

                                                                                                  bullRobotics Sequences of actions states hellip

                                                                                                  bullhellip

                                                                                                  Rest of the Course

                                                                                                  Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                                                                                  All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                                                                                  bull As an example of using undirected graphical models

                                                                                                  bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                                                                                  • Advanced Artificial Intelligence
                                                                                                  • Topic
                                                                                                  • Contents
                                                                                                  • Rationalism versus Empiricism
                                                                                                  • Slide 5
                                                                                                  • This course
                                                                                                  • Ambiguity
                                                                                                  • NLP and Statistics
                                                                                                  • Slide 9
                                                                                                  • Corpora
                                                                                                  • Word Counts
                                                                                                  • Slide 12
                                                                                                  • Word Counts (Brown corpus)
                                                                                                  • Slide 14
                                                                                                  • Zipflsquos Law
                                                                                                  • Language and sequences
                                                                                                  • Key NLP Problem Ambiguity
                                                                                                  • Language Model
                                                                                                  • Example of bad language model
                                                                                                  • A bad language model
                                                                                                  • Slide 22
                                                                                                  • A good language model
                                                                                                  • Why language models
                                                                                                  • Applications
                                                                                                  • Spelling errors
                                                                                                  • Handwriting recognition
                                                                                                  • For Spell Checkers
                                                                                                  • Another dimension in language models
                                                                                                  • Sequence Tagging
                                                                                                  • Slide 31
                                                                                                  • Parsing
                                                                                                  • Slide 33
                                                                                                  • Language models based on Grammars
                                                                                                  • Grammars and parsing
                                                                                                  • Regular Grammars and Finite State Automata
                                                                                                  • Finite State Automaton
                                                                                                  • Phrase structure
                                                                                                  • Notation
                                                                                                  • Context Free Grammar
                                                                                                  • Slide 41
                                                                                                  • Top-down parsing
                                                                                                  • Context-free grammar
                                                                                                  • Parse tree
                                                                                                  • Definite Clause Grammars Non-terminals may have arguments
                                                                                                  • DCGs
                                                                                                  • Unification in a nutshell (cf AI course)
                                                                                                  • Unification
                                                                                                  • Parsing with DCGs
                                                                                                  • Case Marking
                                                                                                  • Slide 51
                                                                                                  • Probabilistic Models
                                                                                                  • Illustration
                                                                                                  • PowerPoint Presentation
                                                                                                  • Sequences are omni-present
                                                                                                  • Rest of the Course

                                                                                                    DCGs

                                                                                                    Are strictly more expressive than CFGs Can represent for instance

                                                                                                    bull S(N) -gt A(N) B(N) C(N)bull A(0) -gt [] bull B(0) -gt []bull C(0) -gt []bull A(s(N)) -gt A(N) [A]bull B(s(N)) -gt B(N) [B]bull C(s(N)) -gt C(N) [C]

                                                                                                    Probabilistic Models

                                                                                                    Traditional grammar models are very rigid bull essentially a yes no decision

                                                                                                    Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                                                                                    Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                                                                                    Illustration

                                                                                                    Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                                                                                    bull Constructed by handbull Can be used to derive stochastic context free

                                                                                                    grammarsbull SCFG assign probability to parse trees

                                                                                                    Compute the most probable parse tree

                                                                                                    Sequences are omni-present

                                                                                                    Therefore the techniques we will see also apply tobull Bioinformatics

                                                                                                    DNA proteins mRNA hellip can all be represented as strings

                                                                                                    bullRobotics Sequences of actions states hellip

                                                                                                    bullhellip

                                                                                                    Rest of the Course

                                                                                                    Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                                                                                    All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                                                                                    bull As an example of using undirected graphical models

                                                                                                    bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                                                                                    • Advanced Artificial Intelligence
                                                                                                    • Topic
                                                                                                    • Contents
                                                                                                    • Rationalism versus Empiricism
                                                                                                    • Slide 5
                                                                                                    • This course
                                                                                                    • Ambiguity
                                                                                                    • NLP and Statistics
                                                                                                    • Slide 9
                                                                                                    • Corpora
                                                                                                    • Word Counts
                                                                                                    • Slide 12
                                                                                                    • Word Counts (Brown corpus)
                                                                                                    • Slide 14
                                                                                                    • Zipflsquos Law
                                                                                                    • Language and sequences
                                                                                                    • Key NLP Problem Ambiguity
                                                                                                    • Language Model
                                                                                                    • Example of bad language model
                                                                                                    • A bad language model
                                                                                                    • Slide 22
                                                                                                    • A good language model
                                                                                                    • Why language models
                                                                                                    • Applications
                                                                                                    • Spelling errors
                                                                                                    • Handwriting recognition
                                                                                                    • For Spell Checkers
                                                                                                    • Another dimension in language models
                                                                                                    • Sequence Tagging
                                                                                                    • Slide 31
                                                                                                    • Parsing
                                                                                                    • Slide 33
                                                                                                    • Language models based on Grammars
                                                                                                    • Grammars and parsing
                                                                                                    • Regular Grammars and Finite State Automata
                                                                                                    • Finite State Automaton
                                                                                                    • Phrase structure
                                                                                                    • Notation
                                                                                                    • Context Free Grammar
                                                                                                    • Slide 41
                                                                                                    • Top-down parsing
                                                                                                    • Context-free grammar
                                                                                                    • Parse tree
                                                                                                    • Definite Clause Grammars Non-terminals may have arguments
                                                                                                    • DCGs
                                                                                                    • Unification in a nutshell (cf AI course)
                                                                                                    • Unification
                                                                                                    • Parsing with DCGs
                                                                                                    • Case Marking
                                                                                                    • Slide 51
                                                                                                    • Probabilistic Models
                                                                                                    • Illustration
                                                                                                    • PowerPoint Presentation
                                                                                                    • Sequences are omni-present
                                                                                                    • Rest of the Course

                                                                                                      Probabilistic Models

                                                                                                      Traditional grammar models are very rigid bull essentially a yes no decision

                                                                                                      Probabilistic grammarsbull Define a probability models for the databull Compute the probability of each alternativebull Choose the most likely alternative

                                                                                                      Ilustrate on bull Shannon Gamebull Spelling correctionbull Parsing

                                                                                                      Illustration

                                                                                                      Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                                                                                      bull Constructed by handbull Can be used to derive stochastic context free

                                                                                                      grammarsbull SCFG assign probability to parse trees

                                                                                                      Compute the most probable parse tree

                                                                                                      Sequences are omni-present

                                                                                                      Therefore the techniques we will see also apply tobull Bioinformatics

                                                                                                      DNA proteins mRNA hellip can all be represented as strings

                                                                                                      bullRobotics Sequences of actions states hellip

                                                                                                      bullhellip

                                                                                                      Rest of the Course

                                                                                                      Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                                                                                      All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                                                                                      bull As an example of using undirected graphical models

                                                                                                      bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                                                                                      • Advanced Artificial Intelligence
                                                                                                      • Topic
                                                                                                      • Contents
                                                                                                      • Rationalism versus Empiricism
                                                                                                      • Slide 5
                                                                                                      • This course
                                                                                                      • Ambiguity
                                                                                                      • NLP and Statistics
                                                                                                      • Slide 9
                                                                                                      • Corpora
                                                                                                      • Word Counts
                                                                                                      • Slide 12
                                                                                                      • Word Counts (Brown corpus)
                                                                                                      • Slide 14
                                                                                                      • Zipflsquos Law
                                                                                                      • Language and sequences
                                                                                                      • Key NLP Problem Ambiguity
                                                                                                      • Language Model
                                                                                                      • Example of bad language model
                                                                                                      • A bad language model
                                                                                                      • Slide 22
                                                                                                      • A good language model
                                                                                                      • Why language models
                                                                                                      • Applications
                                                                                                      • Spelling errors
                                                                                                      • Handwriting recognition
                                                                                                      • For Spell Checkers
                                                                                                      • Another dimension in language models
                                                                                                      • Sequence Tagging
                                                                                                      • Slide 31
                                                                                                      • Parsing
                                                                                                      • Slide 33
                                                                                                      • Language models based on Grammars
                                                                                                      • Grammars and parsing
                                                                                                      • Regular Grammars and Finite State Automata
                                                                                                      • Finite State Automaton
                                                                                                      • Phrase structure
                                                                                                      • Notation
                                                                                                      • Context Free Grammar
                                                                                                      • Slide 41
                                                                                                      • Top-down parsing
                                                                                                      • Context-free grammar
                                                                                                      • Parse tree
                                                                                                      • Definite Clause Grammars Non-terminals may have arguments
                                                                                                      • DCGs
                                                                                                      • Unification in a nutshell (cf AI course)
                                                                                                      • Unification
                                                                                                      • Parsing with DCGs
                                                                                                      • Case Marking
                                                                                                      • Slide 51
                                                                                                      • Probabilistic Models
                                                                                                      • Illustration
                                                                                                      • PowerPoint Presentation
                                                                                                      • Sequences are omni-present
                                                                                                      • Rest of the Course

                                                                                                        Illustration

                                                                                                        Wall Street Journal Corpus 3 000 000 words Correct parse tree for sentences known

                                                                                                        bull Constructed by handbull Can be used to derive stochastic context free

                                                                                                        grammarsbull SCFG assign probability to parse trees

                                                                                                        Compute the most probable parse tree

                                                                                                        Sequences are omni-present

                                                                                                        Therefore the techniques we will see also apply tobull Bioinformatics

                                                                                                        DNA proteins mRNA hellip can all be represented as strings

                                                                                                        bullRobotics Sequences of actions states hellip

                                                                                                        bullhellip

                                                                                                        Rest of the Course

                                                                                                        Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                                                                                        All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                                                                                        bull As an example of using undirected graphical models

                                                                                                        bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                                                                                        • Advanced Artificial Intelligence
                                                                                                        • Topic
                                                                                                        • Contents
                                                                                                        • Rationalism versus Empiricism
                                                                                                        • Slide 5
                                                                                                        • This course
                                                                                                        • Ambiguity
                                                                                                        • NLP and Statistics
                                                                                                        • Slide 9
                                                                                                        • Corpora
                                                                                                        • Word Counts
                                                                                                        • Slide 12
                                                                                                        • Word Counts (Brown corpus)
                                                                                                        • Slide 14
                                                                                                        • Zipflsquos Law
                                                                                                        • Language and sequences
                                                                                                        • Key NLP Problem Ambiguity
                                                                                                        • Language Model
                                                                                                        • Example of bad language model
                                                                                                        • A bad language model
                                                                                                        • Slide 22
                                                                                                        • A good language model
                                                                                                        • Why language models
                                                                                                        • Applications
                                                                                                        • Spelling errors
                                                                                                        • Handwriting recognition
                                                                                                        • For Spell Checkers
                                                                                                        • Another dimension in language models
                                                                                                        • Sequence Tagging
                                                                                                        • Slide 31
                                                                                                        • Parsing
                                                                                                        • Slide 33
                                                                                                        • Language models based on Grammars
                                                                                                        • Grammars and parsing
                                                                                                        • Regular Grammars and Finite State Automata
                                                                                                        • Finite State Automaton
                                                                                                        • Phrase structure
                                                                                                        • Notation
                                                                                                        • Context Free Grammar
                                                                                                        • Slide 41
                                                                                                        • Top-down parsing
                                                                                                        • Context-free grammar
                                                                                                        • Parse tree
                                                                                                        • Definite Clause Grammars Non-terminals may have arguments
                                                                                                        • DCGs
                                                                                                        • Unification in a nutshell (cf AI course)
                                                                                                        • Unification
                                                                                                        • Parsing with DCGs
                                                                                                        • Case Marking
                                                                                                        • Slide 51
                                                                                                        • Probabilistic Models
                                                                                                        • Illustration
                                                                                                        • PowerPoint Presentation
                                                                                                        • Sequences are omni-present
                                                                                                        • Rest of the Course

                                                                                                          Sequences are omni-present

                                                                                                          Therefore the techniques we will see also apply tobull Bioinformatics

                                                                                                          DNA proteins mRNA hellip can all be represented as strings

                                                                                                          bullRobotics Sequences of actions states hellip

                                                                                                          bullhellip

                                                                                                          Rest of the Course

                                                                                                          Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                                                                                          All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                                                                                          bull As an example of using undirected graphical models

                                                                                                          bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                                                                                          • Advanced Artificial Intelligence
                                                                                                          • Topic
                                                                                                          • Contents
                                                                                                          • Rationalism versus Empiricism
                                                                                                          • Slide 5
                                                                                                          • This course
                                                                                                          • Ambiguity
                                                                                                          • NLP and Statistics
                                                                                                          • Slide 9
                                                                                                          • Corpora
                                                                                                          • Word Counts
                                                                                                          • Slide 12
                                                                                                          • Word Counts (Brown corpus)
                                                                                                          • Slide 14
                                                                                                          • Zipflsquos Law
                                                                                                          • Language and sequences
                                                                                                          • Key NLP Problem Ambiguity
                                                                                                          • Language Model
                                                                                                          • Example of bad language model
                                                                                                          • A bad language model
                                                                                                          • Slide 22
                                                                                                          • A good language model
                                                                                                          • Why language models
                                                                                                          • Applications
                                                                                                          • Spelling errors
                                                                                                          • Handwriting recognition
                                                                                                          • For Spell Checkers
                                                                                                          • Another dimension in language models
                                                                                                          • Sequence Tagging
                                                                                                          • Slide 31
                                                                                                          • Parsing
                                                                                                          • Slide 33
                                                                                                          • Language models based on Grammars
                                                                                                          • Grammars and parsing
                                                                                                          • Regular Grammars and Finite State Automata
                                                                                                          • Finite State Automaton
                                                                                                          • Phrase structure
                                                                                                          • Notation
                                                                                                          • Context Free Grammar
                                                                                                          • Slide 41
                                                                                                          • Top-down parsing
                                                                                                          • Context-free grammar
                                                                                                          • Parse tree
                                                                                                          • Definite Clause Grammars Non-terminals may have arguments
                                                                                                          • DCGs
                                                                                                          • Unification in a nutshell (cf AI course)
                                                                                                          • Unification
                                                                                                          • Parsing with DCGs
                                                                                                          • Case Marking
                                                                                                          • Slide 51
                                                                                                          • Probabilistic Models
                                                                                                          • Illustration
                                                                                                          • PowerPoint Presentation
                                                                                                          • Sequences are omni-present
                                                                                                          • Rest of the Course

                                                                                                            Rest of the Course

                                                                                                            Limitations traditional grammar models motivate probabilistic extensions bull Regular grammars and Finite State Automata

                                                                                                            All use principles of Part I on Graphical Models Markov Models using n-gramms (Hidden) Markov Models Conditional Random Fields

                                                                                                            bull As an example of using undirected graphical models

                                                                                                            bull Probabilistic Context Free Grammarsbull Probabilistic Definite Clause Grammars

                                                                                                            • Advanced Artificial Intelligence
                                                                                                            • Topic
                                                                                                            • Contents
                                                                                                            • Rationalism versus Empiricism
                                                                                                            • Slide 5
                                                                                                            • This course
                                                                                                            • Ambiguity
                                                                                                            • NLP and Statistics
                                                                                                            • Slide 9
                                                                                                            • Corpora
                                                                                                            • Word Counts
                                                                                                            • Slide 12
                                                                                                            • Word Counts (Brown corpus)
                                                                                                            • Slide 14
                                                                                                            • Zipflsquos Law
                                                                                                            • Language and sequences
                                                                                                            • Key NLP Problem Ambiguity
                                                                                                            • Language Model
                                                                                                            • Example of bad language model
                                                                                                            • A bad language model
                                                                                                            • Slide 22
                                                                                                            • A good language model
                                                                                                            • Why language models
                                                                                                            • Applications
                                                                                                            • Spelling errors
                                                                                                            • Handwriting recognition
                                                                                                            • For Spell Checkers
                                                                                                            • Another dimension in language models
                                                                                                            • Sequence Tagging
                                                                                                            • Slide 31
                                                                                                            • Parsing
                                                                                                            • Slide 33
                                                                                                            • Language models based on Grammars
                                                                                                            • Grammars and parsing
                                                                                                            • Regular Grammars and Finite State Automata
                                                                                                            • Finite State Automaton
                                                                                                            • Phrase structure
                                                                                                            • Notation
                                                                                                            • Context Free Grammar
                                                                                                            • Slide 41
                                                                                                            • Top-down parsing
                                                                                                            • Context-free grammar
                                                                                                            • Parse tree
                                                                                                            • Definite Clause Grammars Non-terminals may have arguments
                                                                                                            • DCGs
                                                                                                            • Unification in a nutshell (cf AI course)
                                                                                                            • Unification
                                                                                                            • Parsing with DCGs
                                                                                                            • Case Marking
                                                                                                            • Slide 51
                                                                                                            • Probabilistic Models
                                                                                                            • Illustration
                                                                                                            • PowerPoint Presentation
                                                                                                            • Sequences are omni-present
                                                                                                            • Rest of the Course

                                                                                                              top related