Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Post on 09-Jun-2020

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Formal Languages applied to Linguistics

Pascal Amsili

Laboratoire de Linguistique Formelle, Université Paris Diderot

U. São Carlos, september 2014

1 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Overview

1 Formal LanguagesBase notionsDefinitionProblem

2 Formal Grammars

3 Regular Languages

4 Formal complexity of Natural Languages

2 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Alphabet, word

Def. 1 (Alphabet)

An alphabet ⌃ is a finite set of symbols (letters). The size of thealphabet is the cardinal of the set.

Def. 2 (Word)

A word on the alphabet ⌃ is a finite sequence of letters from ⌃.Formally, let [p] = (1, 2, 3, 4, ..., p) (ordered integer sequence).Then a word is a mapping

u : [p] �! ⌃

p, the length of u, is noted |u|.

3 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Examples I

Alphabet {0,1,2,3,4,5,6,7,8,9, · }Words 235 · 29

007 · 12·1 · 1 · 00 · ·3 · 1415962 . . . (⇡). . .

Alphabet { , }Words

. . .

4 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Examples II

Alphabet { , , , , , . . . }Words

. . .Alphabet {a, man, loves, woman }Words a

a man loves a womanman man a loves woman loves a. . .

5 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Monoid

Def. 3 (⌃⇤)

Let ⌃ be an alphabet.The set of all the words that can be formed with any number ofletters from ⌃ is noted ⌃⇤

It comprises a word with no letter, noted "

Example: ⌃ = {a, b, c}⌃⇤ = {", a, b, c , aa, ab, ac , ba, . . . , bbb, . . .}

N.B.: ⌃⇤ is always infinite, except. . .

if ⌃ = ;. Then ⌃⇤ = {"}.

6 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Monoid

Def. 3 (⌃⇤)

Let ⌃ be an alphabet.The set of all the words that can be formed with any number ofletters from ⌃ is noted ⌃⇤

It comprises a word with no letter, noted "

Example: ⌃ = {a, b, c}⌃⇤ = {", a, b, c , aa, ab, ac , ba, . . . , bbb, . . .}

N.B.: ⌃⇤ is always infinite, except. . .if ⌃ = ;. Then ⌃⇤ = {"}.

6 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Structure of ⌃⇤

Let k be the size of the alphabet k = |⌃|.

Then ⌃⇤ contains : k

0 = 1 word(s) of 0 letters (")k

1 = k word(s) of 1 lettersk

2 word(s) of 2 letters. . .k

n words of n letters, 8n � 0

7 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Representation of ⌃⇤

⌃ = {a, b, c}"

�������

HHHHHHH

a

��� HHHaa

��� HHH

aaa aab aac ...

ab ac

b

��� HHHba bb bc

c

��� HHHca cb cc

Words can be enumerated according to different orders⌃⇤ is a countable set

8 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Concatenation

⌃⇤ can be equipped with a binary operation: the concatenation

Def. 4 (Concatenation)

Let [p] u�! X , [q] w�! X . The concatenation of u and w , noteduw (u.w) is thus defined:

uw : [p + q] �! X

uwi =

⇢ui for i 2 [1, p]wi�p for i 2 [p + 1, p + q]

Example : u bacbav ccauv bacbacca

9 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Concatenation

⌃⇤ can be equipped with a binary operation: the concatenation

Def. 4 (Concatenation)

Let [p] u�! X , [q] w�! X . The concatenation of u and w , noteduw (u.w) is thus defined:

uw : [p + q] �! X

uwi =

⇢ui for i 2 [1, p]wi�p for i 2 [p + 1, p + q]

Example : u bacbav cca

uv bacbacca

9 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Concatenation

⌃⇤ can be equipped with a binary operation: the concatenation

Def. 4 (Concatenation)

Let [p] u�! X , [q] w�! X . The concatenation of u and w , noteduw (u.w) is thus defined:

uw : [p + q] �! X

uwi =

⇢ui for i 2 [1, p]wi�p for i 2 [p + 1, p + q]

Example : u bacbav ccauv bacbacca

9 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Factor

Def. 5 (Factor)

A factor w of u is a subset of adjascent letters in u.–w is a factor of u , 9u

1

, u2

s.t. u = u

1

wu

2

–w is a left factor (prefix) of u , 9u2

s.t. u = wu

2

–w is a right factor (suffix) of u , 9u1

s.t. u = u

1

w

Def. 6 (Factorization)

We call factorization the decomposition of a word in factors.

10 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Role of concatenation

1 Words have been defined on ⌃.If one takes two such words, it’s always possible to form a newword by concatenating them.

2 Any word can be factorised in many different ways:a b a c c a b

3 Since all letters of ⌃ form a word of length 1(this set of words is called the base),

4 any word of ⌃⇤ can be seen as a (unique) sequence ofconcatenations of length 1 words :a b a c c a b

((((((ab)a)c)c)a)b)((((((a.b).a).c).c).a).b)

11 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Role of concatenation

1 Words have been defined on ⌃.If one takes two such words, it’s always possible to form a newword by concatenating them.

2 Any word can be factorised in many different ways:a b a c c a b

(a b a)(c c a b)

3 Since all letters of ⌃ form a word of length 1(this set of words is called the base),

4 any word of ⌃⇤ can be seen as a (unique) sequence ofconcatenations of length 1 words :a b a c c a b

((((((ab)a)c)c)a)b)((((((a.b).a).c).c).a).b)

11 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Role of concatenation

1 Words have been defined on ⌃.If one takes two such words, it’s always possible to form a newword by concatenating them.

2 Any word can be factorised in many different ways:a b a c c a b

(a b)(a c c)(a b)

3 Since all letters of ⌃ form a word of length 1(this set of words is called the base),

4 any word of ⌃⇤ can be seen as a (unique) sequence ofconcatenations of length 1 words :a b a c c a b

((((((ab)a)c)c)a)b)((((((a.b).a).c).c).a).b)

11 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Role of concatenation

1 Words have been defined on ⌃.If one takes two such words, it’s always possible to form a newword by concatenating them.

2 Any word can be factorised in many different ways:a b a c c a b

(a b a c c)(a b)

3 Since all letters of ⌃ form a word of length 1(this set of words is called the base),

4 any word of ⌃⇤ can be seen as a (unique) sequence ofconcatenations of length 1 words :a b a c c a b

((((((ab)a)c)c)a)b)((((((a.b).a).c).c).a).b)

11 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Role of concatenation

1 Words have been defined on ⌃.If one takes two such words, it’s always possible to form a newword by concatenating them.

2 Any word can be factorised in many different ways:a b a c c a b

(a)(b)(a)(c)(c)(a)(b)

3 Since all letters of ⌃ form a word of length 1(this set of words is called the base),

4 any word of ⌃⇤ can be seen as a (unique) sequence ofconcatenations of length 1 words :a b a c c a b

((((((ab)a)c)c)a)b)((((((a.b).a).c).c).a).b)

11 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Role of concatenation

1 Words have been defined on ⌃.If one takes two such words, it’s always possible to form a newword by concatenating them.

2 Any word can be factorised in many different ways:a b a c c a b

(a)(b)(a)(c)(c)(a)(b)3 Since all letters of ⌃ form a word of length 1

(this set of words is called the base),4 any word of ⌃⇤ can be seen as a (unique) sequence of

concatenations of length 1 words :a b a c c a b

((((((ab)a)c)c)a)b)((((((a.b).a).c).c).a).b)

11 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Properties of concatenation

1 Concatenation is non commutative2 Concatenation is associative3 Concatenation has an identity (neutral) element: "

1uv .w 6= w .uv

2 (u.v).w = u.(v .w)

3u." = ".u = u

Notation : a.a.a = a

3

12 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Overview

1 Formal LanguagesBase notionsDefinitionProblem

2 Formal Grammars

3 Regular Languages

4 Formal complexity of Natural Languages

13 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Language

Def. 7 ((Formal) Language)

Let ⌃ be an alphabet.A language on ⌃ is a set of words on ⌃.

or, equivalently,A language on ⌃ is a subset of ⌃⇤

14 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Language

Def. 7 ((Formal) Language)

Let ⌃ be an alphabet.A language on ⌃ is a set of words on ⌃.or, equivalently,A language on ⌃ is a subset of ⌃⇤

14 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Examples I

Let ⌃ = {a, b, c}.

L

1

= {aa, ab, bac} finite languageL

2

= {a, aa, aaa, aaaa . . .}or L

2

= {ai / i � 1} infinite languageL

3

= {"} finite language,reduced to a singleton

6=L

4

= ; “empty” languageL

5

= ⌃⇤

15 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Examples I

Let ⌃ = {a, b, c}.

L

1

= {aa, ab, bac} finite language

L

2

= {a, aa, aaa, aaaa . . .}or L

2

= {ai / i � 1} infinite languageL

3

= {"} finite language,reduced to a singleton

6=L

4

= ; “empty” languageL

5

= ⌃⇤

15 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Examples I

Let ⌃ = {a, b, c}.

L

1

= {aa, ab, bac} finite languageL

2

= {a, aa, aaa, aaaa . . .}

or L2

= {ai / i � 1} infinite languageL

3

= {"} finite language,reduced to a singleton

6=L

4

= ; “empty” languageL

5

= ⌃⇤

15 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Examples I

Let ⌃ = {a, b, c}.

L

1

= {aa, ab, bac} finite languageL

2

= {a, aa, aaa, aaaa . . .}or L

2

= {ai / i � 1} infinite language

L

3

= {"} finite language,reduced to a singleton

6=L

4

= ; “empty” languageL

5

= ⌃⇤

15 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Examples I

Let ⌃ = {a, b, c}.

L

1

= {aa, ab, bac} finite languageL

2

= {a, aa, aaa, aaaa . . .}or L

2

= {ai / i � 1} infinite languageL

3

= {"} finite language,reduced to a singleton

6=L

4

= ; “empty” languageL

5

= ⌃⇤

15 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Examples I

Let ⌃ = {a, b, c}.

L

1

= {aa, ab, bac} finite languageL

2

= {a, aa, aaa, aaaa . . .}or L

2

= {ai / i � 1} infinite languageL

3

= {"} finite language,reduced to a singleton

6=

L

4

= ; “empty” languageL

5

= ⌃⇤

15 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Examples I

Let ⌃ = {a, b, c}.

L

1

= {aa, ab, bac} finite languageL

2

= {a, aa, aaa, aaaa . . .}or L

2

= {ai / i � 1} infinite languageL

3

= {"} finite language,reduced to a singleton

6=L

4

= ; “empty” language

L

5

= ⌃⇤

15 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Examples I

Let ⌃ = {a, b, c}.

L

1

= {aa, ab, bac} finite languageL

2

= {a, aa, aaa, aaaa . . .}or L

2

= {ai / i � 1} infinite languageL

3

= {"} finite language,reduced to a singleton

6=L

4

= ; “empty” languageL

5

= ⌃⇤

15 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Examples II

Let ⌃ = {a, man, loves, woman}.

L = { a man loves a woman, a woman loves a man }

Let ⌃0 = {a, man, who, saw, fell}.

L

0 =

8>><

>>:

a man fell,a man who saw a man fell,a man who saw a man who saw a man fell,. . .

9>>=

>>;

16 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Examples II

Let ⌃ = {a, man, loves, woman}.

L = { a man loves a woman, a woman loves a man }

Let ⌃0 = {a, man, who, saw, fell}.

L

0 =

8>><

>>:

a man fell,a man who saw a man fell,a man who saw a man who saw a man fell,. . .

9>>=

>>;

16 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Examples II

Let ⌃ = {a, man, loves, woman}.

L = { a man loves a woman, a woman loves a man }

Let ⌃0 = {a, man, who, saw, fell}.

L

0 =

8>><

>>:

a man fell,a man who saw a man fell,a man who saw a man who saw a man fell,. . .

9>>=

>>;

16 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Examples II

Let ⌃ = {a, man, loves, woman}.

L = { a man loves a woman, a woman loves a man }

Let ⌃0 = {a, man, who, saw, fell}.

L

0 =

8>><

>>:

a man fell,a man who saw a man fell,a man who saw a man who saw a man fell,. . .

9>>=

>>;

16 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Set operations

Since a language is a set, usual set operations can be defined:unionintersectionset difference

) One may describe a (complex) language as the result of setoperations on (simpler) languages:{a2k / k � 1} = {a, aa, aaa, aaaa, . . .} \ {ww / w 2 ⌃⇤}

17 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Set operations

Since a language is a set, usual set operations can be defined:unionintersectionset difference

) One may describe a (complex) language as the result of setoperations on (simpler) languages:{a2k / k � 1} = {a, aa, aaa, aaaa, . . .} \ {ww / w 2 ⌃⇤}

17 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Additional operations

Def. 8 (product operation on languages)

One can define the language product and its closure the Kleene staroperation:

The product of languages is thus defined:L

1

.L2

= {uv / u 2 L

1

& v 2 L

2

}

Notation:

k timesz }| {L.L.L . . . L = L

k ; L0 = {"}The Kleene star of a language is thus defined:

L

⇤ =S

n�0

L

n

18 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Regular expressions

It is common to use the 3 rational operations:

unionproductKleene star

to characterize certain languages...

({a} [ {b})⇤.{c} = {c , ac , abc , bc , . . . , baabaac , . . .}(simplified notation (a|b)⇤c — regular expressions)

... but not all languages can be thus characterized.

19 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Regular expressions

It is common to use the 3 rational operations:

unionproductKleene star

to characterize certain languages...

({a} [ {b})⇤.{c} = {c , ac , abc , bc , . . . , baabaac , . . .}(simplified notation (a|b)⇤c — regular expressions)

... but not all languages can be thus characterized.

19 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Regular expressions

It is common to use the 3 rational operations:

unionproductKleene star

to characterize certain languages...

({a} [ {b})⇤.{c} = {c , ac , abc , bc , . . . , baabaac , . . .}(simplified notation (a|b)⇤c — regular expressions)

... but not all languages can be thus characterized.

19 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Overview

1 Formal LanguagesBase notionsDefinitionProblem

2 Formal Grammars

3 Regular Languages

4 Formal complexity of Natural Languages

20 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Back to “Natural” Languages

English as a formal language:

alphabet morphemes (often simplified to words —depending onyour view on flexional morphology)) Finite at a time t by hypothesis

words well formed English sentences) English sentences are all finite by hypothesis

language English, as a set of an infinite number of well formedcombinations of “letters” from the alphabet

21 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Discussion I

1 is the alphabet finite?closed class morphemes obviouslyopen class morphemes what about “new words”?

morphological derivations can be seen asproduced from an unchangedinventory (1)

other words loan words (rare)lexical inventions (rare)change of category (2) (bounded)

) negligable

(1) motherese = mother+ese

(2) americanA ! americanN

22 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Discussion II

2 is English infinite ?

It is supposed that you can always profer a longuer sentencethan the previous one by adding linguistic material preservingwell-formedness.Compatible with the working memory limit

(Langendoen & Postal, 1984)

3 is language discrete ?Well, that’s another story

23 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

About infinity

Linguists sometimes have trouble with infinity:In order for there to be an infinite number of sentences in a

language there must either be an infinite number of words

in the language (clearly not true) or there must be the possibility

of infinite length sentences. The product of two finite numbers

is always a finite number. (Mannell, 1999)

and many others

!! WRONG !!

The whole point of formal languages is that they are::::::infinite sets

of::::::finite words on a

:::::finite alphabet.

von Humbolt: language is an infinite use of finite means

(quoted by Chomsky)

24 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

About infinity

Linguists sometimes have trouble with infinity:In order for there to be an infinite number of sentences in a

language there must either be an infinite number of words

in the language (clearly not true) or there must be the possibility

of infinite length sentences. The product of two finite numbers

is always a finite number. (Mannell, 1999)

and many others

!! WRONG !!

The whole point of formal languages is that they are::::::infinite sets

of::::::finite words on a

:::::finite alphabet.

von Humbolt: language is an infinite use of finite means

(quoted by Chomsky)

24 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

About infinity

Linguists sometimes have trouble with infinity:In order for there to be an infinite number of sentences in a

language there must either be an infinite number of words

in the language (clearly not true) or there must be the possibility

of infinite length sentences. The product of two finite numbers

is always a finite number. (Mannell, 1999)

and many others

!! WRONG !!

The whole point of formal languages is that they are::::::infinite sets

of::::::finite words on a

:::::finite alphabet.

von Humbolt: language is an infinite use of finite means

(quoted by Chomsky)

24 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

About infinity

Linguists sometimes have trouble with infinity:In order for there to be an infinite number of sentences in a

language there must either be an infinite number of words

in the language (clearly not true) or there must be the possibility

of infinite length sentences. The product of two finite numbers

is always a finite number. (Mannell, 1999)

and many others

!! WRONG !!

The whole point of formal languages is that they are::::::infinite sets

of::::::finite words on a

:::::finite alphabet.

von Humbolt: language is an infinite use of finite means

(quoted by Chomsky)

24 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Good questions

Why would one consider natural language as a formal language?

it allows to describe the language in aformal/compact/elegant wayit allows to compare various languages (via classes oflanguages established by mathematicians)

it give algorithmic tools to recognize and to analyse wordsof a language.

recognize u : decide whether u 2 L

analyse u : show the internal structure of u

25 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Overview

1 Formal Languages

2 Formal GrammarsDefinitionLanguage classes

3 Regular Languages

4 Formal complexity of Natural Languages

26 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Introduction

Formal grammars have been proposed by Chomsky as one of the

available means to characterize a formal language.Other means include :

Turing machines (automata)�-terms. . .

27 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Formal grammar

Def. 9 ((Formal) Grammar)

A formal grammar is defined by h⌃,N, S ,Pi where⌃ is an alphabetN is a disjoint alphabet non-terminal vocabulary)S 2 V is a distinguished elemnt of N, called the axiom

P is a set of « production rules », namely a subset of thecartesian product (⌃ [ N)⇤N(⌃ [ N)⇤ ⇥ (⌃ [ N)⇤.

28 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Examples

h⌃,N, S ,Pi

G0

=

*

{joe, sam, sleeps}, {N,V , S}, S ,

8>><

>>:

9>>=

>>;

+}

29 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Examples

h⌃,N, S ,Pi

G0

=

*{joe, sam, sleeps},

{N,V , S}, S ,

8>><

>>:

9>>=

>>;

+}

29 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Examples

h⌃,N, S ,Pi

G0

=

*{joe, sam, sleeps}, {N,V , S},

S ,

8>><

>>:

9>>=

>>;

+}

29 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Examples

h⌃,N, S ,Pi

G0

=

*{joe, sam, sleeps}, {N,V , S}, S ,

8>><

>>:

9>>=

>>;

+}

29 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Examples

h⌃,N, S ,Pi

G0

=

*{joe, sam, sleeps}, {N,V , S}, S ,

8>><

>>:

(N, joe)(N, sam)(V , sleeps)(S ,N V )

9>>=

>>;

+}

29 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Examples

h⌃,N, S ,Pi

G0

=

*{joe, sam, sleeps}, {N,V , S}, S ,

8>><

>>:

N ! joe

N ! sam

V ! sleeps

S ! N V

9>>=

>>;

+}

29 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Examples (cont’d)

G1

=

*{jean, dort}, {Np, SN, SV ,V , S}, S ,

8>>>><

>>>>:

S ! SN SV

SN ! Np

SV ! V

Np ! jean

V ! dort

9>>>>=

>>>>;

+}

G2

= h{(, )}, {S}, S , {S �! " | (S)S}i

30 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Notation

G3

: E �! E + E

| E ⇥ E

| ( E )| F

F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

G3

= h{+, ⇥, (, ), 0, 1, 2, 3, 4, 5, 6, 7, 8, 9}, {E ,F},E , {. . .}i

G

4

= E ! E + T | T ,T ! T ⇥ F | F ,F ! (E ) | a

31 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Notation

G3

: E �! E + E

| E ⇥ E

| ( E )| F

F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9G

3

= h{+, ⇥, (, ), 0, 1, 2, 3, 4, 5, 6, 7, 8, 9}, {E ,F},E , {. . .}i

G

4

= E ! E + T | T ,T ! T ⇥ F | F ,F ! (E ) | a

31 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Notation

G3

: E �! E + E

| E ⇥ E

| ( E )| F

F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9G

3

= h{+, ⇥, (, ), 0, 1, 2, 3, 4, 5, 6, 7, 8, 9}, {E ,F},E , {. . .}i

G

4

= E ! E + T | T ,T ! T ⇥ F | F ,F ! (E ) | a

31 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Immediate Derivation

Def. 10 (Immediate derivation)

Let G = hX ,V , S ,Pi a grammar, (f , g) 2 (X [ V )⇤ two “words”,r 2 P a production rule, such that r : A �! u (u 2 (X [ V )⇤).

• f derives into g (immediate derivation) with the rule r

(noted f

r�! g) iff9v ,w s.t. f = vAw and g = vuw

• f derives into g (immediate derivation) in the grammar G(noted f

G�! g) iff9r 2 P s.t. f

r�! g .

32 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Derivation

Def. 11 (Derivation)

f

G⇤�! g if f = g or9f

0

, f1

, f2

, ..., fn s.t. f

0

= f

fn = g

8i 2 [1, n] : fi�1

G�! fi

An example with G0

:N V joe N

�! sam V joe N �! sam V joe joe orsam V joe sam orsam sleeps joe N or. . .

33 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Derivation

Def. 11 (Derivation)

f

G⇤�! g if f = g or9f

0

, f1

, f2

, ..., fn s.t. f

0

= f

fn = g

8i 2 [1, n] : fi�1

G�! fi

An example with G0

:N V joe N �! sam V joe N

�! sam V joe joe orsam V joe sam orsam sleeps joe N or. . .

33 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Derivation

Def. 11 (Derivation)

f

G⇤�! g if f = g or9f

0

, f1

, f2

, ..., fn s.t. f

0

= f

fn = g

8i 2 [1, n] : fi�1

G�! fi

An example with G0

:N V joe N �! sam V joe N �! sam V joe joe or

sam V joe sam orsam sleeps joe N or. . .

33 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Derivation

Def. 11 (Derivation)

f

G⇤�! g if f = g or9f

0

, f1

, f2

, ..., fn s.t. f

0

= f

fn = g

8i 2 [1, n] : fi�1

G�! fi

An example with G0

:N V joe N �! sam V joe N �! sam V joe joe or

sam V joe sam or

sam sleeps joe N or. . .

33 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Derivation

Def. 11 (Derivation)

f

G⇤�! g if f = g or9f

0

, f1

, f2

, ..., fn s.t. f

0

= f

fn = g

8i 2 [1, n] : fi�1

G�! fi

An example with G0

:N V joe N �! sam V joe N �! sam V joe joe or

sam V joe sam orsam sleeps joe N or. . .

33 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Endpoint of a derivation

G3

: E �! E + E

| E ⇥ E

| ( E )| F

F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

An example with G3

:

E ⇥ E

�! F ⇥ E �! 3 ⇥ E �! 3 ⇥ (E ) �! 3 ⇥ (E + E ) �!3 ⇥ (E + F ) �! 3 ⇥ (E + 4) �! 3 ⇥ (F + 4) �! 3 ⇥ (5+ 4) �!|

34 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Endpoint of a derivation

G3

: E �! E + E

| E ⇥ E

| ( E )| F

F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

An example with G3

:

E ⇥ E �! F ⇥ E

�! 3 ⇥ E �! 3 ⇥ (E ) �! 3 ⇥ (E + E ) �!3 ⇥ (E + F ) �! 3 ⇥ (E + 4) �! 3 ⇥ (F + 4) �! 3 ⇥ (5+ 4) �!|

34 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Endpoint of a derivation

G3

: E �! E + E

| E ⇥ E

| ( E )| F

F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

An example with G3

:

E ⇥ E �! F ⇥ E �! 3 ⇥ E

�! 3 ⇥ (E ) �! 3 ⇥ (E + E ) �!3 ⇥ (E + F ) �! 3 ⇥ (E + 4) �! 3 ⇥ (F + 4) �! 3 ⇥ (5+ 4) �!|

34 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Endpoint of a derivation

G3

: E �! E + E

| E ⇥ E

| ( E )| F

F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

An example with G3

:

E ⇥ E �! F ⇥ E �! 3 ⇥ E �! 3 ⇥ (E )

�! 3 ⇥ (E + E ) �!3 ⇥ (E + F ) �! 3 ⇥ (E + 4) �! 3 ⇥ (F + 4) �! 3 ⇥ (5+ 4) �!|

34 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Endpoint of a derivation

G3

: E �! E + E

| E ⇥ E

| ( E )| F

F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

An example with G3

:

E ⇥ E �! F ⇥ E �! 3 ⇥ E �! 3 ⇥ (E ) �! 3 ⇥ (E + E )

�!3 ⇥ (E + F ) �! 3 ⇥ (E + 4) �! 3 ⇥ (F + 4) �! 3 ⇥ (5+ 4) �!|

34 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Endpoint of a derivation

G3

: E �! E + E

| E ⇥ E

| ( E )| F

F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

An example with G3

:

E ⇥ E �! F ⇥ E �! 3 ⇥ E �! 3 ⇥ (E ) �! 3 ⇥ (E + E ) �!3 ⇥ (E + F )

�! 3 ⇥ (E + 4) �! 3 ⇥ (F + 4) �! 3 ⇥ (5+ 4) �!|

34 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Endpoint of a derivation

G3

: E �! E + E

| E ⇥ E

| ( E )| F

F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

An example with G3

:

E ⇥ E �! F ⇥ E �! 3 ⇥ E �! 3 ⇥ (E ) �! 3 ⇥ (E + E ) �!3 ⇥ (E + F ) �! 3 ⇥ (E + 4)

�! 3 ⇥ (F + 4) �! 3 ⇥ (5+ 4) �!|

34 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Endpoint of a derivation

G3

: E �! E + E

| E ⇥ E

| ( E )| F

F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

An example with G3

:

E ⇥ E �! F ⇥ E �! 3 ⇥ E �! 3 ⇥ (E ) �! 3 ⇥ (E + E ) �!3 ⇥ (E + F ) �! 3 ⇥ (E + 4) �! 3 ⇥ (F + 4)

�! 3 ⇥ (5+ 4) �!|

34 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Endpoint of a derivation

G3

: E �! E + E

| E ⇥ E

| ( E )| F

F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

An example with G3

:

E ⇥ E �! F ⇥ E �! 3 ⇥ E �! 3 ⇥ (E ) �! 3 ⇥ (E + E ) �!3 ⇥ (E + F ) �! 3 ⇥ (E + 4) �! 3 ⇥ (F + 4) �! 3 ⇥ (5+ 4)

�!|

34 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Endpoint of a derivation

G3

: E �! E + E

| E ⇥ E

| ( E )| F

F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

An example with G3

:

E ⇥ E �! F ⇥ E �! 3 ⇥ E �! 3 ⇥ (E ) �! 3 ⇥ (E + E ) �!3 ⇥ (E + F ) �! 3 ⇥ (E + 4) �! 3 ⇥ (F + 4) �! 3 ⇥ (5+ 4) �!|

34 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Engendered language

Def. 12 (Language engendered by a word)

Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X

⇤/fG⇤�! g}

Def. 13 (Language engendered by a grammar)

The language engendered by a grammar G is the set of words of ⌃⇤

derived from the axiom.LG = LG(S)

For instance () 2 LG2 : S ! (S)S ! ()S ! ()as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :)S( ! )(S)S( ! )()S( ! )()(for there is no way to arrive at )S( starting with S .

35 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Engendered language

Def. 12 (Language engendered by a word)

Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X

⇤/fG⇤�! g}

Def. 13 (Language engendered by a grammar)

The language engendered by a grammar G is the set of words of ⌃⇤

derived from the axiom.LG = LG(S)

For instance () 2 LG2 :

S ! (S)S ! ()S ! ()as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :)S( ! )(S)S( ! )()S( ! )()(for there is no way to arrive at )S( starting with S .

35 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Engendered language

Def. 12 (Language engendered by a word)

Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X

⇤/fG⇤�! g}

Def. 13 (Language engendered by a grammar)

The language engendered by a grammar G is the set of words of ⌃⇤

derived from the axiom.LG = LG(S)

For instance () 2 LG2 : S ! (S)S

! ()S ! ()as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :)S( ! )(S)S( ! )()S( ! )()(for there is no way to arrive at )S( starting with S .

35 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Engendered language

Def. 12 (Language engendered by a word)

Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X

⇤/fG⇤�! g}

Def. 13 (Language engendered by a grammar)

The language engendered by a grammar G is the set of words of ⌃⇤

derived from the axiom.LG = LG(S)

For instance () 2 LG2 : S ! (S)S ! ()S

! ()as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :)S( ! )(S)S( ! )()S( ! )()(for there is no way to arrive at )S( starting with S .

35 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Engendered language

Def. 12 (Language engendered by a word)

Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X

⇤/fG⇤�! g}

Def. 13 (Language engendered by a grammar)

The language engendered by a grammar G is the set of words of ⌃⇤

derived from the axiom.LG = LG(S)

For instance () 2 LG2 : S ! (S)S ! ()S ! ()

as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :)S( ! )(S)S( ! )()S( ! )()(for there is no way to arrive at )S( starting with S .

35 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Engendered language

Def. 12 (Language engendered by a word)

Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X

⇤/fG⇤�! g}

Def. 13 (Language engendered by a grammar)

The language engendered by a grammar G is the set of words of ⌃⇤

derived from the axiom.LG = LG(S)

For instance () 2 LG2 : S ! (S)S ! ()S ! ()as well as ((())), ()()(), ((()()())). . .

but )()( 62 LG2 , even though the following is a licit derivation :)S( ! )(S)S( ! )()S( ! )()(for there is no way to arrive at )S( starting with S .

35 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Engendered language

Def. 12 (Language engendered by a word)

Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X

⇤/fG⇤�! g}

Def. 13 (Language engendered by a grammar)

The language engendered by a grammar G is the set of words of ⌃⇤

derived from the axiom.LG = LG(S)

For instance () 2 LG2 : S ! (S)S ! ()S ! ()as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :

)S( ! )(S)S( ! )()S( ! )()(for there is no way to arrive at )S( starting with S .

35 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Engendered language

Def. 12 (Language engendered by a word)

Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X

⇤/fG⇤�! g}

Def. 13 (Language engendered by a grammar)

The language engendered by a grammar G is the set of words of ⌃⇤

derived from the axiom.LG = LG(S)

For instance () 2 LG2 : S ! (S)S ! ()S ! ()as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :)S( !

)(S)S( ! )()S( ! )()(for there is no way to arrive at )S( starting with S .

35 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Engendered language

Def. 12 (Language engendered by a word)

Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X

⇤/fG⇤�! g}

Def. 13 (Language engendered by a grammar)

The language engendered by a grammar G is the set of words of ⌃⇤

derived from the axiom.LG = LG(S)

For instance () 2 LG2 : S ! (S)S ! ()S ! ()as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :)S( ! )(S)S( !

)()S( ! )()(for there is no way to arrive at )S( starting with S .

35 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Engendered language

Def. 12 (Language engendered by a word)

Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X

⇤/fG⇤�! g}

Def. 13 (Language engendered by a grammar)

The language engendered by a grammar G is the set of words of ⌃⇤

derived from the axiom.LG = LG(S)

For instance () 2 LG2 : S ! (S)S ! ()S ! ()as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :)S( ! )(S)S( ! )()S( !

)()(for there is no way to arrive at )S( starting with S .

35 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Engendered language

Def. 12 (Language engendered by a word)

Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X

⇤/fG⇤�! g}

Def. 13 (Language engendered by a grammar)

The language engendered by a grammar G is the set of words of ⌃⇤

derived from the axiom.LG = LG(S)

For instance () 2 LG2 : S ! (S)S ! ()S ! ()as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :)S( ! )(S)S( ! )()S( ! )()(

for there is no way to arrive at )S( starting with S .

35 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Engendered language

Def. 12 (Language engendered by a word)

Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X

⇤/fG⇤�! g}

Def. 13 (Language engendered by a grammar)

The language engendered by a grammar G is the set of words of ⌃⇤

derived from the axiom.LG = LG(S)

For instance () 2 LG2 : S ! (S)S ! ()S ! ()as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :)S( ! )(S)S( ! )()S( ! )()(for there is no way to arrive at )S( starting with S .

35 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Example

G

4

= E ! E + T | T ,T ! T ⇥ F | F ,F ! (E ) | a

a+ a, a+ (a ⇥ a), ...

36 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Proto-word

Def. 14 (Proto-word)

A proto-word (or proto-sentence) is a word on (⌃ [ N)⇤N(⌃ [ N)⇤

(that is, a word containing at least one letter of N) produced by aderivation from the axiom.

E ! E + T ! E + T ⇤ F ! T + T ⇤ F ! T + F ⇤ F !T + a ⇤ F ! F + a ⇤ F ! a+ a ⇤ F !///////////a+ a ⇤ a

37 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Multiple derivations

A given word may have several derivations:E ! E + E ! F + E ! F + F ! 3 + F ! 3 + 4

E ! E + E ! E + F ! E + 4 ! F + 4 ! 3 + 4... but if the grammar is not ambiguous, there is only one left

derivation:E ! E + E ! F + E ! 3 + E ! 3 + F ! 3 + 4

parsing : trying to find the/a left derivation (resp. right)

38 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Multiple derivations

A given word may have several derivations:E ! E + E ! F + E ! F + F ! 3 + F ! 3 + 4E ! E + E ! E + F ! E + 4 ! F + 4 ! 3 + 4

... but if the grammar is not ambiguous, there is only one left

derivation:E ! E + E ! F + E ! 3 + E ! 3 + F ! 3 + 4

parsing : trying to find the/a left derivation (resp. right)

38 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Multiple derivations

A given word may have several derivations:E ! E + E ! F + E ! F + F ! 3 + F ! 3 + 4E ! E + E ! E + F ! E + 4 ! F + 4 ! 3 + 4... but if the grammar is not ambiguous, there is only one left

derivation:

E ! E + E ! F + E ! 3 + E ! 3 + F ! 3 + 4

parsing : trying to find the/a left derivation (resp. right)

38 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Multiple derivations

A given word may have several derivations:E ! E + E ! F + E ! F + F ! 3 + F ! 3 + 4E ! E + E ! E + F ! E + 4 ! F + 4 ! 3 + 4... but if the grammar is not ambiguous, there is only one left

derivation:E ! E + E ! F + E ! 3 + E ! 3 + F ! 3 + 4

parsing : trying to find the/a left derivation (resp. right)

38 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Multiple derivations

A given word may have several derivations:E ! E + E ! F + E ! F + F ! 3 + F ! 3 + 4E ! E + E ! E + F ! E + 4 ! F + 4 ! 3 + 4... but if the grammar is not ambiguous, there is only one left

derivation:E ! E + E ! F + E ! 3 + E ! 3 + F ! 3 + 4

parsing : trying to find the/a left derivation (resp. right)

38 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Derivation tree

For context-free languages, there is a way to represent the set ofequivalent derivations, via a derivation tree which shows all thederivation independantly of their order.

Grammar G2

: S �! "| (S)S

S

⇣⇣⇣⇣⇣⇣⇣⇣

���

@@@

PPPPPPPP

( S⇣⇣⇣⇣

��@@ PPPP

( S

"

) S

"

) S

"

S ! (S)S ! ((S)S)S ! ((S)S) ! ((S)) ! (())

39 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Structural analysis

Syntactic trees are precious to give access to the semantics

E

����

HHHH

E

T

F

a

+ T

�� HHT

F

a

⇤ F

a

40 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Ambiguity

When a grammar can assign more than one derivation tree to aword w 2 L(G ) (or more than one left derivation), the grammar isambiguous.For instance, G

3

is ambiguous, since it can assign the two follwingtrees to 1 + 2 ⇥ 3:

E

�����

HHHHH

E

F

1

+ E

��� HHHE

F

2

⇥ E

F

3

E

�����

HHHHH

E

��� HHHE

F

1

+ E

F

2

⇥ E

F

3

41 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

About ambiguity

Ambiguity is not desirable for the semanticsUseful artificial languages are rarely ambiguousThere are context-free languages that are intrinsequelyambiguous (3)Natural languages are notoriously ambiguous...

(3) {anbambapbaq|(n � q ^ m � p) _ (n � m ^ p � q)}

42 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Comparison of grammars

different languages generated ) different grammarssame language generated by G and G0:

) same weak generative powersame language generated by G and G0, and same structuraldecomposition : ) same strong generative power

43 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Overview

1 Formal Languages

2 Formal GrammarsDefinitionLanguage classes

3 Regular Languages

4 Formal complexity of Natural Languages

44 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Principle

Define language families on the basis of properties of thegrammars that generate them :

1 Four classes are defined, they are included one in another2 A language is of type k if it can be recognized by a type k

grammar (and thus, by definition, by a type k � 1 grammar) ;and cannot be recognized by a grammar of type k + 1.

45 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Chomsky’s hierarchy

type 0 No restriction onP ⇢ (X [ V )⇤V (X [ V )⇤ ⇥ (X [ V )⇤.

type 1 (context-sensitive grammars) All rules of P are of theshape (u

1

Su

2

, u1

mu

2

), where u

1

and u

2

2 (X [ V )⇤,S 2 V and m 2 (X [ V )+.

type 2 (context-free grammar) All rules of P are of theshape (S ,m), where S 2 V and m 2 (X [ V )⇤.

type 3 (regular grammars) All rules of P are of the shape(S ,m), where S 2 V and m 2 X .V [ X [ {"}.

46 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Examples

type 3:S ! aS | aB | bB | cAB ! bB | bA ! cS | bB

type 2:E ! E + T | T ,T ! T ⇥ F | F ,F ! (E ) | a

47 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Examples

type 3:S ! aS | aB | bB | cAB ! bB | bA ! cS | bB

type 2:E ! E + T | T ,T ! T ⇥ F | F ,F ! (E ) | a

47 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Example 1 type 0

Type 0:S ! SABC AC ! CA A ! a

S ! " CA ! AC B ! b

AB ! BA BC ! CB C ! c

BA ! AB CB ! BC

generated language :

words with an equal number of a, b, and c .

48 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Example 1 type 0

Type 0:S ! SABC AC ! CA A ! a

S ! " CA ! AC B ! b

AB ! BA BC ! CB C ! c

BA ! AB CB ! BC

generated language : words with an equal number of a, b, and c .

48 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Example 2: type 0

Type 0: S ! $S 0$ Aa ! aA $a ! a$S

0 ! aAS

0Ab ! bA $b ! b$

S

0 ! bBS

0Ba ! aB A$ ! $a

S

0 ! " Bb ! bB B$ ! $b$$ ! #

49 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Example 2: type 0 (cont’d)S

�������

HHHHHHH

$ S 0

��� HHH

a A S 0

�� HHb B S 0

"

$

$ a A b B $

a $ A b B $

a $ A b $ ba $ b A $ ba b $ A $ ba b $ $ a ba b # a b

50 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Language families

Turing!recognizable

regular formal

3 2 1 0

recursively enumerable

finite

context!free

context!sensitive

no constraint

recursive

Turing!decidable

51 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Remarks

There are others ways to classify languages,either on other properties of the grammars;or on other properties of the languages

Nested structures are preferred, but it’s not necessaryWhen classes are nested, it is expected to have a growth ofcomplexity/expressive power

52 / 74

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

References I

Aho, Alfred, & Ullman, Jeffrey. 1993. Concepts fondamentaux de l’informatique. Dunod. Traduction deFoundations of Computer Science, 1992, W.H. Freeman and Company, New York.

Langendoen, D Terence, & Postal, Paul Martin. 1984. The vastness of natural languages. BasilBlackwell Oxford.

Mannell, Robert. 1999. Infinite number of sentences. part of a set of class notes on the Internet.

74 / 74

top related