Top Banner
Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages References Formal Languages applied to Linguistics Pascal Amsili Laboratoire de Linguistique Formelle, Université Paris Diderot U. São Carlos, september 2014 1 / 74
115

Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Jun 09, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Formal Languages applied to Linguistics

Pascal Amsili

Laboratoire de Linguistique Formelle, Université Paris Diderot

U. São Carlos, september 2014

1 / 74

Page 2: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Overview

1 Formal LanguagesBase notionsDefinitionProblem

2 Formal Grammars

3 Regular Languages

4 Formal complexity of Natural Languages

2 / 74

Page 3: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Alphabet, word

Def. 1 (Alphabet)

An alphabet ⌃ is a finite set of symbols (letters). The size of thealphabet is the cardinal of the set.

Def. 2 (Word)

A word on the alphabet ⌃ is a finite sequence of letters from ⌃.Formally, let [p] = (1, 2, 3, 4, ..., p) (ordered integer sequence).Then a word is a mapping

u : [p] �! ⌃

p, the length of u, is noted |u|.

3 / 74

Page 4: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Examples I

Alphabet {0,1,2,3,4,5,6,7,8,9, · }Words 235 · 29

007 · 12·1 · 1 · 00 · ·3 · 1415962 . . . (⇡). . .

Alphabet { , }Words

. . .

4 / 74

Page 5: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Examples II

Alphabet { , , , , , . . . }Words

. . .Alphabet {a, man, loves, woman }Words a

a man loves a womanman man a loves woman loves a. . .

5 / 74

Page 6: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Monoid

Def. 3 (⌃⇤)

Let ⌃ be an alphabet.The set of all the words that can be formed with any number ofletters from ⌃ is noted ⌃⇤

It comprises a word with no letter, noted "

Example: ⌃ = {a, b, c}⌃⇤ = {", a, b, c , aa, ab, ac , ba, . . . , bbb, . . .}

N.B.: ⌃⇤ is always infinite, except. . .

if ⌃ = ;. Then ⌃⇤ = {"}.

6 / 74

Page 7: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Monoid

Def. 3 (⌃⇤)

Let ⌃ be an alphabet.The set of all the words that can be formed with any number ofletters from ⌃ is noted ⌃⇤

It comprises a word with no letter, noted "

Example: ⌃ = {a, b, c}⌃⇤ = {", a, b, c , aa, ab, ac , ba, . . . , bbb, . . .}

N.B.: ⌃⇤ is always infinite, except. . .if ⌃ = ;. Then ⌃⇤ = {"}.

6 / 74

Page 8: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Structure of ⌃⇤

Let k be the size of the alphabet k = |⌃|.

Then ⌃⇤ contains : k

0 = 1 word(s) of 0 letters (")k

1 = k word(s) of 1 lettersk

2 word(s) of 2 letters. . .k

n words of n letters, 8n � 0

7 / 74

Page 9: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Representation of ⌃⇤

⌃ = {a, b, c}"

�������

HHHHHHH

a

��� HHHaa

��� HHH

aaa aab aac ...

ab ac

b

��� HHHba bb bc

c

��� HHHca cb cc

Words can be enumerated according to different orders⌃⇤ is a countable set

8 / 74

Page 10: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Concatenation

⌃⇤ can be equipped with a binary operation: the concatenation

Def. 4 (Concatenation)

Let [p] u�! X , [q] w�! X . The concatenation of u and w , noteduw (u.w) is thus defined:

uw : [p + q] �! X

uwi =

⇢ui for i 2 [1, p]wi�p for i 2 [p + 1, p + q]

Example : u bacbav ccauv bacbacca

9 / 74

Page 11: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Concatenation

⌃⇤ can be equipped with a binary operation: the concatenation

Def. 4 (Concatenation)

Let [p] u�! X , [q] w�! X . The concatenation of u and w , noteduw (u.w) is thus defined:

uw : [p + q] �! X

uwi =

⇢ui for i 2 [1, p]wi�p for i 2 [p + 1, p + q]

Example : u bacbav cca

uv bacbacca

9 / 74

Page 12: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Concatenation

⌃⇤ can be equipped with a binary operation: the concatenation

Def. 4 (Concatenation)

Let [p] u�! X , [q] w�! X . The concatenation of u and w , noteduw (u.w) is thus defined:

uw : [p + q] �! X

uwi =

⇢ui for i 2 [1, p]wi�p for i 2 [p + 1, p + q]

Example : u bacbav ccauv bacbacca

9 / 74

Page 13: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Factor

Def. 5 (Factor)

A factor w of u is a subset of adjascent letters in u.–w is a factor of u , 9u

1

, u2

s.t. u = u

1

wu

2

–w is a left factor (prefix) of u , 9u2

s.t. u = wu

2

–w is a right factor (suffix) of u , 9u1

s.t. u = u

1

w

Def. 6 (Factorization)

We call factorization the decomposition of a word in factors.

10 / 74

Page 14: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Role of concatenation

1 Words have been defined on ⌃.If one takes two such words, it’s always possible to form a newword by concatenating them.

2 Any word can be factorised in many different ways:a b a c c a b

3 Since all letters of ⌃ form a word of length 1(this set of words is called the base),

4 any word of ⌃⇤ can be seen as a (unique) sequence ofconcatenations of length 1 words :a b a c c a b

((((((ab)a)c)c)a)b)((((((a.b).a).c).c).a).b)

11 / 74

Page 15: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Role of concatenation

1 Words have been defined on ⌃.If one takes two such words, it’s always possible to form a newword by concatenating them.

2 Any word can be factorised in many different ways:a b a c c a b

(a b a)(c c a b)

3 Since all letters of ⌃ form a word of length 1(this set of words is called the base),

4 any word of ⌃⇤ can be seen as a (unique) sequence ofconcatenations of length 1 words :a b a c c a b

((((((ab)a)c)c)a)b)((((((a.b).a).c).c).a).b)

11 / 74

Page 16: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Role of concatenation

1 Words have been defined on ⌃.If one takes two such words, it’s always possible to form a newword by concatenating them.

2 Any word can be factorised in many different ways:a b a c c a b

(a b)(a c c)(a b)

3 Since all letters of ⌃ form a word of length 1(this set of words is called the base),

4 any word of ⌃⇤ can be seen as a (unique) sequence ofconcatenations of length 1 words :a b a c c a b

((((((ab)a)c)c)a)b)((((((a.b).a).c).c).a).b)

11 / 74

Page 17: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Role of concatenation

1 Words have been defined on ⌃.If one takes two such words, it’s always possible to form a newword by concatenating them.

2 Any word can be factorised in many different ways:a b a c c a b

(a b a c c)(a b)

3 Since all letters of ⌃ form a word of length 1(this set of words is called the base),

4 any word of ⌃⇤ can be seen as a (unique) sequence ofconcatenations of length 1 words :a b a c c a b

((((((ab)a)c)c)a)b)((((((a.b).a).c).c).a).b)

11 / 74

Page 18: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Role of concatenation

1 Words have been defined on ⌃.If one takes two such words, it’s always possible to form a newword by concatenating them.

2 Any word can be factorised in many different ways:a b a c c a b

(a)(b)(a)(c)(c)(a)(b)

3 Since all letters of ⌃ form a word of length 1(this set of words is called the base),

4 any word of ⌃⇤ can be seen as a (unique) sequence ofconcatenations of length 1 words :a b a c c a b

((((((ab)a)c)c)a)b)((((((a.b).a).c).c).a).b)

11 / 74

Page 19: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Role of concatenation

1 Words have been defined on ⌃.If one takes two such words, it’s always possible to form a newword by concatenating them.

2 Any word can be factorised in many different ways:a b a c c a b

(a)(b)(a)(c)(c)(a)(b)3 Since all letters of ⌃ form a word of length 1

(this set of words is called the base),4 any word of ⌃⇤ can be seen as a (unique) sequence of

concatenations of length 1 words :a b a c c a b

((((((ab)a)c)c)a)b)((((((a.b).a).c).c).a).b)

11 / 74

Page 20: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Properties of concatenation

1 Concatenation is non commutative2 Concatenation is associative3 Concatenation has an identity (neutral) element: "

1uv .w 6= w .uv

2 (u.v).w = u.(v .w)

3u." = ".u = u

Notation : a.a.a = a

3

12 / 74

Page 21: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Overview

1 Formal LanguagesBase notionsDefinitionProblem

2 Formal Grammars

3 Regular Languages

4 Formal complexity of Natural Languages

13 / 74

Page 22: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Language

Def. 7 ((Formal) Language)

Let ⌃ be an alphabet.A language on ⌃ is a set of words on ⌃.

or, equivalently,A language on ⌃ is a subset of ⌃⇤

14 / 74

Page 23: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Language

Def. 7 ((Formal) Language)

Let ⌃ be an alphabet.A language on ⌃ is a set of words on ⌃.or, equivalently,A language on ⌃ is a subset of ⌃⇤

14 / 74

Page 24: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Examples I

Let ⌃ = {a, b, c}.

L

1

= {aa, ab, bac} finite languageL

2

= {a, aa, aaa, aaaa . . .}or L

2

= {ai / i � 1} infinite languageL

3

= {"} finite language,reduced to a singleton

6=L

4

= ; “empty” languageL

5

= ⌃⇤

15 / 74

Page 25: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Examples I

Let ⌃ = {a, b, c}.

L

1

= {aa, ab, bac} finite language

L

2

= {a, aa, aaa, aaaa . . .}or L

2

= {ai / i � 1} infinite languageL

3

= {"} finite language,reduced to a singleton

6=L

4

= ; “empty” languageL

5

= ⌃⇤

15 / 74

Page 26: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Examples I

Let ⌃ = {a, b, c}.

L

1

= {aa, ab, bac} finite languageL

2

= {a, aa, aaa, aaaa . . .}

or L2

= {ai / i � 1} infinite languageL

3

= {"} finite language,reduced to a singleton

6=L

4

= ; “empty” languageL

5

= ⌃⇤

15 / 74

Page 27: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Examples I

Let ⌃ = {a, b, c}.

L

1

= {aa, ab, bac} finite languageL

2

= {a, aa, aaa, aaaa . . .}or L

2

= {ai / i � 1} infinite language

L

3

= {"} finite language,reduced to a singleton

6=L

4

= ; “empty” languageL

5

= ⌃⇤

15 / 74

Page 28: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Examples I

Let ⌃ = {a, b, c}.

L

1

= {aa, ab, bac} finite languageL

2

= {a, aa, aaa, aaaa . . .}or L

2

= {ai / i � 1} infinite languageL

3

= {"} finite language,reduced to a singleton

6=L

4

= ; “empty” languageL

5

= ⌃⇤

15 / 74

Page 29: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Examples I

Let ⌃ = {a, b, c}.

L

1

= {aa, ab, bac} finite languageL

2

= {a, aa, aaa, aaaa . . .}or L

2

= {ai / i � 1} infinite languageL

3

= {"} finite language,reduced to a singleton

6=

L

4

= ; “empty” languageL

5

= ⌃⇤

15 / 74

Page 30: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Examples I

Let ⌃ = {a, b, c}.

L

1

= {aa, ab, bac} finite languageL

2

= {a, aa, aaa, aaaa . . .}or L

2

= {ai / i � 1} infinite languageL

3

= {"} finite language,reduced to a singleton

6=L

4

= ; “empty” language

L

5

= ⌃⇤

15 / 74

Page 31: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Examples I

Let ⌃ = {a, b, c}.

L

1

= {aa, ab, bac} finite languageL

2

= {a, aa, aaa, aaaa . . .}or L

2

= {ai / i � 1} infinite languageL

3

= {"} finite language,reduced to a singleton

6=L

4

= ; “empty” languageL

5

= ⌃⇤

15 / 74

Page 32: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Examples II

Let ⌃ = {a, man, loves, woman}.

L = { a man loves a woman, a woman loves a man }

Let ⌃0 = {a, man, who, saw, fell}.

L

0 =

8>><

>>:

a man fell,a man who saw a man fell,a man who saw a man who saw a man fell,. . .

9>>=

>>;

16 / 74

Page 33: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Examples II

Let ⌃ = {a, man, loves, woman}.

L = { a man loves a woman, a woman loves a man }

Let ⌃0 = {a, man, who, saw, fell}.

L

0 =

8>><

>>:

a man fell,a man who saw a man fell,a man who saw a man who saw a man fell,. . .

9>>=

>>;

16 / 74

Page 34: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Examples II

Let ⌃ = {a, man, loves, woman}.

L = { a man loves a woman, a woman loves a man }

Let ⌃0 = {a, man, who, saw, fell}.

L

0 =

8>><

>>:

a man fell,a man who saw a man fell,a man who saw a man who saw a man fell,. . .

9>>=

>>;

16 / 74

Page 35: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Examples II

Let ⌃ = {a, man, loves, woman}.

L = { a man loves a woman, a woman loves a man }

Let ⌃0 = {a, man, who, saw, fell}.

L

0 =

8>><

>>:

a man fell,a man who saw a man fell,a man who saw a man who saw a man fell,. . .

9>>=

>>;

16 / 74

Page 36: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Set operations

Since a language is a set, usual set operations can be defined:unionintersectionset difference

) One may describe a (complex) language as the result of setoperations on (simpler) languages:{a2k / k � 1} = {a, aa, aaa, aaaa, . . .} \ {ww / w 2 ⌃⇤}

17 / 74

Page 37: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Set operations

Since a language is a set, usual set operations can be defined:unionintersectionset difference

) One may describe a (complex) language as the result of setoperations on (simpler) languages:{a2k / k � 1} = {a, aa, aaa, aaaa, . . .} \ {ww / w 2 ⌃⇤}

17 / 74

Page 38: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Additional operations

Def. 8 (product operation on languages)

One can define the language product and its closure the Kleene staroperation:

The product of languages is thus defined:L

1

.L2

= {uv / u 2 L

1

& v 2 L

2

}

Notation:

k timesz }| {L.L.L . . . L = L

k ; L0 = {"}The Kleene star of a language is thus defined:

L

⇤ =S

n�0

L

n

18 / 74

Page 39: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Regular expressions

It is common to use the 3 rational operations:

unionproductKleene star

to characterize certain languages...

({a} [ {b})⇤.{c} = {c , ac , abc , bc , . . . , baabaac , . . .}(simplified notation (a|b)⇤c — regular expressions)

... but not all languages can be thus characterized.

19 / 74

Page 40: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Regular expressions

It is common to use the 3 rational operations:

unionproductKleene star

to characterize certain languages...

({a} [ {b})⇤.{c} = {c , ac , abc , bc , . . . , baabaac , . . .}(simplified notation (a|b)⇤c — regular expressions)

... but not all languages can be thus characterized.

19 / 74

Page 41: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Regular expressions

It is common to use the 3 rational operations:

unionproductKleene star

to characterize certain languages...

({a} [ {b})⇤.{c} = {c , ac , abc , bc , . . . , baabaac , . . .}(simplified notation (a|b)⇤c — regular expressions)

... but not all languages can be thus characterized.

19 / 74

Page 42: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Overview

1 Formal LanguagesBase notionsDefinitionProblem

2 Formal Grammars

3 Regular Languages

4 Formal complexity of Natural Languages

20 / 74

Page 43: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Back to “Natural” Languages

English as a formal language:

alphabet morphemes (often simplified to words —depending onyour view on flexional morphology)) Finite at a time t by hypothesis

words well formed English sentences) English sentences are all finite by hypothesis

language English, as a set of an infinite number of well formedcombinations of “letters” from the alphabet

21 / 74

Page 44: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Discussion I

1 is the alphabet finite?closed class morphemes obviouslyopen class morphemes what about “new words”?

morphological derivations can be seen asproduced from an unchangedinventory (1)

other words loan words (rare)lexical inventions (rare)change of category (2) (bounded)

) negligable

(1) motherese = mother+ese

(2) americanA ! americanN

22 / 74

Page 45: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Discussion II

2 is English infinite ?

It is supposed that you can always profer a longuer sentencethan the previous one by adding linguistic material preservingwell-formedness.Compatible with the working memory limit

(Langendoen & Postal, 1984)

3 is language discrete ?Well, that’s another story

23 / 74

Page 46: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

About infinity

Linguists sometimes have trouble with infinity:In order for there to be an infinite number of sentences in a

language there must either be an infinite number of words

in the language (clearly not true) or there must be the possibility

of infinite length sentences. The product of two finite numbers

is always a finite number. (Mannell, 1999)

and many others

!! WRONG !!

The whole point of formal languages is that they are::::::infinite sets

of::::::finite words on a

:::::finite alphabet.

von Humbolt: language is an infinite use of finite means

(quoted by Chomsky)

24 / 74

Page 47: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

About infinity

Linguists sometimes have trouble with infinity:In order for there to be an infinite number of sentences in a

language there must either be an infinite number of words

in the language (clearly not true) or there must be the possibility

of infinite length sentences. The product of two finite numbers

is always a finite number. (Mannell, 1999)

and many others

!! WRONG !!

The whole point of formal languages is that they are::::::infinite sets

of::::::finite words on a

:::::finite alphabet.

von Humbolt: language is an infinite use of finite means

(quoted by Chomsky)

24 / 74

Page 48: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

About infinity

Linguists sometimes have trouble with infinity:In order for there to be an infinite number of sentences in a

language there must either be an infinite number of words

in the language (clearly not true) or there must be the possibility

of infinite length sentences. The product of two finite numbers

is always a finite number. (Mannell, 1999)

and many others

!! WRONG !!

The whole point of formal languages is that they are::::::infinite sets

of::::::finite words on a

:::::finite alphabet.

von Humbolt: language is an infinite use of finite means

(quoted by Chomsky)

24 / 74

Page 49: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

About infinity

Linguists sometimes have trouble with infinity:In order for there to be an infinite number of sentences in a

language there must either be an infinite number of words

in the language (clearly not true) or there must be the possibility

of infinite length sentences. The product of two finite numbers

is always a finite number. (Mannell, 1999)

and many others

!! WRONG !!

The whole point of formal languages is that they are::::::infinite sets

of::::::finite words on a

:::::finite alphabet.

von Humbolt: language is an infinite use of finite means

(quoted by Chomsky)

24 / 74

Page 50: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

Base notionsDefinitionProblem

Good questions

Why would one consider natural language as a formal language?

it allows to describe the language in aformal/compact/elegant wayit allows to compare various languages (via classes oflanguages established by mathematicians)

it give algorithmic tools to recognize and to analyse wordsof a language.

recognize u : decide whether u 2 L

analyse u : show the internal structure of u

25 / 74

Page 51: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Overview

1 Formal Languages

2 Formal GrammarsDefinitionLanguage classes

3 Regular Languages

4 Formal complexity of Natural Languages

26 / 74

Page 52: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Introduction

Formal grammars have been proposed by Chomsky as one of the

available means to characterize a formal language.Other means include :

Turing machines (automata)�-terms. . .

27 / 74

Page 53: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Formal grammar

Def. 9 ((Formal) Grammar)

A formal grammar is defined by h⌃,N, S ,Pi where⌃ is an alphabetN is a disjoint alphabet non-terminal vocabulary)S 2 V is a distinguished elemnt of N, called the axiom

P is a set of « production rules », namely a subset of thecartesian product (⌃ [ N)⇤N(⌃ [ N)⇤ ⇥ (⌃ [ N)⇤.

28 / 74

Page 54: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Examples

h⌃,N, S ,Pi

G0

=

*

{joe, sam, sleeps}, {N,V , S}, S ,

8>><

>>:

9>>=

>>;

+}

29 / 74

Page 55: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Examples

h⌃,N, S ,Pi

G0

=

*{joe, sam, sleeps},

{N,V , S}, S ,

8>><

>>:

9>>=

>>;

+}

29 / 74

Page 56: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Examples

h⌃,N, S ,Pi

G0

=

*{joe, sam, sleeps}, {N,V , S},

S ,

8>><

>>:

9>>=

>>;

+}

29 / 74

Page 57: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Examples

h⌃,N, S ,Pi

G0

=

*{joe, sam, sleeps}, {N,V , S}, S ,

8>><

>>:

9>>=

>>;

+}

29 / 74

Page 58: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Examples

h⌃,N, S ,Pi

G0

=

*{joe, sam, sleeps}, {N,V , S}, S ,

8>><

>>:

(N, joe)(N, sam)(V , sleeps)(S ,N V )

9>>=

>>;

+}

29 / 74

Page 59: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Examples

h⌃,N, S ,Pi

G0

=

*{joe, sam, sleeps}, {N,V , S}, S ,

8>><

>>:

N ! joe

N ! sam

V ! sleeps

S ! N V

9>>=

>>;

+}

29 / 74

Page 60: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Examples (cont’d)

G1

=

*{jean, dort}, {Np, SN, SV ,V , S}, S ,

8>>>><

>>>>:

S ! SN SV

SN ! Np

SV ! V

Np ! jean

V ! dort

9>>>>=

>>>>;

+}

G2

= h{(, )}, {S}, S , {S �! " | (S)S}i

30 / 74

Page 61: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Notation

G3

: E �! E + E

| E ⇥ E

| ( E )| F

F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

G3

= h{+, ⇥, (, ), 0, 1, 2, 3, 4, 5, 6, 7, 8, 9}, {E ,F},E , {. . .}i

G

4

= E ! E + T | T ,T ! T ⇥ F | F ,F ! (E ) | a

31 / 74

Page 62: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Notation

G3

: E �! E + E

| E ⇥ E

| ( E )| F

F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9G

3

= h{+, ⇥, (, ), 0, 1, 2, 3, 4, 5, 6, 7, 8, 9}, {E ,F},E , {. . .}i

G

4

= E ! E + T | T ,T ! T ⇥ F | F ,F ! (E ) | a

31 / 74

Page 63: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Notation

G3

: E �! E + E

| E ⇥ E

| ( E )| F

F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9G

3

= h{+, ⇥, (, ), 0, 1, 2, 3, 4, 5, 6, 7, 8, 9}, {E ,F},E , {. . .}i

G

4

= E ! E + T | T ,T ! T ⇥ F | F ,F ! (E ) | a

31 / 74

Page 64: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Immediate Derivation

Def. 10 (Immediate derivation)

Let G = hX ,V , S ,Pi a grammar, (f , g) 2 (X [ V )⇤ two “words”,r 2 P a production rule, such that r : A �! u (u 2 (X [ V )⇤).

• f derives into g (immediate derivation) with the rule r

(noted f

r�! g) iff9v ,w s.t. f = vAw and g = vuw

• f derives into g (immediate derivation) in the grammar G(noted f

G�! g) iff9r 2 P s.t. f

r�! g .

32 / 74

Page 65: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Derivation

Def. 11 (Derivation)

f

G⇤�! g if f = g or9f

0

, f1

, f2

, ..., fn s.t. f

0

= f

fn = g

8i 2 [1, n] : fi�1

G�! fi

An example with G0

:N V joe N

�! sam V joe N �! sam V joe joe orsam V joe sam orsam sleeps joe N or. . .

33 / 74

Page 66: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Derivation

Def. 11 (Derivation)

f

G⇤�! g if f = g or9f

0

, f1

, f2

, ..., fn s.t. f

0

= f

fn = g

8i 2 [1, n] : fi�1

G�! fi

An example with G0

:N V joe N �! sam V joe N

�! sam V joe joe orsam V joe sam orsam sleeps joe N or. . .

33 / 74

Page 67: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Derivation

Def. 11 (Derivation)

f

G⇤�! g if f = g or9f

0

, f1

, f2

, ..., fn s.t. f

0

= f

fn = g

8i 2 [1, n] : fi�1

G�! fi

An example with G0

:N V joe N �! sam V joe N �! sam V joe joe or

sam V joe sam orsam sleeps joe N or. . .

33 / 74

Page 68: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Derivation

Def. 11 (Derivation)

f

G⇤�! g if f = g or9f

0

, f1

, f2

, ..., fn s.t. f

0

= f

fn = g

8i 2 [1, n] : fi�1

G�! fi

An example with G0

:N V joe N �! sam V joe N �! sam V joe joe or

sam V joe sam or

sam sleeps joe N or. . .

33 / 74

Page 69: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Derivation

Def. 11 (Derivation)

f

G⇤�! g if f = g or9f

0

, f1

, f2

, ..., fn s.t. f

0

= f

fn = g

8i 2 [1, n] : fi�1

G�! fi

An example with G0

:N V joe N �! sam V joe N �! sam V joe joe or

sam V joe sam orsam sleeps joe N or. . .

33 / 74

Page 70: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Endpoint of a derivation

G3

: E �! E + E

| E ⇥ E

| ( E )| F

F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

An example with G3

:

E ⇥ E

�! F ⇥ E �! 3 ⇥ E �! 3 ⇥ (E ) �! 3 ⇥ (E + E ) �!3 ⇥ (E + F ) �! 3 ⇥ (E + 4) �! 3 ⇥ (F + 4) �! 3 ⇥ (5+ 4) �!|

34 / 74

Page 71: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Endpoint of a derivation

G3

: E �! E + E

| E ⇥ E

| ( E )| F

F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

An example with G3

:

E ⇥ E �! F ⇥ E

�! 3 ⇥ E �! 3 ⇥ (E ) �! 3 ⇥ (E + E ) �!3 ⇥ (E + F ) �! 3 ⇥ (E + 4) �! 3 ⇥ (F + 4) �! 3 ⇥ (5+ 4) �!|

34 / 74

Page 72: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Endpoint of a derivation

G3

: E �! E + E

| E ⇥ E

| ( E )| F

F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

An example with G3

:

E ⇥ E �! F ⇥ E �! 3 ⇥ E

�! 3 ⇥ (E ) �! 3 ⇥ (E + E ) �!3 ⇥ (E + F ) �! 3 ⇥ (E + 4) �! 3 ⇥ (F + 4) �! 3 ⇥ (5+ 4) �!|

34 / 74

Page 73: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Endpoint of a derivation

G3

: E �! E + E

| E ⇥ E

| ( E )| F

F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

An example with G3

:

E ⇥ E �! F ⇥ E �! 3 ⇥ E �! 3 ⇥ (E )

�! 3 ⇥ (E + E ) �!3 ⇥ (E + F ) �! 3 ⇥ (E + 4) �! 3 ⇥ (F + 4) �! 3 ⇥ (5+ 4) �!|

34 / 74

Page 74: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Endpoint of a derivation

G3

: E �! E + E

| E ⇥ E

| ( E )| F

F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

An example with G3

:

E ⇥ E �! F ⇥ E �! 3 ⇥ E �! 3 ⇥ (E ) �! 3 ⇥ (E + E )

�!3 ⇥ (E + F ) �! 3 ⇥ (E + 4) �! 3 ⇥ (F + 4) �! 3 ⇥ (5+ 4) �!|

34 / 74

Page 75: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Endpoint of a derivation

G3

: E �! E + E

| E ⇥ E

| ( E )| F

F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

An example with G3

:

E ⇥ E �! F ⇥ E �! 3 ⇥ E �! 3 ⇥ (E ) �! 3 ⇥ (E + E ) �!3 ⇥ (E + F )

�! 3 ⇥ (E + 4) �! 3 ⇥ (F + 4) �! 3 ⇥ (5+ 4) �!|

34 / 74

Page 76: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Endpoint of a derivation

G3

: E �! E + E

| E ⇥ E

| ( E )| F

F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

An example with G3

:

E ⇥ E �! F ⇥ E �! 3 ⇥ E �! 3 ⇥ (E ) �! 3 ⇥ (E + E ) �!3 ⇥ (E + F ) �! 3 ⇥ (E + 4)

�! 3 ⇥ (F + 4) �! 3 ⇥ (5+ 4) �!|

34 / 74

Page 77: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Endpoint of a derivation

G3

: E �! E + E

| E ⇥ E

| ( E )| F

F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

An example with G3

:

E ⇥ E �! F ⇥ E �! 3 ⇥ E �! 3 ⇥ (E ) �! 3 ⇥ (E + E ) �!3 ⇥ (E + F ) �! 3 ⇥ (E + 4) �! 3 ⇥ (F + 4)

�! 3 ⇥ (5+ 4) �!|

34 / 74

Page 78: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Endpoint of a derivation

G3

: E �! E + E

| E ⇥ E

| ( E )| F

F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

An example with G3

:

E ⇥ E �! F ⇥ E �! 3 ⇥ E �! 3 ⇥ (E ) �! 3 ⇥ (E + E ) �!3 ⇥ (E + F ) �! 3 ⇥ (E + 4) �! 3 ⇥ (F + 4) �! 3 ⇥ (5+ 4)

�!|

34 / 74

Page 79: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Endpoint of a derivation

G3

: E �! E + E

| E ⇥ E

| ( E )| F

F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

An example with G3

:

E ⇥ E �! F ⇥ E �! 3 ⇥ E �! 3 ⇥ (E ) �! 3 ⇥ (E + E ) �!3 ⇥ (E + F ) �! 3 ⇥ (E + 4) �! 3 ⇥ (F + 4) �! 3 ⇥ (5+ 4) �!|

34 / 74

Page 80: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Engendered language

Def. 12 (Language engendered by a word)

Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X

⇤/fG⇤�! g}

Def. 13 (Language engendered by a grammar)

The language engendered by a grammar G is the set of words of ⌃⇤

derived from the axiom.LG = LG(S)

For instance () 2 LG2 : S ! (S)S ! ()S ! ()as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :)S( ! )(S)S( ! )()S( ! )()(for there is no way to arrive at )S( starting with S .

35 / 74

Page 81: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Engendered language

Def. 12 (Language engendered by a word)

Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X

⇤/fG⇤�! g}

Def. 13 (Language engendered by a grammar)

The language engendered by a grammar G is the set of words of ⌃⇤

derived from the axiom.LG = LG(S)

For instance () 2 LG2 :

S ! (S)S ! ()S ! ()as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :)S( ! )(S)S( ! )()S( ! )()(for there is no way to arrive at )S( starting with S .

35 / 74

Page 82: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Engendered language

Def. 12 (Language engendered by a word)

Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X

⇤/fG⇤�! g}

Def. 13 (Language engendered by a grammar)

The language engendered by a grammar G is the set of words of ⌃⇤

derived from the axiom.LG = LG(S)

For instance () 2 LG2 : S ! (S)S

! ()S ! ()as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :)S( ! )(S)S( ! )()S( ! )()(for there is no way to arrive at )S( starting with S .

35 / 74

Page 83: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Engendered language

Def. 12 (Language engendered by a word)

Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X

⇤/fG⇤�! g}

Def. 13 (Language engendered by a grammar)

The language engendered by a grammar G is the set of words of ⌃⇤

derived from the axiom.LG = LG(S)

For instance () 2 LG2 : S ! (S)S ! ()S

! ()as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :)S( ! )(S)S( ! )()S( ! )()(for there is no way to arrive at )S( starting with S .

35 / 74

Page 84: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Engendered language

Def. 12 (Language engendered by a word)

Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X

⇤/fG⇤�! g}

Def. 13 (Language engendered by a grammar)

The language engendered by a grammar G is the set of words of ⌃⇤

derived from the axiom.LG = LG(S)

For instance () 2 LG2 : S ! (S)S ! ()S ! ()

as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :)S( ! )(S)S( ! )()S( ! )()(for there is no way to arrive at )S( starting with S .

35 / 74

Page 85: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Engendered language

Def. 12 (Language engendered by a word)

Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X

⇤/fG⇤�! g}

Def. 13 (Language engendered by a grammar)

The language engendered by a grammar G is the set of words of ⌃⇤

derived from the axiom.LG = LG(S)

For instance () 2 LG2 : S ! (S)S ! ()S ! ()as well as ((())), ()()(), ((()()())). . .

but )()( 62 LG2 , even though the following is a licit derivation :)S( ! )(S)S( ! )()S( ! )()(for there is no way to arrive at )S( starting with S .

35 / 74

Page 86: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Engendered language

Def. 12 (Language engendered by a word)

Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X

⇤/fG⇤�! g}

Def. 13 (Language engendered by a grammar)

The language engendered by a grammar G is the set of words of ⌃⇤

derived from the axiom.LG = LG(S)

For instance () 2 LG2 : S ! (S)S ! ()S ! ()as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :

)S( ! )(S)S( ! )()S( ! )()(for there is no way to arrive at )S( starting with S .

35 / 74

Page 87: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Engendered language

Def. 12 (Language engendered by a word)

Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X

⇤/fG⇤�! g}

Def. 13 (Language engendered by a grammar)

The language engendered by a grammar G is the set of words of ⌃⇤

derived from the axiom.LG = LG(S)

For instance () 2 LG2 : S ! (S)S ! ()S ! ()as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :)S( !

)(S)S( ! )()S( ! )()(for there is no way to arrive at )S( starting with S .

35 / 74

Page 88: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Engendered language

Def. 12 (Language engendered by a word)

Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X

⇤/fG⇤�! g}

Def. 13 (Language engendered by a grammar)

The language engendered by a grammar G is the set of words of ⌃⇤

derived from the axiom.LG = LG(S)

For instance () 2 LG2 : S ! (S)S ! ()S ! ()as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :)S( ! )(S)S( !

)()S( ! )()(for there is no way to arrive at )S( starting with S .

35 / 74

Page 89: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Engendered language

Def. 12 (Language engendered by a word)

Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X

⇤/fG⇤�! g}

Def. 13 (Language engendered by a grammar)

The language engendered by a grammar G is the set of words of ⌃⇤

derived from the axiom.LG = LG(S)

For instance () 2 LG2 : S ! (S)S ! ()S ! ()as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :)S( ! )(S)S( ! )()S( !

)()(for there is no way to arrive at )S( starting with S .

35 / 74

Page 90: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Engendered language

Def. 12 (Language engendered by a word)

Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X

⇤/fG⇤�! g}

Def. 13 (Language engendered by a grammar)

The language engendered by a grammar G is the set of words of ⌃⇤

derived from the axiom.LG = LG(S)

For instance () 2 LG2 : S ! (S)S ! ()S ! ()as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :)S( ! )(S)S( ! )()S( ! )()(

for there is no way to arrive at )S( starting with S .

35 / 74

Page 91: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Engendered language

Def. 12 (Language engendered by a word)

Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X

⇤/fG⇤�! g}

Def. 13 (Language engendered by a grammar)

The language engendered by a grammar G is the set of words of ⌃⇤

derived from the axiom.LG = LG(S)

For instance () 2 LG2 : S ! (S)S ! ()S ! ()as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :)S( ! )(S)S( ! )()S( ! )()(for there is no way to arrive at )S( starting with S .

35 / 74

Page 92: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Example

G

4

= E ! E + T | T ,T ! T ⇥ F | F ,F ! (E ) | a

a+ a, a+ (a ⇥ a), ...

36 / 74

Page 93: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Proto-word

Def. 14 (Proto-word)

A proto-word (or proto-sentence) is a word on (⌃ [ N)⇤N(⌃ [ N)⇤

(that is, a word containing at least one letter of N) produced by aderivation from the axiom.

E ! E + T ! E + T ⇤ F ! T + T ⇤ F ! T + F ⇤ F !T + a ⇤ F ! F + a ⇤ F ! a+ a ⇤ F !///////////a+ a ⇤ a

37 / 74

Page 94: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Multiple derivations

A given word may have several derivations:E ! E + E ! F + E ! F + F ! 3 + F ! 3 + 4

E ! E + E ! E + F ! E + 4 ! F + 4 ! 3 + 4... but if the grammar is not ambiguous, there is only one left

derivation:E ! E + E ! F + E ! 3 + E ! 3 + F ! 3 + 4

parsing : trying to find the/a left derivation (resp. right)

38 / 74

Page 95: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Multiple derivations

A given word may have several derivations:E ! E + E ! F + E ! F + F ! 3 + F ! 3 + 4E ! E + E ! E + F ! E + 4 ! F + 4 ! 3 + 4

... but if the grammar is not ambiguous, there is only one left

derivation:E ! E + E ! F + E ! 3 + E ! 3 + F ! 3 + 4

parsing : trying to find the/a left derivation (resp. right)

38 / 74

Page 96: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Multiple derivations

A given word may have several derivations:E ! E + E ! F + E ! F + F ! 3 + F ! 3 + 4E ! E + E ! E + F ! E + 4 ! F + 4 ! 3 + 4... but if the grammar is not ambiguous, there is only one left

derivation:

E ! E + E ! F + E ! 3 + E ! 3 + F ! 3 + 4

parsing : trying to find the/a left derivation (resp. right)

38 / 74

Page 97: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Multiple derivations

A given word may have several derivations:E ! E + E ! F + E ! F + F ! 3 + F ! 3 + 4E ! E + E ! E + F ! E + 4 ! F + 4 ! 3 + 4... but if the grammar is not ambiguous, there is only one left

derivation:E ! E + E ! F + E ! 3 + E ! 3 + F ! 3 + 4

parsing : trying to find the/a left derivation (resp. right)

38 / 74

Page 98: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Multiple derivations

A given word may have several derivations:E ! E + E ! F + E ! F + F ! 3 + F ! 3 + 4E ! E + E ! E + F ! E + 4 ! F + 4 ! 3 + 4... but if the grammar is not ambiguous, there is only one left

derivation:E ! E + E ! F + E ! 3 + E ! 3 + F ! 3 + 4

parsing : trying to find the/a left derivation (resp. right)

38 / 74

Page 99: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Derivation tree

For context-free languages, there is a way to represent the set ofequivalent derivations, via a derivation tree which shows all thederivation independantly of their order.

Grammar G2

: S �! "| (S)S

S

⇣⇣⇣⇣⇣⇣⇣⇣

���

@@@

PPPPPPPP

( S⇣⇣⇣⇣

��@@ PPPP

( S

"

) S

"

) S

"

S ! (S)S ! ((S)S)S ! ((S)S) ! ((S)) ! (())

39 / 74

Page 100: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Structural analysis

Syntactic trees are precious to give access to the semantics

E

����

HHHH

E

T

F

a

+ T

�� HHT

F

a

⇤ F

a

40 / 74

Page 101: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Ambiguity

When a grammar can assign more than one derivation tree to aword w 2 L(G ) (or more than one left derivation), the grammar isambiguous.For instance, G

3

is ambiguous, since it can assign the two follwingtrees to 1 + 2 ⇥ 3:

E

�����

HHHHH

E

F

1

+ E

��� HHHE

F

2

⇥ E

F

3

E

�����

HHHHH

E

��� HHHE

F

1

+ E

F

2

⇥ E

F

3

41 / 74

Page 102: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

About ambiguity

Ambiguity is not desirable for the semanticsUseful artificial languages are rarely ambiguousThere are context-free languages that are intrinsequelyambiguous (3)Natural languages are notoriously ambiguous...

(3) {anbambapbaq|(n � q ^ m � p) _ (n � m ^ p � q)}

42 / 74

Page 103: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Comparison of grammars

different languages generated ) different grammarssame language generated by G and G0:

) same weak generative powersame language generated by G and G0, and same structuraldecomposition : ) same strong generative power

43 / 74

Page 104: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Overview

1 Formal Languages

2 Formal GrammarsDefinitionLanguage classes

3 Regular Languages

4 Formal complexity of Natural Languages

44 / 74

Page 105: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Principle

Define language families on the basis of properties of thegrammars that generate them :

1 Four classes are defined, they are included one in another2 A language is of type k if it can be recognized by a type k

grammar (and thus, by definition, by a type k � 1 grammar) ;and cannot be recognized by a grammar of type k + 1.

45 / 74

Page 106: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Chomsky’s hierarchy

type 0 No restriction onP ⇢ (X [ V )⇤V (X [ V )⇤ ⇥ (X [ V )⇤.

type 1 (context-sensitive grammars) All rules of P are of theshape (u

1

Su

2

, u1

mu

2

), where u

1

and u

2

2 (X [ V )⇤,S 2 V and m 2 (X [ V )+.

type 2 (context-free grammar) All rules of P are of theshape (S ,m), where S 2 V and m 2 (X [ V )⇤.

type 3 (regular grammars) All rules of P are of the shape(S ,m), where S 2 V and m 2 X .V [ X [ {"}.

46 / 74

Page 107: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Examples

type 3:S ! aS | aB | bB | cAB ! bB | bA ! cS | bB

type 2:E ! E + T | T ,T ! T ⇥ F | F ,F ! (E ) | a

47 / 74

Page 108: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Examples

type 3:S ! aS | aB | bB | cAB ! bB | bA ! cS | bB

type 2:E ! E + T | T ,T ! T ⇥ F | F ,F ! (E ) | a

47 / 74

Page 109: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Example 1 type 0

Type 0:S ! SABC AC ! CA A ! a

S ! " CA ! AC B ! b

AB ! BA BC ! CB C ! c

BA ! AB CB ! BC

generated language :

words with an equal number of a, b, and c .

48 / 74

Page 110: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Example 1 type 0

Type 0:S ! SABC AC ! CA A ! a

S ! " CA ! AC B ! b

AB ! BA BC ! CB C ! c

BA ! AB CB ! BC

generated language : words with an equal number of a, b, and c .

48 / 74

Page 111: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Example 2: type 0

Type 0: S ! $S 0$ Aa ! aA $a ! a$S

0 ! aAS

0Ab ! bA $b ! b$

S

0 ! bBS

0Ba ! aB A$ ! $a

S

0 ! " Bb ! bB B$ ! $b$$ ! #

49 / 74

Page 112: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Example 2: type 0 (cont’d)S

�������

HHHHHHH

$ S 0

��� HHH

a A S 0

�� HHb B S 0

"

$

$ a A b B $

a $ A b B $

a $ A b $ ba $ b A $ ba b $ A $ ba b $ $ a ba b # a b

50 / 74

Page 113: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Language families

Turing!recognizable

regular formal

3 2 1 0

recursively enumerable

finite

context!free

context!sensitive

no constraint

recursive

Turing!decidable

51 / 74

Page 114: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

DefinitionLanguage classes

Remarks

There are others ways to classify languages,either on other properties of the grammars;or on other properties of the languages

Nested structures are preferred, but it’s not necessaryWhen classes are nested, it is expected to have a growth ofcomplexity/expressive power

52 / 74

Page 115: Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages

Formal LanguagesFormal Grammars

Regular LanguagesFormal complexity of Natural Languages

References

References I

Aho, Alfred, & Ullman, Jeffrey. 1993. Concepts fondamentaux de l’informatique. Dunod. Traduction deFoundations of Computer Science, 1992, W.H. Freeman and Company, New York.

Langendoen, D Terence, & Postal, Paul Martin. 1984. The vastness of natural languages. BasilBlackwell Oxford.

Mannell, Robert. 1999. Infinite number of sentences. part of a set of class notes on the Internet.

74 / 74