Formal Languages applied to Linguistics › ~amsili › Ens15 › pdf › ... · 2014-09-22 · Formal Languages Formal Grammars Regular Languages Formal complexity of Natural Languages
Post on 09-Jun-2020
2 Views
Preview:
Transcript
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Formal Languages applied to Linguistics
Pascal Amsili
Laboratoire de Linguistique Formelle, Université Paris Diderot
U. São Carlos, september 2014
1 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Overview
1 Formal LanguagesBase notionsDefinitionProblem
2 Formal Grammars
3 Regular Languages
4 Formal complexity of Natural Languages
2 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Alphabet, word
Def. 1 (Alphabet)
An alphabet ⌃ is a finite set of symbols (letters). The size of thealphabet is the cardinal of the set.
Def. 2 (Word)
A word on the alphabet ⌃ is a finite sequence of letters from ⌃.Formally, let [p] = (1, 2, 3, 4, ..., p) (ordered integer sequence).Then a word is a mapping
u : [p] �! ⌃
p, the length of u, is noted |u|.
3 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Examples I
Alphabet {0,1,2,3,4,5,6,7,8,9, · }Words 235 · 29
007 · 12·1 · 1 · 00 · ·3 · 1415962 . . . (⇡). . .
Alphabet { , }Words
. . .
4 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Examples II
Alphabet { , , , , , . . . }Words
. . .Alphabet {a, man, loves, woman }Words a
a man loves a womanman man a loves woman loves a. . .
5 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Monoid
Def. 3 (⌃⇤)
Let ⌃ be an alphabet.The set of all the words that can be formed with any number ofletters from ⌃ is noted ⌃⇤
It comprises a word with no letter, noted "
Example: ⌃ = {a, b, c}⌃⇤ = {", a, b, c , aa, ab, ac , ba, . . . , bbb, . . .}
N.B.: ⌃⇤ is always infinite, except. . .
if ⌃ = ;. Then ⌃⇤ = {"}.
6 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Monoid
Def. 3 (⌃⇤)
Let ⌃ be an alphabet.The set of all the words that can be formed with any number ofletters from ⌃ is noted ⌃⇤
It comprises a word with no letter, noted "
Example: ⌃ = {a, b, c}⌃⇤ = {", a, b, c , aa, ab, ac , ba, . . . , bbb, . . .}
N.B.: ⌃⇤ is always infinite, except. . .if ⌃ = ;. Then ⌃⇤ = {"}.
6 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Structure of ⌃⇤
Let k be the size of the alphabet k = |⌃|.
Then ⌃⇤ contains : k
0 = 1 word(s) of 0 letters (")k
1 = k word(s) of 1 lettersk
2 word(s) of 2 letters. . .k
n words of n letters, 8n � 0
7 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Representation of ⌃⇤
⌃ = {a, b, c}"
�������
HHHHHHH
a
��� HHHaa
��� HHH
aaa aab aac ...
ab ac
b
��� HHHba bb bc
c
��� HHHca cb cc
Words can be enumerated according to different orders⌃⇤ is a countable set
8 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Concatenation
⌃⇤ can be equipped with a binary operation: the concatenation
Def. 4 (Concatenation)
Let [p] u�! X , [q] w�! X . The concatenation of u and w , noteduw (u.w) is thus defined:
uw : [p + q] �! X
uwi =
⇢ui for i 2 [1, p]wi�p for i 2 [p + 1, p + q]
Example : u bacbav ccauv bacbacca
9 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Concatenation
⌃⇤ can be equipped with a binary operation: the concatenation
Def. 4 (Concatenation)
Let [p] u�! X , [q] w�! X . The concatenation of u and w , noteduw (u.w) is thus defined:
uw : [p + q] �! X
uwi =
⇢ui for i 2 [1, p]wi�p for i 2 [p + 1, p + q]
Example : u bacbav cca
uv bacbacca
9 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Concatenation
⌃⇤ can be equipped with a binary operation: the concatenation
Def. 4 (Concatenation)
Let [p] u�! X , [q] w�! X . The concatenation of u and w , noteduw (u.w) is thus defined:
uw : [p + q] �! X
uwi =
⇢ui for i 2 [1, p]wi�p for i 2 [p + 1, p + q]
Example : u bacbav ccauv bacbacca
9 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Factor
Def. 5 (Factor)
A factor w of u is a subset of adjascent letters in u.–w is a factor of u , 9u
1
, u2
s.t. u = u
1
wu
2
–w is a left factor (prefix) of u , 9u2
s.t. u = wu
2
–w is a right factor (suffix) of u , 9u1
s.t. u = u
1
w
Def. 6 (Factorization)
We call factorization the decomposition of a word in factors.
10 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Role of concatenation
1 Words have been defined on ⌃.If one takes two such words, it’s always possible to form a newword by concatenating them.
2 Any word can be factorised in many different ways:a b a c c a b
3 Since all letters of ⌃ form a word of length 1(this set of words is called the base),
4 any word of ⌃⇤ can be seen as a (unique) sequence ofconcatenations of length 1 words :a b a c c a b
((((((ab)a)c)c)a)b)((((((a.b).a).c).c).a).b)
11 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Role of concatenation
1 Words have been defined on ⌃.If one takes two such words, it’s always possible to form a newword by concatenating them.
2 Any word can be factorised in many different ways:a b a c c a b
(a b a)(c c a b)
3 Since all letters of ⌃ form a word of length 1(this set of words is called the base),
4 any word of ⌃⇤ can be seen as a (unique) sequence ofconcatenations of length 1 words :a b a c c a b
((((((ab)a)c)c)a)b)((((((a.b).a).c).c).a).b)
11 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Role of concatenation
1 Words have been defined on ⌃.If one takes two such words, it’s always possible to form a newword by concatenating them.
2 Any word can be factorised in many different ways:a b a c c a b
(a b)(a c c)(a b)
3 Since all letters of ⌃ form a word of length 1(this set of words is called the base),
4 any word of ⌃⇤ can be seen as a (unique) sequence ofconcatenations of length 1 words :a b a c c a b
((((((ab)a)c)c)a)b)((((((a.b).a).c).c).a).b)
11 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Role of concatenation
1 Words have been defined on ⌃.If one takes two such words, it’s always possible to form a newword by concatenating them.
2 Any word can be factorised in many different ways:a b a c c a b
(a b a c c)(a b)
3 Since all letters of ⌃ form a word of length 1(this set of words is called the base),
4 any word of ⌃⇤ can be seen as a (unique) sequence ofconcatenations of length 1 words :a b a c c a b
((((((ab)a)c)c)a)b)((((((a.b).a).c).c).a).b)
11 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Role of concatenation
1 Words have been defined on ⌃.If one takes two such words, it’s always possible to form a newword by concatenating them.
2 Any word can be factorised in many different ways:a b a c c a b
(a)(b)(a)(c)(c)(a)(b)
3 Since all letters of ⌃ form a word of length 1(this set of words is called the base),
4 any word of ⌃⇤ can be seen as a (unique) sequence ofconcatenations of length 1 words :a b a c c a b
((((((ab)a)c)c)a)b)((((((a.b).a).c).c).a).b)
11 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Role of concatenation
1 Words have been defined on ⌃.If one takes two such words, it’s always possible to form a newword by concatenating them.
2 Any word can be factorised in many different ways:a b a c c a b
(a)(b)(a)(c)(c)(a)(b)3 Since all letters of ⌃ form a word of length 1
(this set of words is called the base),4 any word of ⌃⇤ can be seen as a (unique) sequence of
concatenations of length 1 words :a b a c c a b
((((((ab)a)c)c)a)b)((((((a.b).a).c).c).a).b)
11 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Properties of concatenation
1 Concatenation is non commutative2 Concatenation is associative3 Concatenation has an identity (neutral) element: "
1uv .w 6= w .uv
2 (u.v).w = u.(v .w)
3u." = ".u = u
Notation : a.a.a = a
3
12 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Overview
1 Formal LanguagesBase notionsDefinitionProblem
2 Formal Grammars
3 Regular Languages
4 Formal complexity of Natural Languages
13 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Language
Def. 7 ((Formal) Language)
Let ⌃ be an alphabet.A language on ⌃ is a set of words on ⌃.
or, equivalently,A language on ⌃ is a subset of ⌃⇤
14 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Language
Def. 7 ((Formal) Language)
Let ⌃ be an alphabet.A language on ⌃ is a set of words on ⌃.or, equivalently,A language on ⌃ is a subset of ⌃⇤
14 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Examples I
Let ⌃ = {a, b, c}.
L
1
= {aa, ab, bac} finite languageL
2
= {a, aa, aaa, aaaa . . .}or L
2
= {ai / i � 1} infinite languageL
3
= {"} finite language,reduced to a singleton
6=L
4
= ; “empty” languageL
5
= ⌃⇤
15 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Examples I
Let ⌃ = {a, b, c}.
L
1
= {aa, ab, bac} finite language
L
2
= {a, aa, aaa, aaaa . . .}or L
2
= {ai / i � 1} infinite languageL
3
= {"} finite language,reduced to a singleton
6=L
4
= ; “empty” languageL
5
= ⌃⇤
15 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Examples I
Let ⌃ = {a, b, c}.
L
1
= {aa, ab, bac} finite languageL
2
= {a, aa, aaa, aaaa . . .}
or L2
= {ai / i � 1} infinite languageL
3
= {"} finite language,reduced to a singleton
6=L
4
= ; “empty” languageL
5
= ⌃⇤
15 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Examples I
Let ⌃ = {a, b, c}.
L
1
= {aa, ab, bac} finite languageL
2
= {a, aa, aaa, aaaa . . .}or L
2
= {ai / i � 1} infinite language
L
3
= {"} finite language,reduced to a singleton
6=L
4
= ; “empty” languageL
5
= ⌃⇤
15 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Examples I
Let ⌃ = {a, b, c}.
L
1
= {aa, ab, bac} finite languageL
2
= {a, aa, aaa, aaaa . . .}or L
2
= {ai / i � 1} infinite languageL
3
= {"} finite language,reduced to a singleton
6=L
4
= ; “empty” languageL
5
= ⌃⇤
15 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Examples I
Let ⌃ = {a, b, c}.
L
1
= {aa, ab, bac} finite languageL
2
= {a, aa, aaa, aaaa . . .}or L
2
= {ai / i � 1} infinite languageL
3
= {"} finite language,reduced to a singleton
6=
L
4
= ; “empty” languageL
5
= ⌃⇤
15 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Examples I
Let ⌃ = {a, b, c}.
L
1
= {aa, ab, bac} finite languageL
2
= {a, aa, aaa, aaaa . . .}or L
2
= {ai / i � 1} infinite languageL
3
= {"} finite language,reduced to a singleton
6=L
4
= ; “empty” language
L
5
= ⌃⇤
15 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Examples I
Let ⌃ = {a, b, c}.
L
1
= {aa, ab, bac} finite languageL
2
= {a, aa, aaa, aaaa . . .}or L
2
= {ai / i � 1} infinite languageL
3
= {"} finite language,reduced to a singleton
6=L
4
= ; “empty” languageL
5
= ⌃⇤
15 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Examples II
Let ⌃ = {a, man, loves, woman}.
L = { a man loves a woman, a woman loves a man }
Let ⌃0 = {a, man, who, saw, fell}.
L
0 =
8>><
>>:
a man fell,a man who saw a man fell,a man who saw a man who saw a man fell,. . .
9>>=
>>;
16 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Examples II
Let ⌃ = {a, man, loves, woman}.
L = { a man loves a woman, a woman loves a man }
Let ⌃0 = {a, man, who, saw, fell}.
L
0 =
8>><
>>:
a man fell,a man who saw a man fell,a man who saw a man who saw a man fell,. . .
9>>=
>>;
16 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Examples II
Let ⌃ = {a, man, loves, woman}.
L = { a man loves a woman, a woman loves a man }
Let ⌃0 = {a, man, who, saw, fell}.
L
0 =
8>><
>>:
a man fell,a man who saw a man fell,a man who saw a man who saw a man fell,. . .
9>>=
>>;
16 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Examples II
Let ⌃ = {a, man, loves, woman}.
L = { a man loves a woman, a woman loves a man }
Let ⌃0 = {a, man, who, saw, fell}.
L
0 =
8>><
>>:
a man fell,a man who saw a man fell,a man who saw a man who saw a man fell,. . .
9>>=
>>;
16 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Set operations
Since a language is a set, usual set operations can be defined:unionintersectionset difference
) One may describe a (complex) language as the result of setoperations on (simpler) languages:{a2k / k � 1} = {a, aa, aaa, aaaa, . . .} \ {ww / w 2 ⌃⇤}
17 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Set operations
Since a language is a set, usual set operations can be defined:unionintersectionset difference
) One may describe a (complex) language as the result of setoperations on (simpler) languages:{a2k / k � 1} = {a, aa, aaa, aaaa, . . .} \ {ww / w 2 ⌃⇤}
17 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Additional operations
Def. 8 (product operation on languages)
One can define the language product and its closure the Kleene staroperation:
The product of languages is thus defined:L
1
.L2
= {uv / u 2 L
1
& v 2 L
2
}
Notation:
k timesz }| {L.L.L . . . L = L
k ; L0 = {"}The Kleene star of a language is thus defined:
L
⇤ =S
n�0
L
n
18 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Regular expressions
It is common to use the 3 rational operations:
unionproductKleene star
to characterize certain languages...
({a} [ {b})⇤.{c} = {c , ac , abc , bc , . . . , baabaac , . . .}(simplified notation (a|b)⇤c — regular expressions)
... but not all languages can be thus characterized.
19 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Regular expressions
It is common to use the 3 rational operations:
unionproductKleene star
to characterize certain languages...
({a} [ {b})⇤.{c} = {c , ac , abc , bc , . . . , baabaac , . . .}(simplified notation (a|b)⇤c — regular expressions)
... but not all languages can be thus characterized.
19 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Regular expressions
It is common to use the 3 rational operations:
unionproductKleene star
to characterize certain languages...
({a} [ {b})⇤.{c} = {c , ac , abc , bc , . . . , baabaac , . . .}(simplified notation (a|b)⇤c — regular expressions)
... but not all languages can be thus characterized.
19 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Overview
1 Formal LanguagesBase notionsDefinitionProblem
2 Formal Grammars
3 Regular Languages
4 Formal complexity of Natural Languages
20 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Back to “Natural” Languages
English as a formal language:
alphabet morphemes (often simplified to words —depending onyour view on flexional morphology)) Finite at a time t by hypothesis
words well formed English sentences) English sentences are all finite by hypothesis
language English, as a set of an infinite number of well formedcombinations of “letters” from the alphabet
21 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Discussion I
1 is the alphabet finite?closed class morphemes obviouslyopen class morphemes what about “new words”?
morphological derivations can be seen asproduced from an unchangedinventory (1)
other words loan words (rare)lexical inventions (rare)change of category (2) (bounded)
) negligable
(1) motherese = mother+ese
(2) americanA ! americanN
22 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Discussion II
2 is English infinite ?
It is supposed that you can always profer a longuer sentencethan the previous one by adding linguistic material preservingwell-formedness.Compatible with the working memory limit
(Langendoen & Postal, 1984)
3 is language discrete ?Well, that’s another story
23 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
About infinity
Linguists sometimes have trouble with infinity:In order for there to be an infinite number of sentences in a
language there must either be an infinite number of words
in the language (clearly not true) or there must be the possibility
of infinite length sentences. The product of two finite numbers
is always a finite number. (Mannell, 1999)
and many others
!! WRONG !!
The whole point of formal languages is that they are::::::infinite sets
of::::::finite words on a
:::::finite alphabet.
von Humbolt: language is an infinite use of finite means
(quoted by Chomsky)
24 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
About infinity
Linguists sometimes have trouble with infinity:In order for there to be an infinite number of sentences in a
language there must either be an infinite number of words
in the language (clearly not true) or there must be the possibility
of infinite length sentences. The product of two finite numbers
is always a finite number. (Mannell, 1999)
and many others
!! WRONG !!
The whole point of formal languages is that they are::::::infinite sets
of::::::finite words on a
:::::finite alphabet.
von Humbolt: language is an infinite use of finite means
(quoted by Chomsky)
24 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
About infinity
Linguists sometimes have trouble with infinity:In order for there to be an infinite number of sentences in a
language there must either be an infinite number of words
in the language (clearly not true) or there must be the possibility
of infinite length sentences. The product of two finite numbers
is always a finite number. (Mannell, 1999)
and many others
!! WRONG !!
The whole point of formal languages is that they are::::::infinite sets
of::::::finite words on a
:::::finite alphabet.
von Humbolt: language is an infinite use of finite means
(quoted by Chomsky)
24 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
About infinity
Linguists sometimes have trouble with infinity:In order for there to be an infinite number of sentences in a
language there must either be an infinite number of words
in the language (clearly not true) or there must be the possibility
of infinite length sentences. The product of two finite numbers
is always a finite number. (Mannell, 1999)
and many others
!! WRONG !!
The whole point of formal languages is that they are::::::infinite sets
of::::::finite words on a
:::::finite alphabet.
von Humbolt: language is an infinite use of finite means
(quoted by Chomsky)
24 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
Base notionsDefinitionProblem
Good questions
Why would one consider natural language as a formal language?
it allows to describe the language in aformal/compact/elegant wayit allows to compare various languages (via classes oflanguages established by mathematicians)
it give algorithmic tools to recognize and to analyse wordsof a language.
recognize u : decide whether u 2 L
analyse u : show the internal structure of u
25 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Overview
1 Formal Languages
2 Formal GrammarsDefinitionLanguage classes
3 Regular Languages
4 Formal complexity of Natural Languages
26 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Introduction
Formal grammars have been proposed by Chomsky as one of the
available means to characterize a formal language.Other means include :
Turing machines (automata)�-terms. . .
27 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Formal grammar
Def. 9 ((Formal) Grammar)
A formal grammar is defined by h⌃,N, S ,Pi where⌃ is an alphabetN is a disjoint alphabet non-terminal vocabulary)S 2 V is a distinguished elemnt of N, called the axiom
P is a set of « production rules », namely a subset of thecartesian product (⌃ [ N)⇤N(⌃ [ N)⇤ ⇥ (⌃ [ N)⇤.
28 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Examples
h⌃,N, S ,Pi
G0
=
*
{joe, sam, sleeps}, {N,V , S}, S ,
8>><
>>:
9>>=
>>;
+}
29 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Examples
h⌃,N, S ,Pi
G0
=
*{joe, sam, sleeps},
{N,V , S}, S ,
8>><
>>:
9>>=
>>;
+}
29 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Examples
h⌃,N, S ,Pi
G0
=
*{joe, sam, sleeps}, {N,V , S},
S ,
8>><
>>:
9>>=
>>;
+}
29 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Examples
h⌃,N, S ,Pi
G0
=
*{joe, sam, sleeps}, {N,V , S}, S ,
8>><
>>:
9>>=
>>;
+}
29 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Examples
h⌃,N, S ,Pi
G0
=
*{joe, sam, sleeps}, {N,V , S}, S ,
8>><
>>:
(N, joe)(N, sam)(V , sleeps)(S ,N V )
9>>=
>>;
+}
29 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Examples
h⌃,N, S ,Pi
G0
=
*{joe, sam, sleeps}, {N,V , S}, S ,
8>><
>>:
N ! joe
N ! sam
V ! sleeps
S ! N V
9>>=
>>;
+}
29 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Examples (cont’d)
G1
=
*{jean, dort}, {Np, SN, SV ,V , S}, S ,
8>>>><
>>>>:
S ! SN SV
SN ! Np
SV ! V
Np ! jean
V ! dort
9>>>>=
>>>>;
+}
G2
= h{(, )}, {S}, S , {S �! " | (S)S}i
30 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Notation
G3
: E �! E + E
| E ⇥ E
| ( E )| F
F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
G3
= h{+, ⇥, (, ), 0, 1, 2, 3, 4, 5, 6, 7, 8, 9}, {E ,F},E , {. . .}i
G
4
= E ! E + T | T ,T ! T ⇥ F | F ,F ! (E ) | a
31 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Notation
G3
: E �! E + E
| E ⇥ E
| ( E )| F
F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9G
3
= h{+, ⇥, (, ), 0, 1, 2, 3, 4, 5, 6, 7, 8, 9}, {E ,F},E , {. . .}i
G
4
= E ! E + T | T ,T ! T ⇥ F | F ,F ! (E ) | a
31 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Notation
G3
: E �! E + E
| E ⇥ E
| ( E )| F
F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9G
3
= h{+, ⇥, (, ), 0, 1, 2, 3, 4, 5, 6, 7, 8, 9}, {E ,F},E , {. . .}i
G
4
= E ! E + T | T ,T ! T ⇥ F | F ,F ! (E ) | a
31 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Immediate Derivation
Def. 10 (Immediate derivation)
Let G = hX ,V , S ,Pi a grammar, (f , g) 2 (X [ V )⇤ two “words”,r 2 P a production rule, such that r : A �! u (u 2 (X [ V )⇤).
• f derives into g (immediate derivation) with the rule r
(noted f
r�! g) iff9v ,w s.t. f = vAw and g = vuw
• f derives into g (immediate derivation) in the grammar G(noted f
G�! g) iff9r 2 P s.t. f
r�! g .
32 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Derivation
Def. 11 (Derivation)
f
G⇤�! g if f = g or9f
0
, f1
, f2
, ..., fn s.t. f
0
= f
fn = g
8i 2 [1, n] : fi�1
G�! fi
An example with G0
:N V joe N
�! sam V joe N �! sam V joe joe orsam V joe sam orsam sleeps joe N or. . .
33 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Derivation
Def. 11 (Derivation)
f
G⇤�! g if f = g or9f
0
, f1
, f2
, ..., fn s.t. f
0
= f
fn = g
8i 2 [1, n] : fi�1
G�! fi
An example with G0
:N V joe N �! sam V joe N
�! sam V joe joe orsam V joe sam orsam sleeps joe N or. . .
33 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Derivation
Def. 11 (Derivation)
f
G⇤�! g if f = g or9f
0
, f1
, f2
, ..., fn s.t. f
0
= f
fn = g
8i 2 [1, n] : fi�1
G�! fi
An example with G0
:N V joe N �! sam V joe N �! sam V joe joe or
sam V joe sam orsam sleeps joe N or. . .
33 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Derivation
Def. 11 (Derivation)
f
G⇤�! g if f = g or9f
0
, f1
, f2
, ..., fn s.t. f
0
= f
fn = g
8i 2 [1, n] : fi�1
G�! fi
An example with G0
:N V joe N �! sam V joe N �! sam V joe joe or
sam V joe sam or
sam sleeps joe N or. . .
33 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Derivation
Def. 11 (Derivation)
f
G⇤�! g if f = g or9f
0
, f1
, f2
, ..., fn s.t. f
0
= f
fn = g
8i 2 [1, n] : fi�1
G�! fi
An example with G0
:N V joe N �! sam V joe N �! sam V joe joe or
sam V joe sam orsam sleeps joe N or. . .
33 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Endpoint of a derivation
G3
: E �! E + E
| E ⇥ E
| ( E )| F
F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
An example with G3
:
E ⇥ E
�! F ⇥ E �! 3 ⇥ E �! 3 ⇥ (E ) �! 3 ⇥ (E + E ) �!3 ⇥ (E + F ) �! 3 ⇥ (E + 4) �! 3 ⇥ (F + 4) �! 3 ⇥ (5+ 4) �!|
34 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Endpoint of a derivation
G3
: E �! E + E
| E ⇥ E
| ( E )| F
F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
An example with G3
:
E ⇥ E �! F ⇥ E
�! 3 ⇥ E �! 3 ⇥ (E ) �! 3 ⇥ (E + E ) �!3 ⇥ (E + F ) �! 3 ⇥ (E + 4) �! 3 ⇥ (F + 4) �! 3 ⇥ (5+ 4) �!|
34 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Endpoint of a derivation
G3
: E �! E + E
| E ⇥ E
| ( E )| F
F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
An example with G3
:
E ⇥ E �! F ⇥ E �! 3 ⇥ E
�! 3 ⇥ (E ) �! 3 ⇥ (E + E ) �!3 ⇥ (E + F ) �! 3 ⇥ (E + 4) �! 3 ⇥ (F + 4) �! 3 ⇥ (5+ 4) �!|
34 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Endpoint of a derivation
G3
: E �! E + E
| E ⇥ E
| ( E )| F
F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
An example with G3
:
E ⇥ E �! F ⇥ E �! 3 ⇥ E �! 3 ⇥ (E )
�! 3 ⇥ (E + E ) �!3 ⇥ (E + F ) �! 3 ⇥ (E + 4) �! 3 ⇥ (F + 4) �! 3 ⇥ (5+ 4) �!|
34 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Endpoint of a derivation
G3
: E �! E + E
| E ⇥ E
| ( E )| F
F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
An example with G3
:
E ⇥ E �! F ⇥ E �! 3 ⇥ E �! 3 ⇥ (E ) �! 3 ⇥ (E + E )
�!3 ⇥ (E + F ) �! 3 ⇥ (E + 4) �! 3 ⇥ (F + 4) �! 3 ⇥ (5+ 4) �!|
34 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Endpoint of a derivation
G3
: E �! E + E
| E ⇥ E
| ( E )| F
F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
An example with G3
:
E ⇥ E �! F ⇥ E �! 3 ⇥ E �! 3 ⇥ (E ) �! 3 ⇥ (E + E ) �!3 ⇥ (E + F )
�! 3 ⇥ (E + 4) �! 3 ⇥ (F + 4) �! 3 ⇥ (5+ 4) �!|
34 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Endpoint of a derivation
G3
: E �! E + E
| E ⇥ E
| ( E )| F
F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
An example with G3
:
E ⇥ E �! F ⇥ E �! 3 ⇥ E �! 3 ⇥ (E ) �! 3 ⇥ (E + E ) �!3 ⇥ (E + F ) �! 3 ⇥ (E + 4)
�! 3 ⇥ (F + 4) �! 3 ⇥ (5+ 4) �!|
34 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Endpoint of a derivation
G3
: E �! E + E
| E ⇥ E
| ( E )| F
F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
An example with G3
:
E ⇥ E �! F ⇥ E �! 3 ⇥ E �! 3 ⇥ (E ) �! 3 ⇥ (E + E ) �!3 ⇥ (E + F ) �! 3 ⇥ (E + 4) �! 3 ⇥ (F + 4)
�! 3 ⇥ (5+ 4) �!|
34 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Endpoint of a derivation
G3
: E �! E + E
| E ⇥ E
| ( E )| F
F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
An example with G3
:
E ⇥ E �! F ⇥ E �! 3 ⇥ E �! 3 ⇥ (E ) �! 3 ⇥ (E + E ) �!3 ⇥ (E + F ) �! 3 ⇥ (E + 4) �! 3 ⇥ (F + 4) �! 3 ⇥ (5+ 4)
�!|
34 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Endpoint of a derivation
G3
: E �! E + E
| E ⇥ E
| ( E )| F
F �! 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
An example with G3
:
E ⇥ E �! F ⇥ E �! 3 ⇥ E �! 3 ⇥ (E ) �! 3 ⇥ (E + E ) �!3 ⇥ (E + F ) �! 3 ⇥ (E + 4) �! 3 ⇥ (F + 4) �! 3 ⇥ (5+ 4) �!|
34 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Engendered language
Def. 12 (Language engendered by a word)
Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X
⇤/fG⇤�! g}
Def. 13 (Language engendered by a grammar)
The language engendered by a grammar G is the set of words of ⌃⇤
derived from the axiom.LG = LG(S)
For instance () 2 LG2 : S ! (S)S ! ()S ! ()as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :)S( ! )(S)S( ! )()S( ! )()(for there is no way to arrive at )S( starting with S .
35 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Engendered language
Def. 12 (Language engendered by a word)
Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X
⇤/fG⇤�! g}
Def. 13 (Language engendered by a grammar)
The language engendered by a grammar G is the set of words of ⌃⇤
derived from the axiom.LG = LG(S)
For instance () 2 LG2 :
S ! (S)S ! ()S ! ()as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :)S( ! )(S)S( ! )()S( ! )()(for there is no way to arrive at )S( starting with S .
35 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Engendered language
Def. 12 (Language engendered by a word)
Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X
⇤/fG⇤�! g}
Def. 13 (Language engendered by a grammar)
The language engendered by a grammar G is the set of words of ⌃⇤
derived from the axiom.LG = LG(S)
For instance () 2 LG2 : S ! (S)S
! ()S ! ()as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :)S( ! )(S)S( ! )()S( ! )()(for there is no way to arrive at )S( starting with S .
35 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Engendered language
Def. 12 (Language engendered by a word)
Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X
⇤/fG⇤�! g}
Def. 13 (Language engendered by a grammar)
The language engendered by a grammar G is the set of words of ⌃⇤
derived from the axiom.LG = LG(S)
For instance () 2 LG2 : S ! (S)S ! ()S
! ()as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :)S( ! )(S)S( ! )()S( ! )()(for there is no way to arrive at )S( starting with S .
35 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Engendered language
Def. 12 (Language engendered by a word)
Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X
⇤/fG⇤�! g}
Def. 13 (Language engendered by a grammar)
The language engendered by a grammar G is the set of words of ⌃⇤
derived from the axiom.LG = LG(S)
For instance () 2 LG2 : S ! (S)S ! ()S ! ()
as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :)S( ! )(S)S( ! )()S( ! )()(for there is no way to arrive at )S( starting with S .
35 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Engendered language
Def. 12 (Language engendered by a word)
Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X
⇤/fG⇤�! g}
Def. 13 (Language engendered by a grammar)
The language engendered by a grammar G is the set of words of ⌃⇤
derived from the axiom.LG = LG(S)
For instance () 2 LG2 : S ! (S)S ! ()S ! ()as well as ((())), ()()(), ((()()())). . .
but )()( 62 LG2 , even though the following is a licit derivation :)S( ! )(S)S( ! )()S( ! )()(for there is no way to arrive at )S( starting with S .
35 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Engendered language
Def. 12 (Language engendered by a word)
Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X
⇤/fG⇤�! g}
Def. 13 (Language engendered by a grammar)
The language engendered by a grammar G is the set of words of ⌃⇤
derived from the axiom.LG = LG(S)
For instance () 2 LG2 : S ! (S)S ! ()S ! ()as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :
)S( ! )(S)S( ! )()S( ! )()(for there is no way to arrive at )S( starting with S .
35 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Engendered language
Def. 12 (Language engendered by a word)
Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X
⇤/fG⇤�! g}
Def. 13 (Language engendered by a grammar)
The language engendered by a grammar G is the set of words of ⌃⇤
derived from the axiom.LG = LG(S)
For instance () 2 LG2 : S ! (S)S ! ()S ! ()as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :)S( !
)(S)S( ! )()S( ! )()(for there is no way to arrive at )S( starting with S .
35 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Engendered language
Def. 12 (Language engendered by a word)
Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X
⇤/fG⇤�! g}
Def. 13 (Language engendered by a grammar)
The language engendered by a grammar G is the set of words of ⌃⇤
derived from the axiom.LG = LG(S)
For instance () 2 LG2 : S ! (S)S ! ()S ! ()as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :)S( ! )(S)S( !
)()S( ! )()(for there is no way to arrive at )S( starting with S .
35 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Engendered language
Def. 12 (Language engendered by a word)
Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X
⇤/fG⇤�! g}
Def. 13 (Language engendered by a grammar)
The language engendered by a grammar G is the set of words of ⌃⇤
derived from the axiom.LG = LG(S)
For instance () 2 LG2 : S ! (S)S ! ()S ! ()as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :)S( ! )(S)S( ! )()S( !
)()(for there is no way to arrive at )S( starting with S .
35 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Engendered language
Def. 12 (Language engendered by a word)
Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X
⇤/fG⇤�! g}
Def. 13 (Language engendered by a grammar)
The language engendered by a grammar G is the set of words of ⌃⇤
derived from the axiom.LG = LG(S)
For instance () 2 LG2 : S ! (S)S ! ()S ! ()as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :)S( ! )(S)S( ! )()S( ! )()(
for there is no way to arrive at )S( starting with S .
35 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Engendered language
Def. 12 (Language engendered by a word)
Let f 2 (⌃ [ N)⇤.LG(f ) = {g 2 X
⇤/fG⇤�! g}
Def. 13 (Language engendered by a grammar)
The language engendered by a grammar G is the set of words of ⌃⇤
derived from the axiom.LG = LG(S)
For instance () 2 LG2 : S ! (S)S ! ()S ! ()as well as ((())), ()()(), ((()()())). . .but )()( 62 LG2 , even though the following is a licit derivation :)S( ! )(S)S( ! )()S( ! )()(for there is no way to arrive at )S( starting with S .
35 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Example
G
4
= E ! E + T | T ,T ! T ⇥ F | F ,F ! (E ) | a
a+ a, a+ (a ⇥ a), ...
36 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Proto-word
Def. 14 (Proto-word)
A proto-word (or proto-sentence) is a word on (⌃ [ N)⇤N(⌃ [ N)⇤
(that is, a word containing at least one letter of N) produced by aderivation from the axiom.
E ! E + T ! E + T ⇤ F ! T + T ⇤ F ! T + F ⇤ F !T + a ⇤ F ! F + a ⇤ F ! a+ a ⇤ F !///////////a+ a ⇤ a
37 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Multiple derivations
A given word may have several derivations:E ! E + E ! F + E ! F + F ! 3 + F ! 3 + 4
E ! E + E ! E + F ! E + 4 ! F + 4 ! 3 + 4... but if the grammar is not ambiguous, there is only one left
derivation:E ! E + E ! F + E ! 3 + E ! 3 + F ! 3 + 4
parsing : trying to find the/a left derivation (resp. right)
38 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Multiple derivations
A given word may have several derivations:E ! E + E ! F + E ! F + F ! 3 + F ! 3 + 4E ! E + E ! E + F ! E + 4 ! F + 4 ! 3 + 4
... but if the grammar is not ambiguous, there is only one left
derivation:E ! E + E ! F + E ! 3 + E ! 3 + F ! 3 + 4
parsing : trying to find the/a left derivation (resp. right)
38 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Multiple derivations
A given word may have several derivations:E ! E + E ! F + E ! F + F ! 3 + F ! 3 + 4E ! E + E ! E + F ! E + 4 ! F + 4 ! 3 + 4... but if the grammar is not ambiguous, there is only one left
derivation:
E ! E + E ! F + E ! 3 + E ! 3 + F ! 3 + 4
parsing : trying to find the/a left derivation (resp. right)
38 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Multiple derivations
A given word may have several derivations:E ! E + E ! F + E ! F + F ! 3 + F ! 3 + 4E ! E + E ! E + F ! E + 4 ! F + 4 ! 3 + 4... but if the grammar is not ambiguous, there is only one left
derivation:E ! E + E ! F + E ! 3 + E ! 3 + F ! 3 + 4
parsing : trying to find the/a left derivation (resp. right)
38 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Multiple derivations
A given word may have several derivations:E ! E + E ! F + E ! F + F ! 3 + F ! 3 + 4E ! E + E ! E + F ! E + 4 ! F + 4 ! 3 + 4... but if the grammar is not ambiguous, there is only one left
derivation:E ! E + E ! F + E ! 3 + E ! 3 + F ! 3 + 4
parsing : trying to find the/a left derivation (resp. right)
38 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Derivation tree
For context-free languages, there is a way to represent the set ofequivalent derivations, via a derivation tree which shows all thederivation independantly of their order.
Grammar G2
: S �! "| (S)S
S
⇣⇣⇣⇣⇣⇣⇣⇣
���
@@@
PPPPPPPP
( S⇣⇣⇣⇣
��@@ PPPP
( S
"
) S
"
) S
"
S ! (S)S ! ((S)S)S ! ((S)S) ! ((S)) ! (())
39 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Structural analysis
Syntactic trees are precious to give access to the semantics
E
����
HHHH
E
T
F
a
+ T
�� HHT
F
a
⇤ F
a
40 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Ambiguity
When a grammar can assign more than one derivation tree to aword w 2 L(G ) (or more than one left derivation), the grammar isambiguous.For instance, G
3
is ambiguous, since it can assign the two follwingtrees to 1 + 2 ⇥ 3:
E
�����
HHHHH
E
F
1
+ E
��� HHHE
F
2
⇥ E
F
3
E
�����
HHHHH
E
��� HHHE
F
1
+ E
F
2
⇥ E
F
3
41 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
About ambiguity
Ambiguity is not desirable for the semanticsUseful artificial languages are rarely ambiguousThere are context-free languages that are intrinsequelyambiguous (3)Natural languages are notoriously ambiguous...
(3) {anbambapbaq|(n � q ^ m � p) _ (n � m ^ p � q)}
42 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Comparison of grammars
different languages generated ) different grammarssame language generated by G and G0:
) same weak generative powersame language generated by G and G0, and same structuraldecomposition : ) same strong generative power
43 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Overview
1 Formal Languages
2 Formal GrammarsDefinitionLanguage classes
3 Regular Languages
4 Formal complexity of Natural Languages
44 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Principle
Define language families on the basis of properties of thegrammars that generate them :
1 Four classes are defined, they are included one in another2 A language is of type k if it can be recognized by a type k
grammar (and thus, by definition, by a type k � 1 grammar) ;and cannot be recognized by a grammar of type k + 1.
45 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Chomsky’s hierarchy
type 0 No restriction onP ⇢ (X [ V )⇤V (X [ V )⇤ ⇥ (X [ V )⇤.
type 1 (context-sensitive grammars) All rules of P are of theshape (u
1
Su
2
, u1
mu
2
), where u
1
and u
2
2 (X [ V )⇤,S 2 V and m 2 (X [ V )+.
type 2 (context-free grammar) All rules of P are of theshape (S ,m), where S 2 V and m 2 (X [ V )⇤.
type 3 (regular grammars) All rules of P are of the shape(S ,m), where S 2 V and m 2 X .V [ X [ {"}.
46 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Examples
type 3:S ! aS | aB | bB | cAB ! bB | bA ! cS | bB
type 2:E ! E + T | T ,T ! T ⇥ F | F ,F ! (E ) | a
47 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Examples
type 3:S ! aS | aB | bB | cAB ! bB | bA ! cS | bB
type 2:E ! E + T | T ,T ! T ⇥ F | F ,F ! (E ) | a
47 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Example 1 type 0
Type 0:S ! SABC AC ! CA A ! a
S ! " CA ! AC B ! b
AB ! BA BC ! CB C ! c
BA ! AB CB ! BC
generated language :
words with an equal number of a, b, and c .
48 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Example 1 type 0
Type 0:S ! SABC AC ! CA A ! a
S ! " CA ! AC B ! b
AB ! BA BC ! CB C ! c
BA ! AB CB ! BC
generated language : words with an equal number of a, b, and c .
48 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Example 2: type 0
Type 0: S ! $S 0$ Aa ! aA $a ! a$S
0 ! aAS
0Ab ! bA $b ! b$
S
0 ! bBS
0Ba ! aB A$ ! $a
S
0 ! " Bb ! bB B$ ! $b$$ ! #
49 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Example 2: type 0 (cont’d)S
�������
HHHHHHH
$ S 0
��� HHH
a A S 0
�� HHb B S 0
"
$
$ a A b B $
a $ A b B $
a $ A b $ ba $ b A $ ba b $ A $ ba b $ $ a ba b # a b
50 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Language families
Turing!recognizable
regular formal
3 2 1 0
recursively enumerable
finite
context!free
context!sensitive
no constraint
recursive
Turing!decidable
51 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
DefinitionLanguage classes
Remarks
There are others ways to classify languages,either on other properties of the grammars;or on other properties of the languages
Nested structures are preferred, but it’s not necessaryWhen classes are nested, it is expected to have a growth ofcomplexity/expressive power
52 / 74
Formal LanguagesFormal Grammars
Regular LanguagesFormal complexity of Natural Languages
References
References I
Aho, Alfred, & Ullman, Jeffrey. 1993. Concepts fondamentaux de l’informatique. Dunod. Traduction deFoundations of Computer Science, 1992, W.H. Freeman and Company, New York.
Langendoen, D Terence, & Postal, Paul Martin. 1984. The vastness of natural languages. BasilBlackwell Oxford.
Mannell, Robert. 1999. Infinite number of sentences. part of a set of class notes on the Internet.
74 / 74
top related