1 1 ICML 2006, Grammatical Inference 1 Colin de la Higuera, Université de Saint-Etienne Tim Oates, University of Maryland Grammar Induction: Techniques and Theory 2 ICML 2006, Grammatical Inference 2 Acknowledgements • Laurent Miclet, Tim Oates, Jose Oncina, Rafael Carrasco, Paco Casacuberta, Pedro Cruz, Rémi Eyraud, Philippe Ezequel, Henning Fernau, Jean-Christophe Janodet, Thierry Murgue, Frédéric Tantini, Franck Thollard, Enrique Vidal,... • … and a lot of other people to whom we are grateful 3 ICML 2006, Grammatical Inference 3 Outline 1 An introductory example 2 About grammatical inference 3 Some specificities of the task 4 Some techniques and algorithms 5 Open issues and questions 4 ICML 2006, Grammatical Inference 4 1 How do we learn languages? A very simple example Carmel and Markovitch 98 & 99 http://www.cs.technion.ac.il/~carmel/papers.html 5 ICML 2006, Grammatical Inference 5 The problem: • An agent must take cooperative decisions in a multi-agent world. • His decisions will depend: –on the actions of other agents; –on what he hopes to win or lose. 6 ICML 2006, Grammatical Inference 6 Hypothesis: the opponent follows a rational strategy (given by a DFA/Moore machine): e e p p l l d p e e e p e p → l e e e → d You: listen or doze Me: equations or pictures
34
Embed
Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
1ICML 2006, Grammatical Inference1
Colin de la Higuera, Université de Saint-EtienneTim Oates, University of Maryland
Grammar Induction: Techniques and Theory
2ICML 2006, Grammatical Inference2
Acknowledgements• Laurent Miclet, Tim Oates, Jose Oncina, Rafael Carrasco, PacoCasacuberta, Pedro Cruz, RémiEyraud, Philippe Ezequel, Henning Fernau, Jean-Christophe Janodet, Thierry Murgue, Frédéric Tantini, Franck Thollard, Enrique Vidal,...
• … and a lot of other people to whom we are grateful
3ICML 2006, Grammatical Inference3
Outline
1 An introductory example
2 About grammatical inference
3 Some specificities of the task
4 Some techniques and algorithms
5 Open issues and questions
4ICML 2006, Grammatical Inference4
1 How do we learn languages?
A very simple example
Carmel and Markovitch 98 & 99http://www.cs.technion.ac.il/~carmel/papers.html
5ICML 2006, Grammatical Inference5
The problem:
• An agent must take cooperative decisions in a multi-agent world.
• His decisions will depend:
– on the actions of other agents;
– on what he hopes to win or lose.
6ICML 2006, Grammatical Inference6
Hypothesis: the opponent follows a rational strategy (given by a DFA/Moore machine):
e e
pp
l l d
p e
e e p e p → le e e → d
You: listenor doze
Me: equations or pictures
2
7ICML 2006, Grammatical Inference7
Example: (the prisoner’s dilemma)
• Each prisoner can admit (a) or stay silent (s)
• If both admit: 3 years each;
• If A admits but not B, A=0 years, B=5 years;
• If B admits but not A, B=0 years, A=5 years;
• If neither admits: 1 year each.
8ICML 2006, Grammatical Inference8
a
a
s
s
-3
-3
0
-5
0
-5
-1
-1
AB
9ICML 2006, Grammatical Inference9
• Here an iterated version against an opponent that follows a rational strategy.
• Gain Function: limit of means.
• A game is a word in
(His_moves × My_moves)*!
10ICML 2006, Grammatical Inference10
The general problem
• We suppose that the strategy of the opponent is given by a deterministic finite automaton.
• Can we imagine an optimal strategy?
11ICML 2006, Grammatical Inference11
Suppose we know the opponent’s strategy:
• Then (game theory):
• Consider the opponent’s graph in which we value the edges by our own gain.
12ICML 2006, Grammatical Inference12
a s
a
s
-3 0
-5 -1
s s
aa
a s s
a s-3
-5 -1
0
-1
0
3
13ICML 2006, Grammatical Inference13
1 Find the cycle of maximum mean weight.
2 Find the best path leading to this cycle of maximum mean weight.
3 Follow the path and stay in the cycle.
All that is needed is to find the opponent’s automaton!
Then
14ICML 2006, Grammatical Inference14
a s
a
s
-3 0
-5 -1
s s
aa
a s s
a s
Mean= -0.5
Best path
-3
-5 -1
0
-1
0
15ICML 2006, Grammatical Inference15
Question
• Having seen a game of this opponent…
• Can we reconstruct his strategy ?
16ICML 2006, Grammatical Inference16
Data (him, me) : {aa as sa aa as ssss ss sa}
HIM MEa aa ss aa aa ss ss ss as a
I play asa, his move is a
17ICML 2006, Grammatical Inference17
λ→ a
a→a
as → s
asa → a
asaa → a
asaas → s
asaass → s
asaasss → s
asaasssa → s
a
18ICML 2006, Grammatical Inference18
λ → a
a → a
as → s
asa → a
asaa → a
asaas → s
asaass → s
asaasss → s
asaasssa → s
a
a
4
19ICML 2006, Grammatical Inference19
λ → a
a → a
as → s
asa → a
asaa → a
asaas → s
asaass → s
asaasss → s
asaasssa → s
a
a
20ICML 2006, Grammatical Inference20
λ → a
a → a
as → s
asa → a
asaa → a
asaas → s
asaass → s
asaasss → s
asaasssa → s
a
a, s
21ICML 2006, Grammatical Inference21
λ → a
a → a
as → s
asa → a
asaa → a
asaas → s
asaass → s
asaasss → s
asaasssa → s
sa
a
s
22ICML 2006, Grammatical Inference22
λ → a
a → a
as → s
asa → a
asaa → a
asaas → s
asaass → s
asaasss → s
asaasssa → s
sa
a
s
a
23ICML 2006, Grammatical Inference23
λ → a
a → a
as → s
asa → a
asaa → a
asaas → s
asaass → s
asaasss → s
asaasssa → s
sa
a
s
a,s
24ICML 2006, Grammatical Inference24
λ → a
a → a
as → s
asa → a
asaa → a
asaas → s
asaass → s
asaasss → s
asaasssa → s
sa
a
s
a
s
5
25ICML 2006, Grammatical Inference25
λ → a
a → a
as → s
asa → a
asaa → a
asaas → s
asaass → s
asaasss → s
asaasssa → s
ssa
a
s
a
s
26ICML 2006, Grammatical Inference26
λ → a
a → a
as → s
asa → a
asaa → a
asaas → s
asaass → s
asaasss → s
asaasssa → s
ssa
a
s
a
s
s
27ICML 2006, Grammatical Inference27
λ → a
a → a
as → s
asa → a
asaa → a
asaas → s
asaass → s
asaasss → s
asaasssa → s
s
ssa
a
s
a
s
28ICML 2006, Grammatical Inference28
λ → a
a → a
as → s
asa → a
asaa → a
asaas → s
asaass → s
asaasss → s
asaasssa → s
s
ssa
a
s
a
s
29ICML 2006, Grammatical Inference29
λ → a
a → a
as → s
asa → a
asaa → a
asaas → s
asaass → s
asaasss → s
asaasssa → s
s
ssa
a
s
a
s
a
30ICML 2006, Grammatical Inference30
λ → a
a → a
as → s
asa → a
asaa → a
asaas → s
asaass → s
asaasss → s
asaasssa → s
s
ssa
a
s
a
s
a
6
31ICML 2006, Grammatical Inference31
s
ssa
a
s
a
s
a
32ICML 2006, Grammatical Inference32
How do we get hold of the learning data?
a) through observation
b) through exploration
33ICML 2006, Grammatical Inference33
An open problemThe strategy is probabilistic:
s
a:70%S:30%
a:50%S:50%
a:20%S:80%
a
s
a
s
a
34ICML 2006, Grammatical Inference34
Tit for Tat
sa
a
s
a
s
35ICML 2006, Grammatical Inference35
2 Specificities of grammatical inference
Grammatical inference consists (roughly) in finding the (a) grammar or automaton that has produced a given set of strings (sequences, trees, terms, graphs).
36ICML 2006, Grammatical Inference36
The goal/idea
• Old Greeks:
A whole is more than the sum of all parts
• Gestalt theory
A whole is different than the sum of all parts
7
37ICML 2006, Grammatical Inference37
Better said
• There are cases where the data cannot be analyzed by considering it in bits
• There are cases where intelligibility of the pattern is important
38ICML 2006, Grammatical Inference38
Nothing Lots
What do people know about formal language theory?
39ICML 2006, Grammatical Inference39
A small reminder on formal language theory
• Chomsky hierarchy
• + and – of grammars
40ICML 2006, Grammatical Inference40
A crash course in Formal language theory
• Symbols
• Strings
• Languages
• Chomsky hierarchy
• Stochastic languages
41ICML 2006, Grammatical Inference41
Symbols
are taken from some alphabet Σ
Stringsare sequences of symbols from Σ
42ICML 2006, Grammatical Inference42
Languages
are sets of strings over Σ
Languagesare subsets of Σ*
8
43ICML 2006, Grammatical Inference43
Special languages
• Are recognised by finite state automata
• Are generated by grammars
44ICML 2006, Grammatical Inference44
a
b
a
b
a
b
DFA: Deterministic Finite State Automaton
45ICML 2006, Grammatical Inference45
a
b
a
b
a
b
abab∈L46ICML 2006, Grammatical Inference
46
What is a context free grammar?A 4-tuple (Σ, S, V, P) such that:
– Σ is the alphabet;
– V is a finite set of non terminals;
– S is the start symbol;
– P ∈ V × (V∪Σ)* is a finite set of rules.
47ICML 2006, Grammatical Inference47
Example of a grammarThe Dyck1 grammar
– (Σ, S, V, P)
– Σ = {a, b}
– V = {S}
– P = {S → aSbS, S → λ }
48ICML 2006, Grammatical Inference48
Derivations and derivation trees
S → aSbS
→ aaSbSbS
→ aabSbS
→ aabbS
→ aabb
a
a
b
b
S
SS
S
S
λ
λ
λ
9
49ICML 2006, Grammatical Inference49
Chomsky Hierarchy
• Level 0: no restriction
• Level 1: context-sensitive
• Level 2: context-free
• Level 3: regular
50ICML 2006, Grammatical Inference50
Chomsky Hierarchy• Level 0: Whatever Turing machines can do
<?xml version="1.0"?><?xml-stylesheet href="carmen.xsl" type="text/xsl"?><?cocoon-process type="xslt"?><!DOCTYPE pagina [<!ELEMENT pagina (titulus?, poema)><!ELEMENT titulus (#PCDATA)><!ELEMENT auctor (praenomen, cognomen, nomen)><!ELEMENT praenomen (#PCDATA)><!ELEMENT nomen (#PCDATA)><!ELEMENT cognomen (#PCDATA)><!ELEMENT poema (versus+)><!ELEMENT versus (#PCDATA)>]><pagina><titulus>Catullus II</titulus><auctor><praenomen>Gaius</praenomen><nomen>Valerius</nomen><cognomen>Catullus</cognomen></auctor>
3 Hardness of the task– One thing is to build algorithms, another is to be able to state that it works.
– Some questions:– Does this algorithm work?
– Do I have enough learning data?
– Do I need some extra bias?
– Is this algorithm better than the other?
– Is this problem easier than the other?
72ICML 2006, Grammatical Inference72
Alternatives to answer these questions:
– Use benchmarks
– Solve a real problem
– Prove things
13
73ICML 2006, Grammatical Inference73
Theory
• Because you may want to be able to say something more than « seems to work in practice ».
74ICML 2006, Grammatical Inference74
Convergence
• Does my algorithm converge in some sense to a best solution.
• To be able to answer, we have to admit the existence of a best solution.
75ICML 2006, Grammatical Inference75
Issues
• Get close to the best?
– Metrics
– Distributions over strings
• PAC related model and similar: very negative results
76ICML 2006, Grammatical Inference76
Identification in the limit
L Pres ⊆N→XA class of languages
A class of grammarsG
L A learnerThe naming function
yields
ϕ
f(N)=g(N) ⇒yields(f)=yields(g)L(ϕ(f))=yields(f)
77ICML 2006, Grammatical Inference77
f1 f2
h1 h2
fn
hn
fi
hi ≡ hn
L(hi)= L
L is identifiable in the limit in terms of Gfrom Pres iff
∀L∈L, ∀f∈Pres(L)
78ICML 2006, Grammatical Inference78
No quería componer otro Quijote —lo cual es fácil— sino el Quijote. Inútil agregar que no encaró nunca una transcripción mecánica del original; no se proponía copiarlo. Su admirable ambición era producir unas páginas que coincidieran -palabra por palabra y línea por línea-con las de Miguel de Cervantes.
[…]
“Mi empresa no es difícil, esencialmente” leo en otro lugar de la carta. “Me bastaría ser inmortal para llevarla a cabo.”
Jorge Luis Borges(1899–1986)Pierre Menard, autor del Quijote (El jardín de senderos que
se bifurcan) Ficciones
14
79ICML 2006, Grammatical Inference79
4 Algorithmic ideas
80ICML 2006, Grammatical Inference80
The space of GI problems
• Type of input (strings)
• Presentation of input (batch)
• Hypothesis space (subset of the regular grammars)
• Success criteria (identifi-cation in the limit)
81ICML 2006, Grammatical Inference81
Types of input
the cat hates the dogStrings:
StructuralExamples:
cat dog the the hates
(+)
(-)
Graphs:
82ICML 2006, Grammatical Inference82
Types of input - oracles• Membership queries
– Is string S in the target language?
• Equivalence queries– Is my hypothesis correct?
– If not, provide counter example
• Subset queries– Is the language of my hypothesis a subset of the target language?
83ICML 2006, Grammatical Inference83
Presentation of input
• Arbitrary order
• Shortest to longest
• All positive and negative examples up to some length
• Sampled according to some probability distribution
84ICML 2006, Grammatical Inference84
Presentation of input
• Text presentation
– A presentation of all strings in the target language
• Complete presentation (informant)
– A presentation of all strings over the alphabet of the target language labeled as + or -
15
85ICML 2006, Grammatical Inference85
Hypothesis space
• Regular grammars
– A welter of subclasses
• Context free grammars
– Fewer subclasses
• Hyper-edge replacement graph grammars
86ICML 2006, Grammatical Inference86
Success criteria
• Identification in the limit
– Text or informant presentation
– After each example, learner guesses language
– At some point, guess is correct and never changes
• PAC learning
87ICML 2006, Grammatical Inference87
Theorem’s due to Gold• The good news
– Any recursively enumerable class of languages can be learned in the limit from an informant (Gold, 1967)
• The bad news– A language class is superfinite if it includes all finite languages and at least one infinite language
– No superfinite class of languages can be learned in the limit from a text (Gold, 1967)
– That includes regular and context-free
88ICML 2006, Grammatical Inference88
A picture
Little information
A lot of information
Poor languages Rich Languages
Sub-classes of reg, from pos
Mildly context sensitive, from queries
DFA, from queries
Context-free, from pos
DFA, from pos+neg
89ICML 2006, Grammatical Inference89
AlgorithmsRPNI
K-Reversible
GRIDS
SEQUITUR
L*
90ICML 2006, Grammatical Inference90
4.1 RPNI
• Regular Positive and Negative Grammatical Inference
Identifying regular languages in polynomial time
Jose Oncina & Pedro García 1992
16
91ICML 2006, Grammatical Inference91
• It is a state merging algorithm;
• It identifies any regular language in the limit;
• It works in polynomial time;
• It admits polynomial charac-teristic sets.
92ICML 2006, Grammatical Inference92
The algorithm
function rmerge(A,p,q)
A = merge(A,p,q)
while ∃a∈Σ, p,q∈δA(r,a), p≠qdo
rmerge(A,p,q)
93ICML 2006, Grammatical Inference93
A=PTA(X); Fr ={δ(q0,a): a∈Σ };
K ={q0};
While Fr≠∅ do
choose q from Fr
if ∃p∈K: L(rmerge(A,p,q))∩X-=∅then A = rmerge(A,p,q)
else K = K ∪ {q}
Fr = {δ(q,a): q∈K} – {K}
94ICML 2006, Grammatical Inference94
X+={λ, aaa, aaba, ababa, bb, bbaaa}
a
a
aa
b
b
b
a
a
a
ba b
a
X-={aa, ab, aaaa, ba}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
95ICML 2006, Grammatical Inference95
Try to merge 2 and 1
a
a
aa
b
b
b
a
a
a
ba b
a
X-={aa, ab, aaaa, ba}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
96ICML 2006, Grammatical Inference96
Needs more merging for determinization
aa
aa
b
b
b
a
a
a
b a ba
X-={aa, ab, aaaa, ba}
1,2
3
4
5
6
7
8
9
10
11
12
13
14
15
17
97ICML 2006, Grammatical Inference97
But now string aaaa is accepted, so the merge must be rejected
a
b
b a
a
a
ab
a
X-={aa, ab, aaaa, ba}
1,2,4,7
3,5,8 6
9, 11
10
12
13
14
15
98ICML 2006, Grammatical Inference98
Try to merge 3 and 1
a
a
aa
b
b
b
a
a
a
ba b
a
X-={aa, ab, aaaa, ba}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
99ICML 2006, Grammatical Inference99
Requires to merge 6 with {1,3}
a
a
aa
b
b
a
a
a
ba b
a
X-={aa, ab, aaaa, ba}
1,3
2
4
5
6
7
8
9
10
11
12
13
14
15
b
10ICML 2006, Grammatical Inference100
And now to merge 2 with 10
a
a
aa
b
aa
a
ba b
a
X-={aa, ab, aaaa, ba}
1,3,6
2
4
5
7
8
9
10
11
12
13
14
15
b
10ICML 2006, Grammatical Inference101
And now to merge 4 with 13
a
a
aa
b
a
ba b
a
X-={aa, ab, aaaa, ba}
1,3,6
2,10 4
5
7
8
9
11
12
13
14
15
b a
10ICML 2006, Grammatical Inference102
And finally to merge 7 with 15
a
a
aa
b
a
ba b
a
X-={aa, ab, aaaa, ba}
1,3,6
2,10
4,13
5
7
8
9
11
12
14
15
b
18
10ICML 2006, Grammatical Inference103
No counter example is accepted so the merges are kept
a
a
aa
bb
a ba
X-={aa, ab, aaaa, ba}
1,3,6
2,10
4,13
5
7,15
8
9
11
12
14
b
10ICML 2006, Grammatical Inference104
Next possible merge to be checked is {4,13} with {1,3,6}
a
a
aa
bb
a ba
X-={aa, ab, aaaa, ba}
1,3,6
2,10
4,13
5
7,15
8
9
11
12
14
b
10ICML 2006, Grammatical Inference105
a
a
ab
ba b
a
X-={aa, ab, aaaa, ba}
1,3,4,6,13
2,10
5
7,15
8
9
11
12
14
b
a
More merging for determinizationis needed
10ICML 2006, Grammatical Inference106
ab
a ba
X-={aa, ab, aaaa, ba}
1,3,4,6,8,13
2,7,10,11,15
5
9
12
14
b
a
But now aa is accepted
10ICML 2006, Grammatical Inference107
So we try {4,13} with {2,10}
a
a
aa
bb
a ba
X-={aa, ab, aaaa, ba}
1,3,6
2,10
4,13
5
7,15
8
9
11
12
14
b
10ICML 2006, Grammatical Inference108
After determinizing, negative string aa is again accepted
a ba b
a
X-={aa, ab, aaaa, ba}
1,3,62,4,7,10,13,15
5,89,11 12
14
b
a
19
10ICML 2006, Grammatical Inference109
So we try 5 with {1,3,6}
a
a
aa
bb
a ba
X-={aa, ab, aaaa, ba}
1,3,6
2,10
4,13
5
7,15
8
9
11
12
14
b
11ICML 2006, Grammatical Inference110
But again we accept ab
aa
aa
b
b
X-={aa, ab, aaaa, ba}
1,3,5,6,12
2,9,10,144,13
7,15
8
11
b
11ICML 2006, Grammatical Inference111
So we try 5 with {2,10}
a
a
aa
bb
a ba
X-={aa, ab, aaaa, ba}
1,3,6
2,10
4,13
5
7,15
8
9
11
12
14
b
11ICML 2006, Grammatical Inference112
Which is OK. So next possible merge is {7,15} with {1,3,6}
a
a
a
ab
b
X-={aa, ab, aaaa, ba}
1,3,6
2,5,10
4,9,13
7,15
8,12
11,14
b
11ICML 2006, Grammatical Inference113
Which is OK. Now try to merge {8,12} with {1,3,6,7,15}
aa
a
ab
a
X-={aa, ab, aaaa, ba}
1,3,6,7,15
2,5,10
4,9,13
8,12
11,14
b
b
11ICML 2006, Grammatical Inference114
And ab is accepted
a
a
b
a
X-={aa, ab, aaaa, ba}
1,3,6,7,8,12,15
2,5,10,11,14
4,9,13
b
b
20
11ICML 2006, Grammatical Inference115
Now try to merge {8,12} with {4,9,13}
aa
a
ab
a
X-={aa, ab, aaaa, ba}
1,3,6,7,15
2,5,10
4,9,13
8,12
11,14
b
b
11ICML 2006, Grammatical Inference116
This is OK and no more merge is possible so the algorithm halts.
aa
a
b
a
X-={aa, ab, aaaa, ba}
1,3,6,7,11,14,15
2,5,10
4,8,9,12,13
b
b
11ICML 2006, Grammatical Inference117
Definitions
• Let ≤ be the length-lexordering over Σ*
• Let Pref(L) be the set of all prefixes of strings in some language L.
11ICML 2006, Grammatical Inference118
Short prefixes
Sp(L)={u∈Pref(L): δ(q0,u)=δ(q0,v) ⇒ u≤v}
• There is one short prefix per useful state
0
1 2a
b
a
b b
aSp(L)={λ, a}
11ICML 2006, Grammatical Inference119
Kernel-sets
• N(L)={ua∈Pref(L): u∈Sp(L)}∪{λ}• There is an element in the Kernel-set for each useful transition
0
1 2a
b
a
b b
aN(L)={λ, a, b, ab}
12ICML 2006, Grammatical Inference120
A characteristic sample
• A sample is characteristic (for RPNI) if
–∀x∈Sp(L) ∃xu∈X+–∀x∈Sp(L), ∀y∈N(L),
δ(q0,x)≠δ(q0,y) ⇒∃z∈Σ*: xz∈X+∧yz∈X- ∨
xz∈X-∧yz∈X+
21
12ICML 2006, Grammatical Inference121
About characteristic samples• If you add more strings to a characteristic sample it still is characteristic;
• There can be many different characteristic samples;
• Change the ordering (or the exploring function in RPNI) and the characteristic sample will change.
12ICML 2006, Grammatical Inference122
Conclusion• RPNI identifies any regular language in the limit;
• RPNI works in polynomial time.
Complexity is in O(║X+║3.║X-║);• There are many significant variants of RPNI;
• RPNI can be extended to other classes of grammars.
12ICML 2006, Grammatical Inference123
Open problems
• RPNI’s complexity is not a tight upper bound. Find the correct complexity.
• The definition of the characteristic set is not tight either. Find a better definition.
12ICML 2006, Grammatical Inference124
AlgorithmsRPNI
K-Reversible
GRIDS
SEQUITUR
L*
12ICML 2006, Grammatical Inference125
4.2 The k-reversible languages• The class was proposed by Angluin(1982).
• The class is identifiable in the limit from text.
• The class is composed by regular languages that can be accepted by a DFA such that its reverse is deterministic with a look-ahead of k.
12ICML 2006, Grammatical Inference126
Let A=(Σ, Q, δ, I, F) be a NFA, we denote by AT=(Σ, Q, δT, F, I) the reversal automaton with:
δT(q,a)={q’∈Q: q∈δ(q’,a)}
22
12ICML 2006, Grammatical Inference127
0 1
3
b2
4
a
ba
a a a
0 1
3
b2
4
a
ba
a a a
A
AT
12ICML 2006, Grammatical Inference128
Some definitions
• u is a k-successor of q if │u│=k and δ(q,u)≠∅.
• u is a k-predecessor of q if │u│=k and δT(q,uT)≠∅.
• λ is 0-successor and 0-predecessor of any state.
12ICML 2006, Grammatical Inference129
0 1
3
b2
4b
a
a a a
A
• aa is a 2-successor of 0 and 1 but not of 3.
• a is a 1-successor of 3.
• aa is a 2-predecessor of 3 but not of 1.
a
13ICML 2006, Grammatical Inference130
A NFA is deterministic with look-ahead k iff ∀q,q’∈Q: q≠q’(q,q’∈I) ∨ (q,q’∈δ(q”,a))
⇒
(u is a k-successor of q) ∧(v is a k-successor of q’) ⇒ u≠v
13ICML 2006, Grammatical Inference131
Prohibited:
2
1
a
a
u
u
│u│=k
13ICML 2006, Grammatical Inference132
Example
This automaton is not deterministic with look-ahead 1 but is deterministic with look-ahead 2.
0 1
3
b2
4
a
ba
a a a
23
13ICML 2006, Grammatical Inference133
K-reversible automata• A is k-reversible if A is deterministic and AT is deterministic with look-ahead k.
• Example
0 1
b
2baa
b
0 1
b
2baa
bdeterministic deterministic with look-ahead 1
13ICML 2006, Grammatical Inference134
Violation of k-reversibility• Two states q, q’ violate the k-reversibility condition iff– they violate the deterministic condition: q,q’∈δ(q”,a);
or
– they violate the look-ahead condition: •q,q’∈F, ∃u∈Σk: u is k-predecessor of both;
•∃u∈Σk, δ(q,a)=δ(q’,a) and u is k-predecessor of both q and q’.
13ICML 2006, Grammatical Inference135
Learning k-reversible automata
• Key idea: the order in which the merges are performed does not matter!
• Just merge states that do not comply with the conditions for k-reversibility.
13ICML 2006, Grammatical Inference136
K-RL Algorithm (φk-RL)
Data: k∈ , X sample of a k-RL L
A=PTA(X)
While ∃q,q’ k-reversibility violators do
A=merge(A,q,q’)
13ICML 2006, Grammatical Inference137
Let X={a, aa, abba, abbbba}
a
λ ab abb
aa
abbbbabbb abbbba
abbaa
b b b b a
a
a
k=2
Violators, for u= ba
13ICML 2006, Grammatical Inference138
Let X={a, aa, abba, abbbba}
a
λ ab abb
aa
abbbbabbb
abbaa
b b b ba
a
a
k=2
Violators, for u= bb
24
13ICML 2006, Grammatical Inference139
Let X={a, aa, abba, abbbba}
a
λ ab abb
aa
abbb
abbaa
b b bb
a
a
k=2
14ICML 2006, Grammatical Inference140
Properties (1)• ∀k≥0, ∀X, φk-RL(X) is a k-reversible language.
• L(φk-RL(X)) is the smallest k-reversible language that contains X.
• The class Lk-RL is identifiable in the limit from text.
14ICML 2006, Grammatical Inference141
Properties (2)• Any regular language is k-reversible iff
(u1v)-1L ∩(u2v)-1L≠∅ and │v│=k
⇒(u1v)
-1L=(u2v)-1L
(if two strings are prefixes of a string of length at least k, then the strings are