Computational Models - Lecture 3tau-cm2019.wdfiles.com/local--files/course-schedule/DFA3.pdf · 3. jxyj ‘. I If y is all0, then xy2z has too many0’s. I If y is all1, then xy2z

Computational Models - Lecture 3

Handout Mode

Roded Sharan.

Tel Aviv University.

March, 2019

Roded Sharan (TAU) Computational Models, Lecture 3 March, 2019 1 / 35

Computational Models - Lecture 3

I Non-regular languages: two approaches

1. Pumping Lemma2. Myhill-Nerode Theorem (not in Sipser’s book)

I Closure properties

I Algorithmic questions for NFAs

I Sipser, 1.4,2.1,2.2

I Hopcroft and Ullman, 3.4


What DFAs cannot do

Is there a DFA that accepts the following languages (over {0,1}).I B = {0n1n : n ≥ 0}I C = {w : #1(w) = #0(w)}I D = {w : #01(w) = #10(w)}

#s(w) – the number of times s appears in w .

Consider B:

I DFA must “remember” how many 0’s it has seen

I Impossible with finite state.

The others languages seem to be exactly the same...

Question: Is this a proof?Answer: No, D is regular.....


Part I

Pumping Lemma


Regular languages can be pumped

For every regular language L, there exists ` > 0, the pumping length, s.t.:Every s ∈ L longer than `, can be “pumped” into a longer string in L.

This is a powerful technique for showing that a language is not regular.


The Pumping Lemma

Lemma 1For every regular language L, exists ` > 0 (i.e., the pumping length) s.t.:every s ∈ L with |s| ≥ `, can be written as s = xyz with

1. xy iz ∈ L for every i ≥ 0,

2. |y | > 0, and

3. |xy | ≤ `.

Remarks: Without the second condition, the theorem would be trivial.

x and z may be empty.


Proving the Pumping Lemma

Let M = (Q,Σ, δ,q1,F ) be a DFA accepting L, and let ` = |Q|.

Let s ∈ L with |s| ≥ `, and consider the sequence of states M traverses as itreads s = s1 . . . sn:

s1 s2 s3 s4 s5 s6 . . . sn↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑q1 q20 q9 q17 q12 q13 q9 q2 q5∈ F

By the pigeonhole principle, at least one of the states in the above sequencerepeats.

s = xyz


Proving the Pumping Lemma, cont.

s = xyz

I By inspection, M accepts xyk z for every k ≥ 0.

I |y | > 0, because the state q9 is repeated.

I To ensure that |xy | ≤ `, pick first state repetition, which must occur nolater than `+ 1 states in sequence.


Application # 1

Corollary 2

B = {0n1n : n > 0} is not regular.

Proof: By contradiction. Suppose B is regular and let ` be its pumping length.

I Consider the string s = 0`1` ∈ B.

I Let x , y , z be (one possible) strings guaranteed by the pumping lemma(i.e., s = xyz)

1. xy iz ∈ B for every i ≥ 0,2. |y | > 0, and3. |xy | ≤ `.

I If y is all 0, then xy2z has too many 0’s.

I If y is all 1, then xy2z has too many 1’s.

I If y is mixed, then xy2z is not of right form.

We did not use the third property.Roded Sharan (TAU) Computational Models, Lecture 3 March, 2019 9 / 35

Application # 2

Corollary 3

C = {w : #1(w) = #0(w)} is not regular.

Proof: By contradiction. Suppose C is regular. Let ` be the pumping length.

I Consider the string s = 0`1` ∈ C.I Let x , y , z be (possible) set of strings guaranteed by the pumping lemma

(i.e., s = xyz)

1. xy iz ∈ B for every k ≥ 0,2. |y | > 0, and3. |xy | ≤ `.

I Since |xy | ≤ `, the string y is all 0’s.

I Thus, xy2z /∈ C (more 0’s than 1’s).


Application # 3

Corollary 4

E = {0i1j : i > j} is not regular.

Proof: By contradiction. Suppose E is regular. Let ` be its pumping length.

I Consider the string s = 0`1`−1 ∈ E .

I By pumping lemma, s = xyz, where xyk z ∈ E for every k ≥ 0, |y | > 0and |xy | ≤ `.

I But xy0z = xz /∈ E (at least as much 1’s as 0’s)


Application # 4

Corollary 5The language Primes ⊂ 1∗ – all strings whose length is a prime number – isnot regular.

Proof: Suppose Primes is regular, and let ` be its pumping length.

I Let s = 1p ∈ Primes, where p ≥ ` is a prime (?)

I By pumping lemma, s = xyz, where xyk z ∈ Primes for every k ≥ 0.

I Let |y | = m. Hence, xyp+1z = 1p+mp ∈ Primes

but p(m + 1) is not a prime...


Another Example

Consider the language L = {aibncn : n ≥ 0, i ≥ 1} ∪ {bncm : n,m ≥ 0}.Any non-empty s ∈ L can be pumped:

I If s = aibncn with i > 0, then set x = ε and y = a.

I If s = bncm with n > 0, then set x = ε and y = b.

I If s = cm with m > 0, then set x = ε and y = c.

(in all cases z is set to the right suffix).

I Is L regular? No

I How can we prove it?


Part II

Characterization of Regular Languages


The equivalence relation L∼

Definition 6

For L ⊆ Σ∗, define the equivalence relation L∼ over words in Σ∗, byx L∼y if for every z ∈ Σ∗, it holds that xz ∈ L ⇐⇒ yz ∈ L.

Easy to see that L∼ is indeed an equivalence relation (reflexive, symmetric,transitive) on Σ∗.Hence, L∼ partitions Σ∗ into equivalence classes.

For x ∈ Σ∗, let [x ] ⊆ Σ∗ denote its equivalence class with respect to L∼

How many equivalence classes does L∼ induce? finite or infinite?

Could be either (depends on L).


Three examples

I L1 = {w : #1(w) mod 5 = 0}L1∼ has finitely many equivalence classes.

The equivalent classes are: [ε], [1], [11], [111], [1111].

Proof:I Classes cover {0,1}∗: for any x ∈ {0,1}∗: x L∼ 1#1(x) mod 5.

xz ∈ L ⇐⇒ #1(xz) mod 5 = 0⇐⇒(#1(x) mod 5) + #1(z) mod 5 = 0⇐⇒ 1#1(x) mod 5z ∈ L.

I Classes are disjoint: 1i 6 L∼ 1j for i 6= j ∈ {0,1,2,3,4}I L2 = {0n1n : n ∈ N}

L2∼ has infinitely many equivalence classes.

[0] 6= [02] 6= [03] . . .

I L3 = {aibncn : n ≥ 0, i ≥ 1} ∪ {bncm : n,m ≥ 0}L3∼ has infinitely many equivalence classes.

[ab] 6= [ab2] 6= [ab3] 6= . . .


Myhill-Nerode Theorem

Theorem 7 (Myhill-Nerode Theorem)

L ⊆ Σ∗ is regular iff L∼ finitely many equivalence classes.

Hence

I L1 = {w ∈ {0,1}∗ : #1(w) mod 5 = 0} is regular.

I L2 = {0n1n : n ∈ N} is not regular.

I L3 = {aibncn : n ≥ 0, i ≥ 1} ∪ {bncm : n,m ≥ 0} is not regular.


Right invariance

Fact 8 (right invariance)

If x L∼ y, then xw L∼ yw for every w ∈ Σ∗

Proof: (xw)z ∈ L ⇐⇒ x(wz) ∈ L ⇐⇒ y(wz) ∈ L ⇐⇒ (yw)z ∈ L


Proving Myhill-Nerode Theorem =⇒Let L be a regular language and let M = (Q,Σ, δ,q0,F ) be a DFA accepting it.

I Define the binary relation M∼ by x M∼ y if δ̂(q0, x) = δ̂(q0, y).

I M∼ is an equivalence relation.

I x M∼ y =⇒ xz M∼ yz for every z ∈ Σ∗.

=⇒ xz ∈ L iff yz ∈ L.

I Hence, x M∼ y =⇒ x L∼ y .

I Each equivalence class of L∼ corresponds to union of classes of M∼.Namely, M∼ is a refinement of L∼.

I Specifically, # of equivalence classes of L∼ is less or equal than # ofequivalence classes of M∼.

I M∼ has finitely many equivalence classes. (?)

I Therefore, L∼ has finitely many equivalence classes. ♠


Proving Myhill-Nerode theorem⇐=

Assume L∼ has finitely many equivalence classes and let x1, . . . , xn ∈ Σ∗ betheir representatives.We’ll construct a DFA M = (Q,Σ, δ,q0,F ) that accepts L.For y ∈ Σ∗, let C(y) be the index i ∈ {1, . . . ,n} with y ∈ [xi ].

I Q = {1, . . . ,n}.I δ(i , σ) = C(xiσ).I q0 = C(ε).I F = {i : xi ∈ L}.

Claim. For any y ∈ Σ∗: δ̂(q0, y) = C(y).

Hence, y ∈ L(M)⇔ δ̂(q0, y) ∈ F ⇐⇒ C(y) ∈ F ⇐⇒ xC(y) ∈ L ⇐⇒ y ∈ L.This is the optimal DFA, number of states wise, for L. (?)Proof: (of claim) By induction on word length. Base case: by definition.I Let y = wσ ∈ Σ∗, and assume w ∈ [xi ] and y ∈ [xj ].I δ(i , σ) := C(xiσ) = (by right invariance) C(wσ) = C(y) = j .

=⇒ δ̂(q0,wσ) = δ(δ̂(q0,w), σ) = (by i.h) δ(i , σ) = j .Roded Sharan (TAU) Computational Models, Lecture 3 March, 2019 20 / 35

Example

Construct a DFA for {w ∈ {0,1}∗ : #1(w) ≡ 0 (mod 5)}, via the latter proofmethod.

I Equivalent class representatives: {ε,1,11,111,1111}I Q = {0,1,2,3,4}I q0 = 0

I F = {0}I δ(i ,0) = C(1i0) = i and δ(i ,1) = C(1i1) = i + 1 (mod 5)


Part III

Closure Properties of Regular Languages


Simple closure properties

I Regular languages are closed under complement.

1. Let M = (Q,Σ, δ,q0,F ) be a DFA that accepts L.2. Then M ′ = (Q,Σ, δ,q0,Q \ F ) is a DFA that accepts L = Σ∗ \ L.3. NFA ?!

I Regular languages are closed under intersection.

1. L1 ∩ L2 = L1 ∪ L2.


Division

For languages L1,L2 ⊆ Σ∗, define

L1/L2 = {x ∈ Σ∗ : ∃y ∈ L2, xy ∈ L1}

Examples:

I L1 = {abc,dec,gg} and L2 = {c}. Then L1/L2 = {ab,de}I L1 = L(01 ∪ 1)∗ and L2 = L(00). Then L1/L2 = ∅I L3 = L(a∗b∗c∗) and L4 = L(b). Then L3/L4 = L(a∗b∗)


Closure under division

Recall, L1/L2 = {x : ∃y ∈ L2, xy ∈ L1}

Theorem 9Regular languages are closed under division with any language:L1 is regular =⇒ L1/L2 is regular.

Proof: Let L1 be a regular language and let L2 be an arbitrary language.

I L1∼ is a refinement ofL1/L2∼ . Proof: Assume x

L1∼ y . For z ∈ Σ∗:

xz ∈ L1/L2 ↔ xzw ∈ L1 ↔ yzw ∈ L1 ↔ yz ∈ L1/L2. =⇒ xL1/L2∼ y .

I Hence, L1/L2 has finite number of equivalent states, and thus regular.

I Another proof.

I Let M = (Q,Σ, δ,q0,F ) be a DFA for L1.

I Let F ′ = {q ∈ Q : ∃y ∈ L2, δ̂(q, y) ∈ F}I The DFA M ′ = (Q,Σ, δ,q0,F ′) accepts L1/L2.

F ′ is well defined, but might be hard to compute – “non constructive proof”.Roded Sharan (TAU) Computational Models, Lecture 3 March, 2019 25 / 35

Homomorphism

Definition 10 (Homomorphism)

A homomorphism from alphabet ∆ to words over alphabet Σ, is a functionh : ∆ 7→ Σ∗.For w ∈ ∆∗, let h(w = w1, . . . ,wn) = h(w1) · · · h(wn).For L ⊆ ∆∗, let h(L) = {h(w) : w ∈ L}.

By definition, h(ε) = ε and h(∅) = ∅.

Examples:

I Let h : {0,1} 7→ {a,b}∗ be defined by h(1) = aba and h(0) = aa.

h(010) = aa aba aa. For L1 = (01)∗, h(L1) = (aaaba)∗.

I Let h(0) = a, h(1) = a. For L2 = {0n1n : n ≥ 0}, h(L2) = {a2n : n ≥ 0}.


Closure under homomorphism

Theorem 11Regular languages are closed under homomorphism.

Proof idea: Using regular expressions.

Let L ⊆ ∆∗ be regular language and let h : ∆ 7→ Σ∗.

1. h(∅) = ∅, h({ε}) = {ε}, and h({a}) = {h(a)} for any a ∈ ∆.

2. h(L1 ∪ L2) = h(L1) ∪ h(L2), h(L1‖L2) = h(L1)‖h(L2) and h(L∗) = h(L)∗

Let R be a regular expression with L = L(R). The proof is by induction on |R|.

I |R| = 1. By Item (1), h(L) = h(L(R)) is regular.

I |R| > 1. Assume R = (R1 ∪ R2) (other cases are similar).

I By item (2), h(L) = h(L(R1) ∪ L(R2)) = h(L(R1)) ∪ h(L(R2)).

I By i.h., h(L(R1) and h(L(R2) are regular.

Thus, h(L) is regular.


Closure under homomorphism, proof using Automata

Let M = (Q,∆, δ,q0,F ) be a DFA for L.

Define NFA N = (Q′,Σ, δ′,q0,F ) for h(L) as follows:

I if δ(qi , σ) = qj and h(σ) = w1, . . . ,wt , then δ′(dkiσ,wi ) = dk+1

iσ for allk ∈ {1, . . . , t − 1}, letting d1

iσ = qi and d tiσ = qj .

I Q′ includes Q and all new states.

Claim 12L(N) = h(L).

Proof idea:

h(L) ⊆ L(N): w ∈ L =⇒ ∃r1, . . . , r|w|+1 s.t. r1 = q0, r|w|+1 ∈ F andri+1 = δ(ri ,wi+1). . . . =⇒ h(w) ∈ L(N).

L(N) ⊆ h(L): w ∈ L(N) =⇒ ∃r1, . . . , r|w|+1 s.t. r1 = q0, r|w|+1 ∈ F andri+1 ∈ δ′(ri ,wi+1). . . . =⇒ ∃w ′ ∈ L with h(w ′) = w .


Inverse homomorphism

Definition 13 (Inverse homomorphism)

For homomorphism h : ∆ 7→ Σ∗, define its inverse homomorphismh−1 : Σ∗ 7→ P(∆∗), by h−1(w) = {x ∈ ∆∗ : h(x) = w}.

For L ⊆ Σ∗, let h−1(L) =⋃

x∈L h−1(x) = {x ∈ ∆∗ : h(x) ∈ L}

I Example: h(0) = a, h(1) = b and h(2) = a. Thenh−1({anban : n ≥ 0}) = {{0,2}n1{0,2}n : n ≥ 0}.

For any h : ∆ 7→ Σ∗:

Claim 14

h(h−1(L))⊆L, for any L ⊆ Σ∗

Proof: Immediate.

Claim 15

L⊆h−1(h(L)), for any L ⊆ ∆∗

Proof: Holds since w∈h−1(h(w)) for any w ∈ ∆∗


Closure under inverse homomorphism

Theorem 16Regular languages are closed under inverse homomorphism.

Proof idea: Let L be a regular language, let M be a DFA for L and leth : ∆ 7→ Σ∗.

I w ∈ h−1(L)←→ h(w) ∈ L(M).

I Hence, to decide w simply emulate M(h(w))...

How do we prepare the input for such emulation?


Closure under inverse homomorphism, DFA definition

Given DFA M = (Q,Σ, δ,q0,F ) and homomorphism h : ∆→ Σ∗, define DFAM ′ = (Q,∆, δ′,q0,F ) by

δ′(q,a) = δ̂(q,h(a))

Easy to verify that δ̂′(q,w) = δ̂M(q,h(w))

Hence, w ∈ L(M ′)←→ h(w) ∈ L(M)←→ w ∈ h−1(L)


Part IV

Algorithmic Questions for NFAs


Algorithmic Questions for NFAs

Q: Given an NFA, N, and a string w , is w ∈ L(N)?

Answer: Construct the DFA equivalent to N and run it on w .

Q: Is L(N) = ∅?

Answer: This is a reachability question in graphs: Is there a path in the states’graph of N from the start state to some accepting state?There are simple, efficient algorithms for this task.


More Questions

Q.: Is L(N) = Σ∗?

Answer: Check if L(N) = ∅.

Q.: Given N1 and N2, is L(N1) ⊆ L(N2)?

Answer: Check if L(N2) ∩ L(N1) = ∅.

Q.: Given N1 and N2, is L(N1) = L(N2)?

Answer: Check if L(N1) ⊆ L(N2) and L(N2) ⊆ L(N1).

In the future, we will see that for stronger models of computations, many ofthese problems cannot be solved by any algorithm.


Summary - Regular Languages

So far we saw

I Finite automata,

I Regular languages,

I Regular expressions,

I Myhill-Nerode theorem and pumping lemma for regular languages.

Next class we introduce stronger machines and languages with moreexpressive power:

I pushdown automata,

I context-free languages,

I context-free grammars


Computational Models - Lecture 3tau-cm2019.wdfiles.com/local--files/course-schedule/DFA3.pdf · 3. jxyj ‘. I If y is all0, then xy2z has too many0’s. I If y is all1, then xy2z

Documents