The Pumping Lemma for Regular Languages Teodor Rus [email protected] The University of Iowa, Department of Computer Science Computation Theory – p.1/41
The Pumping Lemma for RegularLanguages
Teodor Rus
The University of Iowa, Department of Computer Science
Computation Theory – p.1/41
Nonregular languages
Consider the language B = {0n1n|n ≥ 0}.• If we attempt to find a DFA that recognizes B we discover that
such a machine needs to remember how many 0s have beenseen so far as it reads the input;
• Because the number of 0s isn’t limited, the machine needs tokeep track of an unlimited number of possibilities;
• This cannot be done with any finite number of states.
Computation Theory – p.2/41
Intuition may fail us
• Just because a language appears to requireunbounded memory to be recognized, itdoesn’t mean that it is necessarily so.
• Example:• C = {w|w has an equal number of 0s and 1s}
is not regular;
• D = {w|w has an equal number of 01 and 10 as substrings}is regular.
Computation Theory – p.3/41
Language nonregularity
• The technique for proving nonregularity ofsome languages is provided by a theoremabout regular languages called pumpinglemma;
• Pumping lemma states that all regularlanguages have a special property;
• If we can show that a language L does nothave this property we are guaranteed that L
is not regular.
Computation Theory – p.4/41
Observation
Pumping lemma states that all regular languageshave a special property.
Note: Pumping lemma does not state that onlyregular languages have this property. Hence:The property used to prove that a language L isnot regular does not ensure that L is regular.
Consequence: A language may not be regular and
still have strings that have all the properties of reg-
ular languages.
Computation Theory – p.5/41
Pumping property
All strings in the language can be “pumped" ifthey are at least as long as a certain specialvalue, called the pumping length.
Meaning: each such string in the language con-
tains a section that can be repeated any number
of times with the resulting string remaining in the
language.
Computation Theory – p.6/41
Theorem 1.70
Pumping Lemma: If A is a regular language thenthere is a number p (the pumping length) so thatif s is any string in A of length at least p, then smay be divided into three pieces, s = xyz,satisfying the following conditions:
1. for each i ≥ 0, xyiz ∈ A;
2. |y| > 0;
3. |xy| ≤ p.
Computation Theory – p.7/41
Interpretation
• Recall that |s| represents the length of string s
and yi means that y may be concatenated i
times, and y0 = ǫ;• Either x or z may be ǫ, but y 6= ǫ;• Without condition y 6= ǫ the theorem would be
trivially true; (xǫiz = xz for any i ≥ 0).• |xy| ≤ p is an extra technical condition, useful
when proving nonregularity; (if |xy| > p, lemma could
be applied for xy).
Computation Theory – p.8/41
Proof idea
Let M = (Q,Σ, δ, q1, F ) be a DFA that recognizesA.
• Consider the pumping length p to be the number of states of M ;
• Show that any string s ∈ A, |s| ≥ p may be broken into threepieces x, y, z satisfying the pumping lemma’s conditions;
• If there are no strings in A of length at least p then theorembecomes vacuously true because all three conditions hold for allstrings of length at least p if there are no such strings.
Computation Theory – p.9/41
More ideas• If s ∈ A and |s| ≥ p, consider a sequence of states that M goes
through to accept s, example: q1, q3, q20, . . . , q13;
• Since M accepts s, q13 must be final; if |s| = n then the length ofq1, q3, q20, . . . , q13 is n + 1;
• Because |s| = n and |s| ≥ p it result that n + 1 > p;
• By pigeonhole principle:if p pigeons are placed into fewer than p holes, some holes musthold more than one pigeon !the sequence q1, q3, q20, . . . , q13 must contain a repeated state,see Figure 1.
Computation Theory – p.10/41
Recognition sequence
s =
q16
s1
q36
s2
q206
s3
jq96
s4
q176
s5
jq96
s6
q66
. . .
q356
q136
Figure 1: State q9 repeats when M reads s
Computation Theory – p.11/41
Proof sketch
Divide s in to the three pieces: x, y, and z where:
• Piece x is the part of s appearing before q9;
• Piece y is the part of s between two appearances of q9;
• Piece z is the remaining part of s, after the second appearance ofq9.
That is:1. x takes M from q1 to q9,
2. y takes M from q9 to q9, and
3. z takes M from q9 to q13!
Computation Theory – p.12/41
Fact
The division specified above satisfies the 3conditions of the pumping lemma:
1. xyiz is accepted by M for any i ≥ 0;
2. |y| > 0;
3. |xy| ≤ p.
Computation Theory – p.13/41
Observations
Suppose that we run M on xyyz:• Condition 1: it is obvious that M accepts xyz, xyyz, and in general
xyiz for all i > 0. For i = 0, xyiz = xz which is also acceptedbecause z takes M from q9 to q13, which is final.
• Condition 2: Since |s| ≥ p, state q9 is repeated. Then because y isthe part between two successive occurrences of q9, y consists ofat least on symbol!. That is, |y| > 0.
• Condition 3: make sure that q9 is the first state repetition. If |xy| ≥ p
then by pigeonhole principle, the first p + 1 states in the sequencemust contain a repetition, i.e., q9 would not be the first repetition.Therefore, |xy| ≤ p.
Computation Theory – p.14/41
Pumping lemma’s proof
Let M = (Q,Σ, δ, q1, F ) be a DFA recognizing Aand p be the number of states of M . Considers = s1s2 . . . sn, a string over Σ with n ≥ p andr1, r2, . . . , rn+1 be the sequence of states whileprocessing s, i.e., ri+1 = δ(ri, si), 1 ≤ i ≤ n.
• Since n + 1 ≥ p + 1 the first p + 1 elements in r1, r2, . . . , rn+1 mustcontain two states that coincide, say rj = rk.
• Because rk occurs among the first p + 1 places in the sequencestarting at r1, we have k ≤ p + 1
• Now let x = s1 . . . sj−1, y = sj . . . sk−1, z = sk . . . sn.
Computation Theory – p.15/41
Facts
• As x takes M from r1 to rj, y takes M from rj
to rj, and z takes M from rj to rn+1, which isan accept state, M must accept xyiz, fori ≥ 0;
• We know that j 6= k, so |y| > 0;
• We also know that k ≤ p + 1, so |xy| ≤ p.
Thus, all conditions are satisfied and lemma is
proven.
Computation Theory – p.16/41
Before using lemma
Fact: if the property stated by the pumping lemmais true then the language is regular.
Proof: by construction.If each element of a language L satisfies the three conditions stated inpumping lemma a GNFA that recognizes L is in Figure 2.
js -x ∪ ǫ jrN
y
-z ∪ ǫ ja�
xz
Figure 2: A GNFA recognizing the language {xyiz|i ≥ 0}
Computation Theory – p.17/41
Fact
If only some elements of a language L satisfy the
three conditions it does not mean that L is regular.
Computation Theory – p.18/41
Using pumping lemma
Proving that a language A is not regular usingpumping lemma:
1. Assume that A is regular in order to obtain a contradiction;
2. The pumping lemma guarantees the existence of a pumpinglength p such that all strings of length p or greater in A can bepumped;
3. Find s ∈ A, |s| ≥ p, that cannot be pumped: i.e., demonstrate that s
cannot be pumped by considering all ways of dividing s into x,y,z, showing that foreach division one of the pumping lemma conditions:(1) xyiz ∈ A, (2) |y| > 0, (3) |xy| ≤ p
fails.
Computation Theory – p.19/41
Observations
• The existence of s contradicts pumpinglemma, hence A cannot be regular;
• Finding s sometimes takes a bit of creativethinking. Experimentation is suggested.
Computation Theory – p.20/41
Applications
Example 1:Use pumping lemma to prove thatB = {0n1n|n ≥ 0} is not regular!
Assume that B is regular and let p be the pumping
length of B. Choose s = 0p1p ∈ B; obviously
|0p1p| > p. By pumping lemma s = xyz such that
for any i ≥ 0, xyiz ∈ B.
Computation Theory – p.21/41
Example, continuation
Consider the cases:1. y consists of 0s only. In this case xyyz has more 0s than 1s and
so it is not in B, violating condition 1;
2. y consists of 1s only. This leads to the same contradiction;
3. y consists of 0s and 1s. In this case xyyz may have the samenumber of 0s and 1s but they are out of order with some 1s beforesome 0s hence it cannot be in B either.
The contradiction is unavoidable if we make the
assumption that B is regular so B is not regular.
Computation Theory – p.22/41
Example 2
Prove that C = {w|w has an equal number of 0sand 1s} is not regular.
Proof: assume that C is regular and let p be
its pumping length. Let s = 0p1p with s ∈ C.
Then pumping lemma guarantees the existence
of strings x, y, z such that s = xyz, and xyiz ∈ C
for any i ≥ 0.
Computation Theory – p.23/41
Observation
If we take the division x = z = ǫ, y = 0p1p itseems that indeed, no contradiction occurs.However:
• Condition 3 states that |xy| ≤ p, and in ourcase xy = 0p1p and |xy| > p. Hence, 0p1p
cannot be pumped.• If |xy| ≤ p then y must consists of only 0s, so
xyyz 6∈ C because there are more 0-s than1-s. This gives us the desired contradiction.
Computation Theory – p.24/41
Other selections
Selecting s = (01)p leads us to trouble because
this string can be pumped by the division: x = ǫ,
y = 01, z = (01)p−1. Then xyiz ∈ C for any i ≥ 0.
Computation Theory – p.25/41
An alternative method
Use the fact that B is nonregular.• If C were regular then C ∩ 0∗1∗ would also be regular because
0∗1∗ is regular and intersection of regular languages is a regularlanguage.
• But C ∩ 0∗1∗ = {0n1n|n ≥ 0} which is not regular.
• Hence, C is not regular either.
Computation Theory – p.26/41
Example 3
Show that F = {ww|w ∈ {0, 1}∗} is nonregularusing pumping lemma.
Proof: Assume that F is regular and let p be its
pumping length. Consider s = 0p10p1 ∈ F . Since
|s| > p, pumping lemma guarantees the existence
of strings x, y, z such that s = xyz and satisfy the
conditions of the pumping lemma.
Computation Theory – p.27/41
Observations
• Condition 3 is again crucial because without itwe could pump s if we let x = z = ǫ, soxyyz ∈ F . But |xy| ≥ p.
• The string s = 0p10p1 exhibits the essence ofthe nonregularity of F .
• If we chose, say 0p0p ∈ F we fail because thisstring can be pumped.
Computation Theory – p.28/41
Proof
So, let us consider the string s = 0p10p1 andshow that it cannot be pumped.
• If we choose y to contains only from 0-s then x = 0k, y = 0p−k,z = 10p1, or x = 0p1, y = 0p−k, z = 10k1. Then by pumping weobtain 0k(0p−k)i10p1 or 0p1(0p−k)i10k1 which for i = 0 become0k10p1 6∈ F , or 0p10k1 6∈ F because k < p.
• If we chose y to contain both 0-s and 1-s then by pumping up weobtain a string that has an odd number of 1-s and so cannot be inthe language.
• Note: we cannot use the division x = ǫ, y = 0p10p1, z = ǫ becausethe condition |xy| ≤ p is violated.
Computation Theory – p.29/41
Example 4
Use pumping lemma to show thatD = {1n2|n ≥ 0} is nonregular.
Proof: by contradiction. Assume that D is regular
and let p be its pumping length. Consider s =
1p2 ∈ D, |s| > p. Pumping lemma guarantees
that s can be split, s = xyz, where for all i ≥ 0,
xyiz ∈ D, |y| > 0, |xy| ≤ p.
Computation Theory – p.30/41
Searching for a contradiction
The elements of D are strings whose lengths areperfect squares. Looking at first perfect squareswe observe that they are: 0, 1, 4, 9, 25, 36, 49,64, 81, . . .
• Note the growing gap between these numbers: large memberscannot be near each other;
• Consider two strings xyiz and xyi+1z which differ from each otherby a single repetition of y;
• If we chose i very large the lengths of xyiz and xyi+1z cannot beboth perfect square because they are too close to each other.
Computation Theory – p.31/41
Turning this idea into a proof
Calculate the value of i that gives us thecontradiction.
• If m = n2, calculating the difference we obtain(n + 1)2 − n2 = 2n + 1 = 2
√m + 1
• By pumping lemma |xyiz| and |xyi+1z| areboth perfect squares. But letting |xyiz| = m
we can see that they cannot be both perfectsquare if |y| < 2
√
|xyiz| + 1, because theywould be too close together.
Computation Theory – p.32/41
Value of i for contradiction
To calculate the value for i that leads tocontradiction we observe that:
• |y| ≤ |s| = p2
• Let i = p4. Then|y| ≤ p2 =
√
p4 < 2√
p4 + 1 ≤ 2√
|xyiz| + 1
Computation Theory – p.33/41
Example 5
Sometimes “pumping down" is useful when weapply pumping lemma.
• We illustrate this using pumping lemma to prove thatE = {0i1j |i > j} is not regular.
• Proof: by contradiction using pumping lemma. Assume that E isregular and its pumping length is p. By pumping lemma, anystring s ∈ E, |s| > p, can be split into three components s = xyz
such that:
1. xyiz ∈ E, for any i ≥ 0,
2. |y| > 0, and
3. |xy| ≤ p.
Computation Theory – p.34/41
Searching for a contradiction
• Let s = 0p+11p; From decomposition s = xyz,from condition 3, |xy| ≤ p it results that y
consists only of 0s.• Let us examine xyyz to see if it is in E.
Adding an extra-copy of y increases thenumber of zeros. Since E contains all strings0∗1∗ that have more 0s than 1s, it will still givea string in E.
Computation Theory – p.35/41
Try something else
• Since xyiz ∈ E even when i = 0, consideri = 0 and xy0z = xz ∈ E.
• This decreases the number of 0s in s.• Since s has just one more 0 than 1 and xz
cannot have more 0s than 1s,(xyz = 0p+11p and |y| 6= 0)
xz cannot be in E.
This is the required contradiction.
Computation Theory – p.36/41
Minimum pumping length
• The pumping lemma says that every regularlanguage has a pumping length p, such thatevery string in the language of length at leastp can be pumped.
• Hence, if p is a pumping length for a regularlanguage A so is any length p′ ≥ p.
The minimum pumping length for A is the smallest p
that it s a pumping length for A.
Computation Theory – p.37/41
Example
Consider A = 01∗. The minimum pumping lengthfor A is 2.
Reason: the string s = 0 ∈ A, |s| = 1 and s cannot be pumped. But any
string s ∈ A, |s| ≥ 2 can be pumped because for s = xyz where x = 0,
y = 1, z = rest and xyiz ∈ A. Hence, the minimum pumping length for
A is 2.
Computation Theory – p.38/41
Problem 1
Find the minimum pumping length for thelanguage 0001∗.Solution: The minimum pumping length for 0001∗ is 4.
Reason: 000 ∈ 0001∗ but 000 cannot be pumped. Hence, 3 is not a
pumping length for 0001∗. If s ∈ 0001∗ and |s| ≥ 4 s can be pumped by
the division s = xyz, x = 000, y = 1, z = rest.
Computation Theory – p.39/41
Problem 2
Find the minimum pumping length for thelanguage 0∗1∗.Solution: The minimum pumping length of 0∗1∗ is 1.
Reason: the minimum pumping length for 0∗1∗ cannot be 0 because ǫ is
in the language but cannot be pumped. Every nonempty string s ∈ 0∗1∗,
|s| ≥ 1 can be pumped by the division: s = xyz, x = ǫ, y first character
of s and z the rest of s.
Computation Theory – p.40/41
Problem 3
Find the minimum pumping length for thelanguage 0∗1+0+1∗ ∪ 10∗1.Solution: The minimum pumping length for 0∗1+0+1∗ ∪ 10∗1 is 3.
Reason: The pumping length cannot be 2 because the string 11 is in the
language and it cannot be pumped. Let s be a string in the language of
length at least 3. If s is generated by 0∗1+0+1∗ we can write is as
s = xyz, x = ǫ, y is the first symbol of s, and z is the rest of the string.
If s is generated by 10∗1 we can write it as s = xyz, x = 1, y = 0 and z
is the remainder of s.
Computation Theory – p.41/41