Second Part of Regular Expressions Equivalencewith Finite Automata
September 11, 2013
Second Part of Regular Expressions Equivalence with Finite Automata
Lemma 1.60
If a language is regular then it is specified by a regular expression
Proof idea: For a given regular language A we will construct aregular expression that describes A.
Second Part of Regular Expressions Equivalence with Finite Automata
Procedure
I Because A is regular, there is a DFA DA that recognizes A
I DA will be converted into a regular expression RA thatspecifies A
Note: This procedure is broken in two parts:
1. Convert a DFA into a generalized nondeterministic finiteautomaton GNFA
2. Convert GNFA into a regular expression
Second Part of Regular Expressions Equivalence with Finite Automata
Procedure
I Because A is regular, there is a DFA DA that recognizes A
I DA will be converted into a regular expression RA thatspecifies A
Note: This procedure is broken in two parts:
1. Convert a DFA into a generalized nondeterministic finiteautomaton GNFA
2. Convert GNFA into a regular expression
Second Part of Regular Expressions Equivalence with Finite Automata
Procedure
I Because A is regular, there is a DFA DA that recognizes A
I DA will be converted into a regular expression RA thatspecifies A
Note: This procedure is broken in two parts:
1. Convert a DFA into a generalized nondeterministic finiteautomaton GNFA
2. Convert GNFA into a regular expression
Second Part of Regular Expressions Equivalence with Finite Automata
Procedure
I Because A is regular, there is a DFA DA that recognizes A
I DA will be converted into a regular expression RA thatspecifies A
Note: This procedure is broken in two parts:
1. Convert a DFA into a generalized nondeterministic finiteautomaton GNFA
2. Convert GNFA into a regular expression
Second Part of Regular Expressions Equivalence with Finite Automata
Procedure
I Because A is regular, there is a DFA DA that recognizes A
I DA will be converted into a regular expression RA thatspecifies A
Note: This procedure is broken in two parts:
1. Convert a DFA into a generalized nondeterministic finiteautomaton GNFA
2. Convert GNFA into a regular expression
Second Part of Regular Expressions Equivalence with Finite Automata
What is an GNFA?
I A GNFA is an NFA wherein the transition arrows may haveany regular expressions as labels, instead only members of thealphabet or ε
I Hence, GNFA reads strings specified by regular expressions(block of symbols) from the input (not necessarily just onesymbol)
I GNFA moves along a transition arrow connecting two statesrepresenting regular expression, Figure 1
Second Part of Regular Expressions Equivalence with Finite Automata
What is an GNFA?
I A GNFA is an NFA wherein the transition arrows may haveany regular expressions as labels, instead only members of thealphabet or ε
I Hence, GNFA reads strings specified by regular expressions(block of symbols) from the input (not necessarily just onesymbol)
I GNFA moves along a transition arrow connecting two statesrepresenting regular expression, Figure 1
Second Part of Regular Expressions Equivalence with Finite Automata
What is an GNFA?
I A GNFA is an NFA wherein the transition arrows may haveany regular expressions as labels, instead only members of thealphabet or ε
I Hence, GNFA reads strings specified by regular expressions(block of symbols) from the input (not necessarily just onesymbol)
I GNFA moves along a transition arrow connecting two statesrepresenting regular expression, Figure 1
Second Part of Regular Expressions Equivalence with Finite Automata
What is an GNFA?
I A GNFA is an NFA wherein the transition arrows may haveany regular expressions as labels, instead only members of thealphabet or ε
I Hence, GNFA reads strings specified by regular expressions(block of symbols) from the input (not necessarily just onesymbol)
I GNFA moves along a transition arrow connecting two statesrepresenting regular expression, Figure 1
Second Part of Regular Expressions Equivalence with Finite Automata
Example GNFA
��
��- qstart
��
��
��
��qaccept
����
q2
6ab
����
q1
?aa
Na∗
M(aa)∗
1ab∗
q∅
q
ab ∪ ba
1
b∗
-b
Figure 1 : A GNFA
Second Part of Regular Expressions Equivalence with Finite Automata
Note
I A GNFA is nondeterministic and so, it may have manydifferent ways to process the same input string
I A GNFA accepts its input if its processing can cause theGNFA to be in an accept state at the end of the input
Second Part of Regular Expressions Equivalence with Finite Automata
GNFA of special form
I The start state has transition arrows to every other state butno arrow coming from any other state
I There is only one accept state and it has arrows coming infrom every other state, but has no arrows going to any otherstate; in addition, the accept state is not the same with thestart state
I Except for start and accept states,one arrow go from everystate to every other state and from each state to itself
Second Part of Regular Expressions Equivalence with Finite Automata
GNFA of special form
I The start state has transition arrows to every other state butno arrow coming from any other state
I There is only one accept state and it has arrows coming infrom every other state, but has no arrows going to any otherstate; in addition, the accept state is not the same with thestart state
I Except for start and accept states,one arrow go from everystate to every other state and from each state to itself
Second Part of Regular Expressions Equivalence with Finite Automata
GNFA of special form
I The start state has transition arrows to every other state butno arrow coming from any other state
I There is only one accept state and it has arrows coming infrom every other state, but has no arrows going to any otherstate; in addition, the accept state is not the same with thestart state
I Except for start and accept states,one arrow go from everystate to every other state and from each state to itself
Second Part of Regular Expressions Equivalence with Finite Automata
GNFA of special form
I The start state has transition arrows to every other state butno arrow coming from any other state
I There is only one accept state and it has arrows coming infrom every other state, but has no arrows going to any otherstate; in addition, the accept state is not the same with thestart state
I Except for start and accept states,one arrow go from everystate to every other state and from each state to itself
Second Part of Regular Expressions Equivalence with Finite Automata
Converting DFA to GNFA
A DFA is converted to a GNFA of special form by the followsingprocedure:
1. Add a new start state with an ε arrow to the old start state and a new acceptstate with an ε arrow from all old accept states
2. If any arrows have multiple labels or if there are multiple arrows going betweenthe same two states in the same direction replace each with a single arrowwhose label is the union of the previous labels
3. Add arrows labeled ∅ between states that had no arrows
Second Part of Regular Expressions Equivalence with Finite Automata
Converting DFA to GNFA
A DFA is converted to a GNFA of special form by the followsingprocedure:
1. Add a new start state with an ε arrow to the old start state and a new acceptstate with an ε arrow from all old accept states
2. If any arrows have multiple labels or if there are multiple arrows going betweenthe same two states in the same direction replace each with a single arrowwhose label is the union of the previous labels
3. Add arrows labeled ∅ between states that had no arrows
Second Part of Regular Expressions Equivalence with Finite Automata
Converting DFA to GNFA
A DFA is converted to a GNFA of special form by the followsingprocedure:
1. Add a new start state with an ε arrow to the old start state and a new acceptstate with an ε arrow from all old accept states
2. If any arrows have multiple labels or if there are multiple arrows going betweenthe same two states in the same direction replace each with a single arrowwhose label is the union of the previous labels
3. Add arrows labeled ∅ between states that had no arrows
Second Part of Regular Expressions Equivalence with Finite Automata
Converting DFA to GNFA
A DFA is converted to a GNFA of special form by the followsingprocedure:
1. Add a new start state with an ε arrow to the old start state and a new acceptstate with an ε arrow from all old accept states
2. If any arrows have multiple labels or if there are multiple arrows going betweenthe same two states in the same direction replace each with a single arrowwhose label is the union of the previous labels
3. Add arrows labeled ∅ between states that had no arrows
Second Part of Regular Expressions Equivalence with Finite Automata
Note
Adding ∅ transitions don’t change the language recognized by DFAbecause a transition labeled by ∅ can never be used
Assumption: now we assume that all GNFAs are in the specialform just defined.
Second Part of Regular Expressions Equivalence with Finite Automata
Converting GNFA→ RE
Assume that GNFA has k states
I Because start and accept states are different from each other,it results that k ≥ 2
I If k > 2 we construct an equivalent GNFA with k − 1 states.This can be repeated for each new GNFA until we obtain aGNFA with k = 2 states.
I If k = 2, GNFA has a single arrow that goes from start toaccept and is labeled by a regular expression that specifies thelanguage recognized by the original DFA
Second Part of Regular Expressions Equivalence with Finite Automata
Converting GNFA→ RE
Assume that GNFA has k states
I Because start and accept states are different from each other,it results that k ≥ 2
I If k > 2 we construct an equivalent GNFA with k − 1 states.This can be repeated for each new GNFA until we obtain aGNFA with k = 2 states.
I If k = 2, GNFA has a single arrow that goes from start toaccept and is labeled by a regular expression that specifies thelanguage recognized by the original DFA
Second Part of Regular Expressions Equivalence with Finite Automata
Converting GNFA→ RE
Assume that GNFA has k states
I Because start and accept states are different from each other,it results that k ≥ 2
I If k > 2 we construct an equivalent GNFA with k − 1 states.This can be repeated for each new GNFA until we obtain aGNFA with k = 2 states.
I If k = 2, GNFA has a single arrow that goes from start toaccept and is labeled by a regular expression that specifies thelanguage recognized by the original DFA
Second Part of Regular Expressions Equivalence with Finite Automata
Converting GNFA→ RE
Assume that GNFA has k states
I Because start and accept states are different from each other,it results that k ≥ 2
I If k > 2 we construct an equivalent GNFA with k − 1 states.This can be repeated for each new GNFA until we obtain aGNFA with k = 2 states.
I If k = 2, GNFA has a single arrow that goes from start toaccept and is labeled by a regular expression that specifies thelanguage recognized by the original DFA
Second Part of Regular Expressions Equivalence with Finite Automata
Example DFA conversion
Assuming that the original DFA has 3 states the process of itsconversion is shown in Figure 2
3-state DFA - 5-state GNFA - 4-state GNFA
?
3-state GNFA�2-state GNFA���
��Regular
expression
Figure 2 : Example DFA conversion to regular expression
Second Part of Regular Expressions Equivalence with Finite Automata
Note
I The crucial step is the construction of an equivalent GNFAwith one fewer states than a GNFA when GNFA has k > 2states.
I This is done by selecting a state, ripping it out of themachine, and repairing the remainder so that the samelanguage is still recognized
I Any state can be selected for ripping, providing that it is notstart or accept state. Such a state exist because k > 2
Second Part of Regular Expressions Equivalence with Finite Automata
Note
I The crucial step is the construction of an equivalent GNFAwith one fewer states than a GNFA when GNFA has k > 2states.
I This is done by selecting a state, ripping it out of themachine, and repairing the remainder so that the samelanguage is still recognized
I Any state can be selected for ripping, providing that it is notstart or accept state. Such a state exist because k > 2
Second Part of Regular Expressions Equivalence with Finite Automata
Note
I The crucial step is the construction of an equivalent GNFAwith one fewer states than a GNFA when GNFA has k > 2states.
I This is done by selecting a state, ripping it out of themachine, and repairing the remainder so that the samelanguage is still recognized
I Any state can be selected for ripping, providing that it is notstart or accept state. Such a state exist because k > 2
Second Part of Regular Expressions Equivalence with Finite Automata
Note
I The crucial step is the construction of an equivalent GNFAwith one fewer states than a GNFA when GNFA has k > 2states.
I This is done by selecting a state, ripping it out of themachine, and repairing the remainder so that the samelanguage is still recognized
I Any state can be selected for ripping, providing that it is notstart or accept state. Such a state exist because k > 2
Second Part of Regular Expressions Equivalence with Finite Automata
Repairing after ripping a state
Assume that the state of a GNFA selected for ripping is qrip
I After removing qrip we repair the machine by altering theregular expressions that label each of the remaining transitions
I The new labels compensate for the absence of qrip by addingback the lost computation
I The new label of the arrow going from state qi to qj is aregular expression that specifies all strings that would take themachine from qi to qj either directly or via qrip
Second Part of Regular Expressions Equivalence with Finite Automata
Repairing after ripping a state
Assume that the state of a GNFA selected for ripping is qrip
I After removing qrip we repair the machine by altering theregular expressions that label each of the remaining transitions
I The new labels compensate for the absence of qrip by addingback the lost computation
I The new label of the arrow going from state qi to qj is aregular expression that specifies all strings that would take themachine from qi to qj either directly or via qrip
Second Part of Regular Expressions Equivalence with Finite Automata
Repairing after ripping a state
Assume that the state of a GNFA selected for ripping is qrip
I After removing qrip we repair the machine by altering theregular expressions that label each of the remaining transitions
I The new labels compensate for the absence of qrip by addingback the lost computation
I The new label of the arrow going from state qi to qj is aregular expression that specifies all strings that would take themachine from qi to qj either directly or via qrip
Second Part of Regular Expressions Equivalence with Finite Automata
Repairing after ripping a state
Assume that the state of a GNFA selected for ripping is qrip
I After removing qrip we repair the machine by altering theregular expressions that label each of the remaining transitions
I The new labels compensate for the absence of qrip by addingback the lost computation
I The new label of the arrow going from state qi to qj is aregular expression that specifies all strings that would take themachine from qi to qj either directly or via qrip
Second Part of Regular Expressions Equivalence with Finite Automata
Illustration
We illustrate the approach of ripping and repairing in Figure 3
����
qrip
� R2
6R3
-
R1
����
qi
qR4
����
qj
before ripping����
qi ����
qj-(R1)(R2)∗(R3) ∪ (R4)
after rippingFigure 3 : Ripping and repairing an GNFA
Second Part of Regular Expressions Equivalence with Finite Automata
Note
I New labels are obtained by concatenating regular expressionsof the arrows that go through qrip and union them with thelabels of the arrows that travel directly between qi and qj
I This construct is carried out for each arrow that goes fromstate qi to any state qj including qi = qj
Second Part of Regular Expressions Equivalence with Finite Automata
Formal proof
I First we need to define formally the GNFA
I Since new labels are regular expressions we use the symbolRΣ to denote the collection of regular expressions over analphabet Σ
I To simplify, denote by qs and qa the start and accept states ofthe GNFA
Second Part of Regular Expressions Equivalence with Finite Automata
Transition function of a GNFA
I Because an arrow connects every state to every other state,except that no arrows are coming from qa or going to qs , thedomain of the transition function of a GNFA isδ : (Q − {qa})× (Q − {qs})→ RΣ
I If δ(qi , qj) = R the arrow from qi to qj has the label R
Second Part of Regular Expressions Equivalence with Finite Automata
Definition 1.64
A generalized nondeterministic finite automaton (GNFA) is a5-tuple (Q,Σ, δ, qs , qa) where:
1. Q is the finite set of state
2. Σ is the input alphabet
3. δ : (Q − {qa})× (Q − {qs})→RΣ is the transition functioni where RΣ is theset of regular expressions over Σ
4. qs is the unique start state
5. qa is the unique accept state and qa 6= qs .
Second Part of Regular Expressions Equivalence with Finite Automata
GNFA computation
A GNFA accepts a string w ∈ Σ∗ if w = w1w2 . . .wk wherewi ∈ Σ∗, 1 ≤ i ≤ k , and a sequence of states q0, q1, . . . , qk exitssuch that:
1. qo = qs is the start state
2. qk = qa is the accept state
3. For each i , δ(qi−1, qi ) = Ri and wi ∈ L(Ri ), i.e., Ri is the regular expressionlabeling the arrow from qi−1 to qi and wi is an element of the languagespecified by this expression
Second Part of Regular Expressions Equivalence with Finite Automata
More proof ideas
Returning to the proof of Lemma 1.60, we assume that M is aDFA recognizing the language A and proceed as follows:
I Convert M into a GNFA G by adding a new start state and a new accept stateand the additional arrows
I Use the procedure Convert(G) that maps G into a regular expression, asexplained before, while preserving the language A
Note: Convert() is recursive; however the case when GNFA has only two sates is
handled without recursion
Second Part of Regular Expressions Equivalence with Finite Automata
Convert(G )
1. Let k be the number of states of G , k ≥ 2.
2. If k = 2 then G must consists of a start state and an accept state and a singlearrow connecting them, labeled by a regular expression R. Return R
3. While k > 2, select any state qrip ∈ Q, different from qs and qa and let G ′ be
the GNFA (Q′,Σ, δ′, qs , qa) where:
I Q ′ = Q − {qrip}I for any qi ∈ Q ′ − {qa} and any qj ∈ Q ′ − {qs} letδ′(qi , qj) = (R1)(R2)∗(R3) ∪ (R4) where:R1 = δ(qi , qrip), R2 = δ(qrip, qrip), R3 = δ(qrip, qj),R4 = δ(qi , qj)
I Convert(G ′);
Second Part of Regular Expressions Equivalence with Finite Automata
Claim 1.65
For any GNFA G , Convert(G ) is equivalent to G
Proof: by induction on k , the number of states of G
Second Part of Regular Expressions Equivalence with Finite Automata
Induction Basis:
k = 2
I If G has only two states, by definition, it can have only asingle arrow which goes from qs to qa
I The regular expression labeling this arrow specify the languageaccepted by G
I Since this expression is returned by Convert(G ), it means thatG and Convert(G ) are equivalent
Second Part of Regular Expressions Equivalence with Finite Automata
Induction Step
Assume that the claim is true for G having k − 1 states and usethis assumption to show that the claim is true for an GNFA with kstates
I Observe from construction that G and G ′ recognize the samelanguage
I Suppose G accepts the input w . Then in an accepting branchof computation, G enters the sequence of statesqs , q1, q2, q3, . . . , qa
I Show that G ′ has an accepting computation for w , too.
Second Part of Regular Expressions Equivalence with Finite Automata
Induction step, continuation
1. If none of the states qs , q1, q2, . . . , qa is qrip, clearly G ′ alsoaccepts w because each of the new regular expressionslabeling arrows of G ′ contain the old regular expressions aspart of a union
2. If qrip does appear in the computation qs , q1, q2, . . . , qa byremoving each run of consecutive qrip states we obtain anaccepting computation for G ′. This is because states qi andqj bracketing a run of consecutive qrip states have a newregular expression on the arrow between them that specify allstrings taking qi to qj via qrip on G . So, G ′ accepts w in thiscase too.
Second Part of Regular Expressions Equivalence with Finite Automata
Induction step, continuation
1. If none of the states qs , q1, q2, . . . , qa is qrip, clearly G ′ alsoaccepts w because each of the new regular expressionslabeling arrows of G ′ contain the old regular expressions aspart of a union
2. If qrip does appear in the computation qs , q1, q2, . . . , qa byremoving each run of consecutive qrip states we obtain anaccepting computation for G ′. This is because states qi andqj bracketing a run of consecutive qrip states have a newregular expression on the arrow between them that specify allstrings taking qi to qj via qrip on G . So, G ′ accepts w in thiscase too.
Second Part of Regular Expressions Equivalence with Finite Automata
Induction step, continuation
For the other direction, suppose that G ′ accepts w .1. Each arrow between any two states qi and qj in G ′ is labeled by a regular
expression that specifies strings specified by arrows in G from qi directly to qj orvia qrip
2. Hence, by the definition of GNFA it follows that G must also accept w .
That is, G and G ′ accept the same language
Second Part of Regular Expressions Equivalence with Finite Automata
Conclusion
I The induction hypothesis states that when the algorithm callsitself recursively on input G ′, the result is a regular expressionthat is equivalent to G ′ because G ′ has k − 1 states
I Hence, that regular expression is also equivalent to G becauseG ′ is equivalent to G
I Consequently Convert(G ) and G are equivalent
Second Part of Regular Expressions Equivalence with Finite Automata
Example 1.35
Convert the DFA D in Figure 4 into the regular expression thatspecifies the language accepted by D
��������
2 �
a
����
1 �
a
?b6
b
-
Figure 4 : DFA D to be converted
Second Part of Regular Expressions Equivalence with Finite Automata
GNFA G1 obtained from D
Figure 5 shows the four-state GNFA obtained from D by addingnew start state and accept state and replacing a, b by a ∪ b
����
2 �
a
����
1 �
a
?b6
b
-����
qs -ε
��������
qa �ε
Figure 5 : GNFA G1 obtained from D
Second Part of Regular Expressions Equivalence with Finite Automata
Eliminating nodes
Removing state 1 and then state 2, Figure 6 shows the GNFA G3:
-����
qs
��������
qa
?a∗b(a ∪ ba∗b)∗
Figure 6 : GNFA G3 obtained from G2
Second Part of Regular Expressions Equivalence with Finite Automata