Top Banner
CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and Beginning Next week (Lexical Analysis) Regular Expressions Finite State Machines DFAs: Deterministic Finite Automata Complications NFAs: Non Deterministic Finite State Automata From Regular Expressions to NFAs From NFAs to DFAs
65

Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

Jun 20, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

CS453 Lecture Regular Expressions and Transition Diagrams 1

Plan for Today and Beginning Next week (Lexical Analysis)

 Regular Expressions

 Finite State Machines   DFAs: Deterministic Finite Automata   Complications   NFAs: Non Deterministic Finite State Automata

 From Regular Expressions to NFAs  From NFAs to DFAs

Page 2: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

CS453 Lecture Regular Expressions and Transition Diagrams 2

Structure of a Typical Compiler

“sentences”

Synthesis

optimization

code generation

target language

IR

IR code generation

IR

Analysis

character stream

lexical analysis

“words” tokens

semantic analysis

syntactic analysis

AST

annotated AST

interpreter

Page 3: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

CS453 Lecture Regular Expressions and Transition Diagrams 3

Tokens for Example MeggyJava program import meggy.Meggy;

class PA3Flower {public static void main(String[] whatever){{

  // Upper left petal, clockwise   Meggy.setPixel( (byte)2, (byte)4, Meggy.Color.VIOLET );   Meggy.setPixel( (byte)2, (byte)1, Meggy.Color.VIOLET);   …   }}Tokens: Symbol(IMPORT,null), Symbol(MEGGY,null),

Symbol(SEMI,null), Symbol(CLASS,null), Symbol(ID,”PA3Flower”), Symbol(LBRACE,null), …

Page 4: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

About The Slides on Languages and Finite Automata

  Slides Originally Developed by Prof. Costas Busch (2004) –  Many thanks to Prof. Busch for developing the original slide set.

  Adapted with permission by Prof. Dan Massey (Spring 2007) –  Subsequent modifications, many thanks to Prof. Massey for CS 301 slides

  Adapted with permission by Prof. Michelle Strout (Spring 2011) –  Adapted for use in CS 453 –  Adapted by Wim Bohm( added regular expr à NFA à DFA, Spr2012)

Page 5: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

  A language is a set of strings   (sometimes called sentences)

  String: A finite sequence of letters

Examples: “cat”, “dog”, “house”, … Defined over a fixed alphabet:

{ }zcba ,,,, …=Σ

Languages

Page 6: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

Empty String

  A string with no letters: ε (sometimes λ is used)

  Observations:

ε = 0

εw = wε = w

εabba = abbaε = abba

Page 7: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

Regular Expressions

  Regular expressions describe regular languages   You have probably seen them in OSs / editors   Example:

describes the language

(a | (b)(c)) *

L((a | (b)(c))*) = ε,a,bc,aa,abc,bca,...{ }

Page 8: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

Recursive Definition for Specifying Regular Expressions

∅, ε, α

r1 | r2r1 r2r1 *r1( )

Are regular expressions

Primitive regular expressions: where

2r1rGiven regular expressions and α ∈ Σ, somealphabet

Page 9: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

Regular operators

choice: A | B a string from L(A) or from L(B)  concatenation: A B a string from L(A) followed by a   string from L(B) repetition: A* 0 or more concatenations of strings from L(A) A+ 1 or more grouping: ( A ) Concatenation has precedence over choice: A|B C vs. (A|B)C More syntactic sugar, used in scanner generators: [abc] means a or b or c [\t\n ] means tab, newline, or space [a-z] means a,b,c, …, or z

CS453 Lecture Regular Expressions and Transition Diagrams 9

Page 10: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

Example Regular Expressions and Regular Definitions

Regular definition: name : regular expression name can then be used in other regular expressions Keywords “print”, “while” Operations: “+”, “-”, “*” Identifiers: let : [a-zA-Z] // chose from a to z or A to Z dig : [0-9] id : let (let | dig)* Numbers: dig+ = dig dig* CS453 Lecture Regular Expressions and Transition Diagrams 10

Page 11: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

Finite Automaton

Input

String Output

String

Finite Automaton

Page 12: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

Finite Accepter

Input

“Accept” or “Reject”

String

Finite Automaton

Output

Page 13: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

State Transition Graph

initial state

final state “accept” state

transition

abba -Finite Accepter

0q 1q 2q 3q 4qa b b a

5q

a a bb

ba,

ba,

Page 14: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

Initial Configuration

1q 2q 3q 4qa b b a

5q

a a bb

ba,

Input String a b b a

ba,

0q

Page 15: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

Reading the Input

0q 1q 2q 3q 4qa b b a

5q

a a bb

ba,

a b b a

ba,

Page 16: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

0q 1q 2q 3q 4qa b b a

5q

a a bb

ba,

a b b a

ba,

Page 17: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

0q 1q 2q 3q 4qa b b a

5q

a a bb

ba,

a b b a

ba,

Page 18: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

0q 1q 2q 3q 4qa b b a

5q

a a bb

ba,

a b b a

ba,

Page 19: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

0q 1q 2q 3q 4qa b b a

Output: “accept”

5q

a a bb

ba,

a b b a

ba,

Input finished

Page 20: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

String Rejection

1q 2q 3q 4qa b b a

5q

a a bb

ba,

a b a

ba,

0q

Page 21: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

0q 1q 2q 3q 4qa b b a

5q

a a bb

ba,

a b a

ba,

Page 22: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

0q 1q 2q 3q 4qa b b a

5q

a a bb

ba,

a b a

ba,

Page 23: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

0q 1q 2q 3q 4qa b b a

5q

a a bb

ba,

a b a

ba,

Page 24: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

0q 1q 2q 3q 4qa b b a

5q

a a bb

ba,Output: “reject”

a b a

ba,

Input finished

Page 25: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

The Empty String

1q 2q 3q 4qa b b a

5q

a a bb

ba,

ba,

0q

ε

Page 26: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

1q 2q 3q 4qa b b a

5q

a a bb

ba,

ba,

0q

Output: “reject”

Would it be possible to accept the empty string?

ε

Page 27: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

Another Example

a

b ba,

ba,

0q 1q 2q

a ba

Page 28: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

a

b ba,

ba,

0q 1q 2q

a ba

Page 29: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

a

b ba,

ba,

0q 1q 2q

a ba

Page 30: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

a

b ba,

ba,

0q 1q 2q

a ba

Page 31: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

a

b ba,

ba,

0q 1q 2q

a ba

Output: “accept”

Input finished

Page 32: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

Rejection

a

b ba,

ba,

0q 1q 2q

ab b

Page 33: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

a

b ba,

ba,

0q 1q 2q

ab b

Page 34: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

a

b ba,

ba,

0q 1q 2q

ab b

Page 35: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

a

b ba,

ba,

0q 1q 2q

ab b

Page 36: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

a

b ba,

ba,

0q 1q 2q

ab b

Output: “reject”

Input finished

Which strings are accepted?

Page 37: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

Formalities

  Deterministic Finite Automaton (DFA)

( )FqQM ,,,, 0δΣ=

δ

0q

F

: set of states

: input alphabet

: transition function

: initial state

: set of final (accepting) states

Page 38: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

Input Alphabet Σ

0q 1q 2q 3q 4qa b b a

5q

a a bb

ba,

{ }ba,=Σ

ba,

Page 39: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

Set of States Q

0q 1q 2q 3q 4qa b b a

5q

a a bb

ba,

{ }543210 ,,,,, qqqqqqQ =

ba,

Page 40: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

Initial State 0q

1q 2q 3q 4qa b b a

5q

a a bb

ba,

ba,

0q

Page 41: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

Set of Final States F

0q 1q 2q 3qa b b a

5q

a a bb

ba,

{ }4qF =

ba,

4q

Page 42: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

Transition Function δ

0q 1q 2q 3q 4qa b b a

5q

a a bb

ba,

QQ →Σ×:δ

ba,

Page 43: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

( ) 10, qaq =δ

2q 3q 4qa b b a

5q

a a bb

ba,

ba,

0q 1q

Page 44: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

( ) 50, qbq =δ

1q 2q 3q 4qa b b a

5q

a a bb

ba,

ba,

0q

Page 45: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

0q 1q 2q 3q 4qa b b a

5q

a a bb

ba,

ba,

( ) 32, qbq =δ

Page 46: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

Transition Function / table δ

0q 1q 2q 3q 4qa b b a

5q

a a bb

ba,

δ a b0q1q2q3q4q5q

1q 5q5q 2q5q 3q4q 5q

ba,5q5q5q5q

Page 47: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

Complications

1. "1234" is an NUMBER but what about the “123” in “1234” or the “23”, etc. Also, the scanner must recognize many tokens,   not one, only stopping at end of file.

  3. "if" is a keyword or reserved word IF, but "if" is also defined by   the reg. exp. for identifier ID. We want to recognize IF.

  4. We want to discard white space and comments.

  5. "123" is a NUMBER but so is "235" and so is "0", just as   "a" is an ID and so is "bcd”, we want to recognize a token,   but add attributes to it.   CS453 Lecture Regular Expressions and Transition Diagrams 47

Page 48: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

Complications 1

  1. "1234" is an NUMBER but what about the “123” in “1234”   or the “23”, etc. Also, the scanner must recognize many tokens,   not one, only stopping at end of file. So:   recognize the largest string defined by some regular expression,   only stop getting more input if there is no more match. This introduces

the need to reconsider a character, as it is the first of the next token

  e.g. fname(a,bcd ); would be scanned as   ID OPEN ID COMMA ID CLOSE SEMI EOF scanning fname would consume (, which would be put back and then recognized as OPEN

CS453 Lecture Regular Expressions and Transition Diagrams 48

Page 49: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

Complication 2

  2. "if" is a keyword or reserved word IF, but "if" is also defined by   the reg. exp. for identifier ID, we want to recognize IF, so

  Have some way of determining which token ( IF or ID ) is recognized.  

  This can be done using priority, e.g. in scanner generators an earlier definition has a higher priority than a later one.

    By putting the definition for IF before the definition for ID in the input

for the scanner generator, we get the desired result.

  What about the string “ifyouleavemenow”?

CS453 Lecture Regular Expressions and Transition Diagrams 49

Page 50: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

Complication 3

  3. we want to discard white space and comments and not bother the parser with these. So:

  in scanner generators, we can   specify, using a regular expression, white space e.g. [\t\n ]

  and return no token, i.e. move to the next

  specify comments using a (NASTY) regular expression and again   return no token, move to the next

  CS453 Lecture Regular Expressions and Transition Diagrams 50

Page 51: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

Complication 4

 4. "123" is a NUMBER but so is "235" and so is "0", just as   "a" is an ID and so is "bcd”, we want to recognize a token,   but add attributes to it. So,

  Scanners return Symbols, not tokens.   A Symbol is a (token, tokenValue) pair,   e.g. (NUMBER,123) or (ID,"a").

 Often more information is added to a symbol, e.g. line number and position (as we will do in MeggyJava)

CS453 Lecture Regular Expressions and Transition Diagrams 51

Page 52: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

(Non) Deterministic Finite State Automata

  A Deterministic Finite State Automaton (DFA) has disjoint character   sets on its edges, i.e. the choice “which state is next” is deterministic.

  A Non-deterministic Finite State Automaton (NFA) does NOT, i.e. it   can have character sets on its edges that overlap (non empty

intersection), and empty sets on the some edges (labeled ε ).   NFAs are used in the translation from regular expressions to FSAs.

E.g. when we combine the reg. exp for IF with the reg.exp for ID by just merging the two Transition graphs, we would get an NFA.

  NFAs are a first step in creating a DFA for a scanner.   The NFA is then transformed into a DFA.

  CS453 Lecture Regular Expressions and Transition Diagrams 52

Page 53: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

From regular expressions to NFAs

  regexp simple letter “a”   empty string AB concat the NFAs

  A|B split merge them

  A* build a loop

CS453 Lecture Regular Expressions and Transition Diagrams 53

a ε

A B

A

B ε

ε

ε

A ε

ε

accept state of the NFA for A

Page 54: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

The Problem

  DFAs are easy to execute (table driven interpretation)

  NFAs are easy to build from reg. exps, but hard to execute   we would need some form of guessing, implemented by back tracking

  To build a DFA from an NFA we avoid the back track by taking all choices in the NFA at once, a move with a character or ε gets us to a set of states in the NFA, which will become one state in the DFA.

  We keep doing this until we have exhausted all possibilities.

  This mechanism is called transitive closure   (This ends because there is only a finite set of subsets of NFA states.

How many are there? )   CS453 Lecture Regular Expressions and Transition Diagrams 54

Page 55: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

Example IF and ID

 let : [a-z]  dig : [0-9]

 tok : if | id

 if : “i” “f”

 id : let (let | dig)*

CS453 Lecture Regular Expressions and Transition Diagrams 55

Page 56: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

Example: NFA for IF and ID

CS453 Lecture Regular Expressions and Transition Diagrams 56

i f IF

ε a-z 0

2 3

4 5 8 ε

a-z 0-9

7 6 ε ε

ID

IF has priority over ID. From 0, with ε we can get to states 1 and 4 this is called an ε-closure We can now simulate the behavior of the NFA and build a table for the DFA making character moves plus ε-closures

let : [a-z]  dig : [0-9]  tok : if | id  if : “i” “f”  id : let (let | dig)*

1 ε

Page 57: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

NFA simulation scanning “in”

CS453 Lecture Regular Expressions and Transition Diagrams 57

ε

a-z

0

4

5 8 ε

a-z 0-9

7 6 ε ε

ID

DFAstate NFAstates Move Next 0 0,1,4 i 2,5,8,6 1 2,5,6,8 n 6,7,8 Only one of the states in 6,7,8 is an accepting state, an ID accepting state, so “in” is an ID

i f IF 2 3 1 ε

Page 58: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

NFA simulation scanning “if”

CS453 Lecture Regular Expressions and Transition Diagrams 58

ε

a-z 4

5 8 ε

a-z 0-9

7 6 ε ε

ID

DFAstate NFAstates Move Next 0 0,1,4 i 2,5,6,8 1 2,5,6,8 f 3,6,7,8 Two of the states in 3,6,7,8 are accepting, an IF accepting state (3) and an ID accepting state (8), IF has priority over ID, so “if” is an IF

ε 0

i f IF 2 3 1 ε

Page 59: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

Definitions: edge(s,c) and closure

  edge(s,c): the set of all NFA states reachable from state s following   an edge with character c

  closure(S): the set of all states reachable from S with no chars or ε

  T=S   repeat T’=T;   forall s in T’ { T’=T; }   until T’==T

This transitive closure algorithm terminates because there is a finite number of states in the NFA  

CS453 Lecture Regular Expressions and Transition Diagrams 59

closure(S) = T = S∪ ( edge(s,ε))s∈T

T = T '∪( edge(s,ε))s∈T '

Page 60: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

DFAedge and NFA Simulation

  Suppose we are in state DFA d = {si, sk,sl}   By moving with character c from d we reach a set of new

NFA states, call these DFAedge(d,c), a new or already existing DFA state

  NFA simulation:   let the input string be c1…ck   d=closure({s1}) // s1 the start state of the NFA   for i from 1 to k   d = DFAedge(d,ci)

CS453 Lecture Regular Expressions and Transition Diagrams 60

DFAedge(d,c) = closure( edge(s,c))s∈d

Page 61: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

Constructing a DFA with closure and DFAEdge

state d1 = closure(s1) the closure of the start state of the NFA make new states by moving from existing states with a character c, using DFAEdge(d,c); record these in the transition table make accepts in the transition table, if there is an accepting state in d, decide priority if more than one accept state. Instead of characters we use non-overlapping (DFA)

character classes to keep the table manageable.

CS453 Lecture Regular Expressions and Transition Diagrams 61

Page 62: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

NFA to DFA (let’s build it)

CS453 Lecture Regular Expressions and Transition Diagrams 62

i f

ε

a-z

1

2 3

4

5 8 ε

a-z 0-9

7 6

ε

IF

ID

ε

Page 63: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

NFA to DFA

CS453 Lecture Regular Expressions and Transition Diagrams 63

i f

ε

a-z

1

2 3

4

5 8 ε

a-z 0-9

7 6

ε

1: 1,4

2: 2,5,6,8 i 3:

3,6,7,8

f IF IF

5: 5,6,8

ID

4: 6,7,8

a-h j-z

a-z 0-9

a-z 0-9

a-z 0-9

ID ID

ε

ID

a-e g-z 0-9

Page 64: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

The transition table for IF ID

p NFAstates(p) i f a-h a-e,g-z a-z,0-9 ACPT j-z 0-9 1 {1,4} {2,5,6,8} {5,6,8} 2 {2,5,6,8} {3,6,7,8} {6,7,8} ID 3 {3,6,7,8} {6,7,8} IF 4 {6,7,8} {6,7,8} ID 5 {5,6,8} {6,7,8} ID

CS453 Lecture Regular Expressions and Transition Diagrams 64

Page 65: Plan for Today and Beginning Next week (Lexical Analysis)cs453/yr2014/Slides/02... · 2014-01-21 · CS453 Lecture Regular Expressions and Transition Diagrams 1 Plan for Today and

Suggested Exercise

  Build an NFA and a DFA for integer and float literals

  dot: “.”

  dig: [0-9]

  int-lit: dig+

  float-lit: dig* dot dig+

CS453 Lecture Regular Expressions and Transition Diagrams 65