MTE.1 CSE4100 Midterm Exam Advice and Midterm Exam Advice and Hints Hints Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut 191 Auditorium Road, Box U-155 Storrs, CT 06269-3155 [email protected]http://www.engr.uconn.edu/~steve (860) 486 - 4818 Dr. Robert LaBarre United Technologies Research Center 411 Silver Lane E. Hartford, CT 06018 [email protected]
94
Embed
MTE.1 CSE4100 Midterm Exam Advice and Hints Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut 191.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
MTE.1
CSE4100
Midterm Exam Advice and HintsMidterm Exam Advice and Hints
Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department
The University of Connecticut191 Auditorium Road, Box U-155
Chapter 1: Introduction to CompilersChapter 1: Introduction to Compilers Basic Compiler Ideas and Concepts The “Big Picture”
Chapter 2: A Simple One-Pass CompilerChapter 2: A Simple One-Pass Compiler A Look at All Phases of Compilation Process From Lexical Analysis Thru Code Generation
FOCUS: Chapter 3: Lexical AnalysisFOCUS: Chapter 3: Lexical Analysis Specifying/Recognizing Tokens Patterns (Regular Expressions) and Lexemes Regular Expressions and DFA/NFA Algorithms for
Derivations, Specification, Languages Writing Grammars Ambiguity, Left Recursion, Left Factoring,
Removing epsilon Moves Algorithm for Left Recursion Removal
Top-Down Parsing Recursive Descent and Predictive Parsing First and Follow Calculation Constructing LL(1) Parsing Table Ambiguity and Error Handling
Lex and Yacc will not be Tested!Lex and Yacc will not be Tested!
MTE.4
CSE4100
Hints for Taking ExamHints for Taking Exam
Read the Questions Carefully!Read the Questions Carefully! Ask Questions if you are Confused!Ask Questions if you are Confused! Answer Questions in Any OrderAnswer Questions in Any Order
Organized to fit on minimum number of pages Answer “Easiest” questions for you!
Assess Points per Time UnitAssess Points per Time Unit 75 minutes = 75 points 30 minutes = 30 points; 20 minutes = 20 points
Don't Be Afraid to Not Answer a QuestionDon't Be Afraid to Not Answer a Question 60% Correct for 100 Points = 60 Points 90% Correct For 80 Points = 72 Points
Partial Credit is the NormPartial Credit is the Norm
MTE.5
CSE4100
Possible QuestionsPossible Questions
Open Notes and Open BookOpen Notes and Open Book 5 to 6 Total Multi-Part Questions5 to 6 Total Multi-Part Questions Possibilities… Possibilities…
Constructive and Algorithm Questions Writing and Using Grammar Understanding Significance and Relevance of
Concepts Know your Algorithms and Constructs Know your Algorithms and Constructs
(Regular Expressions, NFA, DFA, CFG)(Regular Expressions, NFA, DFA, CFG) Show All Work to Receive Partial (Any) CreditShow All Work to Receive Partial (Any) Credit Do Not Jump to Final AnswerDo Not Jump to Final Answer Avoid Run-on ExplanationsAvoid Run-on Explanations
MTE.6
CSE4100
Chapter 3 Excerpted MaterialChapter 3 Excerpted Material Introducing Basic Terminology
Token Sample Lexemes Informal Description of Pattern
const
if
relation
id
num
literal
const
if
<, <=, =, < >, >, >=
pi, count, D2
3.1416, 0, 6.02E23
“core dumped”
const
if
< or <= or = or < > or >= or >
letter followed by letters and digits
any numeric constant
any characters between “ and “ except “
Classifies Pattern
Actual values are critical. Info is :
1. Stored in symbol table2. Returned to parser
MTE.7
CSE4100
Language ConceptsLanguage Concepts
A language, L, is simply any set of strings over a fixed alphabet.
Given the regular expression: (a (b*c)) | (a (b |c+)?)Given the regular expression: (a (b*c)) | (a (b |c+)?) Find a transition diagram NFA that recognizes
Construction Algorithm : R.E. Construction Algorithm : R.E. NFA NFA
Construction Process :
1st : Identify subexpressions of the regular expression
symbols
r | s
rs
r*
2nd : Characterize “pieces” of NFA for each subexpression
MTE.25
CSE4100
Piecing Together NFAsPiecing Together NFAs
2. For a in the regular expression, construct NFA
astart i f L(a)
1. For in the regular expression, construct NFA
L()start i f
MTE.26
CSE4100
Piecing Together NFAs – continued(1)Piecing Together NFAs – continued(1)
where i and f are new start / final states, and -moves are introduced from i to the old start states of N(s) and N(t) as well as from all of their final states to f.
3.(a) If s, t are regular expressions, N(s), N(t) their NFAs s|t has NFA:
start i f
N(s)
N(t)
L(s) L(t)
MTE.27
CSE4100
Piecing Together NFAs – continued(2)Piecing Together NFAs – continued(2)
3.(b) If s, t are regular expressions, N(s), N(t) their NFAs st (concatenation) has NFA:
starti fN(s) N(t) L(s) L(t)
Alternative:
overlap
N(s)start i fN(t)
where i is the start state of N(s) (or new under the alternative) and f is the final state of N(t) (or new). Overlap maps final states of N(s) to start state of N(t).
MTE.28
CSE4100
Piecing Together NFAs – continued(3)Piecing Together NFAs – continued(3)
fN(s)start i
where : i is new start state and f is new final state
-move i to f (to accept null string)
-moves i to old start, old final(s) to f
-move old final to old start (WHY?)
3.(c) If s is a regular expressions, N(s) its NFA, s* (Kleene star) has NFA:
MTE.29
CSE4100
Properties of Construction Properties of Construction
1. N(r) has at most 2*(#symbols + #operators) of r
2. N(r) has exactly one start and one accepting state
3. Each state of N(r) has at most one outgoing edge
a and at most two outgoing ’s
4. BE CAREFUL to assign unique names to all states !
Let r be a regular expression, with NFA N(r), then
MTE.30
CSE4100
Detailed ExampleDetailed Example
r13
r12r5
r3 r11r4
r9
r10
r8r7
r6
r0
r1 r2
b
*c
a a
|
( )
b
|
*
c
See example 3.16 in textbook for (a | b)*abb2nd Example - (ab*c) | (a(b|c*))
Parse Tree for this regular expression:
What is the NFA? Let’s construct it !
MTE.31
CSE4100
Detailed Example – Construction(1)Detailed Example – Construction(1)
r3: a
r0: b
r2: c
b
r1:
r4 : r1 r2b
c
r5 : r3 r4
b
a c
MTE.32
CSE4100
Detailed Example – Construction(2)Detailed Example – Construction(2)
r11: a
r7: b
r6: c
c
r9 : r7 | r8
b
r10 : r9
c
r8:
c
r12 : r11 r10
b
a
MTE.33
CSE4100
Detailed Example – Final StepDetailed Example – Final Step
Data structuresData structures A stack Holds the symbols to treat A lookahead window to choose a prediction
AlgorithmAlgorithm Startup
Initialize stack to start symbol (a non-terminal) Initialize lookahead window at start of token stream
Recursive Process Find out if front of window and top of stack match If match
– Consume the symbols
No match– Pop / Select / Push
MTE.61
CSE4100
Lookahead Size and LanguagesLookahead Size and Languages
With k tokens of lookaheadWith k tokens of lookahead Some languages can be parsed (this way) Some languages cannot be parsed
What about k+1 tokens ?What about k+1 tokens ? More languages can be parsed.... So the set of languages recognized with k is a
subset of the set of languages recognized with k+1!
We have a hierarchy!
Still...Still... In practice LL(1) should be enough.
MTE.62
CSE4100
Top-Down ParsingTop-Down Parsing
Identify a leftmost derivation for an input string Why ?
By always replacing the leftmost non-terminal symbol via a production rule, we are guaranteed of developing a parse tree in a left-to-right fashion consistent with scanning the input.
A aBc adDc adec (scan a, scan d, scan e, scan c - accept!)
Leftmost Derivation for the ExampleLeftmost Derivation for the Example
The leftmost derivation for the example is as follows:
E TE’ FT’E’ id T’E’ id E’ id + TE’ id + FT’E’
id + id T’E’ id + id * FT’E’ id + id * id T’E’
id + id * id E’ id + id * id
MTE.68
CSE4100
What’s the Missing Puzzle Piece ?What’s the Missing Puzzle Piece ?
Constructing the Parsing Table M !
1st : Calculate First & Follow for Grammar
2nd: Apply Construction Algorithm for Parsing Table
Conceptual Perspective:
First: Let be a string of grammar symbols. First() are the first terminals that can appear in in any possible derivation. NOTE: If , then is First( ).
Follow: Let A be a non-terminal. Follow(A) is the set of terminals that can appear directly to the right of A in some sentential form. (S Aa, for some and ). NOTE: If S A, then $ is Follow(A).
3. If X is a non-terminal, and X Y1Y2…Yk is a production rule
Place First(Y1) in First(X)
if Y1 , Place First(Y2) in First(X)
if Y2 , Place First(Y3) in First(X)
…
if Yk-1 , Place First(Yk) in First(X)
NOTE: As soon as Yi , Stop.
May repeat 1, 2, and 3, above for each Yj
*
*
*
*
MTE.70
CSE4100
Computing First(X) : Computing First(X) : All Grammar Symbols - continuedAll Grammar Symbols - continued
Informally, suppose we want to compute
First(X1 X2 … Xn ) = First (X1) “+”
First(X2) if is in First(X1) “+”
First(X3) if is in First(X2) “+”
…
First(Xn) if is in First(Xn-1)
Note 1: Only add to First(X1 X2 … Xn) if is in First(Xi) for all i
Note 2: For First(X1), if X1 Z1 Z2 … Zm , then we need to compute First(Z1 Z2 … Zm) !
MTE.71
CSE4100
ExampleExample
Computing First for: E TE’ E’ + TE’ | T FT’ T’ * FT’ | F ( E ) | id
First(E)
First(TE’)
First(T)
First(T) “+” First(E’)
First(F)
First((E)) “+” First(id)
First(F) “+” First(T’)
“(“ and “id”
Not First(E’) since T
Not First(T’) since F
*
*
Overall: First(E) = { ( , id } = First(F)
First(E’) = { + , } First(T’) = { * , }
First(T) First(F) = { ( , id }
MTE.72
CSE4100
Example 2Example 2
Given the production rules:
S i E t SS’ | a
S’ eS |
E b
Verify that
First(S) = { i, a }
First(S’) = { e, }
First(E) = { b }
MTE.73
CSE4100
Computing Follow(A) : Computing Follow(A) : All Non-TerminalsAll Non-Terminals
1. Place $ in Follow(S), where S is the start symbol and $ signals end of input
2. If there is a production A B, then everything in First() is in Follow(B) except for .
3. If A B is a production, or A B and (First() contains ), then everything in Follow(A) is in Follow(B)
(Whatever followed A must follow B, since nothing follows B from the production rule)
*
We’ll calculate Follow for two grammars.
MTE.74
CSE4100
ExampleExample
Compute Follow for: E TE’ E’ + TE’ | T FT’ T’ * FT’ | F ( E ) | id
• Follow(E) - contains $ since E is the start symbol. Also, since F (E) then First(“)”) is in Follow(E). Thus Follow(E) = { ) , $ }
• Follow(E’) : E TE’ implies Follow(E) is in Follow(E’), and Follow(E’) = { ) , $ }
• Follow(T) : E TE’ implies put in First(E’). Since E’ , put in Follow(E). Since E’ +TE’ , Put in First(E’), and since E’ , put in Follow(E’). Thus Follow(T) = { +, ), $ }.
• Follow(T’)
• Follow(F)You do these !
**
MTE.75
CSE4100
Computing Follow : 2Computing Follow : 2ndnd Example Example
S i E t SS’ | a
S’ eS |
E b
First(S) = { i, a }
First(S’) = { e, }
First(E) = { b }
Recall:
Follow(S) – Contains $, since S is start symbol
Since S i E t SS’ , put in First(S’) – not
Since S’ , Put in Follow(S)
Since S’ eS, put in Follow(S’) So…. Follow(S) = { e, $ }
Follow(S’) = Follow(S) HOW?
Follow(E) = { t }
*
MTE.76
CSE4100
Motivation Behind First & FollowMotivation Behind First & Follow
First:
Follow:
Is used to indicate the relationship between non-terminals (in the stack) and input symbols (in input stream)
Example: If A , and a is in First(), then when a=input, replace with .
( a is one of first symbols of , so when A is on the stack and a is input, POP A and PUSH .
Is used when First has a conflict, to resolve choices. When or , then what follows A dictates the next choice to be made.
Example: If A , and b is in Follow(A ), then when a , and if b is an input character, then we expand A with , which will eventually expand to , of which b follows!
Motivation Behind First & FollowMotivation Behind First & Follow
First:
Follow:
Is used to indicate the relationship between non-terminals (in the stack) and input symbols (in input stream)
Example: If A , and a is in First(), then when a=input, replace with .
( a is one of first symbols of , so when A is on the stack and a is input, POP A and PUSH .
Is used when First has a conflict, to resolve choices. When or , then what follows A dictates the next choice to be made.
Example: If A , and b is in Follow(A ), then when a , and if b is an input character, then we expand A with , which will eventually expand to , of which b follows!
1 : Use “1” input symbol as lookahead in conjunction with stack to decide on the parsing action
LL(1) grammars have no multiply-defined entries in the parsing table.
Properties of LL(1) grammars:
• Grammar can’t be ambiguous or left recursive• Grammar is LL(1) when A 1. & do not derive strings starting with the same terminal a 2. Either or can derive , but not both.
Note: It may not be possible for a grammar to be manipulated into an LL(1) grammar
MTE.89
CSE4100
Error RecoveryError Recovery
a + b $
Y
X
$
Z
Input
Predictive Parsing Program
Stack Output
Parsing Table M[A,a]
When Do Errors Occur? Recall Predictive Parser Function:
1. If X is a terminal and it doesn’t match input.
2. If M[ X, Input ] is empty – No allowable actions
Consider two recovery techniques:
A. Panic Mode
B. Phase-level Recovery
MTE.90
CSE4100
Panic Mode RecoveryPanic Mode Recovery
Augment parsing table with action that attempts to realign / synchronize token stream with the expected input.
Suppose : A on top of stack doesn’t mesh with current input symbol
1. Use Follow(A) to remove input tokens – sync (discard)
2. Use First(A) to determine when to restart parsing
3. Incorporate higher level language concepts (begin/end, while, repeat/until) to sync actions we don’t skip tokens unnecessarily.
Other actions:
4. When A , use it to manipulate stack to postpone error detection
5. Use non-matching terminal on stack as token that is inserted into input.
MTE.91
CSE4100
Revised Parsing Table / ExampleRevised Parsing Table / Example
synch
synch
synch
Non-terminal
INPUT SYMBOL
id + * ( ) $
E
E’
T
T’
F
ETE’
TFT’
Fid
E’+TE’
T’ T’*FT’
F(E)
TFT’
ETE’
T’
E’ E’
T’
synch
synch synch
synch
synch
synch
From Follow sets. Pop stack entry – T or NT
Skip input symbol
MTE.92
CSE4100
Skip & SynchSkip & Synch
MeaningMeaning Skip
Discard input symbol
SynchPop top of stack
MessagesMessages Constructed based on lookahead an non-terminal
ExampleExample NT = F Lookahead = + Expecting a FACTOR. Got + for a Term. So a