1 July 3, 2022 1 July 3, 2022 July 3, 2022 Azusa, Azusa, CA CA Sheldon X. Liang Ph. D. Computer Science at Computer Science at Azusa Azusa Pacific University Pacific University Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/ CS400 Compiler Construction CS400 Compiler Construction
15
Embed
1 November 1, 2015 1 November 1, 2015November 1, 2015November 1, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
April 20, 20231
April 20, 2023April 20, 2023 Azusa, CAAzusa, CA
Sheldon X. Liang Ph. D.
Computer Science at Computer Science at Azusa Pacific UniversityAzusa Pacific University
Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/
CS400 Compiler ConstructionCS400 Compiler Construction
Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/
CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction
Keep in mind following questionsKeep in mind following questions
• Token– Lexical units– Atom parse element– Abstracted in syntax: e.g. Id
• Lexeme – Specific string making up token– Value / attribute related to a token– Concrete in language, e.g., Amt
• Spec of patterns for tokens– Alphabet - a finite set– String s - a finite sequence from – Language – a specific set of strings
7
Tokens, Patterns, and Lexemes
• A token is a classification of lexical units– For example: id and num
• Lexemes are the specific character strings that make up a token– For example: abc and 123
• Patterns are rules describing the set of lexemes belonging to a token– For example: “letter followed by letters and digits”
and “non-empty sequence of digits”
April 20, 20237
Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/
CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction
8
• An alphabet is a finite set of symbols (characters)
• A string s is a finite sequence of symbols from s denotes the length of string s denotes the empty string, thus = 0
• A language is a specific set of strings over some fixed alphabet
April 20, 20238
Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/
CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction
Specification of Patterns for Tokens: Definitions
9
Specification of Patterns for Tokens: String Operations
• The concatenation of two strings x and y is denoted by xy
• The exponentation of a string s is defined by
s0 = si = si-1s for i > 0
note that s = s = sApril 20, 2023
9Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/
CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction
10
• UnionL M = {s s L or s M}
• ConcatenationLM = {xy x L and y M}
• ExponentiationL0 = {}; Li = Li-1L
• Kleene closureL* = i=0,…, Li
• Positive closureL+ = i=1,…, Li
April 20, 202310
Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/
CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction
Specification of Patterns for Tokens: Language Operations
11
• Basis symbols: is a regular expression denoting language {}– a is a regular expression denoting {a}
• If r and s are regular expressions denoting languages L(r) and M(s) respectively, then– rs is a regular expression denoting L(r) M(s)– rs is a regular expression denoting L(r)M(s)– r* is a regular expression denoting L(r)*
– (r) is a regular expression denoting L(r)
• A language defined by a regular expression is called a regular set
April 20, 202311
Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/
CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction
Specification of Patterns for Tokens: Regular Expressions
12
Nondeterministic Finite Automata
• An NFA is a 5-tuple (S, , , s0, F) where
S is a finite set of states is a finite set of symbols, the alphabet is a mapping from S to a set of statess0 S is the start stateF S is the set of accepting (or final) states
April 20, 202312
Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/
CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction
13
Conversion of an NFA into a DFA
• The subset construction algorithm converts an NFA into a DFA using:
-closure(s) = {s} {t s … t}-closure(T) = sT -closure(s)move(T,a) = {t s a t and s T}
• The algorithm produces:Dstates is the set of states of the new DFA consisting of sets of states of the NFADtran is the transition table of the new DFA
April 20, 202313
Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/
CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction
14
April 20, 202314
Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/
CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction
Got it with following questionsGot it with following questions• Tokens
– Lexical units– Atom parse element– Abstracted in syntax: e.g. Id
• Lexeme – Specific string making up token– Value / attribute related to a token– Concrete in language, e.g., Amt
• Spec of patterns for tokens– Alphabet - a finite set– String s - a finite sequence from – Language – a specific set of strings
15
Thank you very much!
Questions?
April 20, 202315
Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/
CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction