Institutionen f¨ or Datavetenskap Department of Computer and Information Science Master’s thesis OMCCp: A MetaModelica Based Parser Generator Applied to Modelica by Edgar Alonso Lopez-Rojas LIU-IDA/LITH-EX-A–11/019–SE 2011-05-31 ✬ ✫ ✩ ✪ Link¨ opings universitet SE-581 83 Link¨ oping, Sweden Link¨ opings universitet 581 83 Link¨ oping
233
Embed
OMCCp: A MetaModelica Based Parser Generator Applied to ...
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Institutionen for DatavetenskapDepartment of Computer and Information Science
Master’s thesis
OMCCp: A MetaModelica BasedParser Generator Applied to Modelica
by
Edgar Alonso Lopez-Rojas
LIU-IDA/LITH-EX-A–11/019–SE
2011-05-31'
&
$
%Linkopings universitet
SE-581 83 Linkoping, Sweden
Linkopings universitet
581 83 Linkoping
Institutionen for DatavetenskapDepartment of Computer and Information Science
Master’s thesis
OMCCp: A MetaModelica BasedParser Generator Applied to Modelica
by
Edgar Alonso Lopez-Rojas
LIU-IDA/LITH-EX-A–11/019–SE
2011-05-31
Supervisors: Martin Sjolund and Mohsen Torabzadeh-TariDept. of Computer and Information Science
Examiner: Prof. Peter FritzsonDept. of Computer and Information Science
Upphovsratt
Detta dokument halls tillgangligt pa Internet a eller dess framtidaersattare a under en langre tid fran publiceringsdatum underforutsattning att inga extra-ordinara omstandigheter uppstar.Tillgang till dokumentet innebar tillstand for var och en att lasa,ladda ner, skriva ut enstaka kopior for enskilt bruk och att anvandadet oforandrat for ickekommersiell forskning och for undervisning.overforing av upphovsratten vid en senare tidpunkt kan inte upphavadetta tillstand. All annan anvandning av dokumentet kraver up-phovsmannens medgivande. For att garantera aktheten, sakerhetenoch tillgangligheten finns det losningar av teknisk och administrativart.Upphovsmannens ideella ratt innefattar ratt att bli namnd som up-phovsman i den omfattning som god sed kraver vid anvandning avdokumentet pa ovan beskrivna satt samt skydd mot att dokumentetandras eller presenteras i sadan form eller i sadant sammanhangsom ar krankande for upphovsmannens litterara eller konstnarligaanseende eller egenart.For ytterligare information om Linkoping University ElectronicPress se forlagets hemsida http://www.ep.liu.se/
Copyright
The publishers will keep this document online on the Internet - orits possible replacement - for a considerable time from the date ofpublication barring exceptional circumstances.The online availability of the document implies a permanent per-mission for anyone to read, to download, to print out single copiesfor your own use and to use it unchanged for any non-commercialresearch and educational purpose. Subsequent transfers of copyrightcannot revoke this permission. All other uses of the document areconditional on the consent of the copyright owner. The publisher hastaken technical and administrative measures to assure authenticity,security and accessibility.According to intellectual property law the author has the right to bementioned when his/her work is accessed as described above and tobe protected against infringement.For additional information about the Linkoping University Elec-tronic Press and its procedures for publication and for assuranceof document integrity, please refer to its WWW home page:http://www.ep.liu.se/
The OpenModelica Compiler-Compiler parser generator (OMCCp) is anLALR(1) parser generator implemented in the MetaModelica language withparsing tables generated by the tools Flex and GNU Bison. The code gener-ated for the parser is in MetaModelica 2.0 language which is the OpenMod-elica compiler implementation language and is an extension of the Modelica3.2 language. OMCCp uses as input an LALR(1) grammar that specifies theModelica language. The generated Parser can be used inside the OpenMod-elica Compiler (OMC) as a replacement for the current parser generated bythe tool ANTLR from an LL(k) Modelica grammar. This report explainsthe design and implementation of this novel Lexer and Parser Generatorcalled OMCCp.Modelica and its extension MetaModelica are both languages used in theOpenModelica environment. Modelica is an Object-Oriented Equation-Based language for Modeling and Simulation.
v
vi
Acknowledgements
It is an honor for me to be able to culminate this work with the guidanceof remarkable computer scientists. This thesis would not have been possibleunless the clear vision of my examiner, professor Peter Fritzson. As thedirector of the Open Source Modelica Consortium (OSMC) he presented thisgreat opportunity to me. Together with him, I have to thank my supervisorsMartin Sjolund and Mohsen Torabzadeh-Tari. Martin has made availablehis support and guidance in a number of ways that I cannot count andMohsen has always been keeping track of my progress and helping me withthe difficulties I found. I am pleased to be part, learn and contribute to thisgreat open source project called OpenModelica.
Nevertheless, To IDA (Department of Computer and Information Sci-ence) for offering its locations and resources for my daily work.
I cannot forget to thank my family. My parents Jesus and Soledad forsupporting me since the beginning in this project to become a Master inComputer Science. My fiancee Helena, who has all the time been encour-aging me to give my best in every step of this journey. I am delighted toinclude my future daughter Isabella here; who is been my biggest motivationto complete this work before the day she step for the first time in this world.
Last, but not less important my financial sponsors from Colombia: Fun-dacion Colfuturo1 and EAFIT University2. They believed in my talent andprovided the financial resources to achieve this goal.
the return action section with the actions that have been specified for each
rule.
The arrays that are present in the lexer are:
yyec: Mach any UTF-8 code with a started condition.
yyaccept: check the states against the accept condition.
yyacclist: once accepted, the action for each state is found here.
yymeta: control array for the transitions.
yybase: control array for the transitions.
yydef: default transition for the states.
yynxt: determines the next transition of the states.
yychk: control array that verifies errors.
FLEX is designed to handle a large amount of rules and tokens. It
simplifies the number of rules and tokens utilized by the parser in the next
phase of the compiler. That is why it is common to find a combination
of FLEX and other parser generators such as the tool called Yet Another
Compiler-Compiler (YACC) or its successor GNU Bison.
For a complete reference of FLEX, the FLEX manual by Paxson [2002]
is a good source of information.
3.3 GNU Bison
GNU Bison is a parser generator that generates a LALR(1) parser from a
context-free grammar. The generated parser can be in one of these three
languages: C-code, C++ and Java. It is based on the tool called YACC.
GNU Bison receives as an input a file with the grammar rules. This
grammar file is specified using BNF. The output of the process is a parser
written in C that communicates with a lexer, commonly written in LEX or
FLEX.
In this section we explain these input and output file in detail and cover
some other details about GNU Bison that will increase the understanding
of the presented project implementation in the next chapter.
3.3. GNU BISON 29
3.3.1 Input file parser.y
There are 4 sections in a grammar file: Prologue, Bison declaration, Gram-
mar rules and Epilogue distributed as presented in Listing 3.3.
Listing 3.3: Bison file structure
1 %{Prologue
3 %}Bison d e c l a r a t i o n s
5 %%
Grammar r u l e s
7 r e s u l t : ru le1−components . . .
| ru le2−components . . .
9 . . .
;
11 %%
Epi logue
Prologue: Macro definitions, declarations of functions, variables used in
the grammar rules. It is attached verbatim to the beginning of the
generated file.
Declarations: Define terminal, nonterminal symbols and specify prece-
dence.
Grammar rules: Rules expressed as result of composition of rules and
defines an action for each rule in Backus-Naur Form (BNF) notation.
Epilogue: is attached to the generated file at the end as the prologue in
the beginning.
Each section is separated by a specific token, the prologue uses ‘%}’ and
the other sections use ‘%%’.
3.3.2 Output file parser.c
GNU Bison generates a file that contains C source code. In generated file
we identify four main parts: Declarations of variables and transition arrays;
the algorithm that runs the PDA; the response actions that build the AST
tree while doing the reductions; and a section for custom code inserted by
the developer.
30 CHAPTER 3. EXISTING TECHNOLOGIES
Transition arrays
Aho et al. [2006] presents the algorithm for LALR(1) based on the creation
of two dimensional parsing tables called ”ACTION table” and ”GOTO ta-
ble” as presented in Section 2.1.4. GNU Bison improves the efficiency of
the storage by converting these two dimensional tables in arrays using the
algorithm method described by Tarjan and Yao [1979].
Popuri [2006] presents a detailed analysis of the role of each array in the
generated file. The 15 arrays generated by GNU Bison are:
yytranslate: interface with the lexer to understand the tokens of the lexer
in the internal representation of the parser.
yyrhs: list of symbol numbers of all rules. yyrhs[n]= first symbol on the
RHS of the rule.
yyprhs: index in yyrhs of the first rhs symbol of each rule.
yyrline: line number in the grammar file where the rule is defined.
yytname: list of names of defined symbols.
yytoknum: list of value of the tokens in the lexer.
yyr1: specifies the symbol number for each rule.
yyr2: number of tokens to be reduced for a certain rule.
yydefact: default reduction for each state
yydefgoto compressed GOTO table, each entry specifies the state to tran-
sition to each non-terminal.
yytable: state numbers in a pre-calculated order, works together with yy-
check, yypgoto and yypact to indicate the next state and the rule to
be use for a reduction.
yypact: indicate what to do next, work together with yytable.
yypgoto: indicates anomalies that derive in errors.
yycheck: a control table that matches the current rule that guides to the
discover of anomalies in the parser.
3.3. GNU BISON 31
As a summary the array yytranslate is as it names indicates a translator
between the lexer and parser. The arrays yydefact, yydefgoto, yyr1, yyr2,
yytable, yypgoto, yypact and yycheck are used to represent the LALR(1)
parsing tables ACTION and GOTO. The other arrays are helpers for de-
bugging and printing.
LALR algorithm
GNU Bison uses the function yyparse to start the parsing operation. It
makes use of two stacks, one for the states and one more for the reductions
and the construction of the AST that is called parser stack.
The algorithm starts by reading a token and pushing the value into the
parser stack, this operation is called SHIFT. When the stack accumulates
enough elements to match a rule the elements are popped from the parser
stack and converted into a new symbol, which is the result action of the rule,
this operation is called REDUCE. This result is pushed back again into the
parser stack. In each step the parser computes the next state based on the
arrays presented before and pushes or pops at the same time as the SHIFT
and REDUCE actions are performed over the parser stack.
GNU Bison includes two new symbols into its internal symbol configura-
tion, the token accept and end are added to identify when the parser stops
and when it finish in an acceptance state.
Construction of the AST
The AST is constructed by the REDUCTIONS actions. The AST can be
specified in the section of the grammar file for the grammar rules; specifically
where the description of the actions for the construction are placed.
Error handling
GNU Bison uses a single primary recovery technique based on the activation
of an error flag that tells the parser to suppress the syntax error diagnostic
while recovering from an error. A developer can write specific rules for the
special token error that can help the parser to detect easily the presence of
errors. There is a chapter in the reference manual of GNU Bison that can
give a more detailed overview of the error handling.
32 CHAPTER 3. EXISTING TECHNOLOGIES
Other considerations about BISON
The parser generated by C manage the memory in an efficient way to avoid
memory exhaustion specially in situations when too many tokens have been
shifted without a reduction operation. This operation is done inside the
parser and usually a developer does not need to configure anything, al-
though the parameters called YYMAXDEPTH and YYINITDEPTH can
be modified.
The GNU Bison parser generator supports the generation of code in
three programming languages: Java, C and C++.
A table of symbols is provided in the GNU Bison manual [Donnelly and
Stallman, 2010, Appendix A] that allows better understanding of the code
inside the generated parsers. However the article written by Popuri [2006]
explains in detail the structures, arrays and makes internal comments in
specific part of the code that clarifies more the algorithm used by GNU
Bison. Some examples of the interaction of FLEX and GNU Bison are
presented in Aaby [2003].
Chapter 4
Implementation
4.1 Proposed Solution
This chapter presents the results of this project, which are the design and
implementation of a MetaModelica Lexer and Parser Generator written in
the MetaModelica Language that has been named as the OpenModelica
Compiler-Compiler parser generator (OMCCp) 1.
We present in this chapter a description of the design and architecture
of the parser generator proposed in this report. We cover at the end the
description of the error handling techniques used in OMCCp and the inte-
gration into the OpenModelica Compiler (OMC).
First we start by outlining the main characteristics that describe this
solution:
• It is a tool written entirely in MetaModelica language.
• Includes a Lexer and a Parser LALR(1) Generator that generate Meta-
Modelica code.
• The Lexer is based on the existing tool FLEX for the generation of
the transition tables for the DFA.
• The Parser is based on the existing tool BISON for the generation of
the transition tables for the Deterministic Push-down Automata.
1OMCCp was initially named OMCC. This explains why the design figures and somefiles in the project use the name OMCC instead of OMCCp. Whenever OMCC is presentit may be interpreted as OMCCp.
33
34 CHAPTER 4. IMPLEMENTATION
• The error handler is based on a primary recovery technique, and the
generation of a candidate set that is properly displayed to the devel-
oper.
Section 4.2 presents the design of the solution, section 4.3 explains how
the generation of files is performed by OMCCp. We continue explaining the
error handling techniques utilised in OMCCp in Section 4.4 and at the end
in Section 4.5 we explain the integration with the OMC.
Appendix A covers the basic command line instructions for running the
OMCCp that generates the files, and the instructions to run the generated
parser for the grammar specified.
4.2 OMCCp Design
Figure 4.1 presents an overview of OMCCp. The Parser and Lexer Generator
are based on the C files generated by FLEX and BISON from the grammar
Figure 4.1: OMCCp (OpenModelica Compiler - Compiler) Lexer and ParserGenerator
4.2. OMCCP DESIGN 35
Those files are c-code generated and contain three main parts that con-
stitute the lexer and the parser generated. We can identify: a section with
the transition arrays, a section with the machine that runs the algorithm
for the lexer or parser and finally an action resolution section. The last
section contains the return token action for the lexer and a reduction/AST
construction action for the parser.
Based on the identification of the main parts of each generated file, the
design presented in figure 4.2 shows the different components of the solu-
tion. We can identify two principal package called LexerGenerator.mo and
ParserGenerator.mo which are responsible for the generation of the com-
plete Compiler-Compiler in MetaModelica code.
4.2.1 Lexical Analyser
The design of the Lexer is presented in figure 4.3. The Lexer contains 3
main files that were designed with the aim of separating: the parsing tables,
the DFA and the action code. The tables for the transitions of the states are
loaded at the start of the DFA. Once the DFA identifies the rule, it passes
the control to the code file that takes the action and decides which token
to return to the DFA. The DFA pushes the result returned into the list of
tokens.
The Lexical Analyser uses the built-in functions from System.mo and
Util.mo files that are part of the compiler. A new file was developed,
Types.mo, to support the uniontype Token and the records TOKEN and
INFO that are used for both, the lexer and the parser.
Lexer.mo
The file Lexer.mo implements the DFA for the Lexical Analysis as explained
in Section 2.1.2. It is the main file of the Lexer and makes the calls to the
functions in the other files LexerCode.mo and LexTable.mo that constitute
the lexer. Its main function is to load the source code file and recognise all
the tokens described by the grammar. It returns as an output to the parser
either a list of tokens or an error if no tokens were found or some characters
in the source code are not compliant with the 8-bytes encoding format called
UTF-8.
In the process of loading the file, it converts the source code file into an
36 CHAPTER 4. IMPLEMENTATION
OMC
+sca
n(i
n p
rogr
am, o
ut
toke
ns)
-lo
adSo
urc
eCo
de(
)+s
can
Stri
ng(
)+l
ex()
-co
nsu
me(
in p
rogr
am, i
n e
nv,
ou
t to
ken
s)-e
valS
tate
(in
ou
t en
v)-f
ind
Ru
le(i
no
ut
env)
+get
Info
()
-EN
V-L
exer
Tab
le-p
rogr
am
Lexer.mo
+gen
erat
eLex
er()
: Le
xTab
le.m
o+b
uild
Lexe
r()
+bu
ildTa
ble
s()
+bu
ildC
od
e()
LexerGenerator.mo
+printToken()
+getMergeToken()
+printErrorToken()
+printTokens()
-To
ken
-In
fo
«ty
pe»
OMCCTypes.mo
-Lex
erTa
ble
«ty
pe»
LexTab
le.m
o
-lis
t<In
tege
r>p
rogr
am
«u
tilit
y»Util.mo
+Par
se(i
n t
oke
ns)
-pro
cess
Toke
n()
-red
uce
(in
sta
ck)
-tra
nsl
ate(
)+e
rro
rHan
dle
r()
+ad
dSo
urc
eMes
sage
()+c
hec
kCan
did
ates
()+c
hec
kTo
ken
()+g
etTo
ken
Sem
Val
ue(
)+p
rin
tCan
did
ateT
oke
ns(
)
-Par
seTa
ble
-En
viro
men
t-l
ist<
Toke
ns>
toke
ns
-set
tin
gs
Parser.mo -A
ST
«ty
pe»
Absyn.m
o
«ca
ll»
+gen
erat
ePar
ser(
)-b
uild
Co
de(
)-b
uild
Tab
les(
) : L
exTa
ble
.mo
-bu
ildP
arse
r()
-bu
ildTo
ken
s()
ParserGenerator.mo
«d
eriv
ed»
«u
ses»
«u
ses»
«u
ses»
«u
ses»
+mai
n(i
n a
rgs)
NewParser.mo
«u
ses»
«u
ses»
«u
ses»
-Par
seTa
ble
«ty
pe»
ParseTable.m
o
+act
ion
(in
act
ion
: in
t)
-EN
V
LexerCode.m
o
«u
ses»
+act
ion
Red
uce
()+p
ush
()+g
etA
ST()
+in
itia
lizeS
tack
()-r
edu
ceSt
rin
gSta
ck()
-get
Info
()
-Mu
ltiT
yped
Stac
k
ParseCode.m
o«
use
s»
//
call
the
lexe
rto
ken
s =
Lexe
r.sc
an(f
ilen
ame)
; //
cal
l th
e p
arse
ras
t =
Par
ser.
par
se(t
oke
ns)
;
«d
eriv
ed»
+rea
dFi
le()
+wri
teFi
le()
+str
ingO
per
atio
ns(
)
«u
tilit
y»System.m
o
-To
ken
Co
de
«ty
pe»
Tokens
-ad
dEr
ror
Error.mo
Figure 4.2: OMCCp Lexer and Parser Generator Architecture Design
4.2. OMCCP DESIGN 37
-LexerTable
«type»LexTable.mo
-*AllTokens
«type»list<Token>tokens
+action(in action : int, out tokens)
-ENV
LexerCode.mo
System.mo Util.mo
+name+value+col : int+row : int
«type»Token
-list<Integer> program
«type»Program
+scan(in program, out tokens)-loadSourceCode()+scanString()+lex()-consume(in program, in env, out tokens)-evalState(inout env)-findRule(inout env)+getInfo()
-ENV-LexerTable-program
Lexer.mo«send»
Figure 4.3: OMC-Lexer design
array of integers, each integer representing the UTF-8 code for the character
present in the source code, e.g. the character ‘a’ will be the UTF-8 code 61.
This helps the lexer to increase the speed in the process of recognising the
tokens due to the direct mapping between the UTF-8 code and the position
in the transition array used for the lexer.
For recognizing the tokens, Lexer.mo runs a DFA based on the transition
arrays found in LexTable.mo. When it reaches an acceptance state it calls
the function action in the LexerCode.mo file. And finally it returns a list of
tokens that are the input for the parser.
The entrance function to the Lexer is named scan, and it is defined in
Lexer.mo:
Listing 4.1: Lexer.mo function scan
function scan ”Scan s t a r t s the l e x i c a l ana l y s i s , load the t a b l e s
and consume the program to output the tokens ”
2 input String f i leName ” input source code f i l e ” ;
input Boolean debug ” f l a g to a c t i v a t e the debug mode” ;
4 output l i s t<Types . Token> tokens ” re turn l i s t o f tokens ” ;
The complete file Lexer.mo is available in Appendix B.1
38 CHAPTER 4. IMPLEMENTATION
LexTable.mo
The LexTable.mo file is the source of the Lexer.mo file for performing the
transitions to new states and finding the tokens out of the input stream. It
contains two variables and 8 arrays that are extracted from the file generated
by FLEX.
The two parameters are utilised to perform control instructions in the
Lexer.mo file and are yy limit and yy finish. The parameter yy finish is
specially used to detect the end of a token.
The 8 arrays that are used to perform the task of the DFA are yy accept,
yy ec, yy meta, yy base, yy def, yy nxt, yy chk and yy acclist. All the arrays
are utilised in the file Lexer.mo, but only the most significant arrays will be
explained in this report. Those are: yy ec, yy accept and yy acclist.
The first array used is yy ec, this array contains all the UTF-8 codes and
the initial state of each character when consuming the input stream by the
DFA.
After performing a mathematical checking operation over the current
character; the DFA uses the yy accept array to find if the current state is a
valid acceptance state and continues until it determines that the lookahead
character belongs to another token. Then it performs a roll-back operation
to the last acceptance state in the stack. Following this the DFA uses the
yy acclist to determine the return token; which is a call to a function in the
LexerCode.mo file.
The generation of this file is explained in Section 4.3.1. A complete
sample of the generated file LexTable10.mo is available in Appendix E.4
LexCode.mo
The file LexCode.mo contains all the specific actions that the lexer performs
when a token has been recognised. The actions that can be perform are
one of these three possible actions: ignore token, return a specific token or
change to another DFA.
The first action is to ignore the token, this operation is performed by
the lexer when a space, line feed or a block of comment is been found in
the input stream. The tokens ignored by the lexer simplifies the job of
the parser, because those tokens are not used for any construction in the
grammar, therefore they will not be converted into executable code in the
further phases of the compiler.
4.2. OMCCP DESIGN 39
The second possible action that can be done when a token is been recog-
nised is to return a specific token, the code of the action defines the token
to be returned. Some information is collected together with the token in a
RECORD called TOKEN that allows the parser to identify the line and the
position of the token in the original source code file.
The last and third action that this function does is to switch from one
DFA to another one. This operation is performed in certain situations, e.g.
when the DFA finds a starting comment block ‘/*’, and all the subsequent
tokens are required to be ignored or categorised as a different token, e.g. in
the case of recognising strings. For this action a new starting state is set
up in the machine and the new characters run in a different DFA as the
original. After the end token is found (e.g. ‘*/’) then the start state returns
to the original one.
This file is generated from the lexer.c file which is produced by FLEX,
the generation of this file is explained in Section 4.3.1. A complete sample
of the generated file LexerCode10.mo is available in Appendix E.5.
4.2.2 Syntax Analyser
The Parser design for OMCCp is presented in figure 4.4, it carries the
function of performing the syntax analysis of the compiler.
5.1.3 Implementation of a subset of Modelica and Meta-
Modelica grammar
A large subset of Modelica 3.2 and MetaModelica 1.0 grammar was imple-
mented based on the grammar of the Modelica 3.2 specifications [Modelica-
Association, 2010] and MetaModelica specifications [Fritzson and Pop, 2011b].
The subset is large enough to be able to parse all the source code of OM-
CCp. It includes all the class types of Modelica and around 90% of the total
Modelica specification 3.2. In addition, the subset includes the extensions
of MetaModelica used for OMCCp.
The lexer includes 100% of the tokens used by Modelica but without sup-
port for identities inside single quote ’QIdent’ and other characters besides
the common alphabet.
During the implementation of this grammar, some issues were found
and corrected inside the parser. Among these issues we can identify in the
list below some of the changes implemented during the construction of the
grammar.
• Change of recursion for loops for the input of both the lexer and the
parser due to stack overflow errors.
• The reduce of the MultiTyped stack was done in order of the variables
instead of the correct way which was in reverse order to pop the correct
values.
• Added implementation of custom error messages to be inserted in the
parser.
• An additional bug regarding the conversion of large files from chars
into integers were found in the OMC compiler.
The grammar lexerModelica.y and parseModelica.y
The tokens ENDIF, ENDFOR, ENDWHILE, ENDWHEN and ENDCLASS
were added to the lexer grammar to avoid ambiguity in the LALR(1) parser.
Some shift-reductions conflicts were found during the construction of the
grammar. The parser select a shift over reduction in the case this happens.
This allows the longest rules to have priority over the shorter.
We avoided completely reduce-reduce errors. This is a symptom of am-
biguity in the grammar and should always be avoided. The way to avoid
62 CHAPTER 5. DISCUSSION
reduce-reduce errors is by avoiding different rules where the same reduc-
tions can be performed. It happens often when empty transitions are im-
plemented.
The grammar files lexerModelica.y and parseModelica.y are available in
Appendices F.1 and F.2.
Testing the Modelica grammar and performance
We performed a test over the files based on the test suit library implemented
for Modelica. 48 out of 571 test failed. This means that around 92% of the
Modelica grammar was implemented correctly. The rest 8% includes part
of the grammar that were not implemented such as annotations and some
other tokens for prefixes such as FINAL or REPLACEABLE in front of
certain rules.
The test cases that failed are easily identified in the log file. This file
shows the OMCCp error messages as presented in Section 4.4. From here
we can identify and test individual Modelica programs until we add the
required instructions to parse the file, or to correct the construction of the
AST in case the problem is found there.
We discover that the parser was not scaling good for large files and we
performed an optimization over the Lexer and later over the parser. The
optimization consisted in minimize the load of the character list and the
token list used for passing the parameters to the recursive functions inside
the lexer and the parser.
Table 5.2 shows the results in time for all the test cases including the
failed tests for OMCCp and ANTLR. It is important to point that when the
parser fails it performs a search over the candidate tokens that increases the
time of parsing.
Another comparison was performed over the same file as presented in ta-
ble 5.3. This test was performed over the source code of OMCCp. Chart 5.1
shows how the No optimised OMCCp was taking 57 seconds to perform the
parsing over an input of 162.000 chars. After the optimisations we reduce the
total time to 5 seconds. However, we believe that the MultiTyped stack that
consist for the Modelica grammar in around 70 stacks is causing overhead
in the parser.
The computer used for the test has the following configuration:
• CPU: Intel Core DUO T9300
5.1. ANALYSIS OF RESULTS 63
Table 5.2: Test Suite - Compiler
Compiler Time (sec) ResultANTLR 19.367 1 out of 571 tests failedInitial grammar 48 447 out of 568 tests failedIntermedia grammar 58 264 out of 568 tests failedNo Optimization 124.492 98 out of 568 tests failedLexer Optimized 43.294 98 out of 568 tests failedParser Optimized 48.657 48 out of 571 tests failed
A.V. Aho, R. Sethi, and J.D. Ullman. Compilers: principles, techniques,and tools. Addison-Wesley, second edition, 2006. xv, 5, 13, 14, 15, 30
J. Akesson, T Ekman, and G Hedin. Development of a Modelica Com-piler Using JastAdd. Electronic Notes in Theoretical Computer Sci-ence, 203(2):117–131, April 2008. ISSN 15710661. doi: 10.1016/j.entcs.2008.03.048. URL http://linkinghub.elsevier.com/retrieve/pii/
S1571066108001539. [Accessed May 2011]. 18
J. Akesson, Torbjorn Ekman, and Gorel Hedin. Implementation of aModelica compiler using JastAdd attribute grammars. Science of Com-puter Programming, 75(1-2):21–38, January 2010. ISSN 01676423. doi:10.1016/j.scico.2009.07.003. URL http://linkinghub.elsevier.com/
retrieve/pii/S0167642309001087. [Accessed May 2011]. 18, 66
D. Blasband. Parsing in a hostile world. Proceedings Eighth Working Confer-ence on Reverse Engineering, pages 291–300, 2001. doi: 10.1109/WCRE.2001.957834. URL http://ieeexplore.ieee.org/lpdocs/epic03/
wrapper.htm?arnumber=957834. [Accessed May 2011]. 12
Michael Burke and G.A. Fisher Jr. A practical method for syntactic error di-agnosis and recovery. In Proceedings of the 1982 SIGPLAN symposium onCompiler construction, pages 67–78. ACM, 1982. ISBN 0897910745. URLhttp://portal.acm.org/citation.cfm?id=800230.806981. [AccessedMay 2011]. 16, 49
Michael G. Burke and Gerald a. Fisher. A practical method for LR and LLsyntactic error diagnosis and recovery. ACM Transactions on Program-ming Languages and Systems, 9(2):164–197, March 1987. ISSN 01640925.
Rafael Corchuelo, Jose a. Perez, Antonio Ruiz, and Miguel Toro. Repairingsyntax errors in LR parsers. ACM Transactions on Programming Lan-guages and Systems, 24(6):698–710, November 2002. ISSN 01640925. doi:10.1145/586088.586092. URL http://portal.acm.org/citation.cfm?
doid=586088.586092. [Accessed May 2011]. 16
M. de Jonge, E. Nilsson-Nyman, L. Kats, and Eelco Visser. Natural andflexible error recovery for generated parsers. Software Language Engineer-ing, pages 204–223, 2010. URL http://www.springerlink.com/index/
b0p750768wum5157.pdf. [Accessed May 2011]. 16, 67
Pierpaolo Degano and Corrado Priami. LR techniques for handlingsyntax errors. Computer Languages, 24(2):73–98, 1998. ISSN0096-0551. URL http://linkinghub.elsevier.com/retrieve/pii/
S0096055197000167. [Accessed May 2011]. 16
F.L DeRemer. Practical translators for LR (k) languages. Project Mac, Mas-sachusetts Institute of Technology, 1969. URL http://publications.
csail.mit.edu/lcs/pubs/ps/MIT-LCS-TR-65.ps. [Accessed May 2011].13
Charles Donnelly and Richard Stallman. GNU Bison Manual version 2.4.3,2010. URL http://www.gnu.org/software/bison/manual/bison.pdf.[Accessed May 2011]. 23, 32, 210
Hilding Elmqvist. Modelica - A Unified Object-Oriented Language for Phys-ical Systems Modeling. EUROSIM -Simulation News Europe, (20):p32,1997. 18
Peter Fritzson. Principles of Object-oriented modeling and simulation withModelica 2.1. IEEE Press, 2004. 1, 210
Peter Fritzson and Peter Bunus. Fritzon Modelica A general OO Languagefor continuous and discrete event system modeling and simulation. InSimulation Symposium, 2002. Proceedings. 35th Annual, pages 365 – 380,2002. 18
Peter Fritzson and Adrian Pop. Meta-programming and language mod-eling with metamodelica 1.0. Technical Report 9, Linkoping Univer-sityLinkoping University, PELAB - Programming Environment Labora-tory, The Institute of Technology, 2011a. 4, 18, 152
Peter Fritzson and Adrian Pop. Towards Modelica 4 Meta-Programmingand Language Modeling with MetaModelica 2.0. Number April. 2011b.unpublished work. 4, 18, 61
Peter Fritzson, Adrian Pop, Peter Aronsson, David Akhvlediani, Bern-hard Bachmann, Vasile Baluta, Simon Bjorklen, Mikael Blom, WilliBraun, David Broman, Stefan Brus, Francesco Casella, Filippo Donida,Henrik Eriksson, Anders Fernstrom, Pavel Grozman, Daniel Hedberg,Michael Hanke, Alf Isaksson, Daniel Kanth, Tommi Karhela, JoelKlinghed, Juha Kortelainen, Alexey Lebedev, Magnus Leksell, OliverLenord, Ha kan Lundvall, Eric Meyers, Hannu Niemisto, Kristoffer Nor-ling, Atanas Pavlov, Pavol Privitzer, Per Sahlin, Wladimir Schamai,Gerhard Schmitz, and Klas Sjoholm. OpenModelica System Docu-mentation. Number November. Open Source Modelica Consortium,Linkoping, 2009. URL http://www.ida.liu.se/labs/pelab/modelica/
OpenModelica/releases/1.6.0/doc/OpenModelicaSystem.pdf. [Ac-cessed May 2011]. xiii, 1, 18, 23, 24
Denis Howe. The Free On-line Dictionary of Computing, 2010. URL http:
//foldoc.org/. [Accessed May 2011]. 209, 210
OG Kakde. Algorithms for compiler design. CHARLES RIVER MEDIA,INC., 2002. ISBN 81-7008-100-6. 5, 13
L.C.L. Kats, M. de Jonge, E. Nilsson-Nyman, and E. Visser. Providingrapid feedback in generated modular language environments: adding errorrecovery to scannerless generalized-LR parsing. ACM SIGPLAN Notices,44(10):445–464, 2009. ISSN 0362-1340. URL http://portal.acm.org/
citation.cfm?id=1640089.1640122. [Accessed May 2011]. 16
Donald E Knuth. On the Translation of Languages from Left to Right.Information and Control, 8(6):607–639, 1965. ISSN 00199958. doi: 10.1016/S0019-9958(65)90426-2. URL http://linkinghub.elsevier.com/
retrieve/pii/S0019995865904262. 12
Ha kan Lundvall, Kristian Stava ker, Peter Fritzson, and Christoph Kessler.Automatic parallelization of simulation code for equation-based modelswith software pipelining and measurements on three platforms. ACMSIGARCH Computer Architecture News, 36(5):46–55, June 2009. ISSN0163-5964. doi: 10.1145/1556444.1556451. URL http://portal.acm.
org/citation.cfm?id=1556444.1556451. [Accessed May 2011]. 18
Bruce J. McKenzie, Corey Yeatman, and Lorraine de Vere. Error re-pair in shift-reduce parsers. ACM Transactions on Programming Lan-guages and Systems, 17(4):672–689, July 1995. ISSN 01640925. doi:10.1145/210184.210193. URL http://portal.acm.org/citation.cfm?
Terence Parr. The Definitive ANTLR Reference: Building Domain-SpecificLanguages (Pragmatic Programmers). Pragmatic Bookshelf, 2007. ISBN0978739256. 25
Terence Parr and R W Quong. ANTLR: a predicated-LL(k) parser genera-tor. Software Practice Experience, 25(7):789, 1995. ISSN 00380644. URLhttp://portal.acm.org/citation.cfm?id=213593.213603. [AccessedMay 2011]. 1, 25
Michal Pise. The Fika Parser Generator. 2010 10th IEEE Working Confer-ence on Source Code Analysis and Manipulation, pages 99–100, Septem-ber 2010. doi: 10.1109/SCAM.2010.27. URL http://ieeexplore.ieee.
Adrian Pop and Peter Fritzson. Debugging natural semantics specifications.Proceedings of the Sixth sixth international symposium on Automatedanalysis-driven debugging - AADEBUG’05, pages 77–82, 2005. doi: 10.1145/1085130.1085140. URL http://portal.acm.org/citation.cfm?
doid=1085130.1085140. [Accessed May 2011]. 18
Adrian Pop and Peter Fritzson. MetaModelica: A unified equation-basedsemantical and mathematical modeling language. Modular ProgrammingLanguages, pages 211–229, 2006. URL http://www.springerlink.com/
index/a5112k4m34067180.pdf. [Accessed May 2011]. 18, 66, 210
Satya Kiran Popuri. Understanding C parsers generated by GNU Bison. FreeSoftware Foundation, 2006. URL http://www.cs.uic.edu/~spopuri/
cparser.html. [Accessed May 2011]. 30, 32, 41, 65
P. Sampath, AC Rajeev, KC Shashidhar, and S Ramesh. How to test pro-gram generators? A case study using flex. In Software Engineering andFormal Methods, 2007. SEFM 2007. Fifth IEEE International Conferenceon, pages 80–92. IEEE, 2007. doi: 10.1109/SEFM.2007.11. URL http:
//ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4343926. [Ac-cessed May 2011]. 67
M Sipser. Introduction to the Theory of Computation. Thomson CourseTechnology, second edition, 2005. 8, 10
Martin Sjolund. Bidirectional External Function Interface Between Model-ica/MetaModelica and Java. Master’s thesis, Linkoping University, 2009.18
Martin Sjolund, Peter Fritzson, and Adrian Pop. Bootstrapping a ModelicaCompiler aiming at Modelica 4. In 8th International Modelica Conference(Modelica’2011), Dresden, Germany, 2011. 1, 2, 18, 66, 71
Robert Endre Tarjan and Andrew Chi-Chih Yao. Storing a sparse table.Communications of the ACM, 22(11):606–611, November 1979. ISSN00010782. doi: 10.1145/359168.359175. URL http://portal.acm.org/
citation.cfm?doid=359168.359175. [Accessed May 2011]. 30
P.D. Terry. Compilers and Compiler Generators. Africa, 2000. URL http:
//scifac.ru.ac.za/compilers/. [Accessed May 2011]. 5, 13
This appendix contains the source code developed during this project. Itincludes instructions of how to run the OpenModelica Compiler and samplesof both; input files for the generator of the parser and output files generatedfrom the exercise 10 of the MetaModelica guide.
A.1 Parameters - MetaModelica Parser Gen-erator
A.1.1 Generate compilerName
Used as a sufix for the name of the files input lexer and parser.
A.1.2 Run compilerName, fileName
Run the generated compiler with the given fileName as an input
A.2 OMC Commands
To run the Lexer and Parser generator for the compiler [compilerName] itis required the following:
1. lexer[compilerName].c generated from the grammar lexer[compilerName].lin Flex.
2. parser[compilerName].c generated from the grammar parser[compilerName].yin Bison.
80
A.2. OMC COMMANDS 81
Optionally you can test your grammar by generating the files in FLEX andGNU Bison as described in listing A.1.
Listing A.1: Compile Flex and Bison
\ $ f l e x −t − l l e x e r [ compilerName ] . c > l e x e r [ compilerName ] . c2 \$ bison par s e r [ compilerName ] . y −−output=par se r [ compilerName ] . c
Modify the file OMCC.mos presented in listing A.2 to include the gram-mar languages to generate.
Listing A.2: OMCC.mos
g e t I n s t a l l a t i o n D i r e c t o r y P a t h ( ) ;2 l o a d F i l e ( ”OMCC.mo” ) ;
l o a d F i l e ( ” . . / . . / Compiler / U t i l /RTOpts .mo” ) ;4 l o a d F i l e ( ” . . / . . / Compiler / U t i l / U t i l .mo” ) ;
l o a d F i l e ( ” . . / . . / Compiler / U t i l /System .mo” ) ;6 l o a d F i l e ( ” LexerGenerator .mo” ) ;
l o a d F i l e ( ” ParserGenerator .mo” ) ;8 l o a d F i l e ( ” . . / . . / Compiler /FrontEnd/Absyn .mo” ) ;
OMCC. main ({ ” Modelica ” }) ;10
ge tEr ro rS t r i ng ( ) ;
For running the OMCCp use the command presented in listing A.3.
Listing A.3: OMCCP Command
1 $ omc +g=MetaModelica +d=rml OMCC. mostrue
3 truetrue
5 truetrue
7 true
9 Generating FLEX grammar f i l e l e x e r 1 0 . c . . .Generating BISON grammar f i l e parse r10 . c . . .
11 Reading FLEX grammar f i l e l e x e r 1 0 . c . . .Result : Lexer Bu i l t
13 Reading BISON grammar f i l e par se r10 . cResult : Parser Bu i l t
15 9 F i l e s Generated for the language grammar :10OMCC v0 . 7 ( OpenModelica Compiler− Compiler )
17 Copyright 2011 Open Souce Modelica Consorsium (OSMC)
This command generates the following 9 files:
• lexer[compilerName].c
• parser[compilerName].c
• Lexer[compilerName].mo
• LexTable[compilerName].mo
82 APPENDIX A. OMC COMPILER COMMANDS
• LexCode[compilerName].mo
• Parser[compilerName].mo
• ParseTable[compilerName].mo
• ParseCode[compilerName].mo
• Token[compilerName].mo
Additionally for debugging purposes the debug flag can be activated inthe script SCRIPT.mos presented in the appendix G.1 in the line shown inthe listing A.4.
Listing A.4: SCRIPT.mos debug mode
1 Main . main ({ ” Modelica ” , true }) ; // run grammar 10 with debug on
The output of the file can be send to another file as presented in thelisting A.5. The file result.txt will present the state stacks used for thelexer and parser and also some transformations and variables that can betracked back to the original source code.
Listing A.5: OMCCP debug mode
1 omc +g=MetaModelica +d=rml SCRIPT. mos > r e s u l t . txt
Appendix B
Lexer Generator
B.1 Lexer.mo
Listing B.1: Lexer.mo
1 package Lexer ” Implements the DFA of OMCC”import Types ;
3 import LexTable ;import LexerCode ;
5uniontype LexerTable
7 record LEXER TABLEarray<Integer> accept ;
9 array<Integer> ec ;array<Integer> meta ;
11 array<Integer> base ;array<Integer> de f ;
13 array<Integer> nxt ;array<Integer> chk ;
15 array<Integer> a c c l i s t ;end LEXER TABLE;
17 end LexerTable ;
19 uniontype Envrecord ENV
21 Integer s ta r tS t , cur rSt ;Integer pos , sPos , ePos , l i n e n r ;
23 l i s t<Integer> bu f f ;l i s t<Integer> bkBuf ;
25 l i s t<Integer> s tateSk ;Boolean i sDebugging ;
27 String f i leName ;end ENV;
29 end Env ;
31 function scan ”Scan s t a r t s the l e x i c a l ana l y s i s , load the t a b l e sand consume the program to output the tokens ”
input String f i leName ” input source code f i l e ” ;
83
84 APPENDIX B. LEXER GENERATOR
33 input Boolean debug ” f l a g to a c t i v a t e the debug mode” ;output l i s t<OMCCTypes . Token> tokens ” re turn l i s t o f tokens ” ;
35algorithm
37 // load program( tokens ) := match ( fi leName , debug )
39 locall i s t<OMCCTypes . Token> resTokens ;
41 l i s t<Integer> s t r eamInteger ;case ( , )
43 equations t r eamInteger = loadSourceCode ( f i leName ) ;
45 resTokens = l ex ( fi leName , streamInteger , debug ) ;then ( resTokens ) ;
47 end match ;end scan ;
49function s canSt r ing ”Scan s t a r t s the l e x i c a l ana ly s i s , load the
t a b l e s and consume the program to output the tokens ”51 input String f i l e S o u r c e ” input source code f i l e ” ;
input Boolean debug ” f l a g to a c t i v a t e the debug mode” ;53 output l i s t<OMCCTypes . Token> tokens ” re turn l i s t o f tokens ” ;
55 algorithm// load program
57 ( tokens ) := match ( f i l e S o u r c e , debug )local
59 l i s t<OMCCTypes . Token> resTokens ;l i s t<Integer> s t r eamInteger ;
61 l i s t<String> chars ;case ( , )
63 equationchars = s t r i n g L i s t S t r i n g C h a r ( f i l e S o u r c e ) ;
65 // s t reamInteger = U t i l . l i s tMap ( chars , s t r ingChar Int ) ;s t r eamInteger = l i s t ( s t r ingChar Int ( c ) for c in chars ) ;
67 resTokens = l ex ( ”<Str ingSource>” , s t reamInteger , debug ) ;then ( resTokens ) ;
69 end match ;end s canSt r ing ;
71function loadSourceCode
73 input String f i leName ” input source code f i l e ” ;output l i s t<Integer> program ;
75 algorithm( program ) := match ( f i leName )
77 locall i s t<Integer> s t r eamInteger ;
79 l i s t<String> chars ;case ( ”” )
81 equationprint ( ”Empty FileName” ) ;
83 then ({} ) ;case ( )
85 equationchars = s t r i n g L i s t S t r i n g C h a r (System . r e a d F i l e ( f i leName
) ) ;87 // s t reamInteger = U t i l . l i s tMap ( chars , s t r ingChar Int ) ;
B.1. LEXER.MO 85
s t r eamInteger = l i s t ( s t r ingChar Int ( c ) for c in chars );
89 then ( s t r eamInteger ) ;end match ;
91 end loadSourceCode ;
93 function l e x ”Scan s t a r t s the l e x i c a l ana l y s i s , load the t a b l e sand consume the program to output the tokens ”
input String f i leName ” input source code f i l e ” ;95 input l i s t<Integer> program ” source code as a stream of
I n t e g e r s ” ;input Boolean debug ” f l a g to a c t i v a t e the debug mode” ;
97 output l i s t<OMCCTypes . Token> tokens ” re turn l i s t o f tokens ” ;Integer r , cTok ;
99 l i s t<Integer> cProg ;l i s t<String> chars ;
101 array<Integer> mm accept , mm ec , mm meta , mm base , mm def , mm nxt ,mm chk , mm accl i s t ;
Env env ;103 LexerTable l exTab le s ;
algorithm105 // load ar rays
107 mm accept := l i s t A r r a y ( LexTable . yy accept ) ;mm ec := l i s t A r r a y ( LexTable . yy ec ) ;
109 mm meta := l i s t A r r a y ( LexTable . yy meta ) ;mm base := l i s t A r r a y ( LexTable . yy base ) ;
111 mm def := l i s t A r r a y ( LexTable . yy de f ) ;mm nxt := l i s t A r r a y ( LexTable . yy nxt ) ;
113 mm chk := l i s t A r r a y ( LexTable . yy chk ) ;mm accl i s t := l i s t A r r a y ( LexTable . y y a c c l i s t ) ;
115 l exTab le s := LEXER TABLE( mm accept , mm ec , mm meta , mm base ,mm def , mm nxt , mm chk , mm accl i s t ) ;
117 // I n i t i a l i z e the Env Var iab l e senv := ENV(1 ,1 , 1 , 0 , 1 , 1 ,{} ,{} ,{1} , debug , f i leName ) ;
119 i f ( debug==true ) thenprint ( ”\nLexer ana lyze r LexerCode . . . ” + fi leName + ”\n” ) ;
121 // printAny (”\ nLexer ana lyze r LexerCode . . . ” + fi leName + ”\n”) ;
end i f ;123
tokens := {} ;125 i f ( debug ) then
print ( ”\n TOTAL Chars : ” ) ;127 print ( intString ( l i s t L e n g t h ( program ) ) ) ;
end i f ;129 while ( Util . i sListEmpty ( program )==fa l se ) loop
i f ( debug ) then131 print ( ”\nChars remaining : ” ) ;
print ( intString ( l i s t L e n g t h ( program ) ) ) ;133 end i f ;
cTok : : program := program ;135 cProg := {cTok } ;
( tokens , env , cProg ) := consume ( env , cProg , lexTables , tokens ) ;137 i f ( Util . i sListEmpty ( cProg )==fa l se ) then
cTok : : cProg := cProg ;
86 APPENDIX B. LEXER GENERATOR
139 program := cTok : : program ;end i f ;
141 end while ;tokens := l i s t R e v e r s e ( tokens ) ;
143 end l e x ;
145 function consumeinput Env env ;
147 input l i s t<Integer> program ;input LexerTable l exTab le s ;
149 input l i s t<OMCCTypes . Token> tokens ;output l i s t<OMCCTypes . Token> resToken ;
151 output Env env2 ;output l i s t<Integer> program2 ;
153 array<Integer> mm accept , mm ec , mm meta , mm base , mm def , mm nxt ,mm chk , mm accl i s t ;
Integer mm startSt , mm currSt , mm pos , mm sPos , mm ePos , mm linenr ;155 l i s t<Integer> bu f f e r , bkBuffer , s t a t e s ;
String f i leNm ;157 Integer c , cp , mm finish , baseCond ;
Boolean debug ;159 algorithm
LEXER TABLE( accept=mm accept , ec=mm ec , meta=mm meta , base=mm base ,
161 de f=mm def , nxt=mm nxt , chk=mm chk , a c c l i s t=mm accl i s t ) :=lexTab le s ;
163 ENV( s t a r t S t=mm startSt , cur rSt=mm currSt , pos=mm pos , sPos=mm sPos , ePos=mm ePos ,
l i n e n r=mm linenr , bu f f=bu f f e r , bkBuf=bkBuffer , s ta teSk=s ta t e s ,i sDebugging=debug , f i leName=fi leNm ) := env ;
165mm finish := LexTable . y y f i n i s h ;
167 baseCond := mm base [ mm currSt ] ;i f ( debug==true ) then
169 print ( ”\nPROGRAM:{ ” + p r i n t B u f f e r ( program , ”” ) + ”} ” ) ;print ( ”\nBUFFER:{ ” + p r i n t B u f f e r ( bu f f e r , ”” ) + ”} ” ) ;
171 print ( ”\nBKBUFFER:{ ” + p r i n t B u f f e r ( bkBuffer , ”” ) + ”} ” ) ;print ( ”\nSTATE STACK:{ ” + pr in tStack ( s t a t e s , ”” ) + ”} ” ) ;
173 print ( ” base : ” + intString ( baseCond ) + ” s t : ” + intString (mm currSt )+” ” ) ;
end i f ;175 ( resToken , program2 ) := match ( program , tokens )
local177 Integer c , d , act , val , c2 , curr2 , f cha r ;
l i s t<Integer> r e s t ;179 l i s t<OMCCTypes . Token> lToken ;
String sToken ;181 Boolean emptyToken ;
Option<OMCCTypes . Token> otok ;183 case ( , ) // loop tokens
14 output String r e s u l t ;String f lexCode , re , ar1 , r e s t ;
16 Boolean r e sBo l ;l i s t<String> resu l tRegex , resTable , chars ;
18 algorithm//open f l e x f i l e and v a l i d a t e
20 i f ( debug==true ) thenprint ( ”\nGenerating Lexer from ” + f l e x F i l e ) ;
22 end i f ;i f ( outFileName<>”” and st r ingLength ( outFileName )<15) then
24 print ( ”\nReading FLEX grammar f i l e ” + f l e x F i l e +” . . . ” ) ;f l exCode := System . r e a d F i l e ( f l e x F i l e ) ;
26 i f ( debug==true ) thenprint ( ”\nBuild Lex Table . . . ” ) ;
28 end i f ;r e sBo l := bui ldLexTable ( f lexCode , ”LexTable” +
outFileName ) ;30 i f ( debug==true ) then
print ( ”\nBuild Lexer . . . ” ) ;32 end i f ;
r e sBo l := bui ldLexer ( outFileName ) ;34 i f ( debug==true ) then
print ( ”\nBuild LexerCode . . . ” ) ;36 end i f ;
r e sBo l := buildLexerCode ( f lexCode , grammarFile ,outFileName ) ;
38 r e s u l t := ” Lexer Bu i l t ” ;else
40 r e s u l t := ” I n v a l i d language grammar name” ;end i f ;
42 end genLexer ;
44 function readPro logEpi loginput String l exerCode ;
46 input String grammarFileName ;output String l exerCodeInc luded ;
48 String grammarFile , ep i l og , prolog , re , ar1 , astRootType ;Integer numMatches , pos1 , pos2 ;
50 l i s t<String> r e su l tRegex ;
B.2. LEXERGENERATOR.MO 93
algorithm52 i f ( debug==true ) then
print ( ”\nRead ep i l o gue and pro logue ” ) ;54 end i f ;
grammarFile := System . r e a d F i l e ( grammarFileName ) ;56
// f i n d pro logue58
pos1 := System . s t r i ngF ind ( grammarFile , ”%{” ) ;60 pos2 := System . s t r i ngF ind ( grammarFile , ”%}” ) ;
62 ar1 := System . s u b s t r i n g ( grammarFile , pos1 +3,pos2−1) ;l exerCodeInc luded := System . s t r i ngRep la c e ( lexerCode , ”%
pro logue%” , ar1 ) ;64
//66 /∗ ar1 := System . s t r i n g F i n d S t r i n g ( grammarFile , ” AstTree ”) ;
pos1 := System . s t r ingF ind ( ar1 ,”=”) ;68 pos2 := System . s t r ingF ind ( ar1 , ” ; ” ) ;
astRootType := System . s ub s t r i n g ( ar1 , pos1 +2,pos2 ) ;70 astRootType := System . trim ( astRootType , ” ”) ;
parserCodeInc luded := System . s t r i ngRep la c e (parserCodeIncluded ,”% astTree %”,astRootType ) ; ∗/
72// f i n d ep i l ogue
74 re := ”%%” ;ar1 := System . s t r i n g F i n d S t r i n g ( grammarFile , r e ) ;
76 ar1 := System . s u b s t r i n g ( ar1 , 3 , s t r ingLength ( ar1 ) ) ;ar1 := System . s t r i n g F i n d S t r i n g ( ar1 , re ) ;
78 ar1 := System . s u b s t r i n g ( ar1 , 3 , s t r ingLength ( ar1 ) ) ;l exerCodeInc luded := System . s t r i ngRep la c e ( lexerCodeInc luded ,
”%ep i l ogue%” , ar1 ) ;80
82 end readPro logEpi log ;
84 function buildLexerCodeinput String f l exCode ;
88 output Boolean bu i ldResu l t ;l i s t<String> resTable ;
90 String lexCode , r e s u l t , r e s t , stTime , cp , caseAction , re ;Integer i , numRules , pos , pos2 , posBegin , posReturn , posKeepBuffer
, posBreak , valBegin ;92 algorithm
lexCode := System . r e a d F i l e ( ”LexerCode . tmo” ) ;94 stTime := leyend + getCurrentTimeStr ( ) ;
r e s u l t := System . s t r i ngRep la c e ( lexCode , ”%LexerCode%” , ”LexerCode” + outFileName ) ;
96 r e s u l t := System . s t r i ngRep la c e ( r e s u l t , ”%time%” , stTime ) ;r e s u l t := System . s t r i ngRep la c e ( r e s u l t , ”%Token%” , ”Token” +
outFileName ) ;98 r e s u l t := System . s t r i ngRep la c e ( r e s u l t , ”%Lexer%” , ” Lexer ” +
outFileName ) ;r e s u l t := System . s t r i ngRep la c e ( r e s u l t , ”%ParseTable%” , ”
ParseTable ” + outFileName ) ;
94 APPENDIX B. LEXER GENERATOR
100 r e s u l t := System . s t r i ngRep la c e ( r e s u l t , ”%nameSpan%” , ”255” ) ;
102 r e s u l t := readPro logEpi log ( r e s u l t , grammarFile ) ;
104 caseAct ion := ”” ;resTable := {} ;
106 // p r in t (”\ nFind value . . . ” ) ;numRules := f indValue ( f lexCode , ”YY NUM RULES” ) ;
108 re := ”/∗ beg inning o f ac t i on switch ∗/” ;r e s t := System . s t r i n g F i n d S t r i n g ( f lexCode , re ) ;
110 i f ( debug==true ) thenprint ( ”\nbeginning o f ac t i on switch ” ) ;
122 r e s t := System . s t r i n g F i n d S t r i n g ( f lexCode , re );
i f ( debug==true ) then124 print ( ”\n” +re ) ;
end i f ;126 re := ”#l i n e ” ;
pos := System . s t r i ngF ind ( r e s t , r e ) ;128 pos2 := System . s t r i ngF ind ( r e s t , ” . l ” ) ;
cp := subs t r i ng2 ( r e s t , pos+1, pos2+3) ;130 resTable := cp : : resTable ;
// posReturn , posKeepBuffer , posBreak132 posReturn := System . s t r i ngF ind ( r e s t , ” re turn
” ) ;posBreak := System . s t r i ngF ind ( r e s t , ”YY BREAK
” ) ;134 posBegin := System . s t r i ngF ind ( r e s t , ”BEGIN” ) ;
posKeepBuffer := System . s t r i ngF ind ( r e s t , ”keepBuf fe r ” ) ;
136 // p r in t (”\n pos : ” + i n t S t r i n g ( pos ) + ” :” + ”pos2 : ” + i n t S t r i n g ( pos2 ) + ” :” + ”posB : ”+ i n t S t r i n g ( posBegin ) ) ;
i f ( posBegin < posBreak and posBegin>=0)then // s t a r t s BEGIN switch s t a r t s t a t e
138 // f i n d tokenpos := System . s t r i ngF ind ( r e s t , ” ( ” ) ;
140 pos2 := System . s t r i ngF ind (r e s t , ” ) ” ) ;
cp := subs t r i ng2 ( r e s t , pos+2,pos2 ) ;
142valBegin := f indValue ( f lexCode , cp ) ;
144 valBegin := 1+2∗valBegin ;i f ( debug==true ) then
146 print ( ”\n BEGIN at ” +intString ( valBegin ) ) ;
226 r e s u l t := System . s t r i ngRep la c e ( r e s u l t , ” package Lexer ” , cp ) ;r e s u l t := System . s t r i ngRep la c e ( r e s u l t , ”end Lexer ; ” , ”end
Lexer ” + outFileName + ” ; ” ) ;228 System . w r i t e F i l e ( ” Lexer ” + outFileName + ” .mo” , r e s u l t ) ;
bu i ldResu l t := true ;230 end bui ldLexer ;
232 function bui ldLexTableinput String f l exCode ;
234 input String outFileName ;output Boolean bu i ldResu l t ;
236 String cp , re , re1 , ar1 , r e s t , r e s u l t , stTime ;Integer numMatches , pos1 , pos2 , l en ;
238 l i s t<String> resu l tRegex , resTable , chars ;
240 algorithm
242 stTime := leyend + getCurrentTimeStr ( ) ;
B.2. LEXERGENERATOR.MO 97
244 cp := ” package ” + outFileName +” // ” + stTime + ” \n\nconstant I n t e g e r y y l i m i t := ” ;
246 resTable := cp : : { } ;
248 // I n s e r t y y l i m i tre := ” i f ( y y c u r r e n t s t a t e >= ” ;
250 re1 := ” i f ( y y c u r r e n t s t a t e >=[ˆ) ] ∗ ) ” ;( numMatches , r e su l tRegex ) := System . regex ( f lexCode , re1 , 1 ,
false , fa l se ) ;252
ar1 : : := resu l tRegex ;254 i f ( debug==true ) then
print ( ”\nFound regex : ” + ar1 ) ;256 end i f ;
numMatches :=0;258 ( numMatches , r e su l tRegex ) := System . regex ( ” i f (
y y c u r r e n t s t a t e >= 65 ) ” , ” [0−9]∗” ,2 , false , fa l se ) ;i f ( debug==true ) then
260 print ( ”\nNumMatches : ” + intString ( numMatches ) ) ;end i f ;
262 cp : : := resu l tRegex ;i f ( debug==true ) then
264 print ( ”\nFound regex2 : ” + cp ) ;end i f ;
266r e s t := System . s t r i n g F i n d S t r i n g ( f lexCode , re ) ;
268pos2 := System . s t r i ngF ind ( r e s t , ” ) ” ) ;
270 ar1 := subs t r ing2 ( r e s t , s t r ingLength ( re ) +1,pos2−1) ;resTable := ar1 : : re sTable ;
272cp := ” ;\n\nconstant I n t e g e r y y f i n i s h := ” ;
274 resTable := cp : : resTable ;re := ” whi l e ( yy base [ y y c u r r e n t s t a t e ] != ” ;
276 r e s t := System . s t r i n g F i n d S t r i n g ( f lexCode , re ) ;pos2 := System . s t r i ngF ind ( r e s t , ” ) ” ) ;
278 ar1 := subs t r ing2 ( r e s t , s t r ingLength ( re ) +1,pos2−1) ;resTable := ar1 : : re sTable ;
280cp := ” ;\n\nconstant l i s t <Integer> y y a c c l i s t := {” ;
282 resTable := cp : : resTable ;
284 // match a c c l i s tre := ” y y a c c l i s t \\ [ [ 0 −9 ]∗\\ ] =[ˆ} ]∗} ” ;
286 ( numMatches , r e su l tRegex ) := System . regex ( f lexCode , re , 1 , false, fa l se ) ;
ar1 : : := resu l tRegex ;288 i f ( numMatches > 0) then
pos1 := System . s t r i ngF ind ( ar1 , ” , ” ) ;290 pos2 := System . s t r i ngF ind ( ar1 , ”}” ) ;
ar1 := subs t r ing2 ( ar1 , pos1 +2,pos2−1) ;292 else
ar1 := ”” ;294 end i f ;
r e sTable := ar1 : : re sTable ;296
98 APPENDIX B. LEXER GENERATOR
298 cp := ” } ;\n\nconstant l i s t <Integer> yy accept := {” ;resTable := cp : : resTable ;
300 re := ” yy accept \\ [ [ 0 −9 ]∗\\ ] =[ˆ} ]∗} ” ;( numMatches , r e su l tRegex ) := System . regex ( f lexCode , re , 1 , false
, fa l se ) ;302 r e s t : : := resu l tRegex ;
i f ( numMatches > 0) then304 // r e s t := System . s t r i n g F i n d S t r i n g ( f lexCode , re ) ;
pos1 := System . s t r i ngF ind ( r e s t , ” , ” ) ;306 pos2 := System . s t r i ngF ind ( r e s t , ”}” ) ;
ar1 := subs t r ing2 ( r e s t , pos1 +2,pos2−1) ;308 resTable := ar1 : : resTable ;
end i f ;310 cp := ” } ;\n\nconstant l i s t <Integer> yy ec := {” ;
resTable := cp : : resTable ;312 // re := ” s t a t i c yyconst i n t yy ec ” ;
re := ” yy ec \\ [ [ 0 −9 ]∗\\ ] =[ˆ} ]∗} ” ;314 ( numMatches , r e su l tRegex ) := System . regex ( f lexCode , re , 1 , false
, fa l se ) ;r e s t : : := resu l tRegex ;
316 i f ( numMatches > 0) then// r e s t := System . s t r i n g F i n d S t r i n g ( f lexCode , re ) ;
318 pos1 := System . s t r i ngF ind ( r e s t , ” , ” ) ;pos2 := System . s t r i ngF ind ( r e s t , ”}” ) ;
320 ar1 := subs t r ing2 ( r e s t , pos1 +2,pos2−1) ;resTable := ar1 : : re sTable ;
322 end i f ;cp := ” } ;\n\nconstant l i s t <Integer> yy meta := {” ;
324 resTable := cp : : resTable ;// re := ” s t a t i c yyconst i n t yy meta ” ;
326 re := ”yy meta \\ [ [ 0 −9 ]∗\\ ] =[ˆ} ]∗} ” ;( numMatches , r e su l tRegex ) := System . regex ( f lexCode , re , 1 , false
, fa l se ) ;328 r e s t : : := resu l tRegex ;
i f ( numMatches > 0) then330 // r e s t := System . s t r i n g F i n d S t r i n g ( f lexCode , re ) ;
pos1 := System . s t r i ngF ind ( r e s t , ” , ” ) ;332 pos2 := System . s t r i ngF ind ( r e s t , ”}” ) ;
ar1 := subs t r ing2 ( r e s t , pos1 +2,pos2−1) ;334 resTable := ar1 : : resTable ;
end i f ;336
cp := ” } ;\n\nconstant l i s t <Integer> yy base := {” ;338 resTable := cp : : resTable ;
340 // re := ” s t a t i c yyconst shor t i n t yy base ” ;re := ” yy base \\ [ [ 0 −9 ]∗\\ ] =[ˆ} ]∗} ” ;
342 ( numMatches , r e su l tRegex ) := System . regex ( f lexCode , re , 1 , false, fa l se ) ;
r e s t : : := resu l tRegex ;344 i f ( numMatches > 0) then
// r e s t := System . s t r i n g F i n d S t r i n g ( f lexCode , re ) ;346 pos1 := System . s t r i ngF ind ( r e s t , ” , ” ) ;
pos2 := System . s t r i ngF ind ( r e s t , ”}” ) ;348 ar1 := subs t r ing2 ( r e s t , pos1 +2,pos2−1) ;
resTable := ar1 : : re sTable ;
B.2. LEXERGENERATOR.MO 99
350 end i f ;
352 cp := ” } ;\n\nconstant l i s t <Integer> yy de f := {” ;resTable := cp : : resTable ;
354 // re := ” s t a t i c yyconst shor t i n t yy de f ” ;re := ” yy de f \\ [ [ 0 −9 ]∗\\ ] =[ˆ} ]∗} ” ;
356 ( numMatches , r e su l tRegex ) := System . regex ( f lexCode , re , 1 , false, fa l se ) ;
r e s t : : := resu l tRegex ;358 i f ( numMatches > 0) then
// r e s t := System . s t r i n g F i n d S t r i n g ( f lexCode , re ) ;360 pos1 := System . s t r i ngF ind ( r e s t , ” , ” ) ;
pos2 := System . s t r i ngF ind ( r e s t , ”}” ) ;362 ar1 := subs t r ing2 ( r e s t , pos1 +2,pos2−1) ;
resTable := ar1 : : re sTable ;364 end i f ;
366 cp := ” } ;\n\nconstant l i s t <Integer> yy nxt := {” ;resTable := cp : : resTable ;
368 // re := ” s t a t i c yyconst shor t i n t yy nxt ” ;re := ” yy nxt \\ [ [ 0 −9 ]∗\\ ] =[ˆ} ]∗} ” ;
370 ( numMatches , r e su l tRegex ) := System . regex ( f lexCode , re , 1 , false, fa l se ) ;
r e s t : : := resu l tRegex ;372 i f ( debug==true ) then
print ( ”\nREST next ” + r e s t ) ;374 end i f ;
i f ( numMatches > 0) then376 // r e s t := System . s t r i n g F i n d S t r i n g ( f lexCode , re ) ;
pos1 := System . s t r i ngF ind ( r e s t , ” , ” ) ;378 pos2 := System . s t r i ngF ind ( r e s t , ”}” ) ;
ar1 := subs t r ing2 ( r e s t , pos1 +2,pos2−1) ;380 resTable := ar1 : : resTable ;
end i f ;382
cp := ” } ;\n\nconstant l i s t <Integer> yy chk := {” ;384 resTable := cp : : resTable ;
re := ” s t a t i c yyconst shor t i n t yy chk ” ;386 re := ” yy chk \\ [ [ 0 −9 ]∗\\ ] =[ˆ} ]∗} ” ;
( numMatches , r e su l tRegex ) := System . regex ( f lexCode , re , 1 , false, fa l se ) ;
388 r e s t : : := resu l tRegex ;i f ( numMatches > 0) then
390 // r e s t := System . s t r i n g F i n d S t r i n g ( f lexCode , re ) ;pos1 := System . s t r i ngF ind ( r e s t , ” , ” ) ;
392 pos2 := System . s t r i ngF ind ( r e s t , ”}” ) ;ar1 := subs t r ing2 ( r e s t , pos1 +2,pos2−1) ;
added 2005−10−29, changed 2006−02−0520 The In f o a t t r i b u t e prov ide s l o c a t i o n in fo rmat ion f o r e lements
and c l a s s e s . ”
22 record INFOStr ing f i leName ” f i leName where the c l a s s i s de f i ned in ” ;
24 Boolean isReadOnly ” isReadOnly : ( t rue | f a l s e ) . Should bet rue f o r l i b r a r i e s ” ;
I n t e g e r l ineNumberStart ” l ineNumberStart ” ;26 I n t e g e r columnNumberStart ”columnNumberStart” ;
I n t e g e r lineNumberEnd ”lineNumberEnd” ;28 I n t e g e r columnNumberEnd ”columnNumberEnd” ;
Absyn . TimeStamp buildTimes ” Build and e d i t t imes ” ;30 end INFO;
32 end In fo ; ∗/
34 function getTimeStamp
B.4. TYPES.MO 103
output Absyn . TimeStamp timeStamp ;36 algorithm
timeStamp := Absyn . dummyTimeStamp ;38 end getTimeStamp ;
40 function printTokeninput Token token ;
42 output String strTk ;String tokName ;
44 Integer idtk , lns , cns , lne , cne ;l i s t<Integer> va l ;
46 In f o i n f o ;algorithm
48 TOKEN(name=tokName , id=idtk , va lue=val , l o c=i n f o ) := token ;INFO( l ineNumberStart=lns , columnNumberStart=cns , lineNumberEnd=
lne , columnNumberEnd=cne ) := i n f o ;50
strTk := ” [TOKEN: ” + tokName + ” ’ ” + p r i n t B u f f e r ( val , ”” ) + ”’ ( ” + intString ( l n s ) + ” : ” + intString ( cns ) + ”−”+intString ( lne ) + ” : ” + intString ( cne ) +” ) ] ” ;
52 end printToken ;
54 function getMergeTokenValueinput Token token1 ;
56 input Token token2 ;output l i s t<Integer> value ;
58 l i s t<Integer> va l1 ;l i s t<Integer> va l2 ;
60 algorithmTOKEN( value=val1 ) := token1 ;
62 TOKEN( value=val2 ) := token2 ;va lue := l i stAppend ( val1 , va l2 ) ;
64 end getMergeTokenValue ;
66 function printErrorTokeninput Token token ;
68 output String strTk ;String tokName , f i leNm ;
70 Integer idtk , lns , cns , lne , cne ;l i s t<Integer> va l ;
72 In f o i n f o ;algorithm
74 TOKEN(name=tokName , id=idtk , va lue=val , l o c=i n f o ) := token ;INFO( f i leName=fileNm , l ineNumberStart=lns , columnNumberStart=cns
, lineNumberEnd=lne , columnNumberEnd=cne ) := i n f o ;76
// strTk := fi leNm + ”:” + i n t S t r i n g ( l n s ) + ” :” + i n t S t r i n g ( cns) + ” : Syntax ERROR near token : [ ” + tokName + ” ’” +p r i n t B u f f e r ( val , ” ” ) + ” ’ ] ” ;
78 // strTk := fi leNm + ”:” + i n t S t r i n g ( l n s ) + ” :” + i n t S t r i n g ( cns) + ” : Syntax ERROR near ’” + p r i n t B u f f e r ( val , ” ” ) + ” ’ ” ;
strTk := ” ’ ” + p r i n t B u f f e r ( val , ”” ) + ” ’ ” ;80 end printErrorToken ;
82 function p r i n t I n f o E r r o rinput In f o i n f o ;
84 output String strTk ;
104 APPENDIX B. LEXER GENERATOR
String tokName , f i leNm ;86 Integer idtk , lns , cns , lne , cne ;
l i s t<Integer> va l ;88 algorithm
INFO( f i leName=fileNm , l ineNumberStart=lns , columnNumberStart=cns, lineNumberEnd=lne , columnNumberEnd=cne ) := i n f o ;
90 // strTk := fi leNm + ”:” + i n t S t r i n g ( l n s ) + ” :” + i n t S t r i n g ( cns) + ” : Syntax ERROR near token : [ ” + tokName + ” ’” +p r i n t B u f f e r ( val , ” ” ) + ” ’ ] ” ;
strTk := fi leNm + ” : ” + intString ( l n s ) + ” : ” + intString ( cns );
92 end p r i n t I n f o E r r o r ;
94 function printShortTokeninput Token token ;
96 output String strTk ;String tokName ;
98 Integer idtk , lns , cns , lne , cne ;l i s t<Integer> va l ;
100 In f o i n f o ;algorithm
102 TOKEN(name=tokName , id=idtk , va lue=val , l o c=i n f o ) := token ;INFO( l ineNumberStart=lns , columnNumberStart=cns , lineNumberEnd=
lne , columnNumberEnd=cne ) := i n f o ;104
// strTk := ” [” + tokName + ” ’” + p r i n t B u f f e r ( val , ” ” ) +” ’ ]” ;106 strTk := ” ’ ” + p r i n t B u f f e r ( val , ”” ) +” ’ ” ;
end printShortToken ;108
function printShortToken2110 input Token token ;
output String strTk ;112 String tokName ;
Integer idtk , lns , cns , lne , cne ;114 l i s t<Integer> va l ;
In f o i n f o ;116 algorithm
TOKEN(name=tokName , id=idtk , va lue=val , l o c=i n f o ) := token ;118 INFO( l ineNumberStart=lns , columnNumberStart=cns , lineNumberEnd=
lne , columnNumberEnd=cne ) := i n f o ;
120 strTk := ” [ ” + tokName + ” ’ ” + p r i n t B u f f e r ( val , ”” ) +” ’ ] ” ;// strTk := ” ’” + p r i n t B u f f e r ( val , ” ” ) +” ’”;
122 end printShortToken2 ;
124 function printTokensinput l i s t<Token> i n L i s t ;
126 input String cBuf f ;output String outL i s t ;
128 l i s t<Token> i n L i s t 2 ;algorithm
130 ( outL i s t ) := match ( inL i s t , cBuf f )local
132 Token c ;String new, tout ;
134 l i s t<Token> r e s t ;case ({} , )
B.4. TYPES.MO 105
136 then ( cBuf f ) ;else
138 equationc : : r e s t = i n L i s t ;
140 //new = cBuf f + printShortToken2 ( c ) ;new = cBuff + printToken ( c ) ;
142 ( tout ) = printTokens ( r e s t ,new) ;then ( tout ) ;
144 end match ;end printTokens ;
146function countTokens
148 input l i s t<Token> i n L i s t ;input Integer sValue ;
150 output Integer outTotal ;l i s t<Token> i n L i s t 2 ;
152 algorithm// printAny (”\ nhere1 ”) ;
154 ( outTotal ) := match ( inL i s t , sValue )local
156 Token c ;Integer new, tout ;
158 l i s t<Token> r e s t ;case ({} , )
160 then ( sValue+1) ;else
162 equation// printAny (”\ nhere2 ”) ;
164 c : : r e s t = i n L i s t ;// printAny (”\ nhere3 ”) ;
166 new = sValue + 1 ;( tout ) = countTokens ( r e s t ,new) ;
168 then ( tout ) ;end match ;
170 // printAny (”\ nhere4 ”) ;end countTokens ;
172function p r i n t B u f f e r
174 input l i s t<Integer> i n L i s t ;input String cBuf f ;
176 output String outL i s t ;l i s t<Integer> i n L i s t 2 ;
178 algorithm( ou tL i s t ) := match ( inL i s t , cBuf f )
180 localInteger c ;
182 String new, tout ;l i s t<Integer> r e s t ;
184 case ({} , )then ( cBuf f ) ;
186 elseequation
188 c : : r e s t = i n L i s t ;new = cBuff + intStr ingChar ( c ) ;
190 ( tout ) = p r i n t B u f f e r ( r e s t ,new) ;then ( tout ) ;
192 end match ;
106 APPENDIX B. LEXER GENERATOR
end p r i n t B u f f e r ;194
end OMCCTypes ;
Appendix C
Parser Generator
C.1 Parser.mo
Listing C.1: Parser.mo
1 package Parserimport Types ;
3 import ParseTable ;import ParseCode ;
5 import Absyn ;import Error ;
7 uniontype Envrecord ENV
9 OMCCTypes . Token crTk , lookAhTk ;l i s t<Integer> s t a t e ;
11 l i s t<String> errMessages ;Integer e r rStatus , sState , cState ;
13 l i s t<OMCCTypes . Token> program , progBk ;ParseCode . AstStack astStack ;
15 Boolean i sDebugging ;l i s t<Integer> stateBackup ;
17 ParseCode . AstStack astStackBackup ;end ENV;
19 end Env ;
21 uniontype ParseDatarecord PARSE TABLE
23 array<Integer> t r a n s l a t e ;array<Integer> prhs ;
25 array<Integer> rhs ;array<Integer> r l i n e ;
27 array<String> tname ;array<Integer> toknum ;
29 array<Integer> r1 ;array<Integer> r2 ;
31 array<Integer> de f a c t ;array<Integer> de fgoto ;
33 array<Integer> pact ;
107
108 APPENDIX C. PARSER GENERATOR
array<Integer> pgoto ;35 array<Integer> t ab l e ;
array<Integer> check ;37 array<Integer> s t o s ; // to be r ep laced
end PARSE TABLE;39 end ParseData ;
41 /∗ when the e r r o r i s p o s i t i v e the par s e r runs in recovery mode ,i f the e r r o r i s negat ive , the par s e r runs in t e s t i n g
candidate mode43 i f the e r r o r i s cero , then no e r r o r i s p re sent or has been
recoveredThe e r r o r va lue de c r ea s e s with each s h i f t e d token ∗/
i f ( debug ) then179 print ( ”\n [ State : ” + intString ( cSt ) +” ]{ ” + pr in tStack (
s tateStk , ”” ) + ”}\n” ) ;end i f ;
181 env2 := env ;// Star t the LALR(1) Pars ing
183 cF ina l := ParseTable .YYFINAL;cPactNinf := ParseTable .YYPACT NINF;
185 cTableNinf := ParseTable .YYTABLE NINF;
C.1. PARSER.MO 111
prog := tokens ;187 // cF ina l==cSt i s a f i n a l s t a t e ? then ACCEPT
// mm pact [ cSt]==cPactNinf i f t h i s REDUCE or ERROR189 r e s u l t := true ;
( rTokens , r e s u l t ) := matchcontinue ( tokens , env , pt , cF ina l==cSt ,mm pact [ cSt+1]==cPactNinf )
191 locall i s t<OMCCTypes . Token> r e s t ;
193 l i s t<Integer> v l ;OMCCTypes . Token c , nt ;
195 Integer n , len , val , tok , tmTok , chkVal ;String nm, semVal ;
197 Absyn . Ident idVal ;case ({} , , , false , fa l se )
199 equationi f ( debug ) then
201 print ( ”\nNow at end o f input :\n” ) ;end i f ;
203 n = mm pact [ cSt +1] ;r e s t = {} ;
205 i f ( debug ) thenprint ( ” [ n : ” + intString (n) + ” ] ” ) ;
207 end i f ;i f (n < 0 or ParseTable .YYLAST < n or mm check [ n+1] <>
0) then209 // goto yyde fau l t ;
n = mm defact [ cSt +1] ;211 i f (n==0) then
// Error Handler213 i f ( debug ) then
print ( ”\n Syntax Error found yyer r l ab5 : ” +intString ( e r rS t ) ) ;
215 // printAny (”\n Syntax Error found yyer r l ab5 : ” +i n t S t r i n g ( e r r S t ) ) ;
end i f ;217 i f ( errSt >=0) then
( env2 , semVal , r e s u l t ) = errorHandle r ( cTok ,env , pt ) ;
219 ENV( crTk=cTok , lookAhTk=nTk ,s t a t e=stateStk , errMessages=errStk , e r r S t a t u s=errSt ,sS ta t e=sSt , cState=cSt ,program=prog , progBk=prgBk ,astStack=astStk , isDebugging=debug , stateBackup=stateSkBk , astStackBackup=astSkBk )= env2 ;
else221 r e s u l t=fa l se ;
end i f ;223 end i f ;
i f ( debug ) then225 print ( ” REDUCE4” ) ;
end i f ;227 env2=reduce (n , env , pt ) ;
ENV( crTk=cTok , lookAhTk=nTk , s t a t e=stateStk ,errMessages=errStk , e r r S t a t u s=errSt , sS ta t e=sSt ,
233 i f (n<=0) theni f (n==0 or n==cTableNinf ) then
235 // Error Handleri f ( debug ) then
237 print ( ”\n Syntax Error found yyer r l ab4 : ”+ intString (n) ) ;
end i f ;239 i f ( errSt >=0) then
( env2 , semVal , r e s u l t ) = errorHandle r ( cTok, env , pt ) ;
241 elser e s u l t = fa l se ;
243 end i f ;ENV( crTk=cTok , lookAhTk=nTk ,
s t a t e=stateStk , errMessages=errStk , e r r S t a t u s=errSt ,sS ta t e=sSt , cState=cSt ,program=prog , progBk=prgBk ,astStack=astStk , isDebugging=debug , stateBackup=stateSkBk , astStackBackup=astSkBk )= env2 ;
245 end i f ;n = −n ;
247 i f ( debug ) thenprint ( ” REDUCE5” ) ;
249 end i f ;env2=reduce (n , env , pt ) ;
251 ENV( crTk=cTok , lookAhTk=nTk , s t a t e=stateStk ,errMessages=errStk , e r r S t a t u s=errSt ,sS ta t e=sSt , cState=cSt , program=prog ,progBk=prgBk , astStack=astStk ,isDebugging=debug , stateBackup=stateSkBk, astStackBackup=astSkBk )= env2 ;
253 elsei f ( debug ) then
255 print ( ” SHIFT” ) ;end i f ;
257 i f ( errSt <0) then // reduce the s h i f t e r r o rlookup
i f ( debug ) then259 print ( ”\n∗∗∗−RECOVERY TOKEN INSERTED IS
SHIFTED−∗∗∗” ) ;end i f ;
261 e r rS t = maxErrRecShift ;end i f ;
263 cSt = n ;s ta t eS tk = cSt : : s t a t eS tk ;
C.1. PARSER.MO 113
265 env2 = ENV( c , nt , s tateStk , errStk , er rSt , sSt , cSt, r e s t , r e s t , astStk , debug , stateSkBk , astSkBk) ;
267 end i f ;end i f ;
269 i f ( r e s u l t==true and errSt>maxErrRecShift ) then // s topswhen i t f i n d s and e r r o r
i f ( debug ) then271 print ( ”\nReproces ing at the END” ) ;
end i f ;273 ( r e s t , env2 , r e s u l t , a s t ) = processToken ( re s t , env2 , pt ) ;
end i f ;275
then ({} , r e s u l t ) ;277 case ( , , , true , )
283 i f ( Util . i sListEmpty ( e r rStk )==fa l se ) thenpr intErrorMessages ( e r rStk ) ;
285 r e s u l t = fa l se ;end i f ;
287 as t = ParseCode . getAST ( astStk ) ;then ({} , r e s u l t ) ;
289 case ( , , , false , true )equation
291 n = mm defact [ cSt +1] ;i f (n == 0) then
293 // Error Handleri f ( debug ) then
295 print ( ”\n Syntax Error found yyer r l ab3 : ” +intString (n) ) ;
end i f ;297 i f ( errSt >=0) then
( env2 , semVal , r e s u l t ) = errorHandle r ( cTok, env , pt ) ;
299 ENV( crTk=cTok , lookAhTk=nTk , s t a t e=stateStk , errMessages=errStk ,e r r S t a t u s=errSt , sS ta t e=sSt , cState=cSt , program=prog , progBk=prgBk ,astStack=astStk , isDebugging=debug ,stateBackup=stateSkBk , astStackBackup=astSkBk )= env2 ;
else301 r e s u l t = fa l se ;
end i f ;303 end i f ;
// reduce ;305 i f ( debug ) then
print ( ”REDUCE3” ) ;307 end i f ;
309 env2=reduce (n , env , pt ) ;
114 APPENDIX C. PARSER GENERATOR
311 i f ( r e s u l t==true ) then //s tops when i t f i n d s ande r r o r
( r e s t , env2 , r e s u l t , a s t ) = processToken ( tokens , env2 ,pt ) ;
313 end i f ;
315 then ( r e s t , r e s u l t ) ;case ( , , , false , fa l se )
317 equation/∗ Do appropr ia t e p r o c e s s i n g g iven the cur rent s t a t e .
Read a319 lookahead token i f we need one and don ’ t a l r eady
have one . ∗/c : : r e s t = tokens ;
321 cTok = c ;OMCCTypes .TOKEN( id=tmTok , name=nm, value=v l ) = c ;
323 semVal = p r i n t B u f f e r ( vl , ”” ) ;i f ( debug ) then
329/∗ F i r s t t ry to dec ide what to do without r e f e r e n c e to
lookahead token . ∗/331
n = mm pact [ cSt +1] ;333 i f ( debug ) then
print ( ” [ n : ” + intString (n) + ”−” ) ;335 end i f ;
337 n = n + tok ;i f ( debug ) then
339 print ( ”NT: ” + intString (n) + ” ] ” );
end i f ;341 chkVal = n+1;
i f ( chkVal<=0) then343 chkVal = 1 ;
end i f ;345 i f (n < 0 or ParseTable .YYLAST < n or mm check [ chkVal ]
<> tok ) then// goto yyde fau l t ;
347 n = mm defact [ cSt +1] ;i f (n==0) then
349 // Error Handleri f ( debug ) then
351 print ( ”\n Syntax Error found yyer r l ab2 : ”+ intString (n) ) ;
end i f ;353 i f ( errSt >=0) then
( env2 , semVal , r e s u l t ) = errorHandle r ( cTok, env , pt ) ;
355 ENV( crTk=cTok , lookAhTk=nTk , s t a t e=stateStk , errMessages=errStk ,
C.1. PARSER.MO 115
e r r S t a t u s=errSt , sS ta t e=sSt , cState=cSt , program=prog , progBk=prgBk ,astStack=astStk , isDebugging=debug ,stateBackup=stateSkBk , astStackBackup=astSkBk )= env2 ;
else357 e r rS t = maxErrRecShift ;
r e s u l t = fa l se ;359 end i f ;
else361 i f ( debug ) then
print ( ” REDUCE2” ) ;363 end i f ;
env2=reduce (n , env , pt ) ;365 ENV( crTk=cTok , lookAhTk=nTk , s t a t e=stateStk ,
errMessages=errStk , e r r S t a t u s=errSt ,sS ta t e=sSt , cState=cSt , program=prog ,progBk=prgBk , astStack=astStk ,isDebugging=debug , stateBackup=stateSkBk, astStackBackup=astSkBk )= env2 ;
r e s t = tokens ;367 ( r e s t , env2 , r e s u l t , a s t ) = processToken ( r e s t ,
env2 , pt ) ;end i f ;
369 else// try to get the value f o r the ac t i on in the ta b l e
array371 n = mm table [ n+1] ;
i f (n<=0) then373 //
i f (n==0 or n==cTableNinf ) then375 // Error Handler
i f ( debug ) then377 print ( ”\n Syntax Error found
yye r r l ab : ” + intString (n) ) ;end i f ;
379 i f ( errSt >=0) then( env2 , semVal , r e s u l t ) = errorHandle r ( cTok
, env , pt ) ;381 ENV( crTk=cTok , lookAhTk=nTk , s t a t e=
stateStk , errMessages=errStk ,e r r S t a t u s=errSt , sS ta t e=sSt , cState=cSt , program=prog , progBk=prgBk ,astStack=astStk , isDebugging=debug ,stateBackup=stateSkBk , astStackBackup=astSkBk )= env2 ;
else383 r e s u l t = fa l se ;
e r rS t = maxErrRecShift ;385 end i f ;
else387 n = −n ;
i f ( debug ) then389 print ( ” REDUCE” ) ;
end i f ;391 env2=reduce (n , env , pt ) ;
116 APPENDIX C. PARSER GENERATOR
ENV( crTk=cTok , lookAhTk=nTk , s t a t e=stateStk ,errMessages=errStk , e r r S t a t u s=errSt ,sS ta t e=sSt , cState=cSt , program=prog ,progBk=prgBk , astStack=astStk ,isDebugging=debug , stateBackup=stateSkBk, astStackBackup=astSkBk )= env2 ;
393 r e s t = tokens ;( r e s t , env2 , r e s u l t , a s t ) = processToken ( re s t ,
env2 , pt ) ;395 end i f ;
else397 i f ( debug ) then
print ( ” SHIFT1” ) ;399 end i f ;
cSt = n ;401 s ta t eS tk = cSt : : s t a t eS tk ;
Boolean debug , e r r o r ;725 l i s t<Integer> s tateStk , stateSkBk ;
l i s t<String> errStk , redStk ;727 String astTmp , semVal , errMsg ;
Integer errSt , sSt , cSt ;729 l i s t<OMCCTypes . Token> prog , prgBk ;
Integer i , len , val , n , nSt , chkVal ;731 algorithm
PARSE TABLE( t r a n s l a t e=mm translate , prhs=mm prhs , rhs=mm rhs ,r l i n e=mm rline , tname=mm tname , toknum=mm toknum , r1=mm r1 ,r2=mm r2
733 , de f a c t=mm defact , de fgoto=mm defgoto , pact=mm pact , pgoto=mm pgoto , t a b l e=mm table , check=mm check , s t o s=mm stos ):= pt ;
C.1. PARSER.MO 123
735 ENV( crTk=cTok , lookAhTk=nTk , s t a t e=stateStk , sS ta t e=sSt ,errMessages=errStk , e r r S t a t u s=errSt , cState=cSt , program=prog , progBk=prgBk , astStack=astStk ,
70 r e s u l t := System . s t r i ngRep la c e ( r e s u l t , ” package Parser ” , cp ) ;r e s u l t := System . s t r i ngRep la c e ( r e s u l t , ”end Parser ; ” , ”end
Parser ” + outFileName + ” ; ” ) ;72 System . w r i t e F i l e ( ” Parser ” + outFileName + ” .mo” , r e s u l t ) ;
bu i ldResu l t := true ;74 end bu i ldPar s e r ;
” $accept ” , ”program” , ” s e r i e s ” , ” statement ” , ” input s tatement ”,
88 ” output statement ” , ” v a r i a b l e l i s t ” , ” ass ignment statement ” ,” c o n d i t i o n a l s t a t e m e n t ” , ” d e f i n i t e l o o p ” , ” w h i l e l o o p ” , ”
exp r e s s i on ” ,90 ”term” , ” element ” , ” comparison ” , ” v a r i a b l e ” , ” constant ” , ”
r e l a t i o n ” ,
E.1. PARSETABLE10.MO 155
” weak operator ” , ” s t r o n g o p e r a t o r ” } ;92
constant l i s t<Integer> yytoknum = {94 256 , 257 , 258 , 259 , 260 , 261 , 262 , 263 , 264 ,
ASTSTACK( stackRelOp=skRelOp , stackBinOp=skBinOp , stackExp=skExp , s tack Ident=skIdent , s tack IdentLs t=skIdentLst ,stackStmt=skStmt , s t a c k S t r i n g=skStr ing , s t a c k I n t e g e r=s k I n t e g e r ) := astStk ;
78 ( ) := matchcontinue ( act , as tStk )local
80 // l o c a l v a r i a b l e sRelOp vRelOp , v1RelOp , v2RelOp , v3RelOp , v4RelOp , v5RelOp
s tack Ident=skIdent , s tack IdentLs t=skIdentLst , stackStmt=skStmt , s t a c k S t r i n g=skStr ing , s t a c k I n t e g e r=s k I n t e g e r ) :=astStk ;
570 skSt r ing := inVal : : s kS t r ing ;astStk2 := ASTSTACK( skRelOp , skBinOp , skExp , skIdent , skIdentLst ,
skStmt , skStr ing , s k I n t e g e r ) ;572 end push ;
574
576 function printAST ” p r i n t the AST b u i l t by the par s ing ”input AstStack astStk ”MultiTypedStack used by the par s e r ” ;
578 output AstTree as t ” r e tu rn s the AST in the f i n a l type o f thet r e e ” ;
l i s t<AbsynPAM. Stmt> r e tStk ;580 algorithm
ASTSTACK( stackStmt=retStk ) := astStk ;582 printAny ( as t ) ;
a s t : : := re tStk ;584 end printAST ;
168 APPENDIX E. SAMPLE OUTPUT
586 function getSemValue ” r e t r i e v e s semval from tokens ”input Integer tokenId ;
588 output String tokenSemValue ” r e tu rn s semantic va lue o f thetoken ” ;
array<String> va lue s ;590 algorithm
va lue s := l i s t A r r a y ( lstSemValue ) ;592 tokenSemValue := va lue s [ tokenId ] ;
end getSemValue ;594
596 end ParseCode10 ;
E.3 Token10.mo
Listing E.3: Token10.mo
package Token10 // generated by OMCC v0 . 7 generated by OMCC v0. 7 Fr i Apr 29 17 : 00 : 58 2011
2constant Integer T READ = 258 ;
4 constant Integer T WRITE = 259 ;constant Integer T ASSIGN = 260 ;
6 constant Integer T IF = 261 ;constant Integer T THEN = 262 ;
8 constant Integer T ENDIF = 263 ;constant Integer T ELSE = 264 ;
10 constant Integer T TO = 265 ;constant Integer T DO = 266 ;
12 constant Integer T END = 267 ;constant Integer T WHILE = 268 ;
14 constant Integer T LPAREN = 269 ;constant Integer T RPAREN = 270 ;
16 constant Integer T IDENT = 271 ;constant Integer T INTCONST = 272 ;
18 constant Integer T EQ = 273 ;constant Integer T LE = 274 ;
20 constant Integer T LT = 275 ;constant Integer T GT = 276 ;
22 constant Integer T GE = 277 ;constant Integer T NE = 278 ;
24 constant Integer T ADD = 279 ;constant Integer T SUB = 280 ;
26 constant Integer T MUL = 281 ;constant Integer T DIV = 282 ;
28 constant Integer T SEMIC = 283 ;end Token10 ;
E.4 LexTable10.mo
E.4. LEXTABLE10.MO 169
Listing E.4: LexTable10.mo
1 package LexTable10 // generated by OMCC v0 . 7 Fr i Apr 2917 : 00 : 58 2011
99 ” $accept ” ,”program” , ” with in ” , ” c l a s s e s l i s t ” , ” c l a s s ” , ” c l a s s p r e f i x ” ,
101 ” encapsu lated ” , ” p a r t i a l ” , ” r e s t r i c t i o n ” , ” c l a s s d e f ” ,” c l a s sde f enumera t i on ” , ” c l a s s d e f d e r i v e d ” , ” enumeration ” , ”
enuml i s t ” ,103 ” enuml i t e r a l ” , ” c l a s s p a r t s ” , ” c l a s s p a r t ” , ” r e s t C l a s s ” ,
” a l g o r i t h m s e c t i o n ” , ” a lgor i thmitem ” , ” a lgor i thm ” , ”i f a l g o r i t h m ” ,
105 ” a l g e l s e i f s ” , ” a l g e l s e i f ” , ” when algorithm ” , ” a lge l s ewhens ” ,” a lge l s ewhen ” , ” e q u a t i o n s e c t i o n ” , ” equat ion item ” , ” equat ion ” ,
107 ” when equation ” , ” e l sewhens ” , ” elsewhen ” , ” f o r i t e r a t o r s ” , ”f o r i t e r a t o r ” ,
” i f e q u a t i o n ” , ” e l s e i f s ” , ” e l s e i f ” , ” e lementItems ” , ”elementItem ” ,
”component” , ” mod i f i c a t i on ” , ” redec la rekeywords ” , ” inne rou t e r ”,
111 ” importe lementspec ” , ” c l a s s e l e me nt s pe c ” , ” import ” , ”e lementspec ” ,
” elementAttr ” , ” v a r i a b i l i t y ” , ” d i r e c t i o n ” , ” typespec ” , ”arrayComplex” ,
113 ” typespecs ” , ” a r r aySubsc r i p t s ” , ”arrayDim” , ” f u n c t i o n c a l l ” ,” f u n c t i o n a r g s ” , ”namedargs” , ”namedarg” , ”exp” , ”matchcont” , ”
i f e x p ” ,115 ” e x p e l s e i f s ” , ” e x p e l s e i f ” , ” matchloca l ” , ” ca s e s ” , ” case ” , ”
casearg ” ,” simpleExp” , ” h e a d t a i l ” , ”rangeExp” , ” l o g i c e x p ” , ” l og i c t e rm ” ,
117 ” l o g f a c t o r ” , ” r e l t e rm ” , ”addterm” , ”term” , ” f a c t o r ” , ”expElement” ,
” tup l e ” , ” e x p l i s t ” , ” e x p l i s t 2 ” , ” c r e f ” , ” woperator ” , ”sope ra to r ” ,
119 ”power” , ” r e lOperato r ” , ”path” , ” ident ” , ” s t r i n g ” , ”comment” } ;
121 %}
123 %token T ALGORITHM%token T AND
125 %token T ANNOTATION%token BLOCK
127 %token CLASS%token CONNECT
129 %token CONNECTOR%token CONSTANT
131 %token DISCRETE%token DER
133 %token DEFINEUNIT%token EACH
135 %token ELSE%token ELSEIF
137 %token ELSEWHEN
F.2. PARSERMODELICA.Y 183
%token T END139 %token ENUMERATION
%token EQUATION141 %token ENCAPSULATED
%token EXPANDABLE143 %token EXTENDS
%token CONSTRAINEDBY145 %token EXTERNAL
%token T FALSE147 %token FINAL
%token FLOW149 %token FOR
%token FUNCTION151 %token IF
%token IMPORT153 %token T IN
%token INITIAL155 %token INNER
%token T INPUT157 %token LOOP
%token MODEL159 %token T NOT
%token T OUTER161 %token OPERATOR
%token OVERLOAD163 %token T OR
%token T OUTPUT165 %token T PACKAGE
%token PARAMETER167 %token PARTIAL
%token PROTECTED169 %token PUBLIC
%token RECORD171 %token REDECLARE
%token REPLACEABLE173 %token RESULTS
%token THEN175 %token T TRUE
%token TYPE177 %token UNSIGNED REAL
%token WHEN179 %token WHILE
%token WITHIN181 %token RETURN
%token BREAK183 %token DOT
%token LPAR185 %token RPAR
%token LBRACK187 %token RBRACK
%token LBRACE189 %token RBRACE
%token EQUALS191 %token ASSIGN
%token COMMA193 %token COLON
%token SEMICOLON
184 APPENDIX F. MODELICA GRAMMAR
195 %token CODE%token CODE NAME
197 %token CODE EXP%token CODE VAR
199 %token PURE%token IMPURE
201 %token IDENT%token DIGIT
203 %token UNSIGNED INTEGER
205 %token STAR%token MINUS
207 %token PLUS%token LESSEQ
209 %token LESSGT%token LESS
211 %token GREATER%token GREATEREQ
213 %token EQEQ%token POWER
215 %token SLASH
217 %token STRING
219 %token PLUS EW%token MINUS EW
221 %token STAR EW%token SLASH EW
223 %token POWEREW
225 %token STREAM
227 %token AS%token CASE
229 %token EQUALITY%token FAILURE
231 %token GUARD%token LOCAL
233 %token MATCH%token MATCHCONTINUE
235 %token UNIONTYPE%token ALLWILD
237 %token WILD%token SUBTYPEOF
239 %token COLONCOLON%token MOD
241 %token ENDIF%token ENDFOR
243 %token ENDWHILE%token ENDWHEN
245 %token ENDCLASS%token ENDMATCHCONTINUE
247 %token ENDMATCH//%expect 42
249
251
F.2. PARSERMODELICA.Y 185
%%253
/∗ Yacc BNF grammar o f the Modelica+MetaModelica language ∗/255
program : c l a s s e s l i s t257 { ( absyntree ) [ Program ] = Absyn .
PROGRAM( $1 [ l s t C l a s s ] , Absyn .TOP( ) , Absyn .TIMESTAMP(System. getCurrentTime ( ) ,System .getCurrentTime ( ) ) ) ; }
| with in c l a s s e s l i s t259 { ( absyntree ) [ Program ] = Absyn .
PROGRAM( $2 [ l s t C l a s s ] , $1 [Within ] , Absyn .TIMESTAMP(System . getCurrentTime ( ) ,System . getCurrentTime ( ) ) ) ; }
261with in : WITHIN path SEMICOLON { $$ [ Within ] =
Absyn .WITHIN( $2 [ Path ] ) ; }263
c l a s s e s l i s t : class SEMICOLON { $$ [ l s t C l a s s ] = $1 [Class ] : : { } ; }
265 | class SEMICOLON c l a s s e s l i s t { $$ [l s t C l a s s ] = $1 [ Class ] : : $2 [ l s t C l a s s ] ;}
/∗ r e s t r i c t i o n IDENT c l a s s d e f T ENDIDENT SEMICOLON
267 { i f ( not s t r ingEqua l ( $2 , $5 ) )then p r i n t ( Types .p r i n t I n f o E r r o r ( i n f o ) + ”Error : The i d e n t i f i e r ats t a r t and end are d i f f e r e n t’” + $2 + ” ’”) ;
t rue = ( $2 == $5 ) ;269 end i f ; $$ [ Class ] = Absyn .
CLASS( $2 , f a l s e , f a l s e , f a l s e, $1 [ R e s t r i c t i o n ] , $3 [ClassDef ] , i n f o ) ; }
∗/271
class : r e s t r i c t i o n IDENT c l a s s d e f273 { $$ [ Class ] = Absyn .CLASS( $2 ,
false , false , false , $1 [R e s t r i c t i o n ] , $3 [ ClassDef ] ,i n f o ) ; }
| c l a s s p r e f i x r e s t r i c t i o n IDENT c l a s s d e f275 { ( v1Boolean , v2Boolean , v3Boolean
| s t r i n g c r e f EQUALS ident LPAR e x p l i s t 2RPAR annotat ion { $$ [ ExternalDec l ] =Absyn .EXTERNALDECL(SOME( $4 [ Ident ] ) ,SOME( $1 ) ,SOME( $2 [ ComponentRef ] ) , $6 [Exps ] ,SOME( $8 [ Annotation ] ) ) ; }
367 | s t r i n g ident LPAR e x p l i s t 2 RPARannotat ion { $$ [ ExternalDec l ] = Absyn.EXTERNALDECL(SOME( $2 [ Ident ] ) ,SOME( $1) ,NONE( ) , $4 [ Exps ] ,SOME( $6 [ Annotation] ) ) ; }
| s t r i n g ident LPAR e x p l i s t 2 RPAR { $$ [ExternalDec l ] = Absyn .EXTERNALDECL(SOME( $2 [ Ident ] ) ,SOME( $1 ) ,NONE( ) , $4 [Exps ] ,NONE( ) ) ; }
369/∗ ALGORITHMS ∗/
371a l g o r i t h m s e c t i o n : a lgor i thmitem SEMICOLON { $$ [
AlgorithmItems ] = $1 [ AlgorithmItem ] : : { } ; }373 | a lgor i thmitem SEMICOLON
a l g o r i t h m s e c t i o n { $$ [ AlgorithmItems] = $1 [ AlgorithmItem ] : : $2 [AlgorithmItems ] ; }
375 a lgor i thmitem : algorithm comment{ $$ [ AlgorithmItem ] = Absyn .
ALGORITHMITEM( $1 [ Algorithm ] ,SOME($2 [ Comment ] ) , i n f o ) ; }
377algorithm : simpleExp ASSIGN exp // TOREV: c r e f
397 i f a l g o r i t h m : IF exp THEN ENDIF { $$ [ Algorithm ] =Absyn . ALG IF( $2 [ Exp ] ,{} ,{} ,{} ) ; } // warning empty i f
| IF exp THEN a l g o r i t h m s e c t i o n ENDIF { $$ [Algorithm ] = Absyn . ALG IF( $2 [ Exp ] , $4 [AlgorithmItems ] ,{} ,{} ) ; }
399 | IF exp THEN a l g o r i t h m s e c t i o n ELSEa l g o r i t h m s e c t i o n ENDIF { $$ [ Algorithm] = Absyn . ALG IF( $2 [ Exp ] , $4 [AlgorithmItems ] ,{} , $6 [ AlgorithmItems] ) ; }
| IF exp THEN a l g o r i t h m s e c t i o n a l g e l s e i f sENDIF { $$ [ Algorithm ] = Absyn . ALG IF
( $2 [ Exp ] , $4 [ AlgorithmItems ] , $5 [A l g E l s e i f s ] , { } ) ; }
401 | IF exp THEN a l g o r i t h m s e c t i o n a l g e l s e i f sELSE a l g o r i t h m s e c t i o n ENDIF { $$ [
Algorithm ] = Absyn . ALG IF( $2 [ Exp ] , $4 [AlgorithmItems ] , $5 [ A l g E l s e i f s ] , $7 [AlgorithmItems ] ) ; }
403 a l g e l s e i f s : a l g e l s e i f { $$ [ A l g E l s e i f s ] = $1 [A l g E l s e i f ] : : { } ; }
| a l g e l s e i f a l g e l s e i f s { $$ [ A l g E l s e i f s ]= $1 [ A l g E l s e i f ] : : $2 [ A l g E l s e i f s ] ; }
405a l g e l s e i f : ELSEIF exp THEN a l g o r i t h m s e c t i o n { $$
[ A l g E l s e i f ] = ( $2 [ Exp ] , $4 [ AlgorithmItems ] ) ; }407
when algorithm : WHEN exp THEN a l g o r i t h m s e c t i o n ENDWHEN409 { $$ [ Algorithm ] = Absyn .ALG WHEN A( $2
[ Exp ] , $4 [ AlgorithmItems ] , { } ) ; }
F.2. PARSERMODELICA.Y 191
| WHEN exp THEN a l g o r i t h m s e c t i o na lge l s ewhens ENDWHEN
f o r i t e r a t o r s : f o r i t e r a t o r { $$ [ F o r I t e r a t o r s ] = $1 [F o r I t e r a t o r ] : : { } ; }
449 | f o r i t e r a t o r COMMA f o r i t e r a t o r s { $$ [F o r I t e r a t o r s ] = $1 [ F o r I t e r a t o r ] : : $2 [F o r I t e r a t o r s ] ; }
451 f o r i t e r a t o r : IDENT { $$ [ F o r I t e r a t o r ] = Absyn .ITERATOR( $1 ,NONE( ) ,NONE( ) ) ; }
| IDENT T IN exp { $$ [ F o r I t e r a t o r ] = Absyn.ITERATOR( $1 ,NONE( ) ,SOME( $3 [ Exp ] ) ) ; }
453i f e q u a t i o n : IF exp THEN e q u a t i o n s e c t i o n ENDIF { $$ [
Equation ] = Absyn . EQ IF( $2 [ Exp ] , $4 [ EquationItems ] ,{} ,{} ) ; }455 | IF exp THEN e q u a t i o n s e c t i o n ELSE
e q u a t i o n s e c t i o n ENDIF { $$ [ Equation ] =Absyn . EQ IF( $2 [ Exp ] , $4 [ EquationItems
] ,{} , $6 [ EquationItems ] ) ; }| IF exp THEN e q u a t i o n s e c t i o n e l s e i f s
ENDIF { $$ [ Equation ] = Absyn . EQ IF( $2 [Exp ] , $4 [ EquationItems ] , $5 [ E l s e i f s ] , { } ); }
457 | IF exp THEN e q u a t i o n s e c t i o n e l s e i f s ELSEe q u a t i o n s e c t i o n ENDIF { $$ [ Equation ]
= Absyn . EQ IF( $2 [ Exp ] , $4 [ EquationItems] , $5 [ E l s e i f s ] , $7 [ EquationItems ] ) ; }
459 e l s e i f s : e l s e i f { $$ [ E l s e i f s ] = $1 [ E l s e i f ] : : { } ;}
| e l s e i f e l s e i f s { $$ [ E l s e i f s ] = $1 [E l s e i f ] : : $2 [ E l s e i f s ] ; }
461e l s e i f : ELSEIF exp THEN e q u a t i o n s e c t i o n { $$ [
E l s e i f ] = ( $2 [ Exp ] , $4 [ EquationItems ] ) ; }463
515 componentcondit ion : IF exp { $$ [ ComponentCondition ] = $1 [ Exp ] ;}
517 component : ident a r raySubsc r i p t s mod i f i c a t i on { $$ [Component ] = Absyn .COMPONENT( $1 [ Ident ] , $2 [ ArrayDim ] ,SOME( $3 [Mod i f i ca t i on ] ) ) ; }
| i dent a r raySubsc r i p t s { $$ [ Component ] =Absyn .COMPONENT( $1 [ Ident ] , $2 [ ArrayDim ] ,NONE( ) ) ; }
519mod i f i c a t i on : EQUALS exp { $$ [ Mod i f i ca t i on ] = Absyn .
CLASSMOD({} , Absyn .EQMOD( $2 [ Exp ] , i n f o ) ) ; }521 | ASSIGN exp { $$ [ Mod i f i ca t i on ] = Absyn .
CLASSMOD({} , Absyn .EQMOD( $2 [ Exp ] , i n f o ) ) ;}
| c l a s s m o d i f i c a t i o n { $$ [ Mod i f i ca t i on ] = $1[ Mod i f i ca t i on ] ; }
523c l a s s m o d i f i c a t i o n : e l ementargs
525 { $$ [ Mod i f i ca t i on ] = Absyn .CLASSMOD( $1 [ElementArgs ] , Absyn .NOMOD( ) ) ; }
| e lementargs EQUALS exp527 { $$ [ Mod i f i ca t i on ] = Absyn .CLASSMOD( $1 [
ElementArgs ] , Absyn .EQMOD( $3 [ Exp ] , i n f o )) ; }
529 annotat ion : T ANNOTATION elementargs { $$ [ Annotation ]=Absyn .ANNOTATION( $1 [ ElementArgs ] ) ; }
F.2. PARSERMODELICA.Y 195
531 e lementargs : LPAR argument l i s t RPAR { $$ [ ElementArgs ] =$1 [ ElementArgs ] ; }
533 e lementargs2 : LPAR argument l i s t RPAR { $$ [ ElementArgs ]= $1 [ ElementArgs ] ; }
| /∗ empty ∗/ { $$ [ ElementArgs ] = {} ; }535
argument l i s t : e lementarg { $$ [ ElementArgs ] = {$1 [ElementArg ] } ; }
537 | e lementarg COMMA argument l i s t { $$ [ElementArgs ] = $1 [ ElementArg ] : : $2 [ElementArgs ] ; }
539 elementarg : e a c h p r e f i x f ina l c r e f{ $$ [ ElementArg ] = Absyn .MODIFICATION( $2 [
843 function pr intContentStackinput AstStack astStk ;
845 l i s t<Token> skToken ;l i s t<Path> skPath ;
847 l i s t<ClassDef> skClassDef ;l i s t<Ident> skIdent ;
849 l i s t<Class> skClass ;l i s t<Program> skProgram ;
851 l i s t<l s t C l a s s > s k l s t C l a s s ;l i s t<String> skS t r ing ;
853 l i s t<Integer> s k I n t e g e r ;algorithm
855 ASTSTACK( stackToken=skToken , stackPath=skPath , s tackClas sDe f=skClassDef , s tack Ident=skIdent , s t ackCla s s=skClass ,stackProgram=skProgram , s t a c k l s t C l a s s=s k l s t C l a s s , s t a c k S t r i n g=skStr ing , s t a c k I n t e g e r=s k I n t e g e r ) := astStk ;
protected function r e ad Se t t i ng s103 ” func t i on : r e a dS e t t i ng s
author : x02lucpo105 Checks i f ’ s e t t i n g s . mos ’ e x i s t and uses handleCommand with
runScr ip t ( . . . ) to execute i t .Checks i f ’− s < f i l e >.mos ’ has been
107 r e tu rn s I n t e r a c t i v e . Interact iveSymbolTable which i s used in ther e s t o f the loop ”
input l i s t<String> i n S t r i n g L s t ;109 output String s t r ;
algorithm111 s t r :=
matchcontinue ( i n S t r i n g L s t )113 local
l i s t<String> args ;115 case ( args )
equation117 outSymbolTable = I n t e r a c t i v e . emptySymboltable ;
”” = Util . f l agVa lue ( ”−s ” , a rgs ) ;119 // t h i s i s out−commented because automat i ca l l y read ing
s e t t i n g s . mos// can make a system bad
121 // outSymbolTable = r e a d S e t t i n g s F i l e (” s e t t i n g s . mos” ,I n t e r a c t i v e . emptySymboltable ) ;
then123 outSymbolTable ;
case ( args )125 equation
s t r = Util . f l agVa lue ( ”−s ” , a rgs ) ;127 s t r = System . tr im ( s t r , ” \”” ) ;
outSymbolTable = r e a d S e t t i n g s F i l e ( s t r , I n t e r a c t i v e .emptySymboltable ) ;
129 thenoutSymbolTable ;
131 end matchcontinue ;end r e ad Se t t i ng s ;
133
135 end Main ;
Glossary
Abstract Syntax Tree (AST) A data structure representing somethingwhich has been parsed, often used as a compiler or interpreter’s in-ternal representation of a program while it is being optimised andfrom which code generation is performed. The range of all possi-ble such structures is described by the abstract syntax. [Howe, 2010,http://foldoc.org/abstract+syntax+tree]. 2, 69
Backus-Naur Form BNF is a formal metasyntax used to express context-free grammars. [Howe, 2010, http://foldoc.org/Backus-Naur+Form].29
Compiler-Compiler A utility to generate the source code of a parser, in-terpreter or compiler from an annotated language description (usuallyin BNF). Most so called compiler-compilers are really just parser gen-erators. [Howe, 2010, http://foldoc.org/compiler-compiler]. 1,35
Extended Backus-Naur Form EBNF is a variation on the basic BNFmeta-syntax notation with (some of) the following additional con-structs: square brackets surrounding optional items, suffix ”*” forKleene closure (a sequence of zero or more of an item), suffix ”+” forone or more of an item, curly brackets enclosing a list of alternatives,and super/subscripts indicating between n and m occurrences [Howe,2010, http://foldoc.org/Extended+Backus-Naur+Form]. . 25
Functional Programming A program in a functional language consistsof a set of (possibly recursive) function definitions and an expressionwhose value is output as the program’s result. Functional languagesare one kind of declarative language. They are mostly based on thetyped lambda-calculus with constants. There are no side-effects to ex-pression evaluation so an expression, e.g. a function applied to certainarguments, will always evaluate to the same value (if its evaluationterminates). Furthermore, an expression can always be replaced by its
value without changing the overall result (referential transparency).[Howe, 2010, http://foldoc.org/functional+programming]. 46
GNU Bison Bison is a general-purpose parser generator that converts anannotated context-free grammar into an LALR(1) or GLR parser forthat grammar [Donnelly and Stallman, 2010, http://www.gnu.org/software/bison/]. 23, 28–32, 46, 48, 49, 58, 64, 65, 81
Lexer A Lexer is a program that performs the Lexical Analysis in a Com-piler.. 2, 8, 33, 35, 37, 41, 44, 53, 69
MetaModelica MetaModelica is an extension of the Modelica languagecreated with the purpose of allowing people from the Modelica commu-nity to contribute to the development of the OpenModelica compiler(OMC) [Pop and Fritzson, 2006].. 1, 5, 19, 46, 58, 64–66
Modelica Modelica is an object-oriented equation-based programming lan-guage that allows specification of mathematical models of complexnatural or man-made systems [Fritzson, 2004].. 1, 5, 18, 19, 64, 66
Parser A Parser is a program that performs the Syntax Analysis in a Com-piler.. 1, 2, 33, 39, 41, 44, 51–53, 69
UTF-8 (UCS Transformation Format 8) An 8-Bytes ASCII-compatible multi-byte Unicode and UCS encoding. [Howe, 2010, http://foldoc.org/utf-8]. 28, 35, 37, 38, 42
Serietitel och serienummerTitle of series, numbering
ISSN
Linkoping Studies in Science and Technology
Thesis No. LIU-IDA/LITH-EX-A–11/019–SE
TitelTitle
ForfattareAuthor
SammanfattningAbstract
NyckelordKeywords
The OpenModelica Compiler-Compiler parser generator (OMCCp) is anLALR(1) parser generator implemented in the MetaModelica language withparsing tables generated by the tools Flex and GNU Bison. The code gener-ated for the parser is in MetaModelica 2.0 language which is the OpenMod-elica compiler implementation language and is an extension of the Modelica3.2 language. OMCCp uses as input an LALR(1) grammar that specifies theModelica language. The generated Parser can be used inside the OpenMod-elica Compiler (OMC) as a replacement for the current parser generated bythe tool ANTLR from an LL(k) Modelica grammar. This report explains thedesign and implementation of this novel Lexer and Parser Generator calledOMCCp.Modelica and its extension MetaModelica are both languages used in the Open-Modelica environment. Modelica is an Object-Oriented Equation-Based lan-guage for Modeling and Simulation.
IDA,Dept. of Computer and Information Science581 83 Linkoping