Top Banner

of 33

Compiler Design Lectures

Apr 04, 2018

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 7/30/2019 Compiler Design Lectures

    1/33

    CS 434

    Compilers Design

    Dr. Ayman Hamarsheh

  • 7/30/2019 Compiler Design Lectures

    2/33

    Lecture 1

    Introduction

    Programs, Interpreters and Translators

  • 7/30/2019 Compiler Design Lectures

    3/33

    Programming languages are notations for

    describing computations to people and tomachines

    All the software running on all thecomputers was written in some

    programming language Before a program can be run, it first must

    be translated into a form in which it can be

    executed by a computer The software systems that do this

    translation are called compilers

  • 7/30/2019 Compiler Design Lectures

    4/33

    a compiler is a program that can read aprogram in one language - the source

    language - and translate it into anequivalent program in another language -the targetlanguage;

    An important role of the compiler is toreport any errors in the source programthat it detects during the translationprocess.

    If the target program is an executablemachine-language program, it can then becalled by the user to process inputs andproduce outputs

  • 7/30/2019 Compiler Design Lectures

    5/33

    An interpreteris another common kind oflanguage processor.

    Instead of producing a target program as atranslation, an interpreter appears to directlyexecute the operations specified in the sourceprogram on inputs supplied by the user.

    The machine-language target program producedby a compiler is usually much faster than aninterpreter at mapping inputs to outputs .

    An interpreter, however, can usually give better

    error diagnostics than a compiler, because itexecutes the source program statement bystatement.

  • 7/30/2019 Compiler Design Lectures

    6/33

    The main advantages of compilers

    They produce programs which run

    quickly.

    They can spot syntax errors while theprogram is being compiled (i.e. you

    are informed of any grammatical

    errors before you try to run the

    program).

  • 7/30/2019 Compiler Design Lectures

    7/33

    The main advantages of interpreters

    There is no lengthy "compile time", i.e. you

    do not have to wait between writing a

    program and running it, for it to compile They tend to be more "portable", which

    means that they will run on a greater

    variety of machines.

  • 7/30/2019 Compiler Design Lectures

    8/33

    In addition to a compiler, several other

    programs may be required to create anexecutable target program.

    A source program may be divided intomodules stored in separate files.

    The task of collecting the source programis sometimes entrusted to a separateprogram, called a preprocessor.

    The preprocessor may also expandshorthands, called macros, into sourcelanguage statements.

  • 7/30/2019 Compiler Design Lectures

    9/33

    The compiler may produce an assembly

    language program as its output, becauseassembly language is easier to produce as

    output and is easier to debug.

    The assembly language is then processedby a program called an assemblerthat

    produces relocatable machine code as its

    output.

  • 7/30/2019 Compiler Design Lectures

    10/33

    Large programs are often compiled in pieces, so

    the relocatable machine code may have to be

    linked together with other relocatable object files

    and library files into the code that actually runs

    on the machine.

    The linkerresolves external memory addresses,where the code in one file may refer to a location

    in another file.

    The loaderthen puts together all of the

    executable object files into memory for

    execution.

  • 7/30/2019 Compiler Design Lectures

    11/33

    Source Program Translators TargetProgram

    Compilers

    Interpreters

  • 7/30/2019 Compiler Design Lectures

    12/33

    Lecture 2

    The Structure of a Compiler

    Analysis-Synthesis Model of

    Translation (Compilation)

  • 7/30/2019 Compiler Design Lectures

    13/33

  • 7/30/2019 Compiler Design Lectures

    14/33

    There are two parts of compilation: Analysis part

    Synthesis part

    Front

    End

    Back

    End

    Intermediate

    Representation

    Source

    code

    Machine

    Errors

    code

  • 7/30/2019 Compiler Design Lectures

    15/33

    The Analysis Part:

    It is often called the front end of the compiler

    Breaks up the source program into constituent

    pieces and imposes a grammatical structure on

    these pieces.

    Creates an intermediate representation of thesource program.

    If the analysis part detects that the source

    program is either syntactically ill formed or

    semantically unsound, then it must provide

    informative messages, so the user can take

    corrective action.

  • 7/30/2019 Compiler Design Lectures

    16/33

    Collects information about the source

    program and stores it in a data structurecalled a symbol table, which is passed

    along with the intermediate representation

    to the synthesis part. During analysis, the operations implied by

    the source program are determined and

    recorded in a hierarchical structure calleda tree.

  • 7/30/2019 Compiler Design Lectures

    17/33

    The Synthesis Part:

    It is often called the back end of the

    compiler

    constructs the desired target program from

    the intermediate representation and the

    information in the symbol table.

  • 7/30/2019 Compiler Design Lectures

    18/33

    Phases of compilation process:

    Compiler operates as a sequence of phases,each of which transforms one representation of

    the source program to another.

    In practice, several phases may be grouped

    together, and the intermediate representationsbetween the grouped phases need not be

    constructed explicitly.

    Symbol table, which stores information about the

    entire source program, is used by all phases of

    the compiler.

  • 7/30/2019 Compiler Design Lectures

    19/33

    Phases of compilation process:

    Lexical Analysis Syntax Analysis

    Semantic Analysis

    Intermediate Code Generation

    Machine-Independent code optimization

    Code Generation

    Machine-Dependent Code Optimization

  • 7/30/2019 Compiler Design Lectures

    20/33

    Issues in compiler design The compiler deals with many big-picture issues

    Compiler construction brings togethertechniques from disparate parts of ComputerScience.

    Compilers are engineered objectssoftwaresystems built with distinct goals in mind.

    In building a compiler, the compiler writer makesmyriad design decisions, each decision has animpact on the resulting compiler.

    a well designed compiler must observe isinviolable.

  • 7/30/2019 Compiler Design Lectures

    21/33

    Lecture 3

    Programming Language

    Specifications

  • 7/30/2019 Compiler Design Lectures

    22/33

    Definition of Syntax

    In computer science, the syntax of aprogramming language is the set of rules thatdefine the combinations of symbols that areconsidered to be correctly structured programsin that language.

    The syntax of a language defines its surfaceform.

    Text-based programming languages are basedon sequences of characters.

    visual programming languages are based on thespatial layout and connections between symbols(which may be textual or graphical).

  • 7/30/2019 Compiler Design Lectures

    23/33

    Definition of Syntax

    The syntaxof a programming languagedescribes the proper form of its programs.

    The syntax of textual programming

    languages is usually defined using acombination of regular expressions (forlexical structure) and Backus-Naur Form(for grammatical structure) to inductivelyspecify syntactic categories (nonterminals)and terminalsymbols.

  • 7/30/2019 Compiler Design Lectures

    24/33

    The syntax of a language describes theform of a valid program, but does not

    provide any information about the meaning

    of the program or the results of executingthat program.

    syntax of most programming languages

    can be specified using a Type-2 grammar,i.e., they are context-free grammars.

  • 7/30/2019 Compiler Design Lectures

    25/33

    Semantics and Pragmatics

    The two stages of analysis semantics andpragmatics, are concerned with getting at themeaningof a sentence.

    In the first stage (semantics) a partialrepresentation of the meaning is obtained basedon the possible syntactic structure(s) of thesentence, and on the meanings of the words inthat sentence

    In the second stage, the meaning is elaboratedbased on contextualand world knowledge

  • 7/30/2019 Compiler Design Lectures

    26/33

    Semantics

    In general, the input to the semantic stage

    of analysis may be viewed as being a set

    of possible parsesof the sentence, and

    information about the possible wordmeanings.

  • 7/30/2019 Compiler Design Lectures

    27/33

    Lecture 4

    In-depth Study of SyntacticSpecifications

  • 7/30/2019 Compiler Design Lectures

    28/33

    Syntactic

    The syntactic analysis of source codeusually entails the transformation of the

    linear sequence of tokens into a

    hierarchical syntax tree (abstract syntax

    trees are one convenient form of syntax

    tree)

  • 7/30/2019 Compiler Design Lectures

    29/33

    Syntax definition

    The syntax of textual programming

    languages is usually defined using a

    combination of regular expressions (for

    lexical structure) and Backus-Naur Form(for grammatical structure) to inductively

    specify syntactic categories (nonterminals)

    and terminalsymbols

  • 7/30/2019 Compiler Design Lectures

    30/33

    Syntax definition

    The syntax of a language describes theform of a valid program, but does notprovide any information about the meaning

    of the program or the results of executingthat program.

    The meaning given to a combination ofsymbols is handled by semantics

    Not all syntactically correct programs aresemantically correct

  • 7/30/2019 Compiler Design Lectures

    31/33

    Using natural language as an example, it

    may not be possible to assign a meaningto a grammatically correct sentence or the

    sentence may be false:

    "John is a married bachelor. " isgrammatically well-formed but has no

    generally accepted meaning.

  • 7/30/2019 Compiler Design Lectures

    32/33

    No ambiguity allowed in programming

    languages in form (syntax) and meaning(semantics)

    Distinction between syntax and semantics:

    many programming languages havefeatures that meanthe same (shared

    semantics) but are expresseddifferently

    identifying which is which helps thelearning curve

  • 7/30/2019 Compiler Design Lectures

    33/33

    Syntax Specification

    Formalism: set of production rules Microsyntax rules: concatenation, alternation

    (choice among finite alternatives), Kleene

    closure

    - The set of strings produced by these three rules

    is a regular setorregular language

    - The rules are specified by regular expressions

    they generate the regular language- Strings in the regular language are recognized

    by scanners