Top Banner
A teaching compiler overview • X is a programming language • hyper supports edit-compile-go for X • 1986 hyper was in C • used static, extern,longjmp • ran on VAX, alpha, SUN, … • 2003 hyper implemented in 6000 lines of C++ • uses exceptions, pure virtual methods • runs on Intel x86 linux, power pc • Similar to MATLAB JIT
35

A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

Dec 22, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

A teaching compileroverview

• X is a programming language• hyper supports edit-compile-go for X• 1986 hyper was in C

• used static, extern,longjmp• ran on VAX, alpha, SUN, …

• 2003 hyper implemented in 6000 lines of C++ • uses exceptions, pure virtual methods• runs on Intel x86 linux, power pc

• Similar to MATLAB JIT

Page 2: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

My Copyleft

~mckeeman/src/cxx/hyper

COPYRIGHT W. M. McKeeman 1987. You may do anything you like with the file except remove or alter this notice.

Page 3: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

I expect the students to learn…

• the mathematics of language description

• scanning, parsing, abstract S/R sequence, semantics

• symbol tables, polish postfix, hardware

• integrated programming environment, editor

• separation of concerns, design, build, test, quality

• dealing with deadlines

Page 4: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

I assume you already know…

• Something about compiler implementation

• Grammars

• Scanners

• Recursive Parsers

• Machine Language

• Something about C++

Page 5: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

Main Points of Talk

Components of the integrated programming environment are an emacs-style editor, a compiler taking source from a source buffer, quickly compiling it into Intel x86 code and executing it, leaving the results in an output buffer.

C++ class hierarchy is used to represent independent layers of the components and enforce separation of concerns. Pure virtual functions are used to communicate between the layers.

Individual components are in subdirectories together with unit tests. The containing directory contains the single makefile for the project and its components and tests.

Page 6: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

A sample X program` FILE: fact.xn := inputval+0; res := 1; i := 1;it if i <= n -> res := res*i; i := i + 1 :: else exit fiti;result := res;

inputval only appears on rhs – must be input

result only appears on lhs – must be output

Page 7: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

X Summary

Scalar assignments and expressions

Strong type inferred from use (int & logical)

Input/output inferred from use

if-fi and it-ti control flow

be-eb nested scopes

Subroutines defined but not implemented

Page 8: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

demo

Fast

Page 9: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

Typical Student Projects

•Replace 32 bit int with 32 bit float (too easy)

•Implement 64 bit double

•Implement arrays

•Implement sets (perhaps infinite)

•Implement BIGNUM rationals

•Implement subroutines

•Implement C-style declarations (too dull)

•Implement conventional I/O (retro)

Page 10: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

Object Structure

Hyper:Edit Display:Lines Line

Compile:Gen:Lang:Parse

Sym Frame Symbol

Mem

Asm

Scan

Page 11: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

Assembler

The Intel x86 assembler implements an open-ended set of methods. Examples of calls to Asm object x86:

x86->addRR(int, int) – add register-register

x86-> daddp(void) – double add, pop

x86-> dldM(double*) – double load from double memory

x86-> dildM(int*) – double load from int memory

x86-> callRi(int) – call indirect through register

x86-> pushA(void) – push all 8 registers

x86-> svcCC(int,int) – supervisor call with constant args

Page 12: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

Asm InternalsThe bit-layout is done with macros that implement the Intel documentation. Ugly stuff.

The data member code is an allocated array long enough to hold the assembled instructions (x86 hardware format).

The member function go(), called from the hyper environment, jumps to the code as a void subroutine. On some platforms the icache and dcache have to be tweaked.

Branches are relative. Forward fixups are inserted when the destination becomes known. Big/little endian problems are handled internally by Asm.

Destructing an Asm object also frees the code.

Page 13: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

A Sample Asm Routinevoid Asm::

addRC(int r, int c) { // r += c

if (r == EAX) {

assmop(0x05);

} else {

assmop(0x81);

assm8(MOD_REG(r,0));

}

assm(c);

}

Page 14: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

Disassembler Internals

dis produces a printout of the assembly code.

It is a 256-way switch, each case of which is potentially another switch. In essence, dis is an Intel x86 interpreter that prints instead of doing.

The mnemonics are more closely related to asm than to Intel. The disassembler interprets only what the assembler makes. Otherwise it dumps the hexadecimal.

dis output passed back via a callback. Hyper places it in the output buffer.

Page 15: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

Memory Manager

• MATLAB allocates run-frames in allocated blocks of store.

• Hyper does the same (but I might change it).

• The trick is getting constant addresses from an object containing an arbitrary number of memory locations. Solution: expandable array of fixed-sized blocks.

Page 16: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

Symbol Table

The symbol table is a stack of frames, each of which is a stack of symbols. The classes Sym, Frame, Symbol represent the concepts above. There is an expandable array of frames in the symbol table class, and an expandable array of symbols in each frame. As frames are closed (exit scope) they are placed aside for later access via the symbol table dumper. Upon destruction the expandable arrays and the objects in them must be freed.

In X type is inferred from use; there are no declarations.

Page 17: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

Symbol Table Objects

level 2

level 1

global

symbolsstack of frames

lookup direction

frames

sym

Page 18: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

Class Symbol

Class Symbol {

Symbol(char*, int);// ptr to src,len

~Symbol(void); // dtor

char *getName(void);

void setType(int);

int getType(void);

void setAddress(int *);

int *getAddress(void);

…etc.

Page 19: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

Class ScanA scan object accepts a sequence of pointers to text fragments (actually lines) and computes a sequence of tokens, each consisting of

• a token code

• a pointer to the start (in the source)

• a length

All scanner table lookups use a perfect hash.

The scan object provides navigation of the token sequence, making parser lookahead straightforward. The tokens are good only so long as the source text is unchanged and not moved.

Page 20: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

A case in scan

case ‘0’: case ‘1’: case ‘2’: case ‘3’: case ‘4’:

case ‘5’: case ‘6’: case ‘7’: case ‘8’: case ‘9’:

while (++end<lim && isdigit(*end)) ; // D+

report(numLEX, begin, end);

break;

Switch on raw character. Manipulate pointers only.

Page 21: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

One Pass Compiling

Parsing is recursive; the basic token handling, shift/reduce output and diagnostic facilities are in Parse. Parse knows nothing of X.

The language specific recursive routines are in Lang:Parse. Here is where shift and reduce are called. Nothing is known of the semantics of X.

The class Gen:Lang:Parse implements shift and reduce, calling in turn a sequence of pure virtual abstract methods like endIffi(). Nothing is known of the target platform.

Finally, Compile:Gen:Lang:Parse having a symbol table, memory manager and assembler available, implements the concrete semantics of X.

Page 22: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

Parse Class

Class Parse { Parse(Scan*); ~Parse(void); virtual void parse(void) = 0; virtual void shift(Token*) = 0; virtual void reduce(int) = 0; virtual void hint(int) = 0;

void start(void); void step(int); void accept(int,char*); void reject(char*);etc.}

Page 23: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

X.cfgprogram statementsstatements statement statements ; statementstatement

exit block selection iteration assignmentblock be identifiers . statements eb

etc.

Page 24: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

A recursive routine

void Lang::conjunction(void) { // a /\ b complement(); reduce(CONJUNCTION1); while (tok == divslashLEX) { step(); // discard /\… complement(); reduce(CONJUNCTION2); }}

Page 25: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

One could use LALR(1)The Lang class could be implemented with a LALR(1) machine (YACC-like tables) and the rest of hyper would never know…

void Lang::lalr(void) { while (lhs != Program) { if isShift(state,tok) { shift(tok); step(); } else { lhs = apply(state, tok); reduce(lhs); } }}

Page 26: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

shift/reduce

void Gen::shift(Token *tok) { // stack token tokStack[tokPtr++] = tok;}void Gen::reduce(int rule) { // obey rule switch (rule) { case PROGRAM1: getRet(); // return to hyper break; case STATEMENTS1: // one stmt break; case STATEMENTS2: // more stmts break;etc.}

Page 27: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

Compile Class

Implements pure virtuals called from Gen.

Creates a symbol table, assembler and memory manager.

Turns abstract actions into concrete actions.

Page 28: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

Layers of the one pass compiler

Compile Gen Lang Parse

parse()

shift()reduce()

hint()

beginIffi()endIffi()

genInfixop()

Pure virtualsSym

Mem

Asm

ScanRecursive routines or LALR(1)go()

abstractconcrete

Page 29: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

Compile

Using an AST

Sym

Mem

Asmgo()

Ast Lang Parse

parse()

shift()reduce()

hint()

Scan

Page 30: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

Some Emittersvoid Emit::genRet(void) { // end of program a->epilog(); // x86 return s->exitScope(); // close global frame}

void Emit::genExit(void) { // loop exit stmt if (itPtr > 0) { // inside loop a->movRC(itVar[itPtr-1], 1); } else { failure(“exit outside loop”); }}

Page 31: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

Edit Class

A Line contains a fragment of text.

Lines contains an array of Line.

A Display is Lines with a visual presentation.

An Edit contains an array of Display (buffers)

Edit has one pure virtual function named callback used to pass uninterpretable keystrokes on to its user.

Page 32: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

Hyper Class

Hyper is an Edit with the additional capability of compile and go. The editor knows nothing of this, merely passing the keystrokes ^xe to hyper via the pure virtual callback.

Hyper maintains 3 buffers:

•Source text

•Run shell

•Help

Results, i/o, dumps and diagnostics are placed in the run shell. The user initially sees the help display.

Page 33: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

Directory structure

/hyper/1edit/2scan/3parse/4sym/5mem/6asm/7gen

Page 34: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

MakefileCC=g++ -gMAKE=gmake

EDITDIR = 1editSCANDIR = 2scanPARSEDIR= 3parseSYMDIR = 4symMEMDIR = 5memASMDIR = 6asmGENDIR = 7gen

include $(EDITDIR)/edit.mkinclude $(SCANDIR)/scan.mkinclude $(PARSEDIR)/parse.mkinclude $(SYMDIR)/sym.mkinclude $(MEMDIR)/mem.mkinclude $(ASMDIR)/asm.mkinclude $(GENDIR)/gen.mk

OBJ=$(EDITOBJ) $(COMPILER)

test: hyperhyper smoke.x

unittests:$(MAKE) linetest scantest symtest memtest asmtest parsetest gentest

hyper: hyper.h hyper.cxx $(OBJ)$(CC) -o hyper hyper.cxx $(OBJ)

clean:rm -f *.o bin/*.o *.out *~ */*~ *driver hyper edit

Page 35: A teaching compiler overview X is a programming language hyper supports edit-compile-go for X 1986 hyper was in C used static, extern,longjmp ran on VAX,

Line Counts•hyper.cxx 314•1edit/ 722

•edit.cxx 336•display.cxx 137•lines.cxx 128•line.cxx 154•terminal.cxx 84

•2scan/scan.cxx 314•3parse/ 454

•parse.cxx 61•x.cxx 393

•4sym/ 282•symbols.cxx 126•frame.cxx 60•symbol.cxx 96

•5mem/int32.cxx 47

•6asm/ 1595

•x86dis.cxx 887

•x86asm.cxx 708

•7gen/ 907

•gen.cxx 225

•xcc.cxx 682

•dot-cxx files 4600+

•dot-h files 1300+

•test drivers 1300+