Top Banner
Parrot in detail 1 Parrot in Detail Dan Sugalski [email protected]
61
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 1

Parrot in DetailParrot in Detail

Dan [email protected]

Page 2: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 2

What is ParrotWhat is Parrot

• The interpreter for perl 6• A multi-language virtual

machine• An April Fools joke gotten out

of hand

Page 3: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 3

VMs in a nutshellVMs in a nutshell

• Platform independence• Impedance matching• High-level base platform• Good target for one or more

classes of languages

Page 4: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 4

Platform independencePlatform independence

• Allow emulation of missing platform features

• Allows unified view of common but differently implemented features

• Isolation of platform-specific underlying code

Page 5: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 5

Impedance matchingImpedance matching

• Can be halfway between the hardware and the software

• Provides another layer of abstraction

• Allows a smoother connection between language and CPU

Page 6: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 6

High-level base platformHigh-level base platform

• Provide single point of abstraction for many useful things, like:• Async I/O• Threads• Events• Objects

Page 7: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 7

Good HLL targetGood HLL target

• Provide features that map well to a class of language constructs

• Reduce the “thought” load for compiler writers

• Allow tasks to be better partitioned and placed

Page 8: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 8

Parts and PiecesParts and Pieces

The chunks of parrot

Page 9: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 9

Parts and piecesParts and pieces

• Parser• Compiler• Optimizer• Interpreter

Page 10: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 10

ParserParser

• Turns source into an AST• $a = $b + $c becomes

$a$b$c=+

Page 11: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 11

Overriding the ParserOverriding the Parser

• New tokens can be added• Existing tokens can have their

meaning changed• Entire languages can be

swapped in or out• All done with Perl 6 grammars

Page 12: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 12

GrammarsGrammars

• Perl 6 grammars are immensely powerful

• Combination of perl 5 regexes, lex, and yacc

• Grammars are object-oriented and overridable at runtime

Page 13: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 13

Useful side-effectsUseful side-effects

• Modifying existing grammars lexically is straightforward

• If you have a perl 6 rule set for a language you’re halfway to getting it up on Parrot

• Conveniently, we have a yacc and BNF grammar to Perl regex converter

Page 14: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 14

CompilerCompiler

• Turns AST into bytecode• Like the parser, is overridable• Essentially a fancy regex

engine, with some extras• No optimizations done here

Page 15: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 15

Mostly standardMostly standard

• The compiler’s mostly standard• Compiling to a machine, even a

virtual one, is a well-known thing• There’s not much in the way of

surprise here• Still a fair amount of work

Page 16: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 16

OptimizerOptimizer

• Takes AST and bytecode, and produces better bytecode

• Levels will depends on how perl’s invoked

• Works best with less uncertainty

Page 17: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 17

Optimizing is difficultOptimizing is difficult

• Lots of uncertainty at compile time

• Active data (tied/overloaded) kills optimization

• Late code loading and creation causes headaches

Page 18: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 18

Optimizing is difficultOptimizing is difficult

When can$x = 0;

foreach (1..10000) {

$x++;

}

become$x = 10000;

Page 19: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 19

Optimizing ironiesOptimizing ironies

• Perl may well end up one of the least-optimized languages running on Parrot

• Perl 6 has some constructs to help with that

• The more you tell the compiler, the faster your code will run

Page 20: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 20

InterpreterInterpreter

• Bytecode comes in, and something magic happens

• End destination for bytecode• May not actually execute the

bytecode, but generally will

Page 21: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 21

InterpreterInterpreter

• As final destination, may do other things• Save to disk• Transform to an alternate form

(like C code)• JIT• Mock in iambic pentameter

Page 22: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 22

Interpreter Design Details

Interpreter Design Details

In which we tell you far more than anyone sane wants to know about the

insides of our interpreter.(And this is the short form)

Page 23: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 23

Parrot in buzzwordsParrot in buzzwords

• Bytecode driven• Runtime extensible• Register based• Language neutral• Garbage collected• Event capable• Object oriented• Introspective• Interpreter• With continuations

Page 24: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 24

Bytecode engineBytecode engine

• Well, almost. 32 bit words• Precompiled form of your

program• Generally loaded from disk

(though not always)• Generally needs no

transformation to run

Page 25: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 25

Opcode functionsOpcode functions

• Opcode function table is lexically scoped

• Functions return the next opcode to run

• Most opcodes can throw an exception• Opcode libraries can be loaded in on

demand• Most opcodes overridable• Bytecode loader is overridable

Page 26: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 26

Fun tricks with dynamic opcodes

Fun tricks with dynamic opcodes

• Load in rarely needed functions only when we have to

• Allow piecemeal upgrading of a Parrot install

• We can be someone else cheaply• JVM• .NET• Z machine• Python• Perl 5• Ruby

Page 27: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 27

RegistersRegisters

• 4 Sets of 32: Integer, String, Float, PMC

• Fast set of temporary locations• All opcodes operate on

registers

Page 28: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 28

StacksStacks

• Six stacks• One per set of registers• One generic stack• One call stack• Stacks are segmented, and

have no size limit

Page 29: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 29

StringsStrings

Buffer pointerBuffer lengthFlags

Buffer usedString start

String Length

EncodingCharacter

set

Page 30: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 30

StringsStrings

• Strings are encoding-neutral• Strings are character set neutral• Engine knows how to ask

character sets to convert between themselves

• Unicode is our fallback (and sometimes pivot) set

Page 31: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 31

PMCsPMCs

vtableProperty hash

flagsData pointerCache data

Synchronization

GC data

Page 32: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 32

PMCsPMCs

• Parrot’s equivalent of perl 5’s variables

• Tying, overloading, and magic all rolled together

• Arrays, hashes, and scalars all rolled into one

Page 33: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 33

PMCs are more than they seem

PMCs are more than they seem

• Lots of behaviour’s delegated to PMCs• PMC structures are generally opaque to

the VM• Lots of the power and modularity of

Parrot comes from PMCs• Engine doesn’t distinguish between

scalar, hash, and array variables at this level

• Done with the magic of vtables and multimethod dispatch

Page 34: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 34

VtablesVtables

• Table of pointers to required functions• Allows each variable to have a custom

set of functions to do required things• Removes a lot of uncertainty from the

various functions which speed things up• Allow very customized behaviour• All load, store, and informational

functions here

Page 35: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 35

Vtables bring multimethodsVtables bring multimethods

• Core engine has support for dispatch based on argument types

• Necessary for proper dispatch of many vtable functions

• Support extends to language level

Page 36: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 36

Multimethod dispatch coreMultimethod dispatch core

• All binary PMC operations use MMD• Relatively recent change• Makes life simpler and vtables

smaller• Things are faster, too, since

everything of interest did MMD anyway.

Page 37: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 37

Aggregate PMCsAggregate PMCs

• All PMCs can potentially be treated as aggregates

• All vtable entries have a _keyed variant

• Up to vtable to decide what’s done if an invalid key is passed

Page 38: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 38

Keys for AggregatesKeys for Aggregates

• List ofkey typekey valueuse key as x

• Also plain one-dimensional integer index

• Keys are inherently multidimensional• Aggregate PMCs may consume

multiple keys

Page 39: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 39

Advantages of keysAdvantages of keys

• Multidimensional aggregates• No-overhead tied hashes and

arrays• Allows potentially interesting

tied behavior

Page 40: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 40

PMCs even hide aggregationPMCs even hide aggregation

• @foo = @bar * @bazTurns into

mul foo, bar, baz

Page 41: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 41

ObjectsObjects

• Parrot has a low-level view of objects

• Things with methods and an array of attributes

• Both of which are ultimately delegated to the object

• Except when we cheat, of course

Page 42: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 42

ObjectsObjects

• Objects may or may not be accessed by reference

• The provided object type is class-based

• Handles mixed-type inheritance with encapsulation and delegation

• Base system handles mixed-type dispatch properly

Page 43: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 43

ExceptionsExceptions

• An exception handler may be put in place at any time

• Exception handlers remember their state (they capture a continuation)

• Handlers may decline any exception• Exceptions propagate outward• Exception handlers may target

specific classes of exceptions

Page 44: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 44

Exceptios are:Exceptios are:

• TypedInformationWarningSevereFatalWe’re Doomed

• ClassedIOMath

• LanguagedPerlRuby

Page 45: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 45

Throwing an ExceptionThrowing an Exception

• Any opfunc that returns 0 triggers an exception

• The throw opcode also throws an exception

• The exception itself is stored in the interpreter

• Exceptions and exception handlers are cheap, but not free

Page 46: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 46

Realities of memory management

Realities of memory management

• Memory and structure allocation is a huge pain

• Terribly error prone• We have full knowledge of

what’s used if we choose to use it

Page 47: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 47

Arena allocation of core structures

Arena allocation of core structures

• All PMCs and Strings are allocated from arenas

• Makes allocation faster and more memory efficient

• Allows us to trace all the core structures as we need for GC and DOD

Page 48: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 48

Pool allocation of memory

Pool allocation of memory

• All ‘random’ chunks of memory are allocated from memory pools

• Allocation is extremely fast, typically five or six machine instructions

• Free memory is handled by the garbage collector

Page 49: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 49

Garbage CollectionGarbage Collection

• Parrot has a tracing, compacting garbage collector

• No reference counting• Live objects are found by

tracing the root set

Page 50: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 50

Garbage CollectionGarbage Collection

• All memory must be pointed to by a Buffer struct (A subset of a String)

• All Buffers must be pointed to by PMCs or string registers

• All PMCs must be pointed to by other PMCs or the root set

Page 51: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 51

DOD and GC are separateDOD and GC are separate

• DOD finds dead structures• GC compacts memory• Typically chew up more

memory than structures.

Page 52: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 52

I/OI/O

• Fully asynchronous I/O system by default

• Synchronous overlays for easier coding

• Perl 5/TCL/SysV style streams• C’s STDIO is dead

Page 53: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 53

I/O streamsI/O streams

• All streams can potentially be filtered

• No limit to the number of filters on a stream

• Filters may run asynchronously, or in their own threads

• Filters may be sources or sinks as need be

Page 54: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 54

I/O Stream examplesI/O Stream examples

• UTF8->UTF32 conversion• EBCDIC->ShiftJIS conversion• Auto-chomping• Tee-style fanout• GIF->PNG conversion

Page 55: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 55

Unified I/O & Event systemUnified I/O & Event system

• I/O requests and events are versions of the same thing

• Event generators are just autonomous streams

Page 56: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 56

Subs and sub callingSubs and sub calling

• Several sub types• Regular subs• Closures• Co-routines• Continuations

• All done with CPS, though it can be hidden

• Caller-save scheme for easier tail-calls

Page 57: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 57

Parrot has calling conventions

Parrot has calling conventions

• One standard set• All languages that want to

interoperate should use them• Only use them for globally exposed

routines• Terribly boring except when you

don’t have them

Page 58: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 58

Common object interfaceCommon object interface

• The OO part of the calling conventions• Method calling and attribute

storage/retrieval is standardized• Method dispatch is ultimately delegated• Attribute storage is also ultimately

delegated• Support multimethod dispatch,

prototype checking, and runtime mutability

Page 59: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 59

ThreadsThreads

• Three models• Shared dependent• Shared independent• Completely independent

• Built in interpreter thread safety• Primitives to allow for language-

level thread safety and inter-thread communication

Page 60: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 60

IntrospectionIntrospection

• All language data structures are accessible from parrot programs• Scratchpads• Global variable tables• Stacks

• Interpreter level stuff is accessible• Statistics• Internal pools

Page 61: Parrot in detail1 Parrot in Detail Dan Sugalski dan@sidhe.org.

Parrot in detail 61

Runtime MutabilityRuntime Mutability

• Full runtime mutability is supported

• Parser and compiler are generally available

• Needed for things like perl’s string eval