Top Banner
Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005
31

Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.

Dec 31, 2015

Download

Documents

Beverly Hardy
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.

Perl 6 Update - PGE and Pugs

Dr. Patrick R. MichaudApril 26, 2005

Page 2: Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.

Rules and Grammars

Perl 6 completely redesigns the regular expression syntaxRegular expressions are now "rules"Rules can call/embed other rulesGroups of rules can be combined into Grammars

Page 3: Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.

Current events in Perl 6

Parrot 1.2 releasedThe Perl Foundation receives $25,000 for completion of Parrot milestonesNew Parrot pumpking - Chip SalzenburgNew version of Parrot Grammar Engine (PGE / Perl 6 rules) to be released this weekPugs - Autrijus Tang Perl 6 test suite

Page 4: Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.

Pugs

Perl 6 compiler written in HaskellStarted by Autrijus TangCompiles directly to Haskell or to Parrot ASTBeing used to develop Perl 6 tests and experiment with Perl 6 designAvailable at http://pugscode.orgDiscussion on [email protected] mailing list

Page 5: Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.

Perl 6 rules / Parrot Grammar Engine

The heart of the Perl 6 compiler is the Perl/Parrot Grammar Engine (PGE)Implements the Perl 6 rules syntax, compiles to Parrot codePerl 6 rules compiler currently written in CBootstrap to Perl 6

Page 6: Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.

Steps to Perl 6 compiler

Finish PGE bootstrap in C Parse p6 "rule" statements and grammars

Use p6 rules to define the Perl 6 grammarP6 grammar can be used to generate Parrot abstract syntax trees from Perl 6 programsCompile, (optimize), execute the abstract syntax tree to get working Perl 6 programUse Perl 6 to rewrite the grammar engine in Perl 6 (faster)

Page 7: Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.

Current state of PGE

Handles concatenation, alternation, quantifiers, captures*, subpatterns, subrulesCapture semantics redefined in Dec 2004, still not finalTo be added next Character classes (note: Unicode) Patterns containing scalars, arrays, hashes

Page 8: Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.

P6 rule syntax

Changes from perl 5 No more trailing /e, /x, /s options [...] denotes non-capturing groups ^ and $ are beginning/end of string ^^ and $$ are beginning/end of line . matches any character, including newline \n and \N match newline/non-newline # marks a comment (to end of line) Quantifiers are *, +, ?, and **{m..n}

Page 9: Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.

Character classes

[aeiou] changed to <[aeiou]>[^0-9] now <-[0..9]>Properties defined as <alpha> <digit> <alnum>

Combine classes using +/- syntax: <+<alpha>-[aeiou]>

Page 10: Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.

Subrules

Patterns are now called "rules"Analogous to subroutines and closuresLike {...}, /.../ compiles into a "rule" subroutineP6 rule statement allows named rules:

rule ident / [<alpha>|_] \w* /;

Named rules can be easily used in other rules:

m / <ident> \:= (.*) /;rule expr / <term> [ <[+-]> <term> ]* /;

Page 11: Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.

Interpolation

Variables no longer interpolate directly, thus/ $var /

matches the contents of $var literally, even if it contains rule metacharacters. (No \Q and \E) To treat $var as a rule, use

/ <$var> /

Interpolated arrays match as an alternation:/ @cmds /

/ [ @cmds[0] | @cmds[1] | @cmds[2] | ... ] /

Page 12: Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.

Interpolation, cont'd

Hashes match the keys of the hash, and the value of the hash is either Executed if it is a closure Treated as a subrule if it's a string or rule

object Succeeds if value is 1 Fails for any other value

Useful for parsed languagesrule expr / <term> [ %infixop <expr> ]? /

Page 13: Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.

< metasyntax >

The < ... > introduce various forms of metasyntaxA leading alphabetic character indicates a subrule or grammatical assertion<alpha><expr><before pattern><after pattern>

A leading ! negates the match<!before pattern>

Page 14: Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.

< metasyntax >

Leading ' matches a literal string<'match this exactly (whitespace matters)'>

Leading " matches an interpolated string

<"match $THIS exactly (whitespace matters)">

Leading '+' or '-' are character classes/<-[a..z]> <-<alpha>>/

Page 15: Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.

< metacharacters >

Leading '(' indicates code assertion/(\d**{1..3}) <( $1 < 256 )>/

# (fail if $1 is not less than 256)

A $, @, or % indicates a variable subrule, where each value (or key) is a subrule to be matched

<$myrule><@cmds>

<%commands>

Page 16: Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.

A cool and somewhat scary example

%cmd{'^\d+'} = { say "You entered a number" };%cmd{'^hello'} = { say "world" };%cmd{'^print \s (.*)'} = { say $1; };%cmd{'^exit'} = { exit() };

while =$*IN { /<%cmd>/ || say "Unrecognized command";

}

Page 17: Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.

Backtracking control

Single colons skip previous atomm/ \( <expr> [ , <expr> ]* : \) /

(if we don't find closing paren, no point in trying to match fewer <expr>s)

Two colons break an alternation:m:w/ [ if :: <expr> <block> | for :: <list> <block> | loop :: <loop_controls>? <block> ]

(once we've found "if", "for", or "loop", no point in trying the other branches of the alternation)

Page 18: Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.

Backtracking control

Three colons (:::) fail the current ruleThe <commit> assertion fails the entire match (including any rules that called the current rule)The <cut> assertion matches successfully, removes the matched portion of the string up to the <cut>, and if backtracked over fails the match entirely Useful for throwing away successfully processed

input when matching from an input stream Like, say, when writing a compiler :-)

Page 19: Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.

Backslash

\L, \U, \Q, \E, \A, \z gone from rules\n and \N match newline/not newline\s matches any Unicode spacebackreferences are gone, use $1, $2, $3 (non-interpolated)Perl 6 allows defining custom backslash sequences for use in rules

Page 20: Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.

Closures

Anything in curlies is executed as a Perl 6 closure

/ (\w+) { say "Got $1"; } /

Page 21: Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.

Capture semantics

Captures are different in Perl 6The result of a match is a "match object"If a match succeeds, the match object has: Boolean value true Numeric value 1 (except for global matches) String value the matched substring Array component is matched subpatterns Hash component is matched subrules

Page 22: Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.

Subpattern captures

Part of a rule in parenthesis is a subpatternEach subpattern produces its own match object

/Scooby (dooby) (doo)!/ $1 $2

Quantified subpatterns produce arrays of match objects:

/Scooby (\w+ \s+)* (doo)!/ $1 $2

$1 is a (possibly empty) array of matches

Page 23: Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.

Non-capturing groups

Brackets do not capture, thus they don't result in a match object

/Scooby [ (\w+ \s+)* (doo) ]!/ $1 $2

Quantified brackets replace nested subpatterns with the last component matched:

/Scooby [ (\w+ \s+)* (doo) ]+ !/ $1 $2

Page 24: Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.

Nested capturing subpatterns

Each capturing subpattern introduces a new lexical scope, with nested captures inside the new match object:

/Scooby ( (\w+ \s+)* (doo) ) !/ $1[0] $1[1] <-------- $1 --------->

Page 25: Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.

Alternations

Alternations introduce a new lexical scope, thus subpatterns restart counting at zero for each alternative branch (unlike p5): $1 $2

m/ Scooby (dooby)* (doo)! | Yabba (dabba)* (doo) /

$1 $2

This avoids lots of empty subpatterns when an alternation doesn't match.

Page 26: Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.

Subrules

Subrules capture into a hash keyed by the name of the subrule:

rule ident / [<alpha>|_] \w* /; rule num / \d+ /;

m/ <ident> \:= <num> /;

places match objects into $<ident> and $<num>

Page 27: Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.

Quantified subrules

Like subpatterns, quantified subrules produce arrays of matches

m:w / dir <file>* /

produces matches in $<file>[0], $<file>[1], etc.

Nested parens in a subrule capture to the subrule's match object

Page 28: Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.

Named captures

Portions of a match can be captured directly into a match object without a subrule:

m:w/ $<name> := \w+ , <$val> := \d+ /

captures the first sequence of alphanumerics into $<name>, and digits following the comma into $<val>.

Page 29: Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.

Grammars

Rules can be packaged together into separate name spaces to form Grammars

grammar Perl6 {rule ident { ... };

rule term { ... }; rule expr { ... }; }

Page 30: Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.

:parsetree

The :parsetree flag to a rule causes the grammar engine to keep all information about a match. Thus, one can do something like

$parse = ($source ~~ Perl6::program);

to get the entire parsetree for a program (including comments)

Page 31: Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.

Questions?