Top Banner
@asgrim Climbing the Abstract Syntax Tree James Titcumb Bulgaria PHP Conference 2016
114

Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

Jan 26, 2017

Download

Technology

James Titcumb
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Climbing theAbstract Syntax Tree

James TitcumbBulgaria PHP Conference 2016

Page 2: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

Who is this guy?James Titcumb

www.jamestitcumb.com

www.roave.com

www.phphants.co.uk

www.phpsouthcoast.co.uk

@asgrim

Page 3: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

How PHP works

PHP code

OpCacheExecute (VM)

Lexer + Parser

Compiler

Page 4: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

The PHP Lexer

zend_language_scanner.l

Page 5: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

zend_language_scanner.l<ST_IN_SCRIPTING>"exit" {

RETURN_TOKEN(T_EXIT);

}

<ST_IN_SCRIPTING>"die" {

RETURN_TOKEN(T_EXIT);

}

<ST_IN_SCRIPTING>"function" {

RETURN_TOKEN(T_FUNCTION);

}

Page 6: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

zend_language_scanner.l<ST_IN_SCRIPTING>"exit" {

RETURN_TOKEN(T_EXIT);

}

<ST_IN_SCRIPTING>"die" {

RETURN_TOKEN(T_EXIT);

}

<ST_IN_SCRIPTING>"function" {

RETURN_TOKEN(T_FUNCTION);

}

Page 7: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

zend_language_scanner.l<ST_IN_SCRIPTING>"exit" {

RETURN_TOKEN(T_EXIT);

}

<ST_IN_SCRIPTING>"die" {

RETURN_TOKEN(T_EXIT);

}

<ST_IN_SCRIPTING>"function" {

RETURN_TOKEN(T_FUNCTION);

}

Page 8: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

zend_language_scanner.l<ST_IN_SCRIPTING>"exit" {

RETURN_TOKEN(T_EXIT);

}

<ST_IN_SCRIPTING>"die" {

RETURN_TOKEN(T_EXIT);

}

<ST_IN_SCRIPTING>"function" {

RETURN_TOKEN(T_FUNCTION);

}

Page 9: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

zend_language_scanner.l<ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC>"${" {

yy_push_state(ST_LOOKING_FOR_VARNAME);

RETURN_TOKEN(T_DOLLAR_OPEN_CURLY_BRACES);

}

<ST_LOOKING_FOR_VARNAME>{LABEL}[[}] {

yyless(yyleng - 1);

zend_copy_value(zendlval, yytext, yyleng);

yy_pop_state();

yy_push_state(ST_IN_SCRIPTING);

RETURN_TOKEN(T_STRING_VARNAME);

}

Page 10: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

zend_language_scanner.l<ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC>"${" {

yy_push_state(ST_LOOKING_FOR_VARNAME);

RETURN_TOKEN(T_DOLLAR_OPEN_CURLY_BRACES);

}

<ST_LOOKING_FOR_VARNAME>{LABEL}[[}] {

yyless(yyleng - 1);

zend_copy_value(zendlval, yytext, yyleng);

yy_pop_state();

yy_push_state(ST_IN_SCRIPTING);

RETURN_TOKEN(T_STRING_VARNAME);

}

Page 11: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

zend_language_scanner.l<ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC>"${" {

yy_push_state(ST_LOOKING_FOR_VARNAME);

RETURN_TOKEN(T_DOLLAR_OPEN_CURLY_BRACES);

}

<ST_LOOKING_FOR_VARNAME>{LABEL}[[}] {

yyless(yyleng - 1);

zend_copy_value(zendlval, yytext, yyleng);

yy_pop_state();

yy_push_state(ST_IN_SCRIPTING);

RETURN_TOKEN(T_STRING_VARNAME);

}

Page 12: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

zend_language_scanner.l<ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC>"${" {

yy_push_state(ST_LOOKING_FOR_VARNAME);

RETURN_TOKEN(T_DOLLAR_OPEN_CURLY_BRACES);

}

<ST_LOOKING_FOR_VARNAME>{LABEL}[[}] {

yyless(yyleng - 1);

zend_copy_value(zendlval, yytext, yyleng);

yy_pop_state();

yy_push_state(ST_IN_SCRIPTING);

RETURN_TOKEN(T_STRING_VARNAME);

}

Page 13: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

zend_language_scanner.l<ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC>"${" {

yy_push_state(ST_LOOKING_FOR_VARNAME);

RETURN_TOKEN(T_DOLLAR_OPEN_CURLY_BRACES);

}

<ST_LOOKING_FOR_VARNAME>{LABEL}[[}] {

yyless(yyleng - 1);

zend_copy_value(zendlval, yytext, yyleng);

yy_pop_state();

yy_push_state(ST_IN_SCRIPTING);

RETURN_TOKEN(T_STRING_VARNAME);

}

Page 14: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

zend_language_scanner.l<ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC>"${" {

yy_push_state(ST_LOOKING_FOR_VARNAME);

RETURN_TOKEN(T_DOLLAR_OPEN_CURLY_BRACES);

}

<ST_LOOKING_FOR_VARNAME>{LABEL}[[}] {

yyless(yyleng - 1);

zend_copy_value(zendlval, yytext, yyleng);

yy_pop_state();

yy_push_state(ST_IN_SCRIPTING);

RETURN_TOKEN(T_STRING_VARNAME);

}

Page 15: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

zend_language_scanner.l<ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC>"${" {

yy_push_state(ST_LOOKING_FOR_VARNAME);

RETURN_TOKEN(T_DOLLAR_OPEN_CURLY_BRACES);

}

<ST_LOOKING_FOR_VARNAME>{LABEL}[[}] {

yyless(yyleng - 1);

zend_copy_value(zendlval, yytext, yyleng);

yy_pop_state();

yy_push_state(ST_IN_SCRIPTING);

RETURN_TOKEN(T_STRING_VARNAME);

}

Page 16: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

zend_language_scanner.l<ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC>"${" {

yy_push_state(ST_LOOKING_FOR_VARNAME);

RETURN_TOKEN(T_DOLLAR_OPEN_CURLY_BRACES);

}

<ST_LOOKING_FOR_VARNAME>{LABEL}[[}] {

yyless(yyleng - 1);

zend_copy_value(zendlval, yytext, yyleng);

yy_pop_state();

yy_push_state(ST_IN_SCRIPTING);

RETURN_TOKEN(T_STRING_VARNAME);

}

Page 17: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

zend_language_scanner.l<ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC>"${" {

yy_push_state(ST_LOOKING_FOR_VARNAME);

RETURN_TOKEN(T_DOLLAR_OPEN_CURLY_BRACES);

}

<ST_LOOKING_FOR_VARNAME>{LABEL}[[}] {

yyless(yyleng - 1);

zend_copy_value(zendlval, yytext, yyleng);

yy_pop_state();

yy_push_state(ST_IN_SCRIPTING);

RETURN_TOKEN(T_STRING_VARNAME);

}

Page 18: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

The PHP Lexer

zend_language_scanner.l

Page 19: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

The PHP Lexer

zend_language_scanner.l

re2c

Page 20: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

The PHP Lexer

zend_language_scanner.l

re2c

zend_language_scanner.c

Page 21: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

The PHP Parser

zend_language_parser.y

Page 22: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

zend_language_parser.yif_stmt:

if_stmt_without_else %prec T_NOELSE { $$ = $1; }

| if_stmt_without_else T_ELSE statement

{ $$ = zend_ast_list_add($1, zend_ast_create(ZEND_AST_IF_ELEM, NULL, $3)); }

;

if_stmt_without_else:

T_IF '(' expr ')' statement

{ $$ = zend_ast_create_list(1, ZEND_AST_IF,

zend_ast_create(ZEND_AST_IF_ELEM, $3, $5)); }

| if_stmt_without_else T_ELSEIF '(' expr ')' statement

{ $$ = zend_ast_list_add($1,

zend_ast_create(ZEND_AST_IF_ELEM, $4, $6)); }

;

Page 23: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

if_stmt:

if_stmt_without_else %prec T_NOELSE { $$ = $1; }

| if_stmt_without_else T_ELSE statement

{ $$ = zend_ast_list_add($1, zend_ast_create(ZEND_AST_IF_ELEM, NULL, $3)); }

;

if_stmt_without_else:

T_IF '(' expr ')' statement

{ $$ = zend_ast_create_list(1, ZEND_AST_IF,

zend_ast_create(ZEND_AST_IF_ELEM, $3, $5)); }

| if_stmt_without_else T_ELSEIF '(' expr ')' statement

{ $$ = zend_ast_list_add($1,

zend_ast_create(ZEND_AST_IF_ELEM, $4, $6)); }

;

zend_language_parser.y

Page 24: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

if_stmt:

if_stmt_without_else %prec T_NOELSE { $$ = $1; }

| if_stmt_without_else T_ELSE statement

{ $$ = zend_ast_list_add($1, zend_ast_create(ZEND_AST_IF_ELEM, NULL, $3)); }

;

if_stmt_without_else:

T_IF '(' expr ')' statement

{ $$ = zend_ast_create_list(1, ZEND_AST_IF,

zend_ast_create(ZEND_AST_IF_ELEM, $3, $5)); }

| if_stmt_without_else T_ELSEIF '(' expr ')' statement

{ $$ = zend_ast_list_add($1,

zend_ast_create(ZEND_AST_IF_ELEM, $4, $6)); }

;

zend_language_parser.y

Page 25: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

if_stmt:

if_stmt_without_else %prec T_NOELSE { $$ = $1; }

| if_stmt_without_else T_ELSE statement

{ $$ = zend_ast_list_add($1, zend_ast_create(ZEND_AST_IF_ELEM, NULL, $3)); }

;

if_stmt_without_else:

T_IF '(' expr ')' statement

{ $$ = zend_ast_create_list(1, ZEND_AST_IF,

zend_ast_create(ZEND_AST_IF_ELEM, $3, $5)); }

| if_stmt_without_else T_ELSEIF '(' expr ')' statement

{ $$ = zend_ast_list_add($1,

zend_ast_create(ZEND_AST_IF_ELEM, $4, $6)); }

;

zend_language_parser.y

Page 26: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

if_stmt:

if_stmt_without_else %prec T_NOELSE { $$ = $1; }

| if_stmt_without_else T_ELSE statement

{ $$ = zend_ast_list_add($1, zend_ast_create(ZEND_AST_IF_ELEM, NULL, $3)); }

;

if_stmt_without_else:

T_IF '(' expr ')' statement

{ $$ = zend_ast_create_list(1, ZEND_AST_IF,

zend_ast_create(ZEND_AST_IF_ELEM, $3, $5)); }

| if_stmt_without_else T_ELSEIF '(' expr ')' statement

{ $$ = zend_ast_list_add($1,

zend_ast_create(ZEND_AST_IF_ELEM, $4, $6)); }

;

zend_language_parser.y

Page 27: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

if_stmt:

if_stmt_without_else %prec T_NOELSE { $$ = $1; }

| if_stmt_without_else T_ELSE statement

{ $$ = zend_ast_list_add($1, zend_ast_create(ZEND_AST_IF_ELEM, NULL, $3)); }

;

if_stmt_without_else:

T_IF '(' expr ')' statement

{ $$ = zend_ast_create_list(1, ZEND_AST_IF,

zend_ast_create(ZEND_AST_IF_ELEM, $3, $5)); }

| if_stmt_without_else T_ELSEIF '(' expr ')' statement

{ $$ = zend_ast_list_add($1,

zend_ast_create(ZEND_AST_IF_ELEM, $4, $6)); }

;

zend_language_parser.y

Page 28: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

if_stmt:

if_stmt_without_else %prec T_NOELSE { $$ = $1; }

| if_stmt_without_else T_ELSE statement

{ $$ = zend_ast_list_add($1, zend_ast_create(ZEND_AST_IF_ELEM, NULL, $3)); }

;

if_stmt_without_else:

T_IF '(' expr ')' statement

{ $$ = zend_ast_create_list(1, ZEND_AST_IF,

zend_ast_create(ZEND_AST_IF_ELEM, $3, $5)); }

| if_stmt_without_else T_ELSEIF '(' expr ')' statement

{ $$ = zend_ast_list_add($1,

zend_ast_create(ZEND_AST_IF_ELEM, $4, $6)); }

;

zend_language_parser.y

Page 29: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

if_stmt:

if_stmt_without_else %prec T_NOELSE { $$ = $1; }

| if_stmt_without_else T_ELSE statement

{ $$ = zend_ast_list_add($1, zend_ast_create(ZEND_AST_IF_ELEM, NULL, $3)); }

;

if_stmt_without_else:

T_IF '(' expr ')' statement

{ $$ = zend_ast_create_list(1, ZEND_AST_IF,

zend_ast_create(ZEND_AST_IF_ELEM, $3, $5)); }

| if_stmt_without_else T_ELSEIF '(' expr ')' statement

{ $$ = zend_ast_list_add($1,

zend_ast_create(ZEND_AST_IF_ELEM, $4, $6)); }

;

zend_language_parser.y

Page 30: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

if ($a == 1)

{

a();

}

else if ($b == 1)

{

b();

}

else

{

c();

}

Using the rules to parse

Page 31: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

if ($a == 1)

{

a();

}

else if ($b == 1)

{

b();

}

else

{

c();

}

Using the rules to parse

if_stmt_without_else (A)

Page 32: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

if ($a == 1)

{

a();

}

else if ($b == 1)

{

b();

}

else

{

c();

}

Using the rules to parse

if_stmt_without_else (A)

if_stmt_without_else (B)

Page 33: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

if ($a == 1)

{

a();

}

else if ($b == 1)

{

b();

}

else

{

c();

}

Using the rules to parse

if_stmt_without_else (A)

if_stmt_without_else (B)

if_stmt

Page 34: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Zend_language_parser.y (PHP 7.0.10)if_stmt:

if_stmt_without_else %prec T_NOELSE { $$ = $1; }

| if_stmt_without_else T_ELSE statement

{ $$ = zend_ast_list_add($1, zend_ast_create(ZEND_AST_IF_ELEM, NULL, $3)); }

;

if_stmt_without_else:

T_IF '(' expr ')' statement

{ $$ = zend_ast_create_list(1, ZEND_AST_IF,

zend_ast_create(ZEND_AST_IF_ELEM, $3, $5)); }

| if_stmt_without_else T_ELSEIF '(' expr ')' statement

{ $$ = zend_ast_list_add($1,

zend_ast_create(ZEND_AST_IF_ELEM, $4, $6)); }

;

Page 35: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

zend_language_parser.y (PHP 5.6.26)T_IF parenthesis_expr { zend_do_if_cond(&$2, &$1 TSRMLS_CC); }

statement { zend_do_if_after_statement(&$1, 1 TSRMLS_CC); }

void zend_do_if_cond(const znode *cond, znode *closing_bracket_token TSRMLS_DC)

{

int if_cond_op_number = get_next_op_number(CG(active_op_array));

zend_op *opline = get_next_op(CG(active_op_array) TSRMLS_CC);

opline->opcode = ZEND_JMPZ;

SET_NODE(opline->op1, cond);

closing_bracket_token->u.op.opline_num = if_cond_op_number;

SET_UNUSED(opline->op2);

INC_BPC(CG(active_op_array));

}

Page 36: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

AST is new in PHP 7+

Page 37: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

How PHP works

PHP code

OpCacheExecute (VM)

Lexer + Parser

Compiler

Page 38: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Let’s simplify!

Page 39: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

First… WTF is AST?

Page 40: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

AST is just a data structure

Page 41: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

PHP code

<?php

echo "Hello world";

Page 42: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

An AST representation

Echo statement

`-- String, value "Hello world"

Page 43: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

PHP code

<?php

echo "Hello " . "world";

Page 44: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

An AST representation

Echo statement

`-- Concat

|-- Left

| `-- String, value "Hello "

`-- Right

`-- String, value "world"

Page 45: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

PHP code

<?php

$a = 5;

$b = 3;

echo $a + ($b * 2);

Page 46: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

An AST representationAssign statement

|-- Variable $a

`-- Integer, value 5

Assign statement

|-- Variable $b

`-- Integer, value 3

Echo statement

`-- Add operation

|-- Left

| `-- Variable $a

`-- Right

`-- Multiply operation

|-- Left

| `-- Variable $b

`-- Right

`-- Integer, value 2

Page 47: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

AST compilationStatements

EchoAssign

Scalarvalue: (int)5

Variablename: $a

Assign

Scalarvalue: (int)3

Variablename: $b Add op

Right operandLeft operand

Variablename: $a

Multiply op

Right operandLeft operand

Variablename: $b

Scalarvalue: (int)2

Page 48: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

AST compilation: pre-order traversalStatements

EchoAssign

Scalarvalue: (int)5

Variablename: $a

Assign

Scalarvalue: (int)3

Variablename: $b Add op

Right operandLeft operand

Variablename: $a

Multiply op

Right operandLeft operand

Variablename: $b

Scalarvalue: (int)2

Page 49: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Pre-order traversal: Polish notationAssign(Variable $a, Scalar 5)

Assign(Variable $b, Scalar 3)

Echo (

Add(

Variable $a,

Multiply( $b, 2 )

)

)

Page 50: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Order of precedence

1 + 2 * 3

= 1 + (2 * 3) = 7?

= (1 + 2) * 3 = 9?

Page 51: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Order of precedence

1 + 2 * 3

= 1 + (2 * 3) = 7?

= (1 + 2) * 3 = 9?

+ 1 * 2 3

Page 52: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Order of precedence

1 + 2 * 3

= 1 + (2 * 3) = 7?

= (1 + 2) * 3 = 9?

+ 1 * 2 3

Operator Left operand Right operand

Page 53: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Order of precedence

1 + 2 * 3

= 1 + (2 * 3) = 7?

= (1 + 2) * 3 = 9?

+ 1 * 2 3

Operator Left operand Right operand

Operator Left operand Right operand

Page 54: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Reverse Polish Notation

1 2 3 * +

Page 55: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Reverse Polish Notation

1 2 3 * + The stack

Page 56: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Reverse Polish Notation

1 2 3 * + The stack

1

Page 57: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Reverse Polish Notation

1 2 3 * + The stack

1

2

Page 58: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Reverse Polish Notation

1 2 3 * + The stack

1

2

3

Page 59: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Reverse Polish Notation

1 2 3 * + The stack

1

2

3

Page 60: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Reverse Polish Notation

1 2 3 * + The stack

1

2

3

Page 61: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Reverse Polish Notation

1 2 3 * + The stack

1

6

Page 62: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Reverse Polish Notation

1 2 3 * + The stack

1

6

Page 63: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Reverse Polish Notation

1 2 3 * + The stack

7

Page 64: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Let’s write a compiler (!!!)In three easy steps…

Page 65: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Warning: do not use in production

Page 66: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

View > Sourcehttps://github.com/asgrim/basic-maths-compiler

Page 67: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Define the languageTokens

● T_ADD (+)

● T_SUBTRACT (-)

● T_MULTIPLY (/)

● T_DIVIDE (*)

● T_INTEGER (\d)

● T_WHITESPACE (\s+)

Page 68: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Step 1: Writing a simple lexer

Page 69: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Using regular expressionsprivate static $matches = [

'/^(\+)/' => Token::T_ADD,

'/^(-)/' => Token::T_SUBTRACT,

'/^(\*)/' => Token::T_MULTIPLY,

'/^(\/)/' => Token::T_DIVIDE,

'/^(\d+)/' => Token::T_INTEGER,

'/^(\s+)/' => Token::T_WHITESPACE,

];

Page 70: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Step through the input stringpublic function __invoke(string $input) : array

{

$tokens = [];

$offset = 0;

while ($offset < strlen($input)) {

$focus = substr($input, $offset);

$result = $this->match($focus);

$tokens[] = $result;

$offset += strlen($result->getLexeme());

}

return $tokens;

}

Page 71: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

The matching methodprivate function match(string $input) : Token

{

foreach (self::$matches as $pattern => $token) {

if (preg_match($pattern, $input, $matches)) {

return new Token($token, $matches[1]);

}

}

throw new \RuntimeException(sprintf(

'Unmatched token, next 15 chars were: %s', substr($input, 0, 15)

));

}

Page 72: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Step 2: Parsing the tokens

Page 73: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Order tokens by operator precedence/**

* Higher number is higher precedence.

* @var int[]

*/

private static $operatorPrecedence = [

Token::T_SUBTRACT => 0,

Token::T_ADD => 1,

Token::T_DIVIDE => 2,

Token::T_MULTIPLY => 3,

];

Page 74: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Order tokens by operator precedence/** @var Token[] $stack */

$stack = [];

/** @var Token[] $operators */

$operators = [];

while (false !== ($token = current($tokens))) {

if ($token->isOperator()) {

// ...

}

$stack[] = $token;

next($tokens);

}

Page 75: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Order tokens by operator precedence/** @var Token[] $stack */

$stack = [];

/** @var Token[] $operators */

$operators = [];

while (false !== ($token = current($tokens))) {

if ($token->isOperator()) {

// ...

}

$stack[] = $token;

next($tokens);

}

Page 76: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Order tokens by operator precedence/** @var Token[] $stack */

$stack = [];

/** @var Token[] $operators */

$operators = [];

while (false !== ($token = current($tokens))) {

if ($token->isOperator()) {

// ...

}

$stack[] = $token;

next($tokens);

}

Page 77: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Order tokens by operator precedence/** @var Token[] $stack */

$stack = [];

/** @var Token[] $operators */

$operators = [];

while (false !== ($token = current($tokens))) {

if ($token->isOperator()) {

// ...

}

$stack[] = $token;

next($tokens);

}

Page 78: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Order tokens by operator precedenceif ($token->isOperator()) {

$tokenPrecedence = self::$operatorPrecedence[$token->getToken()];

while (

count($operators)

&& self::$operatorPrecedence[$operators[count($operators) - 1]->getToken()]

> $tokenPrecedence

) {

$higherOp = array_pop($operators);

$stack[] = $higherOp;

}

$operators[] = $token;

next($tokens);

continue;

}

Page 79: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Order tokens by operator precedenceif ($token->isOperator()) {

$tokenPrecedence = self::$operatorPrecedence[$token->getToken()];

while (

count($operators)

&& self::$operatorPrecedence[$operators[count($operators) - 1]->getToken()]

> $tokenPrecedence

) {

$higherOp = array_pop($operators);

$stack[] = $higherOp;

}

$operators[] = $token;

next($tokens);

continue;

}

Page 80: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Order tokens by operator precedenceif ($token->isOperator()) {

$tokenPrecedence = self::$operatorPrecedence[$token->getToken()];

while (

count($operators)

&& self::$operatorPrecedence[$operators[count($operators) - 1]->getToken()]

> $tokenPrecedence

) {

$higherOp = array_pop($operators);

$stack[] = $higherOp;

}

$operators[] = $token;

next($tokens);

continue;

}

Page 81: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Order tokens by operator precedenceif ($token->isOperator()) {

$tokenPrecedence = self::$operatorPrecedence[$token->getToken()];

while (

count($operators)

&& self::$operatorPrecedence[$operators[count($operators) - 1]->getToken()]

> $tokenPrecedence

) {

$higherOp = array_pop($operators);

$stack[] = $higherOp;

}

$operators[] = $token;

next($tokens);

continue;

}

Page 82: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Order tokens by operator precedence// Clean up by moving any remaining operators onto the token stack

while (count($operators)) {

$stack[] = array_pop($operators);

}

return $stack;

Page 83: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Order tokens by operator precedence

1 + 2 * 3

Output stack

Operator stack

Page 84: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Order tokens by operator precedence

1 + 2 * 3

1Output stack

Operator stack

Page 85: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Order tokens by operator precedence

1 + 2 * 3

1

+

Output stack

Operator stack

Page 86: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Order tokens by operator precedence

1 + 2 * 3

1 2

+

Output stack

Operator stack

Page 87: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Order tokens by operator precedence

1 + 2 * 3

1 2

+ *

Output stack

Operator stack

Page 88: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Order tokens by operator precedence

1 + 2 * 3

1 2 3

+ *

Output stack

Operator stack

Page 89: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Order tokens by operator precedence

1 + 2 * 3

1 2 3 *

+ *

Output stack

Operator stack

Page 90: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Order tokens by operator precedence

1 + 2 * 3

1 2 3 * +

+

Output stack

Operator stack

Page 91: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Create ASTwhile ($ip < count($tokenStack)) {

$token = $tokenStack[$ip++];

if ($token->isOperator()) {

// (figure out $nodeType)

$right = array_pop($astStack);

$left = array_pop($astStack);

$astStack[] = new $nodeType($left, $right);

continue;

}

$astStack[] = new Node\Scalar\IntegerValue((int)$token->getLexeme());

}

Page 92: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Create ASTwhile ($ip < count($tokenStack)) {

$token = $tokenStack[$ip++];

if ($token->isOperator()) {

// (figure out $nodeType)

$right = array_pop($astStack);

$left = array_pop($astStack);

$astStack[] = new $nodeType($left, $right);

continue;

}

$astStack[] = new Node\Scalar\IntegerValue((int)$token->getLexeme());

}

Page 93: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Create ASTwhile ($ip < count($tokenStack)) {

$token = $tokenStack[$ip++];

if ($token->isOperator()) {

// (figure out $nodeType)

$right = array_pop($astStack);

$left = array_pop($astStack);

$astStack[] = new $nodeType($left, $right);

continue;

}

$astStack[] = new Node\Scalar\IntegerValue((int)$token->getLexeme());

}

Page 94: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Create ASTwhile ($ip < count($tokenStack)) {

$token = $tokenStack[$ip++];

if ($token->isOperator()) {

// (figure out $nodeType)

$right = array_pop($astStack);

$left = array_pop($astStack);

$astStack[] = new $nodeType($left, $right);

continue;

}

$astStack[] = new Node\Scalar\IntegerValue((int)$token->getLexeme());

}

Page 95: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Create AST

Node\BinaryOp\Add (

Node\Scalar\IntegerValue(1),

Node\BinaryOp\Multiply (

Node\Scalar\IntegerValue(2),

Node\Scalar\IntegerValue(3)

)

)

Page 96: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Step 3: Executing the AST

Page 97: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Compile & execute ASTprivate function compileNode(NodeInterface $node)

{

if ($node instanceof Node\BinaryOp\AbstractBinaryOp) {

return $this->compileBinaryOp($node);

}

if ($node instanceof Node\Scalar\IntegerValue) {

return $node->getValue();

}

}

Page 98: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Compile & execute ASTprivate function compileBinaryOp(Node\BinaryOp\AbstractBinaryOp $node)

{

$left = $this->compileNode($node->getLeft());

$right = $this->compileNode($node->getRight());

switch (get_class($node)) {

case Node\BinaryOp\Add::class:

return $left + $right;

case Node\BinaryOp\Subtract::class:

return $left - $right;

case Node\BinaryOp\Multiply::class:

return $left * $right;

case Node\BinaryOp\Divide::class:

return $left / $right;

}

}

Page 99: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

AST in userland

Page 100: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

php-ast extensionhttps://github.com/nikic/php-ast

Page 101: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

php-ast example usage<?php

require 'path/to/util.php';

$code = <<<'EOC'

<?php

$var = 42;

EOC;

echo ast_dump(ast\parse_code($code, $version=35)), "\n";

// Output:

AST_STMT_LIST

0: AST_ASSIGN

var: AST_VAR

name: "var"

expr: 42

Page 102: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

astkithttps://github.com/sgolemon/astkit

Page 103: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

astkit example usage$if = AstKit::parseString(<<<EOD

if (true) {

echo "This is a triumph.\n";

} else {

echo "The cake is a lie.\n";

}

EOD

);

$if->execute(); // First run, program is as-seen above

$const = $if->getChild(0)->getChild(0);

// Replace the "true" constant in the condition with false

$const->graft(0, false);

// Can also graft other AstKit nodes, instead of constants

$if->execute(); // Second run now takes the else path

Page 104: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

PhpParserhttps://github.com/nikic/PHP-Parser

Page 105: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

PHP Parser<?php

use PhpParser\ParserFactory;

$parser = (new ParserFactory)

->create(ParserFactory::PREFER_PHP7);

print_r($parser->parse(

file_get_contents('ast-demo-src.php')

));

Page 106: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Better Reflectionhttps://github.com/Roave/BetterReflection

Page 107: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Better Reflection workflow

Reflector

Source Locator

PhpParser

Reflection

Page 108: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

PHP Reflection$reflection = new ReflectionClass(

\My\ExampleClass::class

);

$this->assertSame(

'ExampleClass',

$reflection->getShortName()

);

Page 109: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Better Reflection$reflection = ReflectionClass::createFromName(

\My\ExampleClass::class

);

$this->assertSame(

'ExampleClass',

$reflection->getShortName()

);

Page 110: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

ReflectionClass::createFromName()// In ReflectionClass :

public static function createFromName($className)

{

return ClassReflector::buildDefaultReflector()->reflect($className);

}

Page 111: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

ClassReflector::buildDefaultReflector()// In ClassReflector :

public static function buildDefaultReflector()

{

return new self(new AggregateSourceLocator([

new PhpInternalSourceLocator(),

new EvaledCodeSourceLocator(),

new AutoloadSourceLocator(),

]));

}

Page 112: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

Given a class structure...<?php

class Foo

{

private $bar;

public function thing()

{

}

}

Page 113: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

@asgrim

… we get the AST!Class, name Foo

|-- Statements

| |-- Property, name bar

| | |-- Type [private]

| | `-- Attributes [start line: 7, end line: 9]

| `-- Method, name thing

| |-- Type [public]

| |-- Parameters [...]

| |-- Statements [...]

| `-- Attributes [start line: 7, end line: 9]

`-- Attributes [start line: 3, end line: 10]

Page 114: Climbing the Abstract Syntax Tree (Bulgaria PHP 2016)

Any questions?

https://joind.in/talk/513ad

James Titcumb @asgrim