Top Banner
Byterun: A (C)Python interpreter in Python Allison Kaptur github.com/akaptur akaptur.github.io @akaptur
40

Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Aug 19, 2014

Download

Engineering

akaptur

Allison Kaptur speaking at NYC Python in July 2014.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Byterun: A (C)Python interpreter in Python

Allison Kaptur !

github.com/akaptur akaptur.github.io

@akaptur

Page 2: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Byterun Ned Batchelder

!Based on

# pyvm2 by Paul Swartz (z3p) from http://www.twistedmatrix.com/users/z3p/

Page 3: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Why would you do such a thing

>>> if a or b: ... do_stuff()

Page 4: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Some things we can do

out = "" for i in range(5): out = out + str(i) print(out)

Page 5: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Some things we can do

def fn(a, b=17, c="Hello", d=[]): d.append(99) print(a, b, c, d) !fn(1) fn(2, 3) fn(3, c="Bye") fn(4, d=["What?"]) fn(5, "b", "c")

Page 6: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Some things we can do

def verbose(func): def _wrapper(*args, **kwargs): return func(*args, **kwargs) return _wrapper !@verbose def add(x, y): return x+y !add(7, 3)

Page 7: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Some things we can do

try: raise ValueError("oops") except ValueError as e: print("Caught: %s" % e) print("All done")

Page 8: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Some things we can doclass NullContext(object): def __enter__(self): l.append('i') return self ! def __exit__(self, exc_type, exc_val, exc_tb): l.append('o') return False !l = [] for i in range(3): with NullContext(): l.append('w') if i % 2: break l.append('z') l.append('e') !l.append('r') s = ''.join(l) print("Look: %r" % s) assert s == "iwzoeiwor"

Page 9: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Some things we can do

g = (x*x for x in range(3)) print(list(g))

Page 10: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

A problem

g = (x*x for x in range(5)) h = (y+1 for y in g) print(list(h))

Page 11: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

The Python virtual machine: !

A bytecode interpreter

Page 12: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Bytecode: the internal representation of a python

program in the interpreter

Page 13: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Bytecode: it’s bytes!

>>> def mod(a, b): ... ans = a % b ... return ans

Page 14: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Bytecode: it’s bytes!

>>> def mod(a, b): ... ans = a % b ... return ans >>> mod.func_code.co_code

Function Code object

Bytecode

Page 15: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Bytecode: it’s bytes!

>>> def mod(a, b): ... ans = a % b ... return ans >>> mod.func_code.co_code '|\x00\x00|\x01\x00\x16}\x02\x00|\x02\x00S'

Page 16: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Bytecode: it’s bytes!

>>> def mod(a, b): ... ans = a % b ... return ans >>> mod.func_code.co_code ‘|\x00\x00|\x01\x00\x16}\x02\x00|\x02\x00S' >>> [ord(b) for b in mod.func_code.co_code] [124, 0, 0, 124, 1, 0, 22, 125, 2, 0, 124, 2, 0, 83]

Page 17: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

dis, a bytecode disassembler

>>> import dis >>> dis.dis(mod) 2 0 LOAD_FAST 0 (a) 3 LOAD_FAST 1 (b) 6 BINARY_MODULO 7 STORE_FAST 2 (ans) ! 3 10 LOAD_FAST 2 (ans) 13 RETURN_VALUE

Page 18: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

dis, a bytecode disassembler

>>> import dis >>> dis.dis(mod) 2 0 LOAD_FAST 0 (a) 3 LOAD_FAST 1 (b) 6 BINARY_MODULO 7 STORE_FAST 2 (ans) ! 3 10 LOAD_FAST 2 (ans) 13 RETURN_VALUE

Line Number Index in

bytecode

Instruction name, for humans

More bytes, the argument to each

instruction

Hint about arguments

Page 19: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

whatever

some other thing

something

whatever

some other thing

something

a

b

whatever

some other thing

something

ans

Before After BINARY_MODULO

After LOAD_FAST

Data stack on a frame

Page 20: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

def foo(): x = 1 def bar(y): z = y + 2 # <--- (3) return z return bar(x) # <--- (2) foo() # <--- (1) !c --------------------- a | bar Frame | -> blocks: [] l | (newest) | -> data: [1, 2] l --------------------- | foo Frame | -> blocks: [] s | | -> data: [<foo.<lcl>.bar, 1] t --------------------- a | main (module) Frame | -> blocks: [] c | (oldest) | -> data: [<foo>] k ---------------------

Page 21: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

dis, a bytecode disassembler

>>> import dis >>> dis.dis(mod) 2 0 LOAD_FAST 0 (a) 3 LOAD_FAST 1 (b) 6 BINARY_MODULO 7 STORE_FAST 2 (ans) ! 3 10 LOAD_FAST 2 (ans) 13 RETURN_VALUE

Page 22: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython
Page 23: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

} /*switch*/

/* Main switch on opcode */ READ_TIMESTAMP(inst0); !switch (opcode) {

Page 24: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

#ifdef CASE_TOO_BIG default: switch (opcode) { #endif

/* Turn this on if your compiler chokes on the big switch: */ /* #define CASE_TOO_BIG 1 */

Page 25: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Back to that bytecode

!>>> dis.dis(mod) 2 0 LOAD_FAST 0 (a) 3 LOAD_FAST 1 (b) 6 BINARY_MODULO 7 STORE_FAST 2 (ans) ! 3 10 LOAD_FAST 2 (ans) 13 RETURN_VALUE

Page 26: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

case LOAD_FAST: x = GETLOCAL(oparg); if (x != NULL) { Py_INCREF(x); PUSH(x); goto fast_next_opcode; } format_exc_check_arg(PyExc_UnboundLocalError, UNBOUNDLOCAL_ERROR_MSG, PyTuple_GetItem(co->co_varnames, oparg)); break;

Page 27: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

case BINARY_MODULO: w = POP(); v = TOP(); if (PyString_CheckExact(v)) x = PyString_Format(v, w); else x = PyNumber_Remainder(v, w); Py_DECREF(v); Py_DECREF(w); SET_TOP(x); if (x != NULL) continue; break;

Page 28: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

It’s “dynamic”

>>> def mod(a, b): ... ans = a % b ... return ans >>> mod(15, 4) 3

Page 29: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

“Dynamic”

>>> def mod(a, b): ... ans = a % b ... return ans >>> mod(15, 4) 3 >>> mod(“%s%s”, (“NYC”, “Python”))

Page 30: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

“Dynamic”

>>> def mod(a, b): ... ans = a % b ... return ans >>> mod(15, 4) 3 >>> mod(“%s %s”, (“NYC”, “Python”)) NYC Python

Page 31: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

“Dynamic”

>>> def mod(a, b): ... ans = a % b ... return ans >>> mod(15, 4) 3 >>> mod(“%s %s”, (“NYC”, “Python”)) NYC Python >>> print “%s %s” % (“NYC”, “Python”) NYC Python

Page 32: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

case BINARY_MODULO: w = POP(); v = TOP(); if (PyString_CheckExact(v)) x = PyString_Format(v, w); else x = PyNumber_Remainder(v, w); Py_DECREF(v); Py_DECREF(w); SET_TOP(x); if (x != NULL) continue; break;

Page 33: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

>>> class Surprising(object): … def __mod__(self, other): … print “Surprise!” !>>> s = Surprising() >>> t = Surprsing() >>> s % t Surprise!

Page 34: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

“In the general absence of type information, almost every instruction must be treated as INVOKE_ARBITRARY_METHOD.”

!- Russell Power and Alex Rubinsteyn, “How Fast Can

We Make Interpreted Python?”

Page 35: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Back to our problem

g = (x*x for x in range(5)) h = (y+1 for y in g) print(list(h))

Page 36: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

def foo(): x = 1 def bar(y): z = y + 2 # <--- (3) return z return bar(x) # <--- (2) foo() # <--- (1) !c --------------------- a | bar Frame | -> blocks: [] l | (newest) | -> data: [1, 2] l --------------------- | foo Frame | -> blocks: [] s | | -> data: [<foo.<lcl>.bar, 1] t --------------------- a | main (module) Frame | -> blocks: [] c | (oldest) | -> data: [<foo>] k ---------------------

Page 37: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

def foo(): x = 1 def bar(y): z = y + 2 # <--- (3) return z return bar(x) # <--- (2) foo() # <--- (1) !!!l --------------------- | foo Frame | -> blocks: [] s | | -> data: [3] t --------------------- a | main (module) Frame | -> blocks: [] c | (oldest) | -> data: [<foo>] k ---------------------

Page 38: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

def foo(): x = 1 def bar(y): z = y + 2 # <--- (3) return z return bar(x) # <--- (2) foo() # <--- (1) !!s t --------------------- a | main (module) Frame | -> blocks: [] c | (oldest) | -> data: [3] k ---------------------

Page 39: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Back to our problem

g = (x*x for x in range(5)) h = (y+1 for y in g) print(list(h))

Page 40: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

More

Great blogs http://tech.blog.aknin.name/category/my-projects/pythons-innards/ by @aknin http://eli.thegreenplace.net/ by Eli Bendersky !Contribute! Find bugs! https://github.com/nedbat/byterun !Apply to Hacker School! www.hackerschool.com/apply