May 10, 2015
A better Pythonfor the JVM
Tobias Ivarsson <[email protected]>
Hello, my name is...
• ...Tobias Ivarsson
• Jython Committer / Compiler geek
• Java developer at Neo TechnologyAsk me about our graph database - Neo4j(it works with Python)
• Overview of the “Advanced Compiler” project
• Performance figures
• Python / JVM mismatch
• Getting better
• Summary
• Overview of the “Advanced Compiler” project
• Performance figures
• Python / JVM mismatch
• Getting better
• Summary
• The ultimate goal is a faster Jython
• The new compiler is just a component to get there
• Focus is on representation of Python code on the JVM
Project motivation
What does Code Representation include?• Function/Method/Code object
representation
• Call frame representation
• Affects sys._getframe()
• Scopes. How to store locals and globals
• The representation of builtins
• Mapping from python attributes to the JVM
Compiler tool chainParser Analyzer CompilerSource code AST
Code Infoper scope
AST
The “spine” of the compiler. The main part. This is the same in any compiler in Jython, and similar to other systems, CPython in particular, as well.
Compiler tool chainParser Analyzer Compiler
JVM
Source code ASTCode Infoper scope
AST
Javabyte code
Jython runtime system
This is the structure of the compiler in Jython today.
Compiler tool chainParser Analyzer Compiler Transformer
Codegen
JVM
Source code ASTCode Infoper scope
ASTIR
IR
Javabyte codeJython
runtime system
The advanced compiler adds two more steps to the compilation process.The analyzer and compiler step also change.
Compiler tool chainParser Analyzer Compiler Transformer
Codegen
Interpreter
JVM
Source code ASTCode Infoper scope
ASTIR
IR
Python byte code
Javabyte codeJython
runtime system
This flexibility makes it possible to output many different code formats.Even bundle together multiple formats for one module.
The Intermediate Representation
• “sea of nodes” style SSA
• Control flow and data flow both modeled as edges between nodes
• Simplifies instruction re-ordering
• Overview of the “Advanced Compiler” project
• Performance figures
• Python / JVM mismatch
• Getting better
• Summary
Parrotbench
• 7 tests, numbered b0-b6
• Test b1 omitted
• Tests infinite recursion and expects recursion limit exception
• Allocates objects while recursing
• Not applicable for Jython
Running parrotbench
• Python 2.6 vs Jython 2.5 (trunk)
• Each test executes 3 times, minimum taken
• Total time of process execution, including startup also measured
• Jython also tested after JVM JIT warmup
• Warmup for about 1 hour...110 iterations of each test
The tests(rough understanding)
• b0 parses python in python
• b2 computes pi
• b3 sorts random data
• b4 more parsing of python in python
• b5 tests performance of builtins
• b6 creates large simple lists/dicts
Python 2.6
Test Time (ms)
b0
b2
b3
b4
b5
b6
Total (incl. VM startup)
1387
160
943
438
874
1079
15085
Jython 2.5 (trunk)Test Time (ms)
(without JIT warmup)Time (ms)
(with JIT warmup)
b0
b2
b3
b4
b5
b6
Total (incl. VM startup)
4090 2099
202 107
3612 1629
1095 630
3044 2161
2755 2237
51702 Not applicable
CPython2.6 vs Jython2.5
0
15,000
30,000
45,000
60,000
Total runtime Excluding VM startup
Python 2.6 Jython 2.5
CPython2.6 vs Jython2.5
0
3,750
7,500
11,250
15,000
Python 2.6 Jython 2.5 Jython with warmup
b0 b2 b3 b4 b5 b6
CPython2.6 vs Jython2.5
0
1,250
2,500
3,750
5,000
b0 b2 b3 b4 b5 b6
Python 2.6 Jython 2.5 Jython with warmup
What about the “Advanced Compiler”• So far no speedup compared to the “old
compiler”
• Slight slowdown due to extra compiler step
• Does provide a platform for adding optimizations
• But none of these are implemented yet...
• Overview of the “Advanced Compiler” project
• Performance figures
• Python / JVM mismatch
• Getting better
• Summary
Call frames
• A lot of Python code depend on reflecting call frames
• Every JVM has call frames, but only expose them to debuggers
• Current Jython is naïve about how frames are propagated
• Simple prototyping hints at up to 2x boost
Extremely late binding
• Every binding can change
• The module scope is volatile
• Even builtins can be overridden
Exception handling
• Exception type matching in Python is a sequential comparison.
• Exception type matching in the JVM is done on exact type by the VM.
• Exception types are specified as arbitrary expressions.
• No way of mapping Python try/except directly to the JVM.
• Overview of the “Advanced Compiler” project
• Performance figures
• Python / JVM mismatch
• Getting better
• Summary
Call frames
• Analyze code - omit unnecessary frames
• Fall back to java frames for pdb et.al.
• Treat locals, globals, dir, exec, eval as special
• Pass state - avoid central stored state
• sys._getframe() is an implementation detail
Late binding
• Ignore it and provide a fail path
• Inline builtins
• Turn for i in range(...): ... into a java loop
• Do direct invocations to members of the same module
Exception handling
• The same late binding optimizations+ optimistic exception handler restructuring gets us far
Reaping the fruits of the future JVMs
• Invokedynamic can perform most optimistic direct calls and provide the fail path
• Interface injection makes java objects look like python objects
• And improves integration between different dynamic languages even more
• The advanced compiler makes a perfect platform for integrating this
• Overview of the “Advanced Compiler” project
• Performance figures
• Python / JVM mismatch
• Getting better
• Summary
The “Advanced Jython compiler” project
• Not just a compiler - but everything close to the compiler - code representation
• A platform for moving forward
• First and foremost an enabling tool
• Actual improvement happens elsewhere
Performance
• Jython has decent performance
• On some benchmarks Jython is better
• For most “real applications” CPython is better
• Long running applications benefit from the JVM - Jython is for the server side
• We are only getting started...
Python / JVM mismatch- Getting better -
• Most of the problems comes from trying to mimic CPython to closely
• Future JVMs are a better match
• Optimistic optimizations are the way to go
Thank you!
Questions?Tobias Ivarsson