Top Banner
Towards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 1 / 30
36

Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

May 22, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

Towards Ruby3x3 PerformanceIntroducing RTL and MJIT

Vladimir Makarov

Red Hat

September 21, 2017

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 1 / 30

Page 2: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

About Myself

Red Hat, Toronto office, Canada

Tools group (GCC, Glibc, LLVM, Rust, Go, OpenMP)I part of a bigger platform enablement team (porting

Linux kernel to new hardware)

20 years of work on GCC

2 years of work on MRI

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 2 / 30

Page 3: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

Ruby 3 performance goal

Matz set a very ambitious goal: MRI 3 should be 3x faster than MRI2

I Koichi Sasada improved MRI performance by about 3xI It is symbolic to expect MRI 3 should be 3x faster than MRI 2

Doable for CPU intensive programs

Hardly possible for memory or IO bound programs

I treat Matz’s performance goal as: MRI needs another cardinalperformance improvement

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 3 / 30

Page 4: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

RTL insns

IR for Ruby code analysis, optimizations, and JITI Importance of easy data dependence discoveryI Stack based insns are an inconvenient IR for such goals

Stack insns vs RTL insns for Ruby code a = b + c:

getlocal_OP__WC__0 <b index>

getlocal_OP__WC__0 <c index>

opt_plus

setlocal_OP__WC__0 <a index>

plus <a index>, <b index>, <c index>

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 4 / 30

Page 5: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

Using RTL insns for interpretation

RTL for analysis and JIT code generation

RTL or stack insns for interpretation?

Feature Stack insns RTL insns

Insn length shorter longerInsn number more lessCode length less moreInsn decoding less moreCode data locality more lessInsn dispatching more lessMemory traffic more less

Instructions: Pros & Cons for interpretation

Decision: Use RTL for the interpreter tooI Allows sharing code between the interpreter and JIT

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 5 / 30

Page 6: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

How to generate RTL

A simpler way is to generate RTL insns from the stack insns

A faster approach is to generate directly from MRI parse tree nodes

Decision: generate RTL directly from MRI nodes

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 6 / 30

Page 7: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

RTL insn operands

What could be an operand:I only temporariesI temporaries and localsI temporaries and locals even from higher levels (outside Ruby block)I the above + instance variablesI the above + class variables, globals

Decoding overhead of numerous type operands will not becompensated by processing smaller number of insns

Complicated operands also complicate optimizations and JIT

Currently we use only temporaries and locals. This gives bestperformance results according to my experiments

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 7 / 30

Page 8: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

RTL complications

Practically any RTL insn might be an ISEQ call. A call always puts aresult on the stack top. We need to move this result to a destinationoperand:

I If an RTL insn is actually a call, change the return PC so the next insnexecuted after the call will be an insn moving the result from the stacktop to the insn destination

I To decrease memory overhead, the move insn is a part of the originalinsn

I For example, if the following insnplus <move opcode>, <call data>, dst, op1, op2

is a method call, the next executed insn will be<move opcode> <call data>, dst, op1, op2

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 8 / 30

Page 9: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

RTL insn combining and specialization

Immediate value specializationI e.g. plus − > plusi - addition with immediate fixnum as an operand

Frequent insn sequence combiningI e.g. eq + bt − > bteq - comparison and branch if the operands are

equal

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 9 / 30

Page 10: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

Speculative insn generation

Some initially generated insns can be transformed into speculativeones during their execution

I Speculation is based on operand types (e.g. plus can be transformedinto an integer plus) and on the operand values (e.g. nomulti-precision integers)

Speculative insns can be transformed into unchanging regular insnsif the speculation is wrong

I Speculation insns include code checking the speculation correctness

plus

iplus

uplus

fplus

Speculation will be more important for JITted code performanceI It creates a lot of big extended basic blocks which a C compiler

optimizes well

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 10 / 30

Page 11: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

RTL insn status and future work

It mostly works (make check reports no regressions)

Slightly better performance than stack based insnsI 27% GeoMean improvement on 23 small benchmarks (+110% to -7%)I Code Change (Optcarrot):

Stack insns → RTL insnsExecuted insns number -23%Executed insn length +19%

Still some work to do for RTL improvement:I Reducing code sizeI Reducing overhead in operand decoding

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 11 / 30

Page 12: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

Possible JIT approaches

1. Writing own JIT from scratchI LuaJIT, JavaScript V8, etc

2. Using widely used optimizing compilersI GCC, LLVM

3. Using existing JITsI JVM, OMR, RPython, Graal/Truffle, etc.

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 12 / 30

Page 13: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

Option 1: Writing own JIT from scratch

Full control, small size, fast compilation

Fast compilation is mostly a result of fewer optimizations than inindustrial optimizing compilers

Still a huge effort to implement decent optimizations

Ongoing burden in maintenance and porting

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 13 / 30

Page 14: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

Option 2: Using widely used optimizing compilers

Highly optimized code (GCC has > 300 optimization passes), easierimplementation and porting and extremely well maintained (> 2Kcontributors since GCC 2.95)

Portable (currently supports 49 targets)

Reliable and well tested (> 16K reporters since GCC 2.95)

No new dependencies

But slower compilationI Slower mostly because it does much more than a typical JITI Compilation can be made faster by disabling less valuable optimizations

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 14 / 30

Page 15: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

Option 3: Using existing JITs

Duplication: already used for JRuby, Topaz (Rpython), Opal (JS),OMR Ruby, Graal/Truffle Ruby

JVM is stable, reliable, optimizing, and ubiquitous

But still worse code performance than GCC/JITI Azul Falcon (LLVM based JIT) up to 8x better performance than JVM

C2 (source: http://stuff-gil-says.blogspot.ca/2017)

License issues and patent minefield

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 15 / 30

Page 16: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

Own or existing JITs vs GCC/LLVM based JITs

Webkit moved from LLVM JIT to own JIT (source:https://webkit.org/blog/5852/introducing-the-b3-jit-compiler)

I Implemented about 20 optimizationsI 4-5 speedup in compilation timeI Final results: Jetstream, Kraken, Octane (-9% to +8%)

ISP RAS research: JS V8 ported to LLVM (sourcehttp://llvm.org/devmtg/2016-09/slides/Melnik-LLV8.pdf)

I GeoMean speedup 8-16% on Sunspider

Resulting situation: is the glass half full or half empty?I In my opinion, considering implementation and maintenance efforts,

GCC/LLVM JIT is a winner, especially for long running server programs

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 16 / 30

Page 17: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

How to use GCC/LLVM for implementing JITs

Using LibGCCJIT/MCJIT/ORC:I New, unstable interfacesI A lot of tedious calls to create the environment (see GNU Octave and

PyPy port to libgccjit)

Generating C code:I No dependency on a particular compiler, easier debuggingI But some people call it a heavy, “junky” approach

F Wrong! if we implement it carefully

LibGCCJIT vs GCC data flow (red parts are different):

Environment creation

through API calls

C header parsing

(emvironment)

C function parsing Optimizations

and Generation

Optimizations

and Generation

Assembler/LD

Assembler/LD Loading .so file

Loading .so fileFunction creation

through API calls

GCC

LibGCCJIT

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 17 / 30

Page 18: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

How to use GCC/LLVM for implementing JITs – cont’dGenerating C code

I Environment takes from 21% to 41% of all compilation timeI Using a precompiled header (PCH) decreases this to less than 3.5%I Function parsing takes less than 1%

0

100,000

200,000

300,000

400,000

500,000

600,000

700,000

800,000

Header Minimized_Header PCH Minimized_PCH

GC

C t

hou

sand

in

sns

GCC −O2 processing a function with 44 RTL insns

Environment

Function Parsing

Optimizations & Generation

I GCC with C executable size: 25.1 MB for cc1 vs. 22.6MB for libgccjit(only 10% difference)

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 18 / 30

Page 19: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

MJIT

MJIT is MRI JIT

MJIT is Method JIT

MJIT is a JIT based on C code generation and PCH

MJIT can use GCC or LLVM, in the future other C compilers

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 19 / 30

Page 20: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

MJIT architecture

Environment

header

Minimized

header

MRI building phase

New MRI MJIT environment building step

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 20 / 30

Page 21: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

MJIT architecture

Environment

header

Minimized

header

MRI building phase

MJIT

MRI execution run

MRI

Precompiled header

CC

thread

MJIT initialized in parallel with Ruby program execution

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 20 / 30

Page 22: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

MJIT architecture

Environment

header

Minimized

header

MRI building phase

MJIT

MRI execution run

MRI

Precompiled header

CC

thread

C code .so file

CC

loading

threads

MJIT works in parallel with Ruby program execution

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 20 / 30

Page 23: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

Example

Ruby code:def loop

i = 0; while i < 100_000; i += 1; end

i

end

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 21 / 30

Page 24: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

Example

Ruby code:def loop

i = 0; while i < 100_000; i += 1; end

i

end

RTL code right after compilation:...

0004 val2loc 3, 0

0007 goto 15

0009 plusi cont_op2, <calldata...>, 3, 3, 1

0015 btlti cont_btcmp, 9, <calldata...>, -1, 3, 100000

0022 loc_ret 3, 16

...

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 21 / 30

Page 25: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

Example

Ruby code:def loop

i = 0; while i < 100_000; i += 1; end

i

end

Speculative RTL code after some execution:...

0004 val2loc 3, 0

0007 goto 15

0009 iplusi _, _, 3, 3, 1

0015 ibtlti _, 9, _, -1, 3, 100000

0022 loc_ret 3, 16

...

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 21 / 30

Page 26: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

ExampleRuby code:

def loop

i = 0; while i < 100_000; i += 1; end

i

end

MJIT generated C code:...

l4: cfp->pc = (void *) 0x5576729ccd88; val2loc_f(cfp, &v0, 3, 0x1);

l7: cfp->pc = (void *) 0x5576729ccd98; ruby_vm_check_ints(th); goto l15;

l9: if (iplusi_f(cfp, &v0, 3, &v0, 3, &new_insn)) {

vm_change_insn(cfp->iseq, (void *) 0x5576729ccda6, new_insn);

goto stop_spec;

}

l15: flag = ibtlti_f(cfp, &t0, -1, &v0, 200001, &val, &new_insn);

if (val == RUBY_Qundef) {

vm_change_insn(cfp->iseq, (void *) 0x5576729ccdd6, new_insn);

goto stop_spec;

}

if (flag) goto l9;

l22: cfp->pc = (void *) 0x5576729cce26;

loc_ret_f(th, cfp, &v0, 16, &val);

return val;

...

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 21 / 30

Page 27: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

Example

Ruby code:def loop

i = 0; while i < 100_000; i += 1; end

i

end

GCC optimized x86-64 code:...

movl $200001, %eax

...

ret

There is no loop

JVM can not do this

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 21 / 30

Page 28: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

MJIT performance results

Benchmarking MRI v2 (v2), MRI GCC MJIT (MJIT), MRI LLVMMJIT (MJIT-L), OMR Ruby rev. 57163 using JIT (OMR), JRuby9k9.1.8 (JRuby9K), JRuby9k -Xdynamic (JRuby9k-D), Graal Ruby 0.22(Graal)

Mainstream CPU (i3-7100) under Fedora 25 with GCC-6.3 andClang-3.9

Microbenchmarks and small benchmarks (dir MJIT-benchmarks)I Each benchmark runs at least 20-30sec on MRI v2

Optcarrot (https://github.com/mame/optcarrot)

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 22 / 30

Page 29: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

MJIT performance resultsMicrobenchmarks: Geomean Wall time improvement relative toMRI v2

v2 MJIT MJIT-L OMR JRuby9k JRuby9k-D Graal0

1

2

3

4

5

6

7S

pe

ed

up

(G

eo

Me

an

)

1.09

1.59

2.48

1.83

6.18

4.02

Wall time Speedup

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 23 / 30

Page 30: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

MJIT performance resultsMicrobenchmarks: Geomean CPU time improvement relative toMRI v2

v2 MJIT MJIT-L OMR JRuby9k JRuby9k-D Graal0

1

2

3

4

5

6Speedup (GeoMean

)

1.091.33

1.88

0.69

5.55

3.67

CPU time Speedup

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 24 / 30

Page 31: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

MJIT performance resultsMicrobenchmarks: Geomean Peak memory overhead relative toMRI v2

v2 MJIT MJIT-L OMR JRuby9k JRuby9k-D Graal10-1

100

101

102

103

Peak memory (GeoMean)

2.54

161.76198.86

79.65

4.156.44

Peak memory overhead

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 25 / 30

Page 32: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

MJIT performance resultsOptcarrot: FPS speedup relative to MRI v2

v2 MJIT MJIT-L OMR JRuby9k JRuby9k-D Graal0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5Speedup

1.20 1.14

2.38

2.832.94

FPS improvement

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 26 / 30

Page 33: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

MJIT performance resultsOptcarrot: CPU time improvement relative to MRI v2

v2 MJIT MJIT-L OMR JRuby9k JRuby9k-D Graal0.0

0.5

1.0

1.5

2.0Speedup 1.13

0.79 0.76

1.531.45

CPU time Speedup

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 27 / 30

Page 34: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

MJIT performance resultsOptcarrot: Peak memory overhead relative to MRI v2

v2 MJIT MJIT-L OMR JRuby9k JRuby9k-D Graal10-1

100

101

102

103Peak memory

1.41

10.67

17.68

1.16 1.16

Peak memory overhead

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 28 / 30

Page 35: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

Recommendations to use GCC/LLVM for a JIT

My recommendations in order of importance:I Don’t use MCJIT, ORC, or LIBGCCJITI Use a pre-compiled header (JIT code environment) in a memory FSI Compile code in parallel with program interpretationI Use a good strategy to choose byte code for JITtingI Minimize the environment if you don’t use PCH

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 29 / 30

Page 36: Towards Ruby3x3 Performance - FedoraTowards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance

MJIT status and future directions

The project is at an early development stage:I Unstable, passes ‘make test’, can not pass ‘make check’ yetI Doesn’t work on WindowsI At least one more year to mature

Need more optimizations:I No inlining yet. The most important optimization!I Different approaches to implement inlining:

F Node or RTL levelF Use C inlining (I’ll pursue this one)F New GCC/LLVM extension (a new inline attribute) would be useful

Will RTL and MJIT be a part of MRI?I It does not depend on meI I am going to work in this directionI Will be happy if even some project ideas will be used in future MRI

Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 30 / 30