Fifth International Symposium on Functional and Logic Programming Compiling Lazy Functional Programs based on the Spineless Tagless G-machine for the Java Virtual Machine Kwanghoon Choi (with H. Lim and T. Han) KAIST March 8, 2001
Fifth International Symposium on Functional and Logic Programming
Compiling Lazy Functional Programsbased on the Spineless Tagless G-machine
for the Java Virtual Machine
Kwanghoon Choi(with H. Lim and T. Han)
KAIST
March 8, 2001
1 – Introduction
1. Introduction
2 Motivation
• Mobile codes
• Interlanguage working between Haskell and Java
2 Goals
• Use the STGM as an execution model
• Provide a complete setting for the compilers based on the STGM
March 8, 2001 F L O P S 2
2 – Methodology
2. Methodology
2 An Overview
STG
L-machine
STGM
Answer
compiler
compilerL-code-to-Java
JVMJava
STG-to-L-code
L-code
March 8, 2001 F L O P S 3
2 – Methodology
STGM≡ STG compiler+L-machine
STG-to-L-code
STGM
compiler
L-code
L-machine
STG
Answer
L-code-to-Javacompiler
JVM Java
March 8, 2001 F L O P S 4
2 – Methodology
2 The STG Language
• Another functional language in a restricted form
• Many implicit lower level implementation details : updates,taglessness, application, closures, etc.
Haskell
let s = λf g x. f x (g x)
in ...⇒
STG
let sn= λf g x. let a
u= g x
in f x a
in ...
March 8, 2001 F L O P S 5
2 – Methodology
2 The L-code Language
• Each operation → one instruction
Stack Closure Heap Closure Expressionbuild PUTK PUTC, PUTP -read GETK, REFK GETC -
overwrite - UPDC -jump JMPK, STOP JMPC CASE
args GETNT GETN GETA
stack op{
PUTARG, GETARG, ARGCHK
SAVE, DUMP, RESTORE
• Each closure → 〈l, ρ〉→ let l = C in ...
March 8, 2001 F L O P S 6
2 – Methodology
2 The L-machine
• C µ s h ⇒ C′ µ′ s′ h ′
2 Our STG-to-L-code Compiler
• A closure conversion
• A hoisting transformation
STGM≡ STG compiler+L-machine
March 8, 2001 F L O P S 7
2 – Methodology
2 Compiling A STG program
let lupd = ... in
let ls = B[[{}.λf g x. let au= g x in f x a]] n ls in ...
≡ let lupd = ... inlet ls = GETN (λz.GETC z (λ〈ls, 〉.ARGCK 3 (...)(GETARG (λf g x.
let la = GETN (λz.GETC z (λ〈la, g x〉.SAVE (λs.PUTK 〈lupd, z s〉 (DUMP (PUTARG x (JMPC g))))))
in PUTC au=〈la, g x〉 (PUTARG x a (JMPC f))))))
in ...
2 Hosting Inner Codes
let lupd = ...ls = GETN (λz.GETC z (λ〈ls, 〉〉.ARGCK 3 (...)(GETARG (λf g x.
PUTC au=〈la, g x〉 (PUTARG x a (JMPC f))))))
la = GETN (λz.GETC z (λ〈la, g x〉.(SAVE (λs.PUTK 〈lupd, z s〉(DUMP (PUTARG x (JMPC g)))))))
in ...
March 8, 2001 F L O P S 8
2 – Methodology
Java Representation
Java
L-code
Answer JVM
compilerL-code-to-Java
STG
STGM
STG-to-L-codecompiler
L-machine
March 8, 2001 F L O P S 9
2 – Methodology
2 An Overview of Java Representation
• let l = C in ... → one class
• 〈l, ρ〉 → one object
• runtime system → a few dedicated classes
• C → java statements
• µ → variable scoping rule
• s → one array
• h → the JVM’s heap
March 8, 2001 F L O P S 10
2 – Methodology
Closure
• Our base class
public abstract class Clo {public Clo ind = this;public abstract Clo code();
}
• One class for each let l = {f1, ... , fn}.C in ...
public class Cl extends Clo {public Object f1, ... , fn;public Clo code() { J[[C]] }
}
March 8, 2001 F L O P S 11
2 – Methodology
Runtime System
public class G {public static Object node;public static int tag ;public static boolean loopflag ;
public static int sp, bp;public static Object[] stk;
...}
March 8, 2001 F L O P S 12
2 – Methodology
L-code instruction
• Application (e.g. x0 x1 ... xn) : PUTARG, JMPC
Push-Enter Eval-Apply
reduction stack yes no
multiple args take yes no
intermediate closures no yes
March 8, 2001 F L O P S 13
2 – Methodology
L-code instruction (cont.)
• Tail calls : JMPC, JMPK
public static void loop (Clo c) { while(loopflag) c =c.ind.code(); }
call
return
loop( )
...
... ...
...
...
L-machine :
JVM :
<l1, env1> <l2, env2> <ln, env n>
object nobject2object1
March 8, 2001 F L O P S 14
2 – Methodology
L-code instruction (cont.)
• Updating : UPDC, PUTC
public class Ind extends Clo {public Clo code() { return this.ind; }
}
ve e
NODEInd Ind
ind : ind :
March 8, 2001 F L O P S 15
2 – Methodology
2 Compiling a L-code binding
public class Cla extends Clo { // la =public Object f1, f2; // {g, x}.public Clo code() {
Object z = G.node; // GETN λz.
Object g = ((Cla)z).f1; // GETC λ〈la, g x〉.Object x = ((Cla)z).f2;int s = G.bp; // SAVE λs.
Clupdo = new Clupd
(); // PUTK 〈lupd, z s〉o.f1 = z; o.f2 = s;G.sp ++; G.stk[G.sp] = o;G.bp = G.sp; // DUMP
G.sp ++; G.stk[G.sp] = x; // PUTARG x
return (Clo)g ; // JMPC g
}}
March 8, 2001 F L O P S 16
3 – Experiment
3. Experiment
2 Implementation
• GHC 4.04 as a front end
• STG → L-code → Java compiler
• Sun JIT 1.2.2 as a back end
2 Benchmarking
• Five small haskell programs [M&J99]
• A SUN UltraSPARC-II WS with Solaris 2.5.1
• GHC, Hugs
March 8, 2001 F L O P S 17
3 – Experiment
2 Result
Code Sizes in BytesPgms GHC JIT:unopt JIT:optfib 268k 25k (037 classes) 19k (034 classes)
edigits 283k 114k (135 classes) 64k (092 classes)prime 280k 74k (096 classes) 50k (070 classes)queen 275k 109k (134 classes) 76k (101 classes)soda 303k 336k (388 classes) 186k (203 classes)
runtime system : 3k bytes (4 classes)
March 8, 2001 F L O P S 18
3 – Experiment
2 Result (cont.)
Execution Time in SecondsPgms GHC Hugs JIT:unopt JIT:optfib 0.18s 106.48s 25.70s 5.72s
edigits 0.16s 3.10s 9.15s 2.42sprime 0.14s 3.30s 86.38s 1.97squeen 0.07s 5.79s 5.35s 2.29ssoda 0.03s 0.41s 2.26s 1.59s
March 8, 2001 F L O P S 19
3 – Experiment
2 Discussion
• L-code level optimisation
– some peephole optimisation
– merging those classes with the same arity of their membervariables [Wak99]
→
code sizes : 10,455∼100,495 bytes (12∼26 classes)
execution time : 1.2∼2.2 seconds
March 8, 2001 F L O P S 20
4 – Conclusion
4. Conclusion
2 Our contribution
• A systematic Haskell-to-Java compiler using a complete andsuccinct specification for the STGM
• A performance result combined with STG optimisations
March 8, 2001 F L O P S 21