Top Banner
© Copyright Azul Systems 2019 © Copyright Azul Systems 2015 @speakjava Java at Speed: Building a Better JVM Simon Ritter Deputy CTO, Azul Systems 1
41

Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

Jun 23, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

© Copyright Azul Systems 2015

@speakjava

Java at Speed:Building a Better JVM

Simon RitterDeputy CTO, Azul Systems

1

Page 2: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

JVM Performance Graph: Ideal

2

Page 3: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

JVM Performance Graph: Reality

3

Bytecodesinterpreted

C1 JIT plusprofiling

C2 JIT withDeoptimisations

Steady optimisedstate GC pauses

Page 4: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

Zing: A Better JVM

4

Page 5: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

Azul Zing JVM§ Based on OpenJDK source code§ Passes all Java SE TCK/JCK tests

– Drop in replacement for other JVMs § Hotspot collectors replaced with C4§ Works in conjunction with Zing System Tools

– Only supported on Linux§ Falcon JIT compiler

– C2 replacement§ ReadyNow! warm up elimination technology

5

Page 6: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

Zing System Tools§ Enables better memory management for JVM§ Memory freed by JVM is returned to kernel§ Allocation of new blocks comes from kernel

– ZST knows cache status– Newly allocated blocks for TLAB are ‘hot’– Not like standard JVM

§ Other clever tricks– Contingency memory

6

Page 7: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

Azul Continuous ConcurrentCompacting Collector (C4)

Page 8: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

C4 Basics§ Generational (young and old)

– Uses the same GC collector for both– For efficiency rather than pause containment

§ Concurrent, parallel and compacting§ No STW compacting fallback§ Algorithm is mark, relocate, remap

8

Page 9: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

Loaded Value Barrier§ Read barrier

– Tests all object references as they are loaded§ Enforces two invariants

– Reference is marked through– Reference points to correct object position

§ Allows for concurrent marking and relocation§ Minimal performance overhead

– Test and jump (2 instructions)– x86 architecture reduces this to one micro-op

9

Page 10: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

Concurrent Mark Phase

10

Root SetGC Threads

App Threads

X

X

X

XX

Page 11: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

Relocation Phase

11

Compaction

A B C D E

A’ B’ C’ D’ E’

A -> A’ B -> B’ C -> C’ D -> D’ E -> E’

Page 12: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

Quick Release

12

A -> A’ B -> B’ C -> C’ D -> D’ E -> E’

PHYSICAL

VIRTUAL

Page 13: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

Remapping Phase

App Threads

GC Threads

A -> A’ B -> B’ C -> C’ D -> D’ E -> E’

X

X

X

Page 14: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

Zing: Big Heaps, No Problem§ Scales to 8Tb heap

– No degradation in pause times§ Use one big heap, rather than many small heaps

– Less JVMs means more efficiency§ Zing does not require big heaps

– But works well with them

14

Page 15: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

GC Tuning

Page 16: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

Non-Zing GC Tuning Options

Page 17: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

GC Tuning Used To Be HardJava -Xmx12g -XX:MaxPermSize=64M -XX:PermSize=32M -XX:MaxNewSize=2g

-XX:NewSize=1g -XX:SurvivorRatio=128 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:MaxTenuringThreshold=0-XX:CMSInitiatingOccupancyFraction=60 -XX:+CMSParallelRemarkEnabled-XX:+UseCMSInitiatingOccupancyOnly -XX:ParallelGCThreads=12 -XX:LargePageSizeInBytes=256m …

Java –Xms8g –Xmx8g –Xmn2g -XX:PermSize=64M -XX:MaxPermSize=256M-XX:-OmitStackTraceInFastThrow -XX:SurvivorRatio=2 -XX:-UseAdaptiveSizePolicy -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:+CMSParallelRemarkEnabled -XX:+CMSParallelSurvivorRemarkEnabled-XX:CMSMaxAbortablePrecleanTime=10000 -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=63 -XX:+UseParNewGC –Xnoclassgc

Page 18: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

GC Tuning Used To Be HardJava -Xmx12g -XX:MaxPermSize=64M -XX:PermSize=32M -XX:MaxNewSize=2g

-XX:NewSize=1g -XX:SurvivorRatio=128 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:MaxTenuringThreshold=0-XX:CMSInitiatingOccupancyFraction=60 -XX:+CMSParallelRemarkEnabled-XX:+UseCMSInitiatingOccupancyOnly -XX:ParallelGCThreads=12 -XX:LargePageSizeInBytes=256m …

Java –Xms8g –Xmx8g –Xmn2g -XX:PermSize=64M -XX:MaxPermSize=256M-XX:-OmitStackTraceInFastThrow -XX:SurvivorRatio=2 -XX:-UseAdaptiveSizePolicy -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:+CMSParallelRemarkEnabled -XX:+CMSParallelSurvivorRemarkEnabled-XX:CMSMaxAbortablePrecleanTime=10000 -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=63 -XX:+UseParNewGC –Xnoclassgc

Page 19: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

GC Tuning With Zing

java -Xmx1g

java -Xmx10g

java -Xmx100g

java -Xmx2t

Page 20: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

Measuring Platform Performance§ jHiccup§ Spends most of its time asleep

– Minimal effect on perfomance– Wakes every 1 ms– Records delta of time it expects to wake up– Measured effect is what would be experienced by your

application§ Generates histogram log files

– These can be graphed for easy evaluation

20

Page 21: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

Small Heap, Small Latency

21

Hazelcast 2-node system with 1Gb heap Hotspot v. Zing

Page 22: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

Big Heap, Small Latency

22

Cassandra with 60Gb heap Hotspot v. Zing

Page 23: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

Azul Falcon JIT Compiler

Page 24: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

Advancing Adaptive Compilation§ Azul Falcon JVM compiler

– Based on latest compiler research– LLVM project

§ Better performance– Better intrinsics– More inlining– Fewer compiler excludes

§ Replacement for C2 compiler

Page 25: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

Simple Code Example§ Simple array summing loop

– A modern compiler will use vector operations for this

25

Page 26: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

More Complex Code Example§ Conditional array cell addition loop

– Hard for compiler to identify for vector instruction use

26

Page 27: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

Traditional JVM JIT

27

Per element jumps2 elements per iteration

Page 28: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

Falcon JIT

Using AVX2 vector instructions32 elements per iteration

Broadwell E5-2690-v4

Page 29: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

ReadyNow!

Page 30: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

Traditional JVM

30

Application Warm-up

Page 31: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

ReadyNow! Solution§ Save JVM JIT profiling information

– Classes loaded– Classes initialised– Instruction profiling data– Speculative optimisation failure data

§ Data can be gathered over much longer period– JVM/JIT profiles quickly– Significant reduction in deoptimisations

§ Able to load, initialise and compile most code before main()

31

Page 32: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

Effect Of ReadyNow!

Customer application

Page 33: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

ReadyNow! Startup TimePe

rform

ance

Time

Perfo

rman

ce

Time

Without ReadyNow!

With ReadyNow!

Class loading, initialising and compile time

Page 34: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

Falcon Pipeline

Zing JVM

Bytecodefrontend

LLVM

LLVM IR

VMcallbacks

Queries

Responses

Compiledmethods

Machine code

Page 35: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

Deterministic Compiler

Method for compilation

Initial IR(Method bytecodes & live profile)

Queries and responses

Produced machine code

Given identical input

Guarantees identical output

Page 36: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

Add Compile Stashing

Zing JVM

Bytecodefrontend

LLVM

LLVM IR

VMcallbacks

Queries

Responses

Compiledmethods

Machine code

CompileStash

Page 37: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

Compile Stashing EffectPe

rform

ance

Time

Perfo

rman

ce

Time

Without Compile Stashing

With Compile Stashing

Up to 80% reduction in compile timeand 60% reduction in CPU load

Page 38: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

Summary

Page 39: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

FalconReadyNow! & Compile Stashing

C4

JVM Performance Graph: Zing

Page 40: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

The Zing JVM§ Start fast§ Go faster§ Stay fast

§ Simple replacement for other JVMs– No recoding necessary

40

Try Zing free for 30 days:

azul.com/zingtrial

Page 41: Java at Speed - QCon New York§Save JVM JIT profiling information –Classes loaded –Classes initialised –Instruction profiling data –Speculative optimisation failure data §Data

© Copyright Azul Systems 2019

© Copyright Azul Systems 2015

@speakjava

Thank you!

Simon RitterDeputy CTO, Azul Systems

41