Top Banner
HIGH PERFORMANCE INSTRUMENTATION Jaroslav Bachorík @yardus, @btraceio Prague, 20-21 October 2016
31

GeeCon2016- High Performance Instrumentation (handout)

Apr 07, 2017

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: GeeCon2016- High Performance Instrumentation (handout)

HIGH PERFORMANCE INSTRUMENTATION

Jaroslav Bachorík@yardus, @btraceio

Prague, 20-21 October 2016

Page 2: GeeCon2016- High Performance Instrumentation (handout)

ABOUT ME

Jaroslav Bachorík Prague, 20-21 October 2016

Jaroslav Bachorík, [email protected], [email protected]@yardus

Page 3: GeeCon2016- High Performance Instrumentation (handout)

PERFORMANCE

Jaroslav Bachorík Prague, 20-21 October 2016

Page 4: GeeCon2016- High Performance Instrumentation (handout)

PERFORMANCE● Quantifiable

○ startup time○ request latency○ CPU usage○ Memory usage

● Reproducible○ controlled environment○ consistent results

● Measurable○ strictly defined target goals

● Benchmarking

Jaroslav Bachorík Prague, 20-21 October 2016

Page 5: GeeCon2016- High Performance Instrumentation (handout)

INSTRUMENTATION

Jaroslav Bachorík Prague, 20-21 October 2016

Page 6: GeeCon2016- High Performance Instrumentation (handout)

INSTRUMENTATIONint method() {

MyObject o = new MyObject();

int x = o.getCount();

logger.debug(“Instance “ + o “ has count “ + x;

//

return x;

}

Jaroslav Bachorík Prague, 20-21 October 2016

Page 7: GeeCon2016- High Performance Instrumentation (handout)

INSTRUMENTATION● APIs and code providing means to monitor and control application

○ loggers○ stat counters○ profilers

● Decoupled from the application○ application works properly without instrumentation○ same instrumentation may work for multiple applications

Jaroslav Bachorík Prague, 20-21 October 2016

Page 8: GeeCon2016- High Performance Instrumentation (handout)

SOURCE LEVEL INSTRUMENTATION● Instrumentation part of the source base

○ OS■ dtrace■ systemtap

○ Runtime■ JFR■ jstat counters

○ Application■ logging

● Difficult to modify and extend○ requires access to sources○ rebuild & redistribution

Jaroslav Bachorík Prague, 20-21 October 2016

Page 9: GeeCon2016- High Performance Instrumentation (handout)

BYTECODE LEVEL INSTRUMENTATION● No source code modifications● Modifying bytecode

○ result of Java source compilation○ binary executable consumed by JVM

● Bytecode Injection (BCI)○ during compilation

■ eg. maven AOP plugins■ same drawbacks as static

instrumentation○ during class loading

■ JVM agent and class transformers

Jaroslav Bachorík Prague, 20-21 October 2016

JVM JVM Agent

Classes

ClassloaderTransformer

Transformer

Transformer

Page 10: GeeCon2016- High Performance Instrumentation (handout)

CLASS TRANSFORMERS

Jaroslav Bachorík Prague, 20-21 October 2016

java.lang.instrument.ClassTransformer

byte[] transform(ClassLoader l, String name, Class<?> cls,

ProtectionDomain pd, byte[] classfileBuffer)

● Inspect and modify the class data○ complex task

■ constant pool■ stack frame map

○ better delegate to specialized tools■ ASM■ ByteBuddy■ CGLIB

Page 11: GeeCon2016- High Performance Instrumentation (handout)

DYNAMIC INSTRUMENTATION /w BCI● Required steps

○ Create and register JVM agent○ Create and register class transformers○ Prepare injected bytecode

■ create bytecode■ validate bytecode

○ Inject bytecode■ merge class bytecode /w injected bytecode■ validate merged bytecode■ redefine/retransform class using merged bytecode

Jaroslav Bachorík Prague, 20-21 October 2016

Page 12: GeeCon2016- High Performance Instrumentation (handout)

BTRACE● Bytecode level instrumentation simplified

○ JVM agent○ Class Transformers○ Optimized bytecode injection○ Safety guarantees

● Injected code as POJO○ annotations specify where injection should go○ code specifies what should be injected

● Started as a research project at Sun JDK Serviceability

Jaroslav Bachorík Prague, 20-21 October 2016

Page 13: GeeCon2016- High Performance Instrumentation (handout)

BTRACE SCRIPT● Easy access to

○ class and method name and parameters○ enclosing instance○ return value○ method duration○ fields via reflection

■ immutable, guarded, access only● Interfacing via

○ stdout○ file○ JMX (MXBean)○ jstat counters

Jaroslav Bachorík Prague, 20-21 October 2016

Page 14: GeeCon2016- High Performance Instrumentation (handout)

BTRACE SCRIPT@BTrace public class AllMethods {

@OnMethod(clazz="/javax\\.swing\\..*/", method="/.*/")

public static void m(@Self Object o, @ProbeClassName String probeClass,

@ProbeMethodName String probeMethod) {

println("this = " + o);

print("entered " + probeClass);

println("." + probeMethod);

}

}

Jaroslav Bachorík Prague, 20-21 October 2016

> this = DerivedColor(color=192,192,193)

> entered javax.swing.plaf.nimbus.DerivedColor.getRGB

Page 15: GeeCon2016- High Performance Instrumentation (handout)

INSTRUMENTATION PERFORMANCE

Jaroslav Bachorík Prague, 20-21 October 2016

Page 16: GeeCon2016- High Performance Instrumentation (handout)

PERFORMANCE IMPACT● Class (re)transformation

○ application startup time● Injected bytecode instructions

○ CPU usage○ JIT optimizer decisions○ heap usage○ GC activity

● Instrumentation framework○ additional drain of resources (CPU, RAM)

Jaroslav Bachorík Prague, 20-21 October 2016

Page 17: GeeCon2016- High Performance Instrumentation (handout)

SPARK SPECIFICS● Distributed environment● Worker JVMs come and go

○ startup time is important● The inner parts are frequently executed

○ RDD (Resilient Distributed Dataset) iterators○ latency/overhead of injected code is important

● Startup time equally important as latency/overhead

Jaroslav Bachorík Prague, 20-21 October 2016

Page 18: GeeCon2016- High Performance Instrumentation (handout)

CLASS (RE)TRANSFORMATION● Affects application startup time● Major impact on short lived applications ● Usually a small number of classes will be instrumented

○ optimize class filter for non-match● Minimize overhead of parsing class files

○ register as few transformers as possible○ consider smart caching of the transformed class data

● Example: Spark driver○ lifespan easily just a few minutes○ 104+ classes loaded at startup○ optimizing class transformation decreased overall overhead by >1.5%

Jaroslav Bachorík Prague, 20-21 October 2016

Page 19: GeeCon2016- High Performance Instrumentation (handout)

INJECTED BYTECODE● Affects the application runtime performance● Keep injected code as simple as possible

○ no non-deterministic loops○ minimize external method calls

■ escape analysis○ prefer working with stack instead of fields

■ method arguments■ local variables

● Smart activation of injected code○ sampling○ injection guards

Jaroslav Bachorík Prague, 20-21 October 2016

Page 20: GeeCon2016- High Performance Instrumentation (handout)

ESCAPING OBJECTS● A local instance escapes via injected instrumentation

○ affects GC and JIT optimizer decisions

Jaroslav Bachorík Prague, 20-21 October 2016

int method() {

MyObject o = new MyObject();

int x = o.getCount();

return x;

}

int method() {

MyObject o = new MyObject();

int x = o.getCount();

// inspect the instance providing the count

// a local instance 'o' escapes the method scope

Instrumentation.inspect(o);

//

return x;

}

Page 21: GeeCon2016- High Performance Instrumentation (handout)

GC INTERFERENCE● Minimize instrumentation interference with GC

○ use off-heap data structures where possible○ specialized primitive collections○ specialized queues in runtime (eg. JCTools)

● Reduce instantiations to minimum○ boxing○ string concatenation○ varargs

● Collect only raw data○ aggregations on different JVM or host

Jaroslav Bachorík Prague, 20-21 October 2016

Page 22: GeeCon2016- High Performance Instrumentation (handout)

STACK UNWINDING● Reuse the values stored on stack

Jaroslav Bachorík Prague, 20-21 October 2016

Java StackGETSTATIC TestClass.name : Ljava/lang/String;LLOAD 2

INVOKESPECIAL C.m (Ljava/lang/String;J)J String: “name”Long : 2 (H)Long : 2 (L)

DUP_X2

String: “name”Long : 2 (H)Long : 2 (L)

DUP2_X1INVOKESTATIC Probe.p(Ljava/lang/String;J)V

Page 23: GeeCon2016- High Performance Instrumentation (handout)

TIMESTAMP FOLDING● Timestamps are expensive

○ TSC correlated across cores○ monotonic counter values adjusted for core frequencies

● Minimize number of requested timestamps○ fold in subsequent calls to System.nanoTime()

● BTrace will optimize timestamps for @Duration parameters

Jaroslav Bachorík Prague, 20-21 October 2016

Page 24: GeeCon2016- High Performance Instrumentation (handout)

INVOCATION SAMPLING● Instrumented methods are frequently executed

○ injected code causing high overhead● Short methods experiencing unproportional overhead● Rely on statistically relevant sample instead

○ execute only on each Nth pass○ adjust N for acceptable overhead and detail

● Use @Sampled annotation in BTrace ○ fixed N○ dynamically adjusted N for guaranteed overhead

Jaroslav Bachorík Prague, 20-21 October 2016

Page 25: GeeCon2016- High Performance Instrumentation (handout)

SAMPLING IN BTRACE@BTrace

public class ArgsDurationSampled {

@OnMethod(clazz="/.*\\.OnMethodTest/", method="args", location=@Location(value=Kind.RETURN))

@Sampled(kind = Sampled.Sampler.Const, mean = 20)

public static void args(@Self Object self, @Return long retVal, @Duration long dur) {

println("args");

}

}

Jaroslav Bachorík Prague, 20-21 October 2016

// Adaptive sampler keeps ‘mean’ nanoseconds between samples in average

@Sampled(kind = Sampled.Sampler.Adaptive, mean = 300)

Page 26: GeeCon2016- High Performance Instrumentation (handout)

INJECTION GUARDS● Fastest code is the one never executed● Think of Logger levels● Class retransformation is costly● Introducing injection guards

○ injected code executed only when a condition is met○ minimal overhead when not executing injected code

■ fast field check● Use @Level annotation in BTrace

Jaroslav Bachorík Prague, 20-21 October 2016

@OnMethod(clazz="org.apache.spark.rdd.RDD",

method="iterator",

enableAt=@Level(">=" + SAMPLING_LEVEL),

location=@Location(Kind.RETURN))

Page 27: GeeCon2016- High Performance Instrumentation (handout)

LESSONS LEARNED

Jaroslav Bachorík Prague, 20-21 October 2016

Page 28: GeeCon2016- High Performance Instrumentation (handout)

HIGH PERFORMANCE INSTRUMENTATION● Fast filters for identifying injection points● Minimal and optimized code for injection

○ use timestamps sparsely○ beware of callbacks from injected code○ prefer stack manipulation above field retrievals

● Be gentle to GC● Use sampling when possible

○ getting overhead down○ still obtaining valid insights

● Enable turning off injection when not needed○ class retransformation is slow○ injection guards

Jaroslav Bachorík Prague, 20-21 October 2016

Page 29: GeeCon2016- High Performance Instrumentation (handout)

Resources● BTrace (https://github.com/btraceio/btrace)

○ Contributors welcomed!● ASM (http://asm.ow2.org/index.html)● CGLIB (https://github.com/cglib/cglib)● ByteBuddy (http://bytebuddy.net/#/)● JCTools (https://github.com/JCTools/JCTools)

Jaroslav Bachorík Prague, 20-21 October 2016

Page 30: GeeCon2016- High Performance Instrumentation (handout)

Q&A

Jaroslav Bachorík Prague, 20-21 October 2016

Page 31: GeeCon2016- High Performance Instrumentation (handout)

THANK YOU!

Jaroslav Bachorík Prague, 20-21 October 2016

[email protected], @yardus