WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

Post on 28-Dec-2015

216 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

WebSphere Performance Drivers

William R. Sullivan, P.E.

CTO WHAM Engineering & Software

Memory Performance Drivers

• Memory Concepts– Address Space Management– Address Translation– Locality of Reference

Address Space Management

• Manipulating the numbers that the program generates that reference variables and data storage locations

• For C/C++ a pair of functions called malloc and free managed the heap

• For Java, there is something called a garbage collector that makes Address Space Management transparent to the programmer

Locality of Reference

• This simply refers to the fact that the next location fetched from memory is close to the first

• As long as it is in the same page, no new virtual mapping needs to be created

• Programs with poor locality of reference rarely get extra performance with faster CPUs

• Programs with larger resident set size generally have less locality of reference

Address Space Management in Java Applications

• The operator new is used to create instances of class objects which invokes the class constructor

• No delete operation is needed because of Garbage Collection

• This leads to poorly performing applications where lots of construction occurs

Java Address Space Management

• Two significant impacts on program operation– Locality of reference is not controllable except

by using a very small heap which is not always practical

– If the program uses many objects, garbage collection can take too long and cause excessive CPU use

– These two are at odds when tuning WS

What Does Garbage Collection Do?

• Collects unused space

• Compacts it by coalescing unused contiguous chunks

• Copies data around which is actively in use

• All request processing is suspended during garbage collection

When Does GC Kick In?

• When either a limit in time or a limit in memory use has been achieved

• Main tuning knob is the start size and final size of the heap

• Asynchronous GC can be disabled but it isn’t advisable

• GC can be invoked by the programmer as well

What Adverse Affect Does GC Produce on Application Behavior?

• Excessive CPU consumption, IBM says expect 5%-20%

• Response time impact for all transactions in progress and received during garbage collection interval

• Can we characterize specifically the impact GCs are having?

Test Application from IBM

• We used the Account Transfer Application that came with the Samples from IBM

• Used a URL based load generator with 10 simultaneous requestors and zero think time

• Used WHAM DRM 3.5 to measure and analyze all the results

GC as a percentage of Total CPU

• IBM claims anywhere from 5% to 20% of the application time is acceptable

• That is way too high for the price you pay for the licensing of WebSphere

How do we characterize GC?

• We would say from our understanding of GC so far that when transaction rate drops to 0 and CPU is 100% of one CPU, the JVM is in the process of collecting garbage

• Is there any way to conclusively demonstrate that?– WHAM Profiling data on the application– GC Verbose output correlated to the other data

streams but unfortunately it doesn’t come with timestamps (it did in 3.01)

GC at 2% is good right?

• Percentages can be misleading and that is why it is always necessary to look at both the frequency and time domain

• The next slides show GC for a 128MB heap and a 512MB heap in which GC is 2% of the application CPU during the interval of observation

2% GC in a 128MB Heap

2% GC in a 128MB Heap

2% GC in 512MB Heap

2% GC in a 512MB heap

Why 512MB GC is Slow

Lessons Learned

• Large heap may cause page faulting when garbage collection starts and extend GC time which has an adverse impact on application response time

• Large heap had other negative effects such as increased memory management overhead due to steals and minor faults from poor locality of reference

What happens to GC over time?

• It should kick in at a regular rate as long as the rate of object creation is constant

• The cost of GC should be proportional to the rate of garbage creation

• Let’s have a look at the Account Transfer Bean using JSPs

• We ran it for 33 minutes with different heap sizes, 64MB,128MB and 256MB

Time Per Tier View of Load Effects on a 64MB heap over 30 minutes

Transaction Rate and Service Time in Tier 2

What if we Look at 10s sample rate

CPU 1s vs 10s sample rate

Slower Sampling Averages Data

• Cannot see the clear pattern of garbage collection• We can see that response time is rising over time as

CPU is dropping with throughput• We would assume some internal application slowdown

may be occurring or that things may be slower on the database but we know the latter isn’t the case

• With faster sampling, it is clear that the slowdown is periodic and is clearly in Tier 2

• Faster sampling is key in isolating and identifying these sorts of anamolies

Expanded View of Previous Charts

Functions and Transactions from Silhouette

Libs and Service Times

Lets Plot GC CPU vs Time

• We measured the application for 33 minutes under a fixed load and summarized the CPU usage at 3 minute intervals

• We then plotted the total CPU usage in GC per interval for a 64MB heap, a 128MB heap and a 256MB heap

• Notice that the service time effect of GC is about 2s for a 64MB heap from the previous chart

CPU Breakout for 64MB Heap

CPU Breakdown for 64MB Heap

0

50

100

150

200

250

300

3 6 9 12 15 60 21 24 27 30 33

Time in Minutes

CP

U i

n S

eco

nd

s

64MB AppCPU

64MB GC CPU

CPU Breakout for 128MB Heap

CPU Breakdown for 128MB Heap

0

50

100

150

200

250

300

3 6 9 12 15 18 21 24 27 30 33

Time in Minutes

CP

U i

n S

eco

nd

s

128MB App CPU

128MB GC CPU

CPU Breakout for 256MB Heap

CPU Breakdown for 256MB Heap

0

50

100

150

200

250

300

3 6 9 12 15 18 21 24 27 30 33

Time in Minutes

CP

U i

n S

eco

nd

s

256MB App CPU

256 MB GC CPU

So What is Wrong Here?

• Garbage collection is being invoked more frequently but the rate of transactions is decreasing

• Garbage is collected when we run out of space and so we would have to say that with higher frequency GCs we are running out of space sooner

• The implication is that we have a memory leak or something is holding memory active after requests are completed

But I thought GC fixed memory Leaks?

• Not exactly• Objects have seven states

– Created– In Use– Invisible– Unreachable– Collected– Finalized– Deallocated

Invisible Objects

• These are objects that are apparently out of scope but in the frame of reference of code that the JVM will not eliminate them unless they are explicitly de-referenced.

• This kind of coding can be over-ridden by explicitly setting the reference to the object to null after it’s used

Using JSP’s or Not• It turns out that we have two ways to do the transfer funds operation• We have a JSP based bean where the bean populates a JSP object

and then creates a session servlet to run the JSP• The direct implementation creates all of the html output inside the

bean. This isn’t good because programmers would be required to develop content in the direct case

• In the JSP case, content programmers don’t need Java just html to implement dynamic content because the bean handles all the dynamic content

• The leaky version is the JSP based version• We ran the Direct Bean implementation and here were the results

No JSP produced good results

So What is Special About JSP?

• We investigated the Bean Code and found that the JSP version of the Bean had to create a session

• Sessions are either persistent or cached• The default is cached and there is a limit of 1000

and a timeout of 30 minutes• We decided to adjust session timeout to see if

shorter timeouts would help free memory quickly enough to keep the program from starving

• First adjustment was to 5 minutes

5 Minute Session Timeout

Garbage Collection Cost

• GC becomes consistent but at a fairly high frequency

• Total GC Cost is 11% of the Application• Not acceptable so we can increase the

heap size or decrease the timeout• Next we decreased the session timeout to

2 minutes which produced 3% GC cost• Then we decided to use a larger heap and

longer timeout to see how that worked

2 Minute Session Timeout

5 Minute Timeout 128MB Heap

Final Choice

• The 2 minute timeout with 64MB heap produced 3% GC cost

• The 5 minute timeout with 128MB heap produced <1% GC cost

• Longer session timeout is better for most applications

Conclusions

• GC can have a significant impact on WebSphere performance

• GC must be characterized in order to ensure that it isn’t negatively affecting the application performance

• Proper Tools and the right approach to analyzing GC is imperative to identifying problems and rectifying them

• As far as we could tell, Websphere doesn’t come with the proper tools for characterizing GC costs

top related