Ot performance webinar

Speeding Up Your DITA-OT Processing

Aryeh Sanders, Suite Solutions

Who Are We?

Our Mission• To increase our customers’ profitability by significantly

improving the efficiency of their information development and delivery processes.

Qualitative Advantage• Content Lifecycle Implementation (CLI) is Suite Solutions’

comprehensive approach – from concept to publication – to maximizing the value of your information assets.

• Our professionals are with you at every phase, determining, recommending and implementing the most cost-effective, flexible and long term solution for your business.

Clients and Partners

3Private and ConfidentialSuite Solutions©2009

http://www2.emersonprocess.com/

Introduction

Performance in the DITA-OT• “No Silver Bullet”• Design of the DITA-OT puts limits on performance without a

redesign• Some of which is underway

• Performance relative to what?• Try to examine needs to figure out which performance issues

should be tackled and which can be ignored• No hard and fast rules

• Performance can be assessed only with your data, in your environment

• Measurement

Overview

Overview of the webinar• Performance Pain Points in the DITA-OT• Hardware and Software Changes for Performance• Memory Settings for Java• Stylesheet Performance and Code Changes

Performance Issues With the DITA-OT

• The DITA-OT sacrifices speed for simplicity• Constructed as a pipeline of transformations, each step of which

does one thing Each step must at least reparse the DITA files Each read of a DITA file with DOCTYPE reparsed the DTDs

Now it doesn’t – Eliot Kimber added a patch to cache the DTDs Best takeaway from this talk – upgrade to a version with this patch - 1.5.1

• XSLT• High level language, far removed from the practicalities of

performance• Often, the easiest way to do something is XSLT involves repeated

searches through the document

Importance of Measurement

A Case Study• Since the DITA-OT writes many files repeatedly, we have to

wait for the hard disk to complete the write, even to temporary files where long term integrity isn’t that important. This certainly holds up processing, right?

• Test: Stop those writes• ImBench – ramdisk tool• Create a temporary disk in memory and use that as the temp

directory Now, no writes have to wait for the disk

• Run the OT 20 times with the same data I used a slightly complicated map (98 pages on output) 41.1 seconds average with disk vs. 39.1 seconds in memory

For most people, not worth it; on the other hand, saves 5% of the time

Hardware Issues

• Anecdotal:• I’ve run the same data and stylesheets on my laptop, and on a

client’s server 10 minutes on the server vs. 1.5 minutes on the laptop

And it’s not a new laptop

• Since the DITA-OT is doing a lot of processing, it’s worth using a machine that’s capable of reasonable performance• Measure!• But a modern low-end $250 Dell desktop is about as fast as my

laptop Don’t throw it on an old computer and then make people wait

• Make sure there’s one core free to run the OT so it doesn’t have to compete with other processes

Hardware Issues (2)

• Make sure there’s enough memory• Very workload dependent• For very large workloads (roughly > 600 pages, or > 1000 topics),

consider a 64-bit machine with a 64-bit JVM Eliot Kimber is working on a patch to pass the right memory

parameters to the OT – if this is an issue, check the developer mailing list or contact him

• If there’s not enough physical memory, you can get thrashing JVM memory on next slide

Memory

• Once you have enough, it won’t help to have extra• Slightly surprising to me, but I tested at least one data set

-Xmx tells Java the maximum heap size The reason this is slightly surprising is that before Java gives up, it will

try garbage collection Frequent garbage collection can be slow Possibly the OT doesn’t tend to release memory

• Some datasets run out of memory, then the standard advice is to set reloadstylesheets=“true”• Slows down processing, since stylesheets are re-read• Much better to figure out how to give the OT enough memory if

possible• One customer solved their memory issues with JRockit as JVM

XSLT Performance

• Stylesheet developers don’t necessarily think about what needs to happen behind the scenes• Example:

<xsl:variable name=“example” select=“//*[@id=$refid]”/> This searches the whole document – fine if that’s what you want, but not if

you mean:<xsl:variable name=“example” select=“..//*[@id=$refid]”/>

In the context of a document where @id is unique, both would behave the same, but one would be slower than the other

Except: this could theoretically be optimized if the @id attribute was an ID type, and you have a DTD, and the stylesheet processor has that optimization built in, which leads us back to …

• Measurement is also useful for stylesheets• Saxon comes in a free version and commercial versions

Not that expensive, with more optimizations, which might matter for your workload – or might not

Profiling

• Good idea, many commercial tools• Oxygen, StylusStudio, fancier editions of Visual Studio

• Essentially another example of measurement to find the real pain points

• Not always necessary if the pain points are evident

XSLT Performance (2)

• XPath tends to have one line requests, but that one line can hide a lot of computation• What needs to happen to process this?

preceding-sibling::*[following-sibling::*[contains(@class, ‘ topic/ul ‘)]]• Preceding-sibling has to check each previous sibling

For each one, following-sibling has to check every following-sibling And contains() itself can’t be that efficient because it needs to hunt within

@class for ‘ topic/ul ‘

• Some numbers: Let’s look at 100 nodes, and let’s pretend that there is no topic/ul, so the test never succeeds. Let’s run this test on all 100 nodes in sequence

We could do the math, but it’s easier to write a program

XSLT Performance Example (Calculated in Perl, sorry)

for $a (1..100) { #for each of our 100 nodes for $b (1..$a-1) { #look at the preceding-

siblings for $c ($b+1..100) { #look at the following-

sibling of each of those $contains++; #and call contains() } }}print $contains, "\n";

Running this tells us there are 328350 (!) calls to contains()Of course, with 10 nodes, there are only 285 calls, but the point

remains – one line in XSLT might be doing a LOT of computation

Tips From Mike Kay

• Eight tips for how to write efficient XSLT:• Avoid repeated use of "//item".• Don't evaluate the same node-set more than once; save it in a variable.• Avoid <xsl:number> if you can. For example, by using position().• Use <xsl:key>, for example to solve grouping problems.• Avoid complex patterns in template rules. Instead, use <xsl:choose> within

the rule.• Be careful when using the preceding[-sibling] or following[-sibling] axes.

This often indicates an algorithm with n-squared performance.• Don't sort the same node-set more than once. If necessary, save it as a

result tree fragment and access it using the node-set() extension function.• To output the text value of a simple #PCDATA element, use <xsl:value-of>

in preference to <xsl:apply-templates>.

Commentary On Those Tips

• Use <xsl:number> when appropriate – I’m pretty sure that the cases where his comment applies aren’t found that often in the OT

• By all means, use xsl:key!• This is probably where to find low-hanging fruit in speeding up the

built-in stylesheets• We can’t realistically avoid complex patterns in template rules,

but it’s worth considering why he gave that advice• Every <xsl:apply-templates/> runs through each child node• For each child node, it has to run the test in the match in every one

of the <xsl:template>s• Each match test takes some amount of processing, and it runs for

every node, so we’d like to minimize that• If you can move processing to an xsl:choose or a moded template,

then you only need to run those tests on a smaller subset of nodes

What is an XSLT Key?

• Somewhere on the top level of the stylesheet, you can use something like:<xsl:key name="mapTopics" match="//opentopic:map//*" use="@id" />

• Then, later in your stylesheets, you can look up items with that key:select="key('mapTopics', $id)…"

• This lets you do the search once, instead of searching through opentopic:map elements many times.

• Note that this is part of the code that had a 40% speedup in generating the TOC in a large book I’ll mention on the next slide, despite that <xsl:key name="mapTopics" match="/*/opentopic:map//*" use="@id" /> would have been much more efficient.

More On Slow XSLT

• Consider what’s inside a loop• Example:

If you have a template, and the template defines a variable: <xsl:variable name=“topicrefs” select=“//*[contains(@class, ‘ map/topicref

‘)]”/> (This isn’t a good idea to start with because of //)

This variable will have the same value every time So why not only construct it once?

Move it out of the template and make it a global variable One customer speeded up TOC generation by around 40% on a huge book

PDF Stylesheet Development Tips

• Not a general performance issue, but a timesaver for stylesheet developers

• If, like us, you need to repeatedly tweak a stylesheet and test the tweak, but each test is slow• First, try directly editing the topic.fo file and view it, before you

change the stylesheet, so you won’t have to run the OT at all• Second, you can configure the toolkit to have another Ant “target” –

simply run your DITA once, and after that, let the toolkit start the PDF stylesheets from the files in the temp directory, skipping the earlier processing

Contact us for more information – we don’t have a nicely packaged version of this yet, but we can give you the pieces

Questions?

• Any questions?

• Be in touch!Aryeh [email protected]

mailto:[email protected]

Ot performance webinar

Technology

ot doesnt

dita fileseach

memory issues

performance issues

thrashingjvm memory

ditaotno silver bulletdesign

javastylesheet performance

right memory parameters