Top Banner
Top 10 Perl Performance Tips Perrin Harkins We Also Walk Dogs
24

Top 10 Perl Performance Tips

May 10, 2015

Download

Technology

Perrin Harkins

This talk was presented at YAPC::NA 2010 and OSCON 2010.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Top 10 Perl Performance Tips

Top 10 Perl Performance Tips

Perrin HarkinsWe Also Walk Dogs

Page 2: Top 10 Perl Performance Tips

Devel::NYTProf

Page 3: Top 10 Perl Performance Tips

Ground Rules

● Make a repeatable test to measure progress with○ Sometimes turns up surprises

● Use a profiler (Devel::NYTProf) to find where the time is going

○ Don't flail and waste time optimizing the wrong things!● Try to weigh the cost of developer time vs buying more

hardware○ Optimization is crack for developers, hard to know when

to stop

Page 4: Top 10 Perl Performance Tips

1. The Big Picture

● The biggest gains usually come from changing your high-level approach

○ Is there a more efficient algorithm?○ Can you restructure to reduce duplicated effort?

● Sometimes you just need to tune your SQL● A boatload of RAM hides a multitude of sins● The bottleneck is usually I/O

○ Files○ Database○ Network○ Batch I/O often makes a huge difference

Page 5: Top 10 Perl Performance Tips

2. Use DBI Efficiently

● Can make a huge difference in tight loops with many small queries

● connect_cached() avoids connection overhead○ Or use your favorite connection cache, but beware

overuse of ping()● prepare_cached() avoids object creation and server-side

prepare overhead● Use bind parameters to reuse SQL statements instead of

creating new ones

Page 6: Top 10 Perl Performance Tips

2. Use DBI Efficiently

● Use bind_cols() in a fetch() loop for most efficient retrieval.○ Less copying is faster.○ Alternatively, fetchrow_arrayref()

● prepare() and then many execute() calls is faster than do()

Page 7: Top 10 Perl Performance Tips

2. Use DBI Efficiently

● Turn off AutoCommit for batch changes○ Commit every thousand rows or so saves work for your

database● Use your database's bulk loader when possible

○ Writing rows to CSV and using MySQL's LOAD DATA INFILE crushes the fastest DBI code

○ 10X speedup is not unusual

Page 8: Top 10 Perl Performance Tips

2. Use DBI Efficiently

● Use ORMs Wisely○ Consider using straight DBI for the most performance

sensitive sections■ Removing a layer means fewer method calls and

faster code○ Write report queries by hand if they seem slow

■ Optimizer hints and choices about SQL variations are beyond the scope of ORMs but make a huge difference for this kind of query

Page 9: Top 10 Perl Performance Tips

3. Choose the Fastest Hash Storage

● memcached is not the fastest option for a local cache○ BerkeleyDB (not DB_File!) and Cache::FastMmap are

about twice as fast● CHI abstracts the storage layer

○ Useful if you think network strategy may change later

Page 10: Top 10 Perl Performance Tips

3. Choose the Fastest Hash Storage

Cache Get time Set time Run timeCHI::Driver::Memory 0.03ms 0.05ms 0.35s

BerkeleyDb 0.05ms 0.17ms 0.57sCache::FastMmap 0.06ms 0.09ms 0.62sCHI::Driver::File 0.10ms 0.26ms 1.11sCache::Memcached::Fast 0.12ms 0.15ms 1.23sMemcached::libmemcached 0.14ms 0.16ms 1.40sCHI::Driver::DBI Sqlite 0.11ms 1.94ms 2.05sCache::Memcached 0.29ms 0.21ms 2.88sCHI::Driver::DBI MySQL 0.45ms 0.33ms 4.41s

Page 11: Top 10 Perl Performance Tips

4. Generate Code and Compile to a Subroutine

● This is how most templating tools work.● Remove the cost of things that won't change for a while

○ Skip re-parsing templates○ Skip large groups of conditionals○ Choose architecture-specific code

my %subs;my $code = qq{print "Hello $thing\n";};$subs{'hello'} = eval "sub { $code }";$subs{'hello'}->();

Page 12: Top 10 Perl Performance Tips

5. Sling Text Efficiently

● Slurp files when possible. my $text = do { local $/; <$fh>; }

● Seems obvious, but I still see people doing this:my @lines = <$fh>;my $text = join('', @lines);

● Consider memory with huge files.

Page 13: Top 10 Perl Performance Tips

5. Sling Text Efficiently

● Use a "sliding window" to search very large files.○ Too big to slurp, but line-by-line is slow.○ Chunks of 8K or 16K are much faster, but require book-

keeping code. ○ http://www.perlmonks.org/?node_id=128925

● Use the cheapest string tests you can get away with.○ index() beats a regex when you just want to know if a

string contains another string● Use a fast CSV parser

○ Text::CSV_XS is much faster than the regexes you copied from that web page.

Page 14: Top 10 Perl Performance Tips

6. Replace LWP With Something Faster

● LWP is amazing, but modules built on C libraries tend to be faster.

○ LWP::Curl○ HTTP::Lite○ Maybe HTTP::Async for parallel

LWP 32.8/sHTTP::Async 64.5/sHTTP::Lite 200/sLWP::Curl 1000/s

Page 15: Top 10 Perl Performance Tips

7. Use a Fast Serializer

● Data::Dumper is great for debugging, but slow for serialization.

● JSON::XS is the new speed king, and is human-readable and cross-language.

● Storable handles more and is second-best in speed.

Page 16: Top 10 Perl Performance Tips

7. Use a Fast Serializer

YAML 84.7/s

XML::Simple 800/s

Data::Dumper 2143/s

FreezeThaw 2635/s

YAML::Syck 4307/s

JSON::Syck 4654/s

Storable 9774/s

JSON::XS 41473/s

Page 17: Top 10 Perl Performance Tips

8. Avoid Startup Costs

● Use a daemon to run code persistently○ Skip the costs of compiling○ Cache data○ Open connections ahead of time

● mod_perl, FastCGI, Plack, etc. for web● PPerl for command-line

○ Or hit your web server with lwp-get

Page 18: Top 10 Perl Performance Tips

9. Sometimes You Have to Get Crazy

● Use the @_ array directly to avoid copying sub add_to_sql { my $sqlbase = shift; # hashref my ($name, $value) = @_; if ($value) { push(@{ $sqlbase->{'names'} }, $name); push(@{ $sqlbase->{'values'} }, $value); } return $sqlbase;}

Page 19: Top 10 Perl Performance Tips

9. Sometimes You Have to Get Crazy

sub add_to_sql { # takes 3 params: hashref, name, and value return if not $_[2];

push(@{ $_[0]->{'names'} }, $_[1]); push(@{ $_[0]->{'values'} }, $_[2]);}

● 40% faster than original● More than 40% harder to read

Page 20: Top 10 Perl Performance Tips

10. Consider Compiling Your Own Perl

● Compiling without threads can be good for a free 15% or so.● No code changes needed! ● Has maintenance costs.

Page 21: Top 10 Perl Performance Tips

Resources

Tim Bunce's Advanced DBI slides:http://www.slideshare.net/Tim.Bunce/dbi-advanced-tutorial-2007 Also see Tim's NYTProf slides:http://www.slideshare.net/Tim.Bunce/develnytprof-v4-at-oscon-201007

man perlperf Programming Perl appendix on performance

Page 22: Top 10 Perl Performance Tips

Thank you!

Slides will be available on the conference website

Page 23: Top 10 Perl Performance Tips

Avoid tie()

● Slower than method calls!● PITA to debug too.

Page 24: Top 10 Perl Performance Tips

Use a Fast Sort

● For sorting on derived keys, consider a GRT sort.○ Faster than Schwartzian Transform○ Use Sort::Maker to build it.