Top Banner
24

Wringing Performance out of Perl

Nov 15, 2014

Download

Technology

Leonard Budney

I gave this lightning talk at Yapc 2011. My company uses Perl in a variety of products, some of which have serious performance implications. Here I give a quick overview of some of the tricks we use to squeeze extra performance out of Perl.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Wringing Performance out of Perl
Page 2: Wringing Performance out of Perl

Wringing Performance out of Perl

Page 3: Wringing Performance out of Perl

Grant Street Group

• Began as a financial advisor group

Page 4: Wringing Performance out of Perl

Grant Street Group

• Discovered the Internet in 1997

Page 5: Wringing Performance out of Perl

Grant Street Group

• Online Auctions of Property Tax Liens• Web-Based billing system for tax collectors• Conversion of legacy tax-collector databases• Online license / vehicle tag renewals• Online payment processing• Auctions of all types of bonds• And lots, lots more!

Page 6: Wringing Performance out of Perl

Tax Lien Auctions

Page 7: Wringing Performance out of Perl

Tax Lien Auctions

• Absolute feeding frenzy– Our bidders threatened to exhaust TIN numbers– 20 million bidders in 2011– More than 30 billion bids altogether– Average was a 500,000-way tie– About 2,000 auctions closing simultaneously

Page 8: Wringing Performance out of Perl

Tax Lien Auctions

• How do we award auctions performantly?– Random tie-breaking with Crypt::Random– Random row-ID plus MySQL = S L O W– Turns out we can do it much faster in Perl

Page 9: Wringing Performance out of Perl

Tax Lien Auctions

• Net result: auction closing takes 20 seconds– Breaking 2,000 ties, each 500,000-way– Stress-testing indicates can scale by 4x– The IRS definitely cannot scale by 4x

Page 10: Wringing Performance out of Perl

Property Tax Online Payments

Page 11: Wringing Performance out of Perl

Property Tax Online Payments

• Florida residents can pay the property tax• Hosted, customized sites per county• Largest counties have ~1,000,000 parcels• Users are typical Florida residents

Page 12: Wringing Performance out of Perl

Property Tax Online Payments

Page 13: Wringing Performance out of Perl

Property Tax Online Payments

Page 14: Wringing Performance out of Perl

Property Tax Online Payments

Page 15: Wringing Performance out of Perl

Property Tax Online Payments

Page 16: Wringing Performance out of Perl
Page 17: Wringing Performance out of Perl

Property Tax Online Payments• Backend is MySQL and Sphinx• Lightning-fast searches with Perl– Mapping IDs to table, column, PK– Parsing SHOW STATUS LIKE ‘sphinx%’• Lots of useful metadata!

Page 18: Wringing Performance out of Perl

Property Tax Online Payments

• Net results:– Sub-second turnaround times– 9 minute average time on site by payers– 4 minute average time on site overall

Page 19: Wringing Performance out of Perl

Customer Data Conversion

Page 20: Wringing Performance out of Perl

Customer Data Conversion

• Largest county in FL is a customer– Population ~2.4M people– Tax roll of ~900K parcels– History of ~5.6M bills across 6 years

• Full database is large (by our standards)– Data files are ~30-50GB– Full conversion is ~160 hours, using Perl– Might be ~8 hours using pure SQL

Page 21: Wringing Performance out of Perl

Customer Data Conversion

• Problem is we can’t use pure SQL– Ridiculous amounts of business logic– Utterly different data models

• We’re a Perl shop; Perl is our hammer

Page 22: Wringing Performance out of Perl

Customer Data Conversion

• Hugely parallel data conversion– Subdivide conversion into smaller steps– Build hash of dependencies between steps– Construct DAG of work units in MongoDB

• Distribute the actual work– Run lots of Perl worker processes– Workers grab ready work units– Perform the work unit sequentially

Page 23: Wringing Performance out of Perl

Customer Data Conversion

• The end result– Total conversion time ~3 hours with 80 workers– Nightly reloads now very practical– Able to resume incomplete loads

Page 24: Wringing Performance out of Perl

We’re Hiring Telecommuters