Pg nordic-day-2014-2 tb-enough

Post on 10-May-2015

1358 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Online classified web site Leboncoin.fr is one of the success stories of the French Web. 1/3 of the total internet population in France uses the site each month. The growth has been spectacular and swift, and was made possible by a robust and performant software platform. At the heart of the platform is a large PostgreSQL infrastructure, part of it running on some of the largest PC-class hardware available. In this presentation, we will show how we have grown our infrastructure. In particular, the amazing vertical scalability of PG will be showcased with hard numbers (IOPS, transactions/seconds, etc). We will also cover some of the hard lessons we have learned along the way, including near-disasters. Finally, we will look into how innovative features from the PostgreSQL ecosystem enable new approaches to our scalability challenge.

Transcript

2TB of RAM ought to be enough for anybodyPG Nordic Day 2014

2

The Presenter

Renaud Bruyeron - @brew_your_ownParis, France

3

The Presenter – short bio

• 1998-1999: W3C @ MIT

• 1999-2013: FullSIX• Top 3 Web agency in FR• Strong focus on PHP, Java, and Oracle…

• 2013-present: CTO of

Schibsted Classified MediaFrom 1 to 30+ Countries in 7 years

4

5

6

Templated deployment in 30 countries w/ shared technology

• Technology originally inherited from Blocket.se

• Has since evolved to power all 30 sites

• Focus on performance and ease of local modificationsPostgreSQL

Middleware(data access, business

rules)

Index

Presentation layer

APIs Web

Search C

C

C

PHP

C/PHP

CPL/SQL

7

PostgreSQL in the SCM Platform

100+ servers running PostgreSQL

8TB of data

50+ million classified ads

8

Schibsted Classified Media & PostgreSQLmarried…and in love

#1 Classified Web site in France

9

10

History

Project initiated in 2006

Site launch: early 2007

Based on technology

from 1 to 230 people, challenger to #1 in 7 years

11

12

13

14

Explosive growth…with a few bumps along the way!

15

Big Audience

250M page views / day

5M unique visitors / day

18M UV / monthThat’s 1/3 of French internet population…

600000+ new ads / day25M live ads

#7 most visited Website in France

16

Big Ops

300+ servers in 2 DCs

20 servers hosting PG databases(in production)

17

Built on SCM Technology

PostgreSQL

Middleware(data access, business

rules)

Index

Presentation layer

APIs Web

Search

18

Built on SCM Technology

Website Data

MasterSlave(hot

standby)

Slave(short

queries)

Slave(Long

queries)

Middleware(data access, business

rules)

Index

Presentation layer

APIs Web

Search

19

Built on SCM Technology

Website Data

MasterSlave(hot

standby)

Slave(short

queries)

Slave(Long

queries)

Middleware(data access, business

rules)

Index

Presentation layer

APIs Web

Search

+15 support DBs

20

Website Data

+15 support DBs

We use PostgreSQL everywhere!

Backoffice & Analytics

MasterSlave(hot

standby)

Slave(short

queries)

Slave(Long

queries)ODS

OLAP

BI & CRM Middleware(data access, business

rules)

Index

Presentation layer

APIs Web

Search

21

We try to limit writes!

• Index/Search acting as a structured cache

• Master DB workload = 70% writes• Slaves used to offload read queries

• Main database = 6TB on disk…• +4TB archived away…

• 20K LOCs of PL/SQL

Website Data

+15 support DBs

MasterSlave(hot

standby)

Slave(short

queries)

Slave(Long

queries)

Index

Search

22

PostgreSQL works beautifully as a DW!

Website DataBackoffice & Analytics

MasterSlave(hot

standby)

Slave(short

queries)

Slave(Long

queries)ODS

OLAP

ETL moving 500GB / day into

the DW

2.2TB

Adding 6GB/day

23

Big Iron: HP DL980, 2TB of RAM, 64-80 cores

24

HTOP on the Master

25

Physical storage for the main PostgreSQL instances

DL980 DL980

3Par V800(SAN)

Fusion-IO

DL980DL980

3Par V800(SAN)

Fusion-IO

Fib

erC

hann

el

Fib

erC

hann

el

Master Slave SlaveSlave

DC 1 DC 2

26

Big Iron: 3Par V800 SAN

27

Big Iron: thin provisioning with mix of SSD and FC disks

28

Big Iron: high performance…

29

Bragging about it ;)

30

Despite the caches and the search engine, we get impressive workloads on the master DB

600 tx/sec

31

How did we get there?

2x DL980 w/ Fusion-IO

2x 3Par V8002x DL980 w/

FC to the V800s

32

We did go through growing pains and near disasters

The « Big One »

33

Our own « worst day of our lives »: March 1st 2013 (1/2)

Master DB is slowing down dramatically

We find that Slony replication is the culprit

We don’t know what to do…

Until we find a solution on the net that involves cleaning up slony metadata…

…(you know where this is going)…

We fumble. We notice. The slave is borked.

Rebuilding the Slave with slony brings the Master down. Oh. God…

We take the Master off the stack, and start rebuilding the slave w/ Slony

…5 days later, we are done (!)…

34

Our own « worst day of our lives »: March 1st 2013 (2/2)

…but we are not out of the woods yet!...

Pent up demand is bringing the site down!

We decide to switch to native replication!

…but the network cards are maxed out by the replication data…

…triggering a kernel bug…

…(Murphy, could you please step out of the room?)…

We implement network card bonding, and start moving support tables off the main instance

…and we are done!

What’s Next?

35

36

Vertical Scalability has limits!

We are already running on the biggest HW money can buy

Past certain volume levels, execution plans can change radically

Huge instances are difficult to maintain & backup safelyRebuilding the slave in March 2013 took a full 5 days…

Although we are maxing out the HW available to us, especiallyon the writes/s,

We are committed to PostgreSQL at the heart of our platform

37

Key ideas to take our platform to the next level

Sharding

Unbundling of the schema

PGQ

Spread the reads & writes horizontally

Move parts of the schema (that can be decoupled) to other instances. Spread the

workload

Reduce application-level transaction interleaving by moving parts of the

transactions to asynchronous workers

38

top related