Top Banner
1 Porcupine: A Highly Available Cluster- based Mail Service Yasushi Saito Brian Bershad Hank Levy University of Washington Department of Computer Science and Engineering, Seattle, WA http:// porcupine.cs.washington.edu /
24

1 Porcupine: A Highly Available Cluster-based Mail Service Yasushi Saito Brian Bershad Hank Levy University of Washington Department of Computer Science.

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 Porcupine: A Highly Available Cluster-based Mail Service Yasushi Saito Brian Bershad Hank Levy University of Washington Department of Computer Science.

1

Porcupine: A Highly Available Cluster-based Mail Service

Yasushi SaitoBrian Bershad

Hank Levy

University of Washington Department of Computer Science and Engineering,

Seattle, WA

http://porcupine.cs.washington.edu/

Page 2: 1 Porcupine: A Highly Available Cluster-based Mail Service Yasushi Saito Brian Bershad Hank Levy University of Washington Department of Computer Science.

2

Why Email?

Mail is importantReal demand

Mail is hardWrite intensiveLow locality

Mail is easyWell-defined APILarge parallelismWeak consistency

Page 3: 1 Porcupine: A Highly Available Cluster-based Mail Service Yasushi Saito Brian Bershad Hank Levy University of Washington Department of Computer Science.

3

Goals

Use commodity hardware to build a large, scalable mail service

Three facets of scalability ...• Performance: Linear increase with cluster

size • Manageability: React to changes automatically• Availability: Survive failures gracefully

Page 4: 1 Porcupine: A Highly Available Cluster-based Mail Service Yasushi Saito Brian Bershad Hank Levy University of Washington Department of Computer Science.

4

Conventional Mail Solution

Static partitioning

Performance problems:No dynamic load balancing

Manageability problems:Manual data partition

decisionAvailability problems:

Limited fault tolerance

SMTP/IMAP/POP

Bob’smbox

Ann’smbox

Joe’smbox

Suzy’smbox

NFS servers

Page 5: 1 Porcupine: A Highly Available Cluster-based Mail Service Yasushi Saito Brian Bershad Hank Levy University of Washington Department of Computer Science.

5

Presentation Outline

OverviewPorcupine Architecture

Key concepts and techniquesBasic operations and data structuresAdvantages

Challenges and solutionsConclusion

Page 6: 1 Porcupine: A Highly Available Cluster-based Mail Service Yasushi Saito Brian Bershad Hank Levy University of Washington Department of Computer Science.

6

Key Techniques and Relationships

Functional Homogeneity“any node can perform any task”

AutomaticReconfiguration

Load BalancingReplication

Manageability PerformanceAvailability

Framework

Techniques

Goals

Page 7: 1 Porcupine: A Highly Available Cluster-based Mail Service Yasushi Saito Brian Bershad Hank Levy University of Washington Department of Computer Science.

7

Porcupine Architecture

Node A ...Node B Node Z...

SMTPserver

POPserver

IMAPserver

Mail mapMailbox storage

User profile

Replication Manager

Membership Manager

RPC

Load Balancer

User map

Page 8: 1 Porcupine: A Highly Available Cluster-based Mail Service Yasushi Saito Brian Bershad Hank Levy University of Washington Department of Computer Science.

8

Porcupine Operations

Internet

A B...

A

1. “send mail to bob”

2. Who manages bob? A

3. “Verify bob”

5. Pick the best nodes to store new msg C

DNS-RR selection

4. “OK, bob has msgs on C and D 6. “Store

msg”B

C

Protocol handling

User lookup

Load Balancing

Message store

...C

Page 9: 1 Porcupine: A Highly Available Cluster-based Mail Service Yasushi Saito Brian Bershad Hank Levy University of Washington Department of Computer Science.

9

Basic Data Structures

“bob”

BCACABAC

bob: {A,C}ann: {B}

BCACABAC

suzy: {A,C} joe: {B}

BCACABAC

Apply hash function

User map

Mail map/user info

Mailbox storage

A B C

Bob’s MSGs

Suzy’s MSGs

Bob’s MSGs

Joe’s MSGs

Ann’s MSGs

Suzy’s MSGs

Page 10: 1 Porcupine: A Highly Available Cluster-based Mail Service Yasushi Saito Brian Bershad Hank Levy University of Washington Department of Computer Science.

10

Porcupine Advantages

Advantages:Optimal resource utilizationAutomatic reconfiguration and task re-distribution

upon node failure/recoveryFine-grain load balancing

Results:Better AvailabilityBetter ManageabilityBetter Performance

Page 11: 1 Porcupine: A Highly Available Cluster-based Mail Service Yasushi Saito Brian Bershad Hank Levy University of Washington Department of Computer Science.

11

Presentation Outline

OverviewPorcupine ArchitectureChallenges and solutions

Scaling performanceHandling failures and recoveries:

Automatic soft-state reconstructionHard-state replication

Load balancingConclusion

Page 12: 1 Porcupine: A Highly Available Cluster-based Mail Service Yasushi Saito Brian Bershad Hank Levy University of Washington Department of Computer Science.

12

Performance

GoalsScale performance linearly with cluster size

Strategy: Avoid creating hot spotsPartition data uniformly among nodes

Fine-grain data partition

Page 13: 1 Porcupine: A Highly Available Cluster-based Mail Service Yasushi Saito Brian Bershad Hank Levy University of Washington Department of Computer Science.

13

Measurement Environment

30 node cluster of not-quite-all-identical PCs100Mb/s Ethernet + 1Gb/s hubsLinux 2.2.742,000 lines of C++ code

Synthetic load Compare to sendmail+popd

Page 14: 1 Porcupine: A Highly Available Cluster-based Mail Service Yasushi Saito Brian Bershad Hank Levy University of Washington Department of Computer Science.

14

How does Performance Scale?

0

100

200

300

400

500

600

700

800

0 5 10 15 20 25 30Cluster size

Messages/second

Porcupine

sendmail+popd

68m/day

25m/day

Page 15: 1 Porcupine: A Highly Available Cluster-based Mail Service Yasushi Saito Brian Bershad Hank Levy University of Washington Department of Computer Science.

15

Availability

Goals:Maintain function after failuresReact quickly to changes regardless of cluster sizeGraceful performance degradation / improvement

Strategy: Two complementary mechanismsHard state: email messages, user profile

Optimistic fine-grain replicationSoft state: user map, mail map

Reconstruction after membership change

Page 16: 1 Porcupine: A Highly Available Cluster-based Mail Service Yasushi Saito Brian Bershad Hank Levy University of Washington Department of Computer Science.

16

Soft-state Reconstruction

B C A B A B A C

bob: {A,C}

joe: {C}

B C A B A B A C

B A A B A B A B

bob: {A,C}

joe: {C}

B A A B A B A B

A C A C A C A C

bob: {A,C}

joe: {C}

A C A C A C A C

suzy: {A,B}

ann: {B}

1. Membership protocolUsermap recomputation

2. Distributed disk scan

suzy:

ann:

Timeline

A

B

ann: {B}

B C A B A B A C

suzy: {A,B}C ann: {B}

B C A B A B A C

suzy: {A,B}ann: {B}

B C A B A B A C

suzy: {A,B}

Page 17: 1 Porcupine: A Highly Available Cluster-based Mail Service Yasushi Saito Brian Bershad Hank Levy University of Washington Department of Computer Science.

17

How does Porcupine React to Configuration Changes?

300

400

500

600

700

0 100 200 300 400 500 600 700 800Time(seconds)

Messages/second

No failure

One nodefailureThree nodefailuresSix nodefailures

Nodes fail

New membership determined

Nodes recover

New membership determined

Page 18: 1 Porcupine: A Highly Available Cluster-based Mail Service Yasushi Saito Brian Bershad Hank Levy University of Washington Department of Computer Science.

18

Hard-state Replication

Goals:Keep serving hard state after failuresHandle unusual failure modes

Strategy: Exploit Internet semanticsOptimistic, eventually consistent replicationPer-message, per-user-profile replicationEfficient during normal operationSmall window of inconsistency

Page 19: 1 Porcupine: A Highly Available Cluster-based Mail Service Yasushi Saito Brian Bershad Hank Levy University of Washington Department of Computer Science.

19

How Efficient is Replication?

0

100

200

300

400

500

600

700

800

0 5 10 15 20 25 30Cluster size

Me

ss

ag

es

/se

co

nd

Porcupine no replication

Porcupine with replication=2

68m/day

24m/day

Page 20: 1 Porcupine: A Highly Available Cluster-based Mail Service Yasushi Saito Brian Bershad Hank Levy University of Washington Department of Computer Science.

20

How Efficient is Replication?

0

100

200

300

400

500

600

700

800

0 5 10 15 20 25 30Cluster size

Me

ss

ag

es

/se

co

nd

Porcupine no replication

Porcupine with replication=2

Porcupine with replication=2, NVRAM

68m/day

24m/day33m/day

Page 21: 1 Porcupine: A Highly Available Cluster-based Mail Service Yasushi Saito Brian Bershad Hank Levy University of Washington Department of Computer Science.

21

Load balancing: Deciding where to store messages

Goals:Handle skewed workload wellSupport hardware heterogeneityNo voodoo parameter tuning

Strategy: Spread-based load balancingSpread: soft limit on # of nodes per mailbox

Large spread better load balanceSmall spread better affinity

Load balanced within spreadUse # of pending I/O requests as the load measure

Page 22: 1 Porcupine: A Highly Available Cluster-based Mail Service Yasushi Saito Brian Bershad Hank Levy University of Washington Department of Computer Science.

22

How Well does Porcupine Support Heterogeneous Clusters?

0%

10%

20%

30%

0% 3% 7% 10%Number of fast nodes (% of total)

Th

rou

gh

pu

t in

crea

se(%

)

Spread=4

Static

+16.8m/day (+25%)

+0.5m/day (+0.8%)

Page 23: 1 Porcupine: A Highly Available Cluster-based Mail Service Yasushi Saito Brian Bershad Hank Levy University of Washington Department of Computer Science.

23

Conclusions

Fast, available, and manageable clusters can be built for write-intensive service

Key ideas can be extended beyond mailFunctional homogeneity

Automatic reconfiguration

Replication

Load balancing

Page 24: 1 Porcupine: A Highly Available Cluster-based Mail Service Yasushi Saito Brian Bershad Hank Levy University of Washington Department of Computer Science.

24

Ongoing Work

More efficient membership protocolExtending Porcupine beyond mail: Usenet,

BBS, Calendar, etc More generic replication mechanism