1 Porcupine: A Highly Available Cluster-based Mail Service Yasushi Saito Brian Bershad Hank Levy University of Washington Department of Computer Science.
Post on 21-Dec-2015
220 Views
Preview:
Transcript
1
Porcupine: A Highly Available Cluster-based Mail Service
Yasushi SaitoBrian Bershad
Hank Levy
University of Washington Department of Computer Science and Engineering,
Seattle, WA
http://porcupine.cs.washington.edu/
2
Why Email?
Mail is importantReal demand
Mail is hardWrite intensiveLow locality
Mail is easyWell-defined APILarge parallelismWeak consistency
3
Goals
Use commodity hardware to build a large, scalable mail service
Three facets of scalability ...• Performance: Linear increase with cluster
size • Manageability: React to changes automatically• Availability: Survive failures gracefully
4
Conventional Mail Solution
Static partitioning
Performance problems:No dynamic load balancing
Manageability problems:Manual data partition
decisionAvailability problems:
Limited fault tolerance
SMTP/IMAP/POP
Bob’smbox
Ann’smbox
Joe’smbox
Suzy’smbox
NFS servers
5
Presentation Outline
OverviewPorcupine Architecture
Key concepts and techniquesBasic operations and data structuresAdvantages
Challenges and solutionsConclusion
6
Key Techniques and Relationships
Functional Homogeneity“any node can perform any task”
AutomaticReconfiguration
Load BalancingReplication
Manageability PerformanceAvailability
Framework
Techniques
Goals
7
Porcupine Architecture
Node A ...Node B Node Z...
SMTPserver
POPserver
IMAPserver
Mail mapMailbox storage
User profile
Replication Manager
Membership Manager
RPC
Load Balancer
User map
8
Porcupine Operations
Internet
A B...
A
1. “send mail to bob”
2. Who manages bob? A
3. “Verify bob”
5. Pick the best nodes to store new msg C
DNS-RR selection
4. “OK, bob has msgs on C and D 6. “Store
msg”B
C
Protocol handling
User lookup
Load Balancing
Message store
...C
9
Basic Data Structures
“bob”
BCACABAC
bob: {A,C}ann: {B}
BCACABAC
suzy: {A,C} joe: {B}
BCACABAC
Apply hash function
User map
Mail map/user info
Mailbox storage
A B C
Bob’s MSGs
Suzy’s MSGs
Bob’s MSGs
Joe’s MSGs
Ann’s MSGs
Suzy’s MSGs
10
Porcupine Advantages
Advantages:Optimal resource utilizationAutomatic reconfiguration and task re-distribution
upon node failure/recoveryFine-grain load balancing
Results:Better AvailabilityBetter ManageabilityBetter Performance
11
Presentation Outline
OverviewPorcupine ArchitectureChallenges and solutions
Scaling performanceHandling failures and recoveries:
Automatic soft-state reconstructionHard-state replication
Load balancingConclusion
12
Performance
GoalsScale performance linearly with cluster size
Strategy: Avoid creating hot spotsPartition data uniformly among nodes
Fine-grain data partition
13
Measurement Environment
30 node cluster of not-quite-all-identical PCs100Mb/s Ethernet + 1Gb/s hubsLinux 2.2.742,000 lines of C++ code
Synthetic load Compare to sendmail+popd
14
How does Performance Scale?
0
100
200
300
400
500
600
700
800
0 5 10 15 20 25 30Cluster size
Messages/second
Porcupine
sendmail+popd
68m/day
25m/day
15
Availability
Goals:Maintain function after failuresReact quickly to changes regardless of cluster sizeGraceful performance degradation / improvement
Strategy: Two complementary mechanismsHard state: email messages, user profile
Optimistic fine-grain replicationSoft state: user map, mail map
Reconstruction after membership change
16
Soft-state Reconstruction
B C A B A B A C
bob: {A,C}
joe: {C}
B C A B A B A C
B A A B A B A B
bob: {A,C}
joe: {C}
B A A B A B A B
A C A C A C A C
bob: {A,C}
joe: {C}
A C A C A C A C
suzy: {A,B}
ann: {B}
1. Membership protocolUsermap recomputation
2. Distributed disk scan
suzy:
ann:
Timeline
A
B
ann: {B}
B C A B A B A C
suzy: {A,B}C ann: {B}
B C A B A B A C
suzy: {A,B}ann: {B}
B C A B A B A C
suzy: {A,B}
17
How does Porcupine React to Configuration Changes?
300
400
500
600
700
0 100 200 300 400 500 600 700 800Time(seconds)
Messages/second
No failure
One nodefailureThree nodefailuresSix nodefailures
Nodes fail
New membership determined
Nodes recover
New membership determined
18
Hard-state Replication
Goals:Keep serving hard state after failuresHandle unusual failure modes
Strategy: Exploit Internet semanticsOptimistic, eventually consistent replicationPer-message, per-user-profile replicationEfficient during normal operationSmall window of inconsistency
19
How Efficient is Replication?
0
100
200
300
400
500
600
700
800
0 5 10 15 20 25 30Cluster size
Me
ss
ag
es
/se
co
nd
Porcupine no replication
Porcupine with replication=2
68m/day
24m/day
20
How Efficient is Replication?
0
100
200
300
400
500
600
700
800
0 5 10 15 20 25 30Cluster size
Me
ss
ag
es
/se
co
nd
Porcupine no replication
Porcupine with replication=2
Porcupine with replication=2, NVRAM
68m/day
24m/day33m/day
21
Load balancing: Deciding where to store messages
Goals:Handle skewed workload wellSupport hardware heterogeneityNo voodoo parameter tuning
Strategy: Spread-based load balancingSpread: soft limit on # of nodes per mailbox
Large spread better load balanceSmall spread better affinity
Load balanced within spreadUse # of pending I/O requests as the load measure
22
How Well does Porcupine Support Heterogeneous Clusters?
0%
10%
20%
30%
0% 3% 7% 10%Number of fast nodes (% of total)
Th
rou
gh
pu
t in
crea
se(%
)
Spread=4
Static
+16.8m/day (+25%)
+0.5m/day (+0.8%)
23
Conclusions
Fast, available, and manageable clusters can be built for write-intensive service
Key ideas can be extended beyond mailFunctional homogeneity
Automatic reconfiguration
Replication
Load balancing
top related