Manageability, Availability and Performance in Porcupine: A

advertisement
Manageability, Availability and Performance in Porcupine: A
Highly Scalable, Cluster-based Mail Service
Yasushi Saito, Brian N Bershad and Henry M.Levy
University of Washington
What is Porcupine?
 Highly scalable Mail server
 “Cluster based internet mail server using
SMTP”
 Why do we need a another mail service ?
 Conventional systems do not exploit the
Heterogeneity of the nodes.
 Conventional systems are not efficient
 Conventional systems use legacy software
Kalyan Boggavarapu Lehigh
University CSE 498
Disadvantages of
Conventional Mail Servers
 Manageability:



The earlier systems are to be configured manually.
System has to be tuned for the newly added node / system
in the distributed file system.
So a lot of work is involved when a node fails or a new node
is added to the system.
 Availability:



This depends on how can the system tolerate the loss of a
node.
The conventional systems are less fault tolerant
When a node has failed the users on that node cannot access
the nodes temporarily.
 Performance:


Number of nodes in the system is not proportional to
performance.
No dynamic load balancing
Kalyan Boggavarapu Lehigh
University CSE 498
Goals
 Manageability
 Availability
 Performance
Billions messages
per second
Kalyan Boggavarapu Lehigh
University CSE 498
System Overview
Kalyan Boggavarapu Lehigh
University CSE 498
How Porcupine
Achieve its goals
Kalyan Boggavarapu Lehigh
University CSE 498
Key Data Structures
 Mailbox fragment
 Mail map
 User profile
database
 User profile soft
state (set of users)
 User map
 Cluster
membership list
Kalyan Boggavarapu Lehigh
University CSE 498
Data Structure Managers
Kalyan Boggavarapu Lehigh
University CSE 498
A cluster of 2
Kalyan Boggavarapu Lehigh
University CSE 498
Receiving a Message
Kalyan Boggavarapu Lehigh
University CSE 498
Load Balancing
 Equal distribution of data among the nodes
 Identify the hot-spots and divide the load
accordingly
 Test Bed






Systems: 30
Ethernet: 100Mbps
OS: Linux 2.2.7
Mean Message Size: 4.7KB; Max 1MB
Number of users: 5M
Authentication: No
Kalyan Boggavarapu Lehigh
University CSE 498
Manageability
Kalyan Boggavarapu Lehigh
University CSE 498
Porcupine re-configures
automatically
Without: fall in
#msgs =
100(approx)
With: fall in #
of msgs =
50(approx)
Kalyan Boggavarapu Lehigh
University CSE 498
Availability
Kalyan Boggavarapu Lehigh
University CSE 498
Mail map consistency
 C fails before update
 No problem the message is replicated
 C deleted all the messages of
Bob (A), but update failed.
 No problem A will delete the dangling pointers
 A fails before the update
 A new manager will take the update later
Kalyan Boggavarapu Lehigh
University CSE 498
States of Replication
 Hard State
 Password and Userlogin is written permanently.
 Data that should not be lost.
 Soft State
 User to nodes mapping.
 This can be reconstructed after a loss.
Kalyan Boggavarapu Lehigh
University CSE 498
Hard State Replication
Porcupine no replication
Messages/second
 Aim: consistency
 Type: Permessage, Per-User
 Effect: efficient
during normal
operation
Porcupine with replication=2
800
700
600
500
400
300
200
100
0
Porcupine with replication=2, NVRAM
0
5
10
15
Cluster size
Kalyan Boggavarapu Lehigh
University CSE 498
20
25
30
Effect of Replication
Kalyan Boggavarapu Lehigh
University CSE 498
Soft-state Reconstruction
1. Membership protocol
Usermap recomputation
B C A B A B A C
A
B
C
bob: {A,C}
B A A B A B A B
bob: {A,C}
suzy:
2. Distributed
disk scan
A C A C A C A C
bob: {A,C}
suzy: {A,B}
B C A B A B A C
joe: {C}
B A A B A B A B
joe: {C}
ann:
A C A C A C A C
joe: {C}
ann: {B}
B C A B A B A C
suzy: {A,B}
ann: {B}
B C A B A B A C
suzy: {A,B}
ann: {B}
B C A B A B A C
suzy: {A,B}
ann: {B}
Timeline
Kalyan Boggavarapu Lehigh
University CSE 498
Advantages of Porcupine
 Best use of Resources
 Self configuration
 Dynamic load balancing
 Result:
 Geographically distributed clusters servers
 Highly scalable
 Fault tolerant
 Future work
 Better membership protocol
 Applying porcupine to other applications like
Usenet.
Kalyan Boggavarapu Lehigh
University CSE 498
Sources
 Porcupine figure in all slides is from
http://www.bluebison.net/yosemite/porcupine.
htm
 Diagrams in slides 17,19 are from slides at
http://www.hpl.hp.com/personal/Yasushi_S
aito/pubs.html#publications
Kalyan Boggavarapu Lehigh
University CSE 498
Download