Building a Scalable Architecture

advertisement
Intelligent People. Uncommon Ideas.
Building a Scalable Architecture for Web Apps Part I
(Lessons Learned @ Directi)
By Bhavin Turakhia
CEO, Directi
(http://www.directi.com | http://wiki.directi.com | http://careers.directi.com)
Licensed under Creative Commons Attribution Sharealike Noncommercial
1
Agenda
•
•
•
Why is Scalability important
Introduction to the Variables and Factors
Building our own Scalable Architecture (in incremental
steps)





•
•
Vertical Scaling
Vertical Partitioning
Horizontal Scaling
Horizontal Partitioning
… etc
Platform Selection Considerations
Tips
Creative Commons Sharealike Attributions Noncommercial
2
Why is Scalability Important in a Web 2.0 world
•
•
Viral marketing can result in instant successes
RSS / Ajax / SOA
 pull based / polling type
 XML protocols - Meta-data > data
 Number of Requests exponentially grows with user base
•
•
RoR / Grails – Dynamic language landscape gaining
popularity
In the end you want to build a Web 2.0 app that can serve
millions of users with ZERO downtime
Creative Commons Sharealike Attributions Noncommercial
3
The Variables
•
Scalability - Number of users / sessions / transactions /
operations the entire system can perform
• Performance – Optimal utilization of resources
• Responsiveness – Time taken per operation
• Availability - Probability of the application or a portion of the
application being available at any given point in time
•
Downtime Impact - The impact of a downtime of a
server/service/resource - number of users, type of impact etc
•
•
Cost
Maintenance Effort
High: scalability, availability, performance & responsiveness
Low: downtime impact, cost & maintenance effort
Creative Commons Sharealike Attributions Noncommercial
4
The Factors
• Platform selection
• Hardware
• Application Design
• Database/Datastore Structure and Architecture
• Deployment Architecture
• Storage Architecture
• Abuse prevention
• Monitoring mechanisms
• … and more
Creative Commons Sharealike Attributions Noncommercial
5
Lets Start …
•
We will now build an example architecture for an example
app using the following iterative incremental steps –





Inspect current Architecture
Identify Scalability Bottlenecks
Identify SPOFs and Availability Issues
Identify Downtime Impact Risk Zones
Apply one of •
•
•
•
Vertical Scaling
Vertical Partitioning
Horizontal Scaling
Horizontal Partitioning
 Repeat process
Creative Commons Sharealike Attributions Noncommercial
6
Step 1 – Lets Start …
Appserver &
DBServer
Creative Commons Sharealike Attributions Noncommercial
7
Step 2 – Vertical Scaling
Appserver,
DBServer
CPU
CPU
RAM RAM
Creative Commons Sharealike Attributions Noncommercial
8
Step 2 - Vertical Scaling
•
Introduction
 Increasing the hardware resources
without changing the number of nodes
 Referred to as “Scaling up” the Server
Appserver,
DBServer
• Advantages
CPU CPU
CPU CPU
RAM RAM
RAM RAM
•
 Simple to implement
Disadvantages
 Finite limit
 Hardware does not scale linearly
(diminishing returns for each
incremental unit)
 Requires downtime
 Increases Downtime Impact
 Incremental costs increase
exponentially
Creative Commons Sharealike Attributions Noncommercial
9
Step 3 – Vertical Partitioning (Services)
•
Introduction
 Deploying each service on a separate node
•
Positives
 Increases per application Availability
 Task-based specialization, optimization and
tuning possible
 Reduces context switching
 Simple to implement for out of band
processes
 No changes to App required
 Flexibility increases
AppServer
DBServer
•
Negatives
 Sub-optimal resource utilization
 May not increase overall availability
 Finite Scalability
Creative Commons Sharealike Attributions Noncommercial
10
Understanding Vertical Partitioning
•
The term Vertical Partitioning denotes –
 Increase in the number of nodes by distributing the
tasks/functions
 Each node (or cluster) performs separate Tasks
 Each node (or cluster) is different from the other
•
Vertical Partitioning can be performed at various layers
(App / Server / Data / Hardware etc)
Creative Commons Sharealike Attributions Noncommercial
11
Step 4 – Horizontal Scaling (App Server)
•
Load Balancer
AppServer
AppServer
AppServer
Introduction
 Increasing the number of nodes of
the App Server through Load
Balancing
 Referred to as “Scaling out” the App
Server
DBServer
Creative Commons Sharealike Attributions Noncommercial
12
Understanding Horizontal Scaling
•
The term Horizontal Scaling denotes –




Increase in the number of nodes by replicating the nodes
Each node performs the same Tasks
Each node is identical
Typically the collection of nodes maybe known as a cluster
(though the term cluster is often misused)
 Also referred to as “Scaling Out”
•
Horizontal Scaling can be performed for any particular
type of node (AppServer / DBServer etc)
Creative Commons Sharealike Attributions Noncommercial
13
Load Balancer – Hardware vs Software
•
•
•
Hardware Load balancers are faster
Software Load balancers are more customizable
With HTTP Servers load balancing is typically combined
with http accelerators
Creative Commons Sharealike Attributions Noncommercial
14
Load Balancer – Session Management
•
Sticky Sessions
 Requests for a given user are
sent to a fixed App Server
 Observations
• Asymmetrical load distribution
Sticky Sessions
User 1
(especially during downtimes)
• Downtime Impact – Loss of
session data
User 2
Load Balancer
AppServer
AppServer
Creative Commons Sharealike Attributions Noncommercial
AppServer
15
Load Balancer – Session Management
•
Central Session Store
 Introduces SPOF
 An additional variable
 Session reads and writes
generate Disk + Network I/O
 Also known as a Shared
Session Store Cluster
Central Session Storage
Load Balancer
AppServer
AppServer
AppServer
Session Store
Creative Commons Sharealike Attributions Noncommercial
16
Load Balancer – Session Management
•
Clustered Session
Management




Easier to setup
No SPOF
Session reads are instantaneous
Session writes generate Network
I/O
 Network I/O increases
exponentially with increase in
number of nodes
 In very rare circumstances a
request may get stale session
data
• User request reaches subsequent
Clustered Session Management
Load Balancer
AppServer
AppServer
AppServer
node faster than intra-node
message
• Intra-node communication fails
 AKA Shared-nothing Cluster
Creative Commons Sharealike Attributions Noncommercial
17
Load Balancer – Session Management
•
•
Sticky Sessions with Central
Session Store
Sticky Sessions
User 1
 Downtime does not cause loss
of data
 Session reads need not
generate network I/O
Sticky Sessions with
Clustered Session
Management
 No specific advantages
User 2
Load Balancer
AppServer
AppServer
Creative Commons Sharealike Attributions Noncommercial
AppServer
18
Load Balancer – Session Management
•
Recommendation
 Use Clustered Session Management if you have –
• Smaller Number of App Servers
• Fewer Session writes
 Use a Central Session Store elsewhere
 Use sticky sessions only if you have to
Creative Commons Sharealike Attributions Noncommercial
19
Load Balancer – Removing SPOF
•
•
In a Load Balanced App
Server Cluster the LB is an
SPOF
Setup LB in Active-Active or
Active-Passive mode
 Note: Active-Active nevertheless
assumes that each LB is
independently able to take up
the load of the other
 If one wants ZERO downtime,
then Active-Active becomes truly
cost beneficial only if multiple
LBs (more than 3 to 4) are daisy
chained as Active-Active forming
an LB Cluster
Active-Passive LB
Users
Load Balancer
AppServer
Load Balancer
AppServer
AppServer
Active-Active LB
Users
Load Balancer
AppServer
Load Balancer
AppServer
Creative Commons Sharealike Attributions Noncommercial
AppServer
20
Step 4 – Horizontal Scaling (App Server)
•
Load Balanced
App Servers
•
Our deployment at the end of
Step 4
Positives
 Increases Availability and
Scalability
 No changes to App required
 Easy setup
DBServer
• Negatives
 Finite Scalability
Creative Commons Sharealike Attributions Noncommercial
21
Step 5 – Vertical Partitioning (Hardware)
•
Load Balanced
App Servers
Introduction
 Partitioning out the Storage function
using a SAN
•
SAN config options
 Refer to “Demystifying Storage” at
http://wiki.directi.com -> Dev
University -> Presentations
DBServer
SAN
•
Positives
 Allows “Scaling Up” the DB Server
 Boosts Performance of DB Server
•
Negatives
 Increases Cost
Creative Commons Sharealike Attributions Noncommercial
22
Step 6 – Horizontal Scaling (DB)
•
Introduction
 Increasing the number of DB nodes
 Referred to as “Scaling out” the DB
Server
Load Balanced
App Servers
• Options
DBServer
DBServer
DBServer
 Shared nothing Cluster
 Real Application Cluster (or Shared
Storage Cluster)
SAN
Creative Commons Sharealike Attributions Noncommercial
23
Shared Nothing Cluster
•
•
•
•
Each DB Server node has
its own complete copy of the
database
Nothing is shared between
the DB Server Nodes
This is achieved through DB
Replication at DB / Driver /
App level or through a proxy
Supported by most RDBMs
natively or through 3rd party
software
DBServer
DBServer
DBServer
Database
Database
Database
Note: Actual DB files maybe
stored on a central SAN
Creative Commons Sharealike Attributions Noncommercial
24
Replication Considerations
•
Master-Slave
 Writes are sent to a single master which replicates the data to
multiple slave nodes
 Replication maybe cascaded
 Simple setup
 No conflict management required
•
Multi-Master
 Writes can be sent to any of the multiple masters which replicate
them to other masters and slaves
 Conflict Management required
 Deadlocks possible if same data is simultaneously modified at
multiple places
Creative Commons Sharealike Attributions Noncommercial
25
Replication Considerations
• Asynchronous






Guaranteed, but out-of-band replication from Master to Slave
Master updates its own db and returns a response to client
Replication from Master to Slave takes place asynchronously
Faster response to a client
Slave data is marginally behind the Master
Requires modification to App to send critical reads and writes to
master, and load balance all other reads
• Synchronous
 Guaranteed, in-band replication from Master to Slave
 Master updates its own db, and confirms all slaves have updated
their db before returning a response to client
 Slower response to a client
 Slaves have the same data as the Master at all times
 Requires modification to App to send writes to master and load
balance all reads
Creative Commons Sharealike Attributions Noncommercial
26
Replication Considerations
•
Replication at RDBMS level
 Support may exists in RDBMS or through 3rd party tool
 Faster and more reliable
 App must send writes to Master, reads to any db and critical reads
to Master
•
Replication at Driver / DAO level
 Driver / DAO layer ensures
• writes are performed on all connected DBs
• Reads are load balanced
• Critical reads are sent to a Master
 In most cases RDBMS agnostic
 Slower and in some cases less reliable
Creative Commons Sharealike Attributions Noncommercial
27
Real Application Cluster
•
•
•
•
•
All DB Servers in the cluster
share a common storage
area on a SAN
All DB servers mount the
same block device
The filesystem must be a
clustered file system (eg
GFS / OFS)
Currently only supported by
Oracle Real Application
Cluster
Can be very expensive
(licensing fees)
DBServer
DBServer
DBServer
Database
Creative Commons Sharealike Attributions Noncommercial
SAN
28
Recommendation
•
•
•
Try and choose a DB which
natively supports MasterSlave replication
Use Master-Slave Async
replication
Write your DAO layer to
ensure
 writes are sent to a single DB
 reads are load balanced
 Critical reads are sent to a
master
Load Balanced
App Servers
DBServer
DBServer
DBServer
Writes & Critical Reads
Other Reads
Creative Commons Sharealike Attributions Noncommercial
29
Step 6 – Horizontal Scaling (DB)
•
Load Balanced
App Servers
DB DB DB
•
•
Our architecture now looks like
this
Positives
 As Web servers grow, Database
nodes can be added
 DB Server is no longer SPOF
Negatives
 Finite limit
DB Cluster
SAN
Creative Commons Sharealike Attributions Noncommercial
30
Step 6 – Horizontal Scaling (DB)
Reads Writes
•
DB1
DB2
•
Shared nothing clusters have a
finite scaling limit
 Reads to Writes – 2:1
 So 8 Reads => 4 writes
 2 DBs
• Per db – 4 reads and 4 writes
 4 DBs
• Per db – 2 reads and 4 writes
 8 DBs
• Per db – 1 read and 4 writes
At some point adding another
node brings in negligible
incremental benefit
Creative Commons Sharealike Attributions Noncommercial
31
Step 7 – Vertical / Horizontal Partitioning (DB)
•
Load Balanced
App Servers
DB DB DB
Introduction
 Increasing the number of DB
Clusters by dividing the data
•
Options
 Vertical Partitioning - Dividing
tables / columns
 Horizontal Partitioning - Dividing by
rows (value)
DB Cluster
SAN
Creative Commons Sharealike Attributions Noncommercial
32
Vertical Partitioning (DB)
•
•
•
•
Take a set of tables and move
them onto another DB
 Eg in a social network - the users
table and the friends table can be on
separate DB clusters
Each DB Cluster has different
tables
Application code or DAO / Driver
code or a proxy knows where a
given table is and directs queries
to the appropriate DB
Can also be done at a column
level by moving a set of columns
into a separate table
App Cluster
DB Cluster 1
DB Cluster 2
Table 1
Table 2
Table 3
Table 4
Creative Commons Sharealike Attributions Noncommercial
33
Vertical Partitioning (DB)
•
Negatives
 One cannot perform SQL joins or
maintain referential integrity
(referential integrity is as such overrated)
 Finite Limit
App Cluster
DB Cluster 1
DB Cluster 2
Table 1
Table 2
Table 3
Table 4
Creative Commons Sharealike Attributions Noncommercial
34
Horizontal Partitioning (DB)
•
•
•
•
Take a set of rows and move
them onto another DB
 Eg in a social network – each DB
Cluster can contain all data for 1
million users
Each DB Cluster has identical
tables
Application code or DAO / Driver
code or a proxy knows where a
given row is and directs queries
to the appropriate DB
Negatives
App Cluster
DB Cluster 1
DB Cluster 2
Table 1
Table 2
Table 3
Table 4
Table 1
Table 2
Table 3
Table 4
1 million users
1 million users
 SQL unions for search type queries
must be performed within code
Creative Commons Sharealike Attributions Noncommercial
35
Horizontal Partitioning (DB)
•
Techniques
 FCFS
• 1st million users are stored on cluster 1 and the next on cluster 2
 Round Robin
 Least Used (Balanced)
• Each time a new user is added, a DB cluster with the least users is
chosen
 Hash based
• A hashing function is used to determine the DB Cluster in which the
user data should be inserted
 Value Based
• User ids 1 to 1 million stored in cluster 1 OR
• all users with names starting from A-M on cluster 1
 Except for Hash and Value based all other techniques also
require an independent lookup map – mapping user to Database
Cluster
 This map itself will be stored on a separate DB (which may
36
further need toCreative
be replicated)
Commons Sharealike Attributions Noncommercial
Step 7 – Vertical / Horizontal Partitioning (DB)
•
Load Balanced
App Servers
Lookup
Map
•
•
DB DB DB
DB DB DB
DB Cluster
DB Cluster
SAN
•
Our architecture now looks
like this
Positives
 As App servers grow, Database
Clusters can be added
Note: This is not the same as
table partitioning provided by
the db (eg MSSQL)
We may actually want to
further segregate these into
Sets, each serving a
collection of users (refer next
slide
Creative Commons Sharealike Attributions Noncommercial
37
Step 8 – Separating Sets
•
Now we consider each deployment as a single Set serving
a collection of users
Global
Lookup
Map
Global Redirector
Load Balanced
App Servers
Load Balanced
App Servers
Lookup
Map
Lookup
Map
DB DB DB
DB DB DB
DB DB DB
DB DB DB
DB Cluster
DB Cluster
DB Cluster
DB Cluster
SAN
SAN
SET 1 – 10 million users
SET 2 – 10 million users
Creative Commons Sharealike Attributions Noncommercial
38
Creating Sets
•
•
•
•
•
•
The goal behind creating sets is easier manageability
Each Set is independent and handles transactions for a
set of users
Each Set is architecturally identical to the other
Each Set contains the entire application with all its data
structures
Sets can even be deployed in separate datacenters
Users may even be added to a Set that is closer to them
in terms of network latency
Creative Commons Sharealike Attributions Noncommercial
39
Step 8 – Horizontal Partitioning (Sets)
•
Global Redirector
•
App Servers
Cluster
App Servers
Cluster
DB Cluster
DB Cluster
DB Cluster
DB Cluster
SAN
SAN
SET 1
SET 2
•
Our architecture now looks
like this
Positives
 Infinite Scalability
Negatives
 Aggregation of data across sets
is complex
 Users may need to be moved
across Sets if sizing is improper
 Global App settings and
preferences need to be
replicated across Sets
Creative Commons Sharealike Attributions Noncommercial
40
Step 9 – Caching
•
•
Add caches within App Server




Object Cache
Session Cache (especially if you are using a Central Session Store)
API cache
Page cache
Software
 Memcached
 Teracotta (Java only)
 Coherence (commercial expensive data grid by Oracle)
Creative Commons Sharealike Attributions Noncommercial
41
Step 10 – HTTP Accelerator
•
•
•
If your app is a web app you should add an HTTP
Accelerator or a Reverse Proxy
A good HTTP Accelerator / Reverse proxy performs the
following –





Redirect static content requests to a lighter HTTP server (lighttpd)
Cache content based on rules (with granular invalidation support)
Use Async NIO on the user side
Maintain a limited pool of Keep-alive connections to the App Server
Intelligent load balancing
Solutions
 Nginx (HTTP / IMAP)
 Perlbal
 Hardware accelerators plus Load Balancers
Creative Commons Sharealike Attributions Noncommercial
42
Step 11 – Other cool stuff
• CDNs
• IP Anycasting
• Async Nonblocking IO (for all Network Servers)
• If possible - Async Nonblocking IO for disk
• Incorporate multi-layer caching strategy where required
•
 L1 cache – in-process with App Server
 L2 cache – across network boundary
 L3 cache – on disk
Grid computing
 Java – GridGain
 Erlang – natively built in
Creative Commons Sharealike Attributions Noncommercial
43
Platform Selection Considerations
•
•
Programming Languages and Frameworks
 Dynamic languages are slower than static languages
 Compiled code runs faster than interpreted code -> use
accelerators or pre-compilers
 Frameworks that provide Dependency Injections, Reflection,
Annotations have a marginal performance impact
 ORMs hide DB querying which can in some cases result in poor
query performance due to non-optimized querying
RDBMS




•
MySQL, MSSQL and Oracle support native replication
Postgres supports replication through 3rd party software (Slony)
Oracle supports Real Application Clustering
MySQL uses locking and arbitration, while Postgres/Oracle use
MVCC (MSSQL just recently introduced MVCC)
Cache
 Teracotta vs memcached vs Coherence
Creative Commons Sharealike Attributions Noncommercial
44
Tips
•
•
•
•
•
All the techniques we learnt today can be applied in any
order
Try and incorporate Horizontal DB partitioning by value
from the beginning into your design
Loosely couple all modules
Implement a REST-ful framework for easier caching
Perform application sizing ongoingly to ensure optimal
utilization of hardware
Creative Commons Sharealike Attributions Noncommercial
45
Intelligent People. Uncommon Ideas.
Questions??
bhavin.t@directi.com
http://directi.com
http://careers.directi.com
Download slides: http://wiki.directi.com
46
Download