1 - MSDN

advertisement
Mark Simms (@mabsimms)
Principal Program Manager
Windows Azure Customer Advisory Team
resilient
design and architecture
Load Balancer
Web
Servers
App
Servers
Database
Load Balancer
Distributed
Cache
Doc Store
App
Servers
...
Web
Servers
External Services (SendGrid,
Twitter, Facebook, etc)
Database
Load Balancer
Distributed
Cache
Doc Store
App
Servers
...
Web
Servers
external
any workloads
external service
What are the “9”s
Availability %
Downtime per year
Downtime per month*
Downtime per week
90% ("one nine")
36.5 days
72 hours
16.8 hours
99% ("two nines")
3.65 days
7.20 hours
1.68 hours
99.9% ("three nines")
8.76 hours
43.2 minutes
10.1 minutes
99.99% ("four nines")
52.56 minutes
4.32 minutes
1.01 minutes
99.999% ("five nines")
5.26 minutes
25.9 seconds
6.05 seconds
99.9999% ("six nines")
31.5 seconds
2.59 seconds
0.605 seconds
•
Study Windows Azure Platform SLAs:
•
Compute External Connectivity: 99.95% (2 or more instances)
•
Compute Instance Availability: 99.9% (2 or more instances)
•
Storage Availability: 99.9%
•
SQL Azure Availability: 99.9%
12


Seconds
Web Request Response Latency
450
400
350
300
250
200
150
100
50
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
Avg Latency
Response latency
Platform
Context
Sample Target e2e “Fast
latency max
First”
Retry Delay
Count
Backoff
SQL
Database
Synchronous (e.g.
render web page)
200 ms
Yes
3
50 ms
Linear
Asynchronous (e.g.
process queue item)
60 seconds
No
4
5s
Exponential
Synchronous (e.g.
render web page)
100 ms
Yes
3
10 ms
Linear
Asynchronous (e.g.
process queue item)
500 ms
Yes
3
100 ms Exponential
Azure
Cache
definition: design elements that can cause an outage.
Focus on identifying design elements that are subject to external
change. For example:
Categories of common Failure Points:
definition: a predictable root cause of the outage that occurs at
a Failure Point.
Examples of failure modes:
The following would not be considered a failure mode:
Failure Mode Example
public int GetBusinessData(string[] parameters)
{
try
{
var config = Config.Open(_configPath);
var conn = ConnectToDB(config.ConnectString);
var data = conn.GetData(_sproc, parameters);
return data;
}
catch (Exception e)
{
WriteEventLogEvent(100, E_ExceptionInDal);
throw;
}
}




Potential Failure Points:
Database Server
Database
Table
Configuration File











Potential Failure Modes:
DB Server not responding
DB offline
DB access denied
Sproc execute denied
DB doesn’t exist
DB timeout on connect
Index corrupt
Database corrupt
Table doesn’t exist
Table corrupt
Config file missing or invalid
27
Build
Code
Unit Test
CI
Check In
Dev Fabric
Plan
Automat
ed Test
Run
Dev on Azure
Design
Stage
Deploy
Run
QA/Pre-release on
Azure
Log
Defect
Scope
Test
Plan Fixes
Updates
Defect
Feature
Triage
Monitor
Production
Release on
Azure
Test
Deploy
http://msdn.microsoft.com/en-us/library/jj853352.aspx
(http://msdn.microsoft.com/en-us/library/windowsazure/jj717232.aspx
https://www.usenix.org/events/lisa07/tech/full_papers/hamilton/hamilton.pdf
Microsoft Confidential
Push vs. Pull
Load Balanced Push
Sync and good for sequential processing
Dependent on downstream services
Throttling vs. Performance
Managed Pull/Throughput
Asynchronous and event driven processing
Easy Parallelisation and Pipelining
Extending logic is easy
Logic based
•
•
•
•
Time based
Priority
Date
Amount
Etc.
•
•
•
•
52
ASAP
Gradually
Periodically
On-Demand
Volume based
• Single
• In Batches
Data on the inside – Data on the outside
http://msdn.microsoft.com/en-us/library/ms954587.aspx
Reference Data
• Immutable (versions)
• Requires open schema for interop
Activity Data
• Low concurrency updates (e.g. shopping
basket)
Resource (shared)
Data
• Highly concurrent update (e.g. inventory)
• Should live in worker role
53
Microsoft Confidential
Microsoft Confidential
“Query Ready” Cache
Query patterns
Push the data close to where it is queried
–
Example: BING Maps
Process, structure, produce, format etc. data and cache “query ready” data
Light/cheap data production is OK
Pure and Idempotent operations are usually good candidates
Duplication is OK
Same data in a different format
Same data in multiple places
This requires processing data before it is queried - NOT at the query time
All data can be cached
Some data can be cached:
Frequently used
Process Heavy, Expensive data
Build as you Go
54
Microsoft Confidential
Distributed Caching
Simple to administer
No need to manage and host a distributed cache yourself.
Integrates easily into existing applications
ASP.NET session state and output cache providers enable no-code integration.
Same managed interfaces as Windows Server AppFabric Cache
On-Premises App
AppFabric
Cache APIs
AppFabric
Cache APIs
Windows Azure App
Windows Server
AppFabric Cache
55
Windows Azure
AppFabric
Caching
01100100 01100001 01110100 01100001
Edge
Location
Download