ppt

advertisement
EMTM 600
Software Development
Spring 2011
Lecture Notes 5
Assignments for next time
•
Read “J2EE vs. .NET – An Executive Look” (hardcopy distributed in class).
Email me TWO QUESTIONS. Each student.
•
Read Jeff Dean’s “Experiences with MapReduce, an Abstraction for LargeScale Computing” (on the website) and email me TWO QUESTIONS. Each
student.
EMTM 600 Val Tannen
2
Careful with the “definitional hype”!
Oracle's CEO Larry Ellison:
"The interesting thing about cloud computing is that we've redefined
cloud computing to include everything that we already do....
I don't understand what we would do differently in the light of cloud
computing other than change the wording of some of our ads.”
EMTM 600 Val Tannen
3
What is Cloud Computing?
Two meanings for cloud computing:
•
Applications delivered as services over the Internet/intranets
SaaS: Software-as-a-Service;
analogy: restaurant!
•
The hardware and systems software in the data centers that provide
those services
IaaS: Infrastructure-as-a-Service; analogy: take-out food!
PaaS: Platform-as-a-Service;
analogy: grocery store!
Cloud: data center hardware and systems software
public cloud: available to the general public
with a payment model (utility computing)
private cloud: internal data centers for large enterprises, as
long as benefits similar to those of utility computing hold
also, community clouds and hybrid clouds.
EMTM 600 Val Tannen
4
Benefits of Cloud Computing
1. Computing resources available
• on demand
• quickly (allocation can track load increase)
• “unlimited” amounts (data center >>> user)
• eliminates need for long-range provisioning
1. Resources easily increased/decreased
• “elasticity” in software development
• growing users invest as needed
• pay-per-use/pay-as-you-go (distributed hours, incentive to
economize)
1. Utility economic models
• economy of scale
• networked availability
EMTM 600 Val Tannen
5
Cloud Computing Definition in Our Textbook
(Linthicum)
1. Pay-per-use …
2. … ubiquitous network access that is
• on demand
• available
• convenient …
3. … to a shared pool of resources that are
• configurable
• location-independent
• elastic, that is, they can be
• rapidly provisioned
• rapidly released
• by users
• with minimal effort by providers
EMTM 600 Val Tannen
6
Brave New World
Many opportunities follow from these benefits. Three examples:
1. Ideas for new Internet-based services can be tried with short timeto-market and without
• overprovisioning/underutilization (when it fizzles)
• underprovisioning/saturation (when it takes off)
1. Enterprise batch-oriented tasks can take full advantage of their
potential for parallelization/distribution/scalability
• on a public cloud cost per time unit x no. of machines
1. Enterprise with extra capacity can automate the process of renting
it out
This “elasticity of resources” is completely new phenomenon in IT!
EMTM 600 Val Tannen
7
What Enabled Cloud Computing?
This is not the first time the IT industry looks to outsource resources:
timesharing on mainframes in the late 70’s early 80’s.
Killed by Moore’s law --- cheap powerful workstations --- eventually
personal computing --- eventually client-server architectures.
What happened now? The construction of
•
•
•
•
extremely large-scale data centers
with many (many!) networked commodity-computers
at low-cost locations
statistical multiplexing techniques
This uncovered the factors of 5 to 10 decrease/increase in
•
•
•
•
cost of electricity
network bandwidth
software, and hardware utilization
at very large economies of scale
Enabling the offer of services below the costs of a medium-sized data
center while still making a good profit.
EMTM 600 Val Tannen
8
Software as a Service (SaaS)
User
Application
Middleware
Hardware
Cloud
provider
 Cloud provides an entire application
Word processor, spreadsheet, CRM software, calendar,
database...
Customer pays cloud provider
Example: Google Apps, Salesforce.com, Amazon SimpleDB,
Google BigTable, Yahoo PNUTS
9
EMTM 600 Val Tannen(courtesy of
Ives/Haeberlen)
Platform as a Service (PaaS)
SaaS
provider
User
Application
Middleware
Hardware
Cloud
provider
 Cloud provides middleware/infrastructure
For example, Microsoft Common Language Runtime (CLR)
Customer pays SaaS provider for the service; SaaS provider
pays the cloud for the infrastructure
Example: MS Azure, Google AppEngine, Amazon S3 (part of
AWS)
10
EMTM 600 Val Tannen (courtesy of
Ives/Haeberlen)
Simple Storage System
Infrastructure as a Service (IaaS)
SaaS
provider
User
Application
Middleware
Hardware
Cloud
provider
 Cloud provides raw computing resources
Virtual machine, blade server, hard disk, ...
Customer pays SaaS provider for the service; SaaS provider
pays the cloud for the resources
Examples:
Amazon EC2 and EBS (part of AWS),
Rackspace Cloud, GoGrid
EMTM 600 Val Tannen (courtesy of Ives/Haeberlen)
Elastic Compute Cloud
Elastic Block Storage
11
Other characteristics
IaaS
PaaS
SaaS
Users control
most of the
software.
User platform
is .NET and
CLR or Java
EE, eg.
Web-app specific,
stateless
computation.
layer, stateful storage
layer.
Library-based
scalability and
failover.
Excellent automatic
scalability and
failover.
Automatic
scalability and
failover are
applicationdependent.
EMTM 600 Val Tannen
12
Confusion, Skepticism, Frustration
(Larry Ellison)
Much of it has to do with private clouds. What is and isn’t cloud
computing?
No: A public service hosted on an ISP that can allocate more machines
given a four hours notice. But load on the Internet can surge much
faster than that.
Yes: An enterprise data center that runs applications that change with
significant advance notice so allocation can track expected load
increase.
Not all benefits may be realized in private cloud examples like above.
In particular, not the same economies of scale.
Confusion, skepticism and frustration may arise when some advantages
of public clouds are also claimed for medium-size data centers.
(“A view of cloud computing”, Ambrust et al, CACM(53)4, 2010)
EMTM 600 Val Tannen
13
Public Cloud vs. Conventional Data Center
(“A view of cloud computing”, Ambrust et al, CACM(53)4, 2010)
EMTM 600 Val Tannen
14
Examples of When to Definitely Use
Cloud Computing
1. When demand varies in time
• Peak-valley tasks (payroll, taxes)
• Startup incubators
• Research facilities
2. When demand is unknown
• Web startup
1. Batch analytics that can scale
• Cost associativity:
cost(100hrs x 1machine) = cost(1hr x 100machines)
EMTM 600 Val Tannen
15
Bringing Cloud Computing into
Enterprise Application Integration
Cloud computing is SOA-compatible, both philosophically and
technologically.
Thus, the best strategy is to (re?) organize enterprise IT according to
SOA.
 Reduce tight coupling to a minimum (as allowed by performance
constraints)
 Define application interfaces precisely
 Identify security and privacy problems
EMTM 600 Val Tannen
16
Cloud-based Services for the Enterprise (1)
 Storage-as-a-Service (IaaS+PaaS)
Great temporary solution till you buy your own disk capacity.
Excellent for sharing, moving. Even I am using Dropbox.
Issues: cost, availability, performance, security level, privacy.
 Database-as-a-Service (IaaS + SaaS)
Avoid huge licensing costs. They backup and restore. Some even full
DBA for you!
Or, licenses available with pay-as-you-go on from MS and IBM on EC2.
Issues: availability, performance, security level, privacy, legal issues.
EMTM 600 Val Tannen
17
Cloud-based Services for the Enterprise (2)
 Information-as-a-Service (SaaS)
Many existing services (DJ, etc.).
Several cost models: free(!), onetime + ongoing, pay-per-use, etc.
Issues: cost, availability, performance, security level, privacy.
 Process-as-a-Service (SaaS)
BPEL in the cloud. A solution for integrating applications as services.
Integration across enterprises (B2B, proxy vending).
Temporary solutions for IT merging?
Issues: availability, security
EMTM 600 Val Tannen
18
Cloud-based Services for the Enterprise (3)
 Application-as-a-Service (SaaS)
Long available in scientific grid computing.
Also, Google Docs, Gmail, Google Calendar.
SFA, office automation, HR, logistics, planning services, etc.
Issues: cost, availability, performance, security level, privacy.
 Platform-as-a-Service (PaaS)
Google AppEngine, MS Azure. IBM Websphere on EC2.
A complete enterprise application development environment!
Issues: lock-in, security
EMTM 600 Val Tannen
19
Cloud-based Services for the Enterprise (4)
Also, particular cases of others we saw:
 Integration-as-a-Service
 Security-as-a-Service (issue: security of security!)
 Management/Governance-as-a-Service
 Testing-as-a-Service
EMTM 600 Val Tannen
20
Issues in the Cloud (1)
 Availability
The service could be down. Or the network in-between could be down.
Or they might lose your data. Irretrievably.
Solution: “no single point of failure”. This means using several cloud
providers. This is not easy because solutions are often proprietary.
Opportunity: offer high-availability services using multiple existing ones.
EMTM 600 Val Tannen
21
Issues in the Cloud (2)
 Security, privacy, audit (possibly legal) requirements
Vulnerability to malicious attacks or even to inadvertent disclosure.
Who is responsible for what aspect of security? Provider for some, user
for others, others are shared.
User security responsibilities: EC2 > Azure > AppEngine
Users need to be protected from each other (theft, denial-of-service).
Problem exacerbated by possible competitors sharing public cloud.
Most protection is achieved by virtualization. But the mechanism is
complicated and by virtue of that, vulnerable.
Success story: HIPAA-compliant application of TC3 Health Inc. was
moved to Amazon WS.
Cynic’s perspective: absence of security breach knowledge does not
guarantee security…
EMTM 600 Val Tannen
22
Issues in the Cloud (2)
Security/privacy issues to discuss with cloud vendor in advance:
1. Who are the people who will manage my data?
2. Will you share need for regulatory compliance (HIPAA, SarbanesOxley)?
3. Where will my data be stored? Are those servers subject to local
jurisdictions that should worry me?
4. How do you segregate my data from others’?
5. What are the data loss risks?
6. What is your disaster recovery strategy?
7. Can you support forensic investigations?
8. What happens to my data when you get bought or go bankrupt?
EMTM 600 Val Tannen
23
Some slides by Andreas Haeberlen
(Penn)
Alice's
customers
Alice
Bob
The cloud enables Alice to:
obtain resources on demand
pay only for what she actually uses
benefit from economies of scale
But...
24
EMTM 600 Val Tannen
Problem: Split administrative domain
?
Alice
?
?
?
?
?
?
?
?
?
Bob
?
?
?
Alice's
customers
Control and information about Alice's service
are now split between Alice and Bob
Alice cannot control cloud machines or observe their status
 Alice must have a lot of trust in Bob
Bob does not understand the details of Alice's software
 Difficult to perform many administrative tasks
25
EMTM 600 Val Tannen
Problem: Split administrative domain
Alice
Bob
Alice's
customers
What if the cloud does not deliver?
Insufficient allocation of resources
Hardware malfunction, data loss/unavailability
Data leaks to a third party (HIPAA!)
Misconfiguration
Hacker attack
... 26
EMTM 600 Val Tannen
Handling problems: Alice's perspective
?
? ? ?
?
? ?
?
Alice
Bob



Alice's
customers
If something is wrong,
how will I know?
How can I tell if it's my
software or the cloud?
If it's the cloud, how can
I convince Bob?
27
EMTM 600 Val Tannen
Handling problems: Bob's perspective
?
?
?
?
?
?
?
?
?
?
Alice
Bob



If something is wrong,
how will I know?
How can I tell if it's my
software or the cloud?
If it's the cloud, how can
I convince Bob?



?
?
?
Alice's
customers
If something is wrong,
how will I know?
How can I tell if it's the
cloud or Alice's software?
If it's Alice's software, how
can I convince Alice?
28
EMTM 600 Val Tannen
Outline
An example
problem
A potential
solution
Tomorrow's Ongoing work;
Call for action
cloud
Is the cloud
delivering
what I paid for?
29
EMTM 600 Val
Tannen
Learning from the 'offline' world
Customer
The cloud
Contractor
 In the 'offline' world, we face similar problems!
 Is our contractor delivering the work as promised?
 If a dispute arises, how to decide who is at fault?
 How 30do we solve these problems today?
EMTM 600 Val Tannen
Learning from the 'offline' world
 In the 'offline' world, we rely on accountability
Tamper-evident
record
Expected
behavior
Auditors
Incriminating
evidence
 Accountability can:
Detect faults
Identify the faulty person/entity
Convince others of the fault (by producing evidence)
 Goal: Apply this approach to the cloud
31
EMTM 600 Val Tannen
Is accountablity enough?
 Not if faults can have serious and irrecoverable
consequences  need fault tolerance
 Example: 911 dispatch, controlling a nuclear power plant
 Use multiple clouds?
 BUT: In practice, many apps are not of this type
 Example: Outsourcing a large compute job, handling flash crowds in web
services, data mining...
 What can we do if a fault does appear?




Demand outage credit
Sue the cloud provider
Buy insurance up front
...
32
EMTM 600 Val Tannen
Benefits of accountability
 Accountability enables timely detection and
recovery from faults
Can fix the problem before the customer
complains
 Accountability provides an incentive to avoid faults
Reliability of the cloud becomes measurable!
 Accountability builds trust between cloud customers
and cloud providers
"If anything were to go wrong, we would be
able to tell."
33
EMTM 600 Val Tannen
An idealized solution
Alice
Bob
Alice's
customers
Oracle
What should an accountable cloud look like?
Imagine an oracle that ensures the following:
Completeness: If the cloud is faulty, the oracle will say so
Accuracy: If the cloud is not faulty, the oracle will say so
Verifiability: The oracle produces evidence that would
convince a disinterested third party
34
EMTM 600 Val Tannen
The accountable cloud
Alice
Tamper-evident
log
Bob
Alice's
customers
How can we implement such an oracle?
Cloud records its actions in a tamper-evident log
Challenge: If cloud is compromised, it can lie, or log
things incorrectly!
Alice and Bob can audit the log and check for faults
Use log to construct evidence that a fault does (not) exist
Provides completeness, accuracy, verifiability
Provable guarantees even if Alice and/or Bob are malicious!
35
EMTM 600 Val Tannen
Discussion
Isn't this too pessimistic? Bob isn't malicious!
Hacker attacks, software bugs, disgruntled employees,
operator error, ..., can have the same effect
Difficult to come up with a more restrictive fault model
Alice (or some other customer) could be malicious
Shouldn't Bob use fault tolerance instead?
Bob certainly should mask faults whenever possible
But: Masking is never perfect; Alice still needs to check
Why would a provider want to deploy this?
Attractive to prospective customers
Helps with handling angry support calls
36
EMTM 600 Val Tannen
Recap
Problem: Current cloud designs carry risks
for both customers and providers
Customer loses control over his computation and data
Split administration  Difficult to detect+resolve problems
Proposed solution: The accountable cloud
Can verify correct operation, produce evidence
Provable guarantees  solid foundation for both sides
Discussion: Guarantees, fault model, incentives, ...
Currently building a prototype
 www.cis.upenn.edu/~ahae
37
EMTM 600 Val Tannen
Issues in the Cloud (3)
 Performance
 Transfer of very large amounts of data. Best solution: fedex it!
 Unpredictability. Multiple virtual machines can share CPU and
memory nicely but sharing disk and network is problematic.
Opportunity: do what IBM mainframes did!!
Opportunity: flash memory!
 High-performance batch processing could make valuable use of
clouds, provided threads are actually running simultaneously.
Need “gang scheduling” in cloud computing.
 Scaling speed. More like Google’s AppEngine (charge by cycle.
automatic scaling), less like Amazon’s EC2 (charge by the hour,
user-demanded scaling with minutes of delay).
Opportunity: keep stats and use machine learning techniques
EMTM 600 Val Tannen
38
Issues in the Cloud (4)
 Reputation fate sharing
One bad apple stinks the whole cloud! (Eg., spamming)
A terrorist in the cloud: everybody gets checked out, downtime, etc.
 Software licensing
The usual model doesn’t allow you to install in a cloud.
Big vendors partner with cloud providers to offer
pay-as-you-go licensing: Microsoft Windows Server, SQL Server,
IBM DB2 and Websphere, can be used on EC2.
EMTM 600 Val Tannen
39
Conclusions
 Cloud computing is here to stay: too many advantages
 Technologies still young, in development
 Basic security not so different than existing data centers but there
may be lots of new opportunities for finding vulnerabilities.
 Do not use clouds when:
 Applications are tightly coupled, or require legacy resources
 High performance is critical
 High level of security is critical
 Control or high availability is critical (risk analysis)
EMTM 600 Val Tannen
40
Download