Amazon Elastic Compute Cloud (EC2)

advertisement
Scale to the Sky: Adding Cloud Processing to
Autodesk® Add-Ins
Mike King – Odeh Engineers, Inc.
CP5132
If you are building Autodesk add-ins and running up against performance problems, is it time to leave the
local machine behind? What if you could, with a quick change, be running on a system with 68GB of
RAM? How about an 8-core system with 4.25 GHz each? How about several of these systems
connected by high-speed networks? Or maybe you don’t need a lot of power, you just don’t want to tie
up the local system so you access a little machine that is available on demand to offload your long
running operations. The flexibility of cloud computing allows you to pay for only what you need when you
need it and you will learn how to leverage it with your Autodesk extensions. We will be using Amazon
EC2 in this class, but the principles are applicable to other frameworks. Autodesk Revit® will be used in
this class, but any .NET-based add-in, including for AutoCAD®, works the same
Learning Objectives
At the end of this class, you will be able to:

Dramatically improve performance of computationally or memory-bound operations

Recognize good candidate applications for cloud computing

Extract and serialize data for use by your cloud-based processing

Set up cloud computing resources and integrate creation and management into your applications so
they are available on demand
About the Speaker
Mike King is a software engineer proficient in Microsoft® .NET technology. He has a
background in civil/structural engineering and in-depth experience in .NET development of all
types. He has worked to customize AutoCAD® from Visual LISP® and VBA to ObjectARX®
and Microsoft .NET, and has significant experience with the Revit® Structure API. At Odeh
Engineers, Inc., he has developed both line-of-business applications, such as a corporate
intranet that incorporates CMS, CRM, and PM functionality, and specific-needs applications
such as custom tablet PC tools for documenting field conditions.
mike.king@odehengineers.com
* Updated versions will be posted leading up to AU 2011
Scale to the Sky: Adding Cloud Processing to Autodesk® Add-Ins
Introduction
Cloud Computing is a buzz word right now. Unfortunately, that means it’s tossed around a lot
and the meaning is diluted and stretched beyond recognition. When we talk about cloud
computing in this class, we mean computation as a service. We’re talking about virtual
computers complete with operating systems, remote access, installed software and virtualized
hardware. Hardware in this environment is abstracted, specs are provided but not
manufacturers. The idea is that the service provider can deliver your specs in any manner they
deem cost efficient, as long as your virtual machine performs as promised.
Cloud computing providers run vast data centers and purchase commodity hardware on scales
very few companies could dream of. The economics of scale involved often allow them to
deliver computing resources at remarkably low cost with flexibility undreamed of with physical
hardware allocations and virtually no maintenance. Taking advantage of this type of computing
in your software applications requires a mind shift, careful planning, evaluation and
measurement but the rewards can be remarkable.
While cloud computing is often associated with startups because of the flexibility and low up
front cost it allows, many large and established companies also take advantage. Even
Autodesk itself has been taking advantage of cloud computing.
Autodesk Cloud Computing Tools
 123D Catch Beta (formerly Project Photofly): digital photo to 3d models
 Project Neon: Cloud based rendering
 Autodesk Cloud: Software delivered from cloud on subscription model
 Project Twitch: Streamed software delivery
 AutoCAD WS: Browser based AutoCAD
 Project Storm: Cloud based structural analysis
Benefits
 Do things you simply couldn’t before
 Dramatically improve performance for certain types of applications
 Process lots of data without tying up local resources
 Reduce hardware acquisition and maintenance costs
 Share computing resources easily across locations
 Improve flexibility of server resources
 Alleviate IT burden of planning
Major Providers
 Amazon Elastic Compute Cloud (EC2) - this class
 Microsoft Azure
 Google App Engine (app based, not virtual machines)
2
Scale to the Sky: Adding Cloud Processing to Autodesk® Add-Ins
Amazon Elastic Compute Cloud (EC2)
EC2 is a web service, or a set of web services, provided by Amazon. More than that, it’s a part
of a suite of tools called the Amazon Web Services (AWS) tools that are intended to make
cloud computing accessible to all developers. Behind the scenes, EC2 is a vast infrastructure
of data centers, hardware, personnel, expertise and software applications designed to keep
your computing and data resources safe and reliable. You can, for a few dollars, achieve a
level of dependability and performance that would be impractical for all but the largest
companies otherwise.
Signing Up
You need an Amazon account, and then you need to activate AWS for your account. Once you
do that, you’ll receive an Access Key and a Secret Key (more on that later). You can sign up
here: http://aws.amazon.com/
AWS if fee based, but you don’t pay for anything you don’t use and you won’t be on the hook for
more than a couple dollars / month for development and light use. There is a free usage tier,
but it does not allow windows instances (only Linux / UNIX) and has other restrictions.
Complete and up-to-date information is available on their website.
Understanding Amazon Machine Images (AMI)
An AMI is essentially a disk image. For those familiar with VM Ware, it is similar to a VM Ware
virtual disk. In fact, you can import your VM Ware virtual discs to create Amazon Machine
Images. It also includes some configuration information and, optionally, tags you can use to
track your resources. Once created, an AMI can be instantiated any number of times on any
type of compatible hardware. This basically means if you can’t run a 64 bit OS in a 32 bit
instance (obviously).
Available “base” AMI’s from Amazon:
 Basic 32/64 bit “Amazon Linux” (opt. Cluster Instance)
 SUSE Linux Enterprise Server 11 32/64 bit (opt. Cluster Instance)
 Red Hat Enterprise Linux 6.1 32/64 bit
 Microsoft Windows Server 2008 32/64 bit (opt. Cluster Instance)
 SQL Server 2008 Express & IIS
 SQL Server 2008 R2 Standard
 Many more from the community
To create your own AMI:
 Start with existing AMI
 Launch in an instance
 Connect (SSH or Remote Desktop)
3
Scale to the Sky: Adding Cloud Processing to Autodesk® Add-Ins



Install software, configure, customize
Create a new AMI with current state
Repeat!
Keep in mind when building your AMIs that they will be far more scalable if they don’t maintain
local state. Rather than saving things to the local machine, store them in a database or put the
results in a central location where they can be picked up by the client. This allows you to just
start up more instances as needed to handle load, or to terminate unneeded instances without
saving at any time to reduce cost.
Instance Types
Again, make sure to check http://aws.amazon.com/ec2/#instance for up to date information on
available instance types and their hourly costs. Detailed information current as of 11/14/2011 is
available in the accompanying PowerPoint presentation, I’m not going to duplicate it here
because I want you to go to the Amazon website and get accurate information!
Keep in mind that that hardware you are running on is abstracted away in nearly all cases. This
makes performance specs a bit unfamiliar for instance descriptions sometimes. Memory is in
GB, which isn’t very unusual. Processing power is measured in Compute Units, however.
1 Compute Unit ~ 1.0-1.2 GHz 2007 Opteron / 2007 Xeon ~ early 2006 1.7 GHz Xeon
IO performance is differentiated, but not specifically measured (Low, Moderate, High, Very
High).
Instance Categories
 Standard
o 1.7 – 15 GB RAM
o 1-8 Compute Units
o 160-1,690 GB Local storage
o Moderate – High IO Performance
 High Memory
o 17.1 – 68.4 GB RAM
o 6.5 – 26 Compute Units
o 420 – 1,690 GB local storage
o Moderate – High IO Performance
 High CPU
o 1.7 – 7 GB RAM
o 5-20 Compute Units
o 350 – 1,690 GB Local storage
o Moderate – High IO Performance
 Other
o Micro: tiny instance designed for infrequent bursts of CPU
4
Scale to the Sky: Adding Cloud Processing to Autodesk® Add-Ins
o
Cluster (compute/GPU): 22+ GB RAM, 33.5 Compute Units, 1690 local storage, Very
High IO performance
Miscellaneous Costs
 Data transfer
 Static IP
 Extra storage
 Monitoring
 Load balancing
 Scaling
Availability Zones / Regions
Regions are geographical areas with different pricing schemes. Choosing a region near your
clients helps reduce latency and improve responsiveness. Each geographic region has multiple
isolated availability zones to improve reliability. Certain resources need to be located in the
same availability zone to interconnect. Availability zones within each region are designated by
letters (e.g. A, B, C) and are account specific but not consistent between accounts and don’t
represent any specific building.
Security Groups
Each running instance is protected by a security group. This group is assigned at instance
launch but can be edited anytime. It is essentially a white-list filter of source IP addresses and
port ranges allowed for inbound connections.
Elastic IPs
Fixed IP addresses are available at no cost, as long as they are in use. They can be swapped
between running instances at any time, which leads to some interesting reliability and scaling
options. It’s also useful if your client application needs to directly locate or connect to a cloud
instance.
Load Balancing
Optional Auto Scaling and CloudWatch products from Amazon make load balancing and
instance scaling for well architected applications effortless. Granular performance monitoring
can help detect bottle necks and perform intelligent capacity planning.
Elastic Block Storage (EBS) Volumes
EC2 Instances are not stateful, all data is lost when the instance terminates (unless you create
a new AMI from it). For this reason, any persistent data needs to be stored off-instance. There
are many options for this, but one appealing one is an EBS volume. These are persistent
storage volumes allocated with an availability zone that can be attached to zero or one running
instance in that same region. They provide approximately local storage speed and snapshots
can easily be used to track history and provide backups. Attaching an EBS volume at startup is
5
Scale to the Sky: Adding Cloud Processing to Autodesk® Add-Ins
an easy way to allow local persistent storage for your instances without keeping them running
all the time.
Other Related Tools
 Amazon Simple Storage Service (S3): Highly reliable secure storage with URL accessible
objects
 Amazon Virtual Private Cloud (VPC): Provides LAN like connectivity between multiple
running instances
 Amazon Simple Notification Service (SNS): Notification service supporting a variety of
protocols (email, http, SMS, etc.)
 AWS Direct Connect: VPN connection between your instances and your office
 Amazon SimpleDB: Non-relational database in the cloud
 Relational Database Service (RDS): Relational database in the cloud
 Many more!
Choosing Candidate Applications
Criteria
 Memory or computationally bound operations (performance gain)
 Long running operations (free local resources)
 Serializable data (must be able to transfer data from local machine to cloud and back)
 Not time sensitive (not required, but can allow cheap batch processing)
 Strong separation of concerns (software architecture, UI, logic and data separation)
 Input / output distributed (started on one machine and monitored from others)
 Durability requirements (absolutely positively must complete successfully!)
If you’re looking to improve performance, you need to figure out where your existing bottle
necks are. Is your application CPU bound, memory bound or is it just slow because of network
latency or lots of disc operations? You need to understand what is happening in your own
application before you can make an informed decision about integrating cloud computing to help
boost performance. There is definitely an architecture and complexity cost to adding this type of
processing, be sure that it’s justified. Don’t underestimate the power of Windows Task
Manager, but you should also try professional profiling tools like those provided by Red Gate or
EQATEC. Try Windows Performance Monitor (PerfMon) or just use integrated Stopwatch
objects and logging in your .NET applications to zoom in on problem areas.
In order to take advantage of any kind of distributed architecture, you need to have serializable
data. That means you need to be able to take an in memory .NET object, reduce it to a series
6
Scale to the Sky: Adding Cloud Processing to Autodesk® Add-Ins
of bytes that can be sent between processes or over a network connection to another process
on another machine in another building in another part of the country (or world). That object
needs to be reconstructed by that program on that other machine and processed by code there,
then the result needs to be similarly deconstructed, transmitted, and reconstructed back on the
client. For cloud computing, your data transfer is going to be at WAN speeds, usually
significantly lower than LAN speeds so make sure your data transfer time isn’t going to kill any
performance gains you get from the split.
7
Download