The subtle art of e-triage Professor Ken Birman Dept. of Computer Science Cornell University Last Time We learned that complex e-commerce systems are at e-risk We saw that e-risks take many forms: System complexity Failure to plan for failures Poor project management Can we do better? E-triage In a hospital emergency room, nurse sees a bunch of sick people E-triage In a hospital emergency room, nurse sees a bunch of sick people Can’t deal with everything at once, so she decides what problems are most urgent We can use the same idea to “triage” risks to an e-commerce project Example Your e-company will host the ultimate web site for women’s cosmetic products Scan in pictures of your clients Software on the web site makes cosmetic recommendations Depicts results right on the image Plan is to be the world’s largest direct sales channel for cosmetics Business Roles? You write the checks, but also set the tone You need to learn enough about the choices to guide your technology people to make the right ones Your role is “triage” but of managerial type Reduce to a Technical Problem Option A: Hire a bunch of hackers Explain the vision, trust them They start to mumble about Oracle, Sybase, Informix, VRML, HHTPS, XML, IE 5.0 versus Netscape 3.2, scalable cluster servers… You zone out Problem: your fate in their hands Reduce to a Technical Problem Option B: You bring in the experts IBM shows up Monday, HP on Tuesday… Their marketing guys want money Their technical guys are incomprehensible You zone out Problem: your money in their hands. Didn’t IBM blow that air-traffic project? Reduce to a Technical Problem Option C: You decide to manage the project in an effective way Before bringing in technical people, can you sort out the big issues? Problem: you aren’t a technical person. (But they know that, and yet they look to you for guidance because you “run the show”) Basic Maxims We can’t do everything We need to focus on what really matters Each technical feature adds complexity Can we achieve a spare, elegant solution that has precisely the mechanisms we need to excel? On each choice, ask “why does this matter”? “What will be the downstream costs?” Pieces of the system The Internet: customers use it to talk to your company Your web server: hosts the “content” A database of customer information Her face, her age, past purchases, etc A database of cosmetic products A database of advertising materials Back-office system “clears” transactions Terminology A database is a big collection of information organized for easy access Internet technologies provide a way for client systems (browsers) to issue some form of database request to your system Usually the request occurs by fetching a “computed” web page Internally, your company may have information spread over multiple sites and hidden behind a firewall at each location Complexity! All of this takes an initially simple idea and makes it complex The data may not be at one place Perhaps many systems will need to cooperate to satisfy your customers With lots of customers we need to worry about how the model “scales” The System Internet NewMe.com Choices How to build the web site itself We’ll focus on BEA Weblogic This technology is a hot product that many feel “does it all” There are lots of other similar projects, but Weblogic is typical and most successful But picking the technology doesn’t really solve the business problem… Roll Your Own? With Weblogic you Use their “tools” But your own people build the actual web site Their tools provide: Basic functionality for web sites on which people can do shopping-cart style purchases Database to track inventory and customer histories They even handle credit card transactions Why is Weblogic a Success? Very easy to get slick web sites off the ground Built in solutions for most of the things you might need to do And BEA has related products to link the web site you build to your other “systems” within your company All of this makes for a good story – “the whole story” for building web sites All About Weblogic The basic architecture: On the client side, “cookie” tracks the contents of the shopping cart Requests are received by the Weblogic loadbalancing proxy Behind it, a cluster of servers run the web pages proxy server Weblogic Limits For extremely popular web sites, the approach Weblogic takes won’t scale Remote users will get slow response Hard to customize content so that each user sees a “private” set of web pages And the database itself is built by other companies, forcing you to make a choice High availability a big issue at this level Yet their product literature claims that Weblogic scales extremely well. Business Tradeoffs These limitations are serious worries Near term, they won’t impact you Building a great site will come down to hiring great graphics people But long term, they point to problems scaling your business model You might not easily have discovered the problem In effect, the technology choice may dictate the business model and price point… at a stage when you didn’t even know you were buying into them! Project Creep A common phenomenon You make sensible initial decisions And the technology seems to work Your company gets off the ground But then hit limits These force you to hack solutions Your business suffers downtime… unreliability Ultimately, your people do more and more hacking and may have to rebuild from scratch… or worse! Avoiding Project Creep The key to sensible business decisions in technology settings is scalability To be a success, your company needs to see revenue growth increase without requiring commensurate skilled labor Focus your business decision making on scalability questions about the technology as your company will be using it The mantra: “scalable from start to finish” Scalability We use this term to talk about Ability of system to continue providing high quality services to its users Even as numbers of users rises, and/or Size of network rises, and/or Load on the system increases, and/or Amounts of data it manages increase Scalability is Hard! In small scale settings Conditions are easily controlled We don’t tend to see failures and recoveries Things that can fail include computers and software on them, network links, routers… We are not likely to come under attack Fundamental Issues of Scale Suppose a machine can do x business transactions per second If I double the load to 2x how big and fast a machine should I buy? With computers, answer isn’t obvious! If the answer is “twice as big” we say the problem scales “linearly”. Often the answer is “4 times as big” or worse! Such problems scale poorly – perhaps even exponentially! Basic insight: “bigger” is often much harder! Does the Internet “Scale”? It works pretty well, most of the time But if you look closely, it has outages very frequently Butler Lampson won the Turing Award (to paraphrase): Computer scientists didn’t invent the worldwide-web because they are only interested in building things that work really well. The Web, of course, is notoriously unreliable. But the insight we, as computer scientists, often miss is that for the Web doesn’t need to work well! A “reliable web” – an example of an oxymoron? Internet scales but has low reliability How do technologies scale? One of the most critical issues we face! The bottom line is that, on the whole Very few technologies scale well The ones that do tend to have poor reliability and security properties Scale introduces major forms of complexity And large systems tend to be unstable, hard to administer, and “fragile” under stress Web scaling issues A very serious problem for popular sites Most solutions work like this: Your site “offloads” data to a web hosting company Example is Akamai, or Exodus They replicate your pages at many sites worldwide Ideally, your customers see better performance Second approach: Digital Island They focus on giving better connections to the Internet backbone, avoiding ISP congestion… Akamai Approach They cut deals with lots of ISPs Give us room in your machine room In fact, you should pay us for this! We’ll put our server there And it will handle so much web traffic… That your lines will be less loaded, since nothing will need to go out to the backbone And this will save you big bucks! A Good Idea? Akamai approach focuses on “rarely changing” data Example: pictures used on your web pages Non-example: the pages themselves, which are often constructed in a customized way Pre-Akamai: Your web site handles all the traffic for constructing pages and also handing out the pictures and static stuff Post-Akamai: You hand out the main pages but the URLs for pictures point to Akamai web servers Pre- and Post-Akamai <html> <head> <META name=VI60_defaultClientScript content=JavaScript><title>PokéMansion - The Big House of Pokemon Websites.</title><script>function load(){changeEmail();}function thisPage(){page.style.display="block";site.style. display="none";}function thisSite(){site.style.display="block";page.style. display="none";}function changeEmail(){emailind.innerHTML=' '+formail.recipient.value;}</script><script language="JavaScript" src="textsearch.js"></script><STYLE>INPUT {background-color:black; color:orange; font: 12 'Arial'}TEXTAREA {background-color:black; color:orange; font: 12 'Arial'}SELECT {background-color:black; color:orange; font: 12 'Arial'}</STYLE></head><BODY bgcolor="#000000" text="#cc9900" onload=load() link=gold alink=gold vlink=gold topMargin=0> NuMe.com <!-- ads --> <center> Pre-Akamai, the pages fetched by the browser are a mass of URLs And these point to things like pictures and ads stored at NuMe.com So to open a page, the user Sends a request Fetches an index page Then one by one, fetches each item on the page NuMe.com server handles all of these requests Pre- and Post-Akamai <html> <head> <META name=VI60_defaultClientScript content=JavaScript><title>PokéMansion - The Big House of Pokemon Websites.</title><script>function load(){changeEmail();}function thisPage(){page.style.display="block";site.style. display="none";}function thisSite(){site.style.display="block";page.style. display="none";}function changeEmail(){emailind.innerHTML=' '+formail.recipient.value;}</script><script language="JavaScript" src="textsearch.js"></script><STYLE>INPUT {background-color:black; color:orange; font: 12 'Arial'}TEXTAREA {background-color:black; color:orange; font: 12 'Arial'}SELECT {background-color:black; color:orange; font: 12 'Arial'}</STYLE></head><BODY bgcolor="#000000" text="#cc9900" onload=load() link=gold alink=gold vlink=gold topMargin=0> NuMe.com <!-- ads --> <center> Post-Akamai, these URLs point to an Akamai server “near” the customer They have many servers worldwide and replicate your data onto them NuMe.com sees less load Akamai Akamai Akamai A good idea? Akamai approach has many limits They need time to copy data from your site to their sites So if you change something, customers won’t see the changes for a while Could be an issue if you need “up to the moment” content or customized web pages We call this a “multicast” problem More Akamai Issues Akamai also has competitors Exodus, Digital Island All do variations on the same theme Akamai has special handling of advertising They get a cut on your advertising revenue You might come to regret this “tax” Akamai itself is having scalability problems Some say that their technology is “killing the Internet”! Alternatives? You could try to build your own large-scale web farms But you’ll run into big costs to administer them Akamai’s problems are ultimately technical; you’ll hit them too Bottom line is that the web may be a great idea but the scalability of many aspects leaves much to be improved Other kinds of worries As your site gets big it may become hard for users to find things Search engines really don’t work very well But “natural language” processing is a distant dream after decades of research So users may have an increasingly frustrating experience as your site grows This says nothing about the challenges of shipping all that product, or dealing with returns and customer inquiries Notice how Technical Issues Blur… We started with technical worry “Gee, this approach won’t scale” And found ourselves facing a broader question of a business nature Fundamentally, healthy companies live in a sweet spot They can make good money and see steady revenue growth And the technology matches the style of use Stepping Back For today, this is enough technology Management roles so far? A type of semi-informed inquiry Push technical people to explain the need for things Keep asking: have people been doing this for long? The hot new thing: often high risk. Management Does and Don’t Don’t be seduced deep into the technology choices Instead focus on simple, broad issues: How important is this early in the project’ Is there an incremental path that lets us get something online sooner? How much complexity does this introduce? Toughest Issues: Scale and Security Most things that work scale poorly today This is a feature, and a problem, with modern web and database technology But recognize: your company can’t solve the world’s technology problems Trick is to be solidly established with what works today, positioned to upgrade later Intelligent management of risk, rather than trying to eliminate all risks completely Too Many Objectives NewMe.Com Business Methodology Sort out the big problems from the small ones Keep your “eyes on the ball” Scalable revenue goals Running in the “technical sweet spot” Where you perceive risk, reduce it with extensive testing Styles of eBusiness Useful to distinguish categories of eBusiness problems and issues Technology companies: questions revolve around How cool is this technology? Is it the whole solution? Who will buy it now? Later? How does this scale? Requires very sophisticated business decision making and is an unsafe world for the non-technical business person! Styles of eBusiness Technology companies Web sites that sell products Here the issue of scalability revolves around customer experience Does your technology scale? Will your customer base scale? Early evidence: customer experience is so-so on the web; some successes but many failures Success quickly breeds imitation Styles of eBusiness Technology companies Web sites that sell products Selling the bricks and mortar Here scalability revolves around size of the market, need for your product, ability to deliver it in volume You succeed if your customers manage to sell more networked solutions Styles of eBusiness Trick is always to understand Growth of revenue, costs of providing service But technology scalability introduces limits that often prevent or inhibit growth These emerge as business risks as an area matures Only really clever players succeed when the risks become substantial Enough for now Basic insights? Scalability and security are the biggest threats Thinking about scale offers an angle to help you sort out priorities and to understand how realistic your business model will be We’ll stop now for a break After the break, shift emphasis to understand current and existing technical options a bit better But we won’t go overboard, don’t panic!