Introduction To Managing Infrastructure Paul Strong Distinguished Engineer, eBay Research Labs Acting Chair, Open Grid Forum (OGF) ® OGF22, 26th February 2008 Copyright Notice © 2007 eBay Inc. All rights reserved. • No part of these materials may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording, or otherwise) without the prior permission of eBay Inc. • eBay and the eBay logo are registered trademarks of eBay Inc. • PayPal and the PayPal logo are registered trademarks of PayPal, Inc. • Other trademarks and brands are the property of their respective owners. • Please do not take our picture or record the class/session without asking permission. ® ©2008, eBay Inc. What Is Infrastructure? More Than A Collection Of Technologies • Runs Business Processes driven by SLAs • Is a Value Center, rather than a Cost Center A new dialog between IT and the Business • Enables Internal utilities for core business functions/processes External utilities for business process elements Businesses based on business process mash-ups Opportunities for new platforms ® ©2008, eBay Inc. Managing Infrastructure Is not about managing servers, operating systems, disks etc. ® ©2008, eBay Inc. What Is Modern Infrastructure? • • • • • • • Is it the platform for SOA? Is it a Grid? Does it leverage Virtualization technologies? Is it full of blades? Is it Greener? Is it more efficient? Is it more automated? ® ©2008, eBay Inc. What Is Modern Infrastructure? • • • • • • • Is it the platform for SOA? Is it a Grid? Does it leverage Virtualization technologies? Is it full of blades? Is it Greener? Is it more efficient? Is it more automated? ® ©2008, eBay Inc. NGDC, Friends & Relations U ® ©2008, eBay Inc. ® ©2008, eBay Inc. Why eBay Is A Useful Example New Challenges Extreme Engineering The Bleeding Edge Everyday use Technology trickle down/transfer ® ©2008, eBay Inc. eBay – The 30 Second Introduction! eBay users trade about $1,840 worth of goods on the site every second On an average day on eBay… A vehicle sells every minute A motors part or accessory sells every second Diamond jewelry sells every 2 minutes 1.3m people make all or part of their living selling on ©2008, eBay Inc. *ACNielsen International Research, June 2006 * ® eBay’s Drivers • Extreme Scale 241m Registers Users, 100m+ Items, 6m+ New Items Per Day • Extreme Growth Near exponential growth in listings for most of history – 12 years • Extreme Agility Roll code to the site every 2 weeks • Constant, predictable presence Must be 24x7x365 • Efficiency Failure To Keep Up Is Not An Option! ® ©2008, eBay Inc. eBay Example #1 Making The Database Scale • Second Database for failover • CGI pools, Listings, Pages, and Search continued to scale horizontally However … By November 1999, the database servers approached their limits of physical growth. S/W Load Balancer Web Server 1999 S/W Load Balancer Web Server S/W Load Balancer Web Server S/W Load Balancer Apache C++ OS OS OS UNIX “CGIn” “Listings” “Pages” “Search” COTS Search RDBMS UNIX RDBMS UNIX bull.ebay.com bear.ebay.com ® ©2008, eBay Inc. eBay Example #1 Making The Database Scale • Database "split" technology. • Logically partition database into separate instances. • Horizontal scalability through 2000, but not beyond. S/W Load Balancer S/W Load Balancer Web Server Web Server S/W Load Balancer Web Server S/W Load Balancer Apache C++ OS OS OS UNIX “CGIn” “Listings” “Pages” “Search” 2000 COTS Search RDBMS RDBMS RDBMS RDBMS UNIX UNIX UNIX UNIX chard.ebay.com cab/bongo.ebay.com bull.ebay.com bear.ebay.com ® ©2008, eBay Inc. eBay Example #1 Virtualizing the Database Application Servers Attributes Catalogs Rules DB 1 CATY 1…N User Account Feedback DB 2 Misc API Scratch DB 3 • Separate Application notion of a database from physical implementation • Databases may be combined and separated with no code changes • Reduce cost of creating multiple environments (Dev, QA, …) • Application can continue to function without non-critical data (markdown) ©2008, eBay Inc. ® eBay Example #1 Virtualizing & Scaling the Database November,1999 1999 November, ® ©2008, eBay Inc. eBay Example #1 Virtualizing & Scaling the Database December, 2002 SAN ® ©2008, eBay Inc. eBay Example #1 Virtualizing & Scaling the Database • Scales Out 241 million registered users 103 million Items 6 million new items per day 34 billion SQL transactions per day 600+ production database instances (inc replicas) 100+ clusters • Cheaper Smaller, potentially commodity, servers • Highly Resilient 2-4 copies of everything Minimized impact of outage to [relatively] small sub-set of data • Flexible/Agile Easy to change – database, schemas, partitioning etc. Minimal impact on architecture or code ® ©2008, eBay Inc. eBay Example #2 Scaling The Application ® ©2008, eBay Inc. eBay Example #2 Scaling The Application • Partition code into functional areas – – Application is specific to a single area (Buying, Selling etc.) Domain contains common business logic across applications • Restrict inter-dependencies – – Applications depend on Domains, not on other applications No dependencies among shared domains User Application Selling Application Buying Application Billing Application Search Application Applications User Domain Selling Domain Buying Domain Billing Domain Search Domain Personalization Domain User Validation Domain Shared Billing Domain Shared Buying Domain myEBay Domain Shared Search Domain Core Domain API Domain Lookup Domain ©2008, eBay Inc. Shared Domains ® eBay Example #2 Scaling The Application • Segment functions into separate application pools – Minimizes/isolates DB dependencies – Allows for parallel development, deployment and monitoring ViewItem Pool SYI Pool http://cgiX.ebay.com... http://cgiY.ebay.com... Load Balancer Web Load Balancer Web Web Load Balancer AS AS Web Servers Web Load Balancer AS AS AS AS App Servers Load Balancer User ©2008, eBay Inc. Acct Caty1 Caty20+ ® eBay Example #2 Scaling The Application • Everything behaves as loosely coupled services • Minimize inter-dependencies • Infrastructure is like a giant FPGA – Potential to re-program by re-routing traffic • Scales – Scale out means scaled throughput and resilience – 16000+ concurrent instances – 8000+ servers (mainly blades) • Efficiency – Run traffic from different time zones on the same server but different instances ® ©2008, eBay Inc. Consequences • Scale Out – Pro – Scale, throughput, resilience, use commodity products – Con – More to manage – complexity, relationships • Virtualization – Pro – Flexibility – Con – more relationships to manage • Commodity – Interchangeable, choice, no lock in, lower unit cost ® ©2008, eBay Inc. The Big Problem Management complexity scales with this # Relationships # Components ©2008, eBay Inc. ® Understanding Relationships Service A is composed of Persistence Sub-Service B Business Logic Sub-Service C A Presentation Sub-Service D B C D ® ©2008, eBay Inc. Understanding Relationships Business Logic Sub-Service C is composed of A Load Balancing Service Several Application Instances A C B App D App LBS ® ©2008, eBay Inc. Understanding Relationships The Application Instances are hosted on Operating System Instances The Load Balancing Service is hosted on A A Load Balancer Operating System B C D App App LBS OS OS LB ® ©2008, eBay Inc. Understanding Relationships The Operating System Instances are hosted on Servers or Virtual Servers, which are in turn hosted on servers The Load Balancer OS is hosted on A A Physical Load Balancer B C D App App LBS OS OS LB VS Svr Svr LB ® ©2008, eBay Inc. Categorizing The Components Biz Process/Service Virtualized Platform A B Platform Instance C D App App LBS OS OS LB Virtualized OE OE Virtualized Physical VS Physical Svr Data ©2008, eBay Inc. Svr LB Business Logic Presentation ® Categorizing The Components N.B. Diagram shamelessly borrowed from the Open Grid Forum Reference Model (formerly EGA Reference Model) eBay Auction Biz Process/Service Virtualized Platform Platform Instance eBay Buy Item eBay Search Web Server Farm Aggregations eBay Sell Item Clusters Federation Web Server Database LDAP PayPal Load Balanced Farms Application Server Virtualized OE Network File systems NFS, CIFS Virtualized OS eg Solaris Containers, BSD Jails etc. Load Balancers, Global IP in clusters OE File Systems OS - eg AIX, HP/UX, Linux, Solaris, Windows etc. IP, TCP, UDP etc Virtualized Physical Physical LUNs, Volumes Disks, Array Controllers, SAN Switches Storage ©2008, eBay Inc. VMMs & Hypervisors VLANs Hardware Partitions Servers, Blades etc. Switches, Routers etc.. Compute Network ® Interaction/Traffic Relationships Biz Process/Service Virtualized Platform Platform Instance Data Business Logic Presentation ® ©2008, eBay Inc. Interaction/Traffic Relationships Biz Process/Service Virtualized Platform Platform Instance Data Business Logic Presentation ® ©2008, eBay Inc. Interaction/Traffic Relationships Biz Process/Service Virtualized Platform Platform Instance Data Business Logic Presentation ® ©2008, eBay Inc. Relationships Are Everything! • Everything is interconnected • Changing one thing causes ripples • How you connect things together determines business functionality and business value • Agility is the ability to change these relationships dynamically (easier with loosely coupled services) • Virtualization is about standardizing a relationships and interposing/isolating one end from the other • Understanding these relationships allows you to Tie business processes to the infrastructure they run on Map value to cost Understand and manage traffic flow Understand and manage provisioning etc. • It’s all about managing relationships, not things! ©2008, eBay Inc. ® Conclusions • NGDC is not just about technology that enables greater scaling, flexibility, resilience etc. • NGDC has to be about changing the nature of the data center and its relationship to the business • The challenge is how to understand and manage relationships, not just things! ® ©2008, eBay Inc. Thank You Paul Strong Paul.Strong@eBay.com Distinguished Engineer eBay Research Labs, eBay Inc. ® Understanding Relationships A B C App Data ©2008, eBay Inc. D App LBS Business Logic Presentation ® Why eBay Is A Useful Example • Driven by business need • No one else delivering transactions on this scale to this many users • Extreme today but where we go today many others will surely follow • Just as Formula 1 Grand Prix helps drive automotive technology that ends up in all cars • Graphic F1 car to Ford Focus • eBay+Google+Amazon+Yahoo+Banks -> Everyone Else ® ©2008, eBay Inc. What Is A Next Generation Data Center? • Is it an incremental step or a quantum leap forward? • Is it the platform for SOA? • Is it a Grid? • Does it leverage Virtualization technologies? • Is it full of blades? • Is it Greener? • Is it more efficient? • Is it more automated? ® ©2008, eBay Inc. Getting To The Vision Step 1 – Describe The Platform (NGDC) Biz Process/Service Virtualized Platform Platform Instance Virtualized OE OE Virtualized Physical Physical ® ©2008, eBay Inc. Getting To The Vision Step 1 – Describe The Platform (NGDC) ® ©2008, eBay Inc. Getting To The Vision Step 1 – Describe The Platform (NGDC) ® ©2008, eBay Inc. Getting To The Vision Step 1 – Describe The Platform (NGDC) ® ©2008, eBay Inc. Getting To The Vision Step 1 – Describe The Platform (NGDC) ® ©2008, eBay Inc. Getting To The Vision Step 1 – Describe The Platform (NGDC) ® ©2008, eBay Inc. Getting To The Vision Step 1 – Describe The Platform (NGDC) ® ©2008, eBay Inc. Getting To The Vision Step 1 – Describe The Platform (NGDC) ® ©2008, eBay Inc. Getting To The Vision Step 1 – Describe The Platform (NGDC) ® ©2008, eBay Inc. Getting To The Vision Step 1 – Describe The Platform (NGDC) Biz Process/Service Virtualized Platform Platform Instance Data Business Logic Presentation ® ©2008, eBay Inc. Getting To The Vision Step 1 – Describe The Platform (NGDC) ® ©2008, eBay Inc. Getting To The Vision Step 1 – Describe The Platform (NGDC) ® ©2008, eBay Inc. Getting To The Vision Step 1 – Describe The Platform (NGDC) ® ©2008, eBay Inc. Getting To The Vision Step 1 – Describe The Platform (NGDC) ® ©2008, eBay Inc. eBay’s growth has been amazing 588 Million Listings How do you keep up with this? 233 Million Users Q1Q2Q3Q4Q1Q2Q3Q4Q1Q2Q3Q4Q1Q2Q3Q4Q1Q2Q3Q4Q1Q2Q3Q4Q1Q2Q3Q4Q1Q2Q3Q4Q1Q2Q3Q4Q1Q2 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 ® ©2008, eBay Inc. The Challenges • Massive scale and throughput 233 million registered users 100+ million items available at any given time >1 billion page views a day >6 million new items a day • Security • Exponential Growth Always architecting for 10X growth • Agility 300+ new features per quarter Bi-weekly production code refresh • Availability 24x7x365 ® ©2008, eBay Inc. Ongoing Platform Evolution… Registered Users 222M V1 1998 1999 V2.0 2000V2.3 2001 2002 V2.4 2003 V2.52004 2005 V3 2006 V4 eBay architecture versions ® ©2008, eBay Inc. V1.0 1995-September 1997 • • • • • Built over a weekend in Pierre Omidyar’s living room in 1995 System hardware was made up of parts that could be bought at Fry's Every item was a separate file, generated by a Perl script No search functionality, only category browsing Direct attach, internal storage calculus.ebay.com trig.ebay.com thompson.ebay.com thomson.ebay.com Apache Apache Apache Apache Perl Perl Perl Perl GDBM GDBM GDBM GDBM FreeBSD FreeBSD FreeBSD FreeBSD “cgi” “listings” “pages” “mail” This system maxed out at 50,000 active items ® ©2008, eBay Inc. V2.0 September 1997- February 1999 • 3-tiered conceptual architecture (separation of bus/pres and db access tiers) • 2-tiered physical implementation (no application server) • C++ Library (eBayISAPI.dll) • Commercial index server used for search • Items migrated from GDBM to a commercial database on UNIX • Locally attached storage for the un-clustered database iguana.ebay.com komodo.ebay.com cayman.ebay.com gecko.ebay.com Web Server Web Server Web Server Web Server C++ OS C++ OS OS OS ViewItem Everything but ViewItem Pages Listings Search Index Server python.ebay.com RDBMS UNIX ® ©2008, eBay Inc. V2.1 February 1999-November 1999 • Servers grouped into pools • S/W load balancer used for front end load balancing and failover • Commercial search platform used • Search functionality moved to the commercial indexing system • Back-end database server scaled vertically to a larger machine S/W Load Balancer Web Server S/W Load Balancer S/W Load Balancer Web Server Web Server S/W Load Balancer Apache C++ OS OS OS UNIX “CGIn” “Listings” “Pages” “Search” COTS Search RDBMS UNIX python.ebay.com ® ©2008, eBay Inc. V2.5 April 2001 • True Horizontal Scalability • Items split by category • SPOF elimination S/W Load Balancer S/W Load Balancer Web Server Web Server S/W Load Balancer Web Server S/W Load Balancer Apache C++ OS OS OS UNIX “CGIn” “Listings” “Pages” “Search” COTS Search RDBMS UNIX RDBMS … UNIX Category Splits RDBMS UNIX RDBMS UNIX chard.ebay.com cab/bongo.ebay.com ® ©2008, eBay Inc. Scaling The Data Tier 1 • Partition By function (user, item, account etc.) – looser coupling and better scaling • Keep business logic out of the database tier No stored procedures Few triggers Referential Integrity • Minimize database resource and CPU usage Extensive use of prepared statements and bind variables • Move traditional database functions to the application tier Joins Sorting ® ©2008, eBay Inc. Scaling The Data Tier 2 • Auto-commit for vast majority of DB writes • Absolutely no client side transactions Single database transactions managed through anonymous PL/SQL blocks No distributed transactions • How do we pull it off Careful ordering of DB operations Recovery through Asynchronous recovery events Reconciliation batch Failover to async flow • Partition Again! By access path etc. Multiple patterns – write master/ read slave(s), split by modulus of key etc. • Extensive Caching ® ©2008, eBay Inc. Scaling The Data Tier – Summary • Spread the load Segmentation by function Horizontal splits within functions • Minimize the work Limit in-database work • The tricks to scaling How to survive without transactions Creating alternate database structures • Results Scales to >2PB data today 600 database instances (>100 clusters) in production (inc copies) 26 Billion SQL transactions a day Seamless redeployment of databases – ~6 times a week Extreme resilience ©2008, eBay Inc. ® Now that we have the Database taken care of…. • Application Server Monolithic 2-tier architecture 3.3 million lines of C++ (150MB binary) Hundreds of developers, all working on the same code Hitting compiler limits on number of methods per class (!!) • Search COTS software had reached it’s limits 9 hours to update the index Running on largest SMP money could buy – and still not keeping up Cost of scaling had become prohibitive ® ©2008, eBay Inc. Scaling The Application Tier – V3 – Replacing ISAPI with Java 2002-present • Re-wrote the entire application in J2EE application server framework Gave us a chance to architect the code for reuse and separation of duties • Leveraged the MSXML framework for the presentation layer Minimizing the development cost for migration • Implemented a development kernel as a foundation for programmers Allowed for rapid training and deployment of new engineers ® ©2008, eBay Inc. Scaling The Application Tier – Tiered Application Model • Strictly partition application into tiers Presentation Business Integration XSL Presentation Tier Business Tier Integration Tier Command (View) AO/AOF (View) XML Model Building Logic BO/BOF Business Logic DO/DAO Data Access Layer ® ©2008, eBay Inc. Scaling The Application Tier – Data Access Layer (DAL) • What is the DAL? eBay’s internally developed pure Java OR mapping solution All CRUD (Create, Read, Update, Delete) operations are performed through the DAL’s abstraction of data Enables horizontal scaling of the data tier without application code changes • Dynamic data routing abstracts application developers from Database splits Logical/physical hosts Markdown Graceful degradation • Extensive JDBC prepared statements cached by DataSources ® ©2008, eBay Inc. Scaling The Application Tier – Platform Decoupling • Domain partitioning for deployment Decouple non-transactional domains from transactional flows Search and billing domains are not required in transaction processing Fraud domain is required but easier to manage as separate deployment Integrate with a combination of asynchronous EDA and synchronous SOA patterns Transaction Platform EDA EDA SOA Billing Search Fraud ® ©2008, eBay Inc. Scaling The Application Tier – Summary • Spread the load Segmentation by function Horizontal load balancing within functions • Minimize dependencies Between applications Between functional areas From applications to data tier resources • Virtualize data access • Results 15000 instances of the V3 stack >1 billion page views a day ® ©2008, eBay Inc. Scaling Search – Overview • In 2002, eBay search had reached its limits Cost of scaling third party search engine had become prohibitive 9 hours to update the index Running on the largest systems vendor sold – and still not keeping up • eBay has unique search requirements Real-time updates Update item on any change (list, bid, sale, etc.) Users expect changes to be visible immediately Exhaustive recall Sellers notice if search results miss any item Search results require data (“histograms”) from every matching item Flexible data storage Keywords Structured categories and attributes No off-the-shelf product met these needs ® ©2008, eBay Inc. Scaling Search – Voyager • Real-time feeder infrastructure Reliable multi-cast from primary database to search nodes • Real-time indexing Search nodes update index in real time from messages • In memory search index • Horizontal segmentation (scatter, gather) Search index divided into N slices (“columns” ) Each slice replicated to M instances (“rows”) Aggregator parallelizes query over all N slices, load balances over M instances • Caching Cache results for highly expensive and frequently used queries ® ©2008, eBay Inc. Architectural Lessons Learnt • Scale Out, Not Up Horizontal scaling at every tier Functional decomposition • Prefer Asynchronous Integration Minimize availability coupling Improve scaling options • Virtualize Components Reduce physical dependencies Improve deployment flexibility • Design For Failure Automated failure detection and notification “Limp mode” operation of business features ® ©2008, eBay Inc. So What’s The Downside? • Too many things to manage! ~12000 servers, 15000 app server instances Linear increase in number of things to manage… …leads to a potentially exponential rise in complexity ® ©2008, eBay Inc. So What’s The Downside? • Too many things to manage! ~12000 servers, 15000 app server instances Linear increase in number of things to manage… …leads to a potentially exponential rise in complexity • Too many relationships to manage! Mapping cost to value becomes a challenge Diagnosis becomes a major challenge Becoming almost chaotic as we move to SOA and more personalization ® ©2008, eBay Inc. So What’s The Downside? • Too many things to manage! ~12000 servers, 15000 app server instances Linear increase in number of things to manage… …leads to a potentially exponential rise in complexity • Too many relationships to manage! Mapping cost to value becomes a challenge Diagnosis becomes a major challenge Becoming almost chaotic as we move to SOA and more personalization • Patterns help Minimizing variety of things to manage, reduces cost Fewer: vendors, products, versions, patterns… ® ©2008, eBay Inc. So What’s The Downside? • Too many things to manage! ~12000 servers, 15000 app server instances Linear increase in number of things to manage… …leads to a potentially exponential rise in complexity • Too many relationships to manage! Mapping cost to value becomes a challenge Diagnosis becomes a major challenge Becoming almost chaotic as we move to SOA and more personalization • Patterns help Minimizing variety of things to manage, reduces cost Fewer: vendors, products, versions, patterns… • Patterns also constrain Processes and tools optimize for small set of patterns Introducing more patterns becomes complex Temptation to reuse patterns inappropriately ©2008, eBay Inc. ® Scaling Operations – Code Deployment • Demanding Requirements Entire site rolled every 2 weeks All deployments require staged rollout with immediate rollback if necessary. More than 100 WAR configurations. Dependencies exist between pools during some deployment operations. More than 15,000 instances across eight physical data centers. • Rollout Plan Custom application that works from dependencies provided by projects. Creates transitive closure of dependencies. Generates rollout plan for Turbo Roller. • Automated Rollout Tool (“Turbo Roller”) Manages full deployment cycle onto all application servers. Executes rollout plan. Built in checkpoints during rollout, including approvals. Optimized rollback, including full rollback of dependent pools. ® ©2008, eBay Inc. Scaling Operations – Monitoring • Centralized Activity Logging (CAL) Transaction oriented logging per application server Transaction boundary starts at request. Nested transactions supported. Detailed logging of all application activity, especially database and other external resources. Application generated information and exceptions can be reported. Logging streams gathered and broadcast on a message bus. Subscriber to log to files (1.5TB/day) Subscriber to capture exceptions and generate operational alerts. Subscriber for real time application state monitoring. Extensive Reporting Reports on transactions (page and database) per pool. Relationships between URL’s and external resources. Inverted relationships between databases and pools/URL’s. Data cube reporting on several key metrics available in near real time. ©2008, eBay Inc. ® What We Want • Manage and monitor business process, not servers • Measure value, rather than just cost • Map value to cost • Simplify management in spite of increased complexity under the covers • Allow high school graduates to manage the site ® ©2008, eBay Inc. Nonetheless We Need A New Context • Recognition that the applications have changed Massively distributed Becoming finer grained • So has the platform… Sets of discrete resources are now fabrics • Need a new context for integrating the disparate tool set • The data center network is a System… a Grid Grids are the integrated, network distributed platforms for distributed applications All classes of apps – transactional and computational • What’s different Integration – Everything up to and including database, application and web servers, load balancers etc. is the platform Sharing – This platform has to be shared to deliver economies of scale • Q - What is the (meta-)operating system for this platform? • A – What we think of as systems management tools today BUT most do not recognize this change ® ©2008, eBay Inc. Nonetheless We Need A New Context • Recognition that the applications have changed Massively distributed Becoming finer grained • So has the platform… Sets of discrete resources are now fabrics • Need a new context for integrating the disparate tool set • The data center network is a System… a Grid Grids are the integrated, network distributed platforms for distributed applications All classes of apps – transactional and computational • What’s different Integration – Everything up to and including database, application and web servers, load balancers etc. is the platform Sharing – This platform has to be shared to deliver economies of scale • Q - What is the (meta-)operating system for this platform? • A – What we think of as systems management tools today BUT most do not recognize this change ® ©2008, eBay Inc. Nonetheless We Need A New Context • Recognition that the applications have changed Massively distributed Becoming finer grained • So has the platform… Sets of discrete resources are now fabrics • Need a new context for integrating the disparate tool set • The data center network is a System… a Grid Grids are the integrated, network distributed platforms for distributed applications All classes of apps – transactional and computational • What’s different Integration – Everything up to and including database, application and web servers, load balancers etc. is the platform Sharing – This platform has to be shared to deliver economies of scale • Q - What is the (meta-)operating system for this platform? • A – What we think of as systems management tools today BUT most do not recognize this change ® ©2008, eBay Inc. Nonetheless We Need A New Context • Recognition that the applications have changed Massively distributed Becoming finer grained • So has the platform… Sets of discrete resources are now fabrics • Need a new context for integrating the disparate tool set • The data center network is a System… a Grid Grids are the integrated, network distributed platforms for distributed applications All classes of apps – transactional and computational • What’s different Integration – Everything up to and including database, application and web servers, load balancers etc. is the platform Sharing – This platform has to be shared to deliver economies of scale • Q - What is the (meta-)operating system for this platform? • A – What we think of as systems management tools today BUT most do not recognize this change ® ©2008, eBay Inc. Tools Strategy UI UI Unified UI UI UI Client UI Services UI Abstraction – Standard Interface Tool Tool Tool Services Tool Tier Tool Tool Abstraction/Federation – Standard Interface DataSource Config DataSource Config DataSource Config DataSource Config DataSource Config DataSource Config Managed Object ® ©2008, eBay Inc. Tools Strategy • Unified Operational Model – Minimize skill sets Reduce risk and cost Easier to automate • Task Automation Bottom up • Single User Interface/Experience • Integration Framework Leverage Common Elements • Disaggregate Existing Tools Integrate wherever possible • Wrapping Of Legacy Tools When disaggregation is not possible • Common Data Model - EGA/OGF Reference Model based • Standard Interfaces ® ©2008, eBay Inc. What We Have Today • Distributed Management Framework • Core based on EGA/OGF Reference Model – Feeding back what we learnt into OGF Ref Model vNext • Definition of nouns, verbs and relationships • UML for the nouns – • • • • • Feeding back into OGF Ref Model vNext Java classes that realize the nouns, verbs and relationships Unified Framework – XUNI (eXtensible UNified Interface) Plugins to drive existing tools to execute verbs as required Federated sources of config information & run-time state Currently supported nodes – – V3 Nodes And Pools (inc load balancers) – 8000+ nodes Search Infrastructure – 2000+ nodes ® ©2008, eBay Inc. eXtensible UNified Interface (XUNI) ® ©2008, eBay Inc. What Next? Disaggregate management framework services Roll in functionality from legacy tools one use case at a time Refine UML Refine Java Roll existing tools into XUNI XML Schema for our managed objects RDF (or perhaps SML) for relationships Already have a Jena based demonstrator in labs but no refined RDFS yet Query-able to derive Traffic flows Binding DAGs from discovered information Topologies at various levels of abstraction Investigate use of inference engine More nodes and verbs More complex workflows including many heterogeneous elements ® ©2008, eBay Inc. Why Engage With SDOs? - 1 • • • eBay is not in the business of building management frameworks We want to replace our own components with COTS when possible, including management software Requires the same things we are building now Shared model, nouns, verbs, relationships Common standards for representing these • One of eBay’s Values reads – “We believe everyone has something to contribute” We are where many others will be tomorrow in terms of too many things to manage and inherently network centric services – we can help make it better for others ® ©2008, eBay Inc. Why Engage With SDOs? - 2 • Interoperability Leads to vendor choice and thus lower cost (commoditization) Prevents lock-in • Interoperability requires standards SDO driven De jour, through dominant implementation Proprietary Open Source • Standards will only be both timely and relevant with end user engagement ® ©2008, eBay Inc. Summary/Conclusions • Extreme Scale Demands – Scale up, not up Careful decomposition of workload Scaling data tier – the hardest part! • Continuous Availability & Business Agility Demand Resilience – natural attribute of scale out (Grid) Automation • Downside Management is very hard Patterns partially solve but introduce new constraints BYO Management Frameworks/Tools • Future Interoperability – demands standards Management is the big deal Need COTS tools ©2008, eBay Inc. ®