Aber Whitcomb – Chief Technology Officer Jim Benedetto – Vice President of Technology Allen Hurff – Vice President of Engineering First Megasite 64+ MM Registered Users 38 MM Unique Users 260,000 New Registered Users Per Day 23 Trillion Page* Views/Month 50.2% Female / 49.8% Male Primary Age Demo: 14-34 100K 1M 6M 70 M 185 M As of April 2007 185+ MM Registered Users 90 MM Unique Users Demographics 50.2% Female / 49.8% Male Primary Age Demo: 14-34 Internet Rank Page views in ‘000s MySpace #1 43,723 Yahoo #2 35,576 MSN #3 13,672 Google #4 12,476 facebook #5 12,179 AOL #6 10,609 Source: comScore Media Metrix March - 2007 50,000 45,000 40,000 35,000 MySpace Yahoo MSN Google Ebay Facebook 30,000 M M 25,000 20,000 15,000 10,000 5,000 0 Nov 2006 Source: comScore Media Metrix April 2007 Dec 2006 Jan 2007 Feb 2007 Mar 2007 350,000 new user registrations/day 1 Billion+ total images Millions of new images/day Millions of songs streamed/day 4.5 Million concurrent users Localized and launched in 14 countries Launched China and Latin America last week 7 Datacenters 6000 Web Servers 250 Cache Servers 16gb RAM 650 Ad servers 250 DB Servers 400 Media Processing servers 7000 disks in SAN architecture 70,000 mb/s bandwidth 35,000 mb/s on CDN Typically used for caching MySpace user data. Online status, hit counters, profiles, mail. Provides a transparent client API for caching C# objects. Clustering Servers divided into "Groups" of one or more "Clusters". Clusters keep themselves up to date. Multiple load balancing schemes based on expected load. Heavy write environment Must scale past 20k redundant writes per second on a 15 server redundant cluster. Relay Client Relay Service Platform for middle tier messaging. Up to 100k request messages per second per server in prod. Purely asynchronous—no thread blocking. Concurrency and Coordination Runtime Bulk message processing. IRelayComponents Relay Client Socket Server C C R Custom unidirectional connection pooling. Custom wire format. Gzip compression for larger messages. Data center aware. Configurable components Message Orchestration Berkeley DB Non-locking Memory Buckets Fixed Alloc Shared Interlocked Int Storage for Hit Counters Message Forwarding C C R MySpace embraced Team Foundation Server and Team System during Beta 3 MySpace was also one of the early beta testers of BizDev’s Team Plain (now owned by Microsoft). Team Foundation initially supported 32 MySpace developers and now supports 110 developers on it's way to over 230 developers MySpace is able to branch and shelve more effectively with TFS and Team System MySpace uses Team Foundation Server as a source repository for it's .NET, C++, Flash, and Cold Fusion codebases MySpace uses Team Plain for Product Managers and other non-development roles MySpace is a member of the Strategic Design Review committee for the Team System suite MySpace chose Team Test Edition which reduced cost and kept it’s Quality Assurance Staff on the same suite as the development teams MySpace using MSSCCI providers and customization of Team Foundation Server (including the upcoming K2 Blackperl) was able to extend TFS to have better workflow and defect tracking based on our specific needs Maintaining consistent, always changing code base and configs across thousands of servers proved very difficult Code rolls began to take a very long time CodeSpew – Code deployment and maintenance utility Two tier application Central management server – C# Light agent on every production server – C# Tightly integrated with Windows Powershell UDP out, TCP/IP in Massively parallel – able to update hundreds of servers at a time. File modifications are determined on a per server basis based on CRCs Security model for code deployment authorization Able to execute remote powershell scripts across server farm Images 1 Billion+ images 80 TB of space 150,000 req/s 8 Gigabits/sec Music 25 Million songs 142 TB of space 250,000 concurrent streams Videos 60TB storage 15,000 concurrent streams 60,000 new videos/day Millions of MP3, Video and Image Uploads Every Day Ability to design custom encoding profiles (bitrate, width, height, letterbox, etc.) for a variety of deployment scenarios. Job broker engine to maximize encoding resources and provide a level of QoS. Abandonment of database connectivity in favor of a web service layer XML based workflow definition to provide extensibility to the encoding engine. Coded entirely in C# Filmstrip for Image Review Thumbnails for Categorization DFS 2.0 User Content Job Broker MediaProcessor Upload Web Service Communication Layer CDN FTP Server (Any Application) Provides an object-oriented file store Scales linearly to near-infinite capacity on commodity hardware High-throughput distribution architecture Simple cross-platform storage API Accesses Designed exclusively for long-tail content Demand Custom high-performance event-driven web server core Written in C++ as a shared library Integrated content cache engine Integrates with storage layer over HTTP Capable of more than 1Gbit/s throughput on a dualprocessor host Capable of tens of thousands of concurrent streams DFS uses a generic “file pointer” data type for identifying files, allowing us to change URL formats and distribution mechanisms without altering data. Compatible with traditional CDNs like Akamai Can be scaled at any granularity, from single nodes to complete clusters Provides a uniform method for developers to access any media content on MySpace 300 250 200 150 2005 Server 2006 Server 2007 Server 100 50 0 Pages/Sec Distribute MySpace servers over 3 geographically dispersed co-location sites Maintain presence in Los Angeles Add a Phoenix site for active/active configuration Add a Seattle site for active/active/active with Site Failover capability Sledgehammer Cache Engine Users Business Logic Server Accelerator Engine Storage Cluster DFS Cache Daemon