The Microsoft Perspective On Where High Performance Computing Is Heading Kyril Faenov Director of HPC Windows Server Division Microsoft Corporation Talk Outline Market/technology trends Personal supercomputing Grid computing Leveraging IT industry investments Decoupling domain science from Computer Science Top 500 Supercomputer Trends Industry usage rising Clusters over 50% GigE is gaining x86 is winning HPC Market Trends Report of the High-End Computing Revitalization Task Force, 2004 (Office of Science and Technology Policy, Executive Office of the President) 2004 Systems 1,167 3,915 22,712 127,802 “Make high-end computing easier and more productive to use. Emphasis should be placed on time to solution, the major metric of value to high-end computing users… A common software environment for scientific computation encompassing desktop to high-end systems will enhance productivity gains by promoting ease of use and manageability of systems.” 2004-9 CAGR Capability, Enterprise $1M+ Divisional $250K-$1M Departmental $50-250K Workgroup <$50K 4.2% 5.7% Top Challenges to Implementing Clusters (IDC 2004, N=229) 7.7% 13.4% <$250K – 97% of systems, 52% of revenue In 2004 clusters grew 96% to 37% by revenue Average cluster size 10-16 nodes Source: IDC, 2005 System management capability 18% Apps availability 17% Parallel algorithm complexity 14% Space, power, cooling 11% Interconnect BW/latency 10% I/O performance 9% Interconnect complexity 9% Other 12% Major Implications Market pressures demand accelerated innovation cycle, overall cost reduction, and thorough outcome modeling Leverage volume markets of industry standard hardware and software Rapid procurement, installation and integration of systems Workstation-Cluster integrated applications accelerating market growth Engineering Bioinformatics Oil and Gas Finance Entertainment Government/Research The convergence of affordable high performance hardware and commercial apps is making supercomputing personal Supercomputing Goes Personal 1991 1998 2005 System Cray Y-MP C916 Sun HPC10000 Shuttle @ NewEgg.com Architecture 16 x Vector 4GB, Bus 24 x 333MHz UltraSPARCII, 24GB, SBus 4 x 2.2GHz x64 4GB, GigE OS UNICOS Solaris 2.5.1 Windows Server 2003 SP1 GFlops ~10 ~10 ~10 Top500 # 1 500 N/A Price $40,000,000 $1,000,000 (40x drop) < $4,000 (250x drop) Customers Government Labs Large Enterprises Every Engineer & Scientist Applications Classified, Climate, Physics Research Manufacturing, Energy, Finance, Telecom Bioinformatics, Materials Sciences, Digital Media The Future Supercomputing on a Chip IBM Cell processor 256 Gflops today 4 node personal cluster => 1 Tflops 32 node personal cluster => Top100 Microsoft Xbox 3 custom PowerPCs + ATI graphics processor 1 Tflops today $300 8 node personal cluster => “Top100” for $2500 (ignoring all that you don’t get for $300) Intel many-core chips “100’s of cores on a chip in 2015” (Justin Rattner, Intel) “4 cores”/Tflop => 25 Tflops/chip Key To Evolution Tackling system complexity Scenario Focus Scheduling multiple users’ applications onto scarce compute cycles Departmental Cluster Conventional scenario IT owns large clusters due to cost and complexity and allocates resources on per job basis Users submit batch jobs via scripts In-house and ISV apps, many based on MPI IT Mgr Manual, batch execution Interactive applications Personal/Workgroup Cluster Emerging scenario Clusters are pre-packaged OEM appliances, purchased and managed by end-users Desktop HPC applications transparently and interactively make use of cluster resources Desktop development tools integration Interactive Computation and Visualization Workstation clusters, accelerator appliances Distributed, policy-based management and security Data-centric, “wholesystem” workflows Rapid prototyping of HPC applications HPC Application Integration Future scenario Multiple simulations and data sources integrated into a seamless application workflow Network topology and latency awareness for optimal distribution of computation Structured data storage with rich metadata Applications and data potentially span organizational boundaries Cluster systems administration SQL Grids: Distributed application, systems, and data management Interoperability “Grid Computing” A catch-all marketing term “Grid” Computing means many different things to many different people/companies Desktop cycle-stealing Managed HPC clusters Internet access to giant, distributed repositories Virtualization of data center IT resources Out-sourcing to “utility data centers” … Originally this was all called “Distributed Systems” HPC Grids And Web Services HPC Grid ~ Compute Grid + Data Grid Compute grid Forest of clusters and workstations within an organization Coordinated scheduling of resources Data grid Distributed storage facilities within an organization Coordinated management of data Web Services The means to achieve interoperable Internet-scale computing, including federation of organizations Loosely-coupled, service-oriented architecture Computational Grid Economics* What $1 will buy you (roughly): Computers cost $1000 (roughly) 1 cpu day (~ 10 Tera-ops) == $1 (roughly, assuming 3 yr use cycle) 10TB network transfer costs == $1 (roughly, assuming 1Gbps interconnect) Internet bandwidth costs roughly 100 $/mbps/month (not including routers and management) 1GB network transfer costs == $1 (roughly) Some observations HPC cluster communication is 10,000x cheaper than WAN communication Break-even point for instructions computed per byte transferred: Cluster: O(1) instrs/byte => many parallel applications are economical to run on a cluster or across a GigE LAN WAN: O(10,000) instrs/byte => few parallel applications are economical to run across the Internet *Computational grid economics material courtesy of Jim Gray Exploding Data Sizes Experimental data: TBs PBs Modeling data Today 10’s to 100’s of GB per simulation is the common case Applications mostly run in isolation Tomorrow 10’s to 100’s of TBs, all of it to be archived Whole-system modeling and multi-application workflows How Do You Move A Terabyte?* Speed Mbps Rent $/month $/Mbps $/TB Sent Time/TB Home phone 0.04 40 1,000 3,086 6 years Home DSL 0.6 70 117 360 5 months T1 1.5 1,200 800 2,469 2 months T3 43 28,000 651 2,010 2 days OC3 155 49,000 316 976 14 hours OC 192 9600 1,920,000 200 617 14 minutes FedEx 100 50 24 hours LAN Setting 100 Mpbs 100 1 day Gbps 1000 2.2 hours 10 Gpbs 10000 13 minutes Context Anticipated HPC Grid Topology Islands of high connectivity Simulations done on personal and workgroup clusters Data stored in data warehouses Data analysis best done inside the data warehouse Wide-area data sharing/replication via FedEx? Data warehouse Workgroup cluster Personal cluster Data Analysis And Mining Traditional approach Keep data in flat files Write C or Perl programs to compute specific analysis queries Problems with this approach Imposes significant development times Scientists must reinvent DB indexing and query technologies Have to copy the data from the file system to the compute cluster for every query Results from the astronomy community Relational databases can yield speed-ups of one to two orders of magnitude SQL + application/domain-specific stored procedures greatly simplify creation of analysis queries Is That The End Of The Story? Relational Data warehouse Workgroup cluster Personal cluster Too Much Complexity 2004 NAS supercomputing report: O(35) new computational scientists graduated per year Parallel application development Chip-level, node-level, cluster-level, LAN grid-level, WAN grid-level parallelism OpenMP, MPI, HPF, Global Arrays, … Component architectures Performance configuration & tuning Domain science Debugging/profiling/tracing/analysis Relational Data warehouse Workgroup cluster Personal cluster Distributed systems issues: Security System management Directory services Storage management Digital experimentation: Experiment management Provenance (data & workflows) Version management (data & workflows) (Partial) Solution Leverage IT Industry’s Existing R&D Parallel applications development High-productivity IDEs Integrated debugging/profiling/tracing/analysis Code designer wizards Concurrent programming frameworks Platform optimizations Dynamic, profile-guided optimization New programming abstractions Distributed systems issues Web Services & HPC grids Security Interoperability Scalability Dynamic Systems Management Self (re)configuration & tuning Reliability & availability RDMS + data mining Ease-of-use Advanced indexing & query processing Advanced data mining algorithms Digital experimentation Collaboration-enhanced Office productivity tools Structure experiment data and derived results in a manner appropriate for human reading/reasoning (as opposed to optimizing for query processing and/or storage efficiency) Enable collaboration among colleagues (Scientific) workflow environments Automated orchestration Visual scripting Provenance Separating The Domain Scientist From The Computer Scientist Parallel/distributed file systems, relational data warehouses, dynamic systems management, Web Services & HPC grids Concrete workflow Computer scientist Concrete concurrency Abstract concurrency Computational scientist Parallel domain application development Abstract workflow (Interactive) scientific workflow, integrated with collaborationenhanced office automation tools Example: Write scientific paper (Word) Collaborate with co-authors (NetMeeting) Domain scientist Record experiment data (Excel) Individual experiment run (Workflow orchestrator) Share paper with co-authors (Sharepoint) Analyze data (SQL-Server) Scientific Information Worker Past and future Past Buy lab equipment Keep lab notebook Run experiments by hand Assemble & analyze data (using stat pkg) Collaborate by phone/email; Write up results with Latex Future Buy hardware and software Automatic provenance Workflow with 3rd party domain packages Excel and Access/Sql-Server Office tool suite with collaboration support Metaphor Physical experimentation “Do it yourself” Lots of disparate systems/pieces Metaphor Digital experimentation Turn-key desktop supercomputer Single integrated system Microsoft Strategy Reducing barriers to adoption for HPC clusters Easy to develop Familiar Windows dev environment + key HPC extensions (MPI, OpenMP, Parallel Debugger) Best of breed Fortran, numerical libraries, performance analysis tools through partners Long-term, strategic investments in developer productivity Easy to use Familiarity/intuitiveness of Windows Cluster computing integrated into the workstation applications, user workflow Easy to manage and own Integration with AD and the rest of IT infrastructure Lower TCO through integrated turnkey clusters Price/performance advantage of industry standard hardware components Application support in three key HPC verticals Engagement with the top HPC ISVs Enabling Open Source applications via University relationships Leveraging a breadth of standard knowledge-management tools Web Services, SQL, Sharepoint, Infopath, Excel Focused Approach to Market Enabling broad HPC adoption and making HPC into a high volume market © 2005 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.