8 PS3 – build a Super computer. Reference : http://blogs.zdnet.com/storage/?p=184 http://cag.csail.mit.edu/ps3/lectures.shtml --- MIT lecture Notes You will find many more references by following the links on this paper taken from the internet. Less than a 10th the cost per GFlop of the $2500 supercomputer Take 8 PS 3 consoles, Yellow Dog Linux, a Gigabit Ethernet switch and your favorite protein folding or gravitational wave modeling codes and you’re doing real science. On a Playstation! Try playing Ratchet & Clank on a Cray Most scientific computing is done on cluster computers. Blue Gene/L, the world’s fastest supercomputer, uses 130,000 processors. Plus a lot of money, power and cooling. At about $4 per billion of floating point operations (GFlops) the PS3 is the cheapest supercomputer building block available today. Look under the hood The PS3’s Cell Broadband Engine processor, or Cell, is a heterogenous multiprocessor. Instead of identical cores - like the Intel and AMD multi-core processors - the Cell consists of a 64bit PowerPC core and 8 “synergistic co-processor elements” (SPEs). Each SPE has 256 KB local store, a memory controller and a “synergistic processing unit” (SPU) with a Single Instruction, Multiple Data processing unit and 128 registers of 128 bits each. They’re connected by a bus with an internal bandwidth of more than 300 GB/s that transfers data between the SPEs. The bottom line: you can go to Toys-R-Us and toss 200 GFlops into your shopping cart. Sony, your friendly supercomputer vendor Sony generously donated 8 PS3 consoles to Professor Gaurav Khanna of the University of Massachusetts for his research on black holes and quantum cosmology. This is a graphic of one black hole spiraling into another. It is representative of the problems Prof. Khanna is analyzing. Doing a run on a conventional supercomputer cost him about $5,000 in grant money. For less than that he could have built the PS3 cluster and run anything he wanted. But Sony saved him even that trouble by donating the equipment. This is serious stuff, right? So it has to be rack mounted. But the PS3 is so tiny: [photo courtesy of Prof. Khanna] Do real work on a Playstation cluster Go to Terrasoft to get PowerPC Linux that runs on the PS3’s . Go to IBM for version 3.0 of the developers kit. Pick up a SCOP3, A Rough Guide to Scientific Computing On the PlayStation 3 by a team from the University of Tennessee that includes Jack Dongarra, longtime publisher of the Top 500 supercomputer list. Get the MIT lecture notes from the Cell programming course. Interested in ray tracing? Check out Ray Tracing on the Cell Processor (pdf) by Carsten Benthin, Ingo Wald, Michael Scherbaum andHeiko Friedrich. Note: if you don’t already understand the math behind ray tracing you’ll be lost in this highly technical paper. Protein folding Your standalone PS3 can be part of a supercomputer project even if you don’t build it yourself. Stanford’s Folding@home protein-folding research can use your PS3’s cycles to help understand the causes of Alzheimer’s and many other diseases. Help save the *real* world. The Storage Bits take A single Cell processor is roughly equivalent to 25 nodes on Blue Gene/L. While there are a number of architectural limitations to the Cell and the PS3 that limit its general applicability, it enables researchers to apply an incredible number of cycles to certain classes of problem. And Sony, IBM and Toshiba are hard at work on the next generation of the Cell. On StorageMojo I’ve often addressed the consumerization of IT. The PS3 represents the consumerization of supercomputing. That will benefit us all. Supercomputing Costco-style In 1997, IBM’s Deep Blue supercomputer beat world chess champion Gary Kasparov. Today you can build a more powerful machine for less than $2,500 in an 11″ x 12″ x 17″ box. That works out to less than $100 per gigaflop as of January, 2007 More good news: pricing out the components today the machine would only cost $1,300! The recipe Professor Joel Adams and undergraduate Tim Brom built the machine at Calvin College in Grand Rapids, MI. Using the Beowulf cluster model, the Microwulf design includes 4 microATX motherboards with dual-core AMD Athlon 64 X2 3800 AM2+ processors 8 GigE ports - 1 built-in port on each motherboard, plus 1 added GigE PCI-express NIC 8 GB RAM - half of what a balanced system should have, but 16 GB would have busted their budget. 4 microATX power supplies 1 8-port GigE switch 250 GB hard drive & a CD/DVD drive 3 polycarbonate plastic shelves to mount the kit on plus 5 threaded rods to support the shelves Here’s a schematic diagram: The architecture Beowulf clusters are based on a message-passing (MPI) infrastructure that uses a network to interconnect the nodes. Some Beowulf clusters have hundreds of nodes and scale nicely with the right workloads. Microwulf has an economical version of the same architecture, built on Ubuntu Linux and MPI libraries. The result Performance is a many-splendored thing. In the world of supercomputing the standard benchmark is Linpack, which solves a dense system of linear equations in 64-bit double precision arithmetic. Learn more about Linpack, HPL and their parameters here. It is worth noting that with a 250 GB SATA drive, HPL doesn’t do much I/O. The benchmark is testing float point performance on an in-memory problem. Above 30,000 the machine ran out of memory. Here are Microwulf’s stats: While unexceptional today, this performance would have made Microwulf the world’s 6th fastest supercomputer in 1993. At less than $100 per gigaflop. Update: at today’s prices about $50 per GFlop. The Storage Bits take Humans aren’t very good at forecasting exponential functions like Moore’s Law. Microwulf is a good excuse to take stock of just how much computing has advanced in the last 15 years. Millicomputing is the name of a related initiative to build powerful clusters out of very powerefficient processors and low-cost components. In another 10 years you’ll be able to have the equivalent of a 5,000 node Google cluster in your den. Cluster-based virtual reality, anyone? Update: Lots of great comments from some very experienced people. Thanks! A couple of folks pointed to a detailed tutorial written by Professor Adams - who graciously permitted me to use his copyrighted diagram - that I’d linked to but without flagging its importance. Let me rectify that oversight. If you want to get into the details of the hardware and software this article on the Microwulf architecture and construction should suffice.