The DAS-3 Project Henri Bal Vrije Universiteit Amsterdam Faculty of Sciences Distributed ASCI Supercomputer • Joint infrastructure of ASCI research school • Clusters integrated in a single distributed testbed • Long history and continuity DAS-1 (1997) DAS-2 (2002) DAS-3 (Oct 2006) DAS is a Computer Science grid • Motivation: CS needs its own infrastructure for - Systems research and experimentation - Distributed experiments - Doing many small, interactive experiments • DAS is simpler and more homogeneous than production grids - Single operating system - “A simple grid that works’’ Usage of DAS • ~ 200 users, 32 Ph.D. theses • Clear shift of interest: Cluster computing Distributed computing Grids and P2P Virtual laboratories Impact of DAS • Major incentive for VL-e 20 M€ BSIK funding - Virtual Laboratory for e-Science • Collaboration with French Grid’5000 - Towards a European scale CS grid? • Collaboration SURFnet on DAS-3 - SURFnet provides multiple 10 Gb/s light paths Grid’5000 UvA/MultimediaN(46) DAS-3 VU (85 nodes) SURFnet6 UvA/VL-e (40) 10 Gb/s lambdas TU Delft (68) Leiden (32) 272 AMD Opteron nodes 792 cores, 1TB memory More heterogeneous: 2.2-2.6 GHz Single/dual core nodes Myrinet-10G (exc. Delft) Gigabit Ethernet Status • Timeline - Sep. 04 Apr. 05 Dec. 05 Apr. 06 Oct. 06 Proposal NWO/NCF funding European tender (with TUD/GIS, Stratix) Selected ClusterVision Operational • SURFnet6 connection shortly - Multiple 10 Gb/s dedicated lambdas • First local Myrinet measurements - 2.6 μsec 1-way null-latency - 950 MB/sec throughput Projects using DAS-3 • VL-e - Grid computing, scheduling, workflow, PSE, visualization • MultimediaN - Searching, classifying multimedia data • NWO i-Science (GLANCE, VIEW, STARE) - StarPlane, JADE-MM, GUARD-G, VEARD, GRAPE Grid, SCARIe, AstroStream • NWO Computational Life Sciences: - 3D-RegNet, CellMath, MesoScale • Open competition (many) • NCF projects (off-peak hours) StarPlane • Key idea: - Applications can dynamically allocate light paths - Applications can change the topology of the wide-area network, possibly even at sub-second timescale • VU (Bal, Bos, Maassen) UvA (de Laat, Grosso, Xu, Velders) NOC StarPlane • Challenge: how to integrate such a network infrastructure with (e-Science) applications? - Distributed supercomputing - Remote data access - Visualization CPU Data Network Jade-MM • Large-scale multimedia content analysis on grids • Problem: >30 CPU hours per hour of video Beeld&Geluid 20.000 hours of TV broadcasts per year London Underground >120.000 years of processing for >> 10.000’s CCTV cameras • Data-dependencies at all levels of granularity • UvA (Smeulders, Seinstra) + VU (Bal, Kielmann, Koole, van der Mei) GUARD-G • How to turn grids into a predictable utility for computing (much like the telephone system) • Problems: - Predictability of workloads - Predictability of system availability (grids are faulty!) • Allocation of light paths very useful here • TU Delft (Epema) + Leiden (Wolters) Summary • DAS has a major impact on experimental Computer Science research • It has attracted a large user base • DAS-3 provides - State-of-the-art CPUs: 64-bit (dual-core) - High-speed local interconnect (Myrinet-10G) - A flexible optical wide-area network More info: http://www.cs.vu.nl/das3/ Configuration Head * storage * CPU * memory * Myri 10G * 10GE Compute * storage * CPU * memory * Myri 10G Myrinet * 10G ports * 10GE ports LU TUD UvA-VLe UvA-MN VU TOTALS 10TB 2x2.4GHz DC 16GB 1 1 5TB 2x2.4GHz DC 16GB 2TB 2x2.2GHz DC 8GB 1 1 2TB 10TB 2x2.4GHz DC 8GB 1 1 29TB 32 400GB 68 250GB 2x2.6GHz 2x2.4GHz 4GB 1 4GB 1 33 (7) 8 2x2.2GHz DC 16GB 1 1 40 (1) 250GB 2x2.2GHz DC 4GB 1 46 2x250GB 2x2.4GHz 4GB 1 85 250GB 2x2.4GHz DC 4GB 1 41 8 47 8 86 (2) 8 40 (8) 2 46 (2) 2 85 (11) 1 (1) 64GB 271 84 TB 1.9 THz 1048 GB 320 Gb/s Nortel * 1GE ports * 10GE ports 32 (16) 1 (1) 136 (8) 9 (3) 339 Gb/s DAS-3 networks Nortel 5530 + 3 * 5510 compute nodes (85) 1 Gb/s ethernet (85x) ethernet switch 1 or 10 Gb/s Campus uplink 10 Gb/s ethernet 10 Gb/s Myrinet (85x) 10 Gb/s eth fiber (8x) 80 Gb/s DWDM SURFnet6 Nortel OME 6500 with DWDM blade 10 Gb/s Myrinet 10 Gb/s ethernet blade Myri-10G switch Headnode (10 TB mass storage) vrije Universiteit