e-VLBI Development Program at MIT Haystack Observatory Alan R. Whitney Chester A. Ruszczyk MIT Haystack Observatory 13 July 2005 e-VLBI Workshop Australia Current Projects at Haystack Observatory • Standardization – – • Network interfacing equipment for e-VLBI – • Evaluation, development and deployment of monitoring systems Intelligent Applications – – – • Mark 5 VLBI data system Network Monitoring – • VSI-E Draft VSI-E standard distributed in January 2004 Reference implementation released in October 2004 Automation of e-VLBI transfers an ongoing process Development of optimization-based algorithms for intelligent applications ongoing (EGAE) Intelligent optically-switched networks (DRAGON) e-VLBI Experiments – Goal to put e-VLBI into routine use VSI-E Architecture VSI-E • Purpose: – • Characteristics: – – – – – • To specify standardized e-VLBI data formats and transmission protocols that allow data exchange between heterogeneous VLBI data systems Based on standard RTP/RTCP high-level protocols Allows choice of IP transport protocols (TCP-IP, UDP, FAST, etc.) Scalable Implementation; supports up to 100Gbps Ability to transport individual data-channel streams as individual packet streams; potentially useful for distributed correlators Ability to make use of multicasting to transport data and/or control information in an efficient manner Status – – – – Draft VSI-E specification completed January 2004 Prototype VSI-E prototype implementation Nov 2004 Practical implementation for K5 and Mark 5 now is progress Plan to use VSI-E in real-time demo at SC05, Nov 05 Reaching 1024 Mbps with Mark 5 • Achieving 1024Mbps with Mark 5 is challenging • Can move ~1.2 Gbps between StreamStor card memory via PCI bus, but – If GigE NIC is on same PCI bus, bus contention slows aggregate transfers to ~400-550Mbps, depending on motherboard – Single GigE connections tops out at ~980Mbps (theoretically and experimentally) – Typical GigE drivers require interrupt service every Ethernet frame; can generate up to ~100,000 interrupts/sec • Elements of Solution – – – – Capable motherboard with multiple independent PCI buses Dual ‘channel-bonded’ GigE links Driver or hardware interrupt mitigation; use of ‘jumbo frames’ Careful software structure Mark 5 e-VLBI Connectivity • Mark 5 supports a triangle of connectivity for e-VLBI requirements Disc array Data Port/FPDP PCI bus/Network (64bit/66MHz) Mark 5 can support several possible e-VLBI modes: • e-VLBI data buffer (first to Disc Array, then to Network); vice versa • Direct e-VLBI (Data Port directly to Network); vice versa • Data Port simultaneously to Disc Array and Network at ~800 Mbps Anatomy of a (fairly) modern motherboard (Tyan Thunder i7501 Pro) 2.8GHz Xeon 2.8GHz Xeon FSB 64-bit 533MHz (4.2GB/sec) FPDP Bus Stream Stor 64-bit 66MHz Dual GigE PCI-X 64-bit 133MHz PCI-X Intel 82546EB 64-bit Jumbo frames 133MHz to 16kB; Interrupt mitigation; channel-bonding PCI Bridge HubLink 2.0 1.6GB/s Intel P64H2 PCI Bridge Intel P64H2 Memory Control Hub 1GB memory 64-bit 133MHz Dual-edge 1GB memory HubLink 2.0 1.6GB/s HubLink 1.0 I/O Cntlr PCI 32-bit 33MHz Mark 5A I/O Board PC2100 266MHz Best transfer rates to date • Memory-to-memory transfers between Tyan motherboards – ~1900Mbps • Uses dual channel-bonded GigE connection • Mark 5A-to-memory transfer – ~1200 Mbps • Required major re-working of Mark 5A software to improve efficiency of data-transfer to/from NIC, minimize number of internal buffer-to-buffer transfers, and support multiple threads • More work still to be done to achieve routine 1024 Mbps Mark5-toMark5 transfers • We plan to concentrate our efforts on implementing and optimizing with VSI-E to achieve 1024 Mbps • There should be no performance difference between Mark 5A and Mark 5B e-VLBI Network Monitoring • Use of centralized/integrated network monitoring helped to enable identification of bottleneck (hardware fault) • Automated monitoring allows view of network throughput variation over time – Highlights route changes, network outages • Automated monitoring also helps to highlight any throughput issues at end points: – E.g. Network Inteface Card failures, Untuned TCP Stacks • Integrated monitoring provides overall view of network behavior at a glance • Also examining performance-monitoring packages such as MonaLisa, which would provide better standardization Network State DataBase (NSDB) • Tool to keep track of state of e-VLBI state: – – – • Network performance Configuration of end systems State of end systems Integrates and builds on standard monitoring tools to provide a single, coherent view of e-VLBI network state: – – – Maintain continuous state monitoring of entire e-VLBI system Essential for being able to identify issues with network/end system configuration Diagnose at-a-glance (cf. current practice) NSDB Architecture e-VLBI Weather Map Web Page (Haystack to Kashima) http://web.haystack.mit.edu/e-vlbi/evlbi.html Network Layer Statistics New Application-Layer Protocols for e-VLBI • Based on observed usage statistics of networks such as Abilene, it is clear there is much unused capacity • New protocols are being developed which are tailored to e-VLBI characteristics; for example: – Can tolerate some loss of data (perhaps 1% or so) in many cases – Can tolerate delay in transmission of data in many cases • ‘Experiment-Guided Adaptive Endpoint’ (EGAE) strategy being developed at Haystack Observatory under 3-year NSF grant: – Will ‘scavenge’ and use ‘secondary’ bandwidth – ‘Less than best effort’ service will not interfere with high-priority users – Translates science-user criteria into network constraints Automation of e-VLBI transfers • • Based on EGAE, major effort is now underway to fully automate routine e-VLBI file transfers Algorithms are being built around use of standardized e-VLBI filenaming conventions (as agreed by Himwich, Koyama, Reynolds, Whitney, Nov 2004); see memo #49 at ftp://web.haystack.edu/pub/evlbi/memoindex.html – We urge universal adoption of standardized e-VLBI file naming for ease of data interchange Experimental and Production e-VLBI • August 2004: – – Haystack link link upgraded to 2.5 Gbps Real-time fringes at 128 Mbps, Westford and GGAO antennas, Haystack Correlator • September 2004: – • November 2004 – – • Real-time fringes Westford-Onsala at 256Mbps Used optically-switched light paths over part of route October 2004 – present – • Real-time e-VLBI demonstration at SC2004 at 512 Mbps Use DRAGON optically-switched light paths February 2005 – – • Real-time fringes at 512 Mbps, Westford and GGAO antennas, Haystack Correlator Regular transfers from Kashima (~300GB per experiment; ~200 Mbps) Starting April 2005 – – Routine weekly transfers from Tsukuba (~1.2TB/transfer) Preparing for CONT05 (15 days continuously; ~1TB/day) Real-time e-VLBI SC2004 Demo Haystack Westford Bossnet Pittsburgh Convention Center DRAGON Goddard GGAO DRAGON Project (Dynamic Resource Allocation for FMPLS Optical Networks) • Dynamically-provisionally optically-switched network research project – • • 10GBPS DRAGON network is being installed around Washington, D.C. area, with connections to Abilene, HOPI and NLR e-VLBI is primary demonstration application, using 2.4Gbps dedicated connection to Haystack – – • U. of Maryland, ISI – PI’s Programmatic interfaces to EGAE are under development Hope to upgrade Haystack connection to 10 Gbps in near future DRAGON will play a prominent role in e-VLBI demos scheduled for iGRID (Sep 05) and SC05 (Nov 05) Abilene ISIE M10 WXC EXC2 RE2 RE1 l HAYS ATDnet/ Bossnet DRAGON Network ARLG EXC1 MCLN WXC2 WXC2 RE4 RE1 l l OSPF control plane adjacencies l l WXC1 RE3 RE3 GSFC NCSA l RE1 RE1 UMCP CLPK HOPI Movaz Networks iWSS Optical Switch • • • • MEMS-based switching fabric 400 x 400 wavelength switching, scalable to 1000s x 1000s 9.23"x7.47"x3.28" in size Integrated multiplexing and demultiplexing, eliminating the cost and challenge of complex fiber management Dynamic power equalization (<1 dB uniformity), eliminating the need for expensive external equalizers Ingress and egress fiber channel monitoring outputs to provide sub-microsecond monitoring of channel performance using the OPM Switch times < 5ms In summary - Some lessons learned • High-performance e-VLBI is still hard to do – – – – – Cannot count on consistent performance Varying traffic loads Network configuration changes Equipment failures Continuous network monitoring is critical to success of on-demand RT eVLBI • Jumbo-frame support is important at rates >~256Mbps on GigE – Jumbo-frame support is spotty, but improving Some Challenges • • Network bottlenecks well below advertised rates Performance of transport protocols – • Throughput limitations of COTS hardware – • • Disk-I/O - Network Complexity of e-VLBI experiments – • • untuned TCP stacks, fundamental limits of regular TCP e-VLBI experiments currently require significant network expertise to conduct Time-varying nature of network Define standard formats for transfer of data and control information between different VLBI systems ‘Last-mile’ connectivity to telescopes – – Most telescopes are deliberately placed in remote areas Extensive initiatives in Europe, Japan and Australia to connect; U.S. is lagging Some Frustrations • Telescope connectivity, particularly in U.S. , remains a significant challenge – – – – – – – – Westford – 1 Gbps GGAO – 1 Gbps Arecibo – 155 Gbps VLBA – not connected GBT – not connected CARMA – not connected JCMT – not connected SMA – not connected • Much difficulty in securing funding support from NSF Astronomy for eVLBI – Need to develop convincing science case Future Directions • • • • • • Further EGAE and VSI-E development and deployment Improved IP protocols for e-VLBI Optically-switched networks for highly provisioned high-data-rate pipes Solving ‘last mile’ problem to U.S. telescopes Distributed correlation using clusters and/or highly distributed PC’s Extending to higher bandwidths – – • Haystack has Astronomy NSF grant to push for 4Gbps/station Preparing NSF proposal to extend to 16Gbps/station using new digitalfilter and recording technology Continuing to move e-VLBI into routine practice on a global basis e-VLBI Technical Working Group • Established at this e-VLBI workshop as group of technical experts, David Lapsley chair • On hold until David Lapsley replacement is on-board • Hope to re-invigorate at July e-VLBI workshop in Sydney • Objectives – Evaluate e-VLBI/VSI-E hardware/software/procedures – Implement standardized global e-VLBI network performance/monitoring tools – Provide expert assistance to e-VLBI users • ~2 members from each major e-VLBI geographical area Thank you - THE END Questions? Antenna/Correlator Connectivity • • • • • • • • • • • • • • JIVE Correlator (6 x 1 Gbps) Haystack (2.5 Gbps) Kashima, Japan (1 Gbps) Tsukuba, Japan (1 Gbps) GGAO, MD (10 Gbps) Onsala, Sweden (1 Gbps) Torun, Poland (1 Gbps) Westerbork, The Netherlands (1 Gbps) Westford, MA (2 Gbps) Jodrell Bank (1 Gbps?) Arecibo, PR (155 Mbps) Wettzell, Germany (~30 Mbps) Kokee Park, HA (nominally ~30 Mbps, but problems) TIGO (~2 Mbps) In progress: • Australia – plan to connect all major antennas at 10Gbps! • Hobart – agreement reached to install high-speed fiber • NyAlesund – work in progress to provide ~200Mbps link to NASA/GSFC