FPGA Based Processor for Hubble Space Telescope Autonomous Docking – A Case Study Jonathan F. Feifarek jonathan.feifarek@lmco.com Timothy C. Gallagher timothy.c.gallagher@lmco.com Lockheed Martin Space Systems Co. Courtesy NASA GSFC Feifarek ‹#› MAPLD 2005/A220 Background: Need for Hubble Repair ● 4 / 1990: Hubble Space Telescope (HST) launch ● 12/1993: SM* 1- Corrective COSTAR, WFP Camera2 ● 2 / 1997: SM 2 – Add NICMOS, STIS, Thermal Blankets ● 10/1997: Hubble Operations Extended from 2005 to 2010 ● 12/1997: SM3A Replace 6 Gyros, 3 Fine Guidance Sensors ● 3 / 2002: SM3B Replace Solar Panels, NICMOS Coolant ● 3 / 2003: SM 4 Cancelled Following Columbia Disaster ● 6 / 2004: Hubble HRV Request For Proposal Issued ● 8 / 2004: Lockheed Martin awarded HST Robotic Vehicle (HRV) ● 12/2007: Target HRV Launch Date * SM = Service Mission Feifarek ‹#› MAPLD 2005/A220 HRV Mission : Autonomous Docking System Approach Requirement Mission Phase • Orbit phasing with HST • HRV checkout Pursuit • Range from HST for initial sensor acquisition • HST approach with safe-hold points • Acquire sensor data on HST orientation and rotation rate Proximity Ops • Rate matching with HST • Maneuver to HST capture point Approach • Capture HST: – Robotic Arm Captures HST Grapple Fixture Capture/Berth Feifarek – Berth to HST aft interface ‹#› MAPLD 2005/A220 Vision Processing Algorithm Selection Criteria ● Implementation Concerns - Computational Intensive – – – ● Field Programmable Gate Array (FPGA) Flight Computer, DSP Processor Combination Implementation Approach – – – All: Conventional Programming Languages FPGA High-Order Languages (HOLs) FPGA Register Transfer Logic (RTL) in VHDL or Verilog ● ● ● Feifarek Error-prone Time consuming (calendar time plus engineering cost) Difficult to achieve bit accurate & cycle accurate operations using hand-coded conversions ‹#› MAPLD 2005/A220 Vision Processing Algorithm Selection Results ● FPGA Reconfigurable Architecture Chosen – ● Searched Internet and Conference Proceedings for comparisons between Processors and FPGA Reconfigurable Computer (RCC) ● Space Based RCC technology leaders such as Los Alamos National Labs1 and NASA2 noted FPGA system performed between 10-1000x faster then processors ● Many other references on FPGA based accelerated image processing from University studies3,4 Microprocessor Embedded in FPGA – – – Allows rapid evaluation of architecture performance Can host large amounts of existing code such as decision logic and complex sequential math For certain algorithms Floating Point is more efficiently implemented in processor code then in gates Feifarek ‹#› MAPLD 2005/A220 Vision Processing Algorithm Selection Results ● FPGA Implementation: Combination of HOL, RTL – – – – – HOL (Celoxica Handel-C) for fast and efficient implementation Provided fast development cycle needed for mission ● Quickly ported math libraries & existing C++ code ● Performance matched RTL speed, area ; slower than handcode ● Highest speed increase from hand floorplanning RTL for IO Wrapper, IO reuse, and custom-optimized code Combined the benefits of all worlds Microprocessor Implementation ● Incorporated Xilinx MicroBlaze ™ Core in FPGA ● Xilinx tools: Platform Studio© SDK / EDK suite ● Used gnu© C compiler / “gdb” debugger Feifarek ‹#› MAPLD 2005/A220 Vision Processing FPGA Development Flow C Algorithm Acceleration FPGA C to RTL Implementation Generate human-readable VHDL and Verilog for 3rd party synthesis Provide rapid iteration of partitioning decisions throughout flow C to FPGA Verification Direct implementation to device optimized programmable logic Drive continuous system verification from concept to hardware Used with permission of Celoxica, Inc. Feifarek ‹#› MAPLD 2005/A220 Vision Processing Card (VPC) Block Diagram RAM Serial Camera Pixels Raw images Program, Data Memory t+1 t Edge Enhanced Images Loading Loading t FPGA Memory Manager Xilinx microBlaze™ MicroProcessor Core Image Patches Project Model Points (3D to 2D Images) Pixels Image Points Image Points Project Edges Edge Compute New Pose (Iterative) Output Pose Single FPU Instance With Multiple Software Invocations Edge Finder Best Fit Edge Custom Floating Point Unit (FPU) uBlaze In/Out Operation Request Operator Operands In, Results Out Data Data Floating Point Unit Pipeline Scaler, Matrix */Convert uBlaze Software Libs uBlaze Hardware FPU Feifarek Enhanced Pixels Pyramidal Downsampling / Edge Enhancement Lukas Kanade Trackers Compute New Pose Memory Manager Front End Image Processor ‹#› Control MAPLD 2005/A220 SerDes COP B Xilinx V2 SRAM SRAM COP A Xilinx V2 COP C Xilinx V2 SDRAM SerDes SDRAM SerDes SDRAM SerDes SDRAM Port D SRAM Port C SDRAM Port B SDRAM Port A SDRAM SDRAM SRAM Vision Processor Card Architecture COP D Xilinx V2 Internal PCI Common Interconnect Bus SRAM Flash PCI-PCI Bridge / Config Power Switch J8 PCI Connectors Feifarek ‹#› MAPLD 2005/A220 VPC Engineering Development Board Used with permission of SEAKR Engineering, Inc. Feifarek ‹#› MAPLD 2005/A220 VPC SEU Approach ● Main SEU Mitigation: Dual Voting at FPGA output – Detects SEE's but cannot correct for them – Tight power restrictions (thermal reasons) restrict triple voting – Vision Processing Algorithm tolerant of drop-outs ● Multiple camera views / algorithms into Kahlman filter ● HRV mission uses very low rate docking (1 inch / sec) ● SEU Correction at FPGA-to-Memory Interfaces ● Microprocessor returned to Reset State after each image ● Algorithm memory only 1 image deep; flushes SEU effects ● Voting, Configuration Scrubbing Performed in Rad Hard Part ● Analysis Shows Low SEE Rate (1 effective upset / 10 hours) Feifarek ‹#› MAPLD 2005/A220 VPC Sizing Results for NFIR Algorithm Single LK Tracker MicroBlaze Processor Front End + LK Tracker Total Available Percentage Utilized LUTs 2700 8000 7000 17700 67584 26% Flip Flops 2000 4000 2700 8700 67584 13% Multipliers 4 0 19 23 144 16% BlockRAMs 33 42 24 99 144 69% Multipliers 4 0 70 74 144 51% BlockRAMs 33 42 68 143 144 99% Quad LK Trackers MicroBlaze Processor Front End + LK Tracker (4) Total Available Percentage Utilized Feifarek LUTs 2700 8000 13580 24280 67584 36% Flip Flops 2000 4000 10580 16580 67584 25% ‹#› MAPLD 2005/A220 VPC Performance Results for NFIR Algorithm Function Timed Cycles/Loop Project Model Points 26000 Lktracker (hardware) 2000000 FindExtrinsic 3078000 Project edges 120000 FindEdges (hardware) 400000 Project ellipses 80000 computeAllFis 180000 computeVsumCsum 280000 computeAlpha 230000 UpdatePose 6000 getAllErrors 240000 Total 6640000 Feifarek ‹#› Loops 1 1 1 3 1 3 2 2 2 3 3 Total Cycles 26000 2000000 3078000 360000 400000 240000 360000 560000 460000 18000 720000 8222000 MAPLD 2005/A220 VPC Performance Results for NFIR Algorithm (cont.) FindExtrinsic timing Normalize SVD6x6 SVD3x3 FindHomography ProjectPoints Rest Total Feifarek Cycles/Loop Loops Total Cycles 73000 170000 35000 1100000 190000 600000 2168000 ‹#› 1 3 1 1 4 1 73000 510000 35000 1100000 760000 600000 3078000 MAPLD 2005/A220 Summary: Lessons Learned ● Using OpenGL algorithm for development hampered design ● Parallel PC board and FPGA designs helped meet schedule ● Using FPGA’s was key to meeting speed requirements ● Use of microprocessor core reduced development time ● Early allocation of algorithm to hardware/software paid off ● Use of HOLs made implementing complex tasks possible ● Engage expert tool user on team (MicroBlaze, Handel-C) ● Having reference software / test data eased verification ● Benefited from small, enthusiastic, tight knit team ● Worked around MicroBlaze libraries bugs with custom logic Feifarek ‹#› MAPLD 2005/A220 References ● ● ● ● (1) “A Space Based Reconfigurable Radio”, Michael Caffrey, Los Alamos National Laboratory, MAPLD September 2002 (2) “Developing Reconfigurable Computing Systems for Space Flight Applications”, Thomas P. Flatley, NASA Goddard Space Flight Center Greenbelt, Maryland 20771 (3) "Implementing Image Applications on FPGAs," B. Draper, R. Beveridge, W. Böhm, C. Ross and M. Chawathe. International Conference on Pattern Recognition, Quebec City, Aug. 11-15, 2002. (4) “Performance of Reconfigurable Architectures for Image-Processing Applications”, Domingo Benitez, University of Las Palmas G.C., Journal of Systems Architecture: the EUROMICRO Journal, September 2003 Feifarek ‹#› MAPLD 2005/A220