NASA - Hubble Space Craft Remote Diagnostics A Case Study of the Hubble Space Telescope Data Warehouse By Ralph Reitan Computer Sciences Corporation Lanham-Seabrook GT4 7700 Hubble Drive Lanham/Seabrook, MD 20706 301-794-2386 December 7, 1999 NASA-Hubble Space Craft Remote Diagnostics LONG SUMMARY Please describe your application and the information technology used in conjunction with it. Please keep your language simple and your explanations non-technical. The Hubble Space Telescope Control Center System Data Warehouse Team has revolutionized the means by which scientists and engineers can analyze engineering data associated with a billion-dollar satellite system. They are making huge volumes of engineering data from approximately 7000 onboard sensors, collected for the entire life of the Hubble Telescope, available in real time, and they are providing for the first time ever a means to pose highly complex queries against this data from any authorized location worldwide. They have changed the nature of these analytical activities from an old, timeconsuming offline process that could take months to a new, on-line system that allows these analyses to be performed in seconds to minutes. When spacecraft operating costs can run into several thousand dollars a minute, the added value of such rapid analysis of engineering problems is enormous. The team creatively solved the problem of how to ingest approximately 2 terabytes/year of incoming real-time data by developing data compression and data cleansing techniques. These techniques allow this massive data volume to be compressed by a ratio of 11:1 and, through the use of commercial data warehousing technology, load each day's engineering telemetry data (~5 gigabytes) in 10 minutes! In addition, the team has engineered this system to enable scientists and engineers to ask complex questions that could never be posed before about the "health" of the satellite carrying the Space Telescope as well the scientific instruments onboard. Their concept is portable to other satellite ground systems that store telemetry data. During the closing session of the International Space Operations Symposium '98, held in Tokyo, Japan, the session moderator (Verner Franck) for the conference stated that "the idea of compressing the telemetry data and using a data warehouse to store and make it available in this fashion is one of the most exciting ideas to come out of this symposium. Everyone must look at how this approach should be impacting their ground system development activities." The idea of developing a data warehouse to house real-time engineering data is unique. This particular warehouse was the first of its kind in the country. Based on this implementation, a major U.S. computer chip manufacturer (Micron Technologies) has already successfully implemented an engineering warehouse. Based on a presentation given in October 1998 at the Red Brick Builder's Forum in Tokyo, Japan, several Japanese manufacturing concerns (NEC & others) have expressed interest in the development of engineering warehouses. The president of Red Brick United States (Chris Erickson) and Red Brick Japan (Fumitaka Tezuka) both stated, "The concept of real-time engineering data warehousing represents a vast untapped market for which tremendous interest now exists." NASA-Hubble Space Craft Remote Diagnostics BENEFITS Has your project helped those it was designed to help? In your opinion, how has it affected them? What new advantage or opportunity does your project provide to people? Has your project fundamentally changed how tasks are performed? In your opinion, have you developed a technology that may lead to new ways of communicating and processing information? What change might unfold? This team innovated the way engineering data is compressed and stored by spacecraft ground control systems. In the process, they created a unique system that provides scientists and spacecraft operators with a completely new paradigm for analyzing engineering data to support the operations of the Hubble Space Telescope and its onboard instruments. The team took advantage of commercial data warehouse technology and used a Red Brick Data Warehouse to store the data. Using a commercial data warehouse to track engineering data, specifically spacecraft telemetry, in this application was a first. John Gainsborough, Space Telescope Vision 2000 Operations and Ground Systems Manager has stated, "Other Space Centers have been fighting the problem of storing and analyzing massive amounts of engineering telemetry data for years. We appear to have solved it and they are expressing tremendous interest." This systems concept is portable to other ground systems and to other systems that store engineering data. This new system provides rapidresponse, 24-hour access to the engineering data for this mission's lifetime, which is now in its ninth year, and allows worldwide access to authenticated users. It means that, when a problem occurs, NASA's worldwide partners can now analyze engineering problems at their desks in their home facilities rather than boarding a plane to the Goddard Space Flight Center in Maryland! It means that the data is no longer kept in flat files to be retrieved, uncompressed, and searched in a time-consuming (days to weeks) offline process! It means that queries are no longer one dimensional and slow. Complex trending studies that could take as long as a year can be answered within seconds, and even the more complex queries may take only minutes. Scientists and engineers can, for the first time, rapidly access the entire history of instrument and spacecraft performance that span multiple sensor measurements and multiple time ranges. For example, at day/night transition, they can examine vibration caused by thermal shock; they can find the temperature of the aft shroud during orbital day. The creative means by which data is compressed allows data for the entire mission lifetime to be centrally stored. This solves what has been a huge problem for ground systems: how to store enormous volumes (5 gigabytes/day) of incoming data in near real time and make it accessible almost immediately. Managing the health and safety of the Hubble Telescope involves viewing and analyzing 7000 separate measurements of onboard sensor data. Measurements are sent to the ground in an engineering telemetry stream at a nominal rate of 32 kilobits per second, or 4 to 5 gigabytes per day of raw data to be continuously received, analyzed, and archived for further analysis. Data volumes equivalent to 2 billion rows in the data warehouse tables per year are being collected. The team's innovative data compression and storage scheme solved the data volume problem by storing change-only data points for 91% of the data and averaging the rest over varying intervals based on the sensor sampling rate. This unique solution reduced the flat data volume by approximately 85%. Even considering the amount of space needed to index the data for data warehouse querying, the amount of direct-access storage needed for the lifetime of the mission is only estimated to be 1.5 terabytes. This makes it possible to keep all data on-line, to make it accessible almost immediately, and to use this data as a pointer to all-points data for detailed analyses. The compression techniques used for this system and the system architecture are portable to other ground system programs. The Tracking and Data Relay Satellite System program office has sent representatives to explore in detail how this can be implemented within their systems. The Marshall and Goddard Space Flight Centers, Jet Propulsion Lab, and Johnson Space Center have committees studying this type of solution for new ground system developments. In addition, the team has been contacted by representatives of the data management branches of the European Space Agency (ESA) in Darmstadt, Germany, and the French Space Agency (CNES) in Paris. The U.S. Air Force has also expressed interest in this concept to support their analysis of satellite engineering telemetry on several (proprietary) reconnaissance projects. NASA-Hubble Space Craft Remote Diagnostics IMPORTANCE How did information technology contribute to this project? Describe any new technologies used and/or cite innovative uses of existing technology. For example, did you find new ways to use existing technology to create new benefits for society? Or, did you define a problem and develop new technology to solve it? How quickly has your targeted audience of users embraced your innovation? Or, how rapidly do you predict they will? Does your work define new challenges for society? If so, please describe what you believe they may be. The availability of a high performance data warehouse product was essential for successful implementation. The team used a product developed to store and rapidly analyze business data and adapted it to space craft data. The data warehouse paradigm, a new technology, was used a design model. NASA-Hubble Space Craft Remote Diagnostics ORIGINALITY What are the exceptional aspects of your project? Is it original? How? Is it the first, the only, the best or the most effective application of its kind? How did the project evolve? What is its background? One of NASA's early goals for the HST Vision 2000 Project was to have "All data on line and immediately available for operational use," a goal that was simply unattainable with standard technology. To overcome technical obstacles and meet the goal, the team applied new technology in their unique approach to solving the problem of rapid loading and querying of huge amounts of data. That technology is data warehousing, a specialized analytical database system designed to manage large volumes of data and provide rapid access to information. There was, and still is, very little industry experience using a commercial off-the-shelf data warehouse system for such a large amount of engineering data. In the past, the primary users of these data warehouse products were commercial companies tracking and querying huge volumes of sales data, or medical industry personnel tracking claims and analyzing them for potential fraud on the part of providers. Using a commercial data warehouse to track engineering data, specifically spacecraft telemetry, in this application was a first. It clearly represents a totally new frontier for satellite ground control systems. The team evaluated available data warehouse software and selected Red Brick's Data Warehouse product, mostly because of its indexing schema, called star index. A query interface to the data warehouse is provided via the Vision 2000 CCS Graphical User Interface (GUI), a state-of-the-art Java user interface providing users with access to all the command, control, and monitoring capabilities of the Command and Control System using Internet/Intranet technologies. Queries entered by the user from the GUI are sent to C++ programs written with Roguewave tools. This product allows the software to fully use object orientation and information encapsulation. This, in turn, allows the application to take advantage of enhancements to the interface protocol in a seamless fashion, thus increasing responsiveness to future software enhancements and decreasing maintenance load. NASA-Hubble Space Craft Remote Diagnostics SUCCESS Has your project achieved or exceeded its goals? Is it fully operational? How many people benefit from it? If possible, include an example of how the project has benefited a specific individual, enterprise or organization. Please include personal quotes from individuals who have directly benefited from your work. Describe future plans for the project. This contribution has been applied for 18 months and is considered a fully integrated component of the Hubble Space Telescope ground control system. It is a key component of the overall archiving as well as cost-reduction strategy for the Hubble Space Telescope Ground System. From the standpoint of day-to-day ground system operations, it is assumed that the data warehouse will serve as the main data source for approximately 80% of the queries by spacecraft engineers and telescope instrument engineers. It will provide displays available to these operators and engineers with screens that allow them easy access to the telemetry and events data 24 hours a day in near real time. NASA-Hubble Space Craft Remote Diagnostics DIFFICULTY What were the most important obstacles that had to be overcome in order for your work to be successful? Technical problems? Resources? Expertise? Organizational problems? Often the most innovative projects encounter the greatest resistance when they are originally proposed. If you had to fight for funding, it would be useful to include a summary of the objections you faced and how you overcame them. The use of a commercial database to store space craft telemetry was unheard of during the beginning of this project in 1995. Other ground system development teams had tried using commercial dbms's and were disappointed by their lack of performance and ease of use. These barriers and a lack of experience with this technology led to many doubts about the viability of this technology. Benchmarks were performed and a pilot study was undertaken to build a working prototype. This study demonstrated that a COTS data warehouse could store and manage vast amounts of data. Only after this exhaustive study did the client approve the technology.