Dynamic Data Displays: Experiences from USDA’s Economic Research Service Vince Breneman GIS Coordinator USDA-Economic Research Service March 5, 2007 Economic Research Service Overview 400 employees (300 social scientists) ERS conducts a research program to inform public and private decision-making on economic and policy issues involving: agriculture trade food markets food safety, nutrition, and assistance natural resources rural America Data visualization objectives Clients Policy/legislative gatekeepers Press/Media Commodity industry/Agribusiness Research community Public Goal is to provide our clients with robust data products that will allow them to find, explore, extract data and build custom presentation products with some level of consistency Challenges Lots of products often in the wrong format to create dynamic displays 100 data products that in include 1000’s of datasets Most products are file based with meta data in the file A variety of users wanting different things from the data Some want a number a relationship a table or database graphics Finding data How do we expose data and information that are contained in our automated data delivery systems How do users find cross tabulations and geographic subsets etc. Challenges Departmental guidelines limited screen real-estate: 767 pixel width,140 pixel header, ¼ left nav box, ¼ right nav box 508 compliance – must provide reasonable access to the visually impaired Popups not advised because of popup blockers Departmental guidelines have and will change so building applications with a level of flexibility becomes part of our requirements Commercial packages we use for developing Web data visualization applications Corda’s PopChart – graphing tool Generates flash files as well as jpgs, eps, pdf … Provides some standard graph types but we have done a bit of customization To reduce development time and provide some consistency we have been building graphing web services Provides a d-link for visually impaired ESRI’s ArcIMS 9.1 for mapping of detailed geographies generally county level of lower Utilizes our SQL Server and spatial data repository Able to incorporate a host of spatial reference layers ERS data sets page Lots of products Departmental guidelines Finding data ERS food consumption application Provides food availability estimates of food supplies moving from production through marketing channels for domestic consumption. U.S. national level annual time series 1909 2005 Includes hundreds of commodities from beef and pork to coffee and tea to potato chips Total domestic population Food Consumption data selection page - Application instructions - User defined graph-able subset -Output options Challenge: Wanted to show over 100 variables creating a LIL (Long Intimidating List) Solution: Break list into categories Challenge: Application description and data query component takes up lots of screen real-estate Solution: Jump return page to data selection and results Challenge: Want to illustrate the group total and commodity shares - this can create scaling issues Solution: Allow user to add or delete items from selected group PopChart features: - Graph dynamically scales Y-axis for a given set of data - PopCharts generates flash files with rollovers to get actual values -PopCharts provides d-link for visually impaired Challenge: Wanted graph to standalone and be clear about what it was representing Solution: Include extensive footnotes Challenge: How do we handle to many variables? Solution: Panel plots? Profiles of America mapping application spatial data Purpose: Help users visualize over 100 important US county level demographic and socio-economic indicators and a number of ERS rural indicators/geographies Summarize demographic data by rural geographies and by State Allow the application to be extended with new indicators of regional geographies Outputs include maps, graphs and tables This is one of our first highly interactive data visualization products Regional geographies/typologies Rural-Urban Continuum Code, 2003 County Typology Codes, 2004 Metro-Nonmetro status, 2003 Metro-Micro-Noncore status, 2003 Urban Influence Code, 2003 Example Demographic and Socio-economic data Unemployment rate, 2003 Population and Migration Age and Sex Race and Ethnicity Educational Attainment Households and Families Journey to Work Employment and Unemployment Income and Poverty How do unemployment rates compare between metro and nonmetro regions? State selection – US default Topic groupings Rural indicators Challenge: How to map 3000+ counties quickly…under 2 seconds? Solution: Currently we use ESRI ArcIMS - thin client Regions view Standard map navigation Issue: Do you just show the state or surrounding states? Challenge: Where to display identity tool results? Solution: Used pop-ups - Now posting table results below the map Challenge: How to link map regions and chart regions Solution: Used color linkages Same population change variable aggregated to a different geography How to aggregate data? Make sure all variables were using the same county base geography Develop a data sturcture to handle regional aggregations Identified variables as totals, ratio or categorical Ratio data (rates, percentages) need to know the numerator and the denominator by region Totals sum variable by region Agricultural Resource Management Survey (ARMS) application ARMS is USDA's primary source of information on the financial condition, production practices, resource use, and the economic well-being of America's farm households It’s a large and complicated dataset with over 650 variables before you start summarizing the data with various categorical variables It’s a large complicated dataset with over 650 variables before you start slicing and dicing All these variable are embedded in an application and it’s quit difficult to wrap a search engine around it Challenge: How do you search this variable list and provide useful result to the user? Solution: ??? Commercial farms: Very large family farms, nonfamily farms Intermediate farms: farming occupation/high sales farms, large family farms Rural residence farms: limited-resource, retirement, residential/lifestyle, farming occupation/low sales 2005 view of farm assets per farm 2005 view of farm assets Percent Challenge: How do we show farm assets by farm typology over time? How do we provide statistical significance to the estimates? Solution: Panel plots Provide confidence bounds Issue: We had to calculate confidence bounds in the application….Would prefer the graphing software to handle it. Challenge: How do we show farm assets by region? Solution: Charts on a map Farm assets over time by region – with confidence bound Challenge: There are several ways to show the same data each with different users in mind Solution: Display data in multiple formats Challenge: How to provide an easy way for comparison of multiple graphs? Solution: Here grabbed jpgs and lined up in PowerPoint Farm household income from farming and from off farm sources Useful approaches for developing Web data visualization applications Identifying a few graphing products to concentrate development…Corda’s PopChart and ESRI ArcIMS Developing data exploration apps for in house use – provides utility to research staff and a test bed for external applications In house usability testing – really helps you identify the common information people are looking for from a dataset/application Concluding thoughts Building dynamic data displays helps ERS and its clients in a number of ways Makes data products more accessible and usable Provides ways of turning data into information Provides better ways to find information Provides incentives to organize and manage our data in better ways supporting both our research and data dissemination missions