Seminar on Emerging Trends in Data Communication and Dissemination Statistical Data as a Service and Internet Mashups by Zoltan Nagy 19 February 2010 What do we have today? Statistical organizations became competent and capable content providers and users of the Internet. Most National Statistical Offices and International Agencies have their websites with static or dynamic content and with interactive databases or downloadable datasets. With the current approach we are distributing statistical data as goods but are there other ways to provide access to statistical data? Providing Data as a Service Data as Goods Data as Service (bottled water) (plumbing) Bulk onetime download Dynamic access Dated with the time of download Always latest update Need for storage Storage is provided Analysts, researchers, data enthusiasts Dynamic content providers, mashup creators What is a Mashup? • A web application that combines data from more than one source into a single integrated tool an example is the use of data from Google Maps to add location information to statistical data, thereby creating a new and distinct web service that was not originally provided by either source What are the benefits mashups? - Creation of new dynamic user experience Gain valuable insights through information remix Further promotion of our services and data Minimized application data management Reduced development effort Get results faster by accessing information in place - Ability to quickly assemble applications for new situations How a mashup works? User User Request Data presentation Mashup website API Call Website 1 Data API Call Data Manipulation Data Website 2 APIs and web services - API is an abbreviation for Application Program Interface, a set of routines, protocols, and tools for building software applications. A good API makes it easier to develop a program by providing all the building blocks. A programmer then puts the blocks together. - Web services today are frequently just Application Programming Interfaces (API) or web APIs that can be accessed over a network, such as the Internet, and executed on a remote system hosting the requested services. How to plan for a mashup? • Pick a subject A Mashup of What? Map + Statistical data? Google Fusion tables for transformation + Statistical data? More sources of data More complicated • Decide your data sources Who is your data provider? Google maps, Bing maps, etc.. Online data UNdata, Comtrade, Dallas etc.. Usually language agnostic Varying complexity • Other concerns How much time do you have? Do you have a server to run it on? Which programming language? Data dissemination – UNData UNdata is unique initiative of the Statistics Division to bring statistical information together from various international and national sources and present it in an easily understandable and accessible format. • Make UN databases freely available • Organize international databases to allow searchability and open access • Promote national data dissemination • Build a global data dissemination infrastructure Data dissemination – UNdata Data dissemination – Undata Data dissemination – UNdata UNdata API UNDATA API project – an API version of the great data made available by the United Nations on the UNDATA site. The aim is to make this data accessible and reusable in a variety or ways so it can be easily mashed up and recombined into new applications or analysis. Mashups with UNdata Mashups with UNdata Data dissemination – Comtrade • Data from over 150 countries processed into a standard format • Data by partner country/commodity from 1962 covering about 90% of the world trade • 1.5 billion statistical records, 0.5 Terabyte of data • Free Web access to any record and paid subscription for use of download services • 6 billion records downloaded since June 2003 Data dissemination – Comtrade API Trade Data Transfer Architecture United Nations Comtrade Tariff Line Total Trade Comtrade Web Services Web Server Database Server SDMX-ML Element based XML Organization #1 Http protocol Internet PC XML Files Organization #2 Comtrade Tools Other Tools PC Organization #3 Http protocol Comtrade Tools Text Files XML Files Other Tools Database Server