GEMEDA
Grid-Enabled Microeconometric Data Analysis
• The Substantive Issue: Investigation of the
Welfare of Ethnic Minorities in the UK
– wrapped up as Grid Enabled Micro-econometric
Data Analysis (GEMEDA)
– an e-Social Science pilot demonstrator project
• Disciplines:
– economics: main & co-app
– quantitative social science
– research computing: co-app, grid engineer, software developer
• Local Research Methods Programme workshop seminar on a project (SAMD) on which a colleague was the problem owner, suggested a grid solution might be useful
• Discipline area awareness is minimal (some interest in HPC/HTC in econometrics) to non-existent (despite use of agent-based models and experimentation in economics)
belongs to the class of statistical data fusion methods for data set linkage.
– Single quantitative data sets are messy to deal with.
– This gets worse with two or more: surveys+census
– Grid-enabling the data promised improved workflow.
– using combined data may violate underlying statistical assumptions
– simulation techniques preferable
– these are embarrassingly parallelizable and well suited to implementation on an HPC.
• GEMEDA needed:
– Computational resources to implement a parallel version of the analysis code: NGS compute node
– Hosting of multiple quantitative data sets: NGS data node
– Workflow: common and coherent donor and recipient data sets , communication between data & compute nodes & results presentation
– Visualisation: GIS style linked plot display of results
Implemented:
– ease of use
– known complex tasks easy to specify
– interactive capability for results display
Not implemented:
– ability to specify alternative analyses
– ability to integrate new data sets
– operability outside of the NGS
– computationally intensive interactive capability
• From a user's perspective, the present service is:
– very easy to use (+)
– limited (-)
– static (-)
• Barriers to adoption in my discipline area:
– compute resource brokerage (NGS is batch)
– an honest definition of grid-enablement of data
– improvement of workflow for research, not just a completed problem
– resources: staff , etc.
• The project has now concluded
• The analyses and investigation is ongoing, but
– data disclosure controls and confidentiality restrictions presently preclude extending the present service.
– extensions to elements of the service are restricted by the expertise of the researcher