The User Perspective GEMEDA Grid-Enabled Microeconometric Data Analysis

advertisement

The User Perspective

GEMEDA

Grid-Enabled Microeconometric Data Analysis

The Research Challenge

• The Substantive Issue: Investigation of the

Welfare of Ethnic Minorities in the UK

– wrapped up as Grid Enabled Micro-econometric

Data Analysis (GEMEDA)

– an e-Social Science pilot demonstrator project

• Disciplines:

– economics: main & co-app

– quantitative social science

– research computing: co-app, grid engineer, software developer

Expectations of e-Research

• Local Research Methods Programme workshop seminar on a project (SAMD) on which a colleague was the problem owner, suggested a grid solution might be useful

• Discipline area awareness is minimal (some interest in HPC/HTC in econometrics) to non-existent (despite use of agent-based models and experimentation in economics)

Expectations of e-Research

belongs to the class of statistical data fusion methods for data set linkage.

– Single quantitative data sets are messy to deal with.

– This gets worse with two or more: surveys+census

– Grid-enabling the data promised improved workflow.

– using combined data may violate underlying statistical assumptions

– simulation techniques preferable

– these are embarrassingly parallelizable and well suited to implementation on an HPC.

Functional & Non-Functional

Requirements

• GEMEDA needed:

– Computational resources to implement a parallel version of the analysis code: NGS compute node

– Hosting of multiple quantitative data sets: NGS data node

– Workflow: common and coherent donor and recipient data sets , communication between data & compute nodes & results presentation

– Visualisation: GIS style linked plot display of results

Usability Requirements

 Implemented:

– ease of use

– known complex tasks easy to specify

– interactive capability for results display

 Not implemented:

– ability to specify alternative analyses

– ability to integrate new data sets

– operability outside of the NGS

– computationally intensive interactive capability

Lessons Learnt

• From a user's perspective, the present service is:

– very easy to use (+)

– limited (-)

– static (-)

• Barriers to adoption in my discipline area:

– compute resource brokerage (NGS is batch)

– an honest definition of grid-enablement of data

– improvement of workflow for research, not just a completed problem

– resources: staff , etc.

Future Plans

• The project has now concluded

• The analyses and investigation is ongoing, but

– data disclosure controls and confidentiality restrictions presently preclude extending the present service.

– extensions to elements of the service are restricted by the expertise of the researcher

Download