MOSES: Modelling and Simulation for e-Social Science Mark Birkin, Haibo Chen, Martin Clarke, Pete Dew, Justin Keen, Phil Rees, Jie Xu University of Leeds Abstract This poster describes an e-Social Science research programme at the University of Leeds with a specific focus on Modelling and Simulation (MOSES – Modelling and Simulation for e-Social Science). The cornerstone of the programme is the creation of a dynamic simulation model of the UK population, represented as a series of richly disaggregated individuals and households. We aim to use the power of e-Science Technologies to deliver a complete representation of the population which draws on attributes from a diverse portfolio of databases. The simulation model will be applied to address research questions in three social science domains, relating to healthcare policy and practice, transport & environmental sustainability, and the business impacts of socio-demographic change. 1. Background Social modelling and simulation has obvious appeal to game players of all ages: examples like ‘The Sims’™ as well as ‘SimCity’™ spring immediately to mind. On the other hand, simulation modelling has had a rather more chequered history as both an academic and applied method within the social sciences, ever since Lee’s (1973) damning early critique. And yet the value of scenario-based approaches to policy development, evaluation, planning and research remains widely recognised (e.g. Masser et al, 1992). In this research, we argue that the renaissance in social simulation, and specifically in urban and regional modelling is long overdue. The idea of developing real urban simulations which are calibrated using widespread social and behavioural data has appeal from a number of perspectives. In the first place, the exercise is academically and intellectually challenging. There are many social scientists who would deny vehemently that it is possible to represent cities in such an analytical fashion and to derive meaningful outputs from this process. To demonstrate a capability to reproduce social behaviours and patterns within cities is therefore an objective in its own right. Secondly, the analogue of social simulations to a wind tunnel has value. We can legitimately ask whether such models might not be used in real planning environments, to test and predict the outcomes of different policy interventions. Thirdly, there is a different kind of analogy to flight simulation, in which pilots are trained to fly aircraft within exceptionally realistic VR environments which can provide learning opportunities with no risk to expensive equipment or human life. Could the same opportunities be used in the urban planning context, to learn about the impact of alternative strategies without the need to experience negative outcomes from poor decisions? 2. Aims and Objectives Within this context, the MOSES project has a number of aims. The overarching goal is to create an activity in which the capabilities of Grid Computing are mobilised to develop tools for social modelling and simulation whose power and flexibility surpasses existing and previous research outputs. Furthermore, we seek to demonstrate the applicability of gridenabled modelling and simulation tools within a variety of substantive research and policy environments; to provide a generic framework through which grid-enabled modelling and simulation might be exploited within any problem domain; and to encourage the creation of a community of social scientists and policy users with a shared interest in modelling and simulation for e-social science problems. In order to promote these aims, our main building block will be a richly disaggregate synthetic model of the UK population. It is our objective to develop this model with both a baseline and a short-to-medium term forecasting component. The model will be deployed in selected application domains, comprising health, business and transport, to demonstrate policy impacts and the valueadded through simulation. From these examples, we hope to generalise the application of these techniques to more varied domains. 3. Relevance of e-Science 3.1 Rationale The concept of e-Science and Grid Computing is crucially important to the reinvigoration of urban simulation for a number of reasons. Firstly, and most obviously, the programme demands sharing of data, for example between the core model and the various application domains. Furthermore, this sharing of data may only be possible with strict attention to problems of security and confidentiality: for example, if patient records are being accessed for the purposes of service delivery planning. Many of the simulations may be complex and computationally onerous, especially if forecasts are to be derived as some kind of best guess or ‘average’ from a large universe of possibilities. The academic possibilities will only be realised fully through the pursuit of diverse multi-disciplinary collaborations, while under the proposed model planning departments themselves might increasingly adopt the characteristics of virtual organisations. Finally, it is evident that part of the appeal of simulation games lies in their excellent interfaces and visual representation of outcomes. It is possible that academic implementations of e-social science might equally well benefit from the application of the latest visualisation technologies. 3.2 Proof of Concept The application of Grid technology to spatial decision support systems has been demonstrated within the context of a healthcare planning scenario through the Hydra project (Birkin et al, 2005). Hydra assumes a scenario in which health care services targeted at a particular demographic group are made available through a dispersed network of providers. The technology is designed to support a wide range of contemporary problems such as growth of care in the community services for the elderly, and increased local provision of services like cancer screening. The Hydra demonstrator incorporates a service-based grid architecture which provides secure access to a variety of capabilities, including a (virtual) data service, modelling and optimisation, mapping and collaborative services, delivered through an easy-to-use portal. The Hydra portal is illustrated in Figure 2. Figure 1. The Hydra Portal 4. Methodology The demographic simulation model will be developed in a four stage process – representation, behavioural modelling, forecasting and application testing. 4.1 Stage 1. Representation. The objective is to generate a complete synthetic representation of people and households in the UK. The building blocks within this process will be the Sample of Anonymised Records (SARs) from the 2001 Census of Population and Households. Repeated sampling from the SARs will be used as a means to recreate small area populations in accordance with known census distributions (compare Williamson et al, 1998). As their description implies, the SARs are fully anonymised and there is no possibility that real individuals or households may be identified through this process. Thus the baseline population for the model will be a synthetic but completely realistic representation. This recreation process is commonly referred to as ‘microsimulation’. 4.2 Stage 2. Behavioural modelling. The second stage of the project will be concerned with the addition of likely activity patterns for our synthetic population. This will include travel to work patterns, migration, retailing, leisure and education. A variety of secondary databases from government and commercial providers will be used to inform this process. 4.3 Stage 3. Forecasting. The population will be projected forward to the year 2031 using a combination of static and dynamic ageing. ‘Static ageing’ is a process in which the core database is resampled in order to match a change in the underlying population distribution. For example, suppose that government projections show an expected growth in the young and affluent communities of metropolitan Leeds. The population would we resampled with increased selection probabilities for the young and affluent target group. ‘Dynamic ageing’ is a method in which individual processes of ageing, household formation, labour market migration, and so forth, are modelled explicitly for individual members of the population. Thus an individual aged 25 in 2001 will be aged 35 in 2011, to use a straightforward example. The dynamic ageing method is more resource intensive than static ageing, but potentially more effective. Figure 2. Four stage modelling process 2001 Census SAR Microdata 2001 Census Area Statistics Tables 2001 Census Commuting Data Representation Model Retail & Other Activity Data Behavioural Model 2001 Census UK Microdata (1) Residential Attributes 1 . . . . . . H . . . . . . ………… ………… ………… ………… ………… ………… ………… 1 . . . . . . P . . . . . . Immigration Emigration Domain behaviour data Domain Applications 5.1 Business In this application area we propose building a model that sits on top of the individual and household microsimulation model and simulates the effects of a number of critical personal financial service events and scenarios to examine their potential impact both at a national but also, importantly, at a local level. The events we propose exploring relate to the increased level of personal indebtedness in the UK. Latest Bank of England estimates suggest that personal indebtedness have reached £1 trillion, equivalent to annual GDP. Several key factors come in to play looking towards the future : • The pensions timebomb: as individuals recognise that their pension is unlikely to fund their current lifestyle they will look to liquidate assets (mainly property) to top up the shortfall in pension payments • Relating to this the increased use of Equity Release Products to generate annuity incomes • The reduction of inter-generational transmission of wealth • Potential deflation in house prices • Potential rise in interest rates • Increase in household formation (more smaller households) 1 . . . . . . H . . . . . . UK Microdata 2006 (3) Residential Attributes UK Microdata (4) Domain Attributes ……………… ……………… ……………… ……………… ……………… ……………… ……………… Domain infrastructure data Application Model Forecasting Model 4.4 5. Domain Attributes ……………… ……………… ……………… ……………… ……………… ……………… ……………… Residential Attributes 1 . . . . . . H . . . . . . 1 . . . . . . P . . . . . . Activity Attributes Domain Attributes ………………………… ………………………… ………………………… ………………………… ………………………… ………………………… ……... 1 . . . . . . H . . . . . . Stage 4. Application testing. At this stage we would look to model processes relating to the specific application domains. This will require data relating to both infrastructure and behaviour: for example, the provision of hospital beds and patient referral patterns in the healthcare sector; or road networks and traffic counts for transport analysis. The extent to which applications can be generalised across domains remains an open Process summary The components of the simulation model are summarised in diagrammatic form at Figure 2. Next, we consider some of the issues which might be considered within our chosen application domains. Deaths Births 1 . . . . . . P . . . . . . 4.5 2001 Census UK Microdata (2) Residential Attributes 1 . . . . . . P . . . . . . 2001 Census Workplace Data question for further research and investigation within the project. We propose building a simulation model that would explore the interdependencies of these potential events over the next decade. We believe that the impacts of a fall in house prices/ increase in interest rates will have substantively different impacts in different regions/localities of the UK that the simulation model should be able to detect and predict. 5.2 Transport Many regional development agencies have ambitious plans for expansion within their local economies. For example, the recently published business plan for the Northern Way anticipates substantial increases in the throughput of business and leisure trips through northern airports, together with a growth in the region’s share of ship arrivals and container freight. At the same time, the plan aspires to reduce congestion in the interurban strategic road network to below the national average by 2010. This challenging objective will only be achievable through some combination of a reduction in intra-regional business and leisure trips (for example, in relation to increased home-working), a redistribution of trips towards uncongested routes, changing modes of transport (for example, increased rail traffic), or investments in the transport infrastructure, e.g. new roads or improved junctions. The articulation of scenarios relating to changing demographics and business activity, together with economic forecasts in line with the ambitions of the Northern Way, and specific ‘what if’ changes to the local infrastructure, would be an example of a suitable challenge for the MOSES simulation technology. 5.3 Health care One of the fundamental problems in health care modelling and analysis is that services are provided and monitored by organisations which are vertically-oriented, but that use profiles for individuals are not constrained by the same boundaries. For example, care for the elderly is provided through a rich combination of primary and secondary health care together with social services, local voluntary organisations, and informal support within a family or neighbourhood. However it is extremely difficult to get a view of service use at the level of individual patients, even across the two major services of health and social care because of the disassociation of the provider organisations. This is important, not least because service provision and utilisation are inextricable linked: for example, if hospital beds are limited within a particular area, then more intensive use of social services is one likely outcome. Therefore one objective within this part of the project would be to explore the capabilities of Grid technology to integrate data from diverse sources, including health and social care, to provided a balanced picture of service use. Another interesting question is the way in which ‘social networks’ might support more formal regimes of health and social care. For example, can one demonstrate that communities with strong social networks are persistently less dependent on formal care? The representation of individual members of the population within this project provides an ideal platform for investigations of this type, in which links within a social network could be simulated in a similar way to behaviours or other activity patterns. 6. Conclusion – Towards generic social science applications We are now in a position to add an applications layer to our description of the modelling process. In order to build problemfocused applications on top of the microsimulation model, two types of inputs are required – data about individual characteristics and behaviours (morbidity, propensity to own a personal pension, preferred mode of transport) and information relating to infrastructure and service provision (hospital treatment rates, house price data, trip cost by mode). There seems no reason why a general model of this type might not be applied to a wide range of problems. For example, a user with an interest in crime patterns might access data on propensities to commit crimes (or the likelihood of falling victim to crime) together with intelligence relating to the crimes reported to various local police forces. This could lead to a model which allows the effectiveness of crime prevention to be benchmarked. References Birkin M., Dew P., McFarland, O., Hodrien, J. (2005) Hydra: A prototype grid-enabled decision-support system, Proceedings of the First International Conference on e-Social Science, National Centre for e-Social Science, Manchester. Lee, D.B. (1973) Requiem for Large Scale Urban Modelling, Journal of the American Institute of Planners, 39, 163-178. Masser I., Suiden O., Wegener M. (1992) The geography of Europe’s futures, Belhaven, London. Williamson, P., Birkin, M., Rees, P. (1998), The estimation of population microdata by using data from small area statistics and samples of anonymised records, Environment and Planning A, 30, 785-816