Web Science PhD call 2014 Scientific Web understanding and measurement To date, the MOD has funded many PhDs across a number of themes. The focus of the current call for PhD level research, commencing in January 2015, is Scientific Web understanding and measurement. Our intention is to fund three, interacting PhDs. We encourage applications from UK institutions with leading academics, research groups or research centres in the subject areas identified below. The programme seeks to fully fund PhD projects (a maximum of two per organisation) where applicants can demonstrate how they are leading the thinking in the areas of interest. Evidence of the international standing of the research of the academic groups identified in the proposal should be provided which includes evidence of significant research income and their contribution to the UK and international research landscape. The benefit that MOD would obtain through funding research at their particular institution should also be described. We also intend to build upon the investment in world class expertise by the research councils within the specified areas; significant alignment with, or contribution from, other sources of studentship support from the host university (e.g. from EPSRC, institutional funds, other funding sources) is highly desirable. Applicants need to refer to the assessment criteria (defined later in this document) so that they fully understand how a proposal’s quality, relevance and value to MOD will be judged. Key Dates: Closing date for applications 24 October 2014; Funding decision by 14 November 2014; The projects will need to commence by the end of January 2015. Scientific Web understanding and measurement The Web has become a pervasive utility in modern life in which emergent technology and social interactions have shaped its use and value. The Web is an area in which new technology has led scientific models of understanding and appropriate principles of measurement. There is a need to strengthen our mathematical and statistical understanding of the Web to ensure our derived estimates seek precision and reduced bias. Dstl proposes to fund PhDs in Web Science to help build the necessary skills and knowledge bases to realise this vision. We are, therefore, requesting proposals for doctoral level research programmes focusing on the following areas: 1. A ‘geology map’ of the Web An assumption underlying the selection of appropriate scientific method is that the researcher understands the distribution of their data. In statistics, this motivates an ‘exploratory data analysis’ (EDA) as a first step in modelling and forecasting by understanding the spread (moments) of the data and their internal associations. In considering the Web, the assumption is not satisfied. What is needed is a complete (spanning) and quantitative estimate of the ‘information yield potential’ from compiled DSTL/PUB84073 categories of Web content. The Web may be considered to be an immense (yet finite) universe of information in many forms. Within this complexity, it is expected that an appropriately devised (and evidenced) conceptual categorisation of Web content will identify sub-populations of information content that offer homogeneous properties. Such a categorisation has been previously developed in government statistical offices to structure diverse economic activity for estimation purposes (such as producer price index (PPI) and gross domestic product (GDP) estimation). A standard taxonomy (which spans all economic activity) is the North American Industry Classification System (NAICS) (recognising that economic activity is only one component of Defence interest). Other taxonomies (e.g. politics, security, societal, etc.) need to be developed to the same depth and completeness. What does this offer as Defence benefit? An analogy may be drawn from mineral exploration. If we are interested in mining for diamonds, we look at geology surveys (of surface and sub-surface) rock types that have a propensity to be associated with the discovery of diamonds and focus our drilling activities there. By mapping out Web taxonomy categories (rock types) and quantifying their information yield potential (strength of association of minerals with bearing rock types) we provide the necessary understanding of what sub-populations of Web content jointly offer the richest yield potential for a given question. Different questions will map onto different Web sub-populations and the dynamic nature of Web content will require a cost (and processing) efficient method of maintaining those ‘yield’ estimates as the Web evolves. The output of this research provides the necessary and sufficient foundation for the next PhD studentship; a sampling theory for the Web. Subjects of interest include: 2. Novel mapping of the Web landscape and its conceptual populations of content; Methods and measures to estimate Web information content homogeneity; Methods and measures to estimate ‘information yield potential’ against questions; Evidenced methods to efficiently evolve the mapping of the Web landscape model. A sampling theory for the Web Statistical sampling methodology has developed to allow the properties of large populations to be estimated (with minimised error and bias) using a smaller, representative sample drawn from that population. The motivational reasons for using a small subset may include collection time, costs or processing constraints and is particularly useful where the population is known to be dynamic, heterogeneous and incompletely known. Sampling methodology has been advanced particularly in government statistics related to the estimation of economic or social activity to inform the shaping of policy in these areas. The aim of this research activity is to innovate an evidenced sampling methodology for the detection, estimation, modelling and prediction of world events drawn from Web content. This work will yield a scientific model of how to configure the collection and automatic processing of diverse Web information to answer posed questions (such as the opinion of a nation’s population). The continued rise and centrality of the Web offers many opportunities to measure the status and dynamics of societies and events across the world. We may consider the information pages on the Web to be a very large population (comprising of many, dynamic, DSTL/PUB84073 heterogeneous and incompletely mapped sub-populations) from which estimates may be made. However, the volume, diversity and contradictory nature of information offers new theoretical challenges. Novel statistical methodology is required to intelligently sample the Web for information, to ensure that measurements and derived statistics are founded on science rather than selection bias and intuition. This research activity is potentially disruptive in seeking to design new statistical methodology that scientifically handles the multimedia sampling of information of the Web, through understanding the propensity for the information sub-populations on the Web to yield relevant information to the questions posed. Subjects of interest include: 3. Novel statistical sampling theory based on evidence of Web content understanding; Methods of sampling that handle numerical and categorical information; Efficient and appropriate sampling of very large and dynamic graphs. Tracking in information space Tracking in physical space is a mature and numerical research domain. Residual research in the tracking domain is often directed to tracking targets through extreme manoeuvres and novel configurations of radar sensors (e.g. bi-static). By carrying the tracking model into the Web, we may ask how we might conceptually track the evolution of national activity (such as economic sector industries) through the landscape of the Web and its associated databases. In the Web landscape we have (perhaps) the usual geospatial and temporal parameters associated with Web content. However, time parameters can now have many flavours (e.g. time of the actual event, the time of its reporting, the time of its publication, the time of retrieval, etc.). Additionally, geospatial information may have many flavours of precision (coordinates, place names, regions, nations) and carry ambiguity in spelling and language. In addition to the traditional space and time, we now have the informational dimensions to process. These numeric and categorical dimensions may include topic, semantics and quality and may require a hybrid arithmetic to operate upon these mixed variable states. This research is seeking to formulate the mixed numerical-categorical states associated with sector industries, the graph associations between states and their sequential association (tracking) through time drawing on the landscape of Web content. While some tracking concepts may be drawn from existing, physical tracking techniques (e.g. time latent MultiHypothesis Tracking) this call seeks new directions of research that will assist in understanding the dynamics of industrial sector capability and strategy across both individual and multiple nations. Subjects of interest include: Novel factor formulation of mixed-variable states to represent capability and strategy; Methods of evolving those states in time, drawing from Web content; Methods and measures to create a graph of states and estimate its dynamics; Visualisation of the graph evolution and anomalies therein. DSTL/PUB84073 Assessment criteria PhD proposals will be reviewed under the following assessment criteria and all applications must provide the necessary information requested in the application form. Assessment criteria used to judge the proposal All applications will be judged for technical relevance and quality (under the criteria shown the following table) prior to being considered further according to the academic/research groups or research centre and linkages criteria. Assessment Area Assessment criteria used to judge the proposal The proposal will be judged on the following: Scientific Quality and Innovation The novelty of the proposed work in relation to the context, and the timeliness. Whether the proposed work is ambitious, adventurous, and transformative. The pathway to impact for the proposed research. How complete and realistic the proposed approach is. The proposal will be judged on the following: Academic Staff, Resources and Management The CV(s). Whether the team’s expertise aligns with the topic of the call. The balance of skills of the team. The time and commitment proposed. If requirements for government furnished equipment or information (GFE, GFI) is realistic and whether any work involving human participation is being reasonably proposed. DSTL/PUB84073 Assessment criteria used to judge the academic/research groups or research centre and the value to Dstl. Only technically strong proposals will be considered for funding. The academic/research groups or research centre and linkages criteria will be used to further assess the quality of the application(s). The benefit of funding multiple proposals at a research group/centre and the contributions offered outside the Dstl funding will be judged for single and multiple applications from each group/centre. Assessment Area Assessment criteria used to judge the proposal The proposal will be judged on the following: Academic/Research Groups or Research Centre The evidence provided of the international standing of the research of the group or centre, including evidence of significant research income and their contribution to the UK and international research landscape. The benefit MOD would obtain through funding research at the particular institution. The relevance of the broader research in the centre to MOD. The proposal will be judged on the following: Linkages The benefits of funding multiple projects (a maximum of 2 per organisation) at the particular group or centre. The benefits associated with any wider linkages. The value of linkages to Dstl. Applicants are encouraged to provide options that include significant alignment/contribution of other studentship support from the host university (e.g. from EPSRC, institutional funds, other funding sources). Further Information and the process In addition to the PhD proposal(s) submitted by the research group/centre, the applicant must provide details of how the group/centre can contribute to leading the thinking on the specific theme(s) proposed and how further engagement can be fostered between the research group/centre and the MOD. The intention is to fully fund three PhDs in 2014. The deadline for applications following the conference is the 24th October 2014. Successful applicants will be informed by the 14th November 2014. The projects will need to commence by the end of January 2015. Further terms and conditions will be made available, on request. DSTL/PUB84073 DSTL/PUB84073