Mind The Gap: eResearch in Australia Paul Davis, Gabrielle Bright Victorian eResearch Strategic Initiative, Australia Abstract VeRSI - the Victorian eResearch Strategic Initiative - was funded in 2006 by the Victorian State Government. This paper will describe how the VeRSI program is providing a cohesive and coordinated approach to accelerating the uptake of eResearch by Victorian researchers. This is the first initiative of its kind in Australia. VeRSI will deliver research leadership by harnessing enabling technology in a way that fosters a productive, collaborative research environment, and by developing key examples of how eResearch can enhance research outcomes. This paper will describe the five complementary projects that will establish essential infrastructure, develop resources for collaborative working and support exemplar use-cases and the development of applications in the Life Sciences. VeRSI, like similar eResearch activities elsewhere, is acutely aware of the IT skills shortage developing in Australia. This exacerbates the need to develop skills and expertise in eResearch. This paper will highlight the need to bridge the skills and expertise 'gap' between research disciplines and technology implementations. 1. Challenges to research in the past In the nineties, research communities in science and engineering began to tackle the so-called “Grand Challenges”; fundamental problems with broad application. The Grand Challenges were so large and complex that they could only be addressed by large teams using supercomputers and would take decades to solve. Studies in computational fluid dynamics would enable the optimal design of vehicles, provide more accurate weather forecasts and improve efficiency in the oil industry. A computational approach to modelling the electronic structure of matter would deliver new chemical catalysts, drugs to boost the immune systems and novel superconductors. The physics underpinning fusion technology would lead to the energy sources of the future and bioinformatics and computational chemistry would provide the fundamental understanding of molecular systems necessary to cure cancer, diabetes, cardiovascular ailments and more. A watershed in computer science provided the algorithmic tools needed for such endeavour and technology delivered phenomenal improvements in the power and speed of high performance computer systems. One by one the Grand Challenges were reduced to large but routine computations and even more exigent challenges filled their place; but the nature of the problems and, indeed, of their solutions had changed dramatically. Today’s Grand Challenges are global and involve health, climate change and global warming, the environment, predicting and living with extreme seismic activities and tsunamis, disaster reduction and security and the spread and containment of infectious disease. These are fundamental issues that effect life on the planet, not just the profitability of the manufacturing industry. They demand profound understanding of exceedingly complex systems and they simply cannot be solved by teams constrained to a department, institution or organisation. Figure 1: The change in science focus with technology. Figure 1 illustrates the continuum of change in research with the influence of ICT and transition to meet new challenges over time. 2. System Science Science has become a team sport and we have entered the era of “System Science” - the integration of diverse sources of knowledge about the constituent parts of a complex system with the goal of obtaining an understanding of the system's properties as a whole. System Science is characterised by global collaboration, shared resources, common goals and multidisciplinary problem solving. Ian Foster, one of the recognised “fathers of the grid” provides the following commentary: “System-level science integrates not only different disciplines but also, typically, software systems, data, computing resources, and people. System-level science is usually a team pursuit. Data comes from different sources, different groups develop component models, team members provide specialised expertise, and the often substantial computing and data resources required for success are themselves diverse and distributed. Thus, system-level science itself requires the creation of yet another sort of system that may combine large numbers of both physical and human components.”i An example of System Science is a virtual organisation (VO) for the study of ecoinformatics: an online, collaborative and shared resource for managing ecosystem and environmental data and information products. The ecoinformatics VO becomes the focal point for research collaborators and policy participants: a common, virtual platform that provides a consolidated data and knowledge infrastructure to support research design and delivery, and greatly improved collaboration across research teams in a shared environment. It brings together, as a base set, data and methods from ecology, environmental science, land care, water management, climate studies, agriculture, animal husbandry, mathematics and computer science and advanced ICT. Treating ecoinformatics as a System Science will lead to innovative methods for the integration, curation and management, mining, interrogation, visualisation and modelling of multi-scaled (local, regional, national and global) data within the shared platform, thus delivering the best available and most costeffective research outputs to the natural resource and primary industry sectors. The growing awareness that researchers must collaborate, that resources have to be shared, that information and communication technology has to be better integrated into the research environment and that these characteristics apply to all research activities not just science, has resulted in the genesis of eResearch. 3 eResearch in Australia Dr Mike Sargent, Chairman of the Australian eResearch Coordinating Committee (eRCC), Australia’s equivalent of Sir John Taylor, wrote in his report to the Government: “On a global scale, the level of investment in ICT enablement of research in the United Kingdom, United States of America, Europe and Canada is beginning to pay dividends. In the United Kingdom, £250M has been invested in the e-Science programme over the past 5 years and the programme has drawn further significant investment from corporations such as IBM, Microsoft and Oracle. In the Asia Pacific region, the governments of Japan, Korea, Taiwan and China are strongly committed to finance their e-Research frameworks. Countries investing in e-Research capabilities are increasingly participating in collaborative research into the grand challenges.”ii Australian has been slow to adopt eResearch, however, this tardiness has been an advantage in a country with merely twenty-one million people and fewer than eighty thousand researchersiii. Australia has been able to watch and learn from the development of eResearch activities in the US, Europe, UK, Japan, Korea, China, Singapore and Latin America and develop a philosophy of “adapt – adopt – and contribute” based on successes elsewhere. In particular, Australia has modelled its eResearch development on that of the UK. A large number of delegations have explored every aspect of UK eScience over the last five years and several key emissaries, for instance, Sir John Taylor, Dr A. Hay, Professors A. Trefethan, C. Gobles, and D. DeRoure, have visited Australia to advise on eResearch policy and procedures. VeRSI, the Victorian eResearch Strategic Initiative, exists on a much smaller scale than the UK wide eScience Program. Nevertheless, the principles of awareness as the first step; working with the research community to establish trust; developing solutions for selected research groups that facilitate collaboration and tangible examples of the benefits of Figure 2: Science value chain collaborative research and shared resources, and; providing key enabling infrastructure, are the same. The Australian Department of Education, Science and Technology (DEST)iv proclaims: “The research sector worldwide is experiencing enormous change driven by advances in information and communications technology (ICT). Research is increasingly characterised by national and international multi-disciplinary collaboration and most OECD countries and APEC members are investing heavily in those capabilities and the associated coordinating mechanisms. The term ‘e-Research’ encapsulates research activities that use a spectrum of advanced ICT capabilities and embraces new research methodologies emerging from increasing access to: • Broadband communications networks, research instruments and facilities, sensor networks and data repositories; • Software and infrastructure services that enable secure connectivity and interoperability; • Application tools that encompass discipline-specific tools and interaction tools.” Apart from the Australian Government’s insistence, there are several compelling reasons why eResearch is becoming the new paradigm. Robert Kelley (Carnegie Mellon University) claims that, in 1986, 75% of the knowledge one needed to do one’s job was stored in the mind. In 1997 this had fallen to 15% to 20%v; knowledge has become an on-line commodity. Howard Garner (Harvard University) amplified this message when he stated, “Knowledge does not stop at my skin: it includes my computer and its databases and my network of associates”vi. eResearch provides the infrastructure and methodologies for accessing and reusing on-line knowledge. Figure 2, above, is a schematic of the science value chain illustrating that eResearch, by virtue of the collaborative and shared dogma, provides a seamless communications path for ideas, data and knowledge from the fundamental research carried out (mostly) in universities through to industry. Two years ago DEST commissioned the Australian eResearch Coordinating Committee to undertake a comprehensive review of eResearch and recommend how Australia, cognisant of experiences elsewhere, could coordinate a National eResearch initiative. One of the recommendations of the Committee was the establishment of an eResearch Centre consisting of a coordinating body and six statebased nodes to facilitate the transfer of eResearch methodologies to the research community.vii 4 Challenges Australia to eResearch in eResearch initiatives, such as VeRSI, are not without their challenges; several key issues in the adoption of eResearch have yet to be addressed. These include the reward system for academics, which is not geared to adequately measure outputs other than papers. Encouragement of open-source software, online publication and sharing of experimental data and derived results all test the ability of the research sector to measure real research outputs and reward researchers accordingly. This is particularly important as the research community is encouraged to move from competitive to collaborative endeavours. Before researchers will adopt new methods there has to be a clear, unambiguous and demonstrable benefit. In chemistry, the reaction coordinate is an abstract one-dimensional coordinate system that represents progress along a reaction pathway (Figure 3). Reactants are mixed, energy is added (the activation energy) and the reaction proceeds to the formation of the wanted products. If insufficient energy (less than the activation energy) is added the reaction cannot proceed. By analogy, for a researcher to move to a new paradigm the “product” environment must be better than the starting point AND the “activation energy” (learning, re-training, disruption to research activity, costs of change etc) has to be very small. VeRSI’s role is to describe and demonstrate ways to adopt eResearch that lead to better research environments and require little or no energy input from the researchers. The hidden catch is in the subjective term “better”; in this context “better” is determined by the individual researcher. undertaking research training in eResearch capabilities at the PhD level. It was proposed that this be supplemented with the introduction, by institutions, of formal incentives, recognition and reward mechanisms for the skilled eResearch professionals who support and provide expertise to researchers, and to encourage academics to invest time and expertise to develop cross-disciplinary projects and courses to train a new generation of eResearchers. The press (iTWire) report that: “Salaries in the Information, Communication and Technology sector were up by 12% across the board for the six months to December 2006…” and, “…demand for application developers, data management professionals, and business analysts was continuing unabated with this trend expected to go on strongly in 2007. “The market has been, and remains, hot with demand outstripping supply,” Andy Cross, Ambition technology managing director said. “Salaries have jumped 12% and the technology recruitment market can’t bridge the demand gap. That means the war for talent will continue in 2007. This will be the key challenge in our sector for next year.”” viii With this level of competition and the uncompetitive salaries and conditions offered by the university sector the recruitment challenge will continue to be the major gap between the ideals of eResearch and its realisation. 5 eResearch in Victoria Figure 3: Chemical reaction pathway – an analogy for change The most significant challenge, however, is the paucity of skilled ICT personnel, particularly those with multidisciplinary expertise, and the difficulties of recruiting staff who can bridge the void between research disciplines and technology developers. This was clearly identified by the eRCC who recommended the establishment of an extensive training program involving a five-year programme of one-year eResearch Honours Scholarships and a five-year programme of three-year eResearch Postgraduate Scholarships to support students The Victorian State Government, through Multimedia Victoria, decided to meet these challenges and in 2006 announced funding for the Victorian eResearch Strategic Initiative (VeRSI) as part of its Healthy Futures: Life Science Statementix. The government funding will be supplemented by equal amounts of inkind support from the partners. The clear intention is that VeRSI become the Victorian node of the national eResearch Centre. The VeRSI Program is an unincorporated joint venture. The initial members of the Consortium are Melbourne University and Monash University along with the Victorian Government. VeRSI is based on collaborative and inclusive model and will openly involve research groups, skills and services from other universities, research organisations, government departments and service providers such as the Australian Synchrotron, Victorian Partnership for Advanced Computing (VPAC) and Victorian Education and Research Network (VERNet). The funded activity will proceed from October 2006 to September 2010. During this period the VeRSI Program will undertake five complementary projects to provide support and services to researchers, undertake an extensive outreach and awareness raising exercise, deliver a coordinated program of skills development and assist in the coordination of research development and deployment. The five projects are broken into three categories: enabling projects, a demonstration project and capability projects. • Security & Access: Security and access includes the means to identify and authenticate users, to provide them with access to shared resources and to provide assurances that information stored in shared repositories is safe and secure. The challenge is to balance the security and access needs of the community of interest with those of the host institution(s). VeRSI are developing a standards-based security and access policy that is compatible with existing systems and implementing the supporting ICT platform so that authenticated access is possible from anywhere. • Storage Systems: Data storage is a high demand commodity and most university systems are over taxed and designed specifically to meet the immediate needs of the campus. This leaves little scope for experimentation with storage standards and protocols. The UK eScience program successfully addressed this issue by providing limited, production quality resources, particularly data storage and management, as a foundation on which to build trust and deliver services with minimal impact on University security systems and firewalls. VeRSI will follow this example. 5.2 Demonstration Project Figure 4: Relationships between the VeRSI projects 5.1 Enabling Projects The enabling projects will create awareness and establish essential infrastructure such as an access and security framework and distributed, federated storage facilities. There are three enabling projects: • Communications & Support: Communications are a critical component in the development of an eResearch framework Informing researchers, explaining the benefits of eResearch methods and assisting them to assess their requirements and general engagement are all key, on-going activities that form the Awareness, Outreach, Communications and Support project. The demonstration project is the design and construction of a prototype Virtual Beam Line (VBL) to demonstrate remote working with the High-throughput Protein Crystallography beamline at the Australian Synchrotron. The VBL will allow for collaborative working as part of grid-enabling the Australian Synchrotron and synchrotron user community. The VBL will provide a multimedia nexus through which users can collaborate using voice, video and shared applications, undertake occupational health and safety, radiation and beamline training, use a variety of tools to transfer data to external storage and computing resources and remotely monitor and mentor their experiments. The VBL design will be modular, easily replicated and meet the need of the synchrotron community. The design will be promoted and freely available to interested parties at other beamlines. 5.3 Capability Project The capability project is a series of exemplar use-cases and applications for researchers in the Life Sciences, capitalising on Australia’s existing research strengths. These projects will deliver production quality tools to the specific research groups and serve as tangible examples to the research community of how advanced ICT that is responsive to researchers’ needs can enhance the research environment. There are eight use-cases: • Genomic dataset mining: The function of genes is dependent on their structure but, due to the size of genomic data sets, mining of the gene structure is an arduous task. Further insights are developing from the merging of publicly available data with proprietary information. Innovative methods for collection, curation, visualisation, integration and mining of these data will deliver a major advantage to the Victorian biotechnology and biomedical research sectors. VeRSI is supporting this research by designing the general architecture of software modules for data mining, assisting the implementation of novel data mining techniques and building a user interface to the data sets. • Neurosciences & biomedical imaging: The Neuroimaging Laboratory at the Howard Florey Institute is using human and animal brain imaging techniques to investigate clinical and basic neuroscience research questions. These include how various functions change with variations in hormone levels, and whether neurobiological changes can be observed prior to the onset of physical symptoms in sufferers of diseases such as Huntington's or Alzheimer's disease. VeRSI is supplying crucial integration and networking capabilities, in addition to building a metadata explorer model, a metadata editor for neuroimaging data sets, a user interface to the data and a data authentication model. • Australian Mouse Brain Map: The Mouse Brain Map Consortium is building a national atlas of brain data, from MRI, histology and immuno-histochemistry, for comparison between diseased mouse brains and control brains, studying changes in brain anatomy and function. VeRSI is building a repository of mouse brain map images and information from institutes within the consortium, designing the general mouse brain map database architecture along with dataflow, specimen tracking and security models and building an interface to the repository for upload, management, searching and download of data. • Uro-oncology Informatics Grid: Around fifty percent of Australian men experience some type of prostate problem during their lifetime. Prostate cancer is one of the most common forms of cancer in men. The Centre for Urological Research aims to provide better diagnosis and treatments for prostate cancer and benign prostate disease. A biorepository of prostate, kidney and bladder specimens will allow researchers to share resources with existing national and international repositories. VeRSI is designing and developing an informatics grid, ensuring the informatics grid is interoperable between research centres and national and international tissue banks, and organising lab data, urology data, and pathology data. • Ambulatory motion studies: New technology allows the easy collection and examination of numerous gait parameters, over many trials, for large groups of people within a laboratory or field setting. The data provides clinically valuable information about when gait changes occur, leading to a improved understanding falls and serious injury in older adults. VeRSI is improving the management, distribution and access of gait motion study data by designing and building the data architecture for storage of gait study data and building a security model for access to data sets. • Workflows and laboratory automation for metabolomics: The fundamental challenge of the post-genomic era is to utilise the information generated by genome sequencing projects in conjunction with high-throughput profiling technologies to understand cellular functions on the molecular level in a comprehensive and integrated manner. Metabolomics is an emerging tool increasingly used to identify new protein functions and to model the whole cell metabolism. VeRSI is developing workflows and data management models to automate metabolomics research at the Bio21 Molecular Science and Biotechnology Institute, designing a data storage model for metabolomics data to capture the entire data flow and designing and implementing a web based user interface to enable data management and provide access to shared data. • Distributed radiotherapy system: Computation for radiotherapy patient treatment planning requires high performance computing (HPC) resources, but it is preferable for patients to be treated as close to home as possible. Distributed treatment planning allows for planning to occur at HPC resources in metropolitan areas after which it can be distributed to regional clinics. This use-case will provide a connection between Peter MacCallum Cancer Centre and satellite centres, via VERNet (Victorian Education and Research Network), and develop an authenticated portal interface to treatment planning, imaging and clinical datasets. • Laboratory data management: Australian scientists need to build a critical mass and develop workflow technologies and practices that link geographically dispersed groups. This will enhance collaboration in areas of niche strength such as agrifoods biotechnology, stem cell research, synchrotron-based research and advanced clinical trials. It can be achieved by combining internet-based collaborative environments and grid computing, collectively known as eScience, which in turn will underpin the enhanced data handling and mining methods. This use-case will develop workflow technologies and practices that link geographically disperse groups and enhance collaboration in areas of niche strength such as agrifoods biotechnology, stem cell research, synchrotron-based research and advanced clinical trials 5.4 VeRSI Outcomes VeRSI activities will provide support and services to researchers, undertake an extensive outreach and awareness raising exercise, deliver a coordinated program of skills development and assist in the coordination of research development and deployment. The outputs from the project groups will be knowledge and expertise, designs and technology solutions and advanced open source software. All of these commodities have value that will enhance the quality of research, bring timelier research outcomes, catalyse international collaborations, and provide opportunities to Victorian industry. VeRSI will deliver research leadership by harnessing enabling technology in a way that delivers a productive, collaborative research environment and by developing key examples of how eResearch can enhance research outcomes. This approach will catalyse the widespread adoption and uptake of eResearch and deliver the promise of faster and more exhaustive research activities leading to improved commercialisation opportunities and an elevation in the status of the State’s academic and research institutions. It will add value to the Australian Synchrotron, improve collaboration in the Life Sciences and reinforce Victoria’s position as a knowledge-based economy. 6 Conclusion The adoption of eResearch is a cultural change process which, paradoxically, will have succeeded when the “e” is no longer necessary and the “research method” encompasses all the tenets of shared, collaborative working enabled by advanced information and communication technologies. But this will take time. We can learn the sociological, physiological and technological lessons of the UK, Europe and the US all of whom lead Australia in the adoption of eResearch but this will have little effect on the speed with which the culture of research changes. The “e” will be part of the promise for a time to come. i I. Foster, C. Kesselman, " Scaling SystemLevel Science: Scientific Exploration and IT Implications.” Computer Magazine, vol. 39, no. 11, 2006, pp. 31-39. ii eResearch Coordinating Committee, Final Report of the e-Research Coordinating Committee,5 DEST, http://www.dest.gov.au/sectors/research_sector/ publications_resources/profiles/e_research_strat _imp_framework.htm, 2004) iii Research and Experimental Development, All Sector Summary, 8112.0, Australian Bureau of Statistics, iv http://www.dest.gov.au/sectors/research_sector/ policies_issues_reviews/key_issues/e_research_ consult/default.htm v R.E. Kelley, How to Be a Star at Work: 9 Breakthrough Strategies You Need to Succeed, Three Rivers Press 1998,p77. vi H. Gardner, Intelligence: Multiple Perspective, Wadsworth Publishing 1995 vii http://www.dest.gov.au/sectors/research_sector/ publications_resources/profiles/e_research_strat _imp_framework.htm viii http://www.itwire.com.au/content/view/7503/50 / ix http://www.business.vic.gov.au/BUSVIC/STAN DARD//pc=PC_61353.html