"'()*+!,& '$-$.()-")#(/" !"#"$!%& Data-Intensive Research theme welcome Malcolm Atkinson mpa@nesc.ac.uk 22 November 2010 Data-Intensive Research second workshop Monday, 22 November 2010 1 Welcome to the e-Science Institute Monday, 22 November 2010 2 DIR Theme Goals • Improve understanding of data-intensive research data/computational challenges • Initiate computing science research to address key challenges drawing on database knowledge and experience Monday, 22 November 2010 3 Definitions Monday, 22 November 2010 4 WHAT IS DATA? • • • • • • • • • • collections of data from instruments, observatories, surveys and simulations; results from previous research and earlier surveys; data from engineering and built-environment design, planning and production processes; data from diagnostic, laboratory, personal and mobile devices; streams of data from sensors in the built and natural environment, data from monitoring digital communications; data transferred during the transactions for business, administration, healthcare and government; digital material produced by news feeds, publishing, broadcasting and entertainment; documents in collections and held privately; the texts and multi-media ‘images’ in web pages, wikis, blogs, emails and tweets; and digitised representations of diverse collections of objects, e.g. of museums’ curated objects and books in literary collections. Monday, 22 November 2010 5 WHAT IS DATA-INTENSIVE? A problem is data intensive when considerable care is needed over the use and handling of data in order to solve it Monday, 22 November 2010 6 We have a data bonanza We need a method bonanza Monday, 22 November 2010 7 QUESTION 1 How can we enable researchers who understand their field, the data and the methods to specialise, tune and control their datascope? Monday, 22 November 2010 8 QUESTION 2 How can we enable researchers who understand their field or an analytic technique to capture that as an algorithm just once? Monday, 22 November 2010 9 QUESTION 3 How can we optimise a datascope taking account of the data, the computational environment and the user-defined algorithms? Monday, 22 November 2010 10 QUESTION 4 How can we construct easy to use datascopes economically, quickly and reliably? Monday, 22 November 2010 11 Our Question Monday, 22 November 2010 12 How can we help? Monday, 22 November 2010 13 Today’s Programme 10:30 10:40 11:40 11:50 12:15 12:30 13:45 14:00 15:30 16:15 17:15 Monday, 22 November 2010 Welcome Opening talk Short break Environmental data Discussion Lunch Breakout groups Breakout groups Coffee & refreshments Reporting back DIR reception Malcolm Atkinson Alex Szalay Jeremy Cohen Chapterhouse Briefing in Cramond Cramond & Swanston Chapterhouse Cramond Chapterhouse 14 Previous work • Data-Intensive workshop at eSI •Report draft bit.ly/cfMRn3 •http://wikis.nesc.ac.uk/escienvoy/DataIntensive_Research:_how_should_we_improve_our_a bility_to_use_data •http://wiki.esi.ac.uk/Data-Intensive_Research •Twitter hash tag - #datares • USA data-use report (Atkinson & De Roure) •Draft bit.ly/c0G2rn Monday, 22 November 2010 15 This DIR theme • Twitter hash tag - #datares • http://www.esi.ac.uk/research-themes/15 • http://wiki.esi.ac.uk/Data-IntensiveResearch_Theme • http://wiki.esi.ac.uk/Meet1Summay Monday, 22 November 2010 16 Monday, 22 November 2010 17