e-Science Data Information and Knowledge Transformation Thoughts on Education and Training for E-Science Based on edikt project experience Dr. Denise Ecklund Technical Architect Agenda Background – the edikt project What’s an “e-science research project”? Identify e-science engineering activities Derive educational requirements – Hypothesise what is covered elsewhere – Identify what is missing My wish-list for ‘gap fillers’ www.edikt.org e-Science Data Information and Knowledge Transformation The EDIKT Project E-Science Data, Information and Knowledge Transformation Research Interests RAE Edikt and its mission Study new CS theories in data management Match? Study data mgmt problems in astronomy, physics, biology and geosciences www.edikt.org e-Science Data Information and Knowledge Transformation A Model for an e-Science Research Project E-Science project staffing Principal Investigator Computational Scientist Computing Engineer Theoretical Scientist The team The goal is to do new application science. www.edikt.org Complimentary knowledge Application science domain Conversational and shared understanding Computer SW engineering domain www.edikt.org e-Science Data Information and Knowledge Transformation Tasks of a Computing Engineer in an e-science research project working with people working with machines System development process Gather requirements Investigate solutions Integrate components Deploy & maintain Four phase process www.edikt.org Working with people Phase 1 activities – Gather requirements Gather requirements Understand ‘enough’ of the science application – Answer questions and exchange information Intelligibly Interact with team members having varying levels of computer expertise www.edikt.org Working with machines Phase 2 - activities Gather requirements Investigate solutions – Survey, identify and evaluate existing applicable software technologies Understand the requirements Effective search (where and how) Install and evaluate potential solutions www.edikt.org Working with machines Phase 3 - activities Gather requirements Investigate solutions Integrate components – Provide an integrated software solution Identify technology gaps Develop reliable software to fill the gaps Test and deploy full system www.edikt.org Working with machines and people Gather requirements Investigate solutions Integrate components Deploy & maintain Phase 4 - activities – Maintain and evolve the integrated solution Basic systems support activities – Instruct team members on software usage www.edikt.org Job requirements for an e-science computing engineer Technical capabilities – Can work as the sole computer expert on the team – Understand ‘enough’ vocabulary, concepts, data use, structure of algorithms, and domain-specific standards – Capable of distributed systems analysis and design – Knowledge and practice of rigorous software design and development practices Then … the sociological factors – Willingness to re-use existing technology – Willingness to maintain and evolve the system www.edikt.org Structure of the CS curriculum Basic Principles for everyone Computational logic Programming skills SW engring methods Database programming Scripting languages Data structures Algorithms Specialisation Systems work: operating systems networking compilers embedded systems Graphics HL systems work: middleware service-oriented architectures distributed & parallel systems database systems Algorithmic complexity Applicable skills for e-science? Artificial Intelligence work: language processing vision systems human-computer interaction robotics knowledge systems www.edikt.org Assume a two part approach Leverage good Computer Science curriculum – Identify applicable areas of study – Advise students to study across multiple specialty areas Fill the gaps with e-science training www.edikt.org e-Science Data Information and Knowledge Transformation Filling the Gaps My wish-list Technical knowledge gap Requirement – The project’s primary expert on most computer-related topics CS programme – Teaches “independent work” and “intra-project teamwork Teach ‘inter-project teamwork’ www.edikt.org Inter-project teamwork Bio-research Centre Project A Project B Share knowledge of similar tools and practices – E.g., using same software development tools (CVS, JBuilder, Ant) – A local person to talk with for a range of problems www.edikt.org Application domain gap Requirement – Understands ‘enough’ vocabulary, concepts, data use, structure of algorithms and domain-specific standards CS Programme – Lab exercises are small examples in easy-to-grasp domains E-Science applications are not “easy-to-grasp” – Advise students to study introductory domain-specific courses – Develop lab exercises in e-science application domains – Recommend a single domain focus www.edikt.org Domain focus Can’t wear all these hats and do it well – focus! www.edikt.org Search, identify and use gap Requirement – Identify and re-use existing third party technologies CS Programme – Traditional emphasis on building all new software – Recently added exercises in “design re-use” Add lab exercises covering – Re-use of pre-identified tools and components – Find and select candidate technology for re-use www.edikt.org Software methodology gap Requirement – Rigorous full life cycle software design and development practices CS Programme – Typically offer one course on software development methodologies – Students not required to ‘practice this’ after software methodologies course Student interns in an active e-science project – Mandatory use of software development methodologies www.edikt.org Methods and proven tools Spiral approach Extreme programming Development Tools www.edikt.org The making of an e-Science engineer + CS + Broad fundamentals Rigorous SW development methods + Internship in a real e-science project = E-Science Computing Engineer Domain knowledge (just one) www.edikt.org e-Science Data Information and Knowledge Transformation Thank you Thoughts and questions?