Workshop Goals Richard P. Mount May 24, 2004 DOE Office of Science Data Management Workshop Workshop Progress (1) • SLAC DM Workshop, March 2004 – Presentations of application-science needs, worries and frustrations – Presentations of relevant computer science and technologies – Gap between many application sciences and CS: • “We need a holistic approach to our scientific workflow problems and these CS guys are talking about ontologies!” – Gap between what CS is funded/rewarded for producing and what the application sciences need: • Proposals for long-term user support and product hardening do not get funded and the work wouldn’t get you tenure. – Much vigorous and productive discussion Richard P Mount Data Management Workshop Goals 2 Workshop Progress (2) • Extended Organizing Committee, April 2004 – Application scientists and computer scientists searching for, and finding, mutually understandable ways to organize the issues – Workflow diagrams proposed as a good basis for comparing and contrasting application science needs (in many cases) – Agreement on a straw-man report structure to be presented to this workshop Richard P Mount Data Management Workshop Goals 3 Chicago Workshop Goals • Discuss and improve the proposed report structure • Continue the move from a “laundry list” of needs and topics to wellorganized programs of work reflecting application-science priorities – Workflow diagrams may help in this organization of ideas • Aim to make a case at the “Program of Funding” level, the WBS level can be used for illustration – In other words, avoid ‘commercials’ for particular CS approaches to solving a problem • Application-science priority means “we would be prepared to contribute to this work with our BES/BER/FES/HEP/NP $$$”: – Propose approaches to funding and program management that exploit this mechanism (e.g. SciDAC?) – Priorities will change – the approach should still work • Address the gap between academic CS and the need for robust software with 20-year support: – Funding and program management mechanisms that support what is needed Richard P Mount Data Management Workshop Goals 4 Workshop Structure Monday May 24 09:15 am Straw-man report 09:45 am View from simulation-driven applications 10:45 am View from experiment/observation-driven applications 11:15 am View from information-intensive applications 11:45 am DOD data-management requirements 1:30 pm Technology Working Groups Groups to address: a) additions, corrections and refinements to the report, and b) gaps, cost, priorities and classification into development/hardening/deployment categories. Groups: 1. Workflow, dataflow, data transformation 2. Storage, data movement, grid, networks 3. Metadata management and cataloging 4. Efficient access and query, data integration 5. Integrated data Analysis, visualization 4:30 pm Panel Discussion on Workflows Make progress towards a common approach to describing workflows such that commonalities and true differences are clear. Richard P Mount Data Management Workshop Goals 5 Workshop Structure Tuesday May 25 08:30 am Group Leads will report back to the general session, summarizing their group discussion. This will be followed by open discussion. This should lead to crossgroup comparison of cost and priorities. 11:00 am Panel on a “management plan” - an effective process to direct funding to benefit the various application domains. This is the plan that should get DOE application offices (other than OSCAR) interested and involved. 1:15 pm Parallel sessions of application-led groups to prioritize technologies in their domains. Groups: 1.simulation-driven applications 2.experiment/observation-driven applications 3.information-intensive applications 3:45 pm Brief summary reports from application-led groups. Followed by a panel session to come up with a prioritization and cost for the development/hardening/deployment of SDM technologies. 7:30 pm Group leaders and other volunteers will develop a joint plan on how to normalize the cost and priorities. Richard P Mount Data Management Workshop Goals 6 Workshop Structure Wednesday May 26 08:30 am 9:00 am Present the cross-cutting matrix and priorities/cost to the general session Open discussion led by panelists 11:00 am Assignment of coordinators for final writing. 12:00 pm Adjourn Richard P Mount Data Management Workshop Goals 7