eScience workshop • december 2008 rosalind reid executive director harvard initiative in innovative computing can the next generation of scientists become “computational thinkers”? Fact 1: Computation will be at the core of all science within the next decade. Fact 2: Today’s undergraduates are tomorrow’s research scientists. Fact 3: Computational thinking generally is not integrated into undergraduate science curricula at Harvard. Is this a problem? We asked the faculty. (At Harvard, always ask the faculty.) narrative responses to an informal online survey* of Harvard science faculty on “computational thinking” * conducted June 2008 modest hypothesis: computational thinking (as defined by Jeannette M. Wing*) can be a unifying theme for catalyzing curriculum innovation to improve the preparation of tomorrow’s scientists. *of Carnegie Mellon University, now in charge of Computer and Information Sciences and Engineering (CISE) at NSF Wing’s examples of computational thinking in science* “machine learning has revolutionized statistics” “algorithms and data structures, computational abstractions and methods will inform biology” *Microsoft Faculty Summit, Hangzhou, China, October 31, 2005 notes on these data quantitative data were collected, but the survey was informal and unscientific faculty from several departments took the time to offer thoughtful comments respondents self-labeled their fields of research special thanks to Lynn Stein for her help, to Microsoft Research for funding, and to Rob Lue for his continuing work to organize conversation among science faculty on these questions and thanks to the EECS faculty for an open and supportive discussion of co-teaching possibilities q1: We are interested in how computation has or has not transformed research in your field. theoretical mechanics/earth science: “People with [deep, fundamental understanding of ... science and math] are able to do marvelous things with modern computation.” climate: “Computation is a matter of necessity... as realscale experiments are not possible.” earth science: “Numerical solution of large-scale problems...crystal structures, high-pressure phases.” language/cognition: “More precise and rigorous formulation and testing of theories... large-scale databases can be analyzed for patterns of human behavior.” q1: We are interested in how computation has or has not transformed research in your field. cosmology(n=2): [Computation is] “the engine of progress in our field.” “High-end computation has become both necessary and critical for data analysis” geophysics: “My group.... is doing science that other groups can’t because we have embraced a computational approach... computational geometry and [GUIs] enable us to build and run more realistic models.” materials/surfaces: “Computation makes it possible to ‘see’ molecular detail.” astrophysics: “Totally transformed.” q1: We are interested in how computation has or has not transformed research in your field. paleobiology: “Forward modeling, simulation of complex systems that cannot be addressed analytically, solutions to NP-complete problems such as DNA sequence alignment....” nanotechnology: “Computerized data acquisition; data analysis; graphic presentation... of data; simulations of experiments; fundamental understanding of electrons inside small structures....” evolutionary biology: “analyses of large data sets.” evolutionary developmental biology: “As more and more genomic sequence data becomes available, computational methods are necessary to deal with the data.” q1: We are interested in how computation has or has not transformed research in your field. paleobiology: “Forward modeling, simulation of complex systems that cannot be addressed analytically, solutions to NP-complete problems such as DNA sequence alignment....” nanotechnology: “Computerized data acquisition; data analysis; graphic presentation... of data; simulations of experiments; fundamental understanding of electrons inside small structures....” evolutionary biology: “analyses of large data sets.” evolutionary developmental biology: “As more and more genomic sequence data becomes available, computational methods are necessary to deal with the data.” q1: We are interested in how computation has or has not transformed research in your field. exoplanets: “The discovery of exoplanets was enabled by sensitive optical detectors and by the ability to undertake massive modeling efforts... and identify best models over a large parameter space.” q2: What types of computational thinking do you expect to become important to scientific investigation in the coming decade? theoretical mechanics/earth science: “...intelligent algorithms and a deep understanding of aspects of the physics that cannot be represented accurately due to limitations on computer resources.” climate: “...how to reduce and abstract a real-world problem into a computationally solvable problem... and how to map the results back to the real-world problem.” earth science: “...numerical solution of problems, use of tools such as MATLAB and Mathematica...” language/cognition: “Intelligent searching and parsing of language databases.” q2: What types of computational thinking do you expect to become important to scientific investigation in the coming decade? theoretical mechanics/earth science: “...intelligent algorithms and a deep understanding of aspects of the physics that cannot be represented accurately due to limitations on computer resources.” climate: “...how to reduce and abstract a real-world problem into a computationally solvable problem... and how to map the results back to the real-world problem.” earth science: “...numerical solution of problems, use of tools such as MATLAB and Mathematica...” language/cognition: “Intelligent searching and parsing of language databases.” q2: What types of computational thinking do you expect to become important to scientific investigation in the coming decade? cosmology (n=2): “... ability to exploit large databases... write, debug and run programs. Proficiency with a scripting language. “The ability to conceptualize (and visualize) large data sets.” geophysics: “General procedural programming... models and data visualization... ” materials/surfaces: “ simulations of complex systems... solving mathematically intractable problems....manipulating datasets... capturing time-dependent phenomena.” evo-devo biology: “the ability to compare many genomes at once” q2: What types of computational thinking do you expect to become important to scientific investigation in the coming decade? cosmology (n=2): “... ability to exploit large databases... write, debug and run programs. Proficiency with a scripting language. “The ability to conceptualize (and visualize) large data sets.” geophysics: “General procedural programming... models and data visualization... ” materials/surfaces: “ simulations of complex systems... solving mathematically intractable problems....manipulating datasets... capturing time-dependent phenomena.” evo-devo biology: “the ability to compare many genomes at once” q2: What types of computational thinking do you expect to become important to scientific investigation in the coming decade? exoplanets: “novel analyses... of temporal variability surveys and [categorization of] the variability in these large data sets.” paleobiology: “...a wealth of more complex, easy-to-use packages that can handle Bayesian analyses...” nanotechnology: “..techniques...to handle many [processors] at once.” evo-devo biology: “...analyses of large data sets... smart systems that bring together relevant data from disparate sources.” q3: What computational skills and abilities would allow today’s undergraduates to tackle tough problems in your field 10 or 20 years from now? geophysics: “...general programming skills are the key that allow tomorrow’s researchers to create their own tools... and think differently.” materials/surfaces: “...both applied math and skill in numerical simulations and manipulations...” astrophysics: “ability to use whatever programs are standard [and] be able to modify them.” cosmology “...what is becoming harder and harder is to get students to understand the very basics of how astronomical data is collected.” q3: What computational skills and abilities would allow today’s undergraduates to tackle tough problems in your field 10 or 20 years from now? geophysics: “...general programming skills are the key that allow tomorrow’s researchers to create their own tools... and think differently.” materials/surfaces: “...both applied math and skill in numerical simulations and manipulations...” astrophysics: “ability to use whatever programs are standard [and] be able to modify them.” cosmology “...what is becoming harder and harder is to get students to understand the very basics of how astronomical data is collected.” q3: What computational skills and abilities would allow today’s undergraduates to tackle tough problems in your field 10 or 20 years from now? evo-devo biology: “statistical analysis, programming, large-dataset management.” paleobiology: “the big problems, the importance of first principles” nanostructures: “pattern recognition in the most general sense” evolutionary biology “... familiarity with... ‘informatics’ approaches” miscellaneous comments “Programming seems here to stay.” geophysics “The larger problem is eliminating innumeracy among Harvard undergrads. I routinely have students in my core class that are marginally able, or unable, to deal with quantitative material.” earth science “The major problem with this [“computational thinking”] approach is that it is concerned with teaching skills, rather than building a CV for medical/professional school, and is thus a slightly unusual vector for our undergraduates.” materials/surfaces miscellaneous comments “You know, ironically, students are beginning to lose track of the fundamentals that underlie the computational tools they are using.” paleobiology “These subjects [applied math and numerical simulation] are difficult, and Harvard undergrads are not terrifically fond of difficult subjects.” materials/surfaces “Harvard... is the perfect place to pursue this type of education.” geophysics some conclusions Computing challenges in the sciences will focus on large data sets, but not just on large data sets. The ability to bring data together from disparate sources will be increasingly critical. Some faculty fear that students using sophisticated tools will lose touch with first principles or the understanding of nature that comes from direct observation and experimentation. There is concern about levels of quantitative skill among science students and cynicism about motivation. Scale is seen as a growing challenge across the sciences, and computational skill as necessary for meeting that challenge. experimentation at Harvard research experiences provided by IIC (18 internships, 4 REUs in first 3 years) physical sciences and life sciences now have integrated first-year courses first winter session: January 2010 new undergraduate laboratories will combine wet labs with computer labs IIC Director Efthimios Kaxiras convening interdisciplinary faculty committee to launch co-teaching workshops planned addition of science projects to CS 50/51; new numerical methods courses in School of Engineering and Applied Sciences (lack of departmental boundaries helps!) what can “computational thinking” not do for science? replace observation; scientists must first “take their dictation from Nature” provide young scientists knowledge of science’s laws, principles and method what can “computational thinking” do for science? help conceptualize, manipulate and analyze novel and large databases lead to different formulations of theory cleave observations/data/interactions/natural systems into computable pieces; abstract them; represent and model them; map results back to the real world