No Discipline is an Island Where Computing and Other Disciplines Meet Lillian (Boots) Cassel Villanova University Introduction • It is the journey, not the destination, that makes life interesting • Professor of Computing Sciences • Past chair, Computing Accreditation Commission • Past chair, ACM SIGCSE • Past Program Officer (Rotator) NSF DUE • Member ACM Education Board • Visiting Scholar, DILL program, Parma Italy • odds and ends of other interesting things Plan for this talk • Computing Disciplines: The Identity question – What constitutes the computing disciplines – How do we see ourselves and how do others see us? • Interdisciplinarity – The growing role of computing in all disciplines – A two-way relationship – The challenges • Computing – the discipline that lets you be in whatever field appeals to you now Computing Disciplines: The Identity question • Identity questions – who are we? what do we contribute? • Internal divisions – Computer Science, Computer Engineering, Software Engineering, Information Systems, Information Technology, Information Science … • External views – Source of tools. – “Computer expert” – Way of approaching problems? • Computational (or Algorithmic) Thinking Relationship to other disciplines • Every field depends on computers – Not much disagreement with that • Every field depends on computing – Not so clear. What is the difference? • Computing also depends on other fields – We receive as well as give – Mathematics, of course – Also, psychology, linguistics, sociology, communication, …. Just what is the computing discipline? • Motivation for a project – Specialization of the computing fields – Fractured voice? – Confusion about what computing is? • A bit of history of the Computing Ontology project – Naïve beginning – Lively discussion, consensus – Reconsideration – An experiment, and a surprising result Seven major areas • • • • • • • Theory Information and Recollection Organizational Context Social Context Computing Infrastructure Interaction Software Design and Development See www.distributedexpertise.org/computingontology/ Informs Organizational Context Designs Directs Applies? Computing Infrastructure SW Design + Implementation Executes Informs Bounds Theory Informs Enables Information/ Recollection Interaction Constrains Informs/Constrains Policies/Practices/ Societal Issues Still a work in progress Comments most welcome Theory • Every true discipline has a theoretical base • For computing, this includes – Algorithms, design strategy and complexity analysis • How do we approach solving a class of problems • How practical are the resulting solutions – Automata and formal language theory • What types of problems can we express • How do we distinguish problems that cannot be solved explicitly? • How do we decide on appropriate approximations when complete solutions are not possible? Information and Recollection • • • • • Databases Unstructured data Understanding data, making it informative Addressing a specific information need Preservation of materials as technology changes • Capturing, Organizing, Summarizing, Analyzing, Visualizing* * Jim Gray summary Social Context • • • • Privacy, security, integrity of information Ethics Intellectual property Legal frameworks Computing Infrastructure • Digital Systems • Machine Organization • Multiprocessing, parallel systems, cloud computing • Encoding, representation • Networks and communication • Systems Security, authentication, protection Interaction • Communicating a need to a computing system • Receiving what is needed from a computing system • Graphics, visualization, multimedia, virtual reality, vision, robotics …. Software design and development • • • • • • Software engineering Knowledge representation, reasoning Programming languages, and paradigms Modeling Systems development and life cycle Verification and validation Organizational Context • • • • • • Policies and Planning Forensics Requirements analysis and specification Systems and project management Structure and management of IS functions Quality of Service A dissenting opinion How these relate to others • Let’s look at a few examples of the ways in which the computing disciplines interrelate with other disciplines • Let’s start with a very visible impact – information – Information overload, information avalanche – The terms vary but the message always suggests a need for something beyond historical methods of dealing with data and information Four Science Paradigms • Thousand years ago: science was empirical describing natural phenomena • Last few hundred years: theoretical branch using models, generalizations • Last few decades: a computational branch simulating complex phenomena 2 . a 4G c2 a 3 2 a • Today: data exploration (eScience) unify theory, experiment, and simulation – Data captured by instruments Or generated by simulator – Processed by software – Information/Knowledge stored in computer – Scientist analyzes database / files using data management and statistics http://research.microsoft.com/en-us/um/people/gray/talks/NRC-CSTB_eScience.ppt Data-intensive science • First there was observation and experimentation • Then observations led to theories • Then, in the earliest days of computing, large scale simulation • Now – the Fourth Paradigm, articulated by Microsoft Research’s Jim Gray: The speed at which any given scientific discipline advances will depend on how well its researchers collaborate with one another, and with technologists, in areas of eScience such as databases, workflow management, visualization, and cloud computing technologies. http://research.microsoft.com/en-us/collaboration/fourthparadigm/ How much information is there? Soon most everything will be recorded and indexed Most bytes will never be seen by humans. Data summarization, trend detection anomaly detection are key technologies Zetta Everything Recorded ! These require algorithms, data and knowledge representation, and knowledge of the domain Yotta Exa All Books MultiMedia See also Mike Lesk: How much information is there: http://www.lesk.com/mlesk/ksg97/ksg.html See Lyman & Varian: How much information All books (words) A movie A Photo Peta Tera Giga Mega http://www.sims.berkeley.edu/research/projects/how-much-info/ 24 Yecto, 21 zepto, 18 atto, 15 femto, 12 pico, 9 nano, 6 micro, 3 milli Slide source Jim Gray – Microsoft Research (modified) A Book Kilo X-Info • The evolution of X-Info and Comp-X for each discipline X • How to codify and represent our knowledge Experiments & Instruments Other Archivesfacts Literature facts questions ? answers Simulations The Generic Problems • • • • • • Data ingest Managing a petabyte Common schema How to organize it How to reorganize it How to share with others • • • • • Query and Visualization tools Building and executing models Integrating data and Literature Documenting experiments Curation and long-term preservation http://research.microsoft.com/en-us/collaboration/fourthparadigm/ Some specific examples • These are just examples – something to illustrate the pervasive interconnectedness of computing and other disciplines. • Each of us can probably provide other examples. Astronomy and Computing • The Large Synoptic Survey Telescope (LSST) Over 30 thousand gigabytes (30TB) of images will be generated every night during the decade-long LSST sky survey. People now do not actually look through telescopes. Instead, they are “looking” through large-scale, complex instruments which relay data to datacenters, and only then do they look at the information on their computers -- Jim Gray LSST and Google share many of the same goals: organizing massive quantities of data and making it useful. Computing and Astronomy are inseparable http://lsst.org/lsst/google New Science This data-driven modeling and discovery linkage has entered a new paradigm. The acquisition of scientific data in all disciplines is now accelerating and causing a nearly insurmountable data avalanche. It is no longer possible for humans to look at any representative fraction of the data. Instead, we may be looking over the shoulders of assisted learning machines at innovative visualizations of metadata. Discoveries will be made via searches for correlations. The role of the experimental scientist increasingly is as inventor of ambitious new searches and new algorithms. Novel theories of nature are tested through searching for the predicted statistical relationships across big data bases. With this accelerated advance in data generation capability, we will require novel, increasingly automated, and increasingly more effective scientific knowledge discovery systems. http://www.lsst.org/lsst/science/technology Biology From Nature 14 Nov 2002 Focus on computational biology Biology is overwhelmed with data. The various genome projects are generating, with increasing ease, vast gigabases of DNA sequence, which have stimulated the development of high-throughput assays to provide comprehensive post-genomic analysis. Biology’s Brave new world Shape of things to come: large data sets arising from genome projects demand new skills of biologists. Many research leaders predict that the potential to integrate different levels of genomic data — such as raw sequence from the human genome and those of model organisms, data on genetic variability between individuals and on gene expression in different tissues — will radically change biological research. They argue that small experiments driven by individual investigators will give way to a world in which multidisciplinary teams, sharing huge online data sets, emerge as the key players. Some foresee an era of 'systems biology', in which the ability to create mathematical models describing the function of networks of genes and proteins is just as important as traditional lab skills. http://www.nature.com/nature/journal/v409/n6822/full/409758a0.html The biology lab? • Shape of things to come: large data sets arising from genome projects demand new skills of biologists. http://www.nature.com/nature/journal/v409/n6822/full/409758a0.html Biology The Organization Context • “Last year, 161 exabytes of digital information were created and copied, according to research firm IDC (International Data Corporation – idc.com).” • “While nearly 70% of what IDC is calling the digital universe will be generated by individuals over the next three years, most of this content will be touched by a business or government agency network along the way -- it will be held in a data center or at a hosting site, it will travel over a telephone wire or Internet switch, or it will be stored in a backup system. • Those organizations, IDC said, will be responsible for the security, privacy, reliability, and compliance of at least 85% of the information.” http://www.informationweek.com/news/197800880 -- Information Week - March 7, 2007 Not just science and organizations • Computational Journalism • Social Networks • Theatre • Music Communications of the ACM Vol. 54 No. 10, Pages 66-71 10.1145/2001269.2001288 Music? http://www.cs.cmu.edu/~music/ Music and Computer Science http://teaching.cs.uml.edu/~heines/TUES/ProjectHome.jsp A two-way interaction • It is easy to focus on what computing gives to other disciplines • The other side is just as important, not so well recognized – How can we develop social networks, without sociology? – How can we develop good interfaces, without psychology? – etc. Informs Organizational Context Designs Directs Applies? Computing Infrastructure SW Design + Implementation Executes Informs Bounds Theory Informs Enables Information/ Recollection Interaction Constrains Informs/Constrains Policies/Practices/ Societal Issues A good understanding of who and what we are helps us understand where we best fit with other disciplines Two examples • Joint appointment – Computer Science and Linguistics – Natural language processing specialist • Two brothers – Computer science major, English major Challenges • Lots of talk about interdisciplinarity • Motivation – Faculty – Institution Current NSF project to explore the issues of motivation and the challenges, and what can be done about them. • Some issues – Workload – Tenure – Organizational Culture Connected to everything • Bottom line – • The need for computing in nearly everything makes computing a great choice of specialization if you are not sure what interests you most – or what will interest you in a few years – Study computing …. do anything you like Maybe the tag line for computing education week? Interested • Please go to computingportal.org • Look for the community: Interdisciplinary Computing – Join the group – Read the materials – Comment, participate in discussions • Also, for computing ontology: – See www.distributedexpertise.org/computingontology/