InterdisComputing2

advertisement
No Discipline is an Island
Where Computing and Other Disciplines
Meet
Lillian (Boots) Cassel
Villanova University
Introduction
• It is the journey, not the destination, that makes
life interesting
• Professor of Computing Sciences
• Past chair, Computing Accreditation Commission
• Past chair, ACM SIGCSE
• Past Program Officer (Rotator) NSF DUE
• Member ACM Education Board
• Visiting Scholar, DILL program, Parma Italy
• odds and ends of other interesting things
Plan for this talk
• Computing Disciplines: The Identity question
– What constitutes the computing disciplines
– How do we see ourselves and how do others see us?
• Interdisciplinarity
– The growing role of computing in all disciplines
– A two-way relationship
– The challenges
• Computing – the discipline that lets you be in
whatever field appeals to you now
Computing Disciplines:
The Identity question
• Identity questions
– who are we? what do we contribute?
• Internal divisions
– Computer Science, Computer Engineering, Software
Engineering, Information Systems, Information
Technology, Information Science …
• External views
– Source of tools.
– “Computer expert”
– Way of approaching problems?
• Computational (or Algorithmic) Thinking
Relationship to other disciplines
• Every field depends on computers
– Not much disagreement with that
• Every field depends on computing
– Not so clear. What is the difference?
• Computing also depends on other fields
– We receive as well as give
– Mathematics, of course
– Also, psychology, linguistics, sociology,
communication, ….
Just what is the computing
discipline?
• Motivation for a project
– Specialization of the computing fields
– Fractured voice?
– Confusion about what computing is?
• A bit of history of the Computing Ontology
project
– Naïve beginning
– Lively discussion, consensus
– Reconsideration
– An experiment, and a surprising result
Seven major areas
•
•
•
•
•
•
•
Theory
Information and Recollection
Organizational Context
Social Context
Computing Infrastructure
Interaction
Software Design and Development
See www.distributedexpertise.org/computingontology/
Informs
Organizational
Context
Designs
Directs
Applies?
Computing
Infrastructure
SW Design +
Implementation
Executes
Informs
Bounds
Theory
Informs
Enables
Information/
Recollection
Interaction
Constrains
Informs/Constrains
Policies/Practices/
Societal Issues
Still a work in progress
Comments most welcome
Theory
• Every true discipline has a theoretical base
• For computing, this includes
– Algorithms, design strategy and complexity analysis
• How do we approach solving a class of problems
• How practical are the resulting solutions
– Automata and formal language theory
• What types of problems can we express
• How do we distinguish problems that cannot be solved
explicitly?
• How do we decide on appropriate approximations when
complete solutions are not possible?
Information and Recollection
•
•
•
•
•
Databases
Unstructured data
Understanding data, making it informative
Addressing a specific information need
Preservation of materials as technology
changes
• Capturing, Organizing, Summarizing,
Analyzing, Visualizing*
* Jim Gray summary
Social Context
•
•
•
•
Privacy, security, integrity of information
Ethics
Intellectual property
Legal frameworks
Computing Infrastructure
• Digital Systems
• Machine Organization
• Multiprocessing, parallel systems, cloud
computing
• Encoding, representation
• Networks and communication
• Systems Security, authentication, protection
Interaction
• Communicating a need to a computing system
• Receiving what is needed from a computing
system
• Graphics, visualization, multimedia, virtual
reality, vision, robotics ….
Software design and development
•
•
•
•
•
•
Software engineering
Knowledge representation, reasoning
Programming languages, and paradigms
Modeling
Systems development and life cycle
Verification and validation
Organizational Context
•
•
•
•
•
•
Policies and Planning
Forensics
Requirements analysis and specification
Systems and project management
Structure and management of IS functions
Quality of Service
A dissenting opinion
How these relate to others
• Let’s look at a few examples of the ways in
which the computing disciplines interrelate
with other disciplines
• Let’s start with a very visible impact –
information
– Information overload, information avalanche
– The terms vary but the message always suggests a
need for something beyond historical methods of
dealing with data and information
Four Science Paradigms
• Thousand years ago:
science was empirical
describing natural phenomena
• Last few hundred years:
theoretical branch
using models, generalizations
• Last few decades:
a computational branch
simulating complex phenomena
2
.
a
4G
c2
 a   3  2
a
 
 
• Today:
data exploration (eScience)
unify theory, experiment, and simulation
– Data captured by instruments
Or generated by simulator
– Processed by software
– Information/Knowledge stored in computer
– Scientist analyzes database / files
using data management and statistics
http://research.microsoft.com/en-us/um/people/gray/talks/NRC-CSTB_eScience.ppt
Data-intensive science
• First there was observation and experimentation
• Then observations led to theories
• Then, in the earliest days of computing, large
scale simulation
• Now – the Fourth Paradigm, articulated by
Microsoft Research’s Jim Gray:
The speed at which any given scientific discipline advances
will depend on how well its researchers collaborate with one
another, and with technologists, in areas of eScience such as
databases, workflow management, visualization, and cloud
computing technologies.
http://research.microsoft.com/en-us/collaboration/fourthparadigm/
How much information is there?
Soon most everything will be
recorded and indexed
Most bytes will never be seen by
humans.
Data summarization,
trend detection
anomaly detection
are key technologies
Zetta
Everything
Recorded !
These require
algorithms, data and
knowledge
representation, and
knowledge of the
domain
Yotta
Exa
All Books
MultiMedia
See also Mike Lesk:
How much information is there: http://www.lesk.com/mlesk/ksg97/ksg.html
See Lyman & Varian:
How much information
All books
(words)
A movie
A Photo
Peta
Tera
Giga
Mega
http://www.sims.berkeley.edu/research/projects/how-much-info/
24 Yecto, 21 zepto, 18 atto, 15 femto, 12 pico, 9 nano, 6 micro, 3 milli
Slide source Jim Gray – Microsoft Research (modified)
A Book
Kilo
X-Info
• The evolution of X-Info and Comp-X
for each discipline X
• How to codify and represent our knowledge
Experiments &
Instruments
Other Archivesfacts
Literature facts
questions
?
answers
Simulations
The Generic Problems
•
•
•
•
•
•
Data ingest
Managing a petabyte
Common schema
How to organize it
How to reorganize it
How to share with others
•
•
•
•
•
Query and Visualization tools
Building and executing models
Integrating data and Literature
Documenting experiments
Curation and long-term preservation
http://research.microsoft.com/en-us/collaboration/fourthparadigm/
Some specific examples
• These are just examples – something to
illustrate the pervasive interconnectedness of
computing and other disciplines.
• Each of us can probably provide other
examples.
Astronomy and Computing
• The Large Synoptic Survey Telescope (LSST)
Over 30 thousand gigabytes (30TB) of images will be generated
every night during the decade-long LSST sky survey.
People now do not actually look through telescopes. Instead, they are “looking”
through large-scale, complex instruments which relay data to datacenters, and only
then do they look at the information on their computers -- Jim Gray
LSST and Google share many of the same goals: organizing
massive quantities of data and making it useful.
Computing and Astronomy are inseparable
http://lsst.org/lsst/google
New Science
This data-driven modeling and discovery linkage has entered a new
paradigm. The acquisition of scientific data in all disciplines is now
accelerating and causing a nearly insurmountable data avalanche. It is
no longer possible for humans to look at any representative fraction of
the data. Instead, we may be looking over the shoulders of assisted
learning machines at innovative visualizations of metadata. Discoveries
will be made via searches for correlations. The role of the experimental
scientist increasingly is as inventor of ambitious new searches and new
algorithms. Novel theories of nature are tested through searching for
the predicted statistical relationships across big data bases. With this
accelerated advance in data generation capability, we will require novel,
increasingly automated, and increasingly more effective scientific
knowledge discovery systems.
http://www.lsst.org/lsst/science/technology
Biology
From Nature 14 Nov 2002
Focus on computational biology
Biology is overwhelmed with data. The various genome
projects are generating, with increasing ease, vast
gigabases of DNA sequence, which have stimulated the
development of high-throughput assays to provide
comprehensive post-genomic analysis.
Biology’s Brave new world
Shape of things to come: large data sets arising from genome
projects demand new skills of biologists.
Many research leaders predict that the potential to integrate
different levels of genomic data — such as raw sequence from
the human genome and those of model organisms, data on
genetic variability between individuals and on gene expression in
different tissues — will radically change biological research. They
argue that small experiments driven by individual investigators
will give way to a world in which multidisciplinary teams, sharing
huge online data sets, emerge as the key players. Some foresee
an era of 'systems biology', in which the ability to create
mathematical models describing the function of networks of
genes and proteins is just as important as traditional lab skills.
http://www.nature.com/nature/journal/v409/n6822/full/409758a0.html
The biology lab?
• Shape of things
to come: large
data sets arising
from genome
projects demand
new skills of
biologists.
http://www.nature.com/nature/journal/v409/n6822/full/409758a0.html
Biology
The Organization Context
• “Last year, 161 exabytes of digital information were created
and copied, according to research firm IDC (International Data
Corporation – idc.com).”
• “While nearly 70% of what IDC is calling the digital universe
will be generated by individuals over the next three years,
most of this content will be touched by a business or
government agency network along the way -- it will be held in
a data center or at a hosting site, it will travel over a telephone
wire or Internet switch, or it will be stored in a backup system.
• Those organizations, IDC said, will be responsible for the
security, privacy, reliability, and compliance of at least 85% of
the information.”
http://www.informationweek.com/news/197800880 -- Information Week - March 7, 2007
Not just science and organizations
• Computational Journalism
• Social Networks
• Theatre
• Music
Communications of the ACM
Vol. 54 No. 10, Pages 66-71
10.1145/2001269.2001288
Music?
http://www.cs.cmu.edu/~music/
Music and Computer Science
http://teaching.cs.uml.edu/~heines/TUES/ProjectHome.jsp
A two-way interaction
• It is easy to focus on what computing gives to
other disciplines
• The other side is just as important, not so well
recognized
– How can we develop social networks, without
sociology?
– How can we develop good interfaces, without
psychology?
– etc.
Informs
Organizational
Context
Designs
Directs
Applies?
Computing
Infrastructure
SW Design +
Implementation
Executes
Informs
Bounds
Theory
Informs
Enables
Information/
Recollection
Interaction
Constrains
Informs/Constrains
Policies/Practices/
Societal Issues
A good understanding of who
and what we are helps us
understand where we best fit
with other disciplines
Two examples
• Joint appointment
– Computer Science and Linguistics
– Natural language processing specialist
• Two brothers
– Computer science major, English major
Challenges
• Lots of talk about interdisciplinarity
• Motivation
– Faculty
– Institution
Current NSF project to explore the issues
of motivation and the challenges, and
what can be done about them.
• Some issues
– Workload
– Tenure
– Organizational Culture
Connected to everything
• Bottom line –
• The need for computing in nearly everything
makes computing a great choice of
specialization if you are not sure what
interests you most – or what will interest you
in a few years
– Study computing …. do anything you like
Maybe the tag line for computing education week?
Interested
• Please go to computingportal.org
• Look for the community: Interdisciplinary
Computing
– Join the group
– Read the materials
– Comment, participate in discussions
• Also, for computing ontology:
– See www.distributedexpertise.org/computingontology/
Download