Curing Cancer with Your Cell Phone: all

advertisement
Curing Cancer
with Your Cell
Phone:
Why all Sciences are
Becoming
Computing Sciences
David Evans
http://www.cs.virginia.edu/evans
Computer Science =
Doing Cool Stuff with Computers?
College Science Scholars
2
Toaster Science =
Doing Cool Stuff with Toasters?
College Science Scholars
3
Computer Science
• Mathematics is about declarative
(“what is”) knowledge; Computer
Science is about imperative (“how
to”) knowledge
• The Study of Information Processes
Language
– How to describe them
– How to predict their properties
Logic
– How to implement them
Engineering
quickly, cheaply, and reliably
College Science Scholars
4
Most Science is About
Information Processes
Which came first, the
chicken or the egg?
How can a (relatively) simple,
single cell turn into a chicken?
College Science Scholars
5
Agenda
• Three Big Ideas:
– All Computers are Equally Powerful
– Programs are Data, Data are Programs
– Many Surprisingly Different Problems
are Equally Difficult
• One Open Question
– Is a machine that can always guess
correctly able to solve problems a
normal machine can’t?
College Science Scholars
6
“Computers” before WWII
College Science Scholars
7
Mechanical Computing
College Science Scholars
8
Modeling Pencil and Paper
...
#
C
S
S
A
7
2
3
...
How long should the tape be?
“Computing is normally done by writing certain
symbols on paper. We may suppose this paper is
divided into squares like a child’s arithmetic book.”
Alan Turing, On computable numbers, with an
application to the Entscheidungsproblem, 1936
College Science Scholars
9
Modeling Brains
•Rules for steps
•Remember a
little
“For the present I
shall only say that
the justification lies
in the fact that the
human memory is
necessarily limited.”
Alan Turing
College Science Scholars
10
Turing’s Model
...
#
1
0
1
1
0
1
1
1
Input: #
Write: #
Move: 
Start
1
Input: 1
Write: 1
Move: 
College Science Scholars
1
1
0
1
1
1
#
...
Input: 1
Write: 0
Move: 
2
Input: 0
Write: 0
Move: 
0
Input: 0
Write: #
Move: 
3
11
Universal Machine
Input
Description
Result tape
of a of
Turing
running
Machine
M on M
Input
Universal
Machine
A Universal Turing Machine can simulate
any Turing Machine running on any Input!
College Science Scholars
12
Church-Turing Thesis
• All mechanical computers are equally
powerful*
*Except for practical limits like memory size, time,
energy, etc.
• There exists a Turing machine that can
simulate any mechanical computer
• Any computer that is powerful enough
to simulate a Turing machine, can
simulate any mechanical computer
College Science Scholars
13
What This Means
• Your cell phone, watch, iPod, etc. has
a processor powerful enough to
simulate a Turing machine
• A Turing machine can simulate the
world’s most powerful supercomputer
• Thus, your cell phone can simulate
the world’s most powerful
supercomputer (it’ll just take a lot
longer and will run out of memory)
College Science Scholars
14
Recap
• All Computers are Equally Powerful
• Programs are Data, Data are
Programs
• Many Problems are Equally Difficult
– But no one knows how difficult!
College Science Scholars
15
A “Hard” Problem?
College Science Scholars
16
Generalized Pegboard Puzzle
• Input: a configuration of n pegs on a
cracker barrel style pegboard (of any
size)
• Output: if there is a sequence of
jumps that leaves a single peg,
output that sequence of jumps.
Otherwise, output false.
Is this a “hard” problem?
College Science Scholars
17
Solving Problems
• A solution to a problem instance:
given a pegboard configuration,
here’s the sequence of jumps
• A solution to a problem: a procedure
that (1) always finds the correct
answer, and (2) always finishes.
College Science Scholars
18
“Brute Force” Solvers
• Enumerate all possible answers
– Every possible sequence of jumps
• Try them all until you find one that
works
– Simulate the jumps
• This works for almost all problems!
• Problem: how long does it take?
College Science Scholars
19
Problem Solving Time
1200
1000
Time
800
~ 2n
600
~ n3
400
200
~
n
0
1
2
College Science Scholars
3
4
5
6
7
8
Problem Input Size
9
10
20
Increasing Problem Size
70000
~ 2n
60000
50000
40000
30000
20000
10000
~ n3
0
1
2
3
College Science Scholars
4
5
6
7
8
9
10 11 12 13 14 15 16
21
Tractable and Intractable Problems
1200000
1000000
800000
“intractable”
600000
400000
“tractable”
200000
0
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20
I do nothing that a man of unlimited funds, superb physical endurance,
and maximum scientific knowledge could not do.
– Batman (may be able to solve intractable problems, but computer
scientists can only solve tractable ones for large n)
College Science Scholars
22
This makes a huge difference!
time
since
“Big
Bang”
~ 2n
1E+30
1E+28
1E+26
1E+24
1E+22
1E+20
1E+18
1E+16
1E+14
1E+12
2032
today
1E+10
1E+08
100000
0
10000
~ n3
100
1
2
4
8
16
32
64
128
log-log scale
College Science Scholars
23
Back to the Pegboard...
• A brute force
solution is
easy...but on the
pink line
• Is there a tractable
solution?
1E+30
1E+28
1E+26
1E+24
1E+22
1E+20
1E+18
1E+16
1E+14
1E+12
1E+10
1E+08
100000
0
10000
100
1
2
College Science Scholars
4
8
16
32
64
128
24
Deciding a Problem Is Hard
• “I tried really hard and still couldn’t
solve it.”
– Maybe the speaker isn’t smart enough
– Maybe a few days more effort will find it
• “Lots of really smart people tried
really hard and no one could solve it.”
• “It seems sort of like this other
problem that we think is hard...”
College Science Scholars
25
Output
Input
Reduction
Input
Transformer
Pegboard
Solver
Output
Transformer
jumps
Hard Problem Solver
College Science Scholars
26
Reading the Genome
Whitehead Institute, MIT
College Science Scholars
27
Gene Reading Machines
• One read: about 700 base pairs
• But…don’t know where they are on
the chromosome
Read 3
TACCCGTGATCCA
Read 2
Read 1
Actual
Genome
College Science Scholars
TCCAGAATAA
ACCAGAATACC
ACCAGAATACCCGTGATCCAGAATAA
28
Genome Assembly
Read 1
ACCAGAATACC
Read 2
TCCAGAATAA
Read 3
TACCCGTGATCCA
Input: Genome fragments (but
without knowing where they are
from)
Ouput: The full genome
College Science Scholars
29
Genome Assembly
Read 1
ACCAGAATACC
Read 2
TCCAGAATAA
Read 3
TACCCGTGATCCA
Input: Genome fragments (but
without knowing where they are from)
Ouput: The smallest genome sequence
such that all the fragments are
substrings.
College Science Scholars
30
Genome Assembly Solver
ACCAGAATACC
TCCAGAATAA
TACCCGTGATCCA
ACCAGAATACCCGTGATCCAGAATAA
(~30M reads,
~900 bp)
College Science Scholars
Input
Transformer
Pegboard
Solver
Output
Transformer
jumps
Genome Assembly Solver
31
What This Means
• We already know the shortest
common superstring (genome
assembly) problem is “hard”
• The pegboard problem must also be
hard, since we could use a solver for it
to solve the genome assembly
problem
– Requires: we can build fast transformers
that don’t increase the problem size
exponentially
College Science Scholars
32
Non-Deterministic Machines
...
#
1
0
1
1
0
1
1
1
0
Input: #
Write: #
Move: 
Start
1
Input: 1 Input: 0
Write: 1 Write: 0
Move:  Move: 
College Science Scholars
1
1
0
1
1
1
#
...
Input: 1
Write: 0
Move: 
Input: #
Write: 0
Move: 
4
2
3
33
Non-Deterministic Machine
• Everytime there is a choice, it can
guess the correct choice without
looking ahead
• If we had such a machine, solving
Pegboard (or Genome Assembly,
etc.) problem would be easy:
– It can guess the solution one step
(alignment) at a time
College Science Scholars
34
Big Open Question
Is a nondeterministic
machine able to
solve problems that
are intractable on a
deterministic
machine?
1E+30
1E+28
1E+26
1E+24
1E+22
1E+20
1E+18
1E+16
1E+14
1E+12
1E+10
1E+08
100000
0
10000
100
1
2
4
8
16
32
64
128
Seems obvious that the magic guess
correctly ability should be useful...but
no one knows for sure!
College Science Scholars
35
Recap
• “P vs NP” problem (one of the millennium
prize problems)
• Solving the pegboard puzzle is equivalent
to solving genome assembly
• With a non-deterministic machine, we could
solve both
• With a mechanical computer, we don’t
know if a tractable solution exists (but can’t
prove it doesn’t): We don’t know if
checking a solution is really easier than
finding it
College Science Scholars
36
Summary
• Computer Science is the study of
information processes: all about
problem solving
• Many seemingly paradoxical results:
– All Computers are Equally Powerful!
– Many Surprisingly Different Problems are
Equivalent!
• And seemingly obvious open problems:
– Is checking a solution is really easier than
finding it?
College Science Scholars
37
Computer Science at UVa
• New Interdisciplinary Major in
Computer Science for A&S students
(approved last year)
• Take CS150 this Spring
– Every scientist needs to understand
computing, not just as a tool but as a
way of thinking
• Lots of opportunities to get involved
in research groups
College Science Scholars
38
My Research Group
• Computer Security: computing in the
presence of adversaries
• Recent student projects:
– Proof that the Pegboard puzzle is hard
(Mike Peck and Chris Frost)
– Disk-level virus detection (Adrienne Felt)
– Web Application Security (Sam Guarnieri)
– N-Variant Systems: run variants of a
program simultaneously (Sean Talts)
College Science Scholars
39
Questions
http://www.cs.virginia.edu/evans
evans@cs.virginia.edu
College Science Scholars
40
Download