Testing of Clustering - The Faculty of Mathematics and Computer

advertisement

On the limitations

of

efficient computation

Oded Goldreich

Weizmann Institute of Science

Theory of Computation: a “meta-discipline”

Like Mathematics, typically, TOC does not engage in solving specific problems but rather studies the solvability of classes of problems.

Specifically, it studies the efficient solvability of computational problems.

Individual problems consists of infinitely many instances, and the problem specifics a set of valid solutions for each instance.

Categories

Classes of problems

Problems = set of instance-solution pairs

Instances = finite object

(repr’ed by a finite sequence of bits)

Problems (i.e., computational problems)

Problems of full information.

So what is the problem here?

The problem is to transform one representation of the information to another (more explicit representation).

answer implicit in (input) data transformation = data manipulation explicit answer in output

Type of questions: The P-vs-NP Question

Search problems: binary relations specifying valid instance-solution pairs.

• Finding solutions: given an instance, find a valid solution.

• Checking validity of solutions: given an instance-solution pair, determine whether the solution is valid.

• “P” = class of search problems for which solutions can be found efficiently.

• “NP” = class of search problems for which the validity of solutions can be checked efficiently.

The P-vs-NP

Question:

Is solving harder than checking?

Solving seems harder than checking

Example 1: Sudoku

Solving seems harder than checking, cont.

Example 2: Factoring integers

61512881 = 6917

8893

Example 3: Solving a system of linear (or quadratic) equations

2x + 3y

 z = 9 x

 y + 3z =

4

3x + 2y + z = 0 x = 1 y = 2 z =

1

2x 2 + 3xy

 z = 0 x

 y 2 + 3yz =

5

3x + 2xy +2xz = 3 x =

1 y = 1 z =

1

Solving seems harder than checking, cont.

Example 4: Coloring a Map (with 3 colors)

The P-vs-NP Conjecture

I mean significantly harder

Not all problems in NP are in P:

There exist search problems for which solutions can be efficiently checked (for correctness), and yet can not be found efficiently (when given an instance).

For these problems, solving is harder than checking.

Exhaustive search

• Finding solutions: given an instance, find a valid solution.

• Checking validity of solutions: given an instance-solution pair, determine whether the solution is valid.

• “P” = class of search problems for which solutions can be found efficiently.

• “NP” = class of search problems for which the validity of solutions can be checked efficiently.

NP-Completeness (universal problems in NP)

Assuming that not all problems in NP are in P, can we pinpoint such a problem (i.e., a problem for which finding solutions is harder than verifying their correctness)?

Yes we can; furthermore, a host of natural problem are such!

E.g., solving systems of quadratic equations (SQE).

If you can efficiently solve (i.e., find solutions to) any such (NP-complete) problem, then you can efficiently solve all problems in NP.

How is this done?

By encoding instances of any problem in NP (e.g., factoring, 3-color) as instances of the universal problem (e.g., SQE).

Indeed amazing: we can factor a composite number (or 3-color a geometric map) by solving a “related” system of quadratic equations, where the system can be efficiently obtained from the number (or map).

The Bright Side of Hardness

I.e., hard to solve on typical cases and not only on worst cases.

If NP is not contained in P, in a strong sense, then we get Cryptography and pseudorandomness .

The hardness conjecture we need (one-way functions):

There are efficient processes that cannot be efficiently reversed/inverted (not even on typical cases).

Cryptography: systems that are easy to use but hard to abuse.

The inverting task is always in NP.

pseudorandomness: object that look random although they are not truly random.

• “P” = class of search problems for which solutions can be found efficiently.

• “NP” = class of search problems for which the validity of solutions can be checked efficiently.

Pseudorandomness (exists if OWFs exist)

A (huge) world determined using an unlimited amount of randomness

We cannot distinguish these two worlds

(i.e., a truly random world vs. a pseudorandom one)

An alternative world determined using a very small amount of randomness

Pseudorandomness (exists if OWFs exist), revised

A truly random function from n-digit numbers to n-digit numbers, where (say) n=100.

We cannot distinguish these two cases

(i.e., a truly random function vs. a pseudorandom one)

.

A function from n-digit numbers to n-digit numbers, determined by a random n-digit number .

The End

The slides of this talk are available at http://www.wisdom.weizmann.ac.il/~oded/T/eff-comp.ppt

A related textbook is available at http://www.wisdom.weizmann.ac.il/~oded/bc-book.html

auxiliary slides follow taken from related talks

A common feature: Easy to check correctness

A search problem consists of an infinite (or huge) set of instances and a concise/simple specification of valid solutions.

“ NP ” = All search problems for which valid solutions can be efficiently recognized; that is, there exists an efficient procedure that, when given an instance-solution pair, determines whether or not the solution is valid.

The fact that it is easy to check correctness of candidate solutions yields an obvious (but slow) way of finding solutions to all problems, called exhaustive search :

Just try all potential solutions and check the correctness of each of them, halting with a correct solution once it is found.

Exhaustive search: an obvious (but slow) way of finding solutions to all NP problems

Trying all potential solutions, we find a correct one once it is tried.

Example 1: Sudoku (with 50/81 entries missing).

9 50 possibilities.

Example 2: Factoring 8-digit integers.

10000 possibilities.

(When factoring 200-digit integers, there are 10 100 possibilities!)

Example 3: 3-coloring a map (with 20 countries).

3 20 possibilities.

Are there better ways to find solutions?

Example 1a: Solving systems of linear equations.

YES!

Example 1b: Solving systems of quadratic equations. (prob.)

Example 2a: 2-coloring a (2-colorable) map.

YES!

Example 2b: 3-coloring a (3-colorable) map.

(prob.) NO!

NO!

The P-vs-NP Question (version 1: search problems)

“ P ” = All search problems that can be solved efficiently; that is, there exists an efficient procedure that, when given an instance of the problem, finds a valid solution to that instance

(or indicates that none exists).

“ NP ” = All search problems for which valid solutions can be efficiently recognized; that is, there exists an efficient procedure that, when given an instance-solution pair, determines whether or not the solution is valid.

It is widely believed that there are problems in NP  P .

For example: 3-Coloring, TSP, Knapsack, Solve-Quad-Equations, and also Factoring.

Decision problems (a formulation)

Again, a problem is not a specific instance (e.g.,a specific Sudoko puzzle), but rather the general form/class/type (e.g., Sudoku).

A decision problem consists of an infinite (or huge) set of instances and a concise/simple specification of YES-instnaces (a set of instances having a “desired” property).

A generic example ( having solutions w.r.t a search problem ):

An instance is an instance of a search problem, and the question is whether this instance has a solution.

Example 1 (Sudoku): An instance is any 9-by-9 rectangle partially filled with single digits, and the question is whether it can be augmented such that …

Example 2 (coloring maps): An instance is a map, and the question is whether it is 3-colrable (i.e., whether there exists a 3-coloring of the areas such that no two adjacent areas are assigned the same color).

Focus on the problem of 3-colorability

Coloring a Map with 3 colors such that no two adjacent areas are assigned the same color.

THM: Every map can be colored with four colors.

Some maps can be colored with three colors, this one not

(e.g., Belarus is surrounded by five neighbors).

The decision problem:

Given a map, determine whether it is 3-colorable .

The P-vs-NP Question (version 2: decision problems)

P = All decision problems that can be solved efficiently; that is, there exists an efficient procedure that, when given an instance of the problem, determines whether it is a YES-instance.

NP = All decision problems for which each YES-instance has an efficiently verifiable certificate; that is, there exists an efficient procedure that, when given an instancecertificate pair, determines whether or not the certificate is valid.

It is widely believed that there are problems in NP  P .

For example: 3-Coloring, TSP, Knapsack, Solve-Quad-Equations.

Universal problems (NP-completeness)

P = All decision problems that can be solved efficiently; that is, there exists an efficient procedure that, when given an instance of the problem, determines whether it is a

YES-instance.

NP = All decision problems for which each YES-instance has an efficiently verifiable certificate; that is, there exists an efficient procedure that, when given an instance-certificate pair, determines whether or not the certificate is valid.

A problem in NP is called NP-complete if the ability to efficiently solve it implies the ability to efficiently solve any problem in NP.

How can such problems possibly exist?

(Let alone how can we prove that they exist?)

P  NP  NP-complete problems are hard to solve.

For example: 3-Coloring, TSP, Knapsack, Solve-Quad-Equations.

Universal problems (NP-completeness) [continued]

A (decision) problem C in NP is called NP-complete if the ability to efficiently solve it implies the ability to efficiently solve any (decision) problem D in NP.

How can NP-complete problems possibly exist?

(Let alone how can we prove that they exist?)

Idea: An efficient transformation of instances of

(any NP) problem D into instances of problem C such that YES-instances are mapped to YES-instances

(and NO-instances are mapped to NO-instances).

For example: 3-Coloring, TSP, Knapsack, Solve-Quad-Equations are all NP-complete. Instances, say, of FACTORING (in NP) can be mapped to any of them.

Universal problems (NP-completeness) [2 nd cont.]

A (decision) problem C in NP is called NP-complete if the ability to efficiently solve it implies the ability to efficiently solve any

(decision) problem D in NP.

Shown via an efficient transformation of instances of

(any NP) problem D into instances of problem C such that YES-instances are mapped to YES-instances

(and NO-instances are mapped to NO-instances).

D YES

(e.g.,Factor)

NO

YES

NO

C

(e.g.,3Color)

The End

The slides of this talk are available at http://www.wisdom.weizmann.ac.il/~oded/T/p-vs-np.ppt

A related textbook is available at http://www.wisdom.weizmann.ac.il/~oded/bc-book.html

Download