Fundamentals of Informatics Lecture 14 Intractability and NP-completeness Bas Luttik.

Fundamentals of Informatics

Lecture 14

Intractability and NP-completeness

Bas Luttik

AlgorithmsA complete description of an algorithm consists of three parts:

1. the algorithm2. a proof of the algorithm’s correctness3. a derivation of the algorithm’s running time

Should we really care about the running time of an algorithm?

Couldn’t we just improve our hardware, to compensate for the inefficiency of an algorithm?

Towers of HanoiRules of the game:

1. Rings can only be moved one at the time

2. Rings may not be placed on top of smaller rings.

Towers of Hanoi (running time analysis)Hanoi(n,x,y,z) // move n rings from peg x to peg y using peg z1. if n = 12. then move ring from x to y3. else4. Hanoi(n-1,x,z,y)5. move ring from x to y6. Hanoi(n-1,y,z,x)

O(1)O(1)

T(n-1)

T(n-1)O(1)

T(n)

Solving the recurrence

T(n) = 2T(n-1)+O(1)

yields T(n)=O(2^n)!

A more efficient solution does not exist!

Running times

Input length10 20 50 100 200

n2 1/10000 s. 1/2500 s. 1/400 s. 1/100 s. 1/25 s.

n5 1/10 s. 3.2 s. 5.2 m. 2.8 h. 3.7 d.

2n 1/1000 s. 1 s. 35.7 y. 4 x 1011 c. 1 x 1045 c.

nn 2.8 h. 3.3 x 109 y. 1 x 1070 c. 1 x 10185 c. 1 x 10445 c.

How long does it take to run an algorithm on a computer capable of a million instructions per second?

polynomial-time algorithm:

algorithm with running time of O(nc) for some constant c.

polynomial

exponential

P: the class of all decision problems for which there exists polynomial-time algorithm that solves it.

Tractable versus intractable problems

Tractable

Unsolvable

Intractable

A problem is tractable if there exists a polynomial-time algorithm for it (i.e., if it is in the class P)

A problem is intractable if there does not exist a polynomial-time algorithm for it.

A problem is unsolvable if there does not exist an algorithm for it (not even an inefficient one).

Two decision problems about graphs

Problem 1:

Input: a road map of cities, with distances attached to road segments, two designated cities A and B and an integer k.

Output: ‘Yes’ if it is possible to take a trip from A to B of length ≤ k, and ‘No’ if such a trip is impossible.

Problem 2:

Input: a road map of cities, with distances attached to road segments, two designated cities A and B and an integer k.

Output: ‘Yes’ if it is possible to take a trip from A to B of length ≤ k which passes through all the cities, and ‘No’ if such a trip is impossible.

shortest path problem (known to be in P)

travelling salesman problem

(unknown whether it is in P)

Travelling Salesman decision problem

travelling salesman decision problem

Input: A road map with n locations (one of the locations is the depot) connected by road segments, with distances attached to the road segments, and an integer k.

Output: Yes if there exists a route of distance less or equal k that starts and ends at the depot and visits all locations on the map exactly once. No otherwise.

A solution to the travelling salesman problem is a list l0,…,ln of n+1 locations such that

1. the depot is both the first (l0) and the last (ln) location in the list,

2. all the other locations occur exactly once in the list, and

3. the sum of the distances between successive locations in the list is less or equal k.

Note that it can be verified with a polynomial-time algorithm whether some candidate list l0,…,ln is a solution.

Travelling Salesman decision problem

travelling salesman decision problem



A solution to the travelling salesman problem is a candidate list l0,…,ln of n+1 locations satisfying the three conditions on the previous slide, and the conditions can be verified with a polynomial-time algorithm.

A naïve algorithm for solving the travelling salesman decision problem searches exhaustively through all candidate solutions.

The difficulty is: there are n! = n(n-1)(n-2)321 such candidate lists.

So: searching for a solution is hard, but verifying a solution is easy.

Problems in NPThe travelling salesman problem has the following characteristics:1. Given some candidate solution, it can be verified in polynomial

time whether it is a correct solution.2. There are too many candidate solutions to allow an efficient

solution that exhaustively searches for a correct solution among the candidate solutions.

A candidate solution to a problem will be called a certificate for the problem.

NP: the class of all decision problems for which there exists a suitable notion of certificate and a polynomial-time algorithm to verify whether a certificate is an actual solution to the problem.(NP stands for Non-deterministic Polynomial time)

Boolean formula satisfiability (SAT)

boolean satisfiability problemInput: A boolean formula.Output: Yes if the formula is satisfiable. No otherwise.

A boolean formula consists of 0/1-valued variables and operators AND, OR, NOT with the following interpretation:

x y x AND y x OR y

0 0 0 0

0 1 0 1

1 0 0 1

1 1 1 1

x NOT x

0 1

1 0

Examples:1. x AND (NOT x)

2. (((NOT x) OR y) AND (x AND (NOT z)))

A boolean formula is satisfiable if there exists an assignment of 0/1-values to its variables such that the formula evaluates to 1.

A solution to the boolean satisfiability problem is a satisfying assignment; there are 2n candidate assignments.

A certificate for the boolean satisfiability problem is an assignment; verification boils down to checking whether the assignment satisfies the formula.

So: the boolean satisfiability problem is in NP.

Subset-sumsubset-sum problemInput: An finite set S of positive integers and a target number t.Output: Yes if S has a subset whose elements sum exactly to t. No otherwise.

Example:

If S is the set{1 ,2, 7, 14, 49, 98, 343, 686, 2409, 2793, 16808, 17206, 117705, 117993}

and t=138457, then the elements of the subset{1, 2, 7, 98, 343, 686, 2409, 17206, 117705}

is a solution.

A candidate solution is a subset of S. There are 2|S| candidate solutions (where |S| is the number of elements of S).

A certificate for the subset-sum problem is a subset of S; verification boils down to checking that the numbers in the subset add up to t.

So: the subset-sum problem is in NP.

Hamiltonian-cycle decision problemA hamiltonian cycle in an undirected graph is a path in the graph that starts and ends with the same vertex and visits each vertex exactly once.

hamiltonian-cycle problemInput: An undirected graph.Output: Yes if the graph has hamiltonian cycle. No otherwise.

If the input graph has n vertices, then a candidate solution is any permutation of these n vertices with the first vertex in the permutation added at the end. There are n! candidate solutions.

A certificate for the hamiltonian-cycle problem is a permutation of the vertices of the graph with the first vertex added at the end. Verification boils down to checking whether the resulting list of vertices corresponds to a path in the graph.

So: the hamiltonian-cycle problem is in NP.

NP-complete problemsA problem is in de class NP if there exists a polynomial-time algorithm to verify whether a certificate is an actual solution to the problem.

A problem is NP-hard if the existence of a polynomial-time algorithm to solve the problem implies the existence of polynomial-time algorithms for all the other problems in NP.

A problem is NP-complete if it is NP-hard and in NP.

Theorem: The boolean satisfiability problem is NP-complete.

Stephen Cook Leonid Levin

Reduction revisited

A reduction from decision Problem A to decision Problem B consists of two parts:1. a general method (algorithm) for transforming every question of problem A

into an question of problem B2. an argument that the B-answer to every transformed A-question can be

interpreted as a (correct) answer to the original A-question. (Sometimes B-answers need to be negated.)

Note that if we have a solution for decision problem B and a reduction from A to B, then we can effectively use it to solve problem A.

Roughly, if there exists an efficient reduction from problem A to problem B, then problem A is cannot be fundamentally harder than problem B.

Problem A Problem B

efficient

efficient

efficient

efficient efficient

efficiently

Reduction revisited

A polynomial-time reduction from decision Problem A to decision Problem B consists of two parts:1. a polynomial-time algorithm for transforming every question of problem A

into an question of problem B2. an argument that the B-answer to every transformed A-question can be

interpreted as a (correct) answer to the original A-question. (Sometimes B-answers need to be negated.)

Note that if we have a polynomial-time algorithm for decision problem B and a polynomial-time reduction from A to B, then we can use it to construct a polynomial-time algorithm for decision problem A (see next slide).

Problem A Problem B

input x to A

Polynomial-time reductionyes

nopolynomial-time algorithm for A

Consider a polynomial-time reduction algorithm from A to B that converts every input x for A to an input y for B in such a way that:

1. If the answer in B for y is yes, then the answer in A for x is yes.

2. If the answer in B for y is no, then the answer in A for x is no.

If the reduction algorithm runs in O(nc) and the algorithm for B runs in O(md), then the resulting algorithm for A runs in O(nc + ncd).

polynomial-time reduction algorithm

from A to B

polynomial-time algorithm for B

input y to B

yes

no

Composing polynomial-time reductions

poly-time reduction alg.

from A to B

poly-time alg. for B

input x to A

input y to B

yes

no

yes

no

poly.-time reduction alg. from C to A

input Z to C

polynomial-time algorithm for C

polynomial-time reduction algorithm from C to B

If A is NP-hard, and there is a polynomial-time reduction algorithm from A to B, then from every problem C in NP there exists a polynomial-time reduction algorithm to B.

Therefore:

If A is NP-hard, and there exists a polynomial-time reduction algorithm from A to B, then B is NP-hard too!

So: it suffices to reduce just one NP-hard problem to B (instead of all NP problems) to show that it is NP-hard!

The mother problemNP-complete

boolean formula satisfiability

Theorem: The boolean satisfiability problem is NP-complete.

Stephen Cook Leonid Levin

Subset-sum

subset-sum problemInput: An finite set S of positive integers and a target number t.Output: Yes if S has a subset whose elements sum exactly to t. No otherwise.

A certificate for the subset-sum problem is a subset of S; verification boils down to checking that the numbers in the subset add up to t.

So: the subset-sum problem is in NP.

The book explains the details of an intricate polynomial-time reduction from (a variant of) boolean satisfiability to the subset-sum problem.

So: the subset-sum problem is NP-hard.

Conclusion: the subset-sum problem is NP-complete.

Hamiltonian-cyclehamiltonian-cycle problemInput: An undirected graph.Output: Yes if the graph has hamiltonian cycle. No otherwise.

A certificate for the hamiltonian-cycle problem is some permutation of the list of vertices of the graph, with the first vertex of the list repeated at the end. Verification amounts to checking that the list is indeed a path in the graph.

So: the hamiltonian-cycle problem is in NP.

The book presents a polynomial-time reduction (in several steps) from (a variant of) boolean satisfiability to the hamiltonian-cycle problem.

So: the hamiltonian-cycle problem is NP-hard.

Conclusion: the hamiltonian-cycle problem is NP-complete.

A family tree of reductions NP-complete

hamiltonian-cycle subset-sum


Travelling Salesman (TSP)travelling salesman decision problem



A certificate for TSP is some permutation of the list of n+1 locations starting and ending with the depot. Verification amounts to checking that every location is in the list and the sum of the distances is less than or equal to k.

So: TSP is in NP.

Reducing hamiltonian-cycle to TSPA hamiltonian cycle in an undirected graph is a path in the graph that starts and ends with the same vertex and visits each vertex exactly once.

hamiltonian-cycle problemInput: An undirected graph.Output: Yes if the graph has hamiltonian cycle. No otherwise.

To reduce the hamiltonian-cycle problem to the traveling salesman problem (TSP) assign to all edges a weight of 0, and then add all missing edges (grey edges in the above graph) with a weight of 1.

Then the original graph has a hamiltonian cycle if, and only if, the resulting graph has a TSP-route of distance 0.

Travelling Salesman (TSP)travelling salesman decision problem



A certificate for TSP is some permutation of the list of n+1 locations starting and ending with the depot. Verification amounts to checking that every location is in the list and the sum of the distances is less than or equal to k.

So: TSP is in NP.

We have sketched a polynomial-time reduction from the hamiltonian-cycle problem to TSP (see also the book).

So: TSP is NP-hard.

Conclusion: TSP is NP-complete.



travelling salesman


Partitionpartition problemInput: An finite set S of positive integers.Output: Yes if S can be partitioned into S1 and S2 such that the sum of the elements in S1 is equal to the sum of the elements in S2. No otherwise.

A certificate for the partition problem consists of two disjoint subsets S1 and S2 of S such that the union of S1 and S2 is S.

Verifying whether a certificate is a solution amounts to compute sums of the elements of S1 and S2 and checking whether the sums are equal.

So: the partition problem is in NP.

We have presented a reduction from subset-sum to partition.

So: the partition problem is NP-hard.

Conclusion: the partition problem is NP-complete.

Reducing subset-sum to partitionsubset-sum problemInput: An finite set S of positive integers and a target number t.Output: Yes if S has a subset whose elements sum exactly to t. No otherwise.

Let S, t be some input to the subset-sum problem

Let z be the sum of the elements of S.

Let y > t+z, 2z; add y-t and y-z+t (both >z!) to S to obtain S’.

On the one hand, if S has a subset with sum t, then S’ can be partitioned!

If S’ can be partitioned, then one of the subsets contains y-t and the other contains y-z+t. Since the size of the entire set (y-z+t)+z+(y-t)=2y, it follows that a partition includes a subset of S of size y-(y-t)=t.

So S has a subset with sum t if, and only if, S’ can be partitioned.

S’ can be obtained from S in polynomial time.

Partitionpartition problemInput: An finite set S of positive integers.Output: Yes if S can be partitioned into S1 and S2 such that the sum of the elements in S1 is equal to the sum of the elements in S2. No otherwise.

A certificate for the partition problem consists of two disjoint subsets S1 and S2 of S such that the union of S1 and S2 is S.

Verifying whether a certificate is a solution amounts to compute sums of the elements of S1 and S2 and checking whether the sums are equal.

So: the partition problem is in NP.

We have presented a polynomial-time reduction from subset-sum to partition.

So: the partition problem is NP-hard.

Conclusion: the partition problem is NP-complete.



travelling salesman


partition

Limits of computation

Tractable

Unsolvable

Intractable

A decision problem is

tractable if there exists a polynomial-time that solves it

intractable if there exists an algorithm that solves it, but not a polynomial-time algorithm

unsolvable if there does not exist an algorithm that solves it

NP-complete

P=NP?For a class of important decision problems (the NP-complete problems), it is unknown whether they are tractable or intractable.

Note: they are all tractable or all intractable!

The holy grail

Multiplication

33478071698956898786044169848212690817704794983713768568912431388982883793878002287614711652531743087737814467999489

36746043666799590428244633799627952632279158164343087642676032283815739666511279233373417143396810270092798736308

917

1230186684530117755130494958384962720772853569595334792197322452151726400507263657518745202199786469389956474942774063845925192557326303453731548268507917026122142913461670429214311602221240479274737794080665351419597459856902143413

x

=?

Factorization

33478071698956898786044169848212690817704794983713768568912431388982883793878002287614711652531743087737814467999489

36746043666799590428244633799627952632279158164343087642676032283815739666511279233373417143396810270092798736308

917

1230186684530117755130494958384962720772853569595334792197322452151726400507263657518745202199786469389956474942774063845925192557326303453731548268507917026122142913461670429214311602221240479274737794080665351419597459856902143413

x

=?

?

FactorizationThat factorization is hard, is actually a good thing: it is at the heart of public-key cryptography!

Factorization is actually in NP. (What’s the certificate?)So, public-key cryptography is in trouble in the (unlikely) case that P=NP .

It is unknown whether factorization is NP-hard.

Factorization currently essentially involves exhaustively searching the enormous search space of candidate divisors. But it may happen that an alternative to searching is found.

For the related problem of primality testing (which, at first sight, also involves searching), an alternative to searching has been found and led to polynomial-time algorithm!

Material

Chapter 10 discusses many NP-complete problems, and presents reductions between them(Refer to chapter 9 again for an explanation of public-key cryptography and RSA.)

Deadline:

January 15, 2016

Fundamentals of Informatics Lecture 14 Intractability and NP-completeness Bas Luttik.

Documents