Complexity of Decision Problems

Here are some examples of decision problems:

1.   Given a finite graph, does it have a Hamiltonian cycle?

2.   Given a finite graph, does it have an Euler cycle?

3.   Given a finite graph, can it be colored with 2 colors?   That is, is it bipartite?

4.   Given a finite graph, can it be colored with 4 colors?

5.   Given two finite graphs, are they isomorphic?

6.   Given a formula of propositional logic (which we don't study in this course), is it satisfiable?   That is, does some line in its truth table have the value T?

7.   Given a positive integer, is it a composite?   That is, is it a product of two smaller positive integers?

8.   Given a polynomial equation in one unknown x, and with integer coefficients, does it have a solution in the integers?

9.   Given a polynomial equation in several unknowns x_1, ..., x_n, and with integer coefficients, does it have a solution in the integers?  

10.   Given a program (and its data), will the program ever stop?   (This is called the halting problem in the section Analysis of algorithms.)

Each of these ten problems has the following features: First, there is some allowed finite input, such as a (finite) graph or a polynomial or a program.   There are infinitely many graphs (or polynomials or programs), but any one is a finite object, so it makes sense to give it to someone (or to a machine).   Secondly, there is some yes-or-no question that is to be answered.

Each of the examples listed above is an interesting decision problem, and it would be well worthwhile examining possible algorithms for solving that problem.   But that is not our goal here.   Instead, we want to discuss ways to measure the complexity of such an algorithm, as the inputs become larger and larger.

In each case, we can talk of the size of the input.   Roughly speaking, the size should measure how much space is required in order to send the input to someone.   For example, in deciding whether a number m is composite, we can use the number of digits in m as a measurement of its size; it takes twice as much space to send a 100-digit number as a 50-digit number.   A graph with 15 vertices can be described by a 15 x 15 symmetric matrix, its adjacency matrix; see the section on Representation of graphs.   We could reasonably use 15^2=225 as the size of this graph.

The two concepts I want to describe go by the names P and NP.

A decision problem is said to be polynomial-time solvable (or "in P," for short), if there is an algorithm and a polynomial q such that given any input of size n, the algorithm answers the question (correctly!) in no more than q(n) steps.   In other words, the worst-case time for the algorithm to solve the problem for an input of size n must be O(n^k) for some k.

For example, the Eulerian graph problem is in P.   The exhaustive search algorithm takes too long for the number of steps to be bounded by a polynomial.   But there is another way.   Let's look just at the connected case.   Euler's theorem tells us that a connected graph has an Euler cycle if and only if every vertex has even degree.   So if a connected graph with n vertices is given to us in the form of an n x n adjacency matrix, in a number of steps that is O(n^2) we could decide whether the graph is Eulerian or not.   And the function q(n^2)=n^2 is a polynomial.

For another example, the 2-colorability problem is in P.

The P concept is important in theoretical computer science, because polynomials grow at a modest rate, in contrast to such functions as f(n)=n! or g(n)=2^n.   (See Table 4.3.1 in Analysis of algorithms.)   A polynomial-time algorithm is feasible to execute on a computer.   An exponential-time algorithm is not (except for trivially small inputs).   Problems not in P are said to be intractable.

The NP concept, "non-deterministic polynomial-time solvable," is harder to explain.   A problem is said to be in NP if there is an algorithm with the following properties: First, if the answer is "yes" then there must exist certifying evidence (of polynomially bounded size) such that the algorithm, given both the input and the evidence, can verify the affirmative answer in a polynomially bounded number of steps.   Secondly, if the answer is "no," then of course there must exist no certifying evidence that the algorithm would accept.   (The book mentions NP only in passing.)

For example, the Hamiltonian graph problem is in NP (the certifying evidence is the Hamiltonian cycle); the 4-colorability problem is in NP (the certifying evidence is the coloring); the satisfiability problem in logic is in NP.   And in all of these cases, the problem is not in P as far as we know.

It is widely believed that P does not equal NP, and in particular that the three problems mentioned in the preceding paragraph are not in P.   But efforts to prove that P differs from NP have so far been unsuccessful.   The P vs. NP question remains the most important open problem in theoretical computer science.   (See the comment regarding "instant fame" in the section Hamiltonian cycles.   There is a million-dollar prize for solving it.)

The interest in the question is not merely idle curiosity.   There are cryptographic schemes that rely on the belief that problems like these cannot be answered in a feasible amount of time.   If, contrary to expectations, P and NP are the same, then these cryptographic schemes are not secure.   (If you find a way to solve all NP problems in polynomial time, the NSA would like to have a chat with you.)

Some decision problems are neither in P nor in NP.   Examples #9 and #10 have the feature that there is no algorithm that always gives the correct answer, even in the absence of time constraints.   (Such problems are said to be unsolvable.)

H. B. Enderton