Algorithms versus Problems. It is much easier to talk
about an algorithm than about a problem. Suppose we can prove
that an algorithm is O(n2) or even
Θ(n2). We then say the problem
this algorithm solves can be solved in time O(n2).
At least this gives an upper bound. We know the problem is
no harder than O(n2). But there might be another
algorithm that solves the problem more efficiently.
It is hard to show that a problem cannot be solved more efficiently
than by a given algorithm.
Polynomial Time Algorithms. Assume that the input to
a problem is of size given by a variable n. A
Polynomial Time Algorithm has worst-case time performance of
O(nk) for a fixed constant k. See
hierarchy for
a list of problems and algorithms, going from the easiest, through the
polynomial-time algorithms, past the NP-complete ones,
past the provably exponential-time ones, and finally to
the undecidable algorithms, which cannot be solved at all.
In saying that a problem is of size n, one assumes a succinct
representation of the problem, meaning that the size of one representation is
a polynomial time function of the size of any other.
A succinct representation must not use
an exponential amount of space compared with another representation.
For example, a very sparse graph might have a representation of
size O(n), whereas the incidence matrix representation would
be of size O(n2), with most entries 0.
However, this makes no difference, because the two representations
are within a polynomial multiple of one another.
On the other hand, if we take a fixed integer i in binary,
its representation will be of size n = O(log(i)),
since i is represented with log(i) bits.
Here i = 2n. On the
other hand, a representation using i "tick marks" will have
size O(i).
The ratio of sizes i / log(i) is the same as the
ratio 2n / n.
Thus the "tick marks" representation is exponentially bigger
that the normal one. For large enough n, 2n
will be larger than nk for any fixed k.
Thus "tick marks" does not qualify as a succinct representation.
Consider an algorithm that prints all 2n
integers starting with 1, and going up to 2n.
Printing 2n items will always take time
O(2n).
The class P: Polynomial Time Problems.
The class P consists of all problems that can be solved
in polynomial time (by some polynomial time algorithm).
Most of the problems we have looked are in
P. Consider however, the Subset Sum problem. For an input
list of n numbers, each (say) at most n bits long, the
dynamic programming algorithm will require 2n
amount of space, and so of course O(2n) time.
This is an exponential algorithm. (Obviously so because even the
tables used by the algorithm take up an exponential amount of space.)
There is no known polynomial-time
algorithm for Subset Sum, and it is not known if Subset Sum is
a polynomial time problem.
Decision Problems. In the theory of NP-completeness,
we will be dealing with decision problems,
meaning problems with a Yes or No answer.
In practice we are interested in optimization problems, where we
want a specific optimal solution to the problem.
For example, even with Subset Sum we might ask if a subset of
the numbers will add up to a given specific sum, but we would be
interested in the specific numbers that make up the sum and not
just whether or not the numbers exist.
As another example, the page
Zero-Knowledge Proofs looked at the Hamiltonian Circuit problem.
There we saw that if you had an efficient (= polynomial time)
algorithm to always answer
the question "Is there a Hamiltonian Circuit" with a "Yes" or "No",
then you could get an efficient algorithm to construct the actual
circuit. We can often covert an algorithm to solve a decision
problem into an algorithm to solve the corresponding
optimization problem.
The class NP: Nondeterministic
Polynomial Time Problems. This is a difficult class of problems
to understand. NP is the class of problems solvable on a
nondeterministic computer. From the beginning you should realize
that no such actual hardware computer can exist in the real world.
A nondeterministic computer can carry out a nondeterministic algorithm.
Such an algorithm proceeds in two stages:
Guessing stage: the algorithm "guesses" a structure S,
which either
is the answer to the problem or will help compute the answer.
Checking stage: the computer proceeds in a normal deterministic
fashion to use S in arriving at a Yes-No answer to the problem.
The checking stage must be done in polynomial time.
In effect S is the answer and the checking stage verifies
in polynomial time that S is the answer.
Summary of the difference between P and NP:
An algorithm in Pproduces the answer to a problem in
polynomial time.
A nondeterministic algorithm in NPverifies
in polynomial time that a guess is the answer to a problem.
The above shows that P is a subset of NP:
every problem in P is also in NP.
Experts conjecture that P is not equal to NP, but
no one has been able to prove this (the proof is worth a million
dollars right now, plus fame).
For example, if you are given a Hamiltonian circuit as the result of
a guess, it is a trivial polynomial time computation to check that
this is actually such a circuit.
Arbitrary parallelism: another view of NP.
Another model of a nondeterministic computer is one which at
any stage of computation allows an arbitrary degree of parallelism.
This means that the computation can fork off any number of
separate parallel computations. If any one of these separate
computations completes successfully, the whole computation completes.
This model is similar to a hardware model with separate cores and
threads, but here there is no overhead at all to the separate
computations, and no limit to the number of them. For example,
in a recursive backtracking search (as with depth-first search,
for example), each time a choice appears, the computation takes
both paths "at once". Using this model, we could imagine
starting a computation in parallel with each possible choice in
the previous model. Thus these two models are equivalent.
It should be emphasized that neither of these models can be
realized (implemented) in the real world.
Summary. If you can verify in polynomial time that
a given solution to a problem is correct, then the problem is
in NP. In most cases this verification is easy.
Any problem in P is also in NP.
Examples. Revision date:2012-04-15.
(Please use ISO 8601,
the International Standard Date and Time Notation.)