Binary encoding represents a number in bits. In unary, the same number needs symbols. The gap between the two is exponential.
This compactness is usually treated as a harmless implementation detail, but it is not. It reshapes the boundary between tractable and intractable.
The Hidden Exponential
A dynamic programming algorithm for the subset sum problem runs in . Under unary encoding, is proportional to the input length, so the algorithm is ordinary polynomial time. Under binary encoding, can be exponentially larger than the input length, so the same algorithm is exponential. The problem is weakly NP-hard precisely because its hardness evaporates when numbers are small, which is exactly what unary encoding enforces.
A dynamic programming table with dimensions contains exactly cells. For an instance , let
be the bit-lengths of the two numeric fields. Then
so the total number of cells is
The algorithm is polynomial in the numeric values and , yet exponential in the number of bits that encode them. The loops hide an exponential blow-up inside what looks like a simple double iteration.
Bi Knapsack is a concrete instance of this pattern. Its dynamic programme uses a table , so its running time is
This is polynomial in the numeric capacities, but not necessarily polynomial in the encoded input. For instances with , each capacity needs only bits, while the table has
capacity states. This is the exact sense in which the dynamic programme is pseudo-polynomial.
When Encoding Does Not Matter
Not every problem is rescued by unary encoding. A strongly NP-hard problem remains intractable even when every numeric parameter is bounded by a polynomial in the input length. The hardness comes from combinatorial structure, not from numeric magnitude.
Consider the travelling salesman problem. Its input is a graph with vertices and edge weights. Even if all weights are restricted to and written in unary, the problem is still NP-hard. The exponential search space of permutations does not shrink when the numbers are small. No pseudo-polynomial algorithm is known, and none exists unless .
The weights are tiny, but the number of tours grows factorially. Unary encoding makes the weight labels longer, yet the permutation space is unchanged. The algorithm must still enumerate an exponential number of structures.
The Quantity Space
The clean dichotomy has a deeper cause. The question is whether the problem’s difficulty comes from iterating over a space whose size is a direct function of the numeric values themselves.
For the subset sum problem, the dynamic programming table has dimensions . The state space size is proportional to itself:
The function is essentially the identity on the numeric value. Because binary encoding compresses into bits, iterating over all values up to implicitly iterates over states. The exponential blow-up is hidden inside the loops.
For the travelling salesman problem, the search space is the set of all permutations of vertices:
The edge weights are merely coefficients in the objective function. Restricting weights to does not reduce . The combinatorial explosion comes from the factorial of the vertex count, not from any numeric magnitude.
Even if we treat itself as a binary-encoded number with bit-length , so that , the search space is already super-polynomial in . Under unary encoding the input length becomes , yet is still not polynomial in . The encoding of the vertex count does not matter because the combinatorial explosion outruns the compression.
Uniform Polynomial Bounds
Fix an encoding of problem instances, and let be the length of instance under that encoding. An algorithm is polynomial-time if there exist fixed constants and such that
The encoding is fixed for the problem. It is not chosen separately for each instance. Polynomial-time solvability is stable under standard encodings whose lengths are polynomially related. Binary and unary encodings of numeric values are not polynomially equivalent, since a value has binary length but unary length .
The exponent is chosen once for the whole algorithm. It cannot change with the instance. In particular, one cannot prove polynomial time by saying that, for each instance , there exists some exponent with
That would be meaningless: almost any finite runtime can be bounded this way by choosing a large enough instance-dependent exponent.
Now let be a numeric quantity used by the algorithm, and define its binary length by
Every non-negative integer satisfies . This alone says nothing about polynomial time. What matters is whether there is a fixed exponent such that
If such a exists, then iterating over values can still be polynomial in the input length. If no such fixed exists, then iterating over values may be exponential in the input length.
Minimum Search
For minimum search, the input is an explicit array
Here . Since the instance explicitly contains entries,
So we can choose the fixed exponent :
A linear scan therefore satisfies
This is polynomial-time with the fixed exponent .
Subset Sum
For subset sum, the input contains a target value as one binary field:
Here . The field has length
There is no fixed exponent such that
For every fixed , one can choose an instance with so large that
Equivalently, may be linear in , so
may be exponential in the input length. The dynamic programming algorithm creates columns, so
which is not bounded by for any fixed .
Summa summarum:
So the point is not that exists. It always exists. The point is whether the exponent bounding by is a fixed constant independent of the instance.
Beyond NP-Hardness
This gives the clean dichotomy. For weakly NP-hard problems, the hardness lives in the encoding gap because the state space grows with the numeric value. For strongly NP-hard problems, the hardness lives in the combinatorial structure, and no amount of decompression can remove it.
This means the complexity class of a weakly NP-hard problem can depend on how its input is written. The hardness can come from numeric parameters being compressed efficiently. Remove the compression, and that source of hardness disappears.
Therefore, binary encoding is not just compact—it is the source of the exponential gap that makes weakly NP-hard problems intractable.