Exercise 1
Instruction
Recall the reduction from 3-SAT to Subset Sum presented in the second lecture. Assuming that the Exponential Time Hypothesis holds, what does this reduction imply for the possible running times of algorithms solving Subset Sum? In particular, could Subset Sum then admit an algorithm running in time:
Let the 3-SAT formula have variables and clauses. We want to reduce it to subset sum by creating a set and a target sum such that
To solve this, we attach a quantity to each clause , where each clause should have a quantity of if is satisfiable.
We reduce to subset sum by constructing a set that contains
- two numbers for each variable , i.e. and
- two fill-up numbers for each clause
Fill-up numbers
For each clause , let count how many selected assignment numbers satisfy .
If is satisfied, then
If is not satisfied, then
The target asks for every clause digit to equal . Therefore, for each clause, we add two fill-up numbers with value in that clause digit and everywhere else.
These can “repair” the satisfied cases:
But they cannot repair an unsatisfied clause:
Thus the fill-up numbers make “at least one true literal” equivalent to “the clause digit can be filled to ”.
Hence, the constructed set has cardinality
since we create two numbers and for each variable, and two fill-up numbers for each clause. Thus is linearly bounded in the size of the original 3-SAT formula.
The target sum is built digit by digit. There is one digit for each variable and one digit for each clause. In every variable digit, has value , forcing the subset to choose exactly one of and . In every clause digit, has value , forcing the selected assignment numbers together with the fill-up numbers to fill each satisfied clause exactly to . Thus, if there are variables and clauses, the target has the form
Shape of one element
Suppose there are variables and clauses. Then every number in has digits.
If setting to true satisfies only clause , then the corresponding assignment number is
The variable part says that this number belongs to . The clause part says that this truth choice contributes only to .
A fill-up number for would look like
since it contributes only to the first clause digit and does not choose any variable value.
If subset sum had an algorithm with running time , then we could first reduce a 3-SAT formula to a subset-sum instance with and then run . This would give a 3-SAT algorithm with running time , up to polynomial factors from the reduction.
If the exponential-time hypothesis is true, then 3-SAT cannot be solved in time
where is the small-o notation, i.e. strict asymptotic upper bound, and is the big-o-star notation, i.e. the asymptotic upper bound suppressing polynomial factors. Therefore, the reduction above rules out a subexponential algorithm w.r.t. under the assumption of ETH for subset sum.
Solution
Assuming ETH being true, then:
- is allowed , since
which is still single-exponential in , not subexponential.
- is also allowed, given that the exponent
is still linear in . ETH does not forbid improving the constant in a linear exponent. That would be the role of SETH.
- is not allowed, as its exponent is sublinear:
as
given that . Since , this would give a 3-SAT algorithm running in
contradicting ETH.
Exercise 2
Instruction
Consider the following variant of the classical graph 4-colouring problem. In Equitable 4-Colouring, we are given a graph and are asked for a proper 4-colouring (i.e., one where neighbours do not receive the same colour) but where additionally each colour is used essentially the same number of times (). To avoid any confusion, a formal definition of the problem is provided below.
Equitable 4-Colouring Input: A connected -vertex graph Question: Does there exist a partition of the vertices of into four sets such that
- for each , each is an independent set, and
- for each and , ?
EQUITABLE 4-COLOURING admits a simple brute-force exponential algorithm that runs in time and simply exhaustively enumerates all possible assignments of colours to the vertices (whereas for each assignment, one can check in polynomial time whether conditions (1) and (2) are satisfied).
The task is to design a single-exponential algorithm for EQUITABLE 4-COLOURING with a running time that has a better base of exponent than , i.e., one with running time for some .
A 4-colouring is a function . If we ignore all graph structure, then every vertex has four possible colours. Hence the naive search space is
So the base comes from the fact that we treat every vertex as if all four colours were always possible. To improve the base, we need to force the situation that, for most vertices, at least one colour is already unavailable, so that the search has running time for some .
The proper-colouring condition says that neighbours must not have the same colour assignment, i.e.
So if a vertex already has an already-coloured neighbour , then is forbidden from using colour . Since there are at most four colours, this leaves at most possible colours for .
Given that is connected, it has a DFS tree. Pick an arbitrary root vertex . Then order the remaining vertices according to DFS discovery order
where the DFS run takes polynomial time.
For every , the vertex has a parent in the DFS tree. That parent appears earlier in the order. Hence
So when colouring , it has at least one already-coloured neighbour.
Further, we can fix the root node’s colour to an arbitrary colour, e.g.
given that the names of the colours don’t matter, only the constraints they impose on the graph. Hence, we only have to search through 3 colours for each of the remaining vertices. Therefore, with a fixed root, we have a runtime complexity of
which is single-exponential given that the exponent is linearly bounded.
For every completed colouring, we then check in polynomial time whether it is actually a valid solution. First, we check the proper-colouring condition, i.e. whether adjacent vertices receive different colours. Then we form the four colour classes and check the equitable size condition
These checks only add a polynomial factor, so they are hidden by the notation.
Correctness
If the algorithm finds a solution, the produced partition
satisfies that every is an independent set, because adjacent vertices have different colours.
It also satisfies the equitable condition, because the algorithm only accepts a colouring after checking that the four colour classes differ in size by at most one. In other words, for all , it verifies that
Hence the accepted partition is not only a proper 4-colouring, but an equitable 4-colouring.
Conversely, suppose that has an equitable 4-colouring. Since the names of the four colours are arbitrary, we may rename the colours so that the root receives colour . The algorithm fixes exactly this root colour and then enumerates all valid colour choices for the remaining vertices. Therefore, it eventually considers this equitable colouring. Since it is proper and its colour classes satisfy the size condition, the algorithm accepts it.
Exercise 3
Instruction
Design a subexponential algorithm for SAT when the input is restricted to instances such that , where is the number of clauses and the number of variables.
Given a SAT formula over variables and clauses of form
with the promise that we want to find a subexponential algorithm, i.e. an algorithm that has a runtime of .
A naive SAT algorithm enumerates all assignments
i.e. has a naive runtime of . However, we’re given a upper boundary for the number of clauses, which means that it makes more sense to branch over clauses rather than variables.
CNF formula is satisfiable iff every clause has at least one true literal. Further, each clause in has at most distinct literals, given that each variable has only two possible literals:
So the number of possible choices is at most
Given , this is at most
For every tuple do the following:
- Remove duplicate literals, if any.
- Check whether the chosen literals are consistent, i.e. there must not be a variable such that and are in the chosen literals.
- If the chosen literals are consistent, assign all of them to true.
Example
If the chosen literals are , then set
- Assign all remaining variables arbitrarily.
- Accept.
If no tuple of chosen literals is consistent, reject.
Boolean SubexpSAT(Formula F)
let F = C_1 ∧ C_2 ∧ ... ∧ C_m;
let α := empty partial assignment;
for each (ℓ_1, ℓ_2, ..., ℓ_m) ∈ (C_1 × C_2 × ... × C_m) {
α := empty partial assignment;
consistent := true;
for i := 1 to m {
if (ℓ_i = x) {
required := true;
}
if (ℓ_i = ¬x) {
required := false;
}
if (x ∈ dom(α)) {
if (α(x) ≠ required) {
consistent := false;
break;
}
} else {
α(x) := required;
}
}
if (consistent = true) {
return true;
}
}
return false;
Correctness
Suppose the algorithm accepts, then it found one chosen literal
for every clause , and all chosen literals are mutually consistent. So we can assign all chosen literals to true. Therefore every clause contains at least one true literal, namely , i.e. the whole formula is satisfied.
Conversely, suppose is satisfiable. Let be a satisfying assignment. Since satisfies every clause, every clause contains at least one literal that is true under . Choose one such literal from every clause. The tuple
is consistent, because all chosen literals are true under the same assignment . The algorithm enumerates this tuple and accepts.
Therefore, the algorithm is correct.
Running Time
For each clause, there are at most possible literals. Hence, the number of tuples is at most . Using the promise , we get
Given that , we have
The consistency check for each tuple is polynomial time, so it is hidden inside . Thus the running time is
Therefore, SAT with can be solved in , i.e. subexponential time.
Exercise 4
Instruction
Consider an NP-complete graph problem . Assume admits a polynomial reduction from 3-SAT which transforms each -variable 3-SAT formula into an instance of with at most vertices.
Now, say we have an algorithm that solves any -vertex instance of . Under the Exponential Time Hypothesis, which of the following running times can we exclude for ?
We’re given an NP-complete graph problem that with
- admits a polynomial-time reduction 3-SAT
- solves in time for vertices.
We know that after the reduction, the number of vertices is constrained by
If , then 3-SAT can be solved in subexponential time, contradicting ETH.
Solution
- : Given that
we know that
Given that the exponent is worse than linear, i.e. is not subexponential, (1) is not excluded under the ETH.
- : Given that
it follows that
Again, the running time is not subexponential, meaning that (2) is not excluded under the ETH.
- : From
it follows that
Again, (3) is not subexponential, i.e., it is not excluded under ETH.
- : From
it follows that
This result would conclude that we have found a polynomial-time algorithm (polynomial reduction + polynomial algorithm) that can solve 3-SAT. This contradicts the ETH, i.e., it is excluded.
Exercise 5
Instruction
For each of the following basic scheduling algorithms, construct an example instance of where the algorithm fails.
- Earliest Due Date (EDD) First
- Earliest Release Date First
- Shortest Processing Time First
The notation
means that
- we have one machine,
- job can’t be processed before (its release time), and
- we have no objective function.
Earliest Due Date First
The earliest due date first algorithm selection policy chooses, at a time , the available job with the earliest due date .
Take two jobs with:
- :
- :
At , only can be chosen, because has not yet been released. Hence EDD chooses .
At , only remains. But cannot meet its deadline anymore, since it would complete at time :
This is not because the instance itself is infeasible. A feasible schedule is obtained by idling until and then processing
Earliest Release Date First
The earliest release date first algorithm selection policy chooses, at a time , the available job with the earliest release date .
Take three jobs with:
- :
- :
- :
At , are available with equal release times. The algorithm chooses ; this is the best tie-breaking choice, since it meets the tight deadline
At , the algorithm can choose between and . It chooses , because
At , only remains. It would complete too late, since
This is not because the instance itself is infeasible. A feasible schedule is obtained by processing
Shortest Processing Time First
The shortest processing time first algorithm selection policy chooses, at a time , the available job with the shortest processing time .
Take two jobs with:
- :
- :
At , both jobs are available. The algorithm chooses , since
At , only remains. It would complete too late, since
This is not because the instance itself is infeasible. A feasible schedule is obtained by processing
Exercise 6
Instruction
Consider the following three schedules for (i.e., the scheduling problem with one machine where the aim is to minimize the sum of completion times) for an instance with precisely 4 jobs :
Assume that the job durations satisfy . Under this assumption, answer the following two questions:
- Do and always have the same sum of completion times?
- Do and always have the same sum of completion times?
In each case, if the answer is yes then provide a proof sketch, and otherwise provide a counterexample.
The notation
means that there’s one machine, no further constraints, and the objective function is the total completion time .
Let’s take first. The completion times are
- ,
- ,
- , and
- .
So the total completion time for is
For , the completion times are
- ,
- ,
- , and
- .
Thus, the total completion time for is
For , the completion times are
- ,
- ,
- , and
- .
Thus, the total completion time for is
(1) Do and always have the same sum of completion times?
If and have the same sum of completion times, then the difference of their sums should be .
Given the assumption
we get
Therefore, the completion times of and always have the same sum of completion times under the assumption..
(2) Do and always have the same sum of completion times?
If and have the same sum of completion times, then the difference of their sums should be .
Under the assumption above, it doesn’t follow that the difference of total completion times is . Therefore, and do not always have the same sum of completion times.
Counterexample
A counterexample is
Plugging those processing times into the equation gives