Lukas' Notes

Exercise 1

Instruction

Consider the following scheduling problem: There is one machine, which is unavailable for a given period (i.e., no job can be scheduled during this time). The jobs are released at time and cannot be preempted. We want to compute a schedule that minimises lateness.

Is the problem above NP-hard or polynomial time solvable? In case of the former, provide a reduction. In case of the latter, provide an algorithm with a correctness argument and runtime analysis.

Minimising lateness means minimising the metric

where is the completion time and is the deadline of job . So the schedule should minimise the maximum lateness of the jobs.

We reduce from subset sum by forcing an unavailable time interval . Given a subset sum instance

Let

Further, assume that is an upper bound of , i.e., . Otherwise, if , then the instance is immediately a no-instance.

Next, create one scheduling job for each number . Set the processing time to , the deadline to , and the release time to , i.e.,

We make the one machine we have unavailable during

and no job can be preempted. We want to decide whether

i.e., no job is late and thus must finish no later than the common deadline .

This completes the reduction: the subset sum instance is mapped to the scheduling instance with jobs , common deadline , and one unavailable interval .

The mapping is polynomial: it creates one job per number and computes only , , and the unavailable interval . Since the correctness proof shows equivalence with subset sum, deciding whether the constructed instance has a schedule with is NP-hard. Therefore, computing a schedule that minimises maximum lateness for one machine with one unavailable interval is also NP-hard.

Exercise 2

Instruction

Consider the problem

where is a function (that may be different for each job ). For example, if

then the problem is equivalent to

(i.e., minimising the sum of completion times with preemption constraint). For another example, if

then the problem is then equivalent to

(i.e., maximising throughput with preemption constraint).

First, show that if the functions are nondecreasing (i.e., for , ), there exists an optimal schedule that is nonpreemptive.

Second, answer the following question: “Does the result hold for arbitrary functions ?“. Briefly justify your answer.

Let be a non-decreasing function for a job , i.e.,

In other words, if the completion time is before completion time , then preserves that before-after relation.

An optimal schedule of a scheduling problem is a solution where is minimal. In the problem above, it means that

where is the smallest (best) feasible objective.

(1)

We construct a new schedule that processes jobs non-preemptively in the order

Therefore, the completion time of job in is the sum of preceding (reflexive) processing times:

Let be an optimal preemptive schedule with , and let the jobs be ordered by their completion times in :

In , by time , all jobs have completed. Therefore, must have processed at least

units of work by time . Given there’s only one machine, no parallelism is allowed, i.e.:

From above, we know that

In other words, every job completes no later in than in . Further, we know that is non-decreasing, thus:

Hence:

By definition is already optimal, i.e., the smallest feasible value. We conclude that

which means that both and are optimal. Given that is non-preemptive, there exists an optimal non-preemptive schedule .

(2)

No. The following two-job instance is a counterexample.

Consider two jobs with

In any non-preemptive schedule, the completion times are either

With preemption, however, we can schedule

which yields

Now define arbitrary objective functions that reward exactly these preemptive completion times:

The preemptive schedule has objective value , while every non-preemptive schedule has objective value at least . Thus, for arbitrary functions , there need not exist an optimal non-preemptive schedule.

Exercise 3

Instruction

Let

Assume

Compute the prefix function, the suffix tree, and the suffix array of .

(Note: Some of the concepts in this exercise will be discussed in the lecture on 2 June.)

border
a
ab
abc
abcd
abcdaa
abcdabab
abcdabcabc

with a sorting order

$ < a < b < c < d,

i.e., in lexicographic order, $ is the first suffix, a the second, etc.

This makes $ the first suffix in lexicographic order. It also separates suffixes cleanly: for example, abc$ is not hidden inside abcdabc$.

First list all suffixes of :

startsuffixshared prefix
abcdabc$: abc
bcdabc$: bc
cdabc$: c
dabc$none
abc$: abc
bc$: bc
c$: c
$none

Only have no shared prefixes, thus, they become direct leafs from the root. The suffix tree is the compressed trie of these suffixes

Each orange leaf stores the starting position of its suffix in $S$$.

Sort the same suffixes lexicographically and assign a rank to each:

suffixrank
$
abc$
abcdabc$
bc$
bcdabc$
c$
cdabc$
dabc$

Thus, my suffix array is:

Exercise 4

Instruction

In the lecture, the runtime analysis of the Knuth-Morris-Pratt Algorithm is an example of the sliding window technique. This technique can be applied in many problems with “linear” structure, as in the following example.

Suppose we are given as input a number and an array sorted in increasing order. The task is to find , such that .

The algorithm based on the sliding window technique for this problem is as follows. We start with and and repeatedly perform the following procedure:

  • If , then we increase by .
  • If , then we decrease by .

We stop when , when , or when . In the latter case, we conclude that no such and satisfy the requirement.

Your tasks for this exercise are:

  • Determine the running time of the algorithm above (in notation, but with providing a bound which is as tight as possible).
  • Argue that the algorithm correctly solves the problem.
(int, int) Search(Array a, int k)

assert a is sorted in increasing order

int n = |a|;

// 1-based indexing
int i = 1;
int j = n;

while i <= n and j >= 1 {
	int s = a[i] + a[j];
	if s == k {
		return (i, j);
	} else if s < k {
		// left-side is not big enough
		i += 1;
	} else {
		// right-side is too big
		assert s > k;
		j -= 1;
	}
}

return null;

Runtime Analysis

In every loop iteration, exactly one of these happen (unless the algorithm immediately returns):

  1. , or
  2. .

Each iteration performs only constant-time work, i.e., , and we loop at most iterations given that

Further, we only search over , so the running time is . Hence

Correctness

Given an instance , where is sorted in increasing order.

If the algorithm returns a pair , then it has checked

Therefore, the returned pair is a correct solution.

Now suppose the algorithm returns . We show that every pointer move only discards candidates that cannot be part of a solution.

Sum too small

Assume that

Since is sorted increasingly, every still available right index satisfies

Hence

So is too small even with the largest still available right component, which means that no pair using can be a solution anymore and that it is safe to increase .

Sum too large

Assume that

Since is sorted increasingly, every still available left index satisfies

Hence

So is too large even with the smallest still available left component, which means that no pair using can be a solution anymore and that it is safe to decrease .

Thus, every step either returns a correct pair or removes only impossible candidates, so if the algorithm terminates with because or , no candidate pair remains although only impossible candidates were removed. Hence, no valid pair exists.

Therefore, the algorithm correctly solves the problem.

Exercise 5

Instruction

Given a string and a natural number , describe an -time algorithm to find the maximum border of , such that is a multiple of . Briefly justify the correctness and running time bound of your algorithm.

Given a string and a natural number , let . To do that, we can use the prefix function, as

So if we take , we have the longest border of

Let . If is divisible by , we can return . Otherwise, we have to find a smaller border that’s divisible by .

Since is the current border, every smaller border of must also be border of . Thus, we can ask for the longest border of :

Pseudocode

String MaximumBorder(String S, int k)

assert k >= 2

int n = |S|;
compute prefix function π for S

int ℓ = π(n-1);
while ℓ > 0 and ℓ mod k != 0 {
	ℓ = π(ℓ-1);
}

return S[0 .. ℓ-1];

Correctness

The algorithm starts with , which is the length of the longest border of .

Whenever the current is not divisible by , the border cannot be the answer, and

moves to the next smaller border of , because every smaller border of is also a border of the current border .

Thus, the loop visits the border lengths of from largest to smallest and stops at the first one divisible by , which is therefore maximal. If the loop reaches , no positive valid border exists, so returning is correct.

Runtime Analysis

Computing the prefix function takes time.

During the loop, strictly decreases because . Thus, the loop performs at most iterations, and each iteration does constant work.

Hence, the total running time is .

Exercise 6

Instruction

For a string , we want to find the length of the longest subsequence that has at least two distinct occurrences in . Here, two occurrences of a subsequence are distinct, if they differ in at least one index. More formally, we want to find the length of a longest string , such that for some

and

  • for all (i.e., these indices encode the same subsequence); and
  • there exists such that (i.e., the two occurrences are distinct).

Describe an -time algorithm to solve this problem. Briefly justify the correctness and running time bound of your algorithm.