Skip to content# The Algorithm Design Manual - Notes

## Chapte 1 — Introduction to Algorithm Design

### Robot Tour Optimization

#### Nearest-neighbor heuristic

#### Closest-pair heuristic

### Reasoning about Correctness

#### Problems and Properties

#### Expressing Algorithms

#### Demonstrating Incorrectness

### Induction and Recursion

#### Inductive proof for Insertion sort

### Modeling the Problem

#### Combinatorial Objects

#### Recursive Objects

### Proof by Contradiction

### Estimation

### Exercises

#### Finding counter examples

#### Induction

— notes — 8 min read

Pet Peeve: The author sometimes skips steps without an explaination (like the integer truncation in "stop and think: incremental correctness"). Some examples are hard to follow for an international student (like understanding the lottery system in "war story: pschic modeling").

An

`algorithmic problem`

is specified by describing the complete set of**instances**it must work on and of its output after running on one of these instances.An

`algorithm`

is a procedure that takes any of the possible input instances and transforms it to the desired output.There is a distinction between a general problem and an instance of a problem. E.g:

**Problem**: Sorting

**Input**: A sequence of`n`

keys a_{1},...,a_{n}.

**Output**: The permutation (reordering) of the input sequence such that: a′_{1}≤ a′_{2}≤ ··· ≤ a′_{n−1}≤ a′_{n}.

**Instance of sorting problem**: { Mike, Bob, Sally}; { 154, 245 }Insertion sort is an algorithm to the sorting problem:

**English description**:Start with a single element (thus forming a trivially sorted list) and then incrementally inserts the remaining elements so that the list stays sorted.

**Pseudocode**:1array = input sequence2n = array size3i = 045while i < n6 j = i7 while j > 0 AND array[j] < array[j - 1]8 swap element at `j` with element at `j - 1` in array9 decrement j by 110 increment i by 1**Code**:1insertion_sort(item s[], int n)2 {3 int i,j; /* counters */4 for (i=0; i<n; i++) {5 j=i;67 while ((j>0) && (s[j] < s[j-1])) {8 swap(&s[j],&s[j-1]);9 j = j-1;10 }11 }12 }An animation of the logical flow of this algorithm on a particular instance (the letters in the word

`“INSERTIONSORT”`

)There is a fundamental difference between algorithms, which always produce a correct result, and heuristics, which may usually do a good job but without providing any guarantee.

Three desirable properties for a good algorithm:

- correct
- efficient
- easy to implement

Correct algorithms usually come with a proof of correctness, which is an explanation of why we know that the algorithm must take every instance of the problem to the desired result.

Problem: Robot Tour Optimization (aka: Traveling Salesman Problem [TSP]).

Input: A set`S`

of`n`

points in the plane.

Output: What is the shortest cycle tour that visits each point in the set`S`

?

- English Description:
Starting from some point

`p0`

, we walk first to its nearest neighbor`p1`

From`p1`

, we walk to its nearest unvisited neighbor. Repeat this process until we run out of unvisited points After which we return to`p0`

to close off the tour. - Pseudocode:1NearestNeighbor(P)2 Pick and visit an initial point p₀ from P3 p = p₀4 i = 056 While there are still unvisited points7 i = i + 18 Select pᵢ to be the closest unvisited point to pᵢ₋₁9 Visit pᵢ10 Return to p₀ from pₙ₋₁
- Pros:
- Easy to understand & implement
- Reasonably efficient

- Cons: It's wrong — It always finds a tour, but it doesn’t necessarily find the shortest possible tour. E.g: A bad instance for the nearest-neighbor heuristic (top) & the optimal solution (bottom):

- English description:
Repeatedly connect the closest pair of endpoints whose connection will not create a problem, such as premature termination of the cycle. Each vertex begins as its own single vertex chain. After merging everything together, we will end up with a single chain containing all the points in it. Connecting the final two endpoints gives us a cycle. At any step during the execution of this closest-pair heuristic, we will have a set of single vertices and vertex-disjoint chains available to merge.

- Pseudocode:

- We need tools to distinguish correct algorithms from incorrect ones, the primary one of which is called a
`proof`

. - A proper mathematical proof consists of several parts:
- A clear, precise statement of what you are trying to prove.
- A set of assumptions of things that are taken to be true, and hence can be used as part of the proof.
- A chain of reasoning that takes you from these assumptions to the statement you are trying to prove.
- A little square (■) or
`QED`

at the bottom to denote that you have finished, representing the Latin phrase for "thus it is demonstrated."

- A proof is indeed a demonstration. Proofs are useful only when they are honest, crisp arguments that explain why an algorithm satisfies a non-trivial correctness property.

- It is impossible to prove the correctness of an algorithm for a fuzzily-stated problem.
- Problem specifications have two parts:output.1* The set of allowed input instances.2* The required properties of the algorithm's
- An important technique in algorithm design is to narrow the set of allowable instances until there is a correct and efficient algorithm. For example, we can restrict a graph problem from general graphs down to trees, or a geometric problem from two dimensions down to one.
- There are two common traps when specifying the output requirements of a
problem:1* Asking an ill-defined question. E.g: Asking for the best route between two places on a map is a silly question, unless you define what best means.2* Creating compound goals. E.g: A goal like find the shortest route from **a** to **b** that doesn't use more than twice as many turns as necessary is perfectly well defined, but complicated to reason about and solve.

- The heart of any algorithm is an idea. If your idea is not clearly revealed when you express an algorithm, then you are using too low-level a notation to describe it.
- The three most common forms of algorithmic notation are
- English
- Pseudocode (a programming language that never complains about syntax errors)
- A real programming language.

- All three methods are useful because there is a natural tradeoff between greater ease of expression and precision.

- The best way to prove that an algorithm is incorrect is to produce a counter example, i.e: an instance on which it yields an incorrect answer.
- Good counterexamples have two important properties:
**Verifiability**: To demonstrate that a particular instance is a counterexample to a particular algorithm, you must be able to:- calculate what answer your algorithm will give in this instance, and
- display a better answer so as to prove that the algorithm didn't find it.

**Simplicity**- Good counter-examples have all unnecessary details stripped away. They make clear exactly why the proposed algorithm fails. Simplicity is important because you must be able to hold the given instance in your head in order to reason about it.

- Recursion is mathematical induction in action. In both, we have general and boundary conditions, with the general condition breaking the problem into smaller and smaller pieces. The initial or boundary condition terminates the recursion.
- The simplest and most common form of mathematical induction infers that a statement involving a natural number
`n`

(that is, an integer`n ≥ 0`

) holds for all values of`n`

. The proof consists of two steps:- The
**initial**or**base case**: prove that the statement holds for a fixed natural number (usually 0 or 1). - The
**induction/inductive step**: assume that the statement holds for some arbitrary natural number`n = k`

, and prove that the statement holds for`n = k + 1`

.

- The
- The hypothesis in the inductive step, that the statement holds for
`n = k`

, is called the**induction/inductive hypothesis**. You're doing a thought experiment of what would happen if it was true for`n = k`

. It might be clearer to use the phrase "suppose true when`n = k`

" rather than "assume true when`n = k`

" to emphasise this. - To prove the inductive step, one assumes the induction hypothesis for
`n = k`

and then uses this assumption to prove that the statement holds for`n = k + 1`

. We try to manipulate the statement for`n = k + 1`

so that it involves the case for`n = k`

(which we assumed to be true).

- The basis case consists of a single element, and by definition a one-element array is completely sorted.
- We assume that the first
`n - 1`

elements of array`A`

are completely sorted after`n - 1`

iterations of insertion sort. - To insert one last element
`x`

to`A`

, we find where it goes, namely the unique spot between the biggest element less than or equal to`x`

and the smallest element greater than`x`

. This is done by moving all the greater elements back by one position, creating room for`x`

in the desired location. ■

- Modeling is the art of formulating your application in terms of precisely described, well-understood problems.
- Proper modeling is the key to applying algorithmic design techniques to real-world problems — it can eliminate the need to design an algorithm, by relating your application to what has been done before.
- Real-world applications involve real-world objects — like a system to route traffic in a network or find the best way to schedule classrooms in a university.
- Most algorithms, however, are designed to work on rigorously defined abstract structures such as permutations, graphs, and sets.
- To exploit the algorithms literature, you must learn to describe your problem abstractly, in terms of procedures on such fundamental structures.
- The act of modeling reduces your application to one of a small number of existing problems and structures. Such a process is inherently constraining, and certain details might not fit easily into the given target problem.
- Certain problems can be modeled in several different ways, some much better than others.
- Modeling is only the first step in designing an algorithm for a problem. Be alert for how the details of your applications differ from a candidate model, but don't be too quick to say that your problem is unique and special.

**Permutations**are arrangements, or orderings, of items. E.g:`{1,4,3,2}`

and`{4,3,2,1}`

are two distinct permutations of the same set of four integers.**Subsets**represent selections from a set of items. E.g:`{1,3,4}`

and`{2}`

are two distinct subsets of the first four integers. Order does not matter in subsets the way it does with permutations.**Trees**represent hierarchical relationships between items.**Graphs**represent relationships between arbitrary pairs of objects.**Points**define locations in some geometric space.**Polygons**define regions in some geometric spaces.**Strings**represent sequences of characters, or patterns.

Learning to think recursively is learning to look for big things that are made from smaller things of exactly the same type as the big thing.

If you think of houses as sets of rooms, then adding or deleting a room still leaves a house behind.

**Permutations**: Delete the first element of a permutation of`n`

things`{1, ..., n}`

and you get a permutation of the remaining`n-1`

things. Basis case: {}**Subsets**: Every subset of`{1, ..., n}`

contains a subset of`(1, ..., n - 1)`

obtained by deleting element`n`

. Basis case: {}**Trees**: Delete the root of a tree and you get a collection of smaller trees. Delete any leaf of a tree and you get a slightly smaller tree. Basis case: 1 vertex.**Graphs**: Delete any vertex from a graph, and you get a smaller graph. Now divide the vertices of a graph into two groups, left and right. Cut through all edges that span from left to right, and you get two smaller graphs, and a bunch of broken edges. Basis case: 1 vertex.**Point sets**: Take a cloud of points, and separate them into two groups by drawing a line. Now you have two smaller clouds of points. Basis case: 1 point.**Polygons**: Inserting any internal chord between two non-adjacent vertices of a simple polygon cuts it into two smaller polygons. Basis case: triangle.**Strings**: Delete the first character from a string, and you get a shorter string. Basis case: empty string.

Recursive decompositions of combinatorial objects. (left column) Permutations, subsets, trees, and graphs. (right column) Point sets, polygons, and stringsRecursive descriptions of objects require both decomposition rules and basis cases, namely the specification of the smallest and simplest objects where the decomposition stops.

- The basic scheme of a contradiction argument is as follows:
- Assume that the hypothesis (the statement you want to prove) is false.
- Develop some logical consequences of this assumption.
- Show that one consequence is demonstrably false, thereby showing that the assumption is incorrect and the hypothesis is true.

- The classic contradiction argument is Euclid's proof that there are an infinite number of prime numbers:
- The negation of the claim would be that there are only a finite number of primes, say
`m`

, which can be listed as`p₁,..., pₘ`

. So let's assume this is the case and work with it. - Prime numbers have particular properties with respect to division. Suppose we construct the integer formed as the product of "all" of the listed primes:
[
N = \prod
*{i = 1}^{m} p*{i} ] - This integer
`N`

has the property that it is divisible by each and every one of the known primes, because of how it was built. - But consider the integer
`N + 1`

. It can't be divisible by`p₁ = 2`

, because`N`

is. - The same is true for
`p₂ = 3`

and every other listed prime. Since a +1 can’t be evenly split by any of the prime numbers because they are bigger. - Since
`N + 1`

doesn't have any non-trivial factor, this means it must be prime. - But you asserted that there are exactly
`m`

prime numbers, none of which are`N + 1`

, because`N + 1 > m`

. - This assertion is false, so there cannot be a bounded number of primes.

- The negation of the claim would be that there are only a finite number of primes, say
- For a contradiction argument to be convincing, the final consequence must be clearly, ridiculously false.

- Principled guessing is called estimation.
- Estimation problems are best solved through some kind of logical reasoning process, typically a mix of principled calculations and analogies.
- Principled calculations give the answer as a function of quantities that either you already know, can look up on Google, or feel confident enough to guess.
- Analogies reference your past experiences, recalling those that seem similar to some aspect of the problem at hand.

Show that

`a + b`

can be less than`min(a, b)`

.When:

`a and b < 0`

(i.e: negative)

`a <= b`

Then:

`min(a, b) = a`

`a + b < a`

Example:

`min(-6, -5) = -6`

`-6 + -5 = -6 -5 = -11`

`-11 < -6`

Show that

`a × b`

can be less than`min(a, b)`

.When:

`a < 0`

(i.e: negative)

`b > 0`

(i.e: positive)

Then:

`min(a, b) = a`

`a * b < a`

Example:

`min(-3, 4) = -3`

`-3 * 4 = -12`

`-12 < -3`

Design/draw a road network with two points a and b such that the fastest route between a and b is not the shortest route.

1a──────c──────b2│ │3│ │4└────d────────┘56Route `acb` have a toll gate, is in development and have a speed limit.7Route `adb` is longer, but has none of these impediment.89Route `adb` will be faster than the shorter `acb` route.Design/draw a road network with two points a and b such that the shortest route between a and b is not the route with the fewest turns.

1a────┐ ┌────b2│ └─c─┘ │3│ │4└──────d──────┘56Route `acb` is the shortest but has 4 turns.7Route `adb` is the longest and has only 2 turns.

- Prove that $\sum_{i = 1}^{n} i = \frac{n(n + 1)}{2}$ for n ≥ 0, by induction.