Greedy Algorithms

Making Change
Suppose you would like to make change using a minimum number of coins. How could you do this? For example, 59 cents of change could be made from five dimes, one nickel and four pennies, nine coins. It could also be made from one quarter, three dimes and four pennies, eight coins. Even better, it could be made from two quarters, one nickel and four pennies, seven coins.

Could we do better? It turns out that the answer is no because the way to make change using least number of coins is to use as many quarters as possible, then use as many dimes as possible, then as many nickels as possible and then finally make up the rest with pennies. In the example above, we make 59 cents of change using two quarters, the maximum possible. The remaining 9 cents is made using no dimes and one nickel, the maximum possible. Finally, we make up the final remaining 4 cents with pennies.

We prove that this algorithm works as follows. Suppose that we made c cents of change using q quarters, d dimes, n nickels and p pennies. Then, we used q + d + n + p coins to make c = 25q + 10d + 5n + p cents of change. If q + d + n + p is the minimum number of coins then we can prove that:

  1. p < 5 because we can replace five pennies with one nickel.
  2. 5n + p < 10 because we can replace two nickels with one dime. If 5n + p >= 10 then since p < 5 there must be at least two nickels.
  3. 10d + 5n + p < 25 because we can replace two dimes and a nickel with one quarter or three dimes with a quarter and a nickel. If 10d + 5n + p >= 25 then since 5n + p < 10 we must have at least two dimes. If there is at least one nickel then we can replace the nickel and two dimes with a quarter. Otherwise, if there are no nickels then 5n + p < 4 so we must have at least three dimes. We can replace these three dimes with a quarter and a nickel.
The algorithm we just described is called a greedy algorithm. This is because at each step it makes a greedy choice, i.e. always use as many coins of the largest denomination possible. We were able to prove that this algorithm was correct because the problem of making change from quarters, dimes, nickels and pennies posses the greedy choice property, the greedy choice is always the best choice.
Shortest Paths
Earlier we saw that BFS can be used to find the shortest path between any pair of vertices in a graph. In real-life graph modelling problems though, the cost of using one edge might be more expensive than using another edge. For example, if we use a graph to model a system of trails (trail intersections=vertices) then the shortest path between two points is not necessarily the one using the least number of edges but the one covering the least distance. We can model this type of problem with a weighted graph, a graph whose edges are each associated with some weight or cost. The length of a path in such a graph is the sum of the weights of the edges on the path. The shortest path between two vertices is the shortest path between them. The following is an example of a weighted graph.

It turns out that the problem of finding a shortest path in a weighted graph with non-negative edge weights possesses the greedy choice. The associated algorithm known as Dijkstra's algorithm is very similar to BFS. In the algorithm, we progressively construct a spanning tree of the graph such that, if v is the vertex the algorithm starts with, then the spanning tree is simply the union of shortest paths from v to all other vertices in the graph. In fact, if the edge weights are the same for all edges then the algorithm actually reduces to BFS.

Constructing the tree is based on the idea of adding a vertex to a graph. Suppose that we know the shortest path from v to every other vertex in a graph. If we add a new vertex, say z, to the graph by adding edges from to vertices x1, x2, ..., xd then we know that the shortest path from v to z is passes through x1, x2, ..., or xd so the length of that path is equal to

min{d(v, xi) + w(xi, z) | 1 <= i <= d}
where d(v, xi) is the length of the shortest path from v to xi and w(xi, z) is the weight on the edge connecting xi and z.

As we construct the spanning tree, we keep track of two things for each vertex: the length of the shortest path discovered so far from v to the vertex and the edge incident to the vertex on that shortest path. Initially, we know that the shortest path from v to itself is 0 but, for all other vertices, the path length is infinite since we don't that a shortest path even exists. Then, we add v to the initially empty spanning tree and update the distances of each of v's neighbors to be the weight of the edge connecting v and the neighbor. After this, we repeatedly find the vertex w not in the tree with the smallest known distance from v. We add w to the tree and then update the distances of each of its neighbors. Before update, their distances from v did not include the possibility that the shortest path from v might include w. We take that possibility into account for a neighbor x of w by making its updated distance the minimum of its original distance and its distance through w, d(v,w) + w(w,x).

The greedy choice in Dijkstra's algorithm is then to choose the vertex not in the tree with the smallest known distance from v.

Consider applying the algorithm to the graph above starting at vertex A.

  1. We begin by recording A's distance from itself as 0 and updating the recorded distances of each of A's neighbors. At this point, the tree consists of just A.
  2. Of all the vertices not in the tree, C is the closest to A so we add it and update the recorded distances of all of its neighbors from A. Notice that the distance of B is adjusted because w(A,C) + w(C,B) < w(A,B).
  3. B is now the closest to A so we add it to the tree. We connect it to C because the shortest path from A to B is through C.
  4. D is now the closest to A so we add it to the tree. We connect it through C.







Dijkstra's algorithm is given below. We use d(u) to denote the known distance of u from v as the algorithm runs. When the algorithm finishes, we want d(u) to be equal to d(v,u), the length of the actual shortest path from v to u.
shortest-path(v)
  for each vertex u in the graph
    d(u) = infinity
  d(v) = 0
  q = empty priority queue
  for each vertex u in the graph
    q.insert(d(u), u)
  while q is not empty
    u = q.deleteMin()
    for each vertex z adjacent to u that is in q
      if d(u) + w(u,z) < d(z) 
        d(z) = d(u) + w(u,z)
	change z's position in q to correspond to d(z)
This algorithm only records the length of the shortest paths, not the actual paths themselves. How could we modify this algorithm to actually return the shortest path between two particular vertices? Hint: it would involve adding a couple of statements inside the if statement. Also, removing vertex u from the queue (i.e. "u = q.deleteMin()") corresponds to adding the vertex to the growing spanning tree we mentioned earlier.
Dijkstra's Algorithm Finds the Shortest Path
Now we actually prove that Dijkstra's algorithm works. In other words, we prove that d(u) = d(v,u) when u is removed from the priority queue.

We will use a technique called proof by contradiction. We begin by assuming that Dijkstra's algorithm does not work and then show that this assumption leads to a contradiction.

Proof by contradiction might sound complicated but it is actually quite common. For example, suppose that your dog was bitten by a snake. If two days later your dog was still healthy you could reason that the snake was not poisonous because you know that if a snake is poisonous and it bites a dog then the dog will die. However, your dog is not dead, therefore the snake was not poisonous.

Your proof began by assuming that the snake was poisonous. From that assumption you reasoned that your dog should be dead. Thus you have a contradiction, your dog is alive but should be dead. You reason then that your assumption must have been wrong, i.e. the snake was not poisonous.

Suppose that u is the first vertex removed from the priority queue for which d(u) > d(v,u). Let P = v...xy...u be the shortest path from v to u where y is the first vertex on the path still in the priority queue. Notice that the part of the path from v to x is the shortest path from v to x because x was removed from the priority queue before u. There are two possibilities:
  1. If u=y then
    d(u) <= d(x) + w(x,u) because x is adjacent to y=u and x was removed from the queue before u
    so the algorithm made sure each of x's neighbors z had d(z) <= d(x) + w(x,z).
    <= d(v,x) + w(x,u) because x was removed before u so d(x) = d(v,x).
    Remember that u is the first vertex for which the algorithm did not work.
    = length of P because P = v...xy...u = v...xu
    = d(v,u) because P is the shortest path from v to u.
  2. If u#y then
    d(u) <= d(y) because u is removed from the priority queue before y.
    <= d(x) + w(x,y) (similar to above)
    = d(v,x) + w(x,y) (similar to above)
    < d(v,u) because P contains the shortest path from v to x and the edge (x,y).
Both possibilities lead to the conclusion that d(u) <= d(v,u). However, we began by assuming that d(u) > d(v,u) so we have a contradiction. Therefore, the assumption must be wrong, i.e. d(u) <= d(v,u).

Of course, we can't have d(u) < d(v,u) because d(u) is the length of a path from v to u. Therefore, d(u) = d(v,u).

Greed Doesn't Always Work
Not all problems have the greedy choice property. For example, suppose we wanted to make change from quarters, dimes, nickels and pennies but this time dimes were worth 11 cents. Using the greedy algorithm above to make 33 cents of change we would use one quarter, one nickel and three pennies--five coins. It would have been better, however, to have just used three dimes.
Modified on Tue Jan 8 17:13:01 EST 2002 by Matthew Suderman.