School of Computer Science

Winter 1997 Class Notes for 308-251B

DATA STRUCTURES AND ALGORITHMS

Undirected
Graphs
The DFS Algorithm Complexity Searching for Cycles Directed Graphs Properties of DFS Algorithm and Tree |
Euler
Tours
Java Applet Links References Credits |

Phase |
Condition of Node |
Colour |

1 | undiscovered | white |

2 | discovered and being processed | grey |

3 | finished | black |

Colouring the nodes acts as a memory device - it is like dropping bread crumbs in a maze in order to know where they've already been.

[Back to Top]

for all nodes u do { colour[u]=white; p[u]=NULL; } time = 0; for all nodes u do if colour[u] == white then DFS(u);This is the algorithm itself, which performs the depth-first search:

DFS(u): visit(u); time = time + 1; d[u] = time; colour[u] = grey; for all nodes v adjacent to u do { if colour[v] == white then { p[v] = u; DFS(v); } } time = time + 1; f[u] = time; colour[u] = black;

- I turns grey, d[I] = 1
- E turns grey, d[E] = 2
- E has two neighbours, B and F. By the convention B is visited first: B turns grey, d[B] = 3
- Similarly, A, C, & D are visited
- D has 4 neighbours: A, C, G, H. A and C are grey so G is visited first, then F.
- All of F's neighbours are grey so F is finished: f[F] = 9, F turns black.
- Similarly G is finished
- H is visited because we have not finished visiting all of D's neighbours
- The remaining nodes are finished in the reverse order, backtracking along the original path, and ending with I.

The maximum time index in the example is 18, equal to 2 times the number
of nodes. This is because time is incremented exactly when a discovery
time or a finish time is assigned, and each node is assigned one of each
during its lifetime. The time should not be confused with the complexity
of the algorithm, which we will discuss soon.

1) If v is discovered while u is being processed (is grey), then v will be finished before u. This is called the Nested Interval Property, and is a consequence of the recursive nature of the algorithm: when v is discovered it is "pushed" on to the recursion stack, and u being processed at that time means that u is lower down in the stack. Therefore v will be "popped" before u, i.e. finished before u.

2) The other possibility is that u is finished before v is discovered. In the sample graph above, F and H are an example of this: their start and finish times are 8, 9 and 11, 12 respectively.

Note that the situation depicted here is impossible, by the argument in (1) above:

DFS trees are usually elongated, but are useful because they display the entire recursion path. As well, to determine if any node is a descendant of any other node, one need only compare their times d[u] and f[u]. This is much faster and easier than performing the same task for ordinary binary trees without additional information.

Here is a snapshot of a DFS tree in mid-traversal:

The grey nodes in the tree always form a chain. This represents the current chain of recursion. Note that the black hanging pieces cannot be connected to other parts of the grey chain, since the black section is finished - anything connected to it which is not above it in the tree (below it in the recursion stack) must also be black.

An important property is that:
*The white path property states that v is a descendant of
u if and only if, at the time of discovery of u, there is a white path
from u to v.*

If v is a descendant of u, then at time d[u], there must be at least one white path from u to v for the DFS algorithm to follow. Conversely, if there is a white path from u to v at time d[u], then the last node on that path is a neighbour of v which has not yet been processed.

One useful aspect of the DFS algorithm is that it traverses connected
components one at a time, and thus it can be used to identify the connected
components in a given graph. As an example, if a handwriting sample is
scanned into a pixelized image and DFS is performed on that image, all
connected components below a certain threshold size (which represent unwanted
dirt in the sample) can be removed.

[Back to Top]

Therefore the total complexity is O(|V| + |E|), which is optimal, as
mentioned in the introduction.

We start by looking at a cycle in an undirected graph.

Assume that d[u] < d[v]. The straight edge between u and v is looked at twice during DFS, once from u and once from v (while looping through neighbours). At time d[u], v is white and at time d[v], u is grey (keep in mind that there is no way to ever look at a black node - this follows from the comment about the snapshot of a DFS tree).

More generally, we have the following theorem:

An undirected graph has a cycle if and only if at some point during the traversal, when u is grey, one of the neighbours v of u considered in the inner loop of the DFS is coloured grey and is not p[u].

Proof: If we arrive at v and find that u is grey (and u is not p[v]) then u must be an ancestor of v, and since we travelled from u to v via a different route, there is a cycle in the graph. Conversely, suppose u and v are included in a cycle:

Since d[u] < d[v], when v is reached and becomes grey, u is already grey so v has a grey neighbour. The result can be summarized as:

If we apply this theorem to the problem posed
above, the search for a cycle in an undirected graph becomes the search
for grey neighbours. If there is no cycle then the maximum number of edges
is |V| - 1, and if there is a cycle, it will be found before the |V|th
edge is looked at, so in both cases the DFS algorithm will terminate its
search for a cycle in O(|V|) time.

[Back to Top]

We now define 4 types of DFS tree edges found in the DFS tree of a directed graph. Each edge is classified at the time it is first looked at, and from the point of view of the node occupied by DFS at that point.

- Tree edge - an edge to a white neighbour.
- Back edge - an edge to a grey neighbour (an ancestor).
- Forward edge - an edge to a black descendant (impossible in an undirected graph). The edge from A to C is an example.
- Cross edge - an edge to a black neighbour which is not a descendant, e.g. the edge from D to C.

[Back to Top]

The problem is to traverse the graph while visiting each edge exactly once, and to finish at the starting point (this is known as the Euler tour). An Euler graph is a graph which contains an Euler tour; the figure on top is an Euler graph, but the figure below it is not. The reason is the difference in degree sequences. The degree of a node in an undirected graph is the number of edges emanating from it. If all the degrees are even then the graph can be traversed in the manner described. This is because each node must be arrived at and departed from the same number of times, and each time using a different edge. The converse is also true: all connected undirected graphs which are Euler graphs have even degrees at all nodes.

In a directed graph, define an "in" degree to be the number of incoming edges, and an "out" degree to be the number of outgoing edges.

A connected directed graph is an Euler graph if and only if every node's
"in" degree equals its "out" degree. It is relatively easy to check if
a graph is an Euler graph, but the more difficult task is to actually find
the Euler Tour. This requires a modified version of the DFS (see Cormen,
Leiserson and Rivest on page 485, Exercise 23.3-9), and can be done in
O(|V| + |E|) time.

[Back to Top]

R. Decker. *Data Structures.* Prentice Hall, 1989. pp. 125-6, 263-5

E. Horowitz, S. Sahni. *Fundamentals of Data Structures.* Computer
Science Press, Inc., 1976. pp. 292-3

Depth First Search | Mani Chandy's notes on Depth First Search for undirected graphs from Caltech, with postscript slides. |

Depth First Search in Directed Graphs | Mani Chandy's notes on Depth First Search for directed graphs from Caltech, with postscript slides. |

BFS and DFS | Lecture notes by David Eppstein from UC Irvine. |

Depth-First Search | Two implementations of DFS by Thomas A. Anastasio at UMBC: one using a tree and the other using a stack. |

Lecture: Depth-First Search | Simple lecture notes by Oklahoma State's Nick Street illustrating modified DFS algorithm for solving Euler Graph problem, with sample code. |

Graphs: Search | DFS lecture notes from Tom Leblanc at the University of Rochester with sample code and applictations. |

Advanced Data Structures and Algorithms | Course outline from Dina Kravets at NJIT including a postscript file on DFS and BFS. |

Simple Search Techniques | DFS and BFS as applied to the traveling-salesperson problem, by Nikos Drakos at Leeds. |

Euler, Leonhard (1707-1753) | Biographical information on the famous 18th-century mathemetician, Leonhard Euler, by Paul Golba. |

*Copyright © 1997 by the authors. All rights reserved.
Reproduction of all or part of this work is permitted for educational or
research use provided that this copyright notice is included in any copy.
Disclaimer: this collection of notes is experimental, and does not serve
as-is as a substitute for attendance in the actual class.*

[Back to Top]