Expected Performance

Expected Performance

As I mentioned earlier, the correctness of the algorithm follows directly from the loop invariant, so the only thing we what needs to be do is the performance analysis.

For some insertion orders, the resulting search structure may have quadratic size and linear search time. For other permutations it is much better. We took a random insertion sequence, so we are interested in the expected performance. The expected time is the average of the running time, taken over all n! permutations

Theorem.

- This algorithm computes T and D from a set S n line segments in general position in O(n log n) expected time
- The expected size of D is O(n)
- For any query point q the expected query time is O(log n)

Proof: Expected Query Time is O(log n)

Fix a query point q, and consider the path in D traversed by the query.

Define

S_i = {s₁, s₂, ..., s_i}
X_i = number of nodes added to the search path for q during iteration i
P_i = probability that some node on the search path of q is created in iteration i
D_q(S_i) = trapezoid containing q in T(S_i)

From our construction, X_i Ł 3; thus E[X_i] Ł 3P_i

P_i = Pr[D_q(S_i) ą D_q (S_i-₁)]

Backwards analysis:

How many segments in S_i affect D_q(S_i) when they are removed? At most 4. Since they have been chosen in random order, each one has probability1/i of being s_i. So, P_i Ł 4/i.

Let H_n = 1 + 1/2 + 1/3 + ... + 1/n, and recall that H_n = Θ(log n), in particular for n > 1 we have log_en < H_n < 1 + log_e n.

Proof: Expected Storage Size is O(n)

The total number of nodes is bounded by

where k_i = number of new trapezoids created in iteration i.

Define d(D,s) =	{	1 if D disappears from T(S_i) when s is removed from S_i
		0 otherwise

So, the expected amount of storage is O(n).

Proof: Expected Construction Time is O(n log n)

The time to insert segment s_i is O(k_i) plus the time needed to locate the left endpoint of s_i in T(S_i_-1).
So, the expected running time of the algorithm is

Using tail estimate one can prove that let S be set non-crossing line segments and let l be a parameter with l > 0. Then the probability that the maximum length of a search path in the structure for S computed by algorithm TrapezoidalMap is more than 3lln(n+1) nodes is at most 2/(n+1)^l^ln1.25-3.[6]

Surprise surprise this is not end of the story, Raimund Seidel [8] has showed that if S is given as a simple polygonal chain the expected running time of his algorithm is O(n log^* n) and more generally if S is presented as a plane graph with k connected components, then the expected running time of his algorithm is O(n log^* n + k log n). His algorithm also takes O(n) space for the search structure and query can be done also in O(log n) time.

Removing Our Assumptions

We can get rid of the assumption that no two distinct endpoints have the same x-coordinate.
Observation: The vertical direction that we chose was immaterial.

1st Idea: Rotate the axis system slightly

This can lead to numerical difficulties, and requires an extra pass through the data.

Improved Idea: Apply affine mapping (shear transformation)

F(x, y)=(x + ey, y) for small enough e.

This mapping preserves straight lines, so any "insidedness" holds just as true as it did before we applied the shear.

Symbolic Transformation

In practice, we don’t have to apply any transformation. We get the result using lexicographic order (compare first by x-coordinate, then by y-coordinate):

The only two operations used in the algorithm were:

Decide if q lies to the left of p, to the right, or on the vertical line through p.
Decide whether q lies above, below, or on the segment with endpoints p₁ and p₂.

Both can be done without actually computing F. Moreover, it is equivalent to the limit of these shears as e approaches zero (from the positive side).

This modification also makes the other initial assumption unnecessary: A query point never lies on the vertical line of a point node, or on the segment of a segment node.

Conclusion: Our algorithm TrapezoidalMap computes the trapezoidal map T(S) of a set S of n non-crossing line segments and a search structure D for T(S) in O(n log n) expected time, it takes O(n) space to build the search structure and for any query point q the expected query time is O(log n).