The general selection algorithm with a O(n(logn)^{2})-time complexityIntroduction
We
write pi(a) for the permutation that orders the intercepts
of l_{1},…, l_{n}, in descending order at x=a. So
y_{pi1}(a)>…> ypi_{n}(a). We renumber l_{1},…, l_{n} so that for a< t_{1}, (a) is the identity. The permutation
pi(t_{1}) has one inversion (exactly one
paire of lines crossed at t_{1}), pi (t_{2}) has two, etc… In
fact the function I( (x)) =
number of inversions in pi(x), is
a monotone step function in x with unit jumps at the t_{i}'s. I(pi (x)) = j if and only if t_{j} = max(t_{i}:t_{i}<=x). For a given k, the problem of finding
t_{k }may be viewed as an usual sorting
problem to which we apply Megiddo's technique of building algorithms from
parallel ones. An implicit binary search over the t_{i}'s is performed,
each step taking O(nlogn) time. This will an O(n(logn)^{2})-time
algorithm. Presentation
of the algorithm
In
seeking t_{k }, we will attempt to sort y_{1}(a^{*}),…,
y_{n}(a^{*}) at a^{*} = t_{k} +
epsilon. We know that this sort may be achieved in O(nlogn) comparisons
(cf Binary Search Tree), each answering a question Q_{ij} of the form "y_{i}(a^{*})<=y_{j}(a^{*})?". The O(nlogn) answers yield
the permutation pi^{* }that
sorts these intercepts : ypi^{*}_{1}(a^{*})>…> ypi^{*}_{n}(a^{*}).
Once pi^{*}
has beed found, t_{k }= max[upi_{i}^{*}pi_{i+1}^{*}:pi_{i}^{*}>pi_{i+1}^{*}]; the kth inversion must have just
reversed a pair of adjacent intercepts in the permutation pi(t_{k-1}). The
control structure for the sort will come from the O(logn)-depth sorting
network of Ajtai, kmlos and Szemeredi (AKS-Network). At each level, n/2
questions are answered. The network is just a guide for blocking these
comparisions into groups of size n/2. The sort is complete once the O(nlogn)
answers are obtained and we have determined pi^{*}. Even
though we do not know a^{*}, we can answer
the question Q_{ij}
in time O(nlogn) as follows. We find u_{ij
}the x-coordinate of , l_{i} inter l_{j }, i<j,
in constant time and then obtain its rank among the t_{i}'s.To find its rank, we sort the n intercepts
at u_{ij} in decreasing
order to get pi(u_{ij}). The rank u_{ij} is the number of inversions in pi(u_{ij}), I(pi
(u_{ij})). If
I(pi (u_{ij}))>k, we know that u_{ij}> t_{k} so the answer is "no"; lines i and
j have not yet crossed at t_{k}. If
I(pi(u_{ij}))<k, we know that u_{ij}< t_{k} so the answer is "yes". If
I(pi (u_{ij}))=k, u_{ij}=t_{k}. I may be computed in time O(nlogn) (merge sort of Knuth) : if pi(u_{ij}) = (r_{1},…,r_{n}), we use merge-sort to sort the slopes mr_{1},…,mr_{n} and count the number of inversions that were performed. If
we actually answered all n/2 questions Q_{i1j1},…, Q_{injn} on a level of the network by counting
inversions, the complexity would be O(n^{2}logn) for that level
(n/2 questions*2nlogn for the sort of the n intercepts and the merge sort).
And as there are O(logn) levels, we have a complexity of O((nlogn)^{2})
overall. Improvements
The trick is to resolve the n/2 questions on a level by actually counting inversions only O(log) times. As mentionned, each questions determines an intersection of a distinct pair of lines, and the answer is obtained by comparing the rank of that intersection point with k. On
a given level of the sequentialzed version of the sorting network, denote
the x-coordinates of these points by z_{i1},…,z_{in/2}.
We can compute the median of these intersections points, z_{med},
in time O(n). Its cost time O(nlogn) to rank it, and answer half the questions.
For example, if z_{med }< t_{k}, then z_{ik }<
t_{k} for all the z_{ik }≤ z_{med}. Continuing with the n/4
unresolved questions on this level, we again find the median z and rank
it in time O(nlogn), etc…After O(logn) inversions counts, all n/2 questions
on this level are resolved. Since each inversion count takes O(nlogn)
steps and there are O(logn) levels, the algorithm has time complexity
O(n(logn)^{3}). But,
Cole shows how the result may be improved by a factor of logn, by considering
the fact that in the network each question has two inputs. (see R.COLE,
Slowing down sorting networks to obtain faster sorting algorithms,
Journal of the.ACM, No34 (1987), pp200-208) So we obtain an algorithm of O(n(logn)^{2})-time complexity |