Line Fitting between Data Ranges

Sung Soo Kang

Contents

1.Introduction

2.Polygon Properties

3.Algorithm

4.Correctness

5.Applet

6.References

1. Introduction

In this project an online O(n) algorithm proposed by Joseph O'Rourke [1] for fitting straight lines to incremental data ranges is examined.

Basic Idea

We consider data ranges [a_k,w_k] which are received at each time t_k sequentially. Then, the set of all straight lines u = mt + b that fits the (t_k, [a_k,w_k]), k= 1,2,...,n can be represented as the set of all (m,b) pairs that satisfy the equations

a_k < mt_k + b < w_k for k = 1,...n

(1)

Figure 1. Example of t-b parameter space

By changing our point of view from the u-t parameter space where the data live to the m-b space, we can consider the inequality (1) as the constraint equations of a linear programming problem of two variables m and b.

b > (-t_k)m + a_k
b < (-t_k)m + w_k

(2)
Thus, the original problem of finding straight lines of n data ranges can be reduced to finding the intersection P_n of feasible regions P_k (or intersections of half-planes), k = 1,...,n constrained by the 2n parallel equations in (2).

Figure 2. Example of m-b parameter space (representation of the first five data ranges of Figure 1.)

Goal

Construct a convex polygon P_k from P_k-1 on-line in the m-b parameter space using the set of sequential inputs (t_k, [a_k,w_k]).

2. Polygon Properties

Before we mention the algorithm it is necessary to go over the basic properties of polygons which are used in the algorithm. It is assumed that t_k > 0 and t_k < t_k+1 for all k.

Lemma 1. Each edge of each polygon P_k in m-b parameter space has a strictly negative slope.

Proof. Polygon P_k is the intersection of 2k half-planes, and so each edge of P_k lies along one of the 2k half-plane edges. Each half-plane is described by one of the forms displayed as (2). Therefore, each plane edge has a slope of -t_i. Since t_i > 0 and t_i < t_k for 1 < i < k by the assumption, each slope lies within [-t_k, 0].

Definition 1. Define mmax, bmax, mmin and bmin as the following.

mmax_k=max{m:(m,b) is a vertex of P_k}
bmax_k=max{b:(m,b) is a vertex of P_k}

mmin_k=min{m:(m,b) is a vertex of P_k}
bmin_k=min{b:(m,b) is a vertex of P_k}

Lemma 2. For each polygon P_k in m-b parameter space, the rightmost point of the polygon is also the lowest point, and the leftmost point is also the highest point.

Proof. Suppose that the rightmost and the lowest points are distinct. Then there exist m' and b' such that m' < mmax_k and b' > bmin_k and (mmax_k, b') and (m', bmin_k) are both vertices of P_k. It is not possible to close off P_k by connecting the rightmost and lowest vertices without adding at least one edge of positive slope. This contradicts Lemma 1. The assumption of the leftmost and highest points being distinct gives rise to the same contradiction.

Definition 2.The leftmost point L_k is defined by (mmax_k,bmax_k) and rightmost point R_k is defined by (mmin_k,bmin_k).

Definition 3. The upper half-polygonal chain C_u is defined by {v_k: v_k is a vertex of P_k which lies between R_k and L_k counter-clockwise order}. The lower half-polygonal chain C_l is defined by {v_k: v_k is a vertex of P_k which lies between R_k and L_k clockwise order}.

3. Algorithm

Input: (t_k, a_k, w_k), i=1,...,n
Computed: Polygons P₁, P₂,...,P_n
Step 1. Compute P₂ and initialize L₂ and R₂
Step 2. Repeat from k=3 to n

Step 2-1. If R_k-1 is to the left of the half-plane b > (-t_k)m + a_k or L_k-1 is to the right of the half-plane b < (-t_k)m + w_k then there is no straight line fitting data range at t_k. Thus, reinitialize the process by setting P_k to the quadrilateral formed from the data ranges at t_k-1 and t_k. Initialize L_k and R_k.

Step 2-2. Else

Step 2-2-1. Search for the two intersection points of the half-plane b < (-t_k)m + a_k with P_k-1

(a) If R_k-1 is inside of the half-plane, go to Step 2-2-2.
(b) Find intersection point with edges of C_u counter-clockwisely.
(c) Find intersection point with edges of C_l clockwisely.
(d) Set the lowest intersection point as R_k

Step 2-2-2. Search for the two intersection points of the half-plane b > (-t_k)m + w_k with P_k-1

(a) If L_k-1 is inside of the half-plane, go to Step 2-2-3.
(b) Find intersection point with edges of C_u clockwisely.
(c) Find intersection point with edges of C_l counter-clockwisely.
(d) Set the highest intersection point as L_k

Step 2-2-3. Remove all vertices outside of the two intersection half-planes from P_k-1. Add in the new intersecting vertices.

Figure 3. Example of constructing P_k from P_k-1 on-line
4. Correctness of Algorithm

To show that the algorithm computes P_k intersections correctly, we need to show the following.

(1) Each half-plane edge either does not intersect the polygon at all or intersects C_u once and C_l once.
(2)The algorithm correctly computes the intersection points.

The proof of (2) follows if (1) is proved to be true because of the design of the algorithm.

Proof of (1). From Lemma 1, all the edges of P_k-1 have slopes in the range [-t_k-1, 0). Since t_k > t_k-1, the half-plane edge is steeper than any of the polygon edges. This implies that the half-plane edge intersects C_u and C_l exactly once and it enters the polygon if the intersection points are not L_k-1 or R_k-1. Figure 4 illustrates two impossible cases.

Figure 4. Impossible cases of intersecting the half-plane and P_k-1

5. Java Applet

6. References

[1] O'Rourke, J., An On-Line Algorithm for Fitting Straight Lines Between Data Ranges, Comm. ACM, 24, 9, 574-578, 1981.

[2] Freedman, A.M., Buneman, O.P., Peckham, G., and Trattner, A., Automatic Recognition of Significant Events in the Vital Signs of Neonatal Infants. Compu. Biomed. Res. 12, 2, 141-148, 1979.

[3] Lee, D.T. and Preparata, F.P., An Optimal Algorithm for Finding the Kernel of a Polygon. J. ACM 26, 3, 415-421, 1979.

[4] O'Rourke, J., and Badier, N., Model-based Image Analysis of Human and Motion using Constraint Propagation, IEEE Trans. on Pattern Analysis and Machine Intelligence, PAMI-2, 6, 522-536, 1980.

[5] Pavlidis, T., Structural Pattern Recognition. Springer-Verlag, Berlin, 168-184, 1977.

[6] Preparata, F.P., An Optimal Real-time Algorithm for Plannar Convex Hulls. Comm. ACM, 22,7, 402-405, 1979.

[7] Shamos, M.I., and Hoey, D., Geometric Intersection Problems, 17th Ann. IEEE Symp. Foundations of Computer Science, 208-215, 1976.

[8] Shamos, M.I., Computational Geometry, PhD Dissertation, Yale University, 1978.