Planar Shape Recognition By Shape Morphing

Planar Shape Recognition By Shape Morphing
-Rahul Singh, and Nikolaos P. Papanikolopoulos [1]

Presented by Andrew Royes (aroyes@cs.mcgill.ca) and Danielle MacNevin (pmacne@cs.mcgill.ca)

The General Idea...

Just a Little Background...

Representing the Objects..

Shape Morphing and Recognition...

Check out our Java Applet..

Some of Their Results...

The General Idea...

For humans, it is relatively easy to distinguish and classify an object, according to the visual information we obtain from looking at the object. For example most people would be able to distinguish an apple from an orange easily, or the sun from the moon. However, for a computer this is a very complex task. The idea here is to use shape morphing to gather the necessary information that will allow a computer to classify a shape according to the shapes that that the computer already has in its database.

For example, if we have the following three shapes in the database:

Images in an image database

and a user drew the following shape,

User drawn image

then we would want our program to match it to the purple arrow, and not any of the other shapes in our database, even though it is rotated 90 degrees and a different color.

Shapes are represented by their contours. Using the contour information a morph is performed between the user drawn shape and the template objects (target shapes) stored in the database. The cost of a morph is measured with a physics-based method, and the morph that produces the smallest cost in the deformation of the original object to an already existing object in the database, is chosen as the closest match. The user drawn object is then classified as this template.

If this seems like an innovative method of pattern recognition to you, then read on!

Back to Index

Just a Little Background...

For a pattern recognition algorithm to be successful it should follow the following properties, which were presented by Arkin et al. (as taken from [1]):

The measure should be metric
It should be invariant under translation, rotation and scale change
It should be easy to compute
It should match intuitive notions of shape resemblance

This method manages to encompass all of these properties as well as:

An algorithm should be able to handle convex and non-convex objects and should also function in the presence of open as well as closed objects.
The algorithm should be able to recognize shapes with deformations, that is, if there is noise in the image, the algorithm should be able to exclude it and still come up with the right answer.

Back to Index

How to Represent the Objects Efficiently...

The authors chose to represented shapes by their contours. The contour of a shape is taken to be outer hull of the object.

Image Contour

The contour is chosen for a number of reasons:

It is easily attainable.
A variety of objects are already described by their contours.
It can be thought of as a higher order curve, therefore algorithms that use it to describe objects are extendible to recognize a variety of other classes of shapes such as hand drawn images and writing.

However, it is not the entire contour that we are really interested in since there can be a huge number of points on the contour of an object, and finding a correspondence between them could require a large amount of computation (I will talk more about what a correspondence is soon). What we really want, are the points where the object has large minimums and maximums, or the areas where the object bends and folds (in essence we want to find the corners of the objects, or the points which make the object unique). We will then use this approximation of the shape as our object contour.

For example, the interesting points of the above contour are illustrated by the small square in the image below:

Segmentation points on an image contour

Our polygon contours are represented as a sequence of points and lines in our java applet. (See java applet for more information). A shape is then segmented using a polygon segmenting algorithm to find the areas of interest and to reduce the number of points that will be involved in the morphing process. Once the polygon has been segmented, we are left with a new polygon approximation of the original polygon.

Back to Index

And Then There was Shape Morphing and Recognition...

The word morph is short for metamorphoses, which means to change from one state to another. We are using a morphing process specifically to gather information that will allow our program to recognize an object and to classify it according to existing objects already contained in a database of shapes.

The morph proceeds by deforming (stretching and bending) the source polygon (inputted by the user) until it matches the target polygon (contained in the database of shapes). A morph is created between the user drawn shape and each template object in the database. In order to create a morph between the source and target contours, the algorithm first needs to specify a correct correspondence between the key points obtained in segmentation of the source and target contours. This is done by mapping, the key points on the contour of the source object with those on the target polygon.

In the example below, we have a point correspondence specified by the color of the points on the corners of the triangles. The point correspondence is formulated in such a way that the morph that will ensue between the the input and target shape will have the smallest cost.

Point Correspondence between 2 triangles. The color correspondence shows which points correspond to each other.

However, it needs to be stated that two shapes, the input shape and the target shape, may not have the same number of segmentation points. If this is the case then two points (or more) in the source contour can map to the same point in the target contour. Once a correspondence is created, then a cross-dissolve plus a warp is used to formalize the morph.

Below is an example of one of the morphs that the authors generated using this algorithm.

Morph between an input image and a template image, that is the same as the input image but rotated by 45 degrees.

To obtain the best possible point correspondence possible between the source and target polygon they used a physics based approach which measures the amount of stretching and bending required to morph the source contour into the target contour. For each pair of points (or triplets of points) in the correspondence they use the following formulas to compute the amount of stretching and bending energy. To compute the amount of stretching energy required to morph the source into the target the following is used:

where:
K(s) is the stretching stiffness parameter
L(O) is the length of the segment in the user drawn polygon
L(T) length of the segment in the target polygon
L(I) is the length of the segment at the intermediate point between the input polygon and the target polygon.
c(s) is the penalty that is incurred for line segments collapsing to points.

In the above image, we have A as the initial input polygon, with segment L(o), the target polygon B with segment L(T), and C is the intermediate polygon with line segment L(I) (which is the interpolation of the line segments L(o) and L(T)). The red and blue points specify the point correspondences between the three polygons.

The above formula measures the amount of stretching energy between each pair of line segments specified by the point correspondence. The total amount of stretching energy needed to create the morph is the sum of all the stretching energies summed over all line line segments.

To calculate the amount of bending energy, the following formula is used:

where:
K(b) is the bending stiffness parameter
Phi(O) is the original angle between the point triplets in the user drawn image
Phi(T) is the angle of the point triplets in the target polygon
Phi(I) is the angle created by the point triplets in the intermediate polygon. (The polygon formed by the interpolation of the input polygon and the target polygon)

In the above image, we have A as the initial input polygon, with angle Phi(o), the target polygon B with angle Phi(T), and C is the intermediate polygon with angle Phi(I) (which is the interpolation of the line segments L(o) and L(T)). The red, blue, and purple points specify the point correspondences between the three polygons.

The above formula measures the amount of bending energy required between each triplet of points in the point correspondence and computes the cost of the angular deformation of the contour. Similar to the way that the the total amount of stretching energy was computed, we compute the total amount of bending energy required by taking the sum over all point correspondences.

To test all the different possible point correspondences, an exhaustive search method is employed, and then the best result form all these is used.

The point correspondence that yields the smallest values in both E(s) and E(b) is chosen as the optimal one, since it is the morph between the two polygons that requires the least amount of stretching and bending to create the morph.

Once we have the best point correspondence between the user drawn object and each template in the database, the template that would create the smallest morphing cost (and hence has the smallest E(s) and E(b) over all objects in the database) is chosen as the object which the user drawn template will be classified as. Since this is the object that requires the least amount of deformation between it and the user drawn image, it is therefore (by this measurement), the most similar to the user drawn image.

To summarize the overall recognition process, it is done as follows:

First: For each template in the database, we find the best point correspondence between it and the user drawn image out of all possible point correspondences.
Second: Once we have completed 1, we can then choose the template object that causes the smallest morphing cost between it and the input object based on the angular and stretching cost of the morph.
The information obtained in the first two steps are then used to transform the input object into the target object. (Giving us the morph of the input and target polygons).

Back to Index

After all that Reading, How About Checking out our Java Applet for a More Intuitive Feel for the Algorithm...

Please follow this link to see the applet and to read about the details of our implementation.

Back to Index

Or Maybe You Just Want to Check out Some of Their Results...

In this paper it is interesting to note that the results of the algorithm will only be as good as the segmentation algorithm used to select the key points from the original contour. In there experiments, misclassification was usually caused by improper selection of the feature points of the contour. It also must be said that because objects and sketches are morphed using only the contours of the object, if the object is shaded in any way, or if there is poor illumination on the object then the algorithm will break down since an improper contour of the object would be obtained for the shape.

They tested the algorithm in three different cases:

Case 1: Recognition of Rigid Objects

In this experiment, they used 16 template objects which were stored in a database. They used 80 test objects which were images taken of the 16 template objects but at various rotations and translation with added noise. The table below summarizes the results they obtained for this experiment.

Case 2: Recognition of Deformable Shapes

Deformable shapes were classified as hand-drawn outlines of objects, sketches of natural images, and contours obtained from real-world objects. 15 shapes were used in total, and 4 sketches of each shape were stored in the database as templates for a total of 60 objects to choose from. To conduct this experiment, people were shown what sketches were in the database briefly, but then had to sketch their own representation of the object without the templates in front of them. In this way, the results in the table below were obtained.

Case 3: Recognition of Unconstrained Cursive Handwriting

For this experiment they used 10 randomly chosen words and each user provide one template sample to store in the database for a total of 40 words per user in the database. They also asked each user to provide 4 test samples of the each of the 10 words to test the recognition algorithm with. The table below shows the results of these experiments.

Back to Index

Bibliography
[1] Rahul Singh, Nikolaos Papanikolopoulos: Planar shape recognition by shape morphing. Pattern Recognition 33(10): 1683-1699 (2000).