Wednesday, October 1, 2008

"A Domain-Independent System for Sketch Recognition"

by Bo Yu, Shijie Cai

Summary

This paper utilizes low-level geometric features to handle smooth curves and hybrid shapes as well as poloylines. Previous sketch recognition systems mainly have these deficiencies: (1) strict restrictions on the drawing manner, (2) lack ability to analyze hybrid or smooth curves and decompose them into primitive shapes.

Their system is diveded into two stages: imprecise stroke approximation and post-process. Approximation classifies the strokes into lines, arcs, circles, ellipses and helixes and a combination of these primitive shapes. While post-process is applied after the user finishes drawing.

Imprecise stroke approximation phase uses a top-down approach. A stroke is segmented at the highest curvature point into 2 parts, then do feature-area verification with respect to primitive shapes on each segment to decide if it can be classified as one of those shapes, if approximation fails, the segment is recursively decomposes at the point with the next highest curvature. Curvature is defined as the sum of change in direction from Pn-k to Pn+k divided by path distance between Pn-k and Pn+k, where k is set to 2.

If the direction graph fits a horizontal line and number of deviation point to the least-square best fit line of the direction graph or the actual coordinate of the stroke falls below a loose threshold, then it could be a line. If the feature-area / candidate line's standard area is less than a strict threshold 1, then it's immediately approximated to be a line segment, otherwise, the decision is delayed till after the arc recognition. Circles are arcs with direction change of 2Pi. If the direction change of the stroke is greater than 2Pi, it's segmented into parts with direction change of 2Pi, and decide whether it's overtraced circle or helix (descending or ascending radii). To deal with self-intersection strokes, two copies of stroke is made, one segmented using the original procedure (at high curvature points), the other segmented at self-intersection points to see which gives better result. ((1) simpler is better: fewer parts of primitive shapes, (2) specifier is better: circles, ellipses, helixes are better than lines and arcs)

Post-process does simple relation retrieval (parallelism, perpendicularity, etc.), cleanup (tails removal, merge primitive elements), and basic object recognition (boundaries, rectangles, triangles, arrows, dash lines, etc.)

The overall accuracy is 98%, while accuray for arcs alone is 94%. Hybrid shape of arcs connected to lines gives only 70% accuracy because of low curvature at smooth connecting points.


Discussion

The top-down approach tackles the problem of noises in curvature and direction graph. And the rules used in cleanup phase is quite useful, 60%-80% meaningless elements are removed while meaningful parts are preserved. For the self-intersection part, would a spiral be interpreted into circles attached to arcs and lins? I think this may change the topological relation and the meaning of the sketch.

No comments: