Sunday, September 21, 2008

"Sketch Based Interfaces: Early Processing for Sketch Understanding"

by T. M. Sezgin, T. Stahovich, R. Davis

Summary

The authors is trying to provide a system where users can sketch naturally and have the their sketches understood, with no constraints on how things are drawn.

Early processing in the sytem consists of three phases: approximation, beautification, and basic recognition. Approximation finds the representative points of strokes in the sketch. Beautification modifies the points found in approximation phase to make the output nicer and more meaningful (show parallelism, perpendicularity, etc.) Basic recognition tries to interprete strokes into objects (rectangles, circles, etc.) And this paper focuses mainly on the approximation approaches. Stroke approximation is divided into vertex detection and curves handling.

The authors do vertex detection by combining the information in curvature and speed, since vertexes usually have high curvature and low drawing speed. Fd is the set of all points with local maximum curvatures above average curvature, while Fs is the set of all points with local minimum speed below 90% average speed. Using speed data alone misses short lines that are not drawn fast enough, and using curvature data alone produces many false positive point due to hand dynamics during freehand sketching. So the idea is to combine them. Three stages in generating hybrid fits:

1) Computing vertex certainties: (a) certainty for curvature canditate Vi is |Di-k - Di+k| / L, where Di is curvature at Vi, k is neighborhood size (not specified), L is path distance of substroke from Vi-k to Vi+k. (b) certainty for speed candidate Vi is 1 - vi/vmax.

2) Generating a set of hybrid fits: H0 is the intersection of Fd and Fs. In succession iterations, Hi' = Hi + Vs (Vs is the best remaining speed fit candidate), Hi" = Hi + Vd (Vd is the best remaining curvature fit candidate.) Goodness metric: least squares error (average of the sum of the squares of the distances to the fit from each point in the stroke S.) Ei = 1/|S| sigma ODSQ(s, Hi). Here |S| is the stroke length, and ODSQ stands for orthogonal distance squared, i.e. the square of the distance from the stroke point to the relevant line segment of the polyline difined by Hi. Compare the errors for Hi' and Hi", the one with smaller error become Hi+1. Repeat until all points in Fd and Fs have been used.

3) Selecting the best fit: set an error upper bound (not specified) and designate as final fit Hf, the Hi with the fewest vertices that also has an error below the threshold.

The second phase in approximation is curves handling. Define L1 as the euclidean distance between a pair of consecutive vertices found in vertex detection phase, and L2 be the path length of the sketch between these two points. The ratio of L2/L1 is significantly higher than 1 in curved regions of the sketch. Approximate the curved regions with Bezier curves: end points are consecutive vertices u = Si, v = Sj, and control points are c1 = KT1 + v, c2 = KT2 + u, where K = 1/3*sigma|Sk - Sk+1| where k from i to j (K = 1/3 path length of curve segment), and T1, T2 are unit length tangent vectors pointing inwards at the curve segment. The curve is subdivided in the middle if error of Bezier approximation is higher than a threshold (not specified), until the desired precision is achieved.


Discussion

The idea of using speed to find candidate verticies had already been stated in Herot's paper over 2 decades before this paper was written, and on this point, the recognition may rely partly on how the users draw, for example, some may draw very carefully with even speed. The method of choosing verticies by their certainty scores and least squares errors is the great part of this paper, though many threshold values are left unspecified (I guess some of these thresholds should vary according to the sketch it is dealing with.)

The accuracy of object recognition is as high as 96%, but this doesn't mean the approximation has equally high accuracy, for 96% might as well be attributed to the effect of beautification and the primitives in the system, for example, even the sketch is not properly approximated, the object recognizer may still correctly classify the sketch as rectangle, circle or spring, since it's easy to tell these objects apart.

No comments: