SKETCH RECOGNITION 2008: "LADDER, a sketching language for user interface developers"

by Tracy Hammond, Randall Davis

Summary

Sketch recognition systems can be quite time consuming to build if they are to handle the intricacies of each domain. With LADDER, a developer need only write a domain description which describes what the domain shapes looklik e, and how they should be displayed and edited after they are recognized. The language consists of predefined shapes, constraints, editing behaviors, and display methods, as well as a syntax for specifying a domain description. LADDER allows the developer to specify both hard and soft constraints. LADDER is the first language that not only can define how shapes are to be recognized, but also can define how shapes are displayed and edited. Trigger and action allows user defined editing.

(drawn usingDescription limitations: (1) LADDER can only describe shapes with a fixed graphical grammar (drawn with the same graphical components each time). (2) The shapes must be composed solely of the primitive constraints contained in LADDER and using only the constraints available in LADDER. (3) LADDER can only describe domains that have few curves or where the curve details are not important for distinguishing between different shapes. (4) LADDER can only describe shapes that have a lot of regularity and not too much detail.

Shapes are defined by a list of components, their aliases, constraints that define relationship of the components, editing behaviors (triggers and actions), and display methods (original, clean-up, ideal shape, custom shape). Shapes can be defined hierarchically. Also can extend abstract shape with is-a section (like extends in java). Shape groups (abstract shape groups) can be used by the recognition system to provide top-down recognition, and ‘‘chain reaction’’ editing behaviors.

The language includes the primitive shapes SHAPE, POINT, PATH, LINE, BEZIERCURVE, CURVE, ARC, ELLIPSE, and SPIRAL, also includes predefined shapes built from these primitives including RECTANGLE, DIAMOND, etc. SHAPE (properties: boundingbox, centerpoint, width, and height) is an abstract shape which all other shapes extend.

A number of predefined constraints are included in the language, including perpendicular, parallel, collinear, sameSide, oppositeSide, coincident, connected, meet, intersect, tangent, contains, concentric, larger, near, drawOrder, equalLength, equal, lessThan, lessThanEqual, angle, angleDir, acute, obtuse, acuteMeet, and obtuseMeet. constraints that are valid only in a particular orientation, including horizontal, vertical, posSlope, negSlope, leftOf, rightOf, above, below, sameXPos, sameYPos, aboveLeft, aboveRight, belowLeft, belowRight, centeredBelow, centeredAbove, centeredLeft, centeredRight, and angleL. isRotatable implies the shape can be found in any orientation.

There are predefined editing behaviors like dragInside. The possible editing actions include wait, select, deselect, color, delete, translate, rotate, scale, resize, rubberBand, showHandle, and setCursor. The possible triggers include click, doubleClick, hold, holdDrag, draw, drawOver, scribbleOver, and encircle.

Key word vector can be used to specify variable number of components, and must specify the minimum and maximum number of components.

The domain independent modules determine if the stroke can be classified as an ELLIPSE, LINE, CURVE, ARC, POINT, POLYLINE or some combination. Interpretations from low-level recognizers are produced and sent off to the higher level recognizer. In order to ensure that only one interpretation is chosen, each shape has an ID, and each shape keeps a list of its subshapes, including its strokes. At any particular time, each subshape is allowed to belong to only one final recognized domain shape. But if the primitive shape recognizer does not provide the correct interpretation of a stroke, the domain shape recognizer will never be able to correctly recognize a higher level shape using this stroke.

Domain shape recognition is performed by the Jess rule-based system. To improve efficiency, a greedy algorithm is used that removes subshapes from the rule-based system once a final higher level shape is chosen.

If an editing gesture such as click-and-hold or double-click occurs, the system checks to see (1) if an editing gesture for that trigger is defined for any shape, and (2) if the mouse is over the shape the gesture is defined for. If so then the drawing gesture is short-circuited and the editing gesture takes over.

To display a shape’s ideal strokes, the system uses a shape constraint solver (using optimization functions from Mathematica) which takes in a shape description and initial locations for all of the subshapes and outputs the shape with all of the constraints satisfied while moving the points as little as possible.

Domain shape recognizers, exhibitors, and editors are automatically generated during the translation process.

Discussion

The system is so powerful. The shapes are defined hierarchically, which is just like OO languages. User defined editing behaviors and display modes are also great.

1 comment:

Nabeel said...: Yes the object oriented interface of the language is what makes it more powerful and easy to use.

I also think that because of OO nature its possible to write a LADDER domain description generator, which could generate some basic descriptions to give the developer a head start in shape definition.; October 7, 2008 at 1:42 PM

SKETCH RECOGNITION 2008

Tuesday, October 7, 2008

"LADDER, a sketching language for user interface developers"

1 comment:

About Me

Blog Archive