Functionality: Behaviors

Whereas many document systems support some form of extensibility, the Multivalent system pushes this idea to the extreme. Almost everything that is not a tree node is an extension called a behavior. Behaviors are Java classes that participate in the communication protocols detailed in subsequent sections. Programmatically, this means that behaviors subclass the class Behavior and override methods corresponding to those protocols. Some behaviors happen to be packaged with the basic system, but they have no privileges over third-party behavior extensions. All user-level functionality is implemented by behaviors. The extension language is the implementation language.

Types

Behaviors can be categorized according to primary function, although a single behavior may participate in several.

Media adaptors
Behaviors that primarily parse some concrete document format and build a runtime document tree are known as media adaptors. For example, the UNIX manual page media adaptor reads roff source, the HTML media adaptor goes to great pains to correct the files of random bytes found on the Web into a structurally reasonable tree, and the scanned paper adaptor builds a document tree hierarchy of region, paragraph, line, word. In addition to viewing documents, media adaptors can be used for general purpose access; for example, a full-text indexer could use media adaptors to decode PDF and DVI uniformly to and as easily as ASCII and HTML. Once bridged into the tree, the document of whatever source format enjoys the array of existing functionality; this contrasts with single-format viewers, such as IDVI for TeX DVI, Ghostscript for PostScript, xpdf for PDF, which, however excellent they are for their single format, must recapitulate a large and growing amound of standard functionality.

To preserve readability of digital documents in the future, according to "A Universal Tool to Rescue Old Files From Obsolescence", Raymond Lorie of IBM Almaden proposes a virtual machine that everyone accepts, and definers of document formats support by writing additional software in the VM language that reads and displays those documents. How about Java and Multivalent media adaptors?

Structural
Structural behaviors modify protocols over a document subtree. Such behaviors "register interest" in a particular subtree, and subsequently each protocol invokes the behavior's corresponding methods before and after passing through the subtree rooted at that node. For example, table sorting rearranges the children of the given parent to achieve sorted order, clipboard markup generates a representation of the selected text with markup tags, and one type of search visualization hooks onto the scrollbar to paint its results on top of the scrollbar every time it is painted.
Span
This very common behavior type of behavior extends from some offset within a start leaf linearly through leaf nodes to an offset within an end leaf. Examples of span type behaviors include font change, highlight, hyperlink, and copy editor markup.
Lenses
Lenses, such as Magnify and Decypher, control a geometric portion of the document (described with a movable, resizable window). Lenses compose effects where they overlap, so that magnify plus Show OCR yields magnified OCR, and Show OCR plus decypher yields decyphered OCR.
Managers
Managers provide specialized coordination among behaviors beyond that provided by the usual means of communication. For example, Lens coordination of overlapping lenses is very specialized, to compose effects when lenses overlap, and yet its coordinating manager behavior has no special privileges in the system. When a lens is made, it queries the browser-level attributes for the lens manager, spontaneously creating one if it is the first, and registers its existence. During document painting, the lens manager computes intersections and invokes the individual lenses.
Filters
A number of behaviors customize the document. Annotations add user-authored information. Manual pages are transformed into outlines in order to present an overview of and easy access to the information contained. The autosearch behavior highlights keywords that might otherwise be overlooked in long documenets. Behaviors could be written to strip ads from HTML or highlight relevant words in pages returned by a search engine.

Complexity

Writing behaviors ranges in flexibility and difficulty. An individual behavior may be customizable with attributes in its hub. The behavior that appears in the popup menu on a word and sends that word to a dictionary or language translation service takes as attributes that title to show in the menu and the URL of the service. Most span types rely on a SpanUI behavior to put them in a menu (this separates functionality from user interface), and other spans could be added and their organization in the user interface rearranged. The SemanticUI behavior sends an arbitrary semantic event in response to invoking a menu item or button on the toolbar. A number of other behaviors are probably simple variations on existing behaviors. A demonstration "FBI Redaction" behavior, which blacks out spans of text and associates a reason code and comment, was written in two hours by starting with the hyperlink annotation behavior, changing the blue underline to black foreground and background, and changing the dialog box to ask for a comment rather than a URL.

Of course, the wholly original ideas can be satisfied only by writing a new behavior from scratch, but it seems possible to accomplish interesting things in just a few hundred lines of code. Media adaptors can reuse node types already created for flowed and fixed document types, and current media adaptors range from 167 lines for ASCII, 180 for Zip, 238 for directory listing and 260 for Perl's POD among the simple formats, to about 1000 for UNIX manual pages in the mid-range. At 4000 lines, HTML is the largest media adaptor, yet this is only 5% as large(!) as the rough equivalent in Mozilla. Spans are simpler, with most under 100 lines and the most complex (hyperlink) at 250. Lens range from 50 to 100 lines. Other behaviors range from 100 to 400 lines.