The two main classes of nodes are leaf nodes and internal nodes, which have children. These two classes define common operations for tree navigation and manipulation for behaviors to exploit, in the manner of HTML's Document Object Model (DOM).
Nodes have bbox. observers, chain up to root of current document, absolute root of entire tree, containing browser, next and previous node / leaf, mark valid bit dirty up tree, find node matching name, attributes names, attribute values convert relative coordinates to relative to another node or to root in which case absolute, lowest common ancestor between two nodes, compare if one node comes before another in a left-to-right tree traversal, compute active behaviors at point in tree
Different concrete document types sometimes define one or more new leaf type. For example, while HTML and manual pages do not need much beyond the standard set of text and image types, scanned paper images define a hybrid image-OCR text type for words that functions as text, but that paints itself with that portion of the scanned image. This hybrid image-OCR type enjoys no special accomodation in the system and could have been implemented as a third-party extension, but because it follows the handful of display properties mentioned above, it can be annotated and otherwise manipulated by the same behaviors that operate on other concrete document types such as HTML.
The ability to define new node types (leaf as well as internal nodes) is an uncommon advantage. Other systems, such as HTML browsers (that support the DOM) and Microsoft Word, have a fixed set of node-like object types, which can be manipulated with JavaScript or Visual Basic or other language, but cannot be improved.
Media types with internal structure should be represented as structural subtrees rather than leaf nodes. For example, SVG graphics are described by a structural tree, one that could be manipulated to interesting effect. Even media types that are typically treated as opaque leaves in browsers, such as videos, have a set of frames, and perhaps some analysis has clustered frames into scenes and discovered objects within and across frames. Some browsers and editors can embed another application's window and GUI, as for instance a word processor embedding a live spreadsheet, but this visual placement leaves the manipulation of the embedded document at the same level of functionality as before. In contrast, a native structural tree, while it requires one-time additional work to parse and display, opens the document format to deep manipulation.
Functionally, internal nodes most often pass on flow of control to their children. Nodes have access to all protocols (see below), such as painting and mouse and keyboard events, and specialized internal node types take advantage of the low-level access to implement a wide variety of function:
Internal nodes have the usual collection of child management functions: get first/last child, get number of children (size), get child at index (childAt), get index of child (childNum), add child, insert child between two others, remove child, remove child at index, remove all children, replace child at index with another node. Also, get first/last leaf that involves descending the tree, report whether a passed node is contained in the subtree rooted at that node, and so on.
Occasionally, implementation convenience or performance requires a non-structural internal node. These nodes have a null name.
In addition to structural relationships, all nodes capture physical layout in the form of bounding boxes. A bounding box is a rectangle that describes the dimensions (width and height) and position (x and y location) of a node's content (which for internal nodes is generally the union of the bounding boxes of its children). Node positions are given relative to the parent node to enable incremental manipulation such as reformatting. (System functions translate relative coordinates into absolute, or parent-relative to arbitrary ancestor-relative.) baseline
All nodes have a formatting valid bit. When a node is edited or the window is resized or more of an incremental loading arrives, the affected nodes are marked dirty (valid equals false). The affected nodes in turn mark dirty their parents and their parent's parent and on up to the containing document, as if any part of a document (or subtree) is dirty the entire document (or subtree) dirty. When next the document is to be painted on the screen, which may have been triggered by the same editing operation, the document is formatted on demand. Thus, valid bits enable batch updating of potentially many operations that invalidated tree layout.
Moreover, rather than blindly formatting the entire document, only those parts marked dirty are formatted, and typically a subtree will be able to simply reposition most of its otherwise valid children in its relative coordinate space, thus saving much computation and achieving the speed necessary for interactive editing and annotation. In most cases, reformatting can stop earlier if a reformatted node maintains the same shape as before. For example, text pasted into a paragraph would require reformatting the paragraph, which may or may require a new linebreak, and if so may or may not spill to subsequent lines, and if so may or may not add a new line to the end of the paragraph, which if so would require pushing down subsequent paragraphs in the section, which would require pushing down subsequent sections, and so on.
All graphical user interface components or widgets—including buttons, checkboxes, menus, menubars, popup menus, scrollbars, type-in fields, and so on—are implemented as specializations of tree nodes. The system doesn't use Java GUI widgets, neither AWT nor Swing. Widgets appear in the document tree just like other document content such as text and images. This uniformity and integration in the document model makes widgets as accessible and manipulable, as by style sheets; allows easy embedding of widget within "ordinary" document content or complex document content within widgets; and leverages advanced functionality developed primarily for on class of nodes, say HTML tables, for others, as for GUI layout. Developers can easily introduce new widgets, taking advantage of the power of the Java2D API, which supports roughly the same graphical operations as PostScript.
With most other systems, the set of widgets, while often quite extensive, does not support well complex document content with multiple fonts, embedded images, and sophisticated layout, or if they do, it is in a "leaf widget" that does not mix well with other widgets. For example, as HTML has become the common cross-platform rich text format, widget sets such as Tk and Java Swing have supplied HTML parser-renderers. However, these widgets document with foreign objects Multivalent document model views everything as a complex document, with some parts
just embed button, checkbox, radiobox in menu. menubutton in menu = cascade. Traditional GUI widget sets are build from the point of view of providing containers (windows, panes) and some standard controls (buttons, scales), but the main editing area is a leaf whose content is opaque to the rest of the GUI. If we're lucky, as in Tk, the widget set provides generally useful big drawing areas with sophisticated drawing and/or rich text display. In contrast, the Multivalent perspective sees everything as a document with any part capable of display and events; some specializations of document tree nodes such as hyperlinks and annotations happen to look more like document category activities, other tree nodes such as menus and scrollbars happen to look more like GUI widgets.
Thus, GUI widgets perforce compose well with the rest of the document, including event propagation, relative coordinates, lenses, and so on. Widgets are extremely cheap, costing about the same as word. Their formatting can be controlled with style sheets and spans. It is easy to add new GUI widgets, just as media adaptors often add new leaf types. Drawing can take advantage of Java2D, which essentially gives PostScript-level graphical operators. GUI layout can be done with the same layout nodes as used in document layout, such as with HTML tables. Nesting document components within widgets is easy: When HTML introduced a button type that could arbitrary HTML, this was a trivial tweak to implement on top of the existing button type with a simple text label. Likewise, an editable note is simply a nested document, so it picks up for free as-needed scrollbars and itself can be annotated.
On the other hand, the set of from-scratch native Multivalent widget types is presently not as complete as modern GUI toolkits. The implementation does include enough for HTML and basic GUI—menu, button, checkbutton, radiobutton, scrollbar, subwindow—but not scale, cascade menu, combo box, and others. And, of course, the whole system could be seen as an ultra sophisticated text widget.
Whereas many document systems support some form of extensibility, the Multivalent system pushes this idea to the extreme. Almost everything that is not a node is a subclass Java class with a well-defined interfaced called a behavior. Some behaviors happen to be packaged with the basic system, but they have no privileges over third-party behavior extensions. All user-level functionality is implemented by behaviors: support for concrete document types (HTML, scanned paper, UNIX manual pages, Zip file display and file extraction, Perl POD format), cookies, additional HTTP protocols ("about", "generate"), lenses (magnify, Show as OCR), spans (hyperlinks, executable copy editor markup), annotation saving, searching, table sorting, scanning old document formats for likely hyperlinks, key bindings for Emacs and Microsoft Windows, condensed "executive summary" display, even the cursor and selection span, among others.
The fixed core is extremely small, serving for the most part as a bootstrap loader for behaviors. The set of behaviors to load is given by one or more hub documents, which are simply lists, with implicit priority ordering, in XML format, of behaviors and attributes for them. Hubs can be compared to style sheets in that, given a document with some structure, style sheets describe how to display the document, while hubs describe how one may interact with with. However, whereas style sheets are content with specifying fonts and margins, hubs control the construction of the entire application.
Sometimes behaviors create other behaviors outside of the listing in a hub and so bypasses the preferred behaviors; for this case, all behavior creation is processed through a behavior mapping kept in the startup file. So if a user were to update the current search behavior, which searches for literal strings only, with one that understands regular expressions, a one-line change (and no recompilation) updates all behaviors that rely the search function to handle regular expressions—thus enlivening even legacy Multivalent documents.
The separation of document function in hubs and document data in existing concrete formats gives a number of advantages. Simple (sometimes legacy) document formats can be enlivened with modern ideas (hyperlinks for ASCII), and more modern formats can be enlivened with ideas that are too new to be included, too specialized or esoteric for a general format, or simply too much to be included in a specification meant to be implemented by a number parties of a variety of devices. Moreover, since hubs can be stored separately from content, they can be applied to servers and media that are not cooperative. For example, one can annotate a scanned page image on a web site, which itself supports just simple image display.
Whereas the document tree contains the data and has limited function in painting, layout and event propagation, behaviors supply most of the function in the system. Because concrete documents are represented in abstract trees during runtime, whose basic properties are guided by a set of Java superclasses and whose structure retains as much semantic information as possible, behaviors implementing general functionality can operate on all concrete document types. For instance, because the search behavior relies on a well-defined access method for text, it can search HTML, scanned paper, and any other format. Likewise, because leaf nodes support a common set of display properties, such as background color, underline, and mouse events, all document formats support selection of content and the addition of hyperlinks, even scanned paper, DVI, and ASCII.
Behaviors can invoke one another. Both directory listing and Zip file display behaviors rely upon the table sorting behavior to sort by their columns of various type, name strings, size integers, and last modified dates. Behaviors do not invoke others directly, however, but rather the message (a semantic event, described below in "Protocols") is passed among any other behavior that may be interested, often all the behaviors active in the document, so that other behaviors have a chance to massage, cancel, or even execute the message themselves (one should trust a behavior before loading it!). Furthermore, behaviors should send out messages announcing interesting things they are doing and have done, such as "openDocument" and "openedDocument". The search behavior is continuously active, searching each new document as it is opened, by listening for "openedDocument" events; then search announces "searchHits", which the search visualization behavior greets as the time to compute a new display.
Behaviors can be categorized according to primary function, although a single behavior may participate in several.
Customizing behaviors ranges in flexibility and difficulty. As mentioned above, which behaviors are loaded is described by hubs. An individual behavior may be customizable with attributes. The behavior that appears in the popup menu on a word and sends that word to a dictionary or word translation service takes as attributes that title to show in the menu and the URL of the service. Most span types rely on a "SpanUI" behavior to put them in a menu, and other spans could be added and their organization in the user interface rearranged. The "SemanticUI" behavior sends an arbitrary semantic event in response to invoking a menu item or button on the toolbar. A number of other behaviors are probably simple variations on existing behaviors. As described Of course, the wholly original ideas can be satisfied only by writing a new behavior from scratch, but it seems possible to accomplish interesting things in just a few hundred lines of code. Media adaptors can reuse node types already created for flowed and fixed document types, and current media adaptors range from 167 lines for ASCII, 180 for Zip, 238 for directory listing and 260 for Perl's POD among the simple formats, to about 1000 for UNIX manual pages in the mid-range, and over 4000 for HTML. Spans are simpler, with most under 100 lines and the most complex (hyperlink) at 250. Lens range from 50 to 100 lines. Other behaviors range from 100 to 400 lines.
The system framework manages overall flow of control, flowing through nodes and invoking behaviors during fundamental document protocols:
Little direct invocation. Protocols enable behaviors to filter other behaviors. Instead, send requests through protocols so all interested behaviors have a chance to modify, augment, short-circuit, prevent.
There are presently over 150 behaviors, and during runtime several hundred can be active on a given document; protocols are the main coordination mechanism, invoking behaviors to do the right thing at the right time, and—by spreading behaviors over protocol and document fragment—preventing them from colliding. Experience shows that only rarely are multiple behaviors interested in same portion of the document in the same protocol; most of these occasions call for summing the effects (for example, bold text plus hyperlink yields blue underlined bold text), and the handful of remaining conflicts are resolved by the priority order implicit in the hub listing (see above) or on other occasions a behavior's self-proclaimed priority.
Most protocols have before and after phases; during the before phase some behaviors can create or take actions that can be built upon or reversed during the after phase. For example, during build before, one behavior can load in the main body of a document, so that it will be available for annotations to hook into during build after. Behaviors can short-circuit a protocol from before to after phases to prevent lower priority behaviors from taking their effect.
As appropriate to their intrinsic nature, protocols are either round robin or tree based. Round robin protocols flow through the before phase of all active behaviors from highest priority to lowest, the the after phase in reverse order. Thus the highest priority behavior is called first and last, getting the first word and the last say, as it were. The round robin protocols are build (at the start of which the tree does not exist), semantic events (such as "openDocument" and "newBrowserInstance"), and save.
Tree-based protocols proceed through a depth-first tree walk, during which tree nodes can also affect control flow. Behaviors interested in a structural portion of the tree register interest (programmatically add themselves as observers) to the node at the head of the subtree, and during the tree walk the before methods of the observers are called before the node and its children are traversed, and the after methods of the observers are called after the node is done; behaviors can short-circuit from before to after, and after can short-circuit to cancel the remainder of the tree walk. The tree-based protocols are format, paint, and low-level events (which are primarily keyboard and mouse events).
| Protocol | Round-robin or Tree | Before and After Phases? | Example |
|---|---|---|---|
| Restore | all behaviors | ||
| Build | round-robin | media adaptor builds document tree | |
| Format | Tree | X | |
| Paint | Tree | X | |
| Low-level Events | Tree | X | Emacs key bindings |
| High-level Semantic Events | Round-robin | X | |
| Save | Round-robin | PersonalAnnos saves annotations |
In comparison with low-level Java AWT events,
semantic events are used for higher-level actions, such as a
message to open a new page and most other commands found in menus and
the toolbar. Semantic events allow arbitrary events by name,
e.g., new SemanticEvent(this, "openDocument", Semantic events are routed through all behaviors in order from
highest to lowest priority and back from lowest to highest.
During high-to-low, behaviors can short-circuit to the corresponding point in low-to-high.
During low-to-high, behaviors can short-circuit to termination.
Behaviors not recognizing a particular semantic event must benignly ignore it.
This message passing scheme should not lead to performance issues as semantic events are at a high
level and therefore unlikely to be fired in rapid succession; for
instance, whereas the low-level mouse movement needs to be efficient,
any overhead in dispatching a semantic event to, say, open a new
browser instance, is swamped by the carrying out of that event.
e.g., directory sends out "tableSort descending", which is caught by TableSort
behavior.
During eventBefore, the behaviors that take primary action to that event should pass it through,
giving other behaviors the opportunity to filter it.
Via the round-robin chain of all behaviors for high level
SemanticEvent's, such as openBrowser. The event is
passed through eventBefore methods of behaviors from highest
to lowest priority, then eventAfter in lowest to highest
priority. Behaviors can short-circuit from high-to-low to low-to-high
passes by returning true in eventBefore, and
short-circuit the chain entirely by returning true in
eventAfter.
There are several ways events are sent to behaviors:
User interface controls (widgets) that invoke a high level action,
such as creating a new lens, should not directly do so. Rather, they
send a semantic event announcing this fact, giving other behaviors a
chance to filter it. The UI widget should catch the event in its
eventAfter and take action then. Since the event potentially
passes through all behaviors, of which there may be many, only high
level events should be passed this way, not ones requiring high
performance attention such as mouse movements. Any behavior can send
and event, by creating one and launching it via Browser.eventq(Event),
which automatically executes the proper routing mechanism (tree or
round-robin) for the event type.
retains parse so 80% of CSS easy
Some types of Layers:
Hub documents list the behaviors and layers that comprise a
document. Hub documents are written hierarchically, in XML,
and given an .mvd suffix. Each
entry corresponding to a behavior has a Behavior attribute
that gives the name of the behavior, and any number of other
behavior-specific attributes, which are passed onto the instantiation
of the behavior in the document. Behavior names can be the full
package and class name, such as
multivalent.std.lens.SignalLens, or a shorter, more generic
lens, such as OCRLens, that is resolved into the full class
name using a mapping in the users pref.txt startup file; the
shorter name allows the user to use a different, in some way
preferable implementation of a behavior and have it apply immediately
to pre-existing documents.
In instatiating a document, several hub documents are cascaded, or
combined, by the system. The pan-document, pan-genre
System.mvd is always loaded. A pan-document, genre-specific
hub, such as HTML.mvd and Xdoc.mvd, is loaded next,
if one exists. Next, the document-specific hub is loaded, if one exists.
The system automatically creates a user-specific, document-specific
hub to hold private annotations, if one does not already exists.
Thus behaviors enjoy
simultaneous access to both structural and physical data, and this
access enables additional features.
doc tree structural but with bboxes. doc conversions easy: 2N vs N^2. struct and graphical manipulations: ASCII for full-text search, PDF, WML for wireless
simple tree walk on HTML to generate ASCII or correct(ed), pretty printed
HTML. little more work and get XHTML converter.
so, when constructing tree, add the comments from the source document
and other information that will not necessarily appear in the final, so
that translations (or round-trip to original concrete format, with edits)
can be as complete as possible.
layout: parabox and HTML table
document tree data, little function
but layout and painting, which could be done with behaviors but
since most internal nodes do layout and most leaf nodes paint,
in essence fuse behavior and node
style sheet, span
skip over valid, as indicated by valid bit
start formatting from any point by collecting actives
A well-defined document tree enables a number of useful capabilities
across concrete document types. A simple tree walk of scanned
both structural and physical. e.g., document translations: HTML to PDF
uniform document model - words, gui, visual layers, scroll regions -- all nodes (paint + events). style sheet control over doc and UI. button can have table and image.
initially browsers used tags to signal font changes and line breaks, internally stored as lines or a mess!
nowadays, with TABLE relying on structure and DOM and CSS describing actions structurally, I would image that more of structure retained. but lotsa special cases
Multivalent: eschew special cases
as temporal walkthrough of document lifecycle with examples
powerful
access to fundamental operations vs scripting language on fixed functionality
integrated system vs applet/plug-in
cross platform (Java) vs x86/Windows-specific MSIE
media adaptors vs helper apps. helper apps all different worlds. maybe bought Acrobat and so can anno PDF, but no help on HTML. Yet have to implement in Java, which can be a lot of work. but once done, integrated. full-text search on everything: HTML, OCR, man pages, .... conversions: sometimes can't (OCR+image), always lose info, have to make and maintain conversions vs just works.
of course, media adaptors can do a cheap translation to HTML. even here better: always work from authoratative source so if translator improved every one immediately benefits without need to batch retranslate or whatever, immediately available within the workflow and whatever platform and location, use other behaviors such as anno on top
cite that great DVI viewer -- all that effort, but now what if want that feature on some other document type? likewise, full-text search on XXX, but not HTML. balkanized
client vs server. more function on client. some servers more function than others, and by putting smarts in client, benefit wherever you go.
distribution of components vs the open source Mozilla/Amaya. distribution a big problem. people won't want different probably conflicting patches for each site; getting hack accepted back into main source tree problematic -- political, bloat
even if you recompile the Linux kernel, do you want to recompile browser for every other web site?
composition of extensions vs no composition
media adaptors that support several document types vs HTML app+mail app+news app. extensions, such as annotations, are written against document tree and so work on all document types. e.g., replace with on HTML works on scanned works on any new. often easy to convert to HTML, which is better than other converters because done when needed so always there when you need it and maintenance
hard to demonstrate any new ideas accommodated, but in implementation everything's an extension to small core: parser/renderers for HTML scanned manual ..., annotations, cookies, ...; e.g., hooked in robust hyperlinks [ref DDEP00 paper] without modifying core; e.g., lenses
Clipboard building
4. Layers
System, Document, Personal, Scratch
Visual layers
5. Supporting Services
Browser
behavior instance creation
grab
various hooks: selection, cursor, root including UI,
root of document content
takes events from Java and passes to tree
getBehaviorInstance -- maps, calls restore
Development Support
Live Document Tree Data Display
Here there be monsters.
- bounds lens,
replace current page with view of its doc tree data structure. color coded: red for errors, orange for warnings, green for unusual (+ scroll to point in long doc) [screen dump?]. why is something not showing up? can see if parsing error, layout error (bounding boxes colored), behaviors intercepting
validate()
dump()
System layer doesn't save. For example, SCRATCH and SHARED layers.
Architectural Overview
to show how developers: [add examples]
simple to understand, powerful, extensible. not biased but surprisingly powerful
built-in formatting:
good because already there, but doesn't support everything is an extension claim (though could define own nodes)
floats (left, right)
alignment (left, center, right)
margin, border, padding
behaviors
apply to all doc types - so change keybindings from Emacs to Macintosh. other editors have mappings, but never exactly the same, have a little programming.
levels of XXX: brand new behavior (takes most technical sophistication, but of course most flexible; most behaviors pretty small, 200-300 lines), subclass (sometimes just need general programming skills, not necc Java depending, and can follow pattern), or attributes (e.g., to add new info resource in doc popup)
protocols -- integrate with above
Paint
separate out methods:
X intersects(Point), intersects(Rectangle) => paint relevant? format (valid bit universal?)? event?
coord transform enter (exit=-enter, always) (used by event, paint, repaint, getRelScreen)
formatBeforeAfter
intercept on document root to do constraint-based layout
paintCheckTransform() ("paint"?)
O applicable? node itself (not parent) decides based on coordinates. if not, return => override bbox/w==h==0 check by IRootScreen... and any others?
O INode queries style sheet for active ContextListeners
O paintBeforeAfter (no super because may have different check)
when skip subtree, can update actives by looking at summaries!!!!!!! which tells of all spans that start or stop (but not both) within subtree
paintBeforeAfter - same coordinate space and graphics context as content => override by INode to add/remove style sheet settings + applicable=>wash out, Leaf for applicable=>wash out, Document to replace style sheet itself, several do nothing=>check/ok, IRootScreen undoes scrollbars=>transform
content space vs wrap space ... or don't mix: always homogeneous
paint, event, getLoc
maybe heterogeneous ones report content dx,dy and just undo for special parts?
problem cases:
IScrollPane - scrolledpane + vsb + hsb. dx/dy not special
- scrolledpane. dx/dy set by/taken from scrollbars
transform coordinate space - origin and clip by bbox, border, margin
|observers before, stop on shortcircuit
| if no shortcircuit, paintNode
| else cx.valid=false
|observers after, starting at shortcircuit, stop on new shortcircuit
transform revert
paintNode
INode - background, recurse over children, draw border => override to decorate after super.paintNode(). can override to start/stop showing children when y