Multivalent Tools

Full-text searching

Construct a full-text index of all supported document types, including PDF, HTML and UNIX manual pages, and execute searches on it. These search and index tools interface Multivalent's support for many document formats with Doug Cutting's excellent Lucene search engine. Install Lucene's JAR, version 1.3 or later (which is named something like lucene-1.4-final.jar), and add the complete path includig the Lucene JAR name it to your CLASSPATH environment variable or specify it with invoking a tool withthe -classpath command-line optin.

Index Options

java tool.lucene.Index [options] files-and-directories-to-index

-new — create new index. Otherwise the index is incrementally updated.
-refresh -- for every document in the index, see if it has been modified or deleted, and update the index accordingly. No new files are added (any files-and-directories-to-index are ignored.)
-index directory -- directory in which to find the index created above. Default: directory named .lucene in the user's home directory.
-exclude regular-expression -- do not index any file whose path matches regular-expression

Search Options

java tool.lucene.Search [options] expression-to-search-for

-index directory -- directory in which to place the index. Default: directory named .lucene in the user's home directory.
-maxhits number -- limit number of hits displayed to number of them. The special case -hits all displays all hits. Default: 10.
expression-to-search-for -- Single words/terms are given as is, phrases are surrounded by double quotes, and boolean are composed with AND and OR. See Lucene query syntax for more sophisticated searches. Fielded searches can refer to the fields uri to limit searches based on filename, body for body text, and mod to limit searches based on file last modification date.

At this time full-text searching is not available graphically through the browser.