Full-text searching
Construct a full-text index of all supported document types, including
PDF, HTML and UNIX manual pages,
and execute searches on it.
These search and index tools interface Multivalent's support for many document formats with
Doug Cutting's excellent
Lucene
search engine.
Install Lucene's JAR, version 1.3 or later (which is named something like lucene-1.4-final.jar
),
and add the complete path includig the Lucene JAR name it to your CLASSPATH environment variable or specify
it with invoking a tool withthe -classpath
command-line optin.
Index Options
java tool.lucene.Index [options] files-and-directories-to-index
- -new —
create new index.
Otherwise the index is incrementally updated.
- -refresh --
for every document in the index,
see if it has been modified or deleted,
and update the index accordingly.
No new files are added
(any
files-and-directories-to-index
are ignored.)
- -index directory --
directory in which to find the index created above.
Default: directory named
.lucene
in the user's home directory.
- -exclude regular-expression --
do not index any file whose path matches regular-expression
Search Options
java tool.lucene.Search [options] expression-to-search-for
- -index directory --
directory in which to place the index.
Default: directory named
.lucene
in the user's home directory.
- -maxhits number --
limit number of hits displayed to number of them.
The special case -hits all
displays all hits.
Default: 10.
- expression-to-search-for --
Single words/terms are given as is, phrases are surrounded by double quotes,
and boolean are composed with AND and OR.
See Lucene query syntax
for more sophisticated searches.
Fielded searches can refer to the fields
uri
to limit searches based on filename,
body
for body text,
and mod
to limit searches based on file last modification date.
At this time full-text searching is not available graphically through
the browser.