Read, display, modify, and write Adobe
Portable Document Format (PDF) files
Demos
Screen dump
Sample documents:
digital document lectures (converted from PowerPoint),
an early paper scanned in
and converted with Adobe Capture (try the Lens / Show OCR),
JavaOne slides,
pdfTeX manual
(generated by
pdfTeX,
without Adobe tools),
Aida
(generated by
txt2pdf,
without Adobe tools),
link to named destination (similar to an HTML anchor).
Description
Adobe's PDF has emerged as a extremely popular document format,
a de facto standard alongside HTML.
The Multivalent Browser can display PDFs like Acrobat and xpdf,
with a couple limitations
and some interesting differences.
PDFs can be annotated without Acrobat.
These annotations are the same open-ended set of
Multivalent annotations
that are available on all document types
(and thus they aren't written into the PDF).
Some Multivalent annotations incrementally reformat the PDF to
open up whitespace in the middle of text.
PDFs that were generated from scanned paper,
such as the early paper mentioned above,
which have OCR run on them can have be annotated and have the text copied out
as in Acrobat.
You can also use the Show OCR lens (under the Lens menu) to see
the OCR translation in situ.
PDF named destinations, which are like HTML anchors, are supported.
Furthermore, each page has an implicit named destionation
called page=n.
For example, click to directly view page 5.
A full screen slide show (under the Go menu, when viewing a PDF).
(This slide show runs on all paginated document types, including TeX
DVI and XDOC.)
Status
Most features described in the 1021-page PDF Reference 1.5 manual
(which corresponds to Acrobat 6)
is supported,
except for a few high-end, usually printing industry-oriented features.
In general the random PDF document should look great.
- Parsing:
overall document structure (header, xref / xref stream, trailer,
error correction for corrupted xref)
and objects (primitive types, indirect, array, dictionary,
stream with multiple substreams and cascaded filters,
object streams).
Reading i/o is fast due to a buffered
RandomAccessFile
- Filters: ASCIIHexDecode, ASCII85Decode, LZWDecode, FlateDecode,
RunLengthDecode, and predictor decode for LZW and Flate
- Graphics: graphics state, splines, clipping, text,
inline images, XObjects (Image and Form), marked content,
linear shading gradient with exponential function
(not supported:
horizontal text expansion -- Tz operator,
other shading gradients)
- Color spaces: CalRGB, CalGray, L*a*b*, ICCBased (including data profile),
DeviceRGB, DeviceCMYK, DeviceGray,
Separation, DeviceN, Indexed
(not supported: Pattern)
- Fonts: TrueType, OpenType, Type 1, CFF, Type 3, Type 0 / CID.
Embedded and from operating system.
Embedded font metrics for core 14 fonts.
Fonts not available in the local system and not embedded are
substituted according to embedded font flags.
- Encodings: standard, Macintosh, Macintosh Expert, Windows, CID (CMaps, CIDToGID, and ToUnicode).
For Type 1 fonts, internal ToUnicode maps are automatically generated if needed.
- Images: CCITT Fax (both Group 3 and Group 4), DCT (aka JPEG; in color spaces YCbCr, RGB, grayscale, CMYK, YCCK, L*a*b*), JPEG2000,
raw samples (all bit depths and color spaces).
(not supported: JBIG2, image decode arrays)
- Transfer functions:
exponential, sampled, stitching, PostScript calculator
- Encryption: standard algorithm (RC4; at both 40- and 128-bit strengths), PDF 1.5 crypt filters.
Adherence to document permissions (cut and paste allowed? printing allowed?), as required by Adobe.
(still needed: UI for typing in a password!)
- Annotations: hyperlink
(not supported: other annotation types)
- Actions: GoTo, GoToR, URI, Named
(not supported: other action types)
- (not supported: interactive fill-in forms, transparency, optional content groups aka layers)
- Additions: special support for OCRed scanned page images,
such as Show OCR lens
Note that no use is made of the following features:
linearization, thumbnails, article threads, slideshow settings,
outline, structure / tagging.
See Also
Last update: $Date: 2003/12/18 11:20:11 $