Validate
Examines PDFs at a selectable level of detail and reports errors.
Quickly check for invalid or damaged PDFs in your archive,
or validate a freshly downloaded bunch of PDFs.
Options
java tool.pdf.Validate [options] PDF-file(s)
- validation level (choose one)
- -fast --
Validate a PDF
by checking whether its structure is valid.
However, the content itself, such as images and content streams, is not read.
Since structure and content are intermixed, this is usually sufficient to check
for a successful network transmission.
This option is useful for checking 1000s or more PDFs quickly.
- -full --
Reads the contents of every object in a PDF.
(Default.)
- -obj --
Tests semantic integrity of objects.
- Links (annotations and actions) -
check that link destinations (both internal and external) exist.
And check that the source boxes of links do not overlap one another.
- actions: GoTo, GoToR, Launch, URI
- -verbose --
Report names of valid files as well as invalid ones.
- -password password --
password if PDF is encrypted
Examples
1. fast check finds files mislabeled with .pdf
suffix
java tool.pdf.Validate -fast .
produces
/Users/phelps/data/pdfdb/000137.pdf: java.io.IOException: No document catalog.
invalid password /Users/phelps/data/pdflr/secure2.pdf
File: /Users/phelps/data/pdfdb/000055.pdf
ERROR: invalid but repairable (with tool.pdf.Repair)
File: /Users/phelps/data/pdfdb/000109.pdf
ERROR: can't find '%%EOF' @ byte 3187
File: /Users/phelps/data/pdfdb/000217.pdf
ERROR: can't find '%%EOF' @ byte 20705
2. full read of objects, which is the default level of validation
java tool.pdf.Validate jdj
produces
File: jdj/4-06.pdf
ERROR. object #154: java.io.IOException: incorrect data check @ 10730
#154: {Length=4114, Filter=FlateDecode, DATA=573053}
File: jdj/5-06.pdf
ERROR. object #25: java.io.IOException: invalid bit length repeat @ 0
#25: {Length=13735, Filter=FlateDecode, DATA=1346607}