Home Guide Search Tools News Search Site Info

Site Search Tools

Searching PDF Files


About PDF and the Web

PDF is the Printable Document Format created by Adobe Acrobat. It is designed for brochures, magazines, forms, reports and other materials with complex visual designs which will be printed on PostScript (tm) printers. The format was created to remove machine and platform dependence for the documents, and its goals include design fidelity and typographic control. It was never designed for interactive online reading. However, many word processors, page layout and other programs can create PDF files easily, so some documents not available in HTML can be served in PDF.

Adobe has a PDF Plug-In for browsers and some development tools to allow servers to send PDF in chunks ("byte-range serving") rather than downloading the entire file. This improves the user experience of receiving PDF files, but they still lack the speed, simplicity and user control of HTML. PDF files have a specified page size, for example, and do not reflow in smaller windows, so people with small screens spend a lot of time scrolling around the window. In addition, copying text from a PDF file is very difficult, as sidebar text is included, and selections cannot cross page breaks.

Adobe used to have a web-based program to convert PDF to HTML, but the page was down the last time I tried it. It was formerly at: http://www.adobe.com/prodindex/acrobat/advform.html.

If at all possible, you should serve both HTML and PDF versions of files, designing the HTML for onscreen use and PDF for printing only. That provides your users with the best format for their task, rather than making too many compromises on one side or the other.

PDF Search Issues

Some search tools can read and index PDF files. They show the file titles and any descriptions in the results list, just like HTML. However, each PDF file is a single entity, often very large, so when the user clicks on a link, they suddenly discover that they are downloading a file and may be asked to install a browser plug-in.

If you must index PDF files, there are several ways to improve the user experience. Try to break the PDF files into single-subject files, such as book sections, chapters or even chapter sections. That way, no one will accidentally download an entire manual just because a word has matched. Also, be sure to display the file size and the fact that it's PDF in the search results, so users are not surprised by what happens when you click on the link.

Compatible Site Search Tools

As of December, 1998, the following site search tools can index and search PDF files:

Adobe's Acrobat Search Listings

= Java  = Mac  = Perl  = Windows  = Unix


Home Guide Search Tools News Search Site Info

Site Search Tools
Copyright © 1998-1999
iFetchIT.com