Victorian Electronic Records Strategy - Forever Digital logo
 


Search
    

7.2 PDF (Portable Document Format)

The Portable Document Format (PDF) is a proprietary format defined by Adobe Systems Incorporated. The specification for PDF has been formally published as a book [PDF], which makes it suitable for a long-term preservation format.

The PDF format is designed to precisely describe pages of documents; one purpose of PDF is to ensure that documents are printed identically irrespective of the printer used. This is a key benefit of using PDF in preserving record content: in PDF each page is rendered exactly as the creator intended. Unlike other representations (e.g. Word, or XML) a new version of the PDF reader will not result in text or other objects moving around a page or between pages, or changing font size or colour.

PDF-A is an international standard that prohibits the use of certain features of PDF. The prohibited features are those which may make it difficult to render (display) PDF files in the future. Such features include:

  • Not including the font definitions in the file. All PDF-A files must include the font definitions of all characters included in the file. Not including the font definitions makes the PDF file smaller, but means that the file cannot be accurately displayed.
  • Externally referenced content. All of the content of a PDF-A file must be included in the file. Content that exists outside a PDF file is most likely to be lost.
  • Undefined formats. PDF files may include arbitrary Javascript programs, sound files, or video files. The standard leaves these formats undefined.

PDF is very widely used as a pre-press tool (i.e. preparing publications for printing) and in making available electronic copies of printed publications. There are at least two free viewers: Adobe Reader (available from Adobe Systems at http://www.adobe.com/products/acrobat/readstep2.html visited 10 May 2006), and Ghostscript (available from http://www.cs.wisc.edu/~ghost/index.html visited 15 June 2003).

VERS specifies PDF 1.4. This is generated by Acrobat 5.x. (Acrobat 4.x generated PDF 1.3, and Acrobat 3.x generated PDF 1.2).

There are a number of minor features of PDF that may make it difficult to preserve the ability to render portions of a PDF document. These have been prohibited in PDF-A and in VERS. It should be noted that most of these problematic features are very obscure and would be unlikely to appear in documents produced by Adobe Distiller.

These include the following:

  • PDF files must be self-contained. There are a number of options in the PDF specification that allow components of a PDF document to be external to the PDF file.
  • These include:

    • All external files must be embedded. VERS requires that all external files be embedded in the final PDF file. A PDF document is composed of multiple object streams, and these object streams may be contained in ‘external files’. However, the specification allows these external files to be actually embedded in the final PDF file (this is the default behaviour when generating PDF when using Adobe Distiller).
    • All fonts must be embedded within the PDF file. This is controlled by an option in Adobe Distiller.

      Note that even after embedding the fonts within a PDF file there are some issues with TrueType fonts (see section 5.5.5, p.333, [PDF]). We assume that the standard PDF generation tools (such as Adobe Distiller) implement the detailed recommendations in this section. Note that these issues apply to any archival format that uses TrueType fonts.

    • Reference Xobjects may be used, but the external file must be embedded in the containing document.
    • The Open Prepress Interface (OPI) must not be used. OPI is a mechanism where large high-resolution images produced during the pre-press production process are stored separately to the PDF file.
  • The PDF encryption option is prohibited for the reasons give in section 5.
  • Rendering options dependent on specific output devices are prohibited. These include:
    • CIE-based colour to device colour.
    • Conversions amongst device colour space.
    • Transfer functions.
    • Halftones.
    • Scan conversion.
    • Overprinting.

    These options are designed to allow the fine control of printing on specific printers (or classes of printers).

  • Javascript actions associated with the document are prohibited.
  • The following types of annotations are prohibited:
    • Plugin extensions
    • File Attachment annotations, as it may not be possible to identify the application necessary to render the attachment.
    • Sound annotations, as it may not be possible to identify the application necessary to play the sound.
    • Movie annotations, as it may not be possible to identify the application necessary to play the movie.
    • Any Javascript actions, and the Launch, SubmitForm, or ImportData actions associated with Form annotations. Apart from requiring the inclusion of a Javascript interpreter (and hence specification), the result of executing Javascript can be almost anything (including references to external objects). Hence, it is extremely difficult to preserve the ability to evaluate Javascript over the long-term.

back to top

Department for Victorian Communities logo - Link to DVC home Public Record Office Victoria logo - Link to PROV home