![]() |
| VERS STORY | STANDARD | COMPLIANCE | PROJECTS | DIGITAL ARCHIVE | TRAINING | TOOLKIT | PUBLICATIONS | ||
|
|
6.2 Re-implementing rendering software There are often several possible file formats that will preserve the desired characteristics. In order to select amongst these alternative formats, PROV prefers the format that is easiest to re-implement. The worst case preservation scenario is where it becomes necessary to re-implement, from scratch, software to render a record. In order to re-implement rendering software, the future developer will require:
6.2.1 Simple formats The ideal format is one that is simple enough to allow it to be described it in a short piece of text that can be included in the M128 File Encoding element within the VEO. Very few formats are this simple. 6.2.2 Published formats For records that are too complex to describe in a short piece of text, the preferred format is one that has been formally specified and published. The VEO must include a reference to the published specification in either the M128 File Encoding or M131 Rendering Text elements. An archive can build up a library of the specifications that it uses, or can rely on accessing the specifications through legal deposit libraries. Almost all records are sufficiently complex to require an external published format. Consider a record that contains just text in several languages. In order to render the text it is necessary to convert the bit stream in the file into a sequence of character numbers. It is then necessary to map each character number into a glyph (the character image displayed on the screen). The Unicode standard [Unicode] describes thousands of character glyphs and has several mechanisms for converting the character numbers into bit streams. It is clear that this complex specification could not be summarised in the M128 File Encoding element. Instead, this element will contain a reference to the Unicode standard. Most electronic records are a great deal more complex than a simple text file. For example, consider a document such as this Advice. Although most of the content is text, the characters have formatting applied (e.g. colour, font, and weight). The characters are combined to form higher level formatting units (e.g. paragraphs) which have their own formatting applied. Some paragraphs have particular characteristics (e.g. they are numbered, form indexes, or a headers). Finally, the document contains objects that are not textual, but images. These images may have their own specifications (e.g. JPEG, TIFF, GIF). The result is that any specification rich enough to cater for all the features of current electronic documents is very complex, and will often contain references to other specifications. Where there are a choice of several suitable formats, the following criteria are used to select between them:
| ||||
![]() |
![]() |
|