About websites

A website is...

“A collection of electronic resources: 

  • that is made available in a particular domain of the internet, for the communication of information and/or the conduct of business transactions; and
  • that share a common domain name, normally belonging to a single or defined group of organisations and having as their virtual location (or Uniform Resource Identifier) a hierarchical (or other) relationship with the main domain content (often referred to as the ‘home page’); and 
  • providing a body of interlinked information resources that is navigable using browser technology.” (National Archives UK)

Victorian Government agencies almost universally have a web presence, and the rate of delivery of government services online is growing every year.

For Victorian Government agencies, websites provide a platform to: 

  • publish information about public administration bodies or services 
  • gather information that public bodies then act on to provide services 
  • carry out transactions 
  • offer dynamic and, in some cases, personalised service delivery to clients.

 

Managing websites as records 

Where government websites perform functions, provide services, or contain information that is not replicated offline, they are the primary conduits of business activity.

Agencies are required to create and appropriately maintain records of website activity in accordance with PROV Standards and all other relevant legislative and policy requirements and must:

  • Ensure that all staff and management who have a role in the creation, maintenance and functioning of agency websites understand that websites create public records and are therefore bound by a range of legal and recordkeeping requirements.
  • Determine what content from websites should be and will be captured as records (incorporating functional assessment and risk analysis) and how often this will occur.
  • Ensure appropriate retention of records.
  • Ensure that the design or operation of its websites support recordkeeping.
  • Select an appropriate technical approach to capturing the web records. 

How to manage website records

To determine which web records need to be captured and how long they should be retained, agencies need to appraise them according to: 

  • the function(s) and activities carried out or documented by the website records
  • risk management
  • web content format and context.

Functions and activities
Agencies need to be aware of what business activity their website (or its parts) are performing or recording to determine which records should be captured and retained for the appropriate period of time.

This can be a challenging process, as websites usually span more than one functional area in an organisation, sometimes including functions with quite different levels of importance. 

Agencies should refer to PROV’s Retention and Disposal Authorities (RDAs). These Standards identify the functions and activities performed by agencies and assign appropriate periods that records must be kept for. 

“Content from an agency’s website or intranet should be sentenced according to the function and activity that the content documents.” (Class 15.2.0, PROS 07/01 Common Administrative Functions Retention and Disposal Authority)

See Retention and Disposal Authorities (RDAs) for further information about RDAs.
See our Document Library for a full list of current RDAs.


Risk Management
Agencies should conduct a risk assessment of their website records involving the:

  • business owner/content creator, who can provide information about the content type and value of the web records
  • website manager, who can provide information about the site infrastructure, back-end databases, audit logs, publishing process 
  • records manager, who understands the business and legal requirements for records and provides an organisation-wide perspective on existing recordkeeping systems and processes.

The risk assessment should consider:

  • litigation and legal disputes (and the need to prove what information was published at a certain time)
  • political consequences (the “Herald-Sun test” – how will a breach with negative consequences read on the front page of the daily newspapers?) 
  • business discontinuity or increased costs if important website information is inaccessible 

The following questions could also be explored when conducting the risk assessment: 

  • What are the functional areas of the content?
  • What part of the business of the organisation does the content document or support? 
  • What risks are associated with the information? (e.g. a section showing the office's opening hours would be lower risk than one providing information for tenderers)
  • What is the likelihood these risks will occur? 
  • What would be the consequences if they did? 

See PROS 10/10 G6 Records and Risk Management Guideline for further information.


Web content format and context
It is also important to consider whether the website is the only, primary or a secondary means of delivering the information or providing the function or activity. If content on the website mirrors offline content, it may be the case that the website itself is, in recordkeeping terms, a copy, and may not require lengthy retention even if the function to which it relates is a long-term one.

PROV advises that agencies adopt the approach that will deliver the most complete record and best protect from any risks. 

Not all content, pages or transactions on websites will generate records that require capturing and retention. 

Although, for practical reasons, some agencies may find that archiving entire sites is easier than separating ephemeral content from essential records. Capturing entire sites may, in turn, be difficult for agencies employing multiple systems to deliver web content and dynamic activity. 

For agencies that, following a functional analysis of their web resources or a technical analysis of the nature of their web activity, decide to capture only part of their website/s, it will be necessary to decide what content the record is composed of, and where it can be found. Website records, to be complete, must have: 

  • record content 
  • metadata (contextual information) 
  • management and policy information that gives the framework within which the website was created and published.

All three of these elements are essential to provide a full and accurate (and useful) record of web activity. 

“Records are created and captured as part of or as soon as practical after the action, decision or incident that they document.” (Requirement 7, PROS 11/07 S3 Capture Specification).

Capture frequency may occur live as content is created or at the end of each week or month, as deemed appropriate by the agency. Your risk assessment will help you to determine how frequently posts should be captured.

Agencies should also have appropriate procedures and processes in place to facilitate the capture of website records.

The outcomes of your risk assessments and appraisal activities will assist you to determine what approach(s) to use to capture your website records. The following table outlines some available technical approaches to capturing web records.

Option Suitable for
Retain in web content management system (CMS)
  • Short term retention
  • CMS which have basic recordkeeping functionality (e.g. metadata capture) so both content and context is captured
Automated capture into system such as Electronic Document and Records Management System (EDRMS)
  • Simple web content that does not need all links preserved
  • Web content that requires long term retention
Manual capture into system such as EDRMS or digital/physical file
  • Small volume of web content as it is more labour intensive (requires manual screen capturing/saving/metadata entry)
  • Simple web content that does not need links preserved
  • Web content that requires long term retention
Web harvesting (use product to capture at web browser and save locally) 
  • Web content which is linked together and with low number of web objects
  • Websites that do not change/update frequently and have a low rate of web transactions
Capturing transactions (use product to capture HTTP requests made to web server and consequent responses)
  • Capturing the interactions (what was received from and sent to a user) not the whole website
Capturing from back-end (system captures the data that provides the web content)
 
  • Capturing the content for the website rather than the website itself such as CMS or business application with a website front end (system should also have recordkeeping functionality so both content and context is captured)

 

Determining what (and how) to archive from websites that are no longer operational poses both logistical and practical challenges for good recordkeeping. 

Agencies should consider the following recordkeeping questions when dealing with decommissioned sites: 

  • How much of the material on the decommissioned site is being carried forward intact to a new site or sites? 
  • To what extent is the website content covered by an RDA? 
  • Does the site make any logical sense (and does it retain sufficient context to be meaningful) in an offline environment? 

Answering these questions will help an agency be clear about the minimum record that needs to be captured before the decommissioned site loses accessibility.

Agencies are strongly encouraged to use these long term preservation formats for information that must be retained for more than seven years, even if the information does not have to be transferred to PROV.

See PROS 15/03 S3 Long term preservations formats specification for further information.