Jeremias Märki
Private Website

Paper on Long-Term Archival Package Formats


Together with the Swiss Federal Institute of Intellectual Property we've been looking into possibilities of packaging patent documents for long-term archival. The goal is to package PDF files (following the PDF/A-1 standard) together with the original data that was used to create the PDF. The original data is normally comprised of an XML file and possibily multiple TIFF images.

We've documented our findings in a short paper, which you can download as a PDF file:

Download (published: 2006-02-21)

Additional Notes

2008-01-15: Since the publication of the above paper, a number of proposals and specifications have shown up on the net. A basic ZIP-based container that could be used instead of the above is the "Universal Container Format" (UCF) used by Adobe's Mars specification. UCF is based on the packaging principles of OCF, the OEBPS Container Format, created by the International Digital Publishing Forum.

2012-07-02: PDF/A-2 finally allows attachments which makes a container superfluous.