Eva Müller, Uwe Klosa, Peter Hansson, Stefan Andersson, Erik Siira: Using XML for Long-term Preservation |
Uppsala University Library, Electronic Publishing Centre
Keywords:
long-term preservation, XML, XML Schema, DiVA, DiVA Document Format, DiVA Archive, URN, URN:NBN
One of the objectives of the DiVA project is to explore the possibility of using XML as a format for long-term preservation. For this reason, the practical use of XML in different parts of the system was evaluated before deciding on the design.
The DiVA Document Format - defined by an XML schema - has been developed to describe the inter-relationships amongst the various data elements and processes, and to support long-term preservation of the actual documents.
XML Schema provides a means for defining the structure, content and semantics of XML documents. It is an XML based alternative to the XML Document Type Definition (DTD). Because one of the primary reasons for using XML was to support long-term preservation, the most popular DTDs for documents: DocBook and TEI were evaluated. Limitations regarding metadata descriptions were found in both of these DTDs, so the decision to develop a new structure for DiVA, using XML schema, was made. This schema combines the DocBook Schema (derived from the DocBook DTD) for the textual parts of the document with the internal schema for all metadata (bibliographic and administrative data).
Using the DiVA Document Format for content management and inter-process communication, several applications were developed. Some of their purposes are essential for long-term preservation:
Currently the file-archives for long-term preservation contain the original full-text file in various formats and the DiVA Document Format file, which contains all the metadata about the document. Furthermore the DiVA Document Format file contains all parts of the full-text file that can be converted into XML. In the future it might be possible to transfer the whole full-text into XML, in which case the file-archives would contain only DiVA Document Format files.
Table of Contents | |
| Front page | Using XML for Long-term Preservation |
| Preface | Preface |
| 1 | XML as Long-term Preservation Format |
| 1.1 | XML Schema |
| 1.2 | Comparison of DocBook and TEI |
| 1.3 | DiVA Document Format |
| 2 | Long-term Preservation in the DiVA Project |
| 2.1 | Uniform Resource Name (URN) and National Bibliographic Number (NBN) |
| 2.2 | The DiVA Archive |
| 3 | Conclusions |
| Appendix A | Appendix |
Table of Figures | |
| Figure 1: | Structure of the DiVA Archive |
| Figure 2: | Graphical representation of the complex type personType |
| Figure 3: | Graphical representation of the complex type organisationType |
© This publication and its compilation in form and content is copyrighted. Every realization which is not explicitly allowed by copyright law requires a written agreement. Especially, this holds for reprography and processing / storing by electronic systems.
|
ETD Proceeding DTD |
HTML - Version create: Tue May 20 15:50:59 2003 |