Hans Liegmann: Long-term preservation of Electronic Theses and Dissertations |
A considerable amount of our capacity in the last years had to be invested into the implementation of transfer procedures for digital resources into our electronic library stacks.
Our experience with ETDs showed that submission information package definition is in principle not too complicated for this type of material. Most of our dissertations and theses consist of one single file (90 % are in PDF format) and in opposite to some early predictions, they still do not bring along extensive additional material like rotating molecule models, multimedia additions, executable programmes or data sets. But we wanted to handle the few and rather simple examples for multifile objects in a standardized way to prepare ourselves for future complexity. If consistency and completeness of multifile objects (e. g. a bunch of HTML files) has to be guaranteed, no one does better than the author or primary publisher. Following this rule, we pragmatically defined a container format for multifile ETDs in 1999. It uses a choice of archive formats (ZIP, TAR) to keep together the dependent parts of the document. Additionally, we introduced a simple table of contents file in order to standardize the root element for future migration activities even on single files and for user access and navigation.
Since 1999, several projects and initiatives including ourselves have worked on proposals for submission information packaging. We are looking at the standardized container format for eBooks<4> , Harvard<5> project results on eJournal article transfer and we are observing with great interest the deployment of the Metadata Encoding and Transmission Standard METS<6> . A METS document consists of five components: descriptive metadata, administrative metadata, file groups, structural map and may even include behaviour. METS' ability to combine metadata, content and structural information in one entity makes it very attractive for digital object transfer. METS has its roots in the Digital Library Federation, so that openness is provided for and further input from library and archive communities is possible. We envisage METS as a possible successor for our pragmatic "home-grown" container format for ETDs and other digital objects.
Until now, ETDs archival copies are pulled into the archive by librarians of Die Deutsche Bibliothek individually and manually. Due to the growing amount of ETDs and the sometimes problematic response performance of the original servers, we are interested in automating this procedure as far as possible. We will soon start an experiment, where we will harvest ETD servers using 'Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH)'<7> , then extract the (preferably persistent) document identifier from the metadata set and finally pull the ETD by a robot automatically.
| Footnotes: | |
|---|---|
| <4> | http://www.openebook.org/ |
| <5> | http://www.diglib.org/preserve/harvardsip10.pdf |
| <6> | http://www.loc.gov/standards/mets/ |
| <7> | http://www.oclc.org/research/pmwg/background.shtm |
© This publication and its compilation in form and content is copyrighted. Every realization which is not explicitly allowed by copyright law requires a written agreement. Especially, this holds for reprography and processing / storing by electronic systems.
|
ETD Proceeding DTD |
HTML - Version create: Tue May 20 15:11:32 2003 |