Eva Müller, Stefan Andersson, Uwe Klosa, Peter Hansson: Metadata Workflow Based on Reuse of Original Data

2. The Cataloguing Process at Swedish Research Libraries

2.1 LIBRIS - the Union Catalogue of Swedish Libraries

Research libraries in Sweden primarily catalogue their resources in LIBRIS - the union catalogue of Swedish libraries - but use local library management systems for circulation and patron information. This means that the bibliographical records for books, periodicals, electronic documents, and other publications are registered only once in the union catalogue. The participating libraries will then add local information on their specific holdings, locations and subject headings. Afterwards the records are exported to the local library system where item-specific data, e.g. a barcode, is appended. At the moment some 200 university and special libraries are using LIBRIS for cataloguing and more than 1,400 libraries are using it for interlibrary lending. The union catalogue, comprising 4,5 million titles, is also publicly available on the web. The LIBRIS-department at the Royal Library in Stockholm is responsible for the administration and development of the system.<5>

2.2 MARC 21

Since January 2002 the Voyager software from Endeavor Information Systems is utilised as the library management system for LIBRIS, replacing an older system. In connection with this change of systems it was also decided to replace the local LIBRISMARC/LIBRIS III format with MARC 21 (the harmonised USMARC and CAN/MARC formats published in a single edition by the Library of Congress and the National Library of Canada in 1999).<6> Facilitating the exchange of bibliographical information on an international level was one of the reasons behind this decision.<7>

2.3 Cataloguing Swedish Theses

Normally the theses published at Swedish universities are initially catalogued in LIBRIS by the local university library. The university libraries generally use the minimal level (code 7 in MARC Leader) for the bibliographic information. In KRS<8>, the Swedish cataloguing rules, which are a translation and revision of the Anglo-American cataloguing rules, second edition, this level is described in §1.0D2.

Traditionally theses (and other university publications) are catalogued in the same way as other items acquired by the university libraries: A cataloguer will receive the book, search for it in the LIBRIS database, and, if it is not already registered, type in the full bibliographical record including all ISBD(G)-punctuation (General International Standard Bibliographic Description).<9>

Subsequently the theses will also be included in Svensk bokförteckning<10>, the part of the Swedish national bibliography of literature which covers monographs issued in Sweden. The bibliography is based on copies supplied to the Royal Library by publishers and other organisations and on legal deposit copies. It is produced from the LIBRIS database. The bibliographic information will, in connection with this, be refined by the section of National Bibliography Monographs at the Royal Library to meet the full level as described in §1.0D3 of KRS. As an example it can be mentioned that the number of Swedish theses issued in a single year (2002) and registered in the Swedish national bibliography was 2,247.

2.4 New DiVA Workflow

The main intention of the new cataloguing workflow implemented through DiVA is naturally to reuse the original data which was created by the author (as described above, 1.1-1.2) as the basis for the bibliographical record instead of typing the same information all over again.

Additionally, the bibliographic record will be available in the national union catalogue more or less simultaneously as the thesis is made publicly available. Otherwise a small sample of Uppsala theses showed that it will typically take at least a week before the thesis turn up in LIBRIS (if cataloguing is done manually).

In the complete DiVA workflow described above where the metadata entered by the author actually forms the title page and edition notice of the publication (in print or on the Internet) one can also be sure that the bibliographic information is the correct one - at least from a cataloguing point of view! - since it is reproduced directly from the source document.

2.5 Creation of MARC 21 Records in DiVA

The records created from the information submitted by authors, and stored as files in the DiVA Document Format as described above, constitute the starting point for the creation of the MARC 21 records in the DiVA Publishing System. Transferring the records directly from DiVA to LIBRIS means that cataloguers do not have to import them from another source or use any other software than the Voyager cataloguing client to find them. (See siehe ).

At first an XSLT stylesheet will transform the record in the DiVA format into another XML file in the MARC XML format, the new XML exchange format for MARC as published by the Library of Congress.<11> The XML format is used because the OAI proctocol for metadata harvesting (OAI-PMH)<12> is used to collect and deliver the records which are to be sent to LIBRIS. Besides, no further scripting, apart from the XSLT transformation, is needed to create a full MARC record. A MARC ISO-2709 record can then be created without data loss from the MARC XML record. Leader data positions not needed in the XML environment are retained as place holders containing the value 0. The MARC data fields are created by matching DiVA elements to MARC elements. Standard phrases are included in the XSL templates where appropriate. Information that appears more than once in the MARC record can be created from one source (e.g. both data fields 100a and 245c can be created from a single DiVA creator element).

Since it has been decided to store all the ISBD(G)-punctuation (even the punctuation between fields) in the actual fields of the LIBRIS records, rather than letting the user interface display them, the punctuation is created in the MARC XML record through a series of conditional tests in the XSLT stylesheet, otherwise it must be added manually by the cataloguer. Quite often a number of conditions must be tested.

The date of birth is required for Swedish citizens in main and added entry fields (100/700) for personal names in LIBRIS. As the date is submitted by the authors themselves the cataloguer will not have to look it up in the university directory.

The current version (1.0) of the DiVA Document Format supports all corresponding MARC 21 fields and indicators for the bibliographic description except the number of non-filing characters<13> which occurs as the second indicator in the title statements, e.g. fields 245 and 440. Because there is no way of describing alternative filings of titles in the DiVA system at present, the second indicator will always initially be set to 0 and may have to changed manually later on.

2.6 Transferring DiVA Records to Libris

The OAI Harvester application from OCLC<14> contacts DiVA daily to ask for newly published theses. As a response to the request the DiVA system delivers metadata records conforming to the MARC XML format described above. The records are stored in an XML file which is delivered to LIBRIS by a file transfer protocol over the Internet.

The MARC XML records will eventually be converted to MARC ISO-2709 ("tape format"), and imported into the LIBRIS database. The conversion will calculate the numeric strings for the record length (Leader position 00-04 ) and Base address of data (Leader position 12-16) and replace the 0-values of the MARC XML record. It also establishes the directory of the ISO 2709 record. Since the Voyager library management system does not support Unicode the character encoding must also be converted to ANSEL. These conversions can also easily be accomplished by the MARC4j API<15>.


Footnotes:

<5>

See: http://info.libris.kb.se/infosvensk/allmaninfo.htm

<6>

See: http://lcweb.loc.gov/marc/

<7>

See: http://info.libris.kb.se/infosvensk/Nyheter_Fakta/librisnytt/librisnytt34.htm#rubrik1

<8>

Katalogiseringsregler för svenska bibliotek. Lund 1990

<9>

See: http://www.ifla.org/VII/s13/pubs/isbdg.htm

<10>

See: http://www.kb.se/nbm/svb.htm

<11>

See: http://www.loc.gov/standards/marcxml

<12>

See: http://www.openarchives.org/OAI/openarchivesprotocol.html

<13>

A value that specifies the number of character positions associated with a definite or indefinite article (e.g., Le, An) at the beginning of a title that are disregarded in sorting and filing processes.

<14>

See: http://www.oclc.org/research/software/oai/harvester.shtm

<15>

MARC4J is an easy to use Application Programming Interface (API) for working with MARC records in Java. The API consists of an event-based MARC parser, an object model for in-memory editing of MARC record objects and SAX2 based producers and consumers for conversions between MARC and MARC XML. Available from http://marc4j.tigris.org/



© This publication and its compilation in form and content is copyrighted. Every realization which is not explicitly allowed by copyright law requires a written agreement. Especially, this holds for reprography and processing / storing by electronic systems.

ETD Proceeding DTD
HTML - Version create: Fri May 16 14:26:31 2003