2023-06-30Masterarbeit DOI: 10.18452/26854
Modeling institutional research data repositories using the DCAT3 Data Catalog Vocabulary
A case study on TUdatalib
Semantic Web and Linked Data technologies might solve issues originating from research data being published by independent providers. For maximum benefit from these technologies, metadata should be provided as standardized as possible. The Data Catalog Vocabulary (DCAT) is a W3C recommendation of potential value for Linked Data exposure of research data metadata. The suitability of DCAT for institutional research data repositories was investigated using the TUdatalib repository as study case. A model for TUdatalib metadata was developed based on the analysis of selected resources and guided by a draft of DCAT 3. The model allowed for providing the essential information about the repository structure and contents indicating suitability of the vocabulary and, conceptually, should permit automated data conversion from the repository system to DCAT 3. A loss of expressiveness comes from the omission of dataset series. Conformance with DCAT 3 class definitions led to a highly complex model, thus creating challenges with actual technical realizations. A comparative study revealed simpler models to be used at two other repositories, but implementation of the TUdatalib or a similar model would have potential to improve alignment to DCAT specifications. DCAT 3 was observed to be a promising option for Linked Data exposure of institutional research data repository metadata and the TUdatalib model might serve towards developing a general DCAT 3 application profile for institutional and other research data repositories.