2002-11-01Konferenzveröffentlichung DOI: 10.18452/9206
Declarative Data Merging with Conflict Resolution
Mathematisch-Naturwissenschaftliche Fakultät II
Database integration is a growing and increasingly important field in both research and industry. Integration requires many steps from initial schema integration and schema mapping, to data scrubbing and cleansing, and finally to data merging. While much research has concentrated on the first steps performed at schema level, there are only few publications about actual, practical merging of the data in an integrated database or in a query against multiple databases. When merging data, especially data from autonomous sources, there is a large potential for decreasing the quality of the merged data, even below the level of the original sources. The main reasons for decreased quality are data conflicts among the sources. To address this problem, we define resolution functions merging conflicting data. We present several alternatives of merging relational data sources with common queries through grouping & aggregating and through partitioning & joining. The resulting queries use resolution functions and can be used to migrate data from multiple sources to a target database, or to define an integrating view on multiple sources. We describe and analyze the advantages of the different approaches, and describe our practical solution in the framework of a schema mapping and data transformation tool.
No license information