Logo of Humboldt-Universität zu BerlinLogo of Humboldt-Universität zu Berlin
edoc-Server
Open-Access-Publikationsserver der Humboldt-Universität
de|en
Header image: facade of Humboldt-Universität zu Berlin
View Item 
  • edoc-Server Home
  • Schriftenreihen und Sammelbände
  • Fakultäten und Institute der HU
  • Institut für Informatik
  • Informatik-Berichte
  • View Item
  • edoc-Server Home
  • Schriftenreihen und Sammelbände
  • Fakultäten und Institute der HU
  • Institut für Informatik
  • Informatik-Berichte
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.
All of edoc-ServerCommunity & CollectionTitleAuthorSubjectThis CollectionTitleAuthorSubject
PublishLoginRegisterHelp
StatisticsView Usage Statistics
All of edoc-ServerCommunity & CollectionTitleAuthorSubjectThis CollectionTitleAuthorSubject
PublishLoginRegisterHelp
StatisticsView Usage Statistics
View Item 
  • edoc-Server Home
  • Schriftenreihen und Sammelbände
  • Fakultäten und Institute der HU
  • Institut für Informatik
  • Informatik-Berichte
  • View Item
  • edoc-Server Home
  • Schriftenreihen und Sammelbände
  • Fakultäten und Institute der HU
  • Institut für Informatik
  • Informatik-Berichte
  • View Item
2006-04-12Buch DOI: 10.18452/2461
On the Distance of Databases
Müller, Heiko
Freytag, Johann-Christoph
Leser, Ulf cc
We study the novel problem of efficiently computing the update distance for a pair of relational databases. In analogy to the edit distance of strings, we define the update distance of two databases as the minimal number of set-oriented insert, delete and modification operations necessary to transform one database into the other. We show how this distance can be computed by traversing a search space of database instances connected by update operations. This insight leads to a family of algorithms that compute the update distance or approximations of it. In our experiments we observed that a simple heuristic performs surprisingly well in most considered cases. Our motivation for studying distance measures for databases stems from the field of scientific databases. There, replicas of a single database are often maintained at different sites, which typically leads to (accidental or planned) divergence of their content. To re-create a consistent view, these differences must be resolved. Such an effort requires an understanding of the process that produced them. We found that minimal update sequences are a proper representation of systematic errors, thus giving valuable clues to domain experts responsible for conflict resolution.
Files in this item
Thumbnail
199.pdf — Adobe PDF — 515.1 Kb
MD5: 5e0d81d16e33ce79a820db2c81d0472c
Cite
BibTeX
EndNote
RIS
InCopyright
Details
DINI-Zertifikat 2019OpenAIRE validatedORCID Consortium
Imprint Policy Contact Data Privacy Statement
A service of University Library and Computer and Media Service
© Humboldt-Universität zu Berlin
 
DOI
10.18452/2461
Permanent URL
https://doi.org/10.18452/2461
HTML
<a href="https://doi.org/10.18452/2461">https://doi.org/10.18452/2461</a>