| edoc-Server der Humboldt-Universität zu Berlin |
| Publikationsart: | Workshop- oder Konferenzbeitrag |
| Autor(en): | Ulf Leser; Felix Naumann |
| Titel: | (Almost) Hands-Off Information Integration for the Life Sciences |
| Erschienen in: |
CIDR 2005, Second Biennial Conference on Innovative Data Systems Research, Asilomar, CA, USA, January 4-7, 2005, Online Proceedings 2005 S. 131-143 |
| Veranstaltung: |
2. CIDR 2005 Asilomar, CA, USA 04.01.2005 - 07.01.2005 |
| Verlag: |
CIDR http://www-db.cs.wisc.edu/cidr/cidr2005/index.html |
| Erscheinungsort: | Asilomar, CA, USA |
| Erstveröffentlichung: | 2005 |
| Veröffentlichung auf edoc: | 29.06.2006 |
| Status: |
published peer_reviewed |
| Volltext: | pdf (urn:nbn:de:kobv:11-10065418) |
| URL der Erstveröffentlichung: | http://www-db.cs.wisc.edu/cidr/cidr2005/cidr05cd-rom.zip |
| Fachgebiet(e): | Informatik |
| Schlagwörter (eng): | Data Integration, Schema Matching, Duplicate Detection, Schema Management |
| Einrichtung: | Humboldt-Universität zu Berlin, Mathematisch-Naturwissenschaftliche Fakultät II |
| Metadatenexport:
|
Endnote Bibtex |
| print on demand:
|
|
| Diese Seite taggen:
|
| Abstract (eng): | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Data integration in complex domains, such as the life sciences, involves either manual data curation, offering highest information quality at highest price, or follows a schema integration and mapping approach, leading to moderate information quality at a moderate price. We suggest a radically differ-ent integration approach, called ALADIN, for the life sciences application domain. The predominant feature of the ALADIN system is an architecture that allows almost automatic integration of new data sources into the system, i.e., it offers data in-tegration at almost no cost. We suggest a novel combination of data and text mining, schema matching, and duplicate detection to combat the reduction in information quality that seems inevitable when demanding a high degree of automatism. These heuristics can also lead to the detection of previously unknown or unseen rela-tionships between objects, thus directly supporting the discovery-based work of life science research-ers. We argue that such a system is a valuable con-tribution in two areas. First, it offers challenging and new problems for database research. Second, the ALADIN system would be a valuable knowl-edge resource for life science research. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Zugriffsstatistik:
Bei Formatversionen eines Dokuments, die aus mehreren Dateien bestehen (insbesondere HTML), wird jeweils der monatlich höchste Zugriffswert auf eine der Dateien (Kapitel) des Dokuments angezeigt. Um die detaillierten Zugriffszahlen zu sehen, fahren Sie bitte mit dem Mauszeiger über die einzelnen Balken des Diagramms. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Gesamtzahl der Zugriffe seit May 2011:
|
|
| |||