2004-02-05Zeitschriftenartikel DOI: 10.18452/9188
Completeness of integrated Information Sources
Mathematisch-Naturwissenschaftliche Fakultät II
For many information domains there are numerous World Wide Web data sources. The sources vary both in their extension and their intension: They represent different real-world entities with possible overlap and provide different attributes of these entities. Mediator-based information systems allow integrated access to such sources by providing a common schema against which the user can pose queries. Given a query, the mediator must determine which participating sources to access and how to integrate the incoming results. This article describes how to support mediators in their source selection and query planning process. We propose three new merge operators, which formalize the integration of multiple source responses. A completeness model describes the usefulness of a source to answer a query. The completeness measure incorporates both extensional value (called coverage) and intensional value (called density) of a source. We show how to determine the completeness of single sources and of combinations of sources under the new merge operators. Finally, we show how to use the measure for source selection and query planning.