Zur Kurzanzeige

2006-04-01Konferenzveröffentlichung DOI: 10.18452/9214
XStruct
dc.contributor.authorHegewald, Jan
dc.contributor.authorNaumann, Felix
dc.contributor.authorWeis, Melanie
dc.date.accessioned2017-06-17T00:23:16Z
dc.date.available2017-06-17T00:23:16Z
dc.date.created2006-07-05
dc.date.issued2006-04-01
dc.identifier.urihttp://edoc.hu-berlin.de/18452/9866
dc.description.abstractXML is the de facto standard format for data exchange on the Web. While it is fairly simple to generate XML data, it is a complex task to design a schema and then guarantee that the generated data is valid according to that schema. As a consequence much XML data does not have a schema or is not accompanied by its schema. In order to gain the benefits of having a schema - efficient querying and storage of XML data, semantic verification, data integration, etc.- this schema must be extracted. In this paper we present an automatic technique, XStruct, for XML Schema extraction. Based on ideas of [5], XStruct extracts a schema for XML data by applying several heuristics to deduce regular expressions that are 1-unambiguous and describe each element’s contents correctly but generalized to a reasonable degree. Our approach features several advantages over known techniques: XStruct scales to very large documents (beyond 1GB) both in time and memory consumption; it is able to extract a general, complete, correct, minimal, and understandable schema for multiple documents; it detects datatypes and attributes. Experiments confirm these features and properties.eng
dc.language.isoeng
dc.publisherHumboldt-Universität zu Berlin, Mathematisch-Naturwissenschaftliche Fakultät II
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/
dc.subjectMetadataeng
dc.subjectXML Schemaeng
dc.subject.ddc004 Informatik
dc.titleXStruct
dc.typeconferenceObject
dc.identifier.urnurn:nbn:de:kobv:11-10065894
dc.identifier.doihttp://dx.doi.org/10.18452/9214
local.edoc.type-nameKonferenzveröffentlichung
local.edoc.container-typeconference
local.edoc.container-type-nameKonferenz
local.edoc.container-year2006
dc.description.versionPeer Reviewed
dc.description.eventProceedings of the 22nd International Conference on Data Engineering Workshops, ICDE 2006, 3-7 April 2006, 2006, pp 81-81, 22nd International Conference on Data Engineering Workshops (ICDEW 06), Atlanta, Georgia, USA, 03.04.2006 - 07.04.2006
dcterms.bibliographicCitation.doi10.1109/ICDEW.2006.166
dcterms.bibliographicCitation.booktitle22nd International Conference on Data Engineering Workshops (ICDEW'06)
dcterms.bibliographicCitation.booktitle22nd International Conference on Data Engineering Workshops (ICDEW'06)
dcterms.bibliographicCitation.booktitleProceedings of the 22nd International Conference on Data Engineering Workshops, ICDE 2006, 3-7 April 2006
dcterms.bibliographicCitation.originalpublishernameIEEE Computer Society
dcterms.bibliographicCitation.originalpublisherplaceAtlanta, Georgia, USA
dcterms.bibliographicCitation.pagestart81
dcterms.bibliographicCitation.pageend81
bua.departmentMathematisch-Naturwissenschaftliche Fakultät II

Zur Kurzanzeige