Thesis (Selection of subject)Thesis (Selection of subject)(version: 368)
Thesis details
   Login via CAS
Schematron Schema Inference
Thesis title in Czech: Schematron Schema Inference
Thesis title in English: Schematron Schema Inference
Key words: XML, XML schéma, Odvozování XML, Schematron
English key words: XML, XML schema, XML inferring, Schematron
Academic year of topic announcement: 2010/2011
Thesis type: diploma thesis
Thesis language: angličtina
Department: Department of Software Engineering (32-KSI)
Supervisor: doc. RNDr. Irena Holubová, Ph.D.
Author: hidden - assigned and confirmed by the Study Dept.
Date of registration: 08.10.2009
Date of assignment: 08.10.2010
Confirmed by Study dept. on: 01.07.2011
Date and time of defence: 30.01.2012 09:30
Date of electronic submission:05.12.2011
Date of submission of printed version:08.12.2011
Date of proceeded defence: 30.01.2012
Opponents: RNDr. Martin Svoboda, Ph.D.
 
 
 
Guidelines
Currently there exist a plenty of papers dealing with inference of XML schemas of XML documents expressed either in XML Schema or DTD. However, there are other languages that enable to specify the structure of XML data in quite a different way. An example of such language is Schematron, an ISO standard based on specification of conditions the XML data should follow instead of a grammar. However, there seems to be no approach for inference of Schematron schemas.
The aim of this work is a research on various aspects of the problem of automatic inference of an XML schema for a given set of XML documents. Firstly, it is necessary to analyze existing solutions and to discuss their advantages and disadvantages. The core of the work is a proposal and implementation of own method of automatic schema inference dealing with constructs of Schematron. The work will include suitable experimental results.
References
Mlynkova, I. - Necasky, M. - Pokorny, J. - Richta, K. - Toman, K. - Toman, V.: Technologie XML - Principy a aplikace v praxi. Grada Publishing, Prague, Czech Republic, zari 2008. ISBN 978-80-247-2725-7.

W3C. W3C Technical Reports and Publications. http://www.w3.org/TR/

Schematron: http://www.schematron.com/

Mlynkova, I.: An Analysis of Approaches to XML Schema Inference. SITIS '08, Bali, Indonesia, November/December 2008. IEEE Computer Society Press, 2008.

Vošta, O.: Automatická konstrukce schématu pro množinu XML dokumentů. Diplomová práce, MFF UK, 2005. http://kocour.ms.mff.cuni.cz/~mlynkova/dp/Vosta.pdf

Moh, C.-H. - Lim, E.-P. - Ng, W.-K.: Re-engineering Structures from Web Documents. In DL '00, pages 67-76, New York, NY, USA, 2000. ACM Press.

Garofalakis, M. - Gionis, A. - Rastogi, R. - Seshadri, S. - Shim K.: XTRACT: a System for Extracting Document Type Descriptors from XML Documents. In SIGMOD '00, pages 165-176, New York, NY, USA, 2000. ACM Press.

Ahonen, H.: Generating Grammars for Structured Documents Using Grammatical Inference Methods. Report A-1996-4, Department of Computer Science, University of Helsinki, 1996.
 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html