SubjectsSubjects(version: 837)
Course, academic year 2018/2019
   Login via CAS
Web Semantization - NSWI108
Title in English: Sémantizace webu
Guaranteed by: Department of Software Engineering (32-KSI)
Faculty: Faculty of Mathematics and Physics
Actual: from 2016 to 2018
Semester: winter
E-Credits: 5
Hours per week, examination: winter s.:2/2 C+Ex [hours/week]
Capacity: unlimited
Min. number of students: unlimited
State of the course: taught
Language: Czech, English
Teaching methods: full-time
Additional information:
Note: enabled for web enrollment
Guarantor: prof. RNDr. Peter Vojtáš, DrSc.
Class: Informatika Mgr. - Softwarové systémy
Classification: Informatics > Informatics, Software Applications, Computer Graphics and Geometry, Database Systems, Didactics of Informatics, Discrete Mathematics, External Subjects, General Subjects, Computer and Formal Linguistics, Optimalization, Programming, Software Engineering, Theoretical Computer Science
Annotation -
Last update: RNDr. Michal Kopecký, Ph.D. (10.05.2017)
Making full use of web content needs automated processing if it’s content. Semantization is appropriate dynamic content enrichment. We are treating the problem from SW engineering perspective: models, methodology and process of web semantization. We cover basic formal knowledge necessary for orientation in the field and learn some practical skills. Labs are composed of reporting on current achievements, learning rules for semantization, project of a virtual Lean Startup and customer imitation via a social network.
Course completion requirements -
Last update: prof. RNDr. Peter Vojtáš, DrSc. (12.10.2017)

Terms of passing the course consist of reporting on current achievements, induction on semantized data, project of a virtual Lean Startup and customer imitation via a social network. These are only conditions for getting credits. Exam is oral and requires basic understanding of whole material.

As soon as terminology is introduced, detailed milestones (also form of deliverables) and preferred deadlines will be announced (with possible repeated attempts). There is no evidence on personal presence. Nevertheless, no additional explanation for tasks will be given, except on the respective lab and brief description on the course web. Final deadline is end of semester (without repeated attempts).

Literature -
Last update: RNDr. Michal Kopecký, Ph.D. (10.05.2017)
  • P. Hitzler, M. Krötzsch, S. Rudolph. Foundations of Semantic Web Technologies. Chapman & Hall/CRC 2010,
  • E. Ries. Lean Startup, Crown Business 2011
  • D. Harel, D. Kozen, J. Tiuryn. Dynamic Logic. The MIT Press 2000
  • G. James, D. Witten, T. Hastie, R. Tibshirani. An Introduction to Statistical Learning with Applications in R. Springer 2013
  • C. D. Manning, P. Raghavan, H. Schütze. An Introduction to Information Retrieval. Cambridge University Press 2009

Syllabus -
Last update: RNDr. Michal Kopecký, Ph.D. (10.05.2017)
Web semantization
Basic problems and vision of automation of web content processing, extraction, annotation

Lean start-up methodology and semantization

RDF-framework, description logic, OWL
Data model RDF and RDFS as a model of metadata, formal semantics, satisfiability

Basics of description logic (DeL), knowledge and ontology representation

Web querying languages
Language SPARQL, SPARQL algebra

Dynamic logic
Propositional dynamic logic (DyL)

Decidability of DyL

A dynamic model of web semantization
Integration of W3C models and Dynamic logic

Reliability of automated web information extraction and annotation

A Kripke style model: states are query_based_predicate logic, programs (extractors) remain propositional + information on training extractors (metrics, data)

A Hypothesis - Extraction success is similar on similar resources (e.g. created by same templates)

Charles University | Information system of Charles University |