Subjects

Your browser does not support JavaScript, or its support is disabled. Some features may not be available.

Web Semantization - NSWI108

Title:	Sémantizace webu
Guaranteed by:	Department of Software Engineering (32-KSI)
Faculty:	Faculty of Mathematics and Physics
Actual:	from 2020
Semester:	winter
E-Credits:	5
Hours per week, examination:	winter s.:2/2, C+Ex [HT]
Capacity:	unlimited
Min. number of students:	unlimited
4EU+:	no
Virtual mobility / capacity:	no
State of the course:	cancelled
Language:	Czech, English
Teaching methods:	full-time
Teaching methods:	full-time
Additional information:	http://www.ksi.mff.cuni.cz/~vojtas/vyuka/vyuka.html
Note:	enabled for web enrollment

Guarantor:	prof. RNDr. Peter Vojtáš, DrSc.
Class:	Informatika Mgr. - Softwarové systémy
Classification:	Informatics > Informatics, Software Applications, Computer Graphics and Geometry, Database Systems, Didactics of Informatics, Discrete Mathematics, External Subjects, General Subjects, Computer and Formal Linguistics, Optimalization, Programming, Software Engineering, Theoretical Computer Science

Opinion survey results Examination dates Schedule Noticeboard

Annotation -

Last update: RNDr. Michal Kopecký, Ph.D. (09.05.2019)

Basic semantic web models are covered in NSWI145, dynamical aspects of web data extraction (semantization) are considered in NDBI021 and NSWI167 (see also NSWI144 and NSWI142). Semantization is appropriate dynamic content enrichment, needed for automated processing if it’s content. We are treating the problem from SW engineering perspective: models, methodology and process of web semantization. We cover basic formal knowledge necessary for orientation in the field and learn some practical skills. Labs are composed of reporting on current achievements, learning rules for semantization.

Course completion requirements -

Last update: prof. RNDr. Peter Vojtáš, DrSc. (12.10.2017)

Terms of passing the course consist of reporting on current achievements, induction on semantized data, project of a virtual Lean Startup and customer imitation via a social network. These are only conditions for getting credits. Exam is oral and requires basic understanding of whole material.

As soon as terminology is introduced, detailed milestones (also form of deliverables) and preferred deadlines will be announced (with possible repeated attempts). There is no evidence on personal presence. Nevertheless, no additional explanation for tasks will be given, except on the respective lab and brief description on the course web. Final deadline is end of semester (without repeated attempts).

Literature -

Last update: RNDr. Michal Kopecký, Ph.D. (10.05.2017)

P. Hitzler, M. Krötzsch, S. Rudolph. Foundations of Semantic Web Technologies. Chapman & Hall/CRC 2010, http://www.semantic-web-book.org/page/Slides

E. Ries. Lean Startup, Crown Business 2011

D. Harel, D. Kozen, J. Tiuryn. Dynamic Logic. The MIT Press 2000

G. James, D. Witten, T. Hastie, R. Tibshirani. An Introduction to Statistical Learning with Applications in R. Springer 2013

C. D. Manning, P. Raghavan, H. Schütze. An Introduction to Information Retrieval. Cambridge University Press 2009

Syllabus -

Last update: RNDr. Michal Kopecký, Ph.D. (10.05.2017)

Web semantization
Basic problems and vision of automation of web content processing, extraction, annotation

Lean start-up methodology and semantization

RDF-framework, description logic, OWL
Data model RDF and RDFS as a model of metadata, formal semantics, satisfiability

Basics of description logic (DeL), knowledge and ontology representation

Web querying languages
Language SPARQL, SPARQL algebra

Dynamic logic
Propositional dynamic logic (DyL)

Decidability of DyL

A dynamic model of web semantization
Integration of W3C models and Dynamic logic

Reliability of automated web information extraction and annotation

A Kripke style model: states are query_based_predicate logic, programs (extractors) remain propositional + information on training extractors (metrics, data)

A Hypothesis - Extraction success is similar on similar resources (e.g. created by same templates)