Thesis (Selection of subject)Thesis (Selection of subject)(version: 368)
Thesis details
   Login via CAS
CSV file validator according to the CSV on the Web W3C recommendations
Thesis title in Czech: Validátor CSV souborů dle W3C doporučení CSV on the Web
Thesis title in English: CSV file validator according to the CSV on the Web W3C recommendations
Key words: CSV|JSON-LD|W3C|web|validátor|OTAVA
English key words: CSV|JSON-LD|W3C|web|validator|OTAVA
Academic year of topic announcement: 2021/2022
Thesis type: diploma thesis
Thesis language: angličtina
Department: Department of Software Engineering (32-KSI)
Supervisor: RNDr. Jakub Klímek, Ph.D.
Author: hidden - assigned and confirmed by the Study Dept.
Date of registration: 13.01.2022
Date of assignment: 13.01.2022
Confirmed by Study dept. on: 29.03.2022
Date and time of defence: 14.02.2024 09:00
Date of electronic submission:10.01.2024
Date of submission of printed version:10.01.2024
Date of proceeded defence: 14.02.2024
Opponents: RNDr. Martin Svoboda, Ph.D.
 
 
 
Guidelines
The CSV on the Web W3C recommendations specify, how to describe CSV files published on the Web using JSON-LD descriptors containing important metadata, such as column names, data types and more.
The goal of this thesis is to implement a validator of CSV files based on the CSV on the Web W3C recommendations [1][2].
Although there already are some implementations [3][6], they are insufficiently maintained and hard to use.
The student will:
- Get familiar with CSV on the Web [1][2][3] and JSON-LD [5]
- Study current implementations [3][4][6]
- Design the validator architecture, so that it is easily extensible with additional validation rules
- Implement the validator as a Java library, including a reasonable subset of validation rules, ideally a complete set according to the CSVW validation report.
- Implement a command line and a web service runner
- Represent the validation results in RDF and CSV
- Evaluate the implemented validator on a given set of tests [4] against other implementations
References
[1] Model for Tabular Data and Metadata on the Web, W3C, https://www.w3.org/TR/2015/REC-tabular-data-model-20151217/
[2] Metadata Vocabulary for Tabular Data, W3C, https://www.w3.org/TR/2015/REC-tabular-metadata-20151217/
[3] csvlint.io, Open Data Institute, https://github.com/theodi/csvlint.rb
[4] CSVW Implementation Report, W3C, http://w3c.github.io/csvw/publishing-snapshots/PR-earl/earl.html
[5] JSON for Linking Data, https://json-ld.org/
[6] Validátor CSV souboru dle W3C doporučení CSV on the Web, Vojtěch Malý, FIT ČVUT, https://dspace.cvut.cz/bitstream/handle/10467/82671/F8-DP-2019-Maly-Vojtech-thesis.pdf
 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html