CSV file validator according to the CSV on the Web W3C recommendations
Thesis title in Czech: | Validátor CSV souborů dle W3C doporučení CSV on the Web |
---|---|
Thesis title in English: | CSV file validator according to the CSV on the Web W3C recommendations |
Key words: | CSV|JSON-LD|W3C|web|validátor|OTAVA |
English key words: | CSV|JSON-LD|W3C|web|validator|OTAVA |
Academic year of topic announcement: | 2021/2022 |
Thesis type: | diploma thesis |
Thesis language: | angličtina |
Department: | Department of Software Engineering (32-KSI) |
Supervisor: | RNDr. Jakub Klímek, Ph.D. |
Author: | hidden - assigned and confirmed by the Study Dept. |
Date of registration: | 13.01.2022 |
Date of assignment: | 13.01.2022 |
Confirmed by Study dept. on: | 29.03.2022 |
Date and time of defence: | 14.02.2024 09:00 |
Date of electronic submission: | 10.01.2024 |
Date of submission of printed version: | 10.01.2024 |
Date of proceeded defence: | 14.02.2024 |
Opponents: | RNDr. Martin Svoboda, Ph.D. |
Guidelines |
The CSV on the Web W3C recommendations specify, how to describe CSV files published on the Web using JSON-LD descriptors containing important metadata, such as column names, data types and more.
The goal of this thesis is to implement a validator of CSV files based on the CSV on the Web W3C recommendations [1][2]. Although there already are some implementations [3][6], they are insufficiently maintained and hard to use. The student will: - Get familiar with CSV on the Web [1][2][3] and JSON-LD [5] - Study current implementations [3][4][6] - Design the validator architecture, so that it is easily extensible with additional validation rules - Implement the validator as a Java library, including a reasonable subset of validation rules, ideally a complete set according to the CSVW validation report. - Implement a command line and a web service runner - Represent the validation results in RDF and CSV - Evaluate the implemented validator on a given set of tests [4] against other implementations |
References |
[1] Model for Tabular Data and Metadata on the Web, W3C, https://www.w3.org/TR/2015/REC-tabular-data-model-20151217/
[2] Metadata Vocabulary for Tabular Data, W3C, https://www.w3.org/TR/2015/REC-tabular-metadata-20151217/ [3] csvlint.io, Open Data Institute, https://github.com/theodi/csvlint.rb [4] CSVW Implementation Report, W3C, http://w3c.github.io/csvw/publishing-snapshots/PR-earl/earl.html [5] JSON for Linking Data, https://json-ld.org/ [6] Validátor CSV souboru dle W3C doporučení CSV on the Web, Vojtěch Malý, FIT ČVUT, https://dspace.cvut.cz/bitstream/handle/10467/82671/F8-DP-2019-Maly-Vojtech-thesis.pdf |