Thesis (Selection of subject)Thesis (Selection of subject)(version: 368)
Thesis details
   Login via CAS
Podpora tokenizace pro Diff a Patch
Thesis title in Czech: Podpora tokenizace pro Diff a Patch
Thesis title in English: Tokenization-aware Diff and Patch
Key words: editační vzdálenost|slučování patchů|textové algoritmy|kontrola verzí
English key words: editing distance|three-way merge|text algorithms|version control
Academic year of topic announcement: 2018/2019
Thesis type: Bachelor's thesis
Thesis language: čeština
Department: Department of Software Engineering (32-KSI)
Supervisor: RNDr. Miroslav Kratochvíl, Ph.D.
Author: hidden - assigned and confirmed by the Study Dept.
Date of registration: 08.04.2019
Date of assignment: 12.04.2019
Confirmed by Study dept. on: 25.09.2020
Date and time of defence: 11.02.2021 09:00
Date of electronic submission:06.01.2021
Date of submission of printed version:06.01.2021
Date of proceeded defence: 11.02.2021
Opponents: Mgr. Vojtěch Horký, Ph.D.
 
 
 
Guidelines
Diff is an utility for obtaining precise description of line-wise differences between text files that has found widespread use in software development, especially in version control systems. While the currently available diff tools work sufficiently with the line-oriented source code, their application on files that are formatted according to other guidelines is problematic. One possible improvement is the use of word-diff, which tokenizes the input as words to produce more finely-grained differences that are more suitable for processing e.g. markup-formatted text documents and tables. This thesis aims to generalize this improvement to any user-specifiable tokenization of input, and to produce the currently missing tools that can apply the token-diff files as patches, and allow three-way merging of token-diffs. Results will be demonstrated by improving the merging capabilities of the git version control system.
References
Hunt, J. W., & MacIlroy, M. D. (1976). An algorithm for differential file comparison (p. 9). Murray Hill: Bell Laboratories.

Wagner, R. A., & Fischer, M. J. (1974). The string-to-string correction problem. Journal of the ACM (JACM), 21(1), 168-173.

Bednárek, D., Brabec, M., & Kruliš, M. (2017). Improving matrix-based dynamic programming on massively parallel accelerators. Information Systems, 64, 175-193.

Loeliger, J., & McCullough, M. (2012). Version Control with Git: Powerful tools and techniques for collaborative software development. " O'Reilly Media, Inc.".

Khanna, S., Kunal, K., & Pierce, B. C. (2007, December). A formal investigation of diff3. In International Conference on Foundations of Software Technology and Theoretical Computer Science (pp. 485-496). Springer, Berlin, Heidelberg.
 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html