Témata prací (Výběr práce)Témata prací (Výběr práce)(verze: 393)
Detail práce
   
Tackling Hallucinations in Chart Summarization
Název práce v češtině: Odstraňování halucinací při sumarizaci grafů
Název v anglickém jazyce: Tackling Hallucinations in Chart Summarization
Klíčová slova: generování popisu grafu|generování přirozeného jazyka|generování textu z dat|neuronové generativní modely|zpracování přirozeného jazyka|hluboké učení
Klíčová slova anglicky: chart-to-text generation|natural language generation|data-to-text generation|neural generative models|natural language processing|deep learning
Akademický rok vypsání: 2021/2022
Typ práce: diplomová práce
Jazyk práce: angličtina
Ústav: Ústav formální a aplikované lingvistiky (32-UFAL)
Vedoucí / školitel: Mgr. et Mgr. Ondřej Dušek, Ph.D.
Řešitel: skrytý - zadáno a potvrzeno stud. odd.
Datum přihlášení: 16.05.2022
Datum zadání: 17.05.2022
Datum potvrzení stud. oddělením: 20.05.2022
Datum a čas obhajoby: 31.01.2023 09:00
Datum odevzdání elektronické podoby:25.12.2022
Datum odevzdání tištěné podoby:09.01.2023
Datum proběhlé obhajoby: 31.01.2023
Oponenti: Mgr. Rudolf Rosa, Ph.D.
 
 
 
Zásady pro vypracování
Information visualizations like bar charts, line charts, and pie charts are a popular way of communicating quantitative data. They are used to get important insights and make well informed decisions. However, for some people it can be challenging to understand these charts, especially for people who are visually impaired. Automatic Chart Summarization is the task to explain and summarize the key takeaways from the chart. This task comes under the umbrella of data-to-text natural language generation. Data is extracted from charts in a form of table and that table is converted to textual summary. While current neural generative models are able to produce useful chart summaries, they still have a number of problems. This thesis will use existing Chart-Summarization datasets and aims to tackle a selected problem (or problems) of current neural chart-to-text approaches, such as alignment between data-summary pairs in existing datasets, hallucinations in the generated summaries, deciding the best way to represent the input to the model, or making the models generalize better across datasets.
Seznam odborné literatury
J. Obeid and E. Hoque, “Chart-to-Text: Generating Natural Language Descriptions for Charts by Adapting the Transformer Model,” in Proceedings of the 13th International Conference on Natural Language Generation, Dublin, Ireland, Dec. 2020, pp. 138–147. Accessed: Dec. 15, 2020. [Online]. Available: https://www.aclweb.org/anthology/2020.inlg-1.20
J. Zhu, J. Ran, R. K.-W. Lee, Z. Li, and K. Choo, “AutoChart: A Dataset for Chart-to-Text Generation Task,” in Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), Held Online, Sep. 2021, pp. 1636–1644. Accessed: May 16, 2022. [Online]. Available: https://aclanthology.org/2021.ranlp-1.183
L. Li, C. Ma, Y. Yue, and D. Hu, “Improving Encoder by Auxiliary Supervision Tasks for Table-to-Text Generation,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online, Aug. 2021, pp. 5979–5989. doi: 10.18653/v1/2021.acl-long.466.
W. Chen, Y. Su, X. Yan, and W. Y. Wang, “KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, Nov. 2020, pp. 8635–8648. doi: 10.18653/v1/2020.emnlp-main.697.
R. Puduppully, L. Dong, and M. Lapata, “Data-to-Text Generation with Content Selection and Planning,” Honolulu, HI, USA, Jan. 2019. Accessed: Sep. 05, 2018. [Online]. Available: http://arxiv.org/abs/1809.00582
A. P. Parikh et al., “ToTTo: A Controlled Table-To-Text Generation Dataset,” Online, Nov. 2020. Accessed: Oct. 08, 2020. [Online]. Available: http://arxiv.org/abs/2004.14373
M. Kale and A. Rastogi, “Text-to-Text Pre-Training for Data-to-Text Tasks,” in Proceedings of the 13th International Conference on Natural Language Generation, Dublin, Ireland, Dec. 2020, pp. 97–102. Accessed: Mar. 31, 2021. [Online]. Available: https://www.aclweb.org/anthology/2020.inlg-1.14
X. Li, S. Stevens-Guille, A. Maskharashvili, and M. White, “Self-Training for Compositional Neural NLG in Task-Oriented Dialogue,” in Proceedings of the 14th International Conference on Natural Language Generation, Aberdeen, Scotland, UK, Aug. 2021, pp. 87–102. Accessed: Sep. 21, 2021. [Online]. Available: https://aclanthology.org/2021.inlg-1.10

E. Erdem et al., “Neural Natural Language Generation: A Survey on Multilinguality, Multimodality, Controllability and Learning,” jair, vol. 73, pp. 1131–1207, Apr. 2022, doi: 10.1613/jair.1.12918.
A. Gatt and E. Krahmer, “Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation,” Journal of Artificial Intelligence Research (JAIR), vol. 61, pp. 65–170, Jan. 2018, doi: 10.1613/jair.5477.
 
Univerzita Karlova | Informační systém UK