Tackling Hallucinations in Chart Summarization
| Název práce v češtině: | Odstraňování halucinací při sumarizaci grafů |
|---|---|
| Název v anglickém jazyce: | Tackling Hallucinations in Chart Summarization |
| Klíčová slova: | generování popisu grafu|generování přirozeného jazyka|generování textu z dat|neuronové generativní modely|zpracování přirozeného jazyka|hluboké učení |
| Klíčová slova anglicky: | chart-to-text generation|natural language generation|data-to-text generation|neural generative models|natural language processing|deep learning |
| Akademický rok vypsání: | 2021/2022 |
| Typ práce: | diplomová práce |
| Jazyk práce: | angličtina |
| Ústav: | Ústav formální a aplikované lingvistiky (32-UFAL) |
| Vedoucí / školitel: | Mgr. et Mgr. Ondřej Dušek, Ph.D. |
| Řešitel: | skrytý - zadáno a potvrzeno stud. odd. |
| Datum přihlášení: | 16.05.2022 |
| Datum zadání: | 17.05.2022 |
| Datum potvrzení stud. oddělením: | 20.05.2022 |
| Datum a čas obhajoby: | 31.01.2023 09:00 |
| Datum odevzdání elektronické podoby: | 25.12.2022 |
| Datum odevzdání tištěné podoby: | 09.01.2023 |
| Datum proběhlé obhajoby: | 31.01.2023 |
| Oponenti: | Mgr. Rudolf Rosa, Ph.D. |
| Zásady pro vypracování |
| Information visualizations like bar charts, line charts, and pie charts are a popular way of communicating quantitative data. They are used to get important insights and make well informed decisions. However, for some people it can be challenging to understand these charts, especially for people who are visually impaired. Automatic Chart Summarization is the task to explain and summarize the key takeaways from the chart. This task comes under the umbrella of data-to-text natural language generation. Data is extracted from charts in a form of table and that table is converted to textual summary. While current neural generative models are able to produce useful chart summaries, they still have a number of problems. This thesis will use existing Chart-Summarization datasets and aims to tackle a selected problem (or problems) of current neural chart-to-text approaches, such as alignment between data-summary pairs in existing datasets, hallucinations in the generated summaries, deciding the best way to represent the input to the model, or making the models generalize better across datasets. |
| Seznam odborné literatury |
| J. Obeid and E. Hoque, “Chart-to-Text: Generating Natural Language Descriptions for Charts by Adapting the Transformer Model,” in Proceedings of the 13th International Conference on Natural Language Generation, Dublin, Ireland, Dec. 2020, pp. 138–147. Accessed: Dec. 15, 2020. [Online]. Available: https://www.aclweb.org/anthology/2020.inlg-1.20
J. Zhu, J. Ran, R. K.-W. Lee, Z. Li, and K. Choo, “AutoChart: A Dataset for Chart-to-Text Generation Task,” in Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), Held Online, Sep. 2021, pp. 1636–1644. Accessed: May 16, 2022. [Online]. Available: https://aclanthology.org/2021.ranlp-1.183 L. Li, C. Ma, Y. Yue, and D. Hu, “Improving Encoder by Auxiliary Supervision Tasks for Table-to-Text Generation,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online, Aug. 2021, pp. 5979–5989. doi: 10.18653/v1/2021.acl-long.466. W. Chen, Y. Su, X. Yan, and W. Y. Wang, “KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, Nov. 2020, pp. 8635–8648. doi: 10.18653/v1/2020.emnlp-main.697. R. Puduppully, L. Dong, and M. Lapata, “Data-to-Text Generation with Content Selection and Planning,” Honolulu, HI, USA, Jan. 2019. Accessed: Sep. 05, 2018. [Online]. Available: http://arxiv.org/abs/1809.00582 A. P. Parikh et al., “ToTTo: A Controlled Table-To-Text Generation Dataset,” Online, Nov. 2020. Accessed: Oct. 08, 2020. [Online]. Available: http://arxiv.org/abs/2004.14373 M. Kale and A. Rastogi, “Text-to-Text Pre-Training for Data-to-Text Tasks,” in Proceedings of the 13th International Conference on Natural Language Generation, Dublin, Ireland, Dec. 2020, pp. 97–102. Accessed: Mar. 31, 2021. [Online]. Available: https://www.aclweb.org/anthology/2020.inlg-1.14 X. Li, S. Stevens-Guille, A. Maskharashvili, and M. White, “Self-Training for Compositional Neural NLG in Task-Oriented Dialogue,” in Proceedings of the 14th International Conference on Natural Language Generation, Aberdeen, Scotland, UK, Aug. 2021, pp. 87–102. Accessed: Sep. 21, 2021. [Online]. Available: https://aclanthology.org/2021.inlg-1.10 E. Erdem et al., “Neural Natural Language Generation: A Survey on Multilinguality, Multimodality, Controllability and Learning,” jair, vol. 73, pp. 1131–1207, Apr. 2022, doi: 10.1613/jair.1.12918. A. Gatt and E. Krahmer, “Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation,” Journal of Artificial Intelligence Research (JAIR), vol. 61, pp. 65–170, Jan. 2018, doi: 10.1613/jair.5477. |
- zadáno a potvrzeno stud. odd.