Information extraction is the task of automatically extracting structured information from unstructured data, usually textual documents. The basic sub-tasks are mainly solved on the sentence level (e.g. named entity recognition, extraction of relations between the entities, and linking the entities to an ontology). More complex information is extracted on document level and includes, for instance, template filling which attempts to fill a fixed set of fields from an entire document. The thesis will explore document-level information extraction using deep-learning based models in multilingual and domain-specific settings.
Seznam odborné literatury
Goodfellow, I., Y. Bengio, and A. Courville 2016. Deep learning. Cambridge, MA, USA: MIT press.
Du, Xinya, Alexander M. Rush, and Claire Cardie. "Template filling with generative transformers." Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2021.