Témata prací (Výběr práce)

Váš prohlížeč nepodporuje JavaScript nebo je jeho podpora vypnutá. Některé funkce nemusejí být dostupné.

Universal Morphological Analysis using ReinforcementLearning

Název práce v češtině:	Univerzální morfologická analýza s využitím reinforcement learning
Název v anglickém jazyce:	Universal Morphological Analysis using ReinforcementLearning
Klíčová slova:	morfologická analýza, reinforcement learning
Klíčová slova anglicky:	morphological analysis, reinforcement learning
Akademický rok vypsání:	2018/2019
Typ práce:	diplomová práce
Jazyk práce:	angličtina
Ústav:	Ústav formální a aplikované lingvistiky (32-UFAL)
Vedoucí / školitel:	RNDr. Daniel Zeman, Ph.D.
Řešitel:	Mgr. Ronald Ahmed Cardenas Acosta - zadáno a potvrzeno stud. odd.
Datum přihlášení:	25.02.2019
Datum zadání:	07.03.2019
Datum potvrzení stud. oddělením:	25.04.2019
Datum a čas obhajoby:	04.02.2020 09:00
Datum odevzdání elektronické podoby:	04.01.2020
Datum odevzdání tištěné podoby:	06.01.2020
Datum proběhlé obhajoby:	04.02.2020
Oponenti:	RNDr. David Mareček, Ph.D.

Zásady pro vypracování

In this thesis we take a universal approach to morphological analysis in context. The approach consists of jointly simulating word formation steps and morphological label assignment, one step at a time. Such mechanism is modeled as a neural WFSA (Schwartz et al., 2018), in an effort to add interpretability to an otherwise ‘blackbox’ architecture. Then, the problem is formulated as a multi-armed bandit problem in which each arm captures a specific kind of word formation process. Each arm can then learn how word formation processes are carried out in different languages. Moreover, the model has the potential to learn how to combine processes from different arms, i.e. to model how a language can combine different kind of processes in the same derivation (e.g. German exhibits circumfixation, affixation, and compounding).
Our model leverages paradigm annotations and morphologically labeled sentences in a varied sample of high resource languages made available by the CONLL-SIGMORPHON shared tasks. We evaluate the effectiveness of our approach in high and low-resource scenarios against strong neural baselines for the languages of English, Spanish, German, Czech, Turkish, and Shipibo-Konibo.

Seznam odborné literatury

Ramy Eskander, Owen Rambow, and Smaranda Muresan. 2018. Automatically tailoring
unsupervised morphological segmentation to the language. In Proceedings of the Fifteenth
Workshop on Computational Research in Phonetics, Phonology, and Morphology, pages
78-83.

Ramy Eskander, Owen Rambow, and Tianchun Yang. 2016. Extending the Use of Adaptor
Grammars for Unsupervised Morphological Segmentation of Unseen Languages. Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 900-910.

Mark Johnson. 2008. Unsupervised word segmentation for sesotho using adaptor grammars.
In Proceedings of the Tenth Meeting of ACL Special Interest Group on Computational
Morphology and Phonology, pages 20-27. Association for Computational Linguistics.

Hao Peng, Roy Schwartz, Sam Thomson, and Noah A Smith. 2018. Rational recurrences. In
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing,
pages 1203-1214.

Roy Schwartz, Sam Thomson, and Noah A. Smith. 2018. SoPa: Bridging CNNs, RNNs, and
Weighted Finite-State Machines.

Kairit Sirts and Sharon Goldwater. 2013. Minimally-supervised morphological segmentation
using adaptor grammars. Transactions of the Association of Computational Linguistics,
1:255-266.