Témata prací (Výběr práce)Témata prací (Výběr práce)(verze: 384)
Detail práce
   Přihlásit přes CAS
Language games
Název práce v češtině: Jazykové hry
Název v anglickém jazyce: Language games
Klíčová slova: počítačová lingvistika, anotace dat, koreference, rozpoznání rodného jazyka autora textu
Klíčová slova anglicky: computational linguistics, data annotation, coreference, native language identification
Akademický rok vypsání: 2016/2017
Typ práce: bakalářská práce
Jazyk práce: angličtina
Ústav: Ústav formální a aplikované lingvistiky (32-UFAL)
Vedoucí / školitel: doc. Mgr. Barbora Vidová Hladká, Ph.D.
Řešitel: skrytý - zadáno a potvrzeno stud. odd.
Datum přihlášení: 17.02.2017
Datum zadání: 18.02.2017
Datum potvrzení stud. oddělením: 24.02.2017
Zásady pro vypracování
There have been published three games with textual data at http://lgame.ms.mff.cuni.cz, more details are available in (Hladká, Mírovský, Kohout, 2011):

-- Shannon-game is a game for one or two players with hidden words in the sentence. The players guess the hidden words with the help of unhidden words in the sentence.

-- Place the Space (PtS) is a single-player game of word segmentation. The player is presented with a sentence depicted without spaces between words. His task is to restore the spaces in a time-limit set up according to the length of the sentence.

-- PlayCoref is a single-player and two-player game with text. During a 5 minute session, the players read a short text and connect words that co-refer. Their task is to connect all co-referring words in as many sentences as possible.

The goal of the bachelor’s thesis is to

(i) upgrade the LGame platform,

(ii) develop its admin environment, and

(iii) develop a game for the Native Language Identification task.

The target languages for Shannon Game, PtS, PlayCoref are Czech and English and there are 11 languages presented in the TOEFL11 corpus for the NLI game (Hladka, Holub, Kriz, 2013).

The sub goals are

(1) Thoroughly test the latest versions of the games available at lgame.ms.mff.cuni.cz.
(2) Document the actions that do not work properly.
(3) Document actions that should be improved/involved.
(4) Migrate the PHP codes into a newest PHP version.
(5) Migrate the LGame platform based on an operating system (talk to UFAL’s IT administrator)
(6) Fix the bugs documented in (2).
(7) Implement the actions from (3).
(8) Unify game design across the games.
(9) Develop an admin environment for user-friendly
-- adding new languages
-- uploading new texts to play with
-- game configuration editing
-- collecting annotated data
(10) Develop a game for the NLI task with the TOEFL11 corpus
Seznam odborné literatury
Hladka Barbora, Holub Martin, Kriz Vincent: Feature Engineering in the NLI Shared Task 2013: Charles University Submission Report. In: Proceedings of the Eighth Workshop on Innovative Use of NLP for Building Educational Applications, ACL, Atlanta, Georgia, USA, pp. 232-241, 2013.

Hladka Barbora, Mirovsky Jiri, Kohout Jan. An attractive game with the document: (im)possible?. In: The Prague Bulletin of Mathematical Linguistics, No. 96, pp. 5-26. 2011.

Hladka, B., Mirovsky, J., Schlesinger, P. Designing a Language Game for Collecting Coreference Annotation. In: Proceedings of the Third Linguistic Annotation Workshop, ACL-IJCNLP 2009, pp. 52-55, Suntec, Singapore. 2009.

Hladka, B., Mirovsky, J., Schlesinger, P. Play the Language: Play Coreference. In: Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, pp. 209-212, Suntec, Singapore. 2009.

Poesio, M. DALI: Disagreements in Language Interpretation. 2016.

Poesio, M., Chamberlain, J., Kruschwitz, U. Phrase Detectives, ACM Transactions on Intelligent Interactive Systems (TIIS). 2013.
 
Univerzita Karlova | Informační systém UK