Úvod do práce s jazykovými korpusy - AAA131025
Anglický název: Using Language Corpora: An Introduction
Anotace - angličtina
The seminar focuses mainly on corpus linguistics as a method. Its aim is to introduce the students to the use (an partly, building) of electronic language corpora and corpus tools. The students will work with the British National Corpus, the parallel translation corpus InterCorp as well as with specialized corpora of academic English and web corpora.
Podmínky zakončení předmětu - angličtina

The students have to attend the seminar regularly (a student must not miss more than three seminars). At the end of the semester each student will present the results of their own research to their colleagues (a ppt presentation and/or a handout will be uploaded in Moodle).

Suggested topics of presentations: a comparison aof a dictionary entry and corpus findings (collocations, colligations, semantic preference of a selected lexical unit), a corpus-based comparison of synonyms, lexical and grammatical differences among varieties of English, frequency and keyword analysis of a selected text

Všechny požadavky pro zápočet je nutné splnit do konce zkouškového období akademického roku, ve kterém si student předmět zapsal.

Literatura - angličtina

Baker, P. (2006) Using Corpora in Discourse Analysis. London / New York: Continuum.

Biber, D. et al. (1999) Longman Grammar of Spoken and Written English. New York: Longman.

Hunston, S. (2002) Corpora in applied linguistics. Cambridge: Cambridge University Press.

McEnery, T. and R. Xiao, Y. Tono (2006) Corpus-Based Language Studies. London / New York: Routledge.

McEnery, T. and A. Hardie (2011) Corpus LinguisticsMethod, Theory and Practice. Cambridge: Cambridge University Press.

Teubert, W. and A. Čermáková (2007) Corpus Linguistics. A Short Introduction. London / New York: Continuum.

Further reading will be suggested during the seminar.

Sylabus - angličtina

List of topics

1. Introduction: Historical outline, corpus-based and corpus-driven approaches, types of corpora

2. Collocation: First queries with BNC: running the query, KWIC and Sentence view, ordering the results, viewing a larger context and bibliographical information, restricting the query

3. The simple query: Simple query syntax, words and phrases, variation in phrases, using wildcards,proximity queries

4. Distribution and sorting: Comparing results, normalized frequencies, statistical significance, dispersion and file-frequency extremes

5. Collocations: Making statistical claims, association measures

6. Colligation, pattern grammar: Queries based on part-of-speech and headword/lemma, tagging and parsing

7. Keywords and frequency lists: Text-type and word lists, using keywords in stylistic analysis

8. Corpora of spoken language: Problems of transcription, metadata, speakers’ characteristics

9. Corpora of academic spoken English: Representativeness; units of meaning in spoken corpora, lexical bundles, n-grams

10. Issues in corpus design: Purpose, size and representativeness, criteria of text selection, sampling, balance, homogeneity Working with self-designed corpora, Antconc, tagging

11. Corpora in contrastive research: Varieties of English, parallel and comparable corpora

12. Leaving the corpus: Extracting query results to an external database, presenting the results


PLUS: Three lectures of invited corpus linguists

