Subjects

Your browser does not support JavaScript, or its support is disabled. Some features may not be available.

Unsupervised Machine Learning in NLP - NPFL097

Title:	Neřízené strojové učení v NLP
Guaranteed by:	Institute of Formal and Applied Linguistics (32-UFAL)
Faculty:	Faculty of Mathematics and Physics
Actual:	from 2020
Semester:	winter
E-Credits:	3
Hours per week, examination:	winter s.:1/1, C [HT]
Capacity:	unlimited
Min. number of students:	unlimited
4EU+:	no
Virtual mobility / capacity:	no
State of the course:	taught
Language:	Czech, English
Teaching methods:	full-time
Additional information:	http://ufal.mff.cuni.cz/courses/npfl097

Guarantor:	RNDr. David Mareček, Ph.D.
Teacher(s):	RNDr. David Mareček, Ph.D.
Class:	Informatika Mgr. - volitelný
Classification:	Informatics > Computer and Formal Linguistics

Opinion survey results Examination dates WS schedule Noticeboard

Annotation -

The goal of the course is to introduce basic methods of unsupervised machine learning and their applications in natural language processing. We will discuss methods like Bayesian inference, Expectation-Maximization, Cluster analysis, methods using neural networks and other currently used methods. Selected applications will be discussed in detail and implemented at the lab sessions.

Last update: Vidová Hladká Barbora, doc. Mgr., Ph.D. (25.04.2019)

Course completion requirements -

To get the credit, students are required to implement and deliver in time (usually three) programming assignments. Missing points can be obtained in the final test.

Last update: Mareček David, RNDr., Ph.D. (05.05.2022)

Literature -

Christopher Bishop: Pattern Recognition and Machine Learning, Springer-Verlag New York, 2006

Kevin P. Murphy: Machine Learning: A Probabilistic Perspective, The MIT Press, Cambridge, Massachusetts, 2012

Kar Wi Lim, Wray Buntine, Changyou Chen, Lan Du: Nonparametric Bayesian topic modelling with the hierarchical Pitman-Yor processes, International Journal of Approximate Reasoning 78, Elsevier, 2016

Kevin Knight: Bayesian Inference with Tears, 2009, http://www.isi.edu/natural-language/people/bayes-with-tears.pdf

Last update: Mareček David, RNDr., Ph.D. (24.04.2019)

Syllabus -

1. Introduction

2. Beta-Bernouli and Dirichlet-Categorial models

3. Modeling document collections, Categorical Mixture models, Expectation-Maximization

4. Gibbs Sampling, Latent Dirichlet allocation

5. Unsupervised Text Segmentation

6. Unsupervised tagging, Word alignment, Unsupervised parsing

7. K-means, Mixture of Gaussians, Hierarchical clustering, evaluation

8. T-SNE, Principal Component Analysis, Independent Component Analysis

9. Linguistic Interpretation of Neural Networks

Last update: Mareček David, RNDr., Ph.D. (05.05.2022)