Thesis (Selection of subject)Thesis (Selection of subject)(version: 368)
Thesis details
   Login via CAS
Techniques Applicable to the Analysis of Educational Data
Thesis title in Czech: Techniques Applicable to the Analysis of Educational Data
Thesis title in English: Techniques Applicable to the Analysis of Educational Data
Key words: dobývání znalostí|klasifikace|vizualizace|sociální sítě|rozhodovací stromy|klastrování|míry centrality|detekce komunit
English key words: data mining|classification|visualization|social networks|decision trees|clustering|centrality measures|community detection
Academic year of topic announcement: 2022/2023
Thesis type: diploma thesis
Thesis language: angličtina
Department: Department of Theoretical Computer Science and Mathematical Logic (32-KTIML)
Supervisor: doc. RNDr. Iveta Mrázová, CSc.
Author: hidden - assigned and confirmed by the Study Dept.
Date of registration: 02.04.2023
Date of assignment: 04.04.2023
Confirmed by Study dept. on: 14.04.2023
Opponents: RNDr. Jan Hric
 
 
 
Guidelines
The student shall review the following data mining topics in his diploma thesis:

- overview of the paradigms relevant to decision trees (e.g., ID3 and its variants, CART, CHAID, bagging, random forests, and boosting),

- recapitulation and mutual comparison of various paradigms applicable to pre-processing, visualization, and clustering (feature selection, k-means, LVQ, and k-medoid methods as well its scalable versions),

- detection of significant data characteristics through approaches from social network analysis like centrality measures (betweenness, closeness, PageRank, HITS, etc.), community detection methods (Girvan-Newman, Kerninghan-Lin, and Louvain algorithms, among others), or sentiment analysis.

The student will focus on some of these topics in more detail. Further, he will propose a suitable strategy for analyzing real-world educational data and shall implement the models. Evaluating the obtained results and gained experience shall form an important part of the thesis.
References
1. Some of the textbooks available for the chosen area of research, e.g.:
- B. Liu: Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data, Springer, (2007).
- Ch. C. Aggarwal: Data Mining: The Textbook, Springer, (2015).
- A.-L. Barabási: Network Science, Cambridge University Press, (2016). http://networksciencebook.com/

2. Journal papers and other publications:
- S. Parthasarathy, Y. Ruan, and V. Satuluri: Community Discovery in Social Networks: Applications, Methods, and Emerging Trends (Chapter 4) from: C. D. Aggarwal (Ed.): Social Network Data Analytics, Springer, (2011), https://link.springer.com/content/pdf/10.1007%2F978-1-4419-8462-3_4.pdf
- V. Blondel, J.-L. Guillaume, R. Lambiotte, and E. Lefebvre: Fast unfolding of communities in large networks, in: Journal of Statistical Mechanics: Theory and Experiment, Vol. 10, (2008), 12 p.: doi:10.1088/1742-5468/2008/10/P10008.
- A.Voros, Z. Boda, T. Elmer, M. Hoffman, K. Mepham, I. J. Raabe, and Ch. Stadtfeld: Reprint of: The Swiss Studentlife Study: Investigating the emergence of an undergraduate community through dynamic, multidimensional social network data, in: Social Networks, Vol. 69, (2022), pp. 180-193.
- Ch. Stadtfeld, A. Voros, T. Elmer, Z. Boda, and I. J. Raabe: Integration in emerging social networks explains academic failure and success, in: PNAS, Vol. 116, No 3, (2019), pp. 792-797.
- M. N. Giannakos, I. O. Pappas, L. Jaccheri, and D. G. Sampson: Understanding student retention in computer science
education: The role of environment, gains, barriers and usefulness, in: Education and Information Technologies, (2017), 18p.
- T. Shaik, X. Tao, Ch. Dann, H. Xie, Y. Li, and L. Galligan: Sentiment analysis and opinion mining on educational data: A survey, in: Natural Language Processing Journal 2, (2023), 11 p.
- Retention in Computer Science Undergraduate Programs in the U.S.: Data Challenges and Promising Interventions, ACM, New York, U.S., https://www.acm.org/binaries/content/assets/education/retention-in-cs-undergrad-programs-in-the-us.pdf
- S. Zweben and B. Bizot: 2021 Taulbee Survey CS Enrollment Grows at All Degree Levels, With Increased Gender Diversity, CRA, U.S., May 2022, https://cra.org/wp-content/uploads/2022/05/2021-Taulbee-Survey.pdf and https://cra.org/data/
- ACM-NDC Study 2020-2021, https://www.acm.org/binaries/content/assets/education/acm_ndc_2020-2021.pdf
- National Center for Education Statistics (NCES), Integrated Postsecondary Education Data System: https://nces.ed.gov/ipeds/use-the-data

3. Relevant articles from leading academic journals, e.g.:
Data Mining and Knowledge Discovery, IEEE Transactions on Knowledge and Data Engineering, Machine Learning, etc.
 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html