Témata prací (Výběr práce)Témata prací (Výběr práce)(verze: 368)
Detail práce
   Přihlásit přes CAS
Automated methods of textual content analysis and description of text structures
Název práce v češtině: Automatizované metody popisu struktury odborného textu a vztah některých prvků ke kvalitě textu
Název v anglickém jazyce: Automated methods of textual content analysis and description of text structures
Klíčová slova: obsahová analýza, SEMAN, sémantický analyzátor, sémantická pole, sémy
Klíčová slova anglicky: content analysis, SEMAN, semantic analyzer, semantic fields, semes
Akademický rok vypsání: 2005/2006
Typ práce: disertační práce
Jazyk práce: angličtina
Ústav: Ústav informačních studií a knihovnictví (21-UISK)
Vedoucí / školitel: doc. PhDr. Vladimír Smetáček, CSc.
Řešitel: skrytý - zadáno a potvrzeno stud. odd.
Datum přihlášení: 07.01.2011
Datum zadání: 07.01.2011
Schválení administrátorem: zatím neschvalováno
Datum a čas obhajoby: 27.01.2012 00:00
Datum odevzdání elektronické podoby:12.08.2011
Datum proběhlé obhajoby: 27.01.2012
Odevzdaná/finalizovaná: odevzdaná studentem a finalizovaná
Oponenti: prof. RNDr. Jan Rauch, CSc.
  prof. PhDr. Oldřich Uličný, DrSc.
 
 
Zásady pro vypracování

Cílem dizertační práce je vyhodnotit použitelnost semi-formalizované formy zápisu znalostí, která se nazývá Univerzální Sémantický Jazyk (USJ). Vytvořili jsme novou aplikaci, která schopna pracovat se znalostmi ve formě USJ a zaměřujeme se na metodologické požadavky pro úpravu a údržbu sémantické sítě a definici významů pro potřeby obsahové analýzy. Evaluace systému zahrnuje srovnání mechanismu pro vyhledávání vzorců z textu, vyhodnocení nepřesností v překladech a jejich dopad na schopnost systému vyhledat statisticky významné dvojice sémantických anotací. Tato práce je oživením myšlenky Univerzálního Sémantického Analyzátoru (SEMAN).
Seznam odborné literatury
Version:1.0 StartHTML:0000000167 EndHTML:0000029576 StartFragment:0000000454 EndFragment:0000029560

“Assessment and Development of New Methods for the Analysis of Media Content.” http://www.restore.ac.uk/lboro/index.php (Accessed November 21, 2010).
Azar, Edward E. 1980. “The Conflict and Peace Data Bank (COPDAB) Project.” Journal of Conflict Resolution 24(1): 143 -152.
Bartsch, Sabine. 2004. Structural and Functional Properties of Collocations in English. Tübingen.
Berelson, Bernard. 1972a. 2 Content analysis in communication research. New York: Hafner.
———. 1972b. 2 Content analysis in communication research. New York: Hafner.
Berelson, Bernard, and Paul F. Lazarsfeld. 1948. The Analysis of communication content. Chicago: University of Chicago Press.
Bickle, John, Peter Mandik, and Anthony Landreth. “The Philosophy of Neuroscience.” http://plato.stanford.edu/entries/neuroscience/ (Accessed July 23, 2011).
Cer, Daniel, Marie-Catherine de Marneffe, Daniel Jurafsky, and Christopher D. Manning. 2010. “Parsing to Stanford Dependencies: Trade-offs between speed and accuracy.” In 7th International Conference on Language Resources and Evaluation (LREC 2010), http://nlp.stanford.edu/pubs/lrecstanforddeps_final_final.pdf.
Cuilenberg, Jan J., Jan Kleinnijenhuis, and Jan A. de Ridder. 1998. “Artificial Intelligence and Content Analysis: Problems of and Strategies for Computer Text Analysis.” Quality and Quantity 22: 65-97.
Cunningham, H. 2005. “Information Extraction, Automatic.” Encyclopedia of Language and Linguistics, 2nd Edition.
Cunningham, H., D. Maynard, K. Bontcheva, and V. Tablan. 2002. “GATE: A framework and graphical development environment for robust NLP tools and applications.” In Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics,.
Dale, Robert, Hermann Moisl, and Harold Somers. 2000. Handbook of Natural Language Processing. 1st ed. CRC Press.
Evert, Stefan. 2005. “The statistics of word cooccurrences : word pairs and collocations (Ph.D. thesis).” http://elib.uni-stuttgart.de/opus/volltexte/2005/2371/ (Accessed March 22, 2011).
Fan, R. E., K. W. Chang, C. J. Hsieh, X. R. Wang, et al. 2008. “LIBLINEAR: A library for large linear classification.” Journal of Machine Learning Research 9: 1971-1874.
Firth, J.R. 1957. Papers in Linguistics 1934-1951. London: Oxford University Press.
Forman, George. A Pitfall and Solution in Multi-Class Feature Selection for Text Classification.
———. 2003. “An extensive empirical study of feature selection metrics for text classification.” J. Mach. Learn. Res. 3: 1289-1305.
George, Alexander L. 1959. Propaganda Analysis: a Study of Inferences Made from Nazi Propaganda in World War II. Row, Peterson & Co.
Gliozzo, Alfio, and Carlo Strapparava. 2009. Semantic Domains in Computational Linguistics. Springer.
Goddard, Cliff. 1994. Semantic and lexical universals : theory and empirical findings. Amsterdam ;;Philadelphia: J. Benjamins.
Gottschalk, Louis A. 1997. “The unobtrusive Measurment of Psychological States and Traits.” In Text analysis for the social sciences: methods for drawing statistical inferences from texts and transcripts, Routledge, p. 117-147.
Gottschalk, Louis A., and R. Bechtel. 1995. “Computerized measurement of the content analysis of natural language for use in biomedical research.” Computer Methods and Programs in Biomedicine 47: 123-130.
Harden, Theo. 1983. An analysis of the semantic field of the German particles “überhaupt” and “eigentlich.” Gunter Narr Verlag.
Hopkins, Daniel, and Gary King. 2010. “A Method of Automated Nonparametric Content Analysis for Social Science.” American Journal of Political Science 54(1): 247, 229.
Hsu, C.-W., C.-C. Chang, and C.-J. Lin. 2003. A practical guide to support vector classification. Technical report, Department of Computer Science, National Taiwan University.
Huberman, B. A., D. M. Romero, and F. Wu. 2009. “Crowdsourcing, attention and productivity.” Journal of Information Science 35(6): 758-765.
Joachims, Thorsten. 1998. “Text Categorization with Support Vector Machines: Learning with Many Relevant Features.” In Proceedings of the European Conference on Machine Learning, Springer.
Justeson, John S., and Slava M. Katz. 1995. “Technical terminology: some linguistic properties and an algorithm for identification in text.” Natural Language Engineering 1(01). http://www.journals.cambridge.org/abstract_S1351324900000048 (Accessed July 24, 2011).
Kant, Immanuel. 1992. Lectures on logic Immanuel Kant translated and edited by J. Michael Young. CUP.
Kecskés, István. 2003. Situation-bound utterances in L1 and L2. Walter de Gruyter.
King, Gary. 2003. “10 Million International Dyadic Events.” http://hdl.handle.net/1902.1/FYXLAWZRIA.
King, Gary, M. Knowles, and S. Melendez. 2010. “ReadMe: Software for Automated Content Analysis.” http://gking.harvard.edu/readme.
King, Gary, and Will Lowe. 2003. “An Automated Information Extraction Tool For International Conflict Data with Performance as Good as Human Coders: A Rare Events Evaluation Design.” International Organization 57(3): 617-642.
Koenig, Thomas. “CAQDAS Comparison.” http://www.restore.ac.uk/lboro/research/software/caqdas_comparison.php (Accessed November 21, 2010).
Kolhatkar, Varada. 2009. “An Extended Analysis of a Method of All Words Sense Disambiguation.” University of Minnesota. http://www.d.umn.edu/~tpederse/Pubs/varada-thesis.pdf.
Krippendorf, Klaus. 2004a. 2. Content analysis: An introduction to its methodology. Thosand Oaks: Sage.
———. 2004b. Content analysis: An introduction to its methodology. Thosand Oaks: Sage.
Laurance, Edward J. 1990. “Events data and policy analysis:” Policy Sciences 23(2): 111-132.
Leininger, Kurt. 2000. “Interindexer consistency in PsycINFO.” Journal of Librarianship and Information Science 32(1): 4 -8.
Lejeune, Christophe. 2008. “Au fil de l’interprétation. L’apport des registres aux logiciels d’analyse qualitative.” Revue Suisse de Sociologie 34(3): 593-603.
———. 2009. “Méthodes qualitatives informatisées.” http://analyses.ishs.ulg.ac.be/logiciels/index.html (Accessed January 24, 2010).
Lowe, Will. 2003. Content Analysis Software: A Review. Identity Project, Weatherhead Center for International Affairs, Harvard University. http://www.wcfia.harvard.edu/misc/initiative/identity.
———. 2008. “Understanding Wordscores.” Political Analysis 16(4): 371, 356.
———. “Yoshikoder: An Open Source Multilingual Content Analysis Tool for Social Scientists.” http://www.yoshikoder.org/courses/apsa2006/apsa-yk.pdf.
MacMillan, Katie. 2005. “More Than Just Coding? Evaluating CAQDAS in a Discourse Analysis of News Texts.” Forum: Qualitative social research 6(3). http://nbn-resolving.de/urn:nbn:de:0114-fqs0503257 (Accessed November 21, 2010).
Manning, Christoper D., and Hinrich Schuetze. 1999. Foundations of Statistical Natural Language Processing. MIT.
McClelland, Charles. 1999. World Event/Interaction Survey (WEIS) Project, 1966-1978 [Computer file]. Ann Arbor, MI.
Medelyan, Olena. 2009. “Human-competitive automatic topic indexing.” University of Waikato. http://hdl.handle.net/10289/3513.
Medelyan, Olena, and I. H. Witten. “Measuring inter-indexer consistency using a thesaurus.” In 6th ACM/IEEE-CS Joint Conf. on Digital Libraries, Chapel Hill, NC, USA: ACM Press, p. 274-275.
Navigli, Roberto. 2009. “Word sense disambiguation: A survey.” ACM Comput. Surv. 41(2): 1-69.
Neuendorf, Kimberly A. 2001. The Content Analysis Guidebook. 1st ed. Sage Publications, Inc.
“PCAD 2000.” http://www.gb-software.com/pcad2000.htm (Accessed November 21, 2010).
Pedersen, Ted, and Varada Kolhatkar. 2009. “WordNet::SenseRelate::AllWords - A Broad Coverage Word Sense Tagger that Maximimizes Semantic Relatedness.” In Proceedings of the North American Chapter of the Association for Computational Linguistics, Boulder,CO., p. 17-20.
Phillips, David P. 1979. “Suicide, Motor Vehicle Fatalities, and the Mass Media: Evidence Toward a Theory of Suggestion.” American Journal of Sociology 84(5): 1150-1174.
———. 1983. “The Impact of Mass Media Violence on U.S. Homicides.” American Sociological Review 48(4): 560-568.
Rajman, Martin. 2007. Speech and language engineering. 1st ed. Lausanne  ;Boca Raton: EPFL Press ;;Distributed by CRC Press.
“SQLAlchemy - The Database Toolkit for Python.” http://www.sqlalchemy.org/ (Accessed November 17, 2010).
“SVM-perf: Support Vector Machine for Multivariate Performance Measures.” http://www.cs.cornell.edu/People/tj/svm_light/svm_perf.html (Accessed April 25, 2010).
Saeed, John. 2009. Semantics. 3rd ed. Malden Mass.: Wiley-Blackwell.
Sampson, Geoffrey. 2003. “The Oxford Handbook of Computational Linguistics.” In , p. 333-336. http://llc.oxfordjournals.org.
Schmolze, James G, Bolt Beranek, and Newman Inc. 1985. “An overview of the KL-ONE knowledge representation system.” COGNITIVE SCIENCE 9: 171--216.
Schrodt, Philip A. 2009. “TABARI. Textual Analysis by Augmented Replacement Instructions Version 0.7.” 9024. http://web.ku.edu/keds/tabari.dir/tabari.manual.0.7.3b3.pdf (Accessed July 23, 2011).
Schrodt, Philip A. 2006. “Twenty Years of the Kansas Event Data System Project.” The Political Methodist 14(1): 2-8.
Schrodt, Philip A., and Deborah J. Gerner. Analyzing International Event Data: A Handbook of Computer-Based Techniques. http://eventdata.psu.edu/papers.dir/AIED.Preface.pdf.
Sebastiani, Fabrizio. 2002. “Machine Learning in Automated Text Categorization.” ACM COMPUTING SURVEYS 34: 1--47.
Shapiro, Gilbert, Timothy Tackett, Philip Dawson, and John Markoff. 1998. Revolutionary demands: a content analysis of the Cahiers de doléances of 1789. Stanford University Press.
Smadja, Frank. 1993. “Retrieving collocations from text: Xtract.” Computational linguistics 19: 143-177.
Sowa, John. 2000. Knowledge representation : logical, philosophical, and computational foundations. Pacific Grove: Brooks/Cole.
Stone, Philip. 2001. “Note Introducing Server Version of General Inquirer -- from inquirer blog.” http://www.wjh.harvard.edu/~inquirer/server_blognote.html (Accessed November 21, 2010).
Szabo, Gabor, and Bernardo A. Huberman. 2010. “Predicting the popularity of online content.” Communications of the ACM 53(8): 80.
Titscher, Stefan, Bryan Jenner, and Michael Meyer. Methods of text and discourse analysis.
Turmo, Jordi, Alicia Ageno, and Neus Català. 2006. “Adaptive information extraction.” ACM Comput. Surv. 38(2): 4.
Violi, Patrizia. 2001. Meaning and experience. Indiana University Press.
Weida, Robert. 1991. Knowledge Representation and Reasoning with Definitional Taxonomies. Department of Computer Science Columbia University. Technical report. http://www.google.ch/url?sa=t&source=web&cd=4&sqi=2&ved=0CCkQFjAD&url=http%3A%2F%2Fwww.cs.columbia.edu%2F~library%2FTR-repository%2Freports%2Freports-1991%2Fcucs-047-91.ps.gz&rct=j&q=%22terminological%20reasoner%22%20winograd&ei=MhfcTKzhCuKL4gbXycTKCA&usg=AFQjCNHZzTal4soYucsfIrRb95mwtanihA&sig2=pwG2e_s1Q72wDEkCg3auHA&cad=rja (Accessed November 11, 2010).
West, Mark D. 2001. Theory, method, and practice in computer content analysis. Greenwood Publishing Group.
De Wever, B., T. Schellens, M. Valcke, and H. Van Keer. 2006. “Content analysis schemes to analyze transcripts of online asynchronous discussion groups: a review.” Comput. Educ. 46(1): 6–28.
White, Marilyn Domas, and Emily E. Marsh. 2006. “Content Analysis: A Flexible Methodology.” Library Trends 55(1): 22-45.
Wierzbicka, Anna. 1996. Semantics : primes and universals. Oxford [England] ;;New York: Oxford University Press.
Williams, Geoffrey. 2003. “Les collocations et l’école contextualiste britannique.” In Les Collocations: analyse et traitement, eds. F. Grossmann and A. Tutin. Amsterdam: De Werelt, p. 33-44.
Wittgenstein, Ludwig. 1998. Filosofická zkoumání. 2nd ed. Praha: Filosofia.
Yarowsky, David. 1992. “Word-sense disambiguation using statistical models of Roget’s categories trained on large corpora.” COLING 14: 454-460.
 
Univerzita Karlova | Informační systém UK