:::

詳目顯示

回上一頁
題名:英語文法知識本體學習法
作者:阮家慶
作者(外文):Jia-Cing Ruan
校院名稱:國立中正大學
系所名稱:語言學研究所
指導教授:貝若爾
麥傑
學位類別:博士
出版日期:2015
主題關鍵詞:文法知識本體知識本體詞彙構式語法知識Grammar-based ontologyCorpus Pattern AnalysisConstruction Grammarpatternslexical-based
原始連結:連回原系統網址new window
相關次數:
  • 被引用次數被引用次數:期刊(0) 博士論文(0) 專書(0) 專書論文(0)
  • 排除自我引用排除自我引用:0
  • 共同引用共同引用:0
  • 點閱點閱:77
本初探性的研究旨在嘗試介紹一個以文法為基底的知識本體,並且實際以結合句法及語義結構資料來建立。基本的背景指向於Hanks (2004; 2013)的想法,其認為任何的詞義,特別是多義詞,無法脫離其所處的語境。因此,不論是片語或是句子的語境,都能幫助辨別單詞語義。因為詞網(WordNet)被Ide and Wilks (2006) 批評,其對於詞彙的多義性沒有提供區辨。因此,Hanks以句型(pattern)來達到這樣的目的。
不只是Hanks認為句型本身具備語義,包括「構式語法Construction Grammar」(Goldberg, 1995),「構式構詞學Construction Morphology」(Booij, 2010),及「分散構詞學Distributed Morphology」(Halle and Marantz, 1993; Harley and Noyer, 1999; Embick and Noyer, 2007)也認同此番論點。現今的知識本體都是以詞彙為基底,來描述來自這個世界或某個領域的資訊。然而語言並不單純只是詞彙而已,它還包括語法、結構等的表達義。因此,做為語言及來自世界或某個領域的資訊的介面,知識本體也應該要擴充到以文法為基底的知識本體。
文法知識本體的重要性有三個方面。其一,不論是詞彙、片語或句子都有其形式與語義的配對。然而在知識本體的建構方面,片語及句子的語義已經長時間被忽略。其二,字詞在形式的一開始都是受到句法結構制約,這意味著字詞的形式結構本身對於人類的知識具有影響力。其三,語言知識的缺乏會導致無法正常使用語言。傳統以詞彙為基底的知識本體無法反映出此些類別的知識,也因此無法完全解釋這個世界或某個領域的知識及其分類。
本論文透過實際編寫程式實現文法知識本體、提出相關的結構查詢、以文法知識本體自動化Corpus Pattern Analysis、以文法知識本體提供建立Word sketch系統的另一方法等實例,來說明傳統以詞彙為基底的知識本體的不足,進而提出貢獻。
This preliminary study tries to introduce a new concept of grammar-based ontology through building up an ontology with patterns composed by conjoining syntactic and semantic structures. The background refers to Hanks’ (2004, 2013) thought that any word, especially a polysemous word, does not have its clear meaning unless it occurs in a context. Consequently, phraseological patterns and collocations enable the possibilities to disambiguate word meaning. Since WordNet (Fellbaum, 1998) is unreliable for not providing contrastive analyses of word senses (Ide and Wilks, 2006), using patterns instead of words to achieve the same goal seems more reasonable.
With the Corpus Pattern Analysis (CPA) (Hanks, 2004), Hanks has realized a pattern dictionary for English verbs. This dictionary contains pattern information composed within a corpus, and is complementary to Construction Grammar (Goldberg, 1995) or FrameNet (Baker, Fillmore and Lowe, 1998). Hanks indicates Construction Grammar studies need corpus data as evidence, instead of using fictitious examples. FrameNet provides numerous frame structures while the Corpus Pattern Analysis focuses on systematic analyses between patterns and verbs.
Not only the Corpus Pattern Analysis and its one of the foundations, Theory of Norms and Exploitations (TNE) (Hanks, 2013), but also Construction Grammar, Construction Morphology (Booij, 2010) and Distributed Morphology (Halle and Marantz, 1993; Harley and Noyer, 1999; Embick and Noyer, 2007) admit the existence of meanings of patterns or constructions themselves independently. Thus, an ontology ought not to be only lexicon-based, but grammar-based or pattern-based.
The needs for a grammar-based ontology include three dimensions. First, according to Construction Grammar, whatever words, phrases or sentences have their form-meaning pairs, which exhibits that a meaning corresponding to a form to any grammatical (syntactic or semantic) element itself exists. The grammatical meanings which show implicit abstract knowledge of the world have long been ignored in building up an ontology. Second, according to Construction Morphology or Distributed Morphology, the meaning of a word formation (especially compound or complex words) is predetermined by its syntactic pattern or construction. This indicates ‘patterns’ contributes to human knowledge. Third, without the grammatical knowledge (whether in awareness or not) a language may not be used properly, which points out that traditional lexicon-based ontologies fail to reflect the knowledge of real language usage, and thus further fail to reflect meanings from language usage interfaced to the objects of the world.
Through implementing the queries of a grammar-based ontology and word-sketch-like systems made from a grammar-based ontology, either the systems or the query results are incapable to be reproduced by traditional ontologies, which implies the need of a grammar-based ontology. Furthermore, the grammar-based ontology has been applied to try doing the machine sentence generation, which cannot be handled by traditional ontologies, either. Additionally, a grammar-based ontology can be applied to study typological issues in linguistics, and can also be applied to analyze reference books of learning languages in educational purposes.
The evaluation of a grammar-based ontology is much more straightforward due to the mixture of several systems with clear precisions, compared to traditional ontologies. However, the precision or quality of a grammar-based ontology is changed if the tags are changed in constituency parse results, dependency parse results or semantic tagging results. Applying different tagsets result in different performance.
The limitations of the grammar-based ontology are stated below. In the design of a grammar-based ontology proposed in the present dissertation, the pragmatic information cannot be handled due to the lack of non-taxonomic relations. Using English as an example, the coordinate and subordinate conjunctions play important roles to be the non-taxonomic relations between two concepts. Second, the grammar-based ontology does not provide much information for evaluating CPA.
Aguirre, Eneko, Olatz Ansa, Eduard Hovy, and David Martinez. 2000. Enriching very large ontologies using www. In ECAI’2000 Workshop on Ontology Learning, Proceedings of the First Workshop on Ontology Learning OL’2000. Retreved from: http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-31/EAgirre_14.pdf, Oct. 10th, 2014.
Alfonseca, Enrique and Suresh Manadhar. 2002. Extending a lexical ontology by a combination of distributional semantics signatures. In Proceedings of the 13th International Conference on Knowledge Engineering and Knowledge Management (EKAW 2002), p. 1-7.
Angele, Jürgen, Michael Kifer, and Georg Lausen. 2009. Ontologies in F-Logic. In Steffen Staab and Rudi Studer. (Eds.), Handbook on Ontologies. (2nd ed.), p.45-70. Berlin: Springer.
Antoniou, Grigris, and Frank van Harmelen. 2009. Web ontology language: OWL. In Steffen Staab and Rudi Studer. (Eds.), Handbook on Ontologies. (2nd ed.), p.91-110. Berlin: Springer.
Atkins, B. T. Sue. 1993. Tools for computer-aided corpus lexicography. The Hector project. Acta Linguistica Hungarica, 41.
Aussenac-Gilles, Nathalie. 2005. Supervised text analyses for ontology and terminology engineering. In Proceedings of the Dagstuhl Seminar on Machine Learning for the Semantic Web. Retrieved from: http://kushmerick.org/nick/research/Dagstuhl-MLSW/proceedings/aussenac-gilles.pdf, Oct. 10th, 2014.
Baader, Franz, Ian Horrocks, and Ulrike Sattler. 2009. Description logics. In Steffen Staab and Rudi Studer. (Eds.), Handbook on Ontologies. (2nd ed.), p.21-44. Berlin: Springer.
Baker, Collin, Charles J. Fillmore and John B. Lowe. 1998. The Berkeley FrameNet project. In COLING-ACL ’98: Proceedings of the Conference, p. 86-90.
Bies, Ann, Mark Ferguson, Karen Katz and Robert MacIntyre. 1995. Bracketing Guidelines for Treebank II Style Penn Treebank Project. Retreved from: ftp://ftp.cis.upenn.edu/pub/treebank/doc/manual/root.ps.gz, Dec 12th, 2014.
Bird, Steven. 2005. NLTK-Lite: Efficient scripting for natural language processing. In 4th International Conference on Natural Language Processing, p. 1-8.
Bird, Steven, Ewan Klein, and Edward Loper. 2009. Natural Language Processing with Python. CA: O’Reilly.
Borges Neto, José. 2007. Ontology, Language, and Linguistic Theory. Presented in Language and Ontology, IEL, UNICAMP.
Borst, Willem Nico. 1997.Construction of Engineering Ontologies. Institute for Telematica and Information Technology, University of Twente, Enschede, The Netherlands.
Brown, John Seely and Richard R. Burton. 1975. Multiple representations of
knowledge for tutorial reasoning. In Bobrow, Daniel. G. and Collins, Allan (Eds.), Representation and Understanding, p. 311-350. Academic Press.
Brugman, Claudia M. 1988. The Syntax and Semantics of 'have' and Its Complements. Ph.D.Dissertation. University of California, Berkeley.
Buehrer, J. Daniel. 2009. Class algebra. Presented in International Conference on Advances in Computer Science and Engineering, Thailand.
Buehrer, J. Daniel and Chun-Yao Wang. 2012. CA-ABAC: Class Algebra attribute-based access control. Presented at The 2nd International Symposium on Web Intelligent Systems & Services (WISS), Macau.
Cantor, Georg. 1932. Gesammelte Abhandlungen mathematischen und philosophischen Inhalts, edited and with annotations by Ernst Zermelo. Berlin: Springer.
Christiane Fellbaum 1998. WordNet: An Electronic Lexical Database. Cambridge, MA: MIT Press.
Cimiano, Philipp and Steffen Staab. 2005. Learning concept hierarchies from text with a guided hierarchical clustering algorithm. In Proceedings of the ICML 2005 Workshop on Learning and Extending Lexical Ontologies with Machine Learning Methods. Retrieved from: http://citeseer.ist.psu.edu/viewdoc/download;jsessionid=85677534572CC150E5AECFD4197044D9?doi=10.1.1.140.7472&rep=rep1&type=pdf, Oct. 10th, 2014
Cinková, Silvie, Martin Holub, Pavel Rychlý, Lenka Smejkalová, and Jana Šindlerová. 2010. Can corpus pattern analysis be used in NLP? In Petr Sojka, Aleš Horák, Ivan Kopeček and Karel Pala. (Eds.), Proceedings of 13th International Conference on Text, Speech and Dialogue, p. 67-74
d’Aquin, Mathieu and Natalya Noy. 2012. Review: Where to publish and find ontologies? A survey of ontology libraries. Web Semantics: Science, Services and Agents on the World Wide Web, 11, p. 96-111.
de Marneffe, Marie-Catherine, Timothy Dozat, Natalia Silveira, Katri Haverinen, Filip Ginter, Joakim Nivre, and Chistopher D. Manning. 2014. Universal Stanford dependencies: A cross-linguistic typology. In Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC), p.4585-4592.
de Marneffe, Marie-Catherine, Bill MacCartney and Christopher D. Manning. 2006. Generating typed dependency parses from phrase structure parses. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC-2006). Retrieved from: http://www.lrec-conf.org/proceedings/lrec2006/pdf/440_pdf.pdf, Dec 12, 2014.
Embick, David and Rolf Noyer. 2007. Chapter 9: Distributed Morphology and the syntax/morphology interface. In Gillian Ramchand and Charles Reiss. (Eds.), Oxford Handbook of Linguistic Interfaces, p.289-323.
Faatz , Andreas and Ralf Steinmetz. 2002. Ontology enrichment with texts from the www. In Semantic Web Mining 2nd Workshop at ECML/PKDD-2002, Helsinki, Finland. Retrieved from: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.12.901&rep=rep1&type=pdf, Oct, 10th, 2014.
Farrar, Scott and D. Terence Langendoen. 2003. A linguistic ontology for the Semantic Web. GLOT International, 7(3), p.97-100.
Fernández, Mariano, Asunción Gómez-Pérez, and Natalia Juristo. 1997. METHONTOLOGY: From ontological art towards ontological engineering. Spring Symposium on Ontological Engineering of AAAI. California: Stanford University, p. 33-40.
Fillmore, Charles J. and Paul Kay. 1993. Construction Grammar. Unpublished manuscript. University of California, Berkeley.
Fillmore, Charles J., Paul Kay and Catherine O'Connor. 1988. Regularity and idiomaticity in grammatical constructions: The case of Let Alone. Language 64, p. 501-538.
Freeman, Reva. 2010. Python as a vehicle for teaching natural language processing. In Proceedings of the Twenty-Third International Florida Artificial Intelligence Research Society Conference, p. 300-304.
Garrette, Dan. 2009. An extensible toolkit for computational semantics. In Proceedings of the 8th International Conference on Computational Semantics, p. 116-127.
Genesereth, Michael R., and Nils J. Nilsson. 1987. Logical Foundations of Artificial Intelligence. Morgan Kaufmann.
Goldberg, Adele. E. 1995. Constructions: A construction grammar approach to argument structure. Chicago: University of Chicago Press.
Gómez-Pérez, Asunción, Mariano Fernández-López and Oscar Corcho. 2004. Ontological Engineering: with Examples From the Areas of Knowledge Management, e-Commerce and the Semantic Web. Springer.
Grossmann, Reinhardt. 1992. The Existence of the World: An Introduction to Ontology. Routledge: London and New York.
Gruber, Thomas R. 1993. A translation approach to portable ontologies. Knowledge Acquisition, 5(2), p.199-220.
Guarino, Nicola, Daniel Oberle, and Steffen Staab. 2009. What is an ontology? In Steffen Staab and Rudi Studer. (Eds.), Handbook on Ontologies. (2nd ed.), p.1-20. Berlin: Springer.
Hahn, Udo and Kornél G. Markó. 2001. Joint knowledge capture for grammars and ontologies. In Proceedings of the First International Conference on Knowledge Capture (K-CAP 2001), p. 68-75, ACM Press.
Halle, Morris and Alec Marantz.1993. Chapter 3: Distributed Morphology and the pieces of Inflection. In Ken Hale, and Samuel Jay Keyser, (Eds.), The View from Building 20, p. 111-176, MIT Press.
Halliday, Michael. 1966. Lexis as a linguistic level. In Charles Ernest Bazell and John Rupert Firth (Eds.), In memory of J. R. Firth. Longman.
Hanks, Patrick. 2000. Do word meanings exist? Computer and the Humanities 34, p. 205-215.
Hanks, Patrick. 2004. Corpus Pattern Analysis. In Geoffrey Williams and Sandra Vessier (eds.), Proceedings of the Eleventh EURALEX International Congress, EURALEX 2004, p.87-97. Lorient: France.
Hanks, Patrick. 2009. The linguistic double helix: norms and exploitations. In Dana Hlaváčková, Aleš Horák, Klára Osolsobě, and Pavel Rychlý. (Eds.), After Half a Century of Slavonic Natural Language Processing, p. 63-80. Brno, Czech Republic: Masaryk University.
Hanks, Patrick. 2013. Lexical Analysis: Norms and Exploitations. MIT Press.
Harley, Heidi and Rolf Noyer. 1999. Distributed Morphology. Glot International 4, 3-9.
Howe, Douglas Henry. 1996. American Start With English (Book1 to Book6). Oxford Press.
Ide, Nancy and Yorick Wilks. 2006. Making sense about sense. In Eneko Agirre and Philip Glenny Edmonds, (Eds.), Word Sense Disambiguation: Algorithms and Applications, p.47-73. New York: Springer.
Jurafsky, Daniel, and James H. Martin. 2009. Speech and language processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. (2nd ed.). NewJersey: Pearson Education.
Kilgarriff, Adam, Pavel Rychlý, Pavel Smrž and David Tugwell. 2004. The sketch engine. In Proceedings of the 11th EURALEX International Congress. France, p. 105-116.
Klein, Dan and Christopher D. Manning. 2003a. Fast exact inference with a factored model for natural language parsing. In Suzanna Becker, Sebastian Thrun, and Klaus Obermayer (Eds.), Advances in Neural Information Processing Systems 15 (NIPS 2002). Cambridge, Mass.: MIT Press
Klein, Dan and Christopher D. Manning. 2003b. Accurate Unlexicalized Parsing. In Proceedings of the 41st Meeting of the Association for Computational Linguistics, pp. 423-430.
Klein, Ewan. 2006. Computational semantics in the Natural Language Toolkit. In Proceedings of the 2006 Australasian Language Technology Workshop, p. 26-33.
Lakoff, George. 1987. Women, Fire, and Dangerous Things: What Categories Reveal about the Mind. Chicago: University of Chicago Press
Lambrecht, Knud. 1994. Information Structure and Sentence Form: A Theory of Topic, Focus, and the Mental Representation of Discourse Referents. Cambridge Studies in Linguistics. Cambridge: Cambridge University Press.
Levenshtein, Vladimir I. 1966. Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady,10(8), p.707–710.
Li, Man, Xiao-yong Du and Shan Wang. 2005. Learning ontology from relational database. In Proceeding of the Fourth International Conference on Machine Learning and Cybernetics, p. 3410-3415.
Loper, Edward. 2004. NLTK: Building a pedagogical toolkit in Python. In PyCon DC 2004. Python Software Foundation. http://www.python.org/pycon/dc2004/papers/.
Loper, Edward, and Steven Bird. 2004. NLTK: The natural language toolkit. In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, p. 63-70.
Mavrogiorgos, Marios. 2010. Clitics in Greek: A Minimalist Account of Proclisis and Enclisis. Amsterdam: John Benjamins Publishing.
McCarthy, Diana and Adam Kilgarriff. 2015. Semantic word sketches. Presented in The Eighth International Corpus Linguistics Conference, United Kingdom: Lancaster University.
Miller, Scott, Robert Bobrow, Robert Ingria, Richard Schwartz. 1994. Hidden understanding models of natural language. In ACL-94, Las Cruces, NM, p.25-32.
Ministry of Education. 2008. Appendix 4: Vocabulary list. Grade 1-9 Curriculum Guidelines, p. 33-39.
Minsky, Marvin. 1975. A framework for representing knowledge, In Patrick Henry Winston, (Eds.), The Psychology of Computer Vision, p.211-277. New York: McGraw-Hill.
Negnevitsky, Michael. 2005.Artificial Intelligence: A guide to Intelligent Systems. (2nd ed.), England: Pearson Education.
Osborne, Timothy, Michael Putnam and Thomas M. Gross. 2011. Bare phrase structure, label-less trees, and specifier-less syntax. Is Minimalism becoming a dependency grammar? The Linguistic Review 28, p. 315-364.
Pan, Jeff Z. 2009. Resource description framework. In Steffen Staab and Rudi Studer. (Eds.), Handbook on Ontologies. (2nd ed.), p.71-90. Berlin: Springer.
Pease, Adam. 2001. Evaluation of intelligent systems: The high performance knowledge bases and IEEE standard upper ontology projects ( invited position paper). In Proceedings of the 2001 workshop on Measuring Intelligence and Performance Of Intelligent Systems (PERMIS 2001).
Pease, Adam, Ian Niles, and John Li. 2002. The Suggested Upper Merged Ontology: A large ontology for the Semantic Web and its applications. In Working Notes of the AAAI-2002 Workshop on Ontologies and the Semantic Web, Edmonton, Canada.
Presutti, Valenting, Francesco Draicchio, and Aldo Gangemi. 2012. Knowledge extraction based on discourse representation theory and linguistic frames. In Knowledge Engineering and Knowledge Management, p.114-129. Springer: Berin Heidelberg.
Petrov, Slav, Dipanjan Das, and Ryan McDonald. 2011. A universal part-of-speech tagset. ArXiv:1104.2086.
Ra, Minyoung, Donghee Yoo, Sungchun No, Jinhee Shin, and Changhee Han. 2012. The mixed ontology building methodology using database information. In Proceedings of the International MultiConference of Engineers and Computer Scientists Vol I. Retreved from: http://www.iaeng.org/publication/IMECS2012/IMECS2012_pp68-73.pdf, Dec. 12th, 2014.
Rais, Mohammed, Abdelmonaime Lachkar, Abdelhamid Lachkar, and Said El Alaoui Ouatik. 2014. A comparative study of biomedical named entity recognition methods based machine learning approach. In Proceedings of 2014 Third IEEE International Colloquium in Information Science and Technology, p.329-334.
Rayson, Paul, Dawn Archer, Scott Piao and Tony McEnery. 2004. The UCREL semantic analysis system. In Proceedings of the workshop on Beyond Named Entity Recognition Semantic labelling for NLP tasks in association with 4th International Conference on Language Resources and Evaluation (LREC 2004), Lisbon, Portugal, p. 7-12.
Rosen, Kenneth H. 2007. Discrete Mathematics and Its Applications. (6th ed.). New York: McGraw-Hill.
Ruiz-Casado, Maria, Enrique Alfonseca, and Pablo Castells. 2007. Automating the learning of lexical patterns: An application to the enrichment of Wordnet by extracting semantic relationships from Wikipedia. Data and Knowledge Engineering, 61:484-499.
Sinclair, John. 1966. Beginning the study of lexis. In Charles Ernest Bazell and John Rupert Firth (Eds.), In memory of J. R. Firth. Longman.
Sinclair, John. 1987. Looking Up: an Account of the Cobuild Project in Lexical Computing. HarperCollins.
Sinclair, John. 1991. Corpus, Concordance, Collocation. Oxford University Press.
Sinclair, John and Patrick Hanks. 1987. The Collins Cobuild English Language Dictionary. HarperCollins.
Studer, Rudi, V. Richard Benjamins, and Dieter Fensel. 1998. Knowledge engineering: Principles and methods. Data & Knowledge Engineering, 25(1-2), p.161-198.
Varzi, Achille C. 2007. From Language to Ontology: Beware of the Traps. In Michel Aurnague, Maya Hickman, Laure Vieu (eds.), The Categorization of Spacial Entities in Language and Cognition, p. 269-284. Amsterdam: John Benjamins.

 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
QR Code
QRCODE