:::

詳目顯示

回上一頁
題名:中文全文文件群集索引理論研究與實證
書刊名:圖書與資訊學刊
作者:黃雲龍
作者(外文):Huang, Yun-long
出版日期:1998
卷期:24
頁次:頁44-68
主題關鍵詞:自動索引群集索引資訊檢索向量空間模型群集索引模型奇異值分解Automatic indexingCluster indexingInformation retrievalVector space modelVSMCluster index modelCIMSingular value decompositionSVD
原始連結:連回原系統網址new window
相關次數:
  • 被引用次數被引用次數:期刊(0) 博士論文(0) 專書(0) 專書論文(0)
  • 排除自我引用排除自我引用:0
  • 共同引用共同引用:0
  • 點閱點閱:36
     當前商業應用的全文檢索系統仍以字串比對的全文檢視法,配合布林查詢介面為 主流,這種系統過於簡化電子文件檢索系統環境的形式與內容關係。本研究根據向量空間模 型 (VSM),探討索引詞彙的形式與文件內容關係,運用奇異值分析技術 (SVD),建構中文全 文文件的群集索引模型 (CIM)。 本文從兒童日報全文語料庫中選取醫藥新聞 502 篇文件, 經由各項實驗設計初步獲致以下結論:CIM 索引的效果優於傳統 VSM,而且可以提昇其效能 ,達到具有權威控制機制下的索引效果。
     Since most popular commercialized systems for full text document retrieval are designed with full text scanning and Boolean logic query mode. These systems use an oversimplified relationship between the indexing form and the content of document. We use Singular Value Decomposition (SVD) try to develop a Cluster Indexing Model (CIM) based on Vector Space Model (VSM) in order to explore the index theory of cluster indexing for Chinese full test document. Test corpus was selected from Children's Daily News: the medicine news( MED) with 502 documents. Under a seriesx of experiments, the following conclusions are discovered: we find the indexing performance of CIM is better than traditional VSM, and has almost equivalent effectiveness of the authority control of index terms.
期刊論文
1.Deerwester, Scott、Dumais, Susan T.、Furnas, George W.、Landauer, Thomas K.、Harshman, Richard(1990)。Indexing by Latent Semantic Analysis。Journal of the American Society for Information Science,41(6),391-407。  new window
2.Iivonen, M.(1995)。Consistency in the Selection of Search Concepts and Search Terms。Information Processing & Management,31(2),173-190。  new window
3.Shannon, C. E.(1948)。A Hathematical Theory of Communication。Bell System Technical Journal,27,379-423+623-656。  new window
4.Wong, S. K. M.、Yao, Y. Y.(1992)。An Introduction-Theoretic Measure of Term Specificity。JASIS,43(1),54-61。  new window
5.Everett, D. M.、Cater, S. C.(1992)。Topology of Document Retrieval Systems。Journal of the American Society for Information Science,43(10),659。  new window
6.Can, F.、Ozkarahan, E. A.(1987)。Computation of Term/ Document Discrimination Values by Use of the Cover Coefficient Concept。JASIS,38(3),171。  new window
7.Can, F.(1994)。On The Efficiency of Best-Match Cluster Searches。Information Processing & Management,30(3),343-361。  new window
8.Lu, X.(1990)。Document Retrieval: a Structural Approach。Information Processing & Management,26(2),209-218。  new window
9.Kristensen, J.(1993)。Expanding End-User's Query Statements for Free Text Searching with a Search-Aid Thesaurus。Information Processing & Management,29(6),733-744。  new window
10.Yang, Y.、Wilbur, J.(1996)。Using Corpus Statistics to Remove Redundant Words in Text Categorization。JASIS,47(5),357-369。  new window
11.Crouch, C. J.(1990)。An Approach to the Automatic Construction of Global Thesauri。Information Processing & Management,26(5),632。  new window
12.Yang, Y.、Chute, C. G.(1994)。An Example-Based Mapping Method for Text Categorization and Retrieval。ACM Transaction on Information Systems,12(3),252-277。  new window
13.Borkr, H.、Bernick, M.(1963)。Automatic Document Classification。Journal of Association of Computing Machinery,11,151-162。  new window
14.Kurfeerst, M.、Asher, J. W.(1968)。A Factor Analysis of the Education Laws of Pennsylvania。Information Storage & Retrieval,4,257-270。  new window
15.Burgin, R.(1995)。The Retrieval Effectiveness of Five Clustering Algorithms as a Function of Indexing Exhaustivity。JASIS,46(8),562-572。  new window
16.Fox, A. E.、Koll, M. B.(1988)。Practical Enhanced Boolean Retrieval: Experiences with the SMART and SIRE Systems。Information Processing & Management,24(3),257-267。  new window
會議論文
1.簡立峰(1996)。尋易系統(Csmart)與中文智慧型資訊檢索。在21世紀資訊科學與技術的展望國際學術研討會,世界新聞傳播學院圖書資訊學系、國家圖書館 (會議日期: 1996/11/07-09)。  延伸查詢new window
2.謝清俊(1992)。從二十五史全文資料庫的經驗談中文文件檢索系統設計的考量。第三屆中文信息處理國際會議,(會議日期: 1992/10/16-28)。北京。  延伸查詢new window
3.謝清俊(1994)。語文工作與資訊發展--從電子文件的發展談對語文研究的期盼。在當前語文問題學術研討會,行政院國家科學委員會、國立台灣大學中國文學系 (會議日期: 1994/06/26)。  延伸查詢new window
4.黃蕙株(1994)。索引典的基礎理論。索引典理論與實務研討會。台北市:中國圖書館學會。20-34。  延伸查詢new window
5.Salton, G.(1975)。A Theory of Indexing。Regional Conference Series Application Mathematics。Society for Industrial and Applied Mathematics。55。  new window
6.Salton, G.(1991)。The Smart Document Retrieval Project。The 14th annual international ACM SIGIR conference on Research and development in information retrieval,357-358。  new window
7.Lewis, D. D.(1992)。An Evaluation of Phrasal and Clustering Representations on a Text Categorization Task。The 15th annual international ACM SIGIR conference on Research and development in information retrieval,37-50。  new window
8.Wilkinson, R.、Hingston, P.(1991)。Using the Cosine Measure in A Neural Network for Document Retrieval。The 14th annual international ACM SIGIR conference on Research and development in information retrieval,202-210。  new window
9.Syu, I.、Lang, S. D.、Deo, N.(1996)。Incorporating Latent Semantic Indexing into a Neural Network Model for Information Retrieval。The 5th International Conference on Information and Knowledge Management。  new window
10.Yang, Y.、Chute, C. G.(1993)。An Application of Least Squares Fit Mapping to Text Information Retrieval。The 16th annual international ACM SIGIR conference on Research and development in information retrieval,281-290。  new window
11.Wong, S. K. M.、Ziarko, W.、Wong, P. C. N.(1985)。Generalized Vector Space Model In Information Retrieval。The 8th annual international ACM SIGIR conference on Research and development in information retrieval,18-25。  new window
12.Yang, Y.(1995)。Noise Reduction in a Statistical Approach to Text Categorization。The 18th annual international ACM SIGIR conference on Research and development in information retrieval,256-263。  new window
13.Lang, Sheau-Dong(1996)。Tutorial on Text Retrieval Techniques and Their WWW Applications。資訊擷取技術及其在WWW之應用研討會,國立清華大學 (會議日期: 1996/08/13)。  延伸查詢new window
14.Nie, J. Y.、Brisebois, M.、Ren, X.(1996)。On Chinese Text Retrieval。The 19th annual international ACM SIGIR conference on Research and development in information retrieval,225-233。  new window
15.謝清俊(1996)。電子古籍中的缺字問題。第一屆中國文字學會學術討論會,(會議日期: 1996年8月25-30日)。天津。  延伸查詢new window
學位論文
1.楊允言(1993)。文件自動分類及其相似性排序(碩士論文)。國立清華大學。  延伸查詢new window
2.陳淑美(1992)。財經新聞自動分類之研究(碩士論文)。國立臺灣大學。  延伸查詢new window
圖書
1.Salton, G.(1989)。Automatic Text Processing。Addison-Wesley Publishing Company。  new window
2.鐘聖校(1993)。認知心理學。台北市:心理出版社。  延伸查詢new window
3.方師鐸(1970)。國語詞彙學構詞篇。益智書局。  延伸查詢new window
4.Salton, G.(1971)。The SMART Retrieval System--Experiments in Automatic Document Processing。Englewood Cliffs, N. J.:Prentice-Hall, Inc.。  new window
5.Press, Willian H.、Teukolsky, Saul A.、Vetterling, William T.、Flannery, Brian P.(1992)。Numerical Recipes in C。Cambridge University Press。  new window
6.Salton, Gerald、McGill, Michael J.(1983)。Introduction to modern information retrieval。McGraw-Hill。  new window
其他
1.朱邦復(1993)。概念網路。  延伸查詢new window
2.Fox, A. E.(19961126)。Technical Report 83-560,http://cs-tr.cs.cornell.edu/。  new window
3.Buckley, C.(1996)。Technical Report 85-686,http://cs-tr.cs.cornell.edu/。  new window
圖書論文
1.謝清俊、林晰(1997)。中央研究院古籍全文資料庫的發展概要。中央研究院資訊科學研究所文獻處理實驗室技術報告。  延伸查詢new window
2.趙元任(1992)。語言成分裡意義有關的程度問題。中國現代語文學的開拓與發展:趙元任語言學論文集。北京市:清華大學出版社。  延伸查詢new window
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
QR Code
QRCODE