:::

詳目顯示

回上一頁
題名:高效率之遞增式資料探勘演算法--ICI
書刊名:電子商務學報
作者:黃仁鵬 引用關係錢依佩郭煌政
作者(外文):Huang, Jen-pengChien, I-peiKuo, Huang-cheng
出版日期:2006
卷期:8:3
頁次:頁393-413
主題關鍵詞:資料探勘關聯規則Apriori演算法高頻項目集遞增式資料探勘Data miningAssociation ruleFrequent itemsetsIncremental mining
原始連結:連回原系統網址new window
相關次數:
  • 被引用次數被引用次數:期刊(2) 博士論文(0) 專書(0) 專書論文(0)
  • 排除自我引用排除自我引用:1
  • 共同引用共同引用:4
  • 點閱點閱:45
隨著資訊科技的進步、電腦的普及,蒐集資料變得更容易、快速而且方便。但長時間之下,資料庫累積了大量且有隱藏知識的資料。所以,如何將這些被隱藏的知識,做正確又有效率地探勘成為一個重要的議題。因此,資料探勘的技術便應運而生。當中,最被廣為使用的技術為關聯規則之探勘。關聯規則探勘主要是探討如何從龐大資料庫中找出高頻項目集,進而發掘有用的知識。而在關聯規則中最常被使用的方法為Apriori演算法。雖然此方法可以找出關聯規則,但是它有二個最大的缺點:第一點為在找高頻項目集合時,會產生大量的候選項目集合;第二點為執行時必須經常掃瞄整個資料庫,造成執行效率不佳。後續有許多研究皆針對此缺點做改進,但皆未跳脫Apriori 演算法的整體架構,以致於其執行效率並無很大的進展。本研究所提出ICI演算法脫離Apriori演算法的架構,在產生大項目集合時,只需掃描資料庫一次,因此可以有效率地降低I/O的存取時間,並且快速地找出關聯規則,使得探勘更有效率。此外ICI演算法不需要任何修改就可以當作線上即時漸增式資料探勘 (On-line Incremental Data Mining) 的演算法。
Due to the improvement of information technologies and popularization of computers, collecting information becomes easier, rapider and more convenient than before. As the time goes by, database accumulates huge and knowledge-hiding information. Therefore, how to correctly uncover and efficiently mining hidden knowledge from those information becomes a very important issue. Hence the technology of data mining becomes one of the solutions. Among the data mining technologies association rules mining is one of the most popular technologies to be used. Association rules mining explores the approaches to extract the frequent itemsets from large database and to derive the knowledge behind implicitly. The Apriori algorithm is one of the most frequently used algorithms. Although the Apriori algorithm can successful derive the association rules from database, the Apriori algorithm has two major defects: First, the Apriori algorithm produces large amounts of candidate itemsets during extracting the frequent itemsets from large database. Secondly, the whole database is scanned many times which leads to inefficient performance. Many researches try to improve the performance of the Apriori algorithm, but still not escape from the frame of the Apriori algorithm and lead to a little improvement of the performance. In this paper we propose ICI (In­cremental Combination Itemsets) which escapes the frame of Apriori algorithm, and it only needs to scan whole database once during extracting the frequent itemsets from large data­base. Therefore, the ICI algorithm efficiently reduces the I/O time, and rapidly extracts the frequent itemsets from large database, and makes data mining more efficient than before. Meanwhile, ICI algorithm doesn’t need to scan database and reconstruct data structure again when database is updated or minimum support is varied. Therefore, it can be applied to on­line incremental mining applications without any modification.
期刊論文
1.Han, J.、Pei, J.、Yin, Y.、Mao, R.(2004)。Mining frequent patterns without candidate generation: a frequent pattern tree approach。Data Mining and Knowledge Discovery,8(1),53-87。  new window
2.Chen, Ming-Syan、Han, Jiawei、Yu, Philip S.(1996)。Data Mining: An Overview from a Database Perspective。IEEE Transactions on Knowledge and Data Engineering,8(6),866-883。  new window
3.Park, J. S.、Chen, M.-S.、Yu, P. S.(1997)。Using a Hashed Method with Transaction Trimming and Database Scan Reduction for Mining Association Rules。IEEE Transactions on Knowledge and Data Engineering,19(5),813-825。  new window
會議論文
1.Agrawal, R.、Imielinski, T.、Swami, A. N.(1993)。Mining Association Rules between Sets of Items in Large Databases。The 1993 ACM SIGMOD International Conference on Management of Data,207-216。  new window
2.Ng, R.、Han, J.(1994)。Efficient and Effective Clustering Method for Spatial Data Mining。0。  new window
3.Srikant, R.、Agrawal, R.(1995)。Mining Sequential Patterns。The Eleventh International Conference on Data Engineering。Taipei:IEEE Computer Society。3-14。  new window
4.Agrawal, R.、Srikant, R.(1994)。Fast algorithms for mining association rules in large database。The 20th International Conference on Very Large Data Bases。Morgan Kaufmann Publishers Inc.。478-499。  new window
5.黃仁鵬、錢依佩(2002)。高效率之關聯規則探勘演算法-QDT。0。55-55。new window  延伸查詢new window
6.Lin, D.、Kedem, Z. M.(1998)。Pincer-Search: A New Algorithm for Discovering the Maximum Frequent Set。0。105-119。  new window
圖書
1.Kaufman, Leonard、Rousseeuw, Peter J.(1990)。Finding Groups in Data: an Introduction to Cluster Analysis。John Wiley and Sons, Inc.。  new window
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
QR Code
QRCODE