:::

詳目顯示

回上一頁
題名:一種改良的啟發式方法以建構名目屬性之二元決策樹
書刊名:資訊管理學報
作者:葉榮懋施武榮 引用關係徐芳玲
作者(外文):Yeh, Jong-mauShih, WurongShyu, Fang Ling
出版日期:2010
卷期:17:1
頁次:頁157-176
主題關鍵詞:決策樹資料探勘分類啟發式方法主成分分析Decision treeData miningClassificationHeuristic methodPrincipal component analysis
原始連結:連回原系統網址new window
相關次數:
  • 被引用次數被引用次數:期刊(0) 博士論文(0) 專書(0) 專書論文(0)
  • 排除自我引用排除自我引用:0
  • 共同引用共同引用:0
  • 點閱點閱:62
資訊科技的日新月異,資料的儲存與處理規模均與過去有相當大的差距。如何從龐大的資料量中擷取出有用的資訊以提供給決策者參考,一直是資料探勘領域裡所關注的重點。決策樹由於其運算容易,又能產生清楚的規則,使其成為資料探勘中最常用的分類技術之一。但是當處理的資料量龐大,且名目屬性的屬性值相當多的情況之下,若每一屬性值都形成一個分支,則決策樹的分支太多將會造成所萃取的規則過於複雜難以解讀,資料在處理上的效率也會大打折扣。本論文發展一種簡化決策樹的方法,可將資料庫內的名目屬性做二元分割,把資料分成二支,以減少過多與不必要的決策樹分支。本研究採用主成分分析法中,可表示大部分變異的第一主成分,並利用該成分裡經過標準化成分分數的平均值,作為二元分割屬性值的基準,以消除過多的屬性值分支,使得決策樹的外顯知識容易解讀。最後,並以四個UCI資料庫內的資料集作為測試樣本,結果顯示本研究所提的方法,在決策樹的精簡與分類正確性上都有良好的表現。
The ability to extract useful information from a large-scale database to aid decision-making is critical in data mining. Classification is an important problem in data mining. It has been studied extensively as a possible solution to the knowledge acquisition. Decision tree has become one of the most commonly used techniques for classifying data because the algorithm for generating a decision tree can be easily implemented. However, when there are too many distinct values of the nominal attributes in each node of a tree, the branches of the tree become enormous and complicated. As a result, the effectiveness of data processing in a large data set may be compromised. This paper aims to propose a heuristic method to simplify the decision tree by splitting the nominal attributes into two branches. We adopt principal component analysis to present an algorithm for finding a good partition strategy in order to reduce unnecessary branches of a decision tree. Since the principal component can represent most of the variants, the first component scores of each attribute will be utilized as the thresholds for splitting examples. The decision tree can be simplified to a binary tree so that the explicit knowledge of a tree can be easily extracted. We also compare against other heuristic methods and give an analysis of experimental results on four UCI data sets.
期刊論文
1.Lim, T.-S.、Loh, W.-Y.、Shih, Y.-S.(2000)。A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms。Machine Learning,40(3),203-228。  new window
2.Quinlan, J. R.(1996)。Improved use of continuous attributes in C4.5。Journal of Articial Intelligence Research,4,77-90。  new window
3.Quinlan, J. R.(1986)。Induction of decision tree。Machine Learning,1(1),81-106。  new window
4.Aha, D.W., and Breslow, L.A(1998)。“Comparing simplification procedures for decision trees,"。Artificial Intelligence and Statistics,(5),pp_199-206。  new window
5.Bohanec, M., and Bratko, I.(1994)。“Trading accuracy for simplicity in decision trees,"。Machine Learning,(15),pp_223-250。  new window
6.Brodley, C. E., and Utgoff, P. E.(1995)。“Multivariate decision trees, "。Machine Learning,(19),pp_45-77。  new window
7.Carvalho, D. R., and Freitas, A. A.(2004)。“A hybrid decision tree/genetic algorithm method for data mining,"。Information Science,(163:1-3),pp_13-35。  new window
8.Coppersmith, D., Hong, S. J., and Hosking, J. R. M(1999)。“Partitioning nominal attributes in decision trees,"。Data Mining and Knowledge Discovery,(3),pp_ 197-217。  new window
9.Cox, L. A., Qiu, Y., and Kuehner, W.(1989)。“Heuristic least-cost computation of discrete classification functions with uncertain argument values, "。Annals of Operations Research,(21:1),pp_ 1-30。  new window
10.Esposito, F., Malerba, D., and Semeraro, G.(1997)。“A Comparative Analysis of Methods for Pruning Decision Trees,"。IEEE Transactions on Pattern Analysis and Machine Intelligence,(19),pp_ 476-491。  new window
11.Hoffgen, K. U., Simon, H. U., and Horn, K. S.(1995)。“Robust trainability of single neurons, "。Journal of Computer system Sciences,(50:1),pp_ 114-125。  new window
12.Hyafil, L., and Rivest, R. L(1976)。“Constructing optimal binary decision tree is NP-complete,"。Information Processing Letters,(5:1),pp_ 15-17。  new window
13.Laber, E. S., and Nogueira, L. T.(2004)。“On the hardness of the minimum height decision tree problem,"。Discrete Applied Mathematics,(144:1-2),pp_ 209-212。  new window
14.Murphy, O. J., and McCraw, R. L.(1991)。“Designing storage efficient decision trees, "。IEEE Transactions on Computers,(40:3),pp_ 315-319。  new window
15.Murthy, S. K.(1998)。“Automatic construction of decision trees from data: a multi-disciplinary survey,"。Data Mining and Knowledge Discovery,(2:4),pp_ 345-389。  new window
16.Naumov, G. E.(1991)。“NP-completeness of problems of construction of optimal decision trees,"。Soviet Physics Doklady,(36:4),pp_ 270-271。  new window
17.Osei-Bryson, K.(2004)。“Evaluation of decision trees: a multi-criteria approach,"。Computers and Operations Research,(31:11),pp_ 1933-1945。  new window
18.Pagallo, G., and Haussler, D.(1990)。“Boolean feature discovery in empirical learning,"。Machine Learning,(5),pp_ 71-100。  new window
19.Ruggieri, S.(2002)。“Efficient C4.5, "。IEEE Transactions on Knowledge and Data Engineering,(14:2),pp_ 438-444。  new window
20.Sorensen, K., and Janssens, G. K.(2003)。“Data mining with genetic algorithms on binary trees,"。European Journal of Operational Research,(151:2),pp_ 253-264。  new window
21.Takimoto, E., and Maruoka, A.(2003)。“Top-down decision tree learning as information based boosting,"。Theoretical Computer Science,(292:2),pp_ 447-464。  new window
22.Terano, T., and Ishino, Y.(1996)。“Knowledge acquisition from questionnaire data using simulated breeding and inductive learning methods,"。Expert Systems with Applications,(11:4),pp_ 507-518。  new window
會議論文
1.Auer, P., Holte, R. C., and Maass, W.(1995)。“Theory and applications of agnostic PAC-learning with small decision trees,"pp_21-29。  new window
2.Cherkauer, K. J., and Shavlik, J. W.(1996)。“Growing simpler decision trees to facilitate knowledge discovery,"pp_ 315-318。  new window
3.John, G. H.(1995)。“Robust Decision Trees: Removing Outliers in Databases, "pp_ 174-179。  new window
4.Mehta, M., Agrawal, R., and Rissanen, J.(1996)。“SLIQ: A fast scalable classifier for data mining, "。Avignon, France。  new window
5.Ragavan, H., and Rendell, L.A.(1993)。“Lookahead feature construction for learning hard concepts,"pp_ 252-259。  new window
6.Zheng, Z.(1995)。“Constructing nominal X-of-N attributes, "(2),pp_ 1064-1070。  new window
學位論文
1.Heath, D.G.(1993)。A geometric framework for machine learning。  new window
圖書
1.Breiman, L.、Friedman, J. H.、Olshen, R. A.、Stone, C. J.(1984)。Classification and Regression Trees。Chapman & Hall/CRC。  new window
2.Han, Jiawei、Kamber, Micheline(2000)。Data mining: Concepts and techniques。Morgan Kaufmann Publishers。  new window
3.Quinlan, J. Rose(1993)。C4.5: Programs for Machine Learning。Morgan Kaufmann Publishers。  new window
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
:::
無相關書籍
 
無相關著作
 
無相關點閱
 
QR Code
QRCODE