一種改良的啟發式方法以建構名目屬性之二元決策樹__臺灣人文及社會科學引文索引資料庫

:::

詳目顯示

第 1 筆 / 總合 1 筆

/1頁

來源文獻資料
摘要
外文摘要
引文資料

題名：	一種改良的啟發式方法以建構名目屬性之二元決策樹
書刊名：	資訊管理學報
作者：	葉榮懋／施武榮／徐芳玲
作者(外文)：	Yeh, Jong-mau／Shih, Wurong／Shyu, Fang Ling
出版日期：	2010
卷期：	17:1
頁次：	頁157-176
主題關鍵詞：	決策樹；資料探勘；分類；啟發式方法；主成分分析；Decision tree；Data mining；Classification；Heuristic method；Principal component analysis
原始連結：	連回原系統網址
相關次數：	被引用次數:期刊(0) 博士論文(0) 專書(0) 專書論文(0) 排除自我引用:0 共同引用:0 點閱:62

資訊科技的日新月異，資料的儲存與處理規模均與過去有相當大的差距。如何從龐大的資料量中擷取出有用的資訊以提供給決策者參考，一直是資料探勘領域裡所關注的重點。決策樹由於其運算容易，又能產生清楚的規則，使其成為資料探勘中最常用的分類技術之一。但是當處理的資料量龐大，且名目屬性的屬性值相當多的情況之下，若每一屬性值都形成一個分支，則決策樹的分支太多將會造成所萃取的規則過於複雜難以解讀，資料在處理上的效率也會大打折扣。本論文發展一種簡化決策樹的方法，可將資料庫內的名目屬性做二元分割，把資料分成二支，以減少過多與不必要的決策樹分支。本研究採用主成分分析法中，可表示大部分變異的第一主成分，並利用該成分裡經過標準化成分分數的平均值，作為二元分割屬性值的基準，以消除過多的屬性值分支，使得決策樹的外顯知識容易解讀。最後，並以四個UCI資料庫內的資料集作為測試樣本，結果顯示本研究所提的方法，在決策樹的精簡與分類正確性上都有良好的表現。

以文找文

The ability to extract useful information from a large-scale database to aid decision-making is critical in data mining. Classification is an important problem in data mining. It has been studied extensively as a possible solution to the knowledge acquisition. Decision tree has become one of the most commonly used techniques for classifying data because the algorithm for generating a decision tree can be easily implemented. However, when there are too many distinct values of the nominal attributes in each node of a tree, the branches of the tree become enormous and complicated. As a result, the effectiveness of data processing in a large data set may be compromised. This paper aims to propose a heuristic method to simplify the decision tree by splitting the nominal attributes into two branches. We adopt principal component analysis to present an algorithm for finding a good partition strategy in order to reduce unnecessary branches of a decision tree. Since the principal component can represent most of the variants, the first component scores of each attribute will be utilized as the thresholds for splitting examples. The decision tree can be simplified to a binary tree so that the explicit knowledge of a tree can be easily extracted. We also compare against other heuristic methods and give an analysis of experimental results on four UCI data sets.

以文找文

期刊論文
1.	Lim, T.-S.、Loh, W.-Y.、Shih, Y.-S.(2000)。A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms。Machine Learning，40(3)，203-228。
2.	Quinlan, J. R.(1996)。Improved use of continuous attributes in C4.5。Journal of Articial Intelligence Research，4，77-90。
3.	Quinlan, J. R.(1986)。Induction of decision tree。Machine Learning，1(1)，81-106。
4.	Aha, D.W., and Breslow, L.A(1998)。“Comparing simplification procedures for decision trees,＂。Artificial Intelligence and Statistics，(5)，pp_199-206。
5.	Bohanec, M., and Bratko, I.(1994)。“Trading accuracy for simplicity in decision trees,＂。Machine Learning，(15)，pp_223-250。
6.	Brodley, C. E., and Utgoff, P. E.(1995)。“Multivariate decision trees, ＂。Machine Learning，(19)，pp_45-77。
7.	Carvalho, D. R., and Freitas, A. A.(2004)。“A hybrid decision tree/genetic algorithm method for data mining,＂。Information Science，(163:1-3)，pp_13-35。
8.	Coppersmith, D., Hong, S. J., and Hosking, J. R. M(1999)。“Partitioning nominal attributes in decision trees,＂。Data Mining and Knowledge Discovery，(3)，pp_ 197-217。
9.	Cox, L. A., Qiu, Y., and Kuehner, W.(1989)。“Heuristic least-cost computation of discrete classification functions with uncertain argument values, ＂。Annals of Operations Research，(21:1)，pp_ 1-30。
10.	Esposito, F., Malerba, D., and Semeraro, G.(1997)。“A Comparative Analysis of Methods for Pruning Decision Trees,＂。IEEE Transactions on Pattern Analysis and Machine Intelligence，(19)，pp_ 476-491。
11.	Hoffgen, K. U., Simon, H. U., and Horn, K. S.(1995)。“Robust trainability of single neurons, ＂。Journal of Computer system Sciences，(50:1)，pp_ 114-125。
12.	Hyafil, L., and Rivest, R. L(1976)。“Constructing optimal binary decision tree is NP-complete,＂。Information Processing Letters，(5:1)，pp_ 15-17。
13.	Laber, E. S., and Nogueira, L. T.(2004)。“On the hardness of the minimum height decision tree problem,＂。Discrete Applied Mathematics，(144:1-2)，pp_ 209-212。
14.	Murphy, O. J., and McCraw, R. L.(1991)。“Designing storage efficient decision trees, ＂。IEEE Transactions on Computers，(40:3)，pp_ 315-319。
15.	Murthy, S. K.(1998)。“Automatic construction of decision trees from data: a multi-disciplinary survey,＂。Data Mining and Knowledge Discovery，(2:4)，pp_ 345-389。
16.	Naumov, G. E.(1991)。“NP-completeness of problems of construction of optimal decision trees,＂。Soviet Physics Doklady，(36:4)，pp_ 270-271。
17.	Osei-Bryson, K.(2004)。“Evaluation of decision trees: a multi-criteria approach,＂。Computers and Operations Research，(31:11)，pp_ 1933-1945。
18.	Pagallo, G., and Haussler, D.(1990)。“Boolean feature discovery in empirical learning,＂。Machine Learning，(5)，pp_ 71-100。
19.	Ruggieri, S.(2002)。“Efficient C4.5, ＂。IEEE Transactions on Knowledge and Data Engineering，(14:2)，pp_ 438-444。
20.	Sorensen, K., and Janssens, G. K.(2003)。“Data mining with genetic algorithms on binary trees,＂。European Journal of Operational Research，(151:2)，pp_ 253-264。
21.	Takimoto, E., and Maruoka, A.(2003)。“Top-down decision tree learning as information based boosting,＂。Theoretical Computer Science，(292:2)，pp_ 447-464。
22.	Terano, T., and Ishino, Y.(1996)。“Knowledge acquisition from questionnaire data using simulated breeding and inductive learning methods,＂。Expert Systems with Applications，(11:4)，pp_ 507-518。

會議論文
1.	Auer, P., Holte, R. C., and Maass, W.(1995)。“Theory and applications of agnostic PAC-learning with small decision trees,＂pp_21-29。
2.	Cherkauer, K. J., and Shavlik, J. W.(1996)。“Growing simpler decision trees to facilitate knowledge discovery,＂pp_ 315-318。
3.	John, G. H.(1995)。“Robust Decision Trees: Removing Outliers in Databases, ＂pp_ 174-179。
4.	Mehta, M., Agrawal, R., and Rissanen, J.(1996)。“SLIQ: A fast scalable classifier for data mining, ＂。Avignon, France。
5.	Ragavan, H., and Rendell, L.A.(1993)。“Lookahead feature construction for learning hard concepts,＂pp_ 252-259。
6.	Zheng, Z.(1995)。“Constructing nominal X-of-N attributes, ＂(2)，pp_ 1064-1070。

學位論文
1.	Heath, D.G.(1993)。A geometric framework for machine learning。

圖書
1.	Breiman, L.、Friedman, J. H.、Olshen, R. A.、Stone, C. J.(1984)。Classification and Regression Trees。Chapman & Hall/CRC。
2.	Han, Jiawei、Kamber, Micheline(2000)。Data mining: Concepts and techniques。Morgan Kaufmann Publishers。
3.	Quinlan, J. Rose(1993)。C4.5: Programs for Machine Learning。Morgan Kaufmann Publishers。

推文
推薦
引用網址
引用嵌入語法
轉寄

top

:::

相關期刊
相關論文
相關專書
相關著作
熱門點閱

1.	多元交通行動服務使用者之套票購買行為分析--以高雄市MaaS系統為例
2.	The Crucial Factors of Clicking Keyword Advertisements
3.	Churn Prediction Based on the Analysis of Customers' Preferences and Social Behavior on a Big Data Platform
4.	應用資料探勘技術建構顧客流失預測模型
5.	資料探勘演算法於軍人貪污量刑之預測及比較
6.	止吐藥物與患者自控式止痛副作用相關分析
7.	酌定子女親權之重要因素：以決策樹方法分析相關裁判
8.	來臺日籍旅客之動態行為分析結合大數據應用之研究
9.	以資料探勘分析推甄入學之學生就讀機率--以某大學資管系為例
10.	網路算命使用者行為與特徵分析：資料探勘技術之應用
11.	資料探勘於評估股權投資項目應用之研究
12.	應用決策樹探討研究所補教業者之電話行銷策略
13.	以兩階段集群分析方法之比較：以泰國普吉島遊客資訊管理為例
14.	不同的資料採礦方法於教師教學評量之比較研究
15.	以資料探勘技術建立宅配業之車輛維修及預警決策支援系統

1.	運用智能技術於消金授信資產分級暨違約預警領域之研究
2.	使用文本探勘在伺服器開發上建立無效的缺陷分類模型
3.	高血壓藥物對台灣高血壓年長婦女尿失禁及其醫療費用之影響
4.	具非重現性擁擠特性之高速公路旅行時間預測
5.	資料探勘技術於台灣製藥產業客戶價值分析-行銷策略與銷售人力績效特質之探討
6.	以無線射頻辨識為基礎的混合與啟發式資料探勘技術應用於品質管理
7.	以資料探勘分析影響國民中小學學習成就因素之研究
8.	結合分類分群技術建立推測法則之研究
9.	資源有限下的決策樹建構
10.	不同標籤屬性變化下的決策樹建構系統
11.	資料探勘應用於台灣航空忠誠旅客管理之研究
12.	使用資料探勘技術挖掘線上論壇討論活動型態
13.	資料探勘手術後減重效果分類模式之建構
14.	運用資料探勘技術建構半導體封裝業之品質改善系統
15.	資料挖掘的多值及多標籤決策樹分類法

無相關書籍

無相關著作

無相關點閱

QR Code

臺灣人文及社會科學引文索引資料庫系統

詳目顯示

臺灣人文及社會科學引文索引資料庫