:::

詳目顯示

回上一頁
題名:以資料挖礦法則預測網頁更新規則之研究
書刊名:電子商務學報
作者:許秉瑜 引用關係張維捷
作者(外文):Hsu, Ping-yuChang, Wei-chieh
出版日期:2003
卷期:5:2
頁次:頁11-36
主題關鍵詞:企網頁更新資料挖礦樣式關聯規則網頁挖礦WWWWeb page updateData miningPattern discovery
原始連結:連回原系統網址new window
相關次數:
  • 被引用次數被引用次數:期刊(0) 博士論文(1) 專書(0) 專書論文(0)
  • 排除自我引用排除自我引用:0
  • 共同引用共同引用:0
  • 點閱點閱:23
企在電子商務時代,有各式代理人軟體 (Agent) 在網路搜尋資訊以建構各式各類網站。由於資料量通常相當龐大,對這類軟體而言,何時應更新其所取得的資訊,便成為一個系統管理員重要的決策課題。目前通常採取固定時間更新方式,亦即更新的間隔為一使用者自定的固定時間。但是一旦其間隔的設定不佳,則可能造成抓回來的網頁內容都是與先前相同的 (間隔太短),或是網頁的內容已經被更新過多次以上了 (間隔太長),這樣一來就可能會有浪費網路資源或資料過舊的情況出現。所以本論文利用資料挖礦中產生序列關聯規則的方法,對網頁找出其更新時間的樣式 (up­date pattern),並以此樣式來實際擷取網頁,以做驗證。由於網頁更動的樣式可能隨著時間變化而產生修改,因此一成不動的預測樣式會逐漸失去準確性。本研究因此也提出累進式的方法來更新預測規則,使規則能適時反應現況但又不至於耗用過多電腦資源。
In the E-Commerce era, many agents roam over Internet to find best prices, cluster related product information, etc. Agents have to visit targeted web pages periodically to update information. If agents visit pages too frequently then they end up reloading existing information. On the other hand, if agents visit web pages too infrequently, collected data may be out of date. To minimize out-of-date errors, agents temp to visit a site as soon as possible. However, to minimize network traffic and database update cost, system administrators temp to reduce the visit as much as possible. To the best of our knowledge, no research has have been directed to finding a scientific approach to solve the dilemma. In the paper, we propose to visit web pages according to past update patterns. That is, a page should be visited as soon as it is expected to be changed, but should not be visited in any other time. To discover the update patterns, we propose to use sequential association rules of data mining methodology. Association rules can find patterns implicitly associated with update temporal patterns. In the paper, each web page will be associated with a sequence of binary digits denoting whether the page is updated in last agent fetching slot. We designed an algorithm to mine patterns from the sequence of binary digits. The patterns will be composed of large item sequences and related association rules. The rule states under some preconditions, the web page will be changed in next time slot. If a precondition matches current situation then an agent will be sent to fetch the page. Besides computing patterns for existing pages, the system will also update its database dynamically to consider the factors of newly inserted pages and deleted pages.
期刊論文
1.Chen, Ming-Syan、Han, Jiawei、Yu, Philips S.、Park, J. S.(1996)。Data Mining: An Overview from database Perspective。IEEE Transaction on Knowledge and Data Engineering,8(6),866-883。  new window
2.陳彥良、許秉瑜、陳仕昇(2002)。Mining Hybrid Sequential Patterns and Sequential Rules。Journal of Information Systems,27(5),345-362。  new window
會議論文
1.Cheung, D. W.、Lee, S. D.、Kao, Benjamin(1997)。A General Incremental Technique for Maintaining Discovered Association Rules。The 5th International Conference on Database Systems for Advanced Applications。Melbourne。185-194。  new window
2.Aumann, Y.、Perkowitz, R. F. M.、Etzioni, O.、Shmiel, T.(1998)。Predicting event Sequence: Data Mining for Perfecting Web-pages。沒有紀錄。  new window
圖書
1.Caglayan, A.、Harrison, C.(1997)。Agent Sourcebook。沒有紀錄:John Wiley & Sons, Inc.。  new window
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
:::
無相關書籍
 
無相關著作
 
QR Code
QRCODE