臺灣地區中文網頁自動辨別日期之研究__臺灣人文及社會科學引文索引資料庫

:::

詳目顯示

第 1 筆 / 總合 1 筆

/1頁

來源文獻資料
摘要
外文摘要

題名：	臺灣地區中文網頁自動辨別日期之研究
書刊名：	大學圖書館
作者：	邰文暉／吳政叡
作者(外文)：	Tai, Wen-hui／Wu, Cheng-juei
出版日期：	2011
卷期：	15:1
頁次：	頁132-143
主題關鍵詞：	日期格式；網頁日期；自動日期辨別；元資料；詮釋資料；後設資料；Date format；Webpage date；Auto date extraction；Metadata
原始連結：	連回原系統網址
相關次數：	被引用次數:期刊(0) 博士論文(0) 專書(0) 專書論文(0) 排除自我引用:0 共同引用:0 點閱:27

隨著網際網路的日益普及，線上資源也越來越豐富，要精準的為讀者找出有用的資訊，前提是必須能夠精準的分析網頁內容。日期是網頁 Metadata中的重要欄位，由於臺灣在日期格式的書寫習慣，使得中文網頁的日期形式較為複雜，因而增加了自動著錄網頁創造（或修改）日期時的困難。本研究的主要目的是針對網頁日期部分做深入的分析研究，以便能夠更精確的利用中文網頁中的日期欄位進行檢索利用。本研究以隨機抽樣方式來抓取繁體中文網頁，分析及統計樣本網頁中出現的日期格式，並使用正規表示式來自動抓取正確的網頁日期，最後計算出正確率。透過此研究可以了解在進行中文網頁日期欄位自動辨識時可能會遭遇到的困難，並評估自動擷取繁體中文網頁日期欄位的可行性。實驗結果顯示，有日期資料網頁的正確率約為 61%，沒有日期資料網頁的部分約為 62%。有日期資料網頁的平均誤差年約為 0.62年，且 83.4%的網頁能精準預測其年份（即誤差年為 0），因此雖然本研究的成果尚未能完全取代人工，但若應用得宜仍然可以提高網頁檢索時的效率。

以文找文

Online resources have become more plentiful nowadays, thanks to the popularization of Internet services. In order to achieve accurate search results for the users, it is necessary to analyze web pages precisely. ‘Date’ is one of the most important fields of metadata in web pages. Due to the special date displaying formats using in Taiwan, it has made the automatic cataloging on date for webpage more difficult. The major purpose of this research is to thoroughly analyze different types of date displaying formats applied to Chinese web pages. These findings will be used to increase the precision on the date auto extraction of web pages. The procedures of experiment are as follows. Firstly, samples were randomly selected from Internet. Secondly, the statistic analysis on the date displaying format of each web pages was conducted. Lastly, Regular Expression was used to abstract the dates of each web page, while the accuracy ratio was also calculated. The difficulties and feasibility of auto date extraction are discussed in the end of this work. The results of the experiment suggest the accuracy ratio of web pages with date information is 61%. On the other hand, the accuracy ratio of web pages without date information is 62%. The average error of those web pages with date information is 0.62 year. The results of this research suggest that the auto date extraction mechanism can be used to improve the efficiency on webpage information retrieval.

以文找文

推文
推薦
引用網址
引用嵌入語法
轉寄

top

:::

相關期刊
相關論文
相關專書
相關著作
熱門點閱

1.	探索數位典藏的詮釋資料與索引典之多語化
2.	從資料特性思考傳播內容Metadata之建置
3.	蘭嶼原住民媒體資料庫之Metadata與建檔系統

無相關博士論文

無相關書籍

無相關著作

1.	線上遊戲在大學圖書館利用指導之應用
2.	臺灣社會學門專業期刊間引用網絡之結構分析
3.	競合與超越：對「圖書館2.0」時代公共圖書館的反思
4.	潘佩珠研究述評(1950~2010)及其漢文小說研究之意義
5.	嶺南大學圖書館中文善本書研究
6.	醫藥學生資訊素養學習歷程之研究
7.	機器人文獻之合著網絡及熱門主題分析
8.	國立臺灣大學圖書館多媒體服務中心使用情形與讀者滿意度調查研究
9.	採用MARC21為單一機讀編目格式之評估：以國立臺灣大學圖書館為例
10.	大學圖書館部落格迴響之研究
11.	美國公共圖書館網站社會責任內容分析研究
12.	政府資訊公開與申請應用網站內容分析：以臺灣與美英兩國政府機關為例
13.	圖書館餐飲服務之研究
14.	人文學專書出版問題對學術傳播之影響
15.	國立臺灣大學圖書館藏小川文庫聖經文獻介紹

QR Code

臺灣人文及社會科學引文索引資料庫系統

詳目顯示

臺灣人文及社會科學引文索引資料庫