:::

詳目顯示

回上一頁
題名:華語文閱讀測驗信度效度分析與垂直等化研究
書刊名:華語文教學研究
作者:藍珮君陳柏熹 引用關係
作者(外文):Lan, Pei-jiunChen, Po-hsi
出版日期:2014
卷期:11:1
頁次:頁99-125
主題關鍵詞:華語文能力測驗信度效度試題反應理論垂直等化Mandarin testReliabilityValidityItem response theoryVertical equating
原始連結:連回原系統網址new window
相關次數:
  • 被引用次數被引用次數:期刊(1) 博士論文(0) 專書(0) 專書論文(0)
  • 排除自我引用排除自我引用:1
  • 共同引用共同引用:0
  • 點閱點閱:102
本文旨在探討華語文閱讀測驗四個測驗等級:基礎級、進階級、高階級與流利級的信度與效度表現,並將四個等級試題難度連結至同一量尺上。樣本來自2011年5月與11月正式考試,及2012年預試之考生作答反應資料,以古典測驗理論與試題反應理論進行分析。研究結果顯示:1. 閱讀測驗信度良好,各等測驗KR20信度係數接近或達到0.90以上,IRT估計標準誤換算後的信度數值皆達到0.90以上,且各測驗通過門檻的考生能力值亦有較高的測驗訊息量與較低的估計標準誤;2. 閱讀測驗具有建構效度,各等級因素分析結果抽出閱讀理解單一因素,解釋變異量在66.91%以上,且各等級試題與模式適配比例達87.5%以上;3. 四等測驗試題難度分佈良好;4. 進階與高階級測驗折半合併為一等測驗,通過門檻之測驗訊息量及估計標準誤,與原進階級測驗相當,略差於原高階級測驗,將此兩等級測驗合併為一等測驗在實務上應為可行,惟組卷時試題難度比例需再做調整。
The purpose of this study is to investigate the reliability, validity and vertical equating of the Reading subtest of the Test of Chinese as a Foreign Language. Four levels are included in the reading section, they are Level 2, 3, 4, and 5, respectively. The analysis data was sampled from the formal version of the test administered in 2011 and pretest version in 2012. The results showed that, first, the coefficients of the Kuder-Richardson 20 were closed to or higher than .90. Moreover, large test information is provided to the value of cutoff which is determined an examinee is passed or failed. In other words, low standard error of estimation was obtained for the examinees. Second, the results of factor analysis showed that only one factor was extracted, which could account for above 66% of the variance. In addition, the results of Rasch analysis revealed that more than 87.5% of the items fit the model well. Third, there is a suitable range of difficulties for each level of test. Finally, standard error of estimation about the cutoff values were similar to Level 3 but lower than Level 4 when the items in Level 3 and 4 were split to assemble two tests (i.e., test information on the cutoff values for the even items included in Level 3 and 4, the odd items included in Level 3 and 4, and items in Level 3 and 4). That is these two adjacent levels can be combined to form a composite level of test in the future to reduce the burden for examinees and developers of the test. However, the item difficulty distribution of the composite test should be adjusted.
期刊論文
1.Lai, J., D.、Celia,, C. H.、Chang, R.、Bode, K.、Heinemann, A. W.(2003)。Item banking to improve, shorten, and computerize self-reported fatigue: An il¬lustration of steps to create a core item bank from the FACIT-Fatigue scale。Quality of Life Research,12,485-501。  new window
2.Sawaki, Y.、Strieker, L. J.、Oranje, A. H.(2009)。Factor structure of the TOEFL Internet-based test. Language Testing26(1),5-30。  new window
3.Yu, Chong Ho.(2005)。Test Equating by Common Items and Common Subjects: Concepts and Applications。Practical Assessment, Research & Evaluation,10(4),1-19。  new window
4.符華均、李亞男、李佩澤、張鐵英(2013)。新漢語水平考試HSK(五 級)效度研究。考試研究,3,65-69。  延伸查詢new window
會議論文
1.藍珮君、林玲英(2011)。新版華語文能力測驗與CEFR之連結:標準設定方 法的應用。ALTE第四屆國際研討會,(會議日期: 2011070)。  延伸查詢new window
圖書
1.Bond, Trevor G.、Fox, Christine M.(2007)。Applying the Rasch Model: Fundamental Measurement in the Human Sciences。Mahwah, New Jersey:Lawrence Erlbaum Associates。  new window
2.陳柏熹(2011)。心理與教育測驗:測驗編製理論與實務。精策教育有限公司。  延伸查詢new window
3.王文中、呂金燮、吳毓瑩、張郁雯、張淑慧(2004)。教育測驗與評量--教室學習觀點。臺北市:五南圖書出版有限公司。  延伸查詢new window
4.吳明隆(2003)。SPSS統計應用實務。臺北:松崗電腦圖書資料公司。  延伸查詢new window
5.郭生玉(2000)。心理與教育測驗。臺北縣中和市。  延伸查詢new window
6.王寶墉(1995)。現代測驗理論。台北市:心理出版社。  延伸查詢new window
7.Wright, B. D.、Stone, M. H.(1979)。Best Test Design: Rasch Measurement。Chicago, IL:Mesa Press。  new window
8.余民寧(2009)。試題反應理論(IRT)及其應用。心理出版社。  延伸查詢new window
其他
1.Educational Testing Service.(2007)。TOEFL iBT Score Reliability and General- izability.,http://www.ets.org/Media/Tests/ TOEFL /pdf/TOEFL iBT Score Reliability Generalizability.pdf。  new window
2.Educational Testing Service.(2011)。Reliability and Comparability of TOEFL iBT® Scores(PDF).,http://www.ets.0rg/s/t0efl/pdf/t0efl ibt research slv3.pdf。  new window
3.Winsteps and Rasch measurement Software.(2013)。Misfit diagnosis: Infit outfit mean-square standardized.,http://www.winsteps.com/ win- man/index.htm7diagnosingmisfit.htm.。  new window
4.張晉軍(2011)。新漢語水準考試(HSK)品質報告,http://blog.sina.com.en/s/blog 53e7clld0100v71z.html。  延伸查詢new window
圖書論文
1.柴省三(2012)。關於HSK閱讀理解測驗構想效度的實徵研究。世界漢語教學。北京市:北京語言大學。  延伸查詢new window
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
:::
無相關書籍
 
無相關著作
 
QR Code
QRCODE