:::

詳目顯示

回上一頁
題名:Can We Rely Too Much on Testlets? The Influence of the Number of Testlet Items on Parameter Estimation
書刊名:測驗學刊
作者:林奕宏施慶麟 引用關係
作者(外文):Lin, Yi-hungShih, Ching-lin
出版日期:2013
卷期:60:4
頁次:頁649-680
主題關鍵詞:參數估計測驗設計題組題組效果羅序題組模式Parameter estimationRasch testlet modelTest designTestlet effectTestlet
原始連結:連回原系統網址new window
相關次數:
  • 被引用次數被引用次數:期刊(1) 博士論文(0) 專書(0) 專書論文(0)
  • 排除自我引用排除自我引用:1
  • 共同引用共同引用:0
  • 點閱點閱:23
題組題已被廣泛應用在各種測驗情境裡,然而,研究已發現題組效果會對測驗結果產生某種程度的影響。本研究目的即在進一步探究題組的題數與整體測驗結果的關係,並聚焦在參數估計結果與測驗信度的變化。本研究包含實徵分析及模擬研究。實徵分析以台灣2007年大學入學考試英語科測驗為例,發現測驗資料中含有顯著的題組效果;接著以實徵分析所得的參數值為基礎,進行模擬研究。模擬的結果發現,測驗中題組題的數目會對受試者的能力估計值產生顯著的影響:當題組題數目減少時,與受試者能力估計值有關的偏誤、標準誤、均方差、平均絕對誤差等也會隨之降低,而測驗信度則會隨之提高,但試題難度受到的影響就相對較小。換言之,如果測驗目的是希望獲得精確的受試者能力估計值,如入學測驗等,則對題組題的使用,尤其是題組題的數目,就須特別小心控制。
Testlet items are commonly used in test situations. However, studies have found that the testlet effects have some impact on test results. The purpose of this study is to investigate further the influence of the number of testlet items on the entire test and to observe changes in the parameter estimates as well as test reliability. This study consists of an empirical analysis and two simulation studies. The English test in Taiwan's 2007 College Entrance Examination was analyzed in this study as an example. The empirical analysis demonstrates the non-ignorable testlet effects in the dataset. The parameter estimates obtained from the empirical analysis are then used in the simulation studies. The simulation studies reveal that the total number of testlet items has a significant impact on the person ability estimate; bias, standard error, mean square error and mean absolute error drop, but the EAP test reliability rises, when fewer testlet items were included in the test. In terms of the item difficulty estimate, this impact is relatively small; only standard error shows a consistent increase when the number of testlet items increases, and this effect is not consistent for bias, mean square error and mean absolute error. In sum, it can be concluded that the testlet effects are not beneficial to ability estimation, and this influence undermines test fairness. Other suggestions for test design are provided in the conclusion.
期刊論文
1.Bradlow, E. T.、Wainer, H.、Wang, X.(1999)。A Bayesian Random Effects Model for Testlets。Psychometrika,64(2),153-168。  new window
2.Baghaei, P.(2008)。Local dependency and Rasch measures。Measurement Transactions,21(3),1105-1106。  new window
3.Bao, H.、Dayton, C. M.、Hendrickson, A. B.(2009)。Differential item functioning amplification and cancellation in a reading test。Practical Assessment, Research & Evaluation,14(9),1-27。  new window
4.陳柏熹、黃宏宇、王文中(20080400)。題組之相關特性對電腦化適性測驗測量精準度的影響。測驗學刊,55(1),129-150。new window  延伸查詢new window
5.Draney, K.、Wilson, M.(2008)。A LLTM approach to the examination of teachers' ratings of classroom assessment tasks。Psychology Science Quarterly,50,417-432。  new window
6.Georgiadou, E.、Triantafillou, E.、Economides, A. A.(2007)。A review of item exposure control strategies for computerized adaptive testing developed from 1983 to 2005。The Journal of Technology, Learning, and Assessment,5(8),1-38。  new window
7.Ip, E. H. S.(2001)。Testing for local dependency in dichotomous and polytomous item response models。Psychometrika,66(1),109-132。  new window
8.Steinberg, L.、Thissen, D.(1996)。Uses of item response theory and the testlet concept in the measurement of psychopathology。Psychological Methods,1(1),81-97。  new window
9.Yang, X.、Poggio, J. C.、Glasnapp, D. R.(2006)。Effects of estimation bias on multiplecategory classification with an IRT-based adaptive classification procedure。Educational and Psychological Measurement,66(4),545-564。  new window
10.Zhang, B.(2008)。Investigating proficiency classification for the examination for the certificate of proficiency in English (ECPE)。Spaan Fellow Working Papers in Second or Foreign Language Assessment,6,57-75。  new window
11.Fischer, G. H.、Ponocny, I.(1994)。An extension of the partial credit model with an application to the measurement of change。Psychometrika,59(2),177-192。  new window
12.Wainer, H.、Kiely, G. L.(1987)。Item clusters and computerized adaptive testing: A case for testlets。Journal of Educational Measurement,24(3),185-201。  new window
13.Yen, W. M.(1993)。Scaling performance assessments: Strategies for managing local item dependence。Journal of Educational Measurement,30(3),187-213。  new window
14.Andrich, D.(1978)。A Rating Formulation for Ordered Response Categories。Psychometrika,43(4),561-573。  new window
15.Fischer, G. H.(1973)。The linear logistic test model as an instrument in educational research。Acta Psychologica,37(6),359-374。  new window
16.Warm, T. A.(1989)。Weighted likelihood estimation of ability in item response theory。Psychometrika,54(3),427-450。  new window
17.Dempster, Arthur P.、Laird, Nan M.、Rubin, Donald B.(1977)。Maximum likelihood from incomplete data via the EM algorithm。Journal of the Royal Statistical Society: Series B (Methodological),39(1),1-38。  new window
18.Bock, R. D.、Aitkin, M.(1981)。Marginal maximum likelihood estimation of item parameters: application of an EM algorithm。Psychometrika,46(4),443-459。  new window
19.Wang, Wen-Chung、Wilson, Mark(2005)。The Rasch testlet model。Applied Psychological Measurement,29(2),126-149。  new window
20.Masters, G. N.(1982)。A Rasch model for partial credit scoring。Psychometrika,47(2),149-174。  new window
21.Adams, Raymond J.、Wilson, Mark R.、Wang, Wen-chung(1997)。The multidimensional random coefficients multinomial logit model。Applied Psychological Measurement,21(1),1-23。  new window
22.Wang, W. C.(1999)。Direct Estimation of Correlations among Latent Traits Within IRT Framework。Methods of Psychological Research Online,4(2),47-68。  new window
23.Wilson, M. R.(1992)。The Partial Order Model: An Extension of the Partial Credit Model。Applied Psychological Measurement,16,309-325。  new window
24.Wainer, H.、Tissen, D.(1987)。Estimating Ability with the Wrong Model。Journal of Educational Statistics,12,339-368。  new window
研究報告
1.Bao, H.、Gotwals, A. W.、Mislevy, R.(2006)。Assessing local item dependence in building explanation tasks。Menlo Park, CA:SRI International, Center for Technology in Learning。  new window
2.Glas, C. A. W.、Vos, H. J.(2006)。Testlet-based adaptive mastery testing。Newtown, PA:Law School Admission Council。  new window
學位論文
1.Tseng, F. L.(2001)。Multidimensional Adaptive Testing Using the Weighted Likelihood Estimation: A Comparison of Estimation Methods(博士論文)。University of Pittsburgh,Pittsburgh, PA。  new window
圖書
1.Wu, M. L.、Adams, R. J.、Wilson, M. R.(2007)。ACER ConQuest: Generalized item response modeling software。Hawthorn:Australia Council for Educational Research。  new window
2.Bahadur, R.(1961)。A representation of the joint distribution of responses to n dichotomous items。Studies in item analysis and prediction。Palo Alto, CA:Stanford University Press。  new window
3.Glas, C. A. W.、Wainer, H.、Bradlow, E. T.(2000)。MML and EAP estimation in testletbased adaptive testing。Computerized adaptive testing: Theory and practice。Dordrecht:Kluwer Academic Publishers。  new window
4.The College Board(2010)。Exam development & assembly。New York, NY:The College Board。  new window
5.De Boeck, P.、Wilson, M.(2004)。Explanatory item response models: A generalized linear and nonlinear approach。New York:Springer-Verlag。  new window
6.Reckase, M. D.(2009)。Multidimensional item response theory。New York, NY:Springer。  new window
7.Baker, F. B.、Kim, S. H.(2004)。Item response theory: Parameter estimation techniques。New York:Marcel Dekker, Inc。  new window
8.Wainer, H.、Bradlow, E. T.、Wang, X.(2007)。Testlet response theory and its applications。Cambridge University Press。  new window
9.Rasch, G.(1960)。Probabilistic models for some intelligence and attainment tests。Copenhagen:The Danish Institute of Educational Research。  new window
10.Embretson, Susan E.、Reise, Steven P.(2000)。Item Response Theory for Psychologists。Lawrence Erlbaum Associates, Inc.。  new window
11.Linacre, J. M.(1989)。Many-facet Rasch measurement。Chicago:MESA Press。  new window
12.Vos, H. J.、Glas, C. A. W.(2000)。Testlet-based Adaptive Mastery Testing。Computerized Adaptive Testing: Theory and Practice。London:Kluwer Academic Publishers。  new window
13.Wainer, H.、Bradlow, E. T.、Du, Z.(2000)。Testlet response theory: An analog for the 3-PL useful in adaptive testing。Computerized adaptive testing: Theory and practice。Dordrecht:Kluwer Academic Publishers。  new window
圖書論文
1.Adams, R. J.、Wilson, M. R.(1996)。Formulating the Rasch model as a mixed coefficients multinomial logit。Objective measurement: Theory into practice。Norwood, NJ:Ablex。  new window
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
QR Code
QRCODE