:::

詳目顯示

回上一頁
題名:IRT真實分數等化於群體不變性假設之探究--以TASA 2007英語文題組測驗為例
書刊名:教育與心理研究
作者:王暄博郭伯臣 引用關係呂玉如
作者(外文):Wang, Hsuan-poKuo, Bor-chenLu, Yu-ru
出版日期:2013
卷期:36:1
頁次:頁117-146
主題關鍵詞:IRT真實分數等化方法群體不變性臺灣學生學習成就評量資料庫題組反應理論模式Item response theory true score equatingPopulation invarianceTaiwan Assessment of Student AchievementTestlet response theory
原始連結:連回原系統網址new window
相關次數:
  • 被引用次數被引用次數:期刊(0) 博士論文(0) 專書(0) 專書論文(0)
  • 排除自我引用排除自我引用:0
  • 共同引用共同引用:0
  • 點閱點閱:30
群體不變性是指不同次群體的測驗分數,經過相同的等化程序後,其轉換後之量尺分數應該相同。因此,為了要滿足測驗公平性的原則,等化在進行時必須符合群體不變性之需求(Kolen & Brennan, 2004)。近年來,許多研究皆在檢驗大型測驗進行量尺化程序後是否仍符合群體不變性的性質,例如:Yang與Gao(2008)透過IRT真實分數等化方法檢驗CLEP題組測驗的量尺分數是否符合群體不變性。此外,由於Yang與Gao使用Rasch測量模式估計題組測驗,而非使用題組反應理論模式。因此,本研究將以2007年臺灣學生學習成就評量資料庫英語文題組測驗資料為例,檢驗測驗資料是否滿足等化群體不變性的性質,並探究使用不同測量模式的等化效果。
In this study, population invariance refers to treating the test scores of different subgroups. Following identical equalization processes, the scaled scores after transformation should be the same. Thus, to ensure the fairness of tests, equalization must adhere to the demands of population invariance (Kolen & Brennan, 2004). In recent years, numerous studies have examined whether tests meet population invariance demands after scaling is performed. For example, Yang and Gao (2008) examined whether scaled CLEP scores using item response theory true score equating method conformed to the principles of population invariance. However, they employed the Rasc model to estimate the testlets rather than using testlet response theory. Therefore, this study uses the English test score data from the 2007 Taiwan Assessment of Student Achievement (TASA) as an example, investigating population invariance after equalization, and examining the effects of using different measurement models.
期刊論文
1.Haladyna, T. M.(1992)。Context-dependent item sets。Educational Measurement: Issues and Practice,11(4),21-25。  new window
2.Lee, G.(2000)。A comparison of methods of estimating conditional standard errors of measurement for testlet-based test scores using simulation techniques。Journal of Educational Measurement,37(2),91-112。  new window
3.Wainer, H.(1995)。Precision and differential item functioning on a testlet-based test: The 1991 Law School Admissions Tests as an example。Applied Measurement in Education,8(2),157-186。  new window
4.Angoff, W. H.、Cowell, W. R.(1986)。An examination of the assumption that the equating of parallel forms is population- independent。Journal of Educational Measurement,23,327-345。  new window
5.Bradlow, E. T.、Wainer, H.、Wang, X.(1999)。A Bayesian Random Effects Model for Testlets。Psychometrika,64(2),153-168。  new window
6.Dorans, N. J.、Liu, J.、Hammond, S.(2008)。Anchor test type and population invariance: An exploration across subpopulations and test administrations。Applied Psychological Measurement,32(1),81-97。  new window
7.Harris, D. J.、Kolen, M. J.(1986)。Effect of examinee group on equating relationships。Applied Psychological Measurement,10,35-43。  new window
8.Kolen, M. J.(1981)。Comparison of traditional and item response theory methods for equating test。Journal of Educational Measurement,18,1-11。  new window
9.Liu, M.、Holland, P. W,(2008)。Exploring population sensitivity of linking functions across three law school admission test administrations。Applied Psychological Measurement,32,27-44。  new window
10.Rosenbaum, P. R.(1988)。Item bundles。Psychometrika,53(3),349-359。  new window
11.Von Davier, A. A.、Wilson, C.(2008)。Investigating the population sensitivity assumption of item response theory true- score equating across two subgroups of examinees and two test formats。Applied Psychological Measurement,32(1),11-26。  new window
12.Wainer, H.、Lukhele, R.(1997)。How reliable are TOEFL scores。Educational and Psychological Measurement,57,749-766。  new window
13.Wainer, H.、Sireci, S. G.、Thissen, D.(1991)。Differential testlet functioning: Definition and detecting。Journal of Educational Measurement,28,197-219。  new window
14.Yang, W. L.(2004)。Sensitivity of linkings between AP multiple-choice scores and composite scores to geographical region: An illustration of checking for population invariance。Journal of Educational Measurement,41,33-41。  new window
15.Yang, W.-L.、Gao, R.(2008)。Invariance of score linkings across gender groups for forms of a testlet-based college-level examination program examination。Applied Psychological Measurement,32,45-61。  new window
16.Yi, Q.、Harris, D. J.、Gao, X.(2008)。Invariance of equating functions across different subgroups of examinees taking a science achievement test。Applied Psychological Measurement,32(1),62-80。  new window
17.Wainer, H.、Thissen, D.(1996)。How is Reliability Related to the Quality of Test Scores? What is the Effect of Local Dependence on Reliability?。Educational Measurement: Issues and Practice,15(1),22-29。  new window
18.Wainer, H.、Kiely, G. L.(1987)。Item clusters and computerized adaptive testing: A case for testlets。Journal of Educational Measurement,24(3),185-201。  new window
19.Haebara, T.(1980)。Equating logistic ability scales by a weighted least squares method。Japanese Psychological Research,22(3),144-149。  new window
20.Yen, W. M.(1993)。Scaling performance assessments: Strategies for managing local item dependence。Journal of Educational Measurement,30(3),187-213。  new window
21.Wainer, H.、Wang, X. H.(2000)。Using a New Statistical Model for Testlets to Score TOEFL。Journal of Educational Measurement,37(3),203-220。  new window
22.Hanson, B. A.、Béguin, A. A.(2002)。Obtaining a Common Scale for Item Response Theory Item Parameters Using Separate versus Concurrent Estimation in the Common-item Equating Design。Applied Psychological Measurement,26(1),3-24。  new window
23.Stocking, M. L.、Lord, F. M.(1983)。Developing a common metric in item response theory。Applied Psychological Measurement,7(2),201-210。  new window
24.Wang, Wen-Chung、Wilson, Mark(2005)。The Rasch testlet model。Applied Psychological Measurement,29(2),126-149。  new window
25.Dorans, N. J.、Holland, P. W.(2000)。Population Invariance and the Equatability of Tests: Basic Theory and the Linear Case。Journal of Educational Measurement,37(4),281-306。  new window
26.Lord, F. M.、Wingersky, M. S.(1984)。Comparing IRT true-score and equipercentile ob-served score “equatings”。Applied Psychological Measurement,8,452-461。  new window
會議論文
1.Allen, S.、Sudweeks, R. R.(200104)。Identifying and managing local item dependence in context-dependent item sets。The Annual Meeting of the American Educational Research Association。Seattle, WA。  new window
2.Wang, H. P.、Lu, Y. J.、Kuo, B. C.、Cheng, C. M.(201107)。Invariance of equating functions across gender groups of the Taiwan assessment of student achievement。The 17th International Meeting of the Psychometric Society。Hong Kong:Hong Kong Institute of Education。18-22。  new window
研究報告
1.Dorans, N. J.(2003)。Population invariance of score linking: Theory and applications to Advanced Placement Program examinations。Princeton, NJ:Educational Testing Service。  new window
2.Wang, X.、Bradlow, E. T.、Wainer, H.(2004)。A user fs guide for SCORIGHT(verson 3.0): A computer program for scoring tests built of testlets including a module for covariate analysis。Princeton, NJ:Educational Testing Service。  new window
圖書
1.Braun, H. L(1982)。Observed score test equating: A mathematical analysis of some ETS equating procedures。Test equating。New York, NY:Academic Press。  new window
2.Birnbaum, A.(1968)。Some latent trait models and their user in inferring an examinee’s ability。Statistical theories of mental rest scores。Reading, MA:Addison-Wesley。  new window
3.Dorans, N. J.、Holland, P. W.、Thayer, D. T.、Tateneni, K.(2003)。Invariance of score linking across gender groups for three Advanced Placement Program Examinations。Population invariance of score Unking: Theory and applications to Advanced Placement Program Examinations。Princeton, NJ:Educational Testing Service。  new window
4.Wainer, R、Bradlow, E. T.、Du, Z.(2000)。Testlet response theory: An analog for the 3PL model useM in testlet-based adaptive testing。Computerized adaptive testing: Theory and practice。Dordrecht, Netherlands:Kluwer。  new window
5.Yang, W. L.、Dorans, N. J.、Tateneni, K.(2003)。Sample selection effects on AP multiple-choice score to composite score scaling。Population invariance of score linking: Theory and applications to advanced placement program examinations。Princeton, NJ:Educational Testing Service。  new window
6.Gullikson, H.(1950)。Theory of mental tests。New York:John Wiley & Sons:Wiley。  new window
7.Zimowski, M. F.、Muraki, E.、Mislevy, R. J.、Bock, R. D.(2003)。BILOG-MG: Multiple-group IRT analysis and test maintence for binary for binary items。Mooresvilk IL:Scientific Software。  new window
8.Dorans, N. J.、Feigenbaum, M. D.(1994)。Equating issues engendered by changes to the SAT and PSAT/NMSQT。Technical issues related to the introduction of the new SAT and PSAT/NMSQT (ETS RM-94- 10)。Princeton, NJ:Educational Testing Service。  new window
9.Wainer, H.、Bradlow, E. T.、Wang, X.(2007)。Testlet response theory and its applications。Cambridge University Press。  new window
10.Kolen, M. J.、Brennan, R. L.(2004)。Test equating, scaling, and linking: Methods and practices。New York, NY:Springer Science+Business Media:Springer-Verlag。  new window
11.Hambleton, R. K.、Swaminathan, H.(1985)。Item response theory: Principles and applications。Kluwer-Nijhoff Publisher。  new window
12.Lord, F. M.(1980)。Applications of item response theory to practical testing problems。Lawrence Erlbaum Associates。  new window
13.Kolen, M. J.、Brennan, R. J.(1995)。Test Equating: Methods and Practices。New York:Springer-Verlag。  new window
其他
1.教育部統計處(2011)。性別統計專區,http://www.edu.tw/statistics/。  延伸查詢new window
2.臺灣學生學習成就評量資料庫(2012)。臺灣學生學習成就評量資料庫,http://tasa.naer.edu.tw/。  延伸查詢new window
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
:::
QR Code
QRCODE