:::

詳目顯示

回上一頁
題名:大型測驗等化群體不變性之探究:以2007年臺灣學生學習成就評量資料庫國中二年級數學科為例
書刊名:測驗學刊
作者:王暄博郭伯臣 引用關係呂玉如
作者(外文):Wang, Hsuan-poKuo, Bor-chenLu, Yu-ju
出版日期:2013
卷期:60:3
頁次:頁489-518
主題關鍵詞:IRT真實分數等化IRT觀察分數等化群體不變性量尺轉換方法IRT observed score equatingIRT true score equatingPopulation invarianceScale transformation method
原始連結:連回原系統網址new window
相關次數:
  • 被引用次數被引用次數:期刊(2) 博士論文(0) 專書(0) 專書論文(0)
  • 排除自我引用排除自我引用:2
  • 共同引用共同引用:9
  • 點閱點閱:29
本研究以2007年「臺灣學生學習成就評量資料庫」(TASA)國中二年級數學科的測驗資料為例,檢驗TASA測驗進行量尺程序後,其測驗分數是否有符合等化群體不變性之性質。本研究以性別進行分群,探討不同等化方法於性別受試者群體中是否保留群體不變性,包含:平均數與標準差法、平均數法、試題特徵曲線,以及測驗特徵曲線等不同量尺轉換方法,並搭配試題反應理論(IRT)真實分數與IRT觀察分數等化方法,共計八種等化方法。此外,採用Dorans與Holland(2000)提出之均方根誤差(RMSD)與均方根平均期望誤差(REMSD),以及Yang(2004)提出之均方根期望誤差(RESD)等三種方法來評估經過次群體等化後的群體不變性,並以SDTM為評估準則。研究結果顯示,TASA 2007年的數學科資料除了題本七有某些分數點超出SDTM標準值之外,其餘題本皆符合等化群體不變性。
This study aims to use test data from the Taiwan Assessment of Student Achievement (TASA) database to explore whether the test scores determined by the TASA complied with population invariance. Researchers used the TASA eighth grade mathematics data from 2007 and explored eight different equating methods to assess whether invariance was retained regarding the subjects' gender, including item response theory (IRT) true score and IRT observed score equating. This study also adopted four scale transformation methods, such as mean/mean, mean/sigma, Haebara, and Stocking-Lord procedures. Furthermore, Dorans and Hollands' (2000) RMSD and REMSD methods, as well as Yang's (2004) RESD method, were used to evaluate the population invariance after completed subpopulation equating. SDTM was the evaluation standard. The results showed that the TASA mathematics data correlated with the population invariance, except for the seventh booklet where a few points exceeded the SDTM standard.
期刊論文
1.Dorans, N. J.、Liu, J.、Hammond, S.(2008)。Anchor test type and population invariance: An exploration across subpopulations and test administrations。Applied Psychological Measurement,32(1),81-97。  new window
2.Liu, M.、Holland, P. W,(2008)。Exploring population sensitivity of linking functions across three law school admission test administrations。Applied Psychological Measurement,32,27-44。  new window
3.Yang, W. L.(2004)。Sensitivity of linkings between AP multiple-choice scores and composite scores to geographical region: An illustration of checking for population invariance。Journal of Educational Measurement,41,33-41。  new window
4.Yang, W.-L.、Gao, R.(2008)。Invariance of score linkings across gender groups for forms of a testlet-based college-level examination program examination。Applied Psychological Measurement,32,45-61。  new window
5.Brennan, R. L.、Kolen, M. J.(1987)。Some practical issues in equating。Applied Psychological Measurement,11,279-290。  new window
6.Cook, L. L.、Petersen, N. S.(1987)。Problems related to the use of conventional and item response theory equating methods in less than optimal circumstances。Applied Psychological Measurement: Issues and Practice,10,37-45。  new window
7.Dorans, N. J.、Holland, P. W.(2000)。Population invariance and equatability of tests: Basic theory and the linear case。Journal of Educational Measurement,37,281-306。  new window
8.Harris, D. J.、Crouse, J. D.(1993)。A study of criteria used in equating。Applied Measurernent in Education,6,195-240。  new window
9.Lord, F. M.、Wingersky, M. S.(1984)。Comparing IRT true-score and equipercentile observed score "equatings"。Applied Psychological Measurement,8,452-461。  new window
10.Loyd, B. H.、Hoover, H. D.(1980)。Vertical equating using the Rasch model。Journal of Educational Measurement,4,11-22。  new window
11.Petersen, Nancy S.、Cook, Linda L.、Stocking, Martha L.(1983)。IRT versus conventional equating methods: A comparative study of scale stability。Journal of Educational Statistics,8(2),135-156。  new window
12.von Davier, A. A.、Wilson, C.(2008)。Investigating the population sensitivity assumption of item response theory true-score equating across two subgroups of examinces and two test formats。Applied Psychological Measurement,32,11-26。  new window
13.Yi, Q.、Harris, D. J.、Gao, X.(2008)。Invariance of equating functions across different subgroups of exarninees taking a Science Achievement。Test, Applied Psychological Measurement,32,62-80。  new window
14.Skaggs, G.、Lissitz, R. W.(1986)。IRT test equating: Relevant issues and a review of recent research。Review of Educational Research,56(4),495-529。  new window
15.Haebara, T.(1980)。Equating logistic ability scales by a weighted least squares method。Japanese Psychological Research,22(3),144-149。  new window
16.Hanson, B. A.、Béguin, A. A.(2002)。Obtaining a Common Scale for Item Response Theory Item Parameters Using Separate versus Concurrent Estimation in the Common-item Equating Design。Applied Psychological Measurement,26(1),3-24。  new window
17.Marco, G. L.(1977)。Item characteristic curve solutions to three intractable testing problems。Journal of Educational Measurement,14,139-160。  new window
18.Stocking, M. L.、Lord, F. M.(1983)。Developing a common metric in item response theory。Applied Psychological Measurement,7(2),201-210。  new window
19.郭伯臣、王暄博(20081200)。大型測驗中同時進行垂直與水平等化效果之探討。教育研究與發展期刊,4(4),87-119。new window  延伸查詢new window
會議論文
1.Dorans, N. J.、Holland, P. W.、Thayer, D. T.、Tateneni, K.(2002)。Invariance of score linking across gender groups for three Advanced Placement Program exams, Paper presented at the annual meeting。the annual meeting of the National Council on Measurement in Education。New Orleans, LA。  new window
2.Harris, D. J.(1993)。Practical issues in equating。the annual meeting of the American Educational Research Association, Atlanta, GA。Atlanta, GA。  new window
3.Marco, G.、Petersen, N.、Stewart, E.(1979)。A test of the adequacy of curvilinear score equating models。The Computerized Adaptive Testing Conference。Minneapolis, MN。  new window
4.Skaggs, G.(1990)。Assessing the utility of item response theory models for testing equating。The annual meeting of the National Council on Measurement in Education, Boston, MA,。Boston, MA。  new window
5.Yang, W.-L.(2002)。Sample selection effect on AP multiplechoice score to composite score scaling, Paper presented at the annual meeting。The annual meeting of the National Council on Measurement in Education, New Orleans, LA,。New Orleans, LA。  new window
研究報告
1.(2010)。99學年度國中學生、教職員統計。  延伸查詢new window
圖書
1.Braun, H. L(1982)。Observed score test equating: A mathematical analysis of some ETS equating procedures。Test equating。New York, NY:Academic Press。  new window
2.Gullikson, H.(1950)。Theory of mental tests。New York:John Wiley & Sons:Wiley。  new window
3.Zimowski, M. F.、Muraki, E.、Mislevy, R. J.、Bock, R. D.(2003)。BILOG-MG: Multiple-group IRT analysis and test maintence for binary for binary items。Mooresvilk IL:Scientific Software。  new window
4.Lord, F. M.(1980)。Application of item response theoty to practical testing problems。Hitlsdale, NJ:Lawrence Eribaum Associates。  new window
5.Petersen, N. S.、Marco, G. L.、Stewart, E. B.(1982)。A test of the adequacy of linear score equating models。Testing equating。New York, NY:Academic Press。  new window
6.Crocker, L.、Algina, J.(1986)。Introduction to Classical and Modern Test Theory。Holt, Rinehart & Winston。  new window
7.Kolen, M. J.、Brennan, R. L.(2004)。Test equating, scaling, and linking: Methods and practices。New York, NY:Springer Science+Business Media:Springer-Verlag。  new window
8.Hambleton, R. K.、Swaminathan, H.(1985)。Item Response Theory: Principles and Applications。Boston, Massachusetts:Kluwer-Nijhoff。  new window
其他
1.臺灣學生學習成就評量資料庫(2011)。臺灣學生學習成就評量資料庫,http://tasa.naer.edu.tw/brief.htm, 20110420。  延伸查詢new window
2.Hanson, B. A.,Zeng, L.,Chien, Y.(2004)。ST: A computer program for IRT scale transformation [Computer software],http://www.education.uiowa.edu/casma, 20110310。  new window
3.Hanson, B. A.,Zeng, L.,Chien, Y.(2004)。PIE: IRT true and observed scoring equaling for dichotomously scored tests [Computer sothware],http://www.education.uiowa.edu/casma, 20110310。  new window
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
:::
無相關書籍
 
無相關著作
 
QR Code
QRCODE