:::

詳目顯示

回上一頁
題名:以可能值方法為基礎之多向度能力值垂直等化探究
書刊名:測驗學刊
作者:吳慧珉 引用關係郭伯臣 引用關係許天維 引用關係陳婉寧
作者(外文):Wu, Huey-minKuo, Bor-chenSheu, Tian-weiChen, Wan-ning
出版日期:2015
卷期:62:2
頁次:頁95-126
主題關鍵詞:大型測驗可能值方法多向度試題反應理論垂直等化能力估計Large-scale assessmentsMIRTPlausible value methodTrait estimationVertical equating
原始連結:連回原系統網址new window
相關次數:
  • 被引用次數被引用次數:期刊(2) 博士論文(1) 專書(0) 專書論文(0)
  • 排除自我引用排除自我引用:2
  • 共同引用共同引用:13
  • 點閱點閱:217
現今國際上幾個著名大型測驗均使用可能值方法呈現群體參數,因可能值方法在群體參數的回復性極佳,且大型測驗關注的焦點正是群體參數。建置大型測驗的目的通常是為了長期的教育成效評估,因此,如何檢視學生是否隨著年級不同而在某些能力值上有所不同,便成了一項值得關注的議題。透過垂直等化能使不同年級的受試者分別接受符合於其能力範圍的試題之後,將測量結果建置在同一量尺上,以進行能力高低之比較。本研究以多向度試題反應理論為基礎,使用垂直等化設計,探討不同題數、不同向度數對於能力參數估計的影響,並以不同估計方法與可能值方法進行比較。研究結果顯示,可能值方法在群體標準差的估計上有極佳的精準度,而群體能力平均數的估計則與其他估計法差不多;在多向度垂直等化設計下,每向度所對應的題數較多時則估計的效果較好。
The purpose of large-scale assessment is to monitor group progress. Therefore, group statistics are what the large-scale assessment focus on. Plausible value method is proposed to be a great method that measures population statistics accurately so it is used to provide students' achievement data by some significant large-scale assessment programs. Vertical equating is the way test publishers used to longitudinally evaluate achievement that spans grade levels. This research is aimed to analysis if: (1) the method that used to estimate parameters; (2) the number of item for each dimension whether or not impact on the recovery of ability parameters of group statistics, based on multidimensional item response theory (MIRT) with the vertical equating design. The result indicates that plausible value method recovers the standard deviation very well but not outstands in recovering the population means. When using MIRT vertical design, parameters are estimated better when the number of items is more.
期刊論文
1.Mislevy, R. J.(1991)。Randomization-based inference about latent variable from complex samples。Psychometrika,56(2),177-196。  new window
2.Mislevy, R. J.(1984)。Estimating Latent Distributions。Psychometrika,49,359-381。  new window
3.Van Der Linden, W. J.、Veldkamp, B. P.、Carlson, J. E.(2004)。Optimizing Balanced Incomplete Block Designs for Educational Assessments。Applied Psychological Measurement,28,317-331。  new window
4.許天維、郭伯臣、吳慧珉、葉昶成(20131200)。單向度試題反應理論之可能值方法於等化設計下之模擬實驗探究。測驗統計年刊,21(下),1-24。new window  延伸查詢new window
5.Ito, K.、Sykes, R. C.、Yao, L.(2008)。Concurrent and separate grade-groups linking procedures for vertical scaling。Applied Measurement in Education,21,187-206。  new window
6.Reckase, M. D.、Mckinley, R. L.(1991)。The Discriminating Power of Items That Measure More Than One Dimension。Applied Psychological Measurement,15(4),361-373。  new window
7.陳柏熹(20061200)。能力估計方法對多向度電腦化適性測驗測量精準度的影響。教育心理學報,38(2),195-211。new window  延伸查詢new window
8.Adams, R. J.、Wilson, M. R.、Wu, M. L.(1997)。Multilevel item response models: An approach to errors in variables regression。Journal of Educational and Behavioral Statistics,22(1),47-76。  new window
9.de la Torre, J.、Song, H.(2009)。Improving the quality of ability estimates through multidimensional scoring and incorporation of ancillary variables。Applied Psychological Measurement,33,465-485。  new window
10.Kim, S. H.、Cohen, A. S.(1998)。A Comparison of Linking and Concurrent Calibration Under Item Response Theory。Applied Psychological Measurement,22,131-143。  new window
11.Lord, F. M.(1983)。Unbiased estimators of ability parameters, of their variance, and of their parallel-forms reliability。Psychometrika,48,233-245。  new window
12.Mislevy, R. J.、Johnson, E. G.、Muraki, E.(1992)。Scaling procedures in NAEP。Journal of Educational Statistics,17(2),131-154。  new window
13.Mislevy, R. J.、Beaton, A. E.、Kaplan, B.、Sheehan, K. M.(1992)。Estimating population characteristics from sparse matrix samples of item responses。Journal of Educational Measurement,29(2),133-161。  new window
14.Wu, Margaret(2005)。The role of plausible values in large-scale surveys。Studies in Educational Evaluation,31(2/3),114-128。  new window
15.Von Davier, M.、Gonzalez, E.、Mislevy, R. J.(2009)。What are plausible values and why are they useful?。IERA Monograph Series: Issues and Methodologies in Large-Scale Assessments,2(1),9-36。  new window
16.Warm, T. A.(1989)。Weighted likelihood estimation of ability in item response theory。Psychometrika,54(3),427-450。  new window
17.Mckinley, R. L.、Reckase, M. D.(1983)。MAXLOG: A computer program for the estimation of the parameters of a multidimensional logistic model。Behavior Research Methods & Instrumentation,15,389-390。  new window
18.郭伯臣、王暄博(20081200)。大型測驗中同時進行垂直與水平等化效果之探討。教育研究與發展期刊,4(4),87-119。new window  延伸查詢new window
19.Adams, Raymond J.、Wilson, Mark R.、Wang, Wen-chung(1997)。The multidimensional random coefficients multinomial logit model。Applied Psychological Measurement,21(1),1-23。  new window
20.Mislevy, R. J.、Sheehan, K. M.(1989)。Information matrices in latent-variable models。Journal of Educational Statistics,14(4),335-350。  new window
會議論文
1.Sympson, J. B.(1978)。A model for testing with the multidimensional items。1977 Computerized Adaptive Testing Conference。Minneapolis, MN:University of Minnesota, Department of Psychology, Psychometric Methods Program。82-98。  new window
學位論文
1.黃珮璇(2007)。BIB、PBIB與NEAT設計於多元計分測驗之連結效果比較(碩士論文)。國立臺中教育大學,臺中市。  延伸查詢new window
圖書
1.Glas, C. A. W.、Geerlings, H.(2009)。A study of structural modeling using plausible value imputation。Law School Admission Council。  new window
2.Hattie, J.(1981)。Decision criteria for determining unidimensional and multidimensional normal ogive models of latent trait it theory。Armidale, Australia:The University of New England:Center for Behavioral Studies。  new window
3.Nemhauser, George L.、Wolsey, Laurence A.(1999)。Integer and combinatorial optimization。New York, NY:John Wiley & Sons。  new window
4.郭伯臣、曾建銘、吳慧珉(2012)。大型標準化測驗建置流程應用於TASA之研究。新北市:國家教育研究院。  延伸查詢new window
5.Organisation for Economic Co-operation and Development(2009)。PISA 2006 technical report。OECD。  new window
6.Reckase, M. D.(2009)。Multidimensional item response theory。New York, NY:Springer。  new window
7.Allen, N. L.、Donoghue, J. R.、Schoeps, T. L.、National Center for Educational Statistics(2001)。The NAEP 1998 technical report。Washington, DC:National Assessment Governing Board, U.S. Department of Education。  new window
8.余民寧(2009)。試題反應理論(IRT)及其應用。心理出版社。  延伸查詢new window
9.Kolen, M. J.、Brennan, R. J.(1995)。Test Equating: Methods and Practices。New York:Springer-Verlag。  new window
其他
1.Wu, M. L.,Adams, R. J.,Wilson, M. R.,Haldane, A. H.(2007)。ACER ConQuest 2.0,Hawthorn:ACER。  new window
圖書論文
1.Weiss, A. R.、Schoeps, T. L.(2001)。Assessment frameworks and instruments for the 1998 civics Assessment。The NAEP 1998 technical report。Washington, DC:National Center for Education Statistics。  new window
2.Foy, P.、Galia, J.、Li, L.(2008)。Scaling the data from the TIMSS 2007 Mathematics and Science assessments。TIMSS 2007 Technical Report。TIMSS & PIRLS International Study Center:Lynch School of Education:Boston College。  new window
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
:::
無相關書籍
 
無相關著作
 
QR Code
QRCODE