以可能值方法為基礎之多向度能力值垂直等化探究__臺灣人文及社會科學引文索引資料庫

:::

詳目顯示

第 1 筆 / 總合 1 筆

/1頁

來源文獻資料
摘要
外文摘要
引文資料

題名：	以可能值方法為基礎之多向度能力值垂直等化探究
書刊名：	測驗學刊
作者：	吳慧珉／郭伯臣／許天維／陳婉寧
作者(外文)：	Wu, Huey-min／Kuo, Bor-chen／Sheu, Tian-wei／Chen, Wan-ning
出版日期：	2015
卷期：	62:2
頁次：	頁95-126
主題關鍵詞：	大型測驗；可能值方法；多向度試題反應理論；垂直等化；能力估計；Large-scale assessments；MIRT；Plausible value method；Trait estimation；Vertical equating
原始連結：	連回原系統網址
相關次數：	被引用次數:期刊(2) 博士論文(1) 專書(0) 專書論文(0) 排除自我引用:2 共同引用:13 點閱:217

現今國際上幾個著名大型測驗均使用可能值方法呈現群體參數，因可能值方法在群體參數的回復性極佳，且大型測驗關注的焦點正是群體參數。建置大型測驗的目的通常是為了長期的教育成效評估，因此，如何檢視學生是否隨著年級不同而在某些能力值上有所不同，便成了一項值得關注的議題。透過垂直等化能使不同年級的受試者分別接受符合於其能力範圍的試題之後，將測量結果建置在同一量尺上，以進行能力高低之比較。本研究以多向度試題反應理論為基礎，使用垂直等化設計，探討不同題數、不同向度數對於能力參數估計的影響，並以不同估計方法與可能值方法進行比較。研究結果顯示，可能值方法在群體標準差的估計上有極佳的精準度，而群體能力平均數的估計則與其他估計法差不多；在多向度垂直等化設計下，每向度所對應的題數較多時則估計的效果較好。

以文找文

The purpose of large-scale assessment is to monitor group progress. Therefore, group statistics are what the large-scale assessment focus on. Plausible value method is proposed to be a great method that measures population statistics accurately so it is used to provide students' achievement data by some significant large-scale assessment programs. Vertical equating is the way test publishers used to longitudinally evaluate achievement that spans grade levels. This research is aimed to analysis if: (1) the method that used to estimate parameters; (2) the number of item for each dimension whether or not impact on the recovery of ability parameters of group statistics, based on multidimensional item response theory (MIRT) with the vertical equating design. The result indicates that plausible value method recovers the standard deviation very well but not outstands in recovering the population means. When using MIRT vertical design, parameters are estimated better when the number of items is more.

以文找文

期刊論文
1.	Mislevy, R. J.(1991)。Randomization-based inference about latent variable from complex samples。Psychometrika，56(2)，177-196。
2.	Mislevy, R. J.(1984)。Estimating Latent Distributions。Psychometrika，49，359-381。
3.	Van Der Linden, W. J.、Veldkamp, B. P.、Carlson, J. E.(2004)。Optimizing Balanced Incomplete Block Designs for Educational Assessments。Applied Psychological Measurement，28，317-331。
4.	許天維、郭伯臣、吳慧珉、葉昶成(20131200)。單向度試題反應理論之可能值方法於等化設計下之模擬實驗探究。測驗統計年刊，21(下)，1-24。延伸查詢
5.	Ito, K.、Sykes, R. C.、Yao, L.(2008)。Concurrent and separate grade-groups linking procedures for vertical scaling。Applied Measurement in Education，21，187-206。
6.	Reckase, M. D.、Mckinley, R. L.(1991)。The Discriminating Power of Items That Measure More Than One Dimension。Applied Psychological Measurement，15(4)，361-373。
7.	陳柏熹(20061200)。能力估計方法對多向度電腦化適性測驗測量精準度的影響。教育心理學報，38(2)，195-211。延伸查詢
8.	Adams, R. J.、Wilson, M. R.、Wu, M. L.(1997)。Multilevel item response models: An approach to errors in variables regression。Journal of Educational and Behavioral Statistics，22(1)，47-76。
9.	de la Torre, J.、Song, H.(2009)。Improving the quality of ability estimates through multidimensional scoring and incorporation of ancillary variables。Applied Psychological Measurement，33，465-485。
10.	Kim, S. H.、Cohen, A. S.(1998)。A Comparison of Linking and Concurrent Calibration Under Item Response Theory。Applied Psychological Measurement，22，131-143。
11.	Lord, F. M.(1983)。Unbiased estimators of ability parameters, of their variance, and of their parallel-forms reliability。Psychometrika，48，233-245。
12.	Mislevy, R. J.、Johnson, E. G.、Muraki, E.(1992)。Scaling procedures in NAEP。Journal of Educational Statistics，17(2)，131-154。
13.	Mislevy, R. J.、Beaton, A. E.、Kaplan, B.、Sheehan, K. M.(1992)。Estimating population characteristics from sparse matrix samples of item responses。Journal of Educational Measurement，29(2)，133-161。
14.	Wu, Margaret(2005)。The role of plausible values in large-scale surveys。Studies in Educational Evaluation，31(2/3)，114-128。
15.	Von Davier, M.、Gonzalez, E.、Mislevy, R. J.(2009)。What are plausible values and why are they useful?。IERA Monograph Series: Issues and Methodologies in Large-Scale Assessments，2(1)，9-36。
16.	Warm, T. A.(1989)。Weighted likelihood estimation of ability in item response theory。Psychometrika，54(3)，427-450。
17.	Mckinley, R. L.、Reckase, M. D.(1983)。MAXLOG: A computer program for the estimation of the parameters of a multidimensional logistic model。Behavior Research Methods & Instrumentation，15，389-390。
18.	郭伯臣、王暄博(20081200)。大型測驗中同時進行垂直與水平等化效果之探討。教育研究與發展期刊，4(4)，87-119。延伸查詢
19.	Adams, Raymond J.、Wilson, Mark R.、Wang, Wen-chung(1997)。The multidimensional random coefficients multinomial logit model。Applied Psychological Measurement，21(1)，1-23。
20.	Mislevy, R. J.、Sheehan, K. M.(1989)。Information matrices in latent-variable models。Journal of Educational Statistics，14(4)，335-350。

會議論文
1.	Sympson, J. B.(1978)。A model for testing with the multidimensional items。1977 Computerized Adaptive Testing Conference。Minneapolis, MN：University of Minnesota, Department of Psychology, Psychometric Methods Program。82-98。

學位論文
1.	黃珮璇(2007)。BIB、PBIB與NEAT設計於多元計分測驗之連結效果比較(碩士論文)。國立臺中教育大學，臺中市。延伸查詢

圖書
1.	Glas, C. A. W.、Geerlings, H.(2009)。A study of structural modeling using plausible value imputation。Law School Admission Council。
2.	Hattie, J.(1981)。Decision criteria for determining unidimensional and multidimensional normal ogive models of latent trait it theory。Armidale, Australia：The University of New England：Center for Behavioral Studies。
3.	Nemhauser, George L.、Wolsey, Laurence A.(1999)。Integer and combinatorial optimization。New York, NY：John Wiley & Sons。
4.	郭伯臣、曾建銘、吳慧珉(2012)。大型標準化測驗建置流程應用於TASA之研究。新北市：國家教育研究院。延伸查詢
5.	Organisation for Economic Co-operation and Development(2009)。PISA 2006 technical report。OECD。
6.	Reckase, M. D.(2009)。Multidimensional item response theory。New York, NY：Springer。
7.	Allen, N. L.、Donoghue, J. R.、Schoeps, T. L.、National Center for Educational Statistics(2001)。The NAEP 1998 technical report。Washington, DC：National Assessment Governing Board, U.S. Department of Education。
8.	余民寧(2009)。試題反應理論（IRT）及其應用。心理出版社。延伸查詢
9.	Kolen, M. J.、Brennan, R. J.(1995)。Test Equating: Methods and Practices。New York：Springer-Verlag。

其他
1.	Wu, M. L.，Adams, R. J.，Wilson, M. R.，Haldane, A. H.(2007)。ACER ConQuest 2.0，Hawthorn：ACER。

圖書論文
1.	Weiss, A. R.、Schoeps, T. L.(2001)。Assessment frameworks and instruments for the 1998 civics Assessment。The NAEP 1998 technical report。Washington, DC：National Center for Education Statistics。
2.	Foy, P.、Galia, J.、Li, L.(2008)。Scaling the data from the TIMSS 2007 Mathematics and Science assessments。TIMSS 2007 Technical Report。TIMSS & PIRLS International Study Center：Lynch School of Education：Boston College。

推文
推薦
引用網址
引用嵌入語法
轉寄

top

:::

相關期刊
相關論文
相關專書
相關著作
熱門點閱

1.	三至九年級學生數學運算能力等化測量與多向度分析
2.	從多層面Rasch模式來檢視不同的評分者等化連結設計對參數估計的影響
3.	三～八年級資料與可能性能力測驗的發展及信效度分析
4.	極端反應風格之多向度試題反應理論模式的發展與應用
5.	領域特定詞彙知識的測量：三至八年級學生數學詞彙能力
6.	結合輔助訊息之單向度試題反應理論能力值估計探究
7.	定錨試題參數估計誤差分布範圍對受試能力估計精確性之影響
8.	單向度試題反應理論之可能值方法於等化設計下之模擬實驗探究
9.	大型測驗等化群體不變性之探究：以2007年臺灣學生學習成就評量資料庫國中二年級數學科為例
10.	以多層面Rasch分析的角度來評估標準設定之變異性
11.	納入背景變項對群體參數估計之影響的模擬與實徵研究
12.	電腦化適性測驗在日常生活活動功能量表上之應用
13.	大型測驗中同時進行垂直與水平等化效果之探討
14.	題組之相關特性對電腦化適性測驗測量精準度的影響
15.	能力估計方法對多向度電腦化適性測驗測量精準度的影響

1.	應用認知診斷模型於國中多項式單元概念與錯誤類型之實徵研究
2.	CEFR基礎級之華語文聽力與閱讀理解能力測驗研發與電腦化適性評量系統建置

無相關書籍

無相關著作

1.	小學生「未來時間觀量表」之中文化及信、效度評估
2.	偏遠地區與一般地區國中生學習成就差異比較：以「反事實」分析

QR Code

臺灣人文及社會科學引文索引資料庫系統

詳目顯示

臺灣人文及社會科學引文索引資料庫