:::

詳目顯示

回上一頁
題名:試題呈現與回饋模式對Angoff標準設定結果一致性提升效益之比較研究
書刊名:教育研究與發展期刊
作者:吳宜芳鄒慧英 引用關係
作者(外文):Wu, Yi-fangTzou, Hueying
出版日期:2010
卷期:6:4
頁次:頁47-80
主題關鍵詞:Angoff法Reckase表試題預先分類標準設定Angoff methodItem-groupingReckase chartsStandard setting
原始連結:連回原系統網址new window
相關次數:
  • 被引用次數被引用次數:期刊(2) 博士論文(0) 專書(1) 專書論文(0)
  • 排除自我引用排除自我引用:2
  • 共同引用共同引用:14
  • 點閱點閱:49
在標準設定的眾多方法中,Angoff法及其相關變形、延伸與修正程序等,實為教育實景中相當普及的標準設定流程。然而,執行Angoff標準設定方法的設定者在概念化最低能力受試者、估計其答題概率時,面臨相當大的認知挑戰。試題特徵(如:試題難度)對設定者間或設定者內一致性的影響,可能影響最後產出標準的效度。基於此,本研究試圖以實徵P值排序回饋、Reckase表回饋與試題呈現分類與否等做法融入修正Angoff法的標準設定程序,以促進設定結果的一致性,並從中比較前述作法融入設定程序之優劣。 本研究係為測驗結束後所進行之標準設定研究,屬於事後做決定型,研究中探究不同回饋模式及試題是否分類呈現對標準設定結果之影響,藉以比較二種作法的優劣,此為本研究之獨特性所在。其次,透過這二種修正作法,期能使設定者對於試題難度有較佳的察覺,進而改善設定間或設定者內一致性,提高設定結果的一致性,並對標準之效度有所助益,是為本研究在功能性之貢獻。
Numerous standard setting methods have been developed to assist panels in estimating the performance of the borderline examinees. Among them, the Angoff method is one of the most popular judgmental standard setting procedures. Its extensions, modifications, and variations are often applied in practice. In standard setting, panelists hold an important role, especially in the judgmental methods such as the Angoff method and its variations. The ability of panelists to accurately estimate the borderline examinees’ performance is to some extent subjected to item difficulty. Once the accuracy is questioned, the validity of the performance standard would be damaged. Therefore, a variety of procedures and several types of feedback have been developed to reduce inconsistency among panelists or within a single panelist. To compare different procedures embedded in the modified Angoff standard setting method for establishing cutoff scores on a large-scale achievement assessment, we designed two standard setting activities, integrating different procedures to help panelists make more accurate estimates. Two sets of data from a national achievement assessment in mathematics in Taiwan were used in the standard setting activities. Each set contained 104 operational multiplechoice items used to measure students’ grade-level math ability. Twelve panelists participated in the 4th grade standard setting activity and the 6th grade panel consisted of 14 panelists. They were all math educators and some had prior experiences in the modified Angoff standard setting procedures. The standard setting procedures included two factors, each of which involved two conditions: test items with/without item-grouping in advance; different types of feedback, such as feedback with empirical p-values and feedback with IRT calibration/Reckase charts (Reckase, 1998, 2001). We presented a generalizability analysis design to examine the improvement of consistency for different above mentioned procedures. Item effect, item difficulty effect (both within difficulty level and between levels) and panelist effect were of interest. First, the percentage of variance components of item effect increased consistently from Round 1 to Round 3, while the percentage of variance components of panelist effect decreased as the setting round passes. Panelists’ consistency was raised; in addition, relatively more variability of panelists was eliminated in the procedure of feedback with Reckase charts. Secondly, with/without item-grouping, panelists could make similar estimates of item performance toward items with similar difficulty as the setting rounds passes. Finally, item-grouping integrated into feedback with Reckase charts having the best improvement of intra-judge consistency, since we observed that under this condition, the estimates of the root mean square error were the smallest and the estimates of generalizability coefficients and intraclass correlation coefficients (ICCs) were the highest. Panelists are capable of distinguishing hard and easy items; however, with the help of item-group by difficulty and feedback with Reckase charts, the variability induced by item difficulty which has an impact on panelists’ consistency, has been decreased as much as possible. This finding, undoubtedly, is beneficial in terms of defending the validity of standard.
期刊論文
1.鄭明長、余民寧(199401)。各種通過分數設定方法之比較。測驗年刊,41,19-40。new window  延伸查詢new window
2.吳裕益(19880100)。標準參照測驗通過分數設定方法之研究。測驗年刊,35,159-165。new window  延伸查詢new window
3.Ferdous, A. A.、Plake, B. S.(2005)。Understanding the factors that influence decisions of panelists in a standard-setting study。Applied Measurement in Education,18(3),257-267。  new window
4.Impara, J. C.、Plake, B. S.(1997)。Standard setting: An alternative approach。Journal of Educational Measurement,34(4),353-366。  new window
5.Brandon, P. R.(2004)。Conclusions about frequently studied modified Angoff standard-setting topics。Applied Measurement in Education,17(1),59-88。  new window
6.Buckendahl, C. W.、Smith, R. W.、Impara, J. C.、Plake, B. S.(2002)。A comparison of Angoff and Bookmark standard setting methods。Journal of Educational Measurement,39(3),253-263。  new window
7.Kane, M. T.(1994)。Validating the performance standards associated with passing scores。Review of Educational Research,64(3),425-461。  new window
8.Maurer, T. J., Alexander, R. A., Callahan, C. M., Bailey, J. J.,、Dambrot, F. H.(1991)。Methodological and psychometric issues in setting cut off scores using the Angoff method。Personnel Psychology,44,235-262。  new window
9.Plake, B. S.、Melican, G. J.、Mills, C. N.(1991)。Factors Influencing Intrajudge Consistency during Standardsetting。Educational Measurement: Issues and Practice,10(2),15-25。  new window
10.Clauser, B. E.、Swanson, D. B.、Harik, P.(2002)。Multivariate Generalizability Analysis of the Impact of Training and Examinee Performance Information on Judgments Made in an Angoff‐Style Standard‐Setting Procedure。Journal of Educational Measurement,39(4),269-290。  new window
11.van der Linden, W. J.(1982)。A latent traitmethod for determining intrajudge inconsistency in the Angoff and Nedelsky techniques of standard setting。Journal of Educational Measurement,19(4),295-308。  new window
12.van der Linden, W. J.(1986)。A latent trait method for determining intrajudge inconsistency in the Angoff and Nedelsky techniques of standard setting (Addendum)。Journal of Educational Measurement,23(3),265-266。  new window
13.Cizek, G. J.、Bunch, M. B.、Koons, H.(2004)。Setting performance standards: Contemporary methods。Educational Measurement: Issues and Practice,23(4),31-50。  new window
14.Berk, R. A.(1986)。A Consumer's Guide to Setting Performance Standards on Criterion-referenced Tests。Review of Educational Research,56(1),137-172。  new window
15.Taube, K. T.(1997)。The incorporation of empirical item difficulty data into the Angoff standard setting procedure。Evaluation & Health Professions,20(4),479-498。  new window
16.Plake, Barbara S.、Impara, James C.(2001)。Ability of panelists to estimate item performance for a target group of candidates: An issue in judgmental standard setting。Educational Assessment,7(2),87-97。  new window
17.Goodwin, L. D.(1999)。Relations between observed item difficulty levels and Angoff minimum passing levels for a group of borderline examinees。Applied Measurement in Education,12,13-28。  new window
18.Jaeger, R. M.(1995)。Setting performance standards through two-stage judgmental policycapturing。Applied Measurement in Education,8(1),15-40。  new window
19.Kane, M.(1987)。On the use of IRT models with judgmental standard setting procedures。Journal of Educational Measurement,24(4),333-345。  new window
20.Lorge, I.,、Kruglov, L. K.(1953)。The improvement of the estimates of test difficulty。Educational and Psychological Measurement,13,34-46。  new window
21.MacCann, R. G.,、Stanley, G.(2006)。The use of Rasch modeling to improve standardsetting。Practical Assessment, Research & Evaluation,11(2)。  new window
22.Plake, B. S.,、Melican, G. J.(1989)。Effects of item context on intrajudge consistency of expert judgments via the Nedelsky standard setting method。Educational and Psychological Measurement,49(1),45-51。  new window
23.Reckase, M. D.(2006)。Some criteria for evaluating the functioning of standard-setting methods with application to bookmark and modified Angoff methods。Educational Measurement: Issues and Practice,25(2),4-18。  new window
24.Schraw, G.,、Roedel, T. D.(1994)。Test difficulty and judgment bias。Memory and Cognition,22(1),63-69。  new window
25.Sireci, S. G.,、Biskin, B. H.(1992)。A survey of national professional licensure examination programs。CLEAR Exam Review,3,21 25。  new window
26.Smith, R. L.,、Smith, J. K.(1988)。Differential use of item information by judges using Angoff and Nedelsky procedures。Journal of Educational Measurement,25(4),259-274。  new window
27.Verhoeven, B. H., van der Steeg, A. F. W., Scherpbier, A. F. F. A., Muijtjens, A. M. M., Verwijnen,、van der Vleuten, C. P. M.(1999)。Reliability and credibility of an Angoff standard setting procedure in progress testing using recent graduates as judges。Medical Education,33,832-837。  new window
28.Goodwin, L. D.(1999)。Relations between observed item difficulty levels and Angoff minimum passing levels for a group of borderline examinees。Applied Measurement in Education,12,13-28。  new window
29.Jaeger, R. M.(1995)。Setting performance standards through two-stage judgmental policy capturing。Applied Measurement in Education,8(1),15-40。  new window
30.Kane, M.(1987)。On the use of IRT models with judgmental standard setting procedures。Journal of Educational Measurement,24(4),333-345。  new window
31.Lorge, I.、Kruglov, L. K.(1953)。The improvement of the estimates of test difficulty。Educational and Psychological Measurement,13,34-46。  new window
32.MacCann, R. G.、Stanley, G.(2006)。The use of Rasch modeling to improve standard setting。Practical Assessment, Research & Evaluation,11(2)。  new window
33.Plake, B. S.、Melican, G. J.(1989)。Effects of item context on intrajudge consistency of expert judgments via the Nedelsky standard setting method。Educational and Psychological Measurement,49(1),45-51。  new window
34.Reckase, M. D.(2006)。Some criteria for evaluating the functioning of standard-setting methods with application to bookmark and modified Angoff methods。Educational Measurement: Issues and Practice,25(2),4-18。  new window
35.Schraw, G.、Roedel, T. D.(1994)。Test difficulty and judgment bias。Memory and Cognition,22(1),63-69。  new window
36.Sireci, S. G.、Biskin, B. H.(1992)。A survey of national professional licensure examination programs。CLEAR Exam Review,3,21-25。  new window
37.Smith, R. L.、Smith, J. K.(1988)。Differential use of item information by judges using Angoff and Nedelsky procedures。Journal of Educational Measurement,25(4),259-274。  new window
38.Verhoeven, B. H.、van der Steeg, A. F. W.、Scherpbier, A. F. F. A.、Muijtjens, A. M. M.、Verwijnen,、van der Vleuten, C. P. M.(1999)。Reliability and credibility of an Angoff standard setting procedure in progress testing using recent graduates as judges。Medical Education,33,832-837。  new window
會議論文
1.Reckase, M. D.(2000)。The ACT/NAGB standard setting process: How “modified” does it have to be before it is no longer a modified-Angoff process?。New Orleans, L. A.。  new window
2.Shepard, L. A.(1995)。Implications for standard setting of the National Academy of Education evaluation of National Assessment of Educational Progress achievement levels。Washington, D.C.。  new window
3.Reckase, M. D.(2000)。The ACT/NAGB standard setting process: How 'modified' does it have to be before it is no longer a modified-Angoff process?。New Orleans, L. A。  new window
4.Shepard, L. A.(1995)。Implications for standard setting of the National Academy of Education evaluation of National Assessment of Educational Progress achievement levels。Washington, D.C.。  new window
學位論文
1.吳裕益(1986)。標準參照測驗通過分數設定方法之研究,台北市。new window  延伸查詢new window
2.Pitoniak, M. J.(2003)。Standard setting methods for complex licensure examinations(博士論文)。University of Massachusetts,Amherst, Massachusetts。  new window
3.Matter, J. D.(2000)。Investigation of the validity of the Angoff standard setting procedure for multiple-choice items,Amherst, MA。  new window
圖書
1.Hambleton, R. K.、Pitoniak, M. J.(2006)。Setting performance standards。Educational measurement。Washington, DC:American Council on Education。  new window
2.Hambleton, R. K.(2001)。Setting performance standards on educational assessments and criteria for evaluating the process1, 2。Setting performance standards: Concepts, methods, and perspectives。Mahwah, NJ:Lawrence Erlbaum Associates。  new window
3.Raymond, M. R.、Reid, J. B.(2001)。Who made thee a judge? Selecting and training participants for standard setting。Setting Performance Standards: Concepts, Methods, and Perspectives。Mahwah, NJ:Lawrance Erlbaum Associates。  new window
4.Shepard, L., Glaser, R., Linn, R.,、Bohrnstedt, G.(1993)。Setting performance standards for student achievement tests。Stanford, CA:National Academy of Education。  new window
5.Reckase, M. D.(1998)。Setting standards to be consistent with an IRT item calibration。Iowa City, IA:ACT。  new window
6.Allen, N. L.、Jenkins, F.、Kulick, E.、Zelenak, C. A.(1997)。Technical report of the NAEP 1996 state assessment program in mathematics。Washington, DC:National Center for Education Statistics。  new window
7.American Psychological Association、American Educational Research Association、National Council on Measurement in Education(1999)。Standards for educational and psychological testing。Washington, DC:American Psychological Association。  new window
8.Cizek, G. J.、Bunch, M. B.(2007)。Standard setting: A guide to establishing and evaluating performance standards on tests。Sage。  new window
9.Cizek, G. J.(2001)。Conjectures on the rise and call of standard setting: An introduction to contextand practice。Setting performance standards: Concepts, methods, and perspectives。Mahwah, NJ。  new window
10.Cizek, G. J.(2006)。Standard setting。Handbook of test development。Mahwah, NJ。  new window
11.McLaughlin, D. H.(1993)。Validity of the 1992 NAEP achievement-level setting process。Setting performance standards forstudent achievement tests: Background studies。Stanford, CA。  new window
12.National Assessment Governing Board(2006)。Writing framework and specifications for the2007 National Assessment of Educational Progress。Washington, DC。  new window
13.Reckase, M. D.(2001)。Innovative methods for helping standard-setting participants to performtheir task: The role of feedback regarding consistency, accuracy and impact。Setting Performance Standards: Concepts, Methods, and Perspectives。Mahwah, NJ。  new window
14.Cizek, G. J.(2001)。Conjectures on the rise and call of standard setting: An introduction to context and practice。Setting performance standards: Concepts, methods, and perspectives。Mahwah, NJ。  new window
15.Cizek, G. J.(2006)。Standard setting。Handbook of test development。Mahwah, NJ。  new window
16.McLaughlin, D. H.(1993)。Validity of the 1992 NAEP achievement-level setting process。Setting performance standards for student achievement tests: Background studies。Stanford, CA。  new window
17.Reckase, M. D.(2001)。Innovative methods for helping standard-setting participants to perform their task: The role of feedback regarding consistency, accuracy and impact。Setting Performance Standards: Concepts, Methods, and Perspectives。Mahwah, NJ。  new window
其他
1.National Assessment Governing Board(2006)。Writing framework and specifications for the 2007 National Assessment of Educational Progress,Washington, DC。  new window
2.Wuensch, K. L.(2003)。Inter-rater agreement,http://core.ecu.edu/psyc/wuenschk/docs30/InterRater.doc。  new window
圖書論文
1.Angoff, William H.(1971)。Scales, norms, and equivalent scores。Educational measurement。Washington, DC:American Council on Education。  new window
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top