|
Ackerman, T. A. (1989). Unidimensional IRT calibration of compensatory and noncompensatory multidimensional items. Applied Psychological Measurement, 13(2), 101–125. Baker, F. B. (1992). Item response theory: Parameter estimation techniques. New York: Marcel Dekker. Baker, F. B., & Kim, S. H. (2004). Item response theory: Parameter estimation techniques. New York: Marcel Dekker. Bar-Hillel, M., Budescu, D., & Attali, Y. (2005). Scoring and keying multiple choice tests: A case study in irrationality. Mind & Society, 4(1), 3–12. Barton, M. A., & Lord, F. M. (1981). An upper asymptote for the three-parameter logistic item-response model. Princeton, NJ: Educational Testing Services. Bimbaum, A. (1968). Some latent trait models and their use in inferring an examinee's ability. In F. M. Lord & M. R. Novick (Eds.), Statistical Theories of Mental Test Scores (pp. 397–479). London: Addison Wesley. Bowles, R., & Pommerich, M. (2001, April). An examination of item review on a CAT using the specific information item selection algorithm. Paper presented at the the annual meeting of the National Council on Measurement in Education, Seattle WA. Brown, J. D. (1997). Computers in language testing: Present research and some future directions. Language Learning & Technology, 1(1), 44–59. Burton, R. F. (2002). Misinformation, partial knowledge and guessing in true/false tests. Medical Education, 36(9), 805–811. Carr, M. J., & Pauwels, A. (2006). Boys and foreign language learning : Real boys don't do languages. New York: Palgrave Macmillan. Chang, H. H. (2004). Understanding computerized adaptive testing: From Robbins-Munro to Lord and beyond. In D. Kaplan (Ed.), The Sage handbook of quantitative methodology for the social sciences. (pp. 117–133). New York: Sage. Chen, L. J. (2009). Effects of block-review and rearrangement computerized adaptive test on ability estimation and test anxiety. Unpublished doctoral dissertation, National Taiwan Normal University, Taiwan. Coombs, C. H., & Womer, F. B. (1956). The assessment of partial knowledge. Educational and Psychological Measurement, 16(1), 13–37. Dunkel, P. A. (1997). Computer-adaptive testing of listening comprehension: A blueprint for CAT development. The Language Teacher, 21, 7–14. Educational Testing Service. (2000). The computer-based TOEFL score user guide. Princeton, NJ: Author. Ehrman, M., & Oxford, R. (1988). Effects of sex differences, career choice, and psychological type on adult language learning strategies. Modern Language Journal, 73(1), 3–13. Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Lawrence Erlbaum Associates. Gardner-Medwin, A. R., & Gahan, M. (2003). Formative and summative confidence-based assessment. Paper presented at the Proceedings of the 7th International Computer-Aided Assessment Conference, Loughborough, UK. Gershon, R. C., & Bergstrom, B. (1995, April). Does cheating on CAT pay: NOT! Paper presented at the the annual meeting of the American Educational Research Association, San Francisco. Hambleton, R. K., Rogers, H. J., & Swaminathan, H. (1995). Fundamentals of item response theory. Newbury Park: Sage. Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Boston: Kluwer Nijhoff. Harvey, R. J., & Hammer, A. L. (1999). Item response theory. The Counseling Psychologist, 27(3), 353–384. Harvil, L. M., & Davis III, G. (1997). Medical students' reasons for changing answers on multiple-choice tests. Academic Medicine, 72(10 Suppl 1), S97–S99. Harwell, M., Stone, C. A., Hsu, T. C., & Kirisci, L. (1996). Monte Carlo studies in item response theory. Applied Psychological Measurement, 20(2), 101–125. Heidenberg, A. J., & Layne, B. H. (2000). Answer changing: A conditional argument. College Student Journal, 34(3), 440–450. Ho, R.-G. (1989). Computerized adaptive testing. Psychological Testing, 36, 117–130. Ho, R.-G., & Yen, Y.-C. (2005). Design and evaluation of an XML-based platform-independent computerized adaptive testing system. IEEE Transactions on Education, 48(2), 230–237. Hsu, T. C., & Sadock, S. F. (1985). Computer-assisted test construction: A state of art. TME report 88, Princeton, New Jersey, Eric on Test. Measurement, and Evaluation, Educational Testing Service. Huff, K. L., & Sireci, S. G. (2001). Validity issues in computer-based testing. Educational Measurement: Issues and Practice, 20(3), 16–25. Hyde, J. S., & Linn, M. C. (1988). Gender differences in verbal ability: A meta-analysis. Psychological Bulletin, 104(1), 53–69. Jackson, R. A. (1955). Guessing and test performance. Educational and Psychological Measurement, 15(1), 74–79. Kingsbury, G. G. (1996). Item review and adaptive testing. Paper presented at the the annual meeting of the National Council on Measurement in Education, New York, NY. Kissau, S. (2006a). Gender differences in motivation to learn French. Canadian Modern Language Review, 62(3), 401–422. Kissau, S. (2006b). Gender differences in second language motivation: An investigation of micro- and macro-level influences. Canadian Journal of Applied Linguistics, 9(1), 73–96. Kissau, S., & Turnbull, M. (2008). Boys and French as a second language: A research agenda for greater understanding. Canadian Journal of Applied Linguistics, 11(3), 151–170. Kreitzberg, C., & Jones, D. (1980). An empirical study of the broad range tailored test of verbal ability. research report. RR-80-5. Princeton, NJ: Educational Testing Service. Larson, J. W., & Madsen, H. S. (1985). Computerized adaptive language testing: moving beyond computer-assisted testing. CALICO Journal, 2(3), 32–36. Lord, F. M. (1970). Some test theory for tailored testing. In W. H. Holzman (Ed.), Computer assisted instruction, testing, and guidance. New York: Harper and Row. Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum Associates. Lord, F. M. (1983). Unbiased estimators of ability parameters, of their variance, and of their parallel-forms reliability. Psychometrika, 48(2), 233–245. Lord, F. M., & Novick, M. R. (1968). Theory of mental test scores. Reading, MA: Addison-Wesley. Luecht, R. M., & Hirsch, T. M. (1992). Item selection using an average growth approximation of target information functions. Applied Psychological Measurement, 16(1), 41–51. Lunz, M., Bergstrom, B., & Wright, B. (1992). The effect of review on student ability and test efficiency for computerized adaptive tests. Applied Psychological Measurement, 16(1), 33. McBride, J. R., Wetzel, C. D., & Hetter, R. D. (1997). Preliminary psychometric research for CAT-ASVAB: Selecting an adaptive testing strategy. In W. A. Sands, B. K. Waters & J. R. McBride (Eds.), Computerized adaptive testing: From inquiry to operation (pp. 83–95). Washington, DC: American Psychological Association. McMorris, R. F., DeMers, L. P., & Schwarz, S. P. (1987). Attitudes, behaviors, and reasons for changing responses following answer-changing instruction. Journal of Educational Measurement, 131–143. McMorris, R. F., & Weideman, A. H. (1986). Answer changing after instruction on answer changing. Measurement and Evaluation in Counseling and Development, 19(2), 93–101. Mills, C. N., & Stocking, M. L. (1995). Practical issues in large-scale high-stakes computerized adaptive testing. Princeton, NJ: Educational Testing Service. Morisset, C. E., Barnard, K. E., & Booth, C. L. (1995). Toddlers' language development: Sex differences within social risk. Developmental psychology, 31(5), 851–865. Nyikos, M. (1990). Sex-related differences in adult language learning: Socialization and memory factors. Modern Language Journal, 74(3), 273–287. Olea, J., Revuelta, J., Ximenez, M., & Abad, F. (2000). Psychometric and psychological effects of review on computerized fixed and adaptive tests. Psicologica, 21, 157–173. Owen, R. (1975). A Bayesian sequential procedure for quantal response in the context of adaptive mental testing. Journal of the American Statistical Association, 351–356. Oxford, R., Nyikos, M., & Ehrman, M. (1988). Vive la différence? Reflections on sex differences in use of language learning strategies. Foreign Language Annals, 21(4), 321–329. Oxford, R., Park-Oh, Y., It, S., & Sumrall, M. (1993). Japanese by satellite: Effects of motivation, language learning styles and strategies, gender, course Level, and previous language learning experience on Japanese language achievement. Foreign Language Annals, 26(3), 359–371. Papanastasiou, E. C. (2002, April). A ‘rearrangement procedure’ for scoring adaptive tests with review options. Paper presented at the the National Council of Measurement in Education, New Orleans, LA. Papanastasiou, E. C. (2005). Item review and the rearrangement procedure: Its process and its results. Educational Research and Evaluation, 11(4), 303–321. Papanastasiou, E. C., & Reckase, M. (2007). A "rearrangement procedure" for scoring adaptive tests with review options. International Journal of Testing, 7(4), 387–407. Parchev, I. (2004). A visual guide to item response theory. Retrieved November 9, 2009, from http://www2.uni-jena.de/svw/metheval/irt/VisualIRT.pdf. Parshall, C. G., Kalhn, J. C., & Davey, T. (2002). Practical considerations in computer based testing. New York: Springer-Verlag. Reeve, B. B., & Fayers, P. (2005). Applying item response theory modelling for evaluating questionnaire item and scale properties. In Q. Fayers & R. D. Hays (Eds.), Assessing quality of life in clinical trial: Methods and practice (pp. 55–74). Oxford: Oxford University Press. Rulison, K., & Loken, E. (2009). I've fallen and I can't get up: can high-ability students recover from early mistakes in CAT? Applied Psychological Measurement, 33(2), 83. Schwarz, S. P., McMorris, R. F., & DeMers, L. P. (1991). Reasons for changing answers: An evaluation using personal interviews. Journal of Educational Measurement, 28(2), 163–171. Shatz, M. A., & Best, J. B. (1987). Students' reasons for changing answers on objective tests. Teaching of Psychology, 14(4), 241–242. Stocking, M. L. (1997). Revising item responses in computerized adaptive tests: A comparison of three models. Applied Psychological Measurement, 21(2), 129. Stone, G., & Lunz, M. (1994). The effect of review on the psychometric characteristics of computerized adaptive tests. Applied Measurement in Education, 7(3), 211–222. Tao, Y.-H., Wu, Y.-L., & Chang, H.-Y. (2008). A practical computer adaptive testing model for small-scale scenarios. Educational Technology & Society, 11(3), 259–274. Van Der Linden, W., & Glas, C. (2000). Computerized adaptive testing: Theory and practice. Boston, MA: Kluwer Academic Publishers. Vicino, F., & Moreno, K. (1997). Human factors in the CAT system: A pilot study. In W. A. Sands, B. K. Waters & J. R. McBride (Eds.), Computerized adaptive testing: From inquiry to operation (pp. 157–160). Washington, DC: APA. Vispoel, W. P. (1998). Reviewing and changing answers on computer-adaptive and self-adaptive vocabulary tests. Journal of Educational Measurement, 35(4), 328–345. Vispoel, W. P., Clough, S. J., Bleiler, T., Hendrickson, A. B., & Ihrig, D. (2002). Can examinees use judgments of item difficulty to improve proficiency estimates on computerized adaptive vocabulary tests? Journal of Educational Measurement, 39(4), 311–330. Vispoel, W. P., Hendrickson, A., & Bleiler, T. (2000). Limiting answer review and change on computerized adaptive vocabulary tests: Psychometric and attitudinal results. Journal of Educational Measurement, 37(1), 21–38. Vispoel, W. P., Rocklin, T., & Wang, T. (1994). Individual differences and test administration procedures: a comparison of fixed-item, computerized-adaptive, and self-adapted testing. Applied Measurement in Education, 7(1), 53–79. Vispoel, W. P., Rocklin, T. R., Wang, T., & Bleiler, T. (1999). Can examinees use a review option to obtain positively biased ability estimates on a computerized adaptive test? Journal of Educational Measurement, 36(2), 141–157. Waddell, D. L., & Blankenship, J. C. (1994). Answer changing: A meta-analysis of the prevalence and patterns. Journal of Continuing Education in Nursing, 25(4), 155–158. Wagner, D., Cook, G., & Friedman, S. (1998). Staying with their first impulse? The relationship between impulsivity/reflectivity, field dependence/field independence and answer changes on a multiple-choice exam in a fifth-grade sample. Journal of Research and Development in Education, 31(3), 166–175. Wainer, H. (1993). Some practical considerations when converting a linearly administered test to an adaptive format. Educational Measurement: Issues and Practice, 12(1), 15–20. Wainer, H., Dorans, N. J., Eignor, D., Flaugher, R., Green, B. F., Mislevy, R. J., et al. (2000). Computerized adaptive testing: A primer (2nd ed.). Hillsdale, NJ: Erlbaum. Wallentin, M. (2009). Putative sex differences in verbal abilities and language cortex: A critical review. Brain and Language, 108(3), 175–183. Wang, M., & Wingersky, M. (1992). Incorporating post-administration item response revision into a CAT. Paper presented at the the annual meeting of the American Educational Research Association, San Francisco, CA. Wise, S. L. (1996, April). A critical analysis of the arguments for and against item review in computerized adaptive testing. Paper presented at the the annual meeting of the National Conference on Measurement in Education, New York, NY. Wise, S. L., & Kingsbury, G. (2000). Practical issues in developing and maintaining a computerized adaptive testing program. Psicologica, 21(1–2), 135–155. Wright, B. D. (1997). Fundamental measurement for psychology. In S. Embretson & S. Hershberger (Eds.), The new roles of measurement: What every psychologist and educator know. Hillsdale, NJ: Lawrence Erlbaum Associates. Yen, Y. C., Ho, R. G., Chen, L. J., Chou, K. Y., & Chen, Y. L. (in press). Development and evaluation of a confidence-weighting computerized adaptive testing. Educational Technology & Society.
|