Computer-based and paper-based testing: Does the test administration mode influence the reliability and validity of achievement tests?
Abstract
This article reports the findings of a study that sought to investigate whether computer-based vs. paper-based test-delivery mode has an impact on the reliability and validity of an achievement test for a pedagogical content knowledge course in an English teacher education program. A total of 97 university students enrolled in the English as a foreign language (EFL) teacher education program were randomly assigned to the experimental group that took the computer-based achievement test online and the control group that took the same test in paper-and-pencil based format. Results of Spearman Rank order and Mann-Whitney U tests indicated that test-delivery mode did not have any impact on the reliability and validity of the tests administered in either way. Findings also demonstrated that there was not any significant difference in test scores between participants who took the computer-based test and those who took the paper-based test. Findings were discussed in terms of the idea that computer technology could be integrated into the curriculum not only for instructional practices but also for assessment purposes.
Keywords
Full Text:
PDFReferences
Akdemir, O., & OÄŸuz, A. (2008). Computer-based testing: An alternative for the assessment of Turkish undergraduate students. Computers & Education, 51, 1198-1204. doi:10.1016/j.compedu.2007.11.007
Alderson, J.C. (2000). Technology in testing: The present and the future. System, 28, 53-603. doi.org/10.1016/S0346-251X(00)00040-3
Al-Amri, S. (2008). Computer-based testing vs. paper-based testing: A comprehensive approach to examining the comparability of testing modes. Essex Graduate Student Papers in Language and Linguistics, 10, 22–44.
American Psychological Association (1986). Guidelines for computer-based tests and interpretations. Washington, DC: Author.
Bennett, R. E. (2003). Online assessment and the comparability of score meaning. Princeton, NJ: Educational Testing Service.
Blerkom, M. L. V. (2009). Measurement and statistics for teachers. New York, NY: Routledge.
Boo, J. (1997) Computerized versus paper-and-pencil assessment of educational development: Score comparability and examinee preferences. Unpublished PhD dissertation, University of Iowa.
Brown, H.D. (2004). Language assessment: Principles and classroom practices. White Plains, NY: Pearson Education.
Brown, H. D. & Abeywickrama, P. (2010). Language assessment: Principles and classroom practices.White Plains, NY: Pearson Education.
Brusilovsky, P., & Miller, P. (1999). Web-based testing for distance education. Webnet 99 World conference on the WWW, Hawaii, USA, 24-30 October 1999.
Bugbee, A. C. (1996). The equivalence of paper-and-pencil and computer-based testing. Journal of Research on Computing in Education, 28 (3), 282-299.
Chapelle, C. (1998): Construct definition and validity inquiry in SLA research. In L. F. Bachman and A. D. Cohen (Eds.), Interfaces between second language acquisition and language testing research, 32-70. New York, NY: Cambridge University Press.
Chapelle, C. (1999). Validity in language assessment. Annual Review of Applied Linguistics, 19, 254-72.
https://doi.org/10.1017/S0267190599190135
Chapelle, C. (2001) Computer applications in second language acquisition: Foundations for teaching, testing, and research. Cambridge, England: Cambridge University Press.
Chapelle, C., & Douglas, D. (2006). Assessing language through computer technology. Cambridge, England: Cambridge University Press.
Chin, C. H. L. (1990). The effect of computer-based tests on the achievement, anxiety and attitudes of grade 10 science students. (Unpublised master’s thesis). The University of British Columbia, Vancouver.
Choi, I. C., Kim, K. S., & Boo, J. (2003). Comparability of a paper-based language test and a computer-based language test. Language Testing, 20(3), 295-320. doi: 0.1191/0265532203lt258oa
Choi, S. W., & Tinkler, T. (2002). Evaluating comparability of paper and computer based assessment in a K-12 setting. Paper presented at the Annual Meeting of the National Council on Measurement in Education, New Orleans, LA.
Chua, Y. P. (2012). Effects of computer-based testing on test performance and testing motivation. Computers in Human Behavor, 28(5), 1580-1586. doi: 10.1016/j.chb.2012.03.020
Cisar, S. M., Radosav, D., Markoski, B., Pinter, R., & Cisar, P. (2010). New Possibilities for Assessment through the Use of Computer Based Testing. 8th International Symposium on Intelligent Systems and Informatics, Serbia, 10-11 September 2010 .
Cohen, A. D. (2001). Second language assessment. In M. Celce-Murcia (Ed.). Teaching English as a second or foreign language (3rd ed., pp. 515-534). Boston, MA: Heinle & Heinle.
Creed, A., Dennis, I., & Newstead, S. (1987). Proof-reading on VDUs. Behaviour and Information Technology, 6(1), 3-13. https://doi.org/10.1080/01449298708901814
Delen, E. (2015). Enhancing a computer-based testing environment with optimum item response time. Eurasia Journal of Mathematics, Science and Technology Education, 11(6), 1457-1472. https://doi.org/10.12973/eurasia.2015.1404a
Dermo, J. (2009). E-assessment and the student learning experience: A survey of student perceptions of e-assessment. British Journal of Educational Technology, 40 (2), 203-214. https://doi.org/10.1111/j.1467-8535.2008.00915.x
Dillon, A. (1994). Designing usable electronic text: Ergonomic aspects of human information usage. London: Taylor & Francis.
Dunkel, P. (Ed.) (1991). Computer-assisted language learning and testing: Research issues and practice. New York, NY: Newbury House.
Flaugher, R. (2000). Item banks. In H. Wainer, N. J. Dorans, D. Eignor, R. Flaugher, B. F. Green, R. J. Mislevy, L. Steinberg, & D. Thissen (Eds.), Computerized adaptive testing: A primer, 37-59. Mahwah, NJ: Lawrence Erlbaum Associates Inc.
Folk, V. G., & Smith, R. L. (2002). Models for delivery of CBTS. . In C. N. Mills, Potenza, M. T., Fremer, J. J., Ward, W. C. (Eds.), Computer-based testing: Building the foundation for future assessments, 41-66. Mahwah, NJ: Lawrence Erlbaum Associates Inc.
Fulcher, G. and Davidson, F. (2007). Language testing and assessment: An advanced resource book. New York, NY: Routledge.
Guzman, E., & Conejo, R. (2005). Self-assessment in a feasible, adaptive web-based testing system. IEEE Transactions on Education, 48 (4), 688-695. doi: 10.1109/TE.2005.854571
Hakim, B. M. (2017). Comparative study on validity of paper-based test and computer-based test in the context of educational and psychological assessment among Arab students. International Journal of English Linguistics, 8(2), 85-91. http://doi.org/10.5539/ijel.v8n2p85
Hensley, K.K. (2015). Examining the effects of paper-based and computer-based modes of assessment of mathematics curriculum-based measurement. Unpublished PhD thesis, University of Iowa, Iowa.
Higgings, J., Russell, M., & Hoffmann, T. (2005). Examining the effect of computer-based passage presentation on reading test performance. Journal of Technology, Learning and Assessment, 3 (4), 3-35.
Hosseini, M., Abidin, M.J.Z., & Baghdarnia, M. (2014). Comparability of test results of computer based tests (CBT) and paper and pencil tests (PPT) among English language learners in Iran. Social and Behavioral Sciemces, 98, 659-667. doi: 10.1016/j.sbspro.2014.03.465
Hughes, A. (2003). Testing for language teachers. (2nd ed.). Cambridge, England: Cambridge University Press.
Jeong, H. (2014). A comparative study of scores on computer-based tests and paper-based tests. Behaviour and Information Technology, 33(4), 410-422. doi.org/10.1080/0144929X.2012.710647
Kearsley, G. (1996). The World Wide Web: Global access to education. Educational Technology Review, 5, 26-30.
Kim, D. H., & Huynh, H. (2007). Comparability of computer and paper-and-pencil versions of algebra and biology assessments. Journal of Technology, Learning, and Assessment, 6(4), 4-30. Retrieved from http://ejournals.bc.edu/ojs/index.php/jtla/article/download/ 1634/1478.
Laborda, J. G. (2010). Contextual clues in semi-direct interviews for computer assisted language testing. Procedia Social and Behavioral Sciences, 2, 3591-3595. doi:10.4304/jltr.5.5.971-975
Larson-Hall, J. (2010). A guide to doing statistics in second language research using SPSS. Abingdon, Oxon: Routledge.
Lilley, M., Barker, R., & Britton, C. (2004). The development and evaluation of a software prototype for computer-adaptive testing. Computers and Education, 43, 109-123.
Linden, W. J. (2002). On complexity in CBT. . In C. N. Mills, Potenza, M. T., Fremer, J. J., Ward, and W. C. (Eds.), Computer-based testing: Building the foundation for future assessments, 89-102. Mahwah, NJ: Lawrence Erlbaum Associates Inc.
Linden, W. J., & Glas, G. A. W. (2002). Computer-adaptive testing: Theory and Practice. NewYork: Kluwer Academic Publishers.
Logan, T. (2015). The influence of test mode and visuospatial ability on mathematics assessment performance. Mathematics Education Research Journal, 27, 423-441. doi: 10.1007/s13394-015-0143-1
Mackey, A., & Gass, S. M. (2005). Second language research: Methodology and design. Mahwah, NJ: Lawrence Erlbaum Associates.
Madsen, H. S. (1991). Computer-adaptive testing of listening and reading comprehension. In P. Dunkel(Ed.) Computer-assisted language learning and testing, 237-257. New York, NY: Newbury House.
McGough, J., Mortensen, J., Johnson, J., & Fadali, S. (2001). A web based testing system with dynamic question generation. 31st ASEE/ IEEE frontiers in education conference, Reno, 10-13 October 2001.
Muter, P., Latremouille, S. A., Treurniet, W. C., & Beam, P. (1982). Extended reading of continuous text on television screens. Human Factors, 24, 502-508. https://doi.org/10.1177/001872088202400501
Noyes, J. M., & Garland, K. J. (2008). Computer- vs. paper-based tasks: Are they equivalent? Ergonomics, 51(9), 1352-1375. doi: 10.1080/00140130802170387
Paek, P. (2005). Recent trends in comparability studies (Pearson Educational Measurement Research Report 05-05). Retrieved from http://www.pearsonassessments.com/NR/rdonlyres/5FC04F5A-E79D-45FE-8484-07AACAE2DA75/0/TrendsCompStudies_rr0505.pdf.
Parshall, C. G., & Kromrey, J. D. (1993). Computer-based versus paper-and-pencil testing: An analysis of examinee characteristics associated with mode effect. Annual meeting of the American educational research association, Atlanta, GA, April 1993.
Parshall, C. G., Spray, J. A., Kalohn, J. C., & Davey, T. (2002). Practical considerations in computer based testing. Verlag, NewYork: Springer.
Ravid, R. (2011). Practical statistics for educators (4th ed.) Plymouth, UK: Rowman & Littlefiel.
Retnawati, H. (2015). The comparision of accuracy scores on the paper and pencil testing versus computer-based testig. TOJET, 14(4), 135-142.
Roever, C. (2001). Web-based language testing. Language Learning and Technology, 5(5), 84-94.
Russell, M., Goldberg, A., & O’conner, K. (2003). Computer-based testing and validity: A look back into the future. Assessment in Education: Principles, Policy & Practice, 10 (3), 279-293. https://doi.org/10.1080/0969594032000148145
Scheerens, J., Glas C., & Thomas, S. M. (2005). Educational evaluation, assessment, and monitoring: A systemic approach. Lisse: Swets & Zeitlinger B.V.
Semerci, Ç., ve Bektaş, C. (2005). İnternet temelli ölçmelerin geçerliliğini sağlamada yeni yaklaşımlar. TOJET, 4 (1), 130-134.
Siozos, P., Palaigeorgiou, G., Triantafyllakos, G., & Despotakis, T. (2009). Computer-based testing using “digital inkâ€: Participatory design of a tablet PC based assessment application for secondary education. Computers & Education, 52, 811-819.
Stevenson, J., & Gross, S. (1991). Use of a computerized adaptive testing model for ESOL/ bilingual entry/ exit decision making. In P. Dunkel(Ed.) Computer-assisted language learning and testing, 223-235. New York, NY: Newbury House.
Stobart, G. (2012). Validity in formative assessment. In J. Gardner, (Ed.). Assessment and learning, 233-242. London: Sage Publications, Inc.
Texas Education Agency. (2008). A review of literature on the comparability of scores obtained from examinees on computer-based and paper-based tests. Retrieved from www.tea.state.tx.us/WorkArea/linkit.aspx?LinkIdentifier=id&ItemID=2147494120&libID= 2147494117.
Tsai, T. H., & Shin, C. D. (2012). A score comparability study for the NBDHE: Paper-pencil versus computer versions. Evaluation & the Health Professions, 36(2), 228-239. https://doi.org/10.1177/0163278712445203
Tung, P. (1986). Computerized adaptive testing: Implications for language test developers. In C. W. Stansfield (Ed.). Technology and language testing (pp. 9-11). Washington, DC: TESOL.
Wainer, H., & Eignor, D. (2000). Caveats, pitfalls and unexpected consequences of implementing large-scale computerized testing. In H. Wainer, N. J. Dorans, D. Eignor, R. Flaugher, B. F. Green, R. J. Mislevy, L. Steinberg, & D. Thissen (Eds.), Computerized adaptive testing: A primer, 271-298. Mahwah, NJ: Lawrence Erlbaum Associates Inc.
Wang, H. (2010). Comparability of computerized adaptive and paper-pencil tests. [Online: http://images.pearsonassessments.com/images/tmrs/tmrs_rg/Bulletin_13.pdf, retrieved in August, 2013].
Wang, H., & Shin, C. D. (2009). Computer-based & paper-pencil test comparability studies. Test, Measurement and Research Service Bulletin, 9, 1-6. Retrieved from http://www.pearsonassessments.com/NR/rdonlyres/93727FC9-96D3-4EA5-B807-5153EF17C431/0/Bulletin_9.pdf
Wang, H., & Shin, C. D. (2010). Comparability of computerized adaptive and paper-pencil tests. Test, Measurement and Research Service Bulletin, 13, 1-7. Retrieved from http://www.pearsonassessments.com/NR/rdonlyres/057A4A04-9DCB-4B68-9CB0-3F32DDF396F6/0/Bulletin_13.pdf.
Wang, S., Jiao, H., Young, M. J., Brooks, T., & Olson, J. (2007). A meta-analysis of testing mode effects in grade k-12 mathematics tests. Educational and Psychological Measurement, 67(2), 219-238. https://doi.org/10.1177/0013164406288166
Wang, T., & Kolen, M. J. (2001). Evaluating comparability in computerized adaptive testing: Issues, criteria and an example. Journal of Educational Measurement, 38(1), 19-49. http://dx.doi.org/10.1111/j.1745-3984.2001.tb01115.x
Ward, W. C. (2002). Test models. In C. N. Mills, Potenza, M. T., Fremer, J. J., Ward, W. C. (Eds.), Computer-based testing: Building the foundation for future assessments, 37-40. Mahwah, NJ: Lawrence Erlbaum Associates Inc.
Whiston, S. C. (2009). Principles and applications of assessment in counseling (3rd ed.). CA: Brooks/ Cole.
Yagcı M., Ekiz. H., ve Gelbal, S. (2011). Çevrimiçi sınav ortamlarının öğrencilerin akademik başarılarına etkisi.5th international computer and instructional technologies symposium, Elazığ, Turkey, 22-24 September 2011.
Yaman, S. O., & Cagıltay, N. E. (2010). Paper-based versus computer-based testing in engineering education. IEEE Educon Education Engineering: The Future of Global Learning Engineering Education, 1631-1637. doi: 10.1109/EDUCON.2010.5492397
Yunxiang, L., Ruixue, G., Lili, R., Wangjie, Quinshui, Q., & Hefei (2010). Advantages and disadvantages of computer-based testing: A case study of service learning. [Online: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5691870, retrieved in July, 2013]. doi: 10.1109/ICISE.2010.5691870
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
ISSN 1305-578X (Online)
Copyright © 2005-2022 by Journal of Language and Linguistic Studies