A blending of computer-based assessment and performance-based assessment: Multimedia-Based Performance Assessment (MBPA). The introduction of a new method of assessment in Dutch Vocational Education and Training (VET)

Journal title CADMO
Author/s Sebastiaan de Klerk, Theo J.H.M. Eggen, Bernard P. Veldkamp
Publishing Year 2014 Issue 2014/1
Language English Pages 18 P. 39-56 File size 1085 KB
DOI 10.3280/CAD2014-001006
DOI is like a bar code for intellectual property: to have more infomation click here

Below, you can see the article first page

If you want to buy this article in PDF format, you can do it, following the instructions to buy download credits

Article preview

FrancoAngeli is member of Publishers International Linking Association, Inc (PILA), a not-for-profit association which run the CrossRef service enabling links to and from online scholarly content.

Innovation in technology drives innovation in assessment. Since the introduction of computer-based assessment (CBA), a few decades ago, many formerly paper-and-pencil tests have transformed in a computer-based equivalent. CBAs are becoming more complex, including multimedia and simulative elements and even immersive virtual environments. In Vocational Education and Training (VET), test developers may seize the opportunity provided by technology to create a multimedia-based equivalent of performance-based assessment (PBA), from here on defined as multimediabased performance assessment (MBPA). MBPA in vocational education is an assessment method that incorporates multimedia (e.g. video, illustrations, graphs, virtual reality) for the purpose of simulating the work environment of the student and for creating tasks and assignments in the assessment. Furthermore, MBPA is characterized by a higher amount of interactivity between the student and the assessment than traditional computer-based tests. The focal constructs measured by MBPA are the same as are currently assessed by performance-based assessments. Compared to automated delivery of item-based tests, MBPA realizes the full power of ICT. In the present article we will therefore discuss the current status of MBPA, including examples of our own research on MBPA. We provide an argument for the use of MBPA in vocational education too.

Keywords: Assessment in vocational education and training, performance-based assessment, computer-based assessment, multimedia-based performance assessment.

  1. Baartman, L.K.J. (2008), Assessing the Assessment: Development and Use of Quality Criteria for Competence Assessment Programmes, Doctoral dissertation, Utrecht University, The Netherlands, retrieved from http://hdl.handle.net/1820/1555.
  2. Bakx, A.W.E.A., Sijstma, K., Van Der Sanden, J.M.M., Taconis, R. (2002), “Development and Evaluation of a Student-centerd Multimedia Self-assessment Instrument for Socialcommunicative Competence”, Instructional Science, 30, pp. 335-359.
  3. Bartram, D. (2006), “Testing on the Internet: Issues, Challenges, Opportunities in the Field of Occupational Assessment”, in D. Bartram, R.K. Hambleton (eds), Computer-based Testing and the Internet, Chichester: Wiley, pp. 13-37.
  4. Baxter, G.P., Shavelson, R.J., Herman, S.J., Brown, K.A., Valadez, J.R. (1993), “Mathematics Performance Assessment: Technical Quality and Diverse Student Impact”, Journal for Research in Mathematics Education, 24, pp. 190-216.
  5. Bejar, I.I., Williamson, D.M., Mislevy, R.J. (2006), “Human Scoring”, in D.M. Williamson, R.J. Mislevey, I.I. Bejar (eds), Automated Scoring of Complex Tasks in Computer-based Testing, Mahwah, NJ: Lawrence Erlbaum, pp. 49-81.
  6. Bennett, R.E. (2002), “Inexorable and Inevitable: The Continuing Story of Technology and Assessment”, in D. Bartram, R.K.. Hambleton (eds), Compouter-based Testing and the Internet, Chichester: Wiley, pp. 201-217.
  7. Breland, H. (1983), The Direct Assessment of Writing Skill: A Measurement Review, College Board Report n. 83-6, New York: College Entrance Examination Board.
  8. Brennan, R.L. (1983), Elements of Generalizability, Iowa City, IA: American College Testing Program.
  9. Brennan, R.L. (2000), “Performance Assessments from the Perspective of Generalizability Theory”, Applied Psychological Measurement, 24, pp. 339-353.
  10. Brennan, R.L. (2001), Generalizability Theory, New York: Springer-Verlag.
  11. Chang, J., Lee, M., Ng, K., Moon, K. (2003), “Business Simulation Games: The Hong Kong Experience”, Simulation & Gaming, 34, pp. 367-376.
  12. Clarke, J. (2009). Studying the Potential of Virtual Performance Assessment for Measuring Student Achievement in Science, paper presented at the Annual Meeting of the American Educational Research Association (AERA), San Diego, CA, retrieved September 5, 2012, from http://virtualassessment.org/publications/ aera_2009_clarke.pdf.
  13. Clarke-Midura, J., Dede, C. (2010), “Assessment, Technology, and Change”, Journal of Research on Technology in Education, 42 (3), pp. 309-328.
  14. Clauser, B.E., Clyman, S.G., Swanson, D.B. (1999), “Components of Rater Error in a Complex Performance Assessment”, Journal of Educational Measurement, 36 (1), pp. 29-45.
  15. Conole, G., Warburton, B. (2005), “A Review of Computer-assisted Assessment”, Research in Learning Technology, 13 (1), pp. 17-31.
  16. Cronbach, L.J., Gleser, G.C., Nanda, H., Rajaratnam, N. (1972), The Dependability of Behavioral Measurements: Theory of Generalizability of Scores and Profiles, New York: John Wiley. Cronbach, L.J., Linn, R.L., Brennan, R.L., & Haertel, E.H. (1997), “Generalizability Analysis for Performance Assessments of Student Achievement or School Effectiveness”, Educational and Psychological Measurement, 57 (3), pp. 373-399.
  17. De Klerk, S. (2012), “An Overview of Innovative Computer-based Testing”, in T.J.H.M.
  18. Eggen, B.P. Veldkamp, (eds), Psychometrics in Practice at RCEC, Enschede: RCEC, pp. 137-150.
  19. De Klerk, S., Veldkamp, B.P., Eggen, T.J.H.M. (submitted for review), A Framework for Designing and Developing Multimedia-based Performance Assessment in Vocational Education.
  20. Dekker, J., Sanders, P.F. (2008), Kwaliteit van beoordeling in de praktijk [Quality of Rating during Work Placement], Ede: Kenniscentrum handel.
  21. Dierick, S., Dochy, F.J.R.C. (2001), “New Lines in Edumetrics: New Forms of Assessment Lead to New Assessment Criteria”, Studies in Educational Evaluation, 27, pp. 307-329.
  22. Drasgow, F., Luecht, R.M., & Bennett, R.E. (2006), “Technology and Testing”, in R.L. Brennan (ed), Educational Measurement. Westport, CT: Praeger, pp. 471-530.
  23. Dunbar, S.B., Koretz, D.M., Hoover, H.D. (1991), “Quality Control in the Development and Use of Performance Assessments”, Applied Measurement in Education, 4 (4), pp. 289-304.
  24. Eckes, T. (2005), “Examining Rater Effects in TestDaf Writing and Speaking Performance Assessments: A Many-facet Rasch Analysis”, Language Assessment Quarterly, 2 (3), pp. 197-221.
  25. Edgeworth, F.Y. (1888), “The Statistics of Examinations”, Journal of the Royal Statistical Society, 51, pp. 599-635.
  26. Fitzpatrick, R., Morrison, E.J. (1971), “Performance and Product Evaluation”, in R.L. Thorndike (ed), Educational Measurement. Washington DC: American Council on Education, 2nd ed., pp. 237-270.
  27. Gao, X., Shavelson, R.J., Baxter, G.P. (1994), “Generalizability of Large-scale Performance Assessments in Science. Promises and Problems”, Applied Measurement in Education, 7, pp. 323-334.
  28. Grégoire, J. (1997), “Diagnostic Assessment of Learning Disabilities. From Assessment of Performance to Assessment of Competence”, European Journal of Psychological Assessment, 13 (1), pp. 10-20.
  29. Gulikers, J.T.M., Bastiaens, T.J., Kirschner, P.A. (2004), “A Five-dimensional Framework for Authentic Assessment”, Educational Technology Research and Development, 52 (3), pp. 67-86.
  30. Haertel, E.H., Lash, A., Javitz, H., Quellmalz, E. (2006), An Instructional Sensitivity Study of Science Inquiry Items from Three Large-scale Science Examinations, paper presented at the Annual Meeting of the American Educational Research Association, San Francisco, CA. Issenberg, S.B., Gordon, M.S., Gordon, D.L., Safford, R.E, Hart, I.R. (2001), “Simulation and New Learning Technologies”, Medical Teacher, 23 (1), pp. 16-23.
  31. Kane, M.T. (1992), “The Assessment of Professional Competence”, Evaluation & the health professions, 15 (2), p. 163.
  32. Ketelhut, D. J., Dede, C., Clarke, J., Nelson, B., Bowman, C. (2007), “Studying Situated Learning in a Multi-user Virtual Environment”, in E. Baker, J. Dickieson, W. Wulfeck, H. O’Neil (eds), Assessment of Problem Solving using Simulations, Mahwah, NJ: Lawrence Erlbaum, pp. 37-58.
  33. Klieme, E., Hartig, J., Rauch, D. (2008), “The Concept of Competence in Educational Contexts”, in E. Klieme, J. Hartig, D. Leutner (eds), Assessment of Competencies in Educational Contexts, Göttingen: Hogrefe, pp. 3-22. Lane, S, Stone, C.A. (2006), “Performance Assessment”, in R.L. Brennan (ed), Educational Measurement, Westport, CT: Praeger, pp. 387-431.
  34. Linn, R.L., Baker, E.L., Dunbar, S.B. (1991), “Complex Performance Assessment: Expectations and Validation Criteria”, Educational Researcher, 20 (8), pp. 15-21.
  35. Marzano, R.J., Pickering, D., McTighe, J. (1993), Assessing Student Outcomes: Performance Assessment Using the Dimensions of Learning Model, Alexandria, VA: Association for Supervision and Curriculum Development.
  36. Mayrath, M.C., Clarke-Midura, J., Robinson, D.H. (2012), “Introduction to Technology-based Assessments for 21st Century Skills”, in M.C. Mayrath, J. Clarke-Midura, D.H. Robinson, G. Schraw (eds), Technology-based Assessments for 21st Century Skills, Charlotte, NC: Information Age, pp. 1-11.
  37. Messick, S. (1989), “Validity”, in R.L. Linn (ed), Educational Measurement, New York: Macmillan, 3rd ed., pp. 13-103.
  38. Messick, S. (1995), “Standards of Validity and the Validity of Standards in Performance Assessment”, Educational Measurement: Issues and Practice, 14 (4), pp. 5-8.
  39. Monahan, T., McArdle, G., Bertolotto, M. (2008), “Virtual Reality for Collaborative elearning”, Computers & Education, 50, pp. 1339-1353.
  40. Oostrom, J K., Born, M.P., Serlie, A.W., van der Molen, H.T. (2010), “Webcam Testing: Validation of an Innovative Open-ended Multimedia Test”, European Journal of Work and Organizational Psychology, 19 (5), pp. 532-550.
  41. Oostrom, J.K., Born, M.P., Serlie, A.W., van der Molen, H.T. (2011), “A Multimedia Situational Test with a Constructed-response Format: Its Relationship with Personality, Cognitive Ability, Job Experience, and Academic Performance”, Journal of Personnel Psychology, 10 (2), p. 78.
  42. Roelofs, E.C., Straetmans, G.J.J.M. (eds) (2006), Assessment in Actie [Assessment in Action], Arnhem: Cito.
  43. Ruiz‐Primo, M.A., Baxter, G.P., Shavelson, R.J. (1993), “On the Stability of Performance Assessments”, Journal of Educational Measurement, 30 (1), pp. 41-53.
  44. Scalise, K., Gifford, B. (2006), “Computer-based Assessment in e-learning: A Framework for Constructing ‘Intermediate Constraint’ Questions and Tasks for Technology Platforms”, Journal of Technology, Learning, and Assessment, 4 (6), retrieved March 20, 2012 from http://www.jtla.org.
  45. Schoech, D. (2001), “Using Video Clips as Test Questions: The Development and Use of a Multimedia Exam”, Journal of Technology in Human Services, 18 (3-4), pp. 117-131.
  46. Segers, M. (2004), “Assessment en leren als twee-eenheid: onderzoek naar de impact van assessment op leren” [The Dyad of Assessment and Learning: A Study of the Impact of Assessment on Learning], Tijdschrift voor Hoger Onderwijs, 22, pp. 188-220.
  47. Shavelson, R.J., Webb, N.M. (1991), Generalizability Theory, Newbury Park, CA: Sage.
  48. Shavelson, R.J., Baxter, G.P., Pine, J. (1992), “Performance Assessments: Political Rhetoric and Measurement Reality”, Educational Researcher, 21 (4), pp. 22-27.
  49. Shavelson, R.J., Baxter, G.P., Gao, X. (1993), “Sampling Variability of Performance Assessments”, Journal of Educational Measurement, 30 (3), pp. 215-232.
  50. Shavelson, R.J., Ruiz-Primo, M.A., Wiley, E. (1999), “Note on Sources of Sample Variability in Science Performance Assessments”, Journal of Educational Measurement, 36 (1), pp. 56-69.
  51. Sliney, A., Murphy, D. (2011), “Using Serious Games for Assessment”, in M. Ma, A. Oikonomou, L.C. Jain (eds), Serious Games and Edutainment Applications, London: Springer, pp. 225-243. Straetmans, G.J.J.M., van Diggele, J.B.H. (2001), Anders opleiden, anders toetsen [Different
  52. Instruction, Different Assessment]. BVE-brochurereeks: Perspectief op Assessment, deel 1 [BVE-brochure series: Perspective on Assessment, part 1], Arnhem: Cito.
  53. Susi, T., Johannesson, M., Backlund, P. (2007), Serious Games – An Overview [Technological Report], retrieved from http://www.his.se/PageFiles/10481/HS-IKI-TR-07-001.pdf.
  54. Sweet, D., Zimmermann, J. (1992), “Performance Assessment”, Education Research Consumer Guide, 2, pp. 2-5.
  55. Thornburg, D.D. (1999), Technology in K-12 education: Envisioning a New Future, retrieved October 16, 2012, from http://www.edtech.ku.edu/resources/portfolio/examples/nets/Miller/
  56. www.air.org/forum/Thornburg.pdf.
  57. Thorndike, E.L. (1920), “A Constant Error in Psychological Ratings”, Journal of Applied Psychology, 4 (1), pp. 25-29.
  58. Van Dijk, P. (2010), Examinering in de beroepspraktijk [Assessment in Vocational Practice], Amersfoort: ECABO.
  59. Van der Vleuten, C.P.M., Swanson, D.B. (1990), “Assessment of Clinical Skills with Standardized Patients: The State of the Art”, Teaching and Learning in Medicine, 2, pp. 58-76.
  60. Webb, N.M., Schlackman, J., Sugrue, B. (2000), “The Dependability and Interchangeability of Assessment Methods in Science”, Applied Measurement in Education, 13 (3), pp. 277-301.
  61. Weir, C.J. (2005), Language Testing and Validation: An Evidence-based Approach, Houndmills, UK: Palgrave Macmillan.
  62. Williamson, D.M., Bejar, I.I., Mislevy, R.J. (2006), “Automated Scoring of Complex Tasks in Computer-based Testing: An Introduction”, in D.M. Williamson, R.J. Mislevey, I.I. Bejar (eds), Automated Scoring of Complex Tasks in Computer-based Testing, Mahwah, NJ: Lawrence Erlbaum, pp. 1-13.
  63. Wolfe, E.W., McVay, A. (2010), Rater Effects as a Function of Rater Training Context, retrieved from http://www.pearsonassessments.com/NR/rdonlyres/ 6435A0AF-0C12-46F7-812E-908CBB7ADDFF/0/RaterEffects_101510.pdf.
  64. Wollack, J.A., Fremer, J.J. (eds) (2013), Handbook of Test Security. New York, NY: Routledge.
  65. Ziv, A., Small, S.D., Wolpe, P.R. (2000), “Patient Safety and Simulation-based Medical Education”, Medical Teacher, 22 (5), pp. 489-495.

  • Computational Psychometrics: New Methodologies for a New Generation of Digital Learning and Assessment Jessica Andrews-Todd, Robert J. Mislevy, Michelle LaMar, Sebastiaan de Klerk, pp.45 (ISBN:978-3-030-74393-2)
  • Psychometric analysis of the performance data of simulation-based assessment: A systematic review and a Bayesian network example Sebastiaan de Klerk, Bernard P. Veldkamp, Theo J.H.M. Eggen, in Computers & Education /2015 pp.23
    DOI: 10.1016/j.compedu.2014.12.020
  • Plagiarism detection in students’ programming assignments based on semantics: multimedia e-learning based smart assessment methodology Farhan Ullah, Junfeng Wang, Muhammad Farhan, Sohail Jabbar, Zhiming Wu, Shehzad Khalid, in Multimedia Tools and Applications /2020 pp.8581
    DOI: 10.1007/s11042-018-5827-6
  • The design, development, and validation of a multimedia-based performance assessment for credentialing confined space guards Sebastiaan de Klerk, Bernard P. Veldkamp, Theo J. H. M. Eggen, in Behaviormetrika /2018 pp.565
    DOI: 10.1007/s41237-018-0064-x

Sebastiaan de Klerk, Theo J.H.M. Eggen, Bernard P. Veldkamp, A blending of computer-based assessment and performance-based assessment: Multimedia-Based Performance Assessment (MBPA). The introduction of a new method of assessment in Dutch Vocational Education and Training (VET) in "CADMO" 1/2014, pp 39-56, DOI: 10.3280/CAD2014-001006