Back to the Scholarly Publications page

 

 

Pucel, D.J., & Anderson, L.D. (June, 2003). Developing computer simulation performance tests: Challenges and criteria. Proceedings of the IASTED International Conference on Computers and Advanced Technology in Education 2003 (CATE), ISBN: 0-88986-361-X: Rhodes, Greece pp. 170-174.

DEVELOPING COMPUTER SIMULATION PERFORMANCE TESTS: CHALLENGES AND CRITERIA

 

                                David J. Pucel, Ph.D.                                           Lynn D. Anderson, Ph.D.

                                Professor                                                               Executive Director

                                Human Resources Development and             Joint Commission on Allied Health Personnel

                                Business and Industry Education                    in Ophthalmology (JCAHPO)

University of Minnesota                           USA                 

                                USA                                                                                               


Abstract

     The use of computer simulations to test a person’s technical psychomotor skills in health and other technician roles is in its infancy. This paper presents insights into how computer simulations can be developed to test a person’s ability to perform psychomotor tasks. They are the result of a two-year project that has resulted in a set of international certification tests for ophthalmic technicians within the United States and Canada.  A proven rationale and methods for meeting some of the major challenges in using computer simulations for skill certification testing are presented.

 

Key Words

 

Performance, Testing, Simulations, Certification, Ophthalmic

 

1. Introduction

     Testing to verify a person’s ability to perform a skill is fundamental to many occupations. It provides assurance that a person is capable of performing on the job. Such testing is often very expensive and time consuming. It often requires people to come to testing sites far from their homes, and examiners that are qualified in the skills to be tested. The authors faced the challenge of making the process of such testing more efficient, valid and reliable for those wishing to be certified as ophthalmic technicians. The solution they arrived at was to develop computer simulations to test a person’s skills.

 

     The authors quickly found that the use of computer simulations to test a person’s technical psychomotor skills in health and other technician roles is in its infancy. Although an extensive literature search was undertaken, no literature was found that focused directly on how to develop such simulations. The question became, “How can a computer simulation be developed so it can be used as the platform to test a person’s ability to perform a psychomotor task, such as determining a patient’s eye correction using a retinoscope? 

 

     This paper presents major considerations in developing computer simulations for certifying the ability to perform psychomotor skills based on a two year developmental project that has yielded tests that will be in place through throughout the United States and Canada during 2003. It presents the rationale for doing so, the issues that were and must be addressed, and the developmental process that was used to develop a currently fully functional set of computer simulation certification tests that meet professional standards.

 

2. Models Underlying the Developmental Effort

 

     Two basic models were adopted as frameworks to begin the design and development of the computer simulation tests. They were implemented within the context of the Standards for Educational and Psychological Testing [1].  The first was the performance testing model presented by Pucel [2]. That model outlines the procedures for the development of tests to be used to assess skill mastery. That model calls for:

·         clearly defining the skill to be evaluated.

·         outlining the process of performing the skill.

·         defining criteria for judging each step in the process.

·         developing criteria for judging the final outcome of performing the skill (e.g., product, accuracy of decisions)

·         establishing scoring procedures that reflect the importance of each step and an acceptable outcome of the performance.

 

     The second model was that presented by Alessi and Trollip for the development of instructional simulations [3]. Their model suggests a set of standards for the development of multi-media materials aimed at providing instruction. Those standards are presented in three categories: planning, design, and development.

 

3. Challenges

 

    At first, one might think the development of computer simulation tests is only a matter of generalizing typical instructional simulation development principles. However, after spending two years developing a series of simulation performance tests in ophthalmic technology, it is apparent that the challenges are much more complex. Instruction is aimed at teaching a person knowledge or how to do something. In teaching a performance skill, the focus is on how to do it correctly. Testing is aimed at determining if a person has mastered the content.

 

     In the context of performance testing, one challenge is to present the skill through accurate simulation, but an additional challenge is presenting alternatives to the correct procedure that will allow a person to demonstrate they do not know how to perform the skill correctly. These alternatives can be created in a variety of ways. For example, one dimension for doing something incorrectly is not performing a particular performance step correctly. Another is performing the steps out of order. Another is arriving at the wrong answer even if the correct process is used. The process is further complicated by the fact that when people enter computer simulations they tend to want to experiment with the simulation to see what it does. In doing so they may or may not be intending to demonstrate their skill. Therefore, when does scoring begin, how does one determine when a person is intentionally trying to perform the skill versus explore the workings of the simulation?

 

4. Rationale for Testing Performance Skills with Computer Simulations

 

     At first it would appear that it would not be rational to try and assess psychomotor performance skills using computer simulations. Psychomotor skills require the ability to actually manipulate real devices that require the use of tactile skills that can only be learned from working with and handling the real devices. However, adequate performance of psychomotor skills not only requires the ability to manipulate actual devices, but cognitive decision-making regarding the process of manipulating the devices and the ability to arrive at the desired outcome.

 

     Therefore, if a person has had a significant amount of experience working with the actual devices, certification testing can be based on whether the person can manipulate the devices correctly. In addition, if the outcome desired is a decision or a result that can be recorded as contrasted with the production of a physical product such as welding a pipe or building a wall, computer simulations can allow for judging the adequacy of the outcome of the performance. Therefore, the authors determined that psychomotor testing with computer simulations is reasonable if the computer simulation is designed to allow a person to demonstrate the ability to manipulate devices, and to produce the desired outcome in a recordable fashion. The authors also suggest that it would not be appropriate  to use computer simulations if the goal of testing is to assess a person’s ability to physically manipulate the real devices to build their psychomotor skills, or to produce physical products.

 

     In the case of this project to develop the ophthalmic skills tests, verification of the ability to manipulate the actual real devices was obtained by requiring candidates to have either successfully completed an accredited training program which included the skills, or having work experience with the devices verified by a supervising ophthalmologist.

 

5. Design Issues to be Addressed

     Following the performance testing model, major design issues to be addressed during the development of the computer simulation tests were:

1.       The need to realistically present each skill. What people saw on the screen needed to be an accurate representation of what they would see in real life.

2.       Navigation through the simulations needed to be simple enough so testing was not seriously affected by a person’s ability to operate the computer.

3.       Besides allowing a person to complete a skill correctly, alternative ways of completing the skill incorrectly needed to be built into the simulations.

4.       The simulations required built in scoring algorithms that reflected the ways in which peoples’ performance would be judged on the job.

5.       Adjustments were needed to accommodate differences between the way people approach computer simulations and real-life performance tests.

6.       All portions of the simulations needed to be validated as truly and accurately representing each skill and allowing candidates to demonstrate their true ability to perform the skill.

7.       A tutorial was required that allowed people to be trained in how to actually use the computer during the simulations to move objects, to provide directions, and to record responses.

 

5.1 The simulations needed to realistically present each skill. What people saw on the screen needed to be an accurate representation of what they would see in real life.

 

     In order to ensure the simulations were realistic presentations of the skills, actual movies and pictures were taken of the skills. They were then incorporated into a computer simulation using FLASH. The simulations were first developed to show the correct method of performing the skills. They were later modified to allow candidates to demonstrate alternative incorrect as well correct processes. Figure 1 presents a

    


 

Figure 1

 
                                

 


 screen capture of a simulation of the ophthalmic skill “refinement”.

 

     Validation of the realism of the simulations was first accomplished by having subject matter experts, who were ophthalmologists and technicians, repeatedly review and suggest modifications to ensure what was presented on the screen represented  the real world. It was further validated through a series of pilot tests that will be described later.

 

5.2 Navigation through the simulations needed to be simple enough so testing was not seriously affected by a person’s ability to operate the computer.

 

     Developing navigation through the simulations became a major issue. Navigation needed to not only accommodate the ability to move through the simulation correctly, but to move through the simulation in a manner similar to the way the skill would be performed on the job. Given that these were performance tests, the navigation system also needed allow people to make mistakes and navigate incorrectly.

 

     The navigation system was developed by first storyboarding each skill into logical portions of what one would see when performing each major segment of the skill. Therefore, each skill was broken down into logical portions based on changes in what a person would need to see and attend to when in each segment of the skill. The segments of the skill needed to be discrete so it would be possible to proceed through the segments in the correct as well as the incorrect order. In other words, if a skill required a person to focus an eyepiece before moving on to positioning a device, the eyepiece would be seen in one segment and positioning the device in another. This allowed candidates to be able to select either focus the eyepiece as a major portion of the skill, or position the device. Once the segments were identified, an introductory menu divided into the logical portions or segments of the skill was developed.

 

5.3 Besides allowing a person to complete a skill correctly, alternative ways of completing the skill incorrectly needed to be built into the simulations.

 

     After getting into a segment, the person needed to be able to perform the processes associated with that segment correctly or incorrectly. This required allowing people to activate correct and incorrect controls on the devices. This was eventually accomplished by placing arrows on each device control that would allow a person to move a control in different directions. In Figure 1 above, these arrows are presented on the face of the controls. The actual device does not have arrows on it. A candidate was instructed to place the cursor on the appropriate arrow and to activate the left-hand mouse button to move the device. Although this process currently seems obvious, it took a number of pilot studies to perfect.

 

5.4 The simulations required built in scoring algorithms that reflected the ways in which peoples’ performance would be judged on the job.

 

     Since the simulations were going to be used for certification testing, scoring algorithms were needed that would allow a person to be judged on the extent to which they could perform the skill to on the job standards. This first required the development of scoring rubrics in the form of checklists. The checklists needed to clearly indicate the correct procedure for completing each skill, the criteria for judging each portion of the procedure, and point systems that would reflect the relative importance of completing each portion of the procedure correctly. Table 1.

presents a portion of a sample checklist for the ophthalmic skill “keratometry”

 

 


Table 1

Scoring Rubric Checklist for Keratometry

 

Givens: Keratometer, new patient with astigmatic error.

Required Performance: Measure the corneal curvature and record results.

Standard:  80 points on process and within tolerance range. 

 

Process Steps

Criteria

Score

Focus the eyepiece.

Reticule clear

3

Instruct patient

Patient instructed to keep forehead and chin in position                           

3 

Position the patient, Etc.

Patient’s forehead and chin in position

(Computer version: automatically done)

0

 


     The development of these scoring rubrics was a long and difficult process requiring many iterations. Although each ophthalmic expert assigned to work on the project was able to observe people and determine if they were competent, they had different ways and words for expressing what they would observe and how they would judge competence. The additional complicating factor was that the criteria had to be assessable through the computer simulation. Therefore, it was necessary for them to repeatedly assemble and arrive at mutually agreed upon criteria to judge skill mastery.

 

     In some cases if a step was relatively automatic given a set of instructions, the fact that a person selected the correct instruction was automatically assumed to lead to the correct action in the computer. In cases such as the “position the patient” example above, this simplified the programming of the simulation.

 

5.5 Adjustments needed to accommodate differences between the way people approach computer simulations and real-life performance tests.

 

     At first it was assumed that when people entered the simulation tests they would proceed directly to complete the tests in the linear order as specified in the validated scoring checklists. Pilot testing soon indicted this was not so. Even though an extensive tutorial in how to mechanically operate the simulations was provided to candidates before testing, most candidates wanted to try things out to familiarize themselves with how to manipulate the devices with the computer as they went through the tests. (A discussion of the tutorial is presented later.) This meant that they would touch dials to try things out without intending to actually complete the test. However, the scoring algorithms were developed in such a way that if they touched the dials out of order they received score deductions. Also, candidates at times wanted to go back and check on earlier results during the simulations. Again the original assumption was that once they completed a step they would move on without returning. When they did return, they were being scored as doing things out of order.

 

     These unanticipated candidate behaviors required a further review of the scoring procedures. When a candidate entered a specific portion of a simulation and a new picture segment, they were allowed to touch things as long as they did not proceed along a logical sequence that indicated that they were intending to actually complete the test. They were scored after they completed a systematic portion of what was being tested in that portion of the simulation. Also, if a person went back to an earlier step to check on a previous reading, it was determined that this was reasonable in many situations during the real-life performance of the skill. Therefore, scoring was adjusted to ensure that as a person proceeded to the next step they did all of the previously required steps, and if they did go back it did not invalidate the procedure. However, if they moved ahead to steps without completing the necessary prerequisite steps they did receive score deductions. Again, in retrospect this seems obvious. However, it took a number of pilot tests and adjustments to arrive at these adjustments to typical performance test scoring rubrics.

 

5.6 All portions of the simulations needed to be validated as truly and accurately representing each skill and allowing candidates to demonstrate their true ability to perform the skill.

 

     Throughout the development of the simulation tests the process was under continual review by a simulation development committee composed of ophthalmologists and incumbent technicians who were already certified to perform the skills to be tested. All aspects of the tests were reviewed an approved by the committee as the developmental process was underway. This included: a) the extent to which the simulations were accurate representations of  the skill as they look and are performed in real-life, b) the ease with which a person can complete the skills without the artificial nature of computerization compromising a person’s ability to demonstrate their true skill, and c) the validity of the scoring algorithms to truly judging things that are important and in the correct relative importance.

     In addition, three informal pilot tests were conducted with actual job incumbents as the project continued. Candidates gave feedback on the extent to which they felt they could demonstrate their true skill, the realism of the simulations, and their ability to operate the simulations. A formal pilot test was also conducted though which formal survey data were gathered and analyzed. The results showed that the simulations did allow people to demonstrate their true skill, they were easily operated, and candidates felt the tests were equally valid or more valid than the real-life tests used in the past.

 

5.7 A tutorial was required that allowed people to be trained in how to actually use the computer during the simulations to move objects, to provide directions, and to record responses.

 

     It quickly became apparent that many of the people who would be tested had relatively few computer skills. How to move things and record responses was not intuitive to them. Therefore, an extensive tutorial was developed and assembled on a CD. It provided candidates with an orientation to the purposes and format of the overall computerized simulation evaluation of the seven ophthalmic skills. It also provided detailed examples and opportunities for candidates to make sample menu selections, move ophthalmic objects with the computer, and to record responses. The tutorial was pilot tested along with all other aspects of the simulations and data indicated that the final version was easy to use and provided the needed information to effectively use the simulations. 

 

6. Summary

     This developmental project revealed that there are many new considerations in developing computer simulation psychomotor performance tests than those typically faced when developing instructional simulations, or real life performance tests. However, it also has shown that the development of such tests is feasible and that such tests are capable of validly testing such skills.

 

7. Recommendations

     Experience with the development of these computerized simulation tests has provided insights that might be useful to others.

1.                   During the design and development of the simulations there was constant tension between how detailed the simulations needed to be as evaluation tools and the fidelity of portraying all of the nuances of the skill. This had many implications for the cost of the project as well as the effectiveness of the simulations as evaluation tools. The more detail that was included, the higher the cost. Also, at times detail beyond that needed to evaluate a person’s skill actually obscured what was being assessed during a particular segment of a simulation. The implication for designers and developers is to early on explicitly address the amount of fidelity that is required to evaluate the skill being addressed. Otherwise, there may be many costly revisions that could be avoided.

2.                   Because of the diversity of the way experts convert their professional criteria for judging adequate skill performance into words, it is important to obtain agreement regarding how the process of completing a skill and the criteria for judging it will be stated. In real life, people may be looking for the same things and be able to come to the same judgment about a person’s competence. But when you ask them to write down what they look for and how they judge if it is done correctly, they tend to express things in different words. A project can face many false starts if these issues are not resolved before beginning. If the differences are not addressed early, they will surface as the project progresses.

3.                   The use of computer simulations to replace live performance tests is not meaningful in all situations. It is important to make sure doing so makes sense in terms of the particular situation.  A rationale presenting the logic for using them similar to the one presented earlier should be developed.

 

8. References

 

[1] American Educational Research Association, American Psychological Association & National Council on Measurement in Education. Standards for Educational and Psychological Testing. Washington D.C.: American Educational Research Association, 1999. ISBN 0-935302-25-5

 

[2] Pucel, D.J. Developing and Evaluating Performance-Based Instruction(second edition). New Brighton, MN: Performance Training Systems, Inc., 2001. ISBN 0-943919-02-9

 

[3] Alessi, S.M. & Trollip, S.R. Multimedia for Learning: Methods and Development. (third edition) Needham Heights, Massachusetts: Allyn & Bacon, 2001. ISBN 0-205-27691-1