What is Question and Test Versioning and Why Should You Care?
by Steven B. Just

Recently we were asked to perform an item and test validation for a major corporation. This company uses an on-line testing system, but not ours. Using our testing system is not a requirement for us to perform a validation (though it helps), so we began the process.

At the risk of over-simplifying there are two parts to validation: results-independent analysis (the front-end part) and results-dependent analysis (the back-end part).

The results-independent analysis comprises all those parts of the validation process that do not require prior test results, and, indeed, should be performed before the questions and tests are ever actually used. These include, but are not limited to: review of questions to assure that they are properly written, review of questions to assure that they properly cover the domain of content, and setting of passing scores via a validated process.

The results-dependent analysis comprises what is typically referred to as item analysis: question difficulty levels, choice distributions, point-biserial correlations and test reliability.

We completed the front-end analysis and began work on the item analysis. At that point we ran into a host of problems:

  • Tests change over time, but the testing system had no internal methodology for tracking this. Was the test taken by Student B in September the same test as that taken by Student A in June? Was it a totally different test or just a different version of the same test? If it was a different version what had changed?
  • Similarly, questions change over time. Did two students take the exact same test with the same questions or did they have questions that seemed the same but were actually different versions of very similar questions?
  • Can question results be tracked across multiple tests? While it is always necessary to have item results information within a test it is often important to be able to track question history over multiple uses of the question, independent of which tests the question has actually been used in.
  • It was possible to go from test to question – that is we could look at a test and see which questions were in it – but we couldn’t do the inverse process: Given a question we couldn’t see which tests it had been used in.
  • There was a lack of a good internal auditing system. If we looked at a test we couldn’t tell much beyond that particular administration of the test. When was the test first created? How often had it been used? When? How many students used it at each administration? Who within the training department created the test? Who modified it (if it was modified) and why?

Versioning and auditing are rather dull procedural back-ends to a testing system, but critically important for a complete test validation process.

Print Article