| What
is Question and Test Versioning and Why Should You Care?
by Steven B. Just
Recently we were asked to perform an item and
test validation for a major corporation. This company uses
an on-line testing system, but not ours. Using our testing
system is not a requirement for us to perform a validation
(though it helps), so we began the process.
At the risk of over-simplifying there are two
parts to validation: results-independent analysis (the front-end
part) and results-dependent analysis (the back-end part).
The results-independent analysis comprises all
those parts of the validation process that do not require
prior test results, and, indeed, should be performed before
the questions and tests are ever actually used. These include,
but are not limited to: review of questions to assure that
they are properly written, review of questions to assure that
they properly cover the domain of content, and setting of
passing scores via a validated process.
The results-dependent analysis comprises what
is typically referred to as item analysis: question difficulty
levels, choice distributions, point-biserial correlations
and test reliability.
We completed the front-end analysis and began
work on the item analysis. At that point we ran into a host
of problems:
- Tests change over time,
but the testing system had no internal methodology for tracking
this. Was the test taken by Student B in September the same
test as that taken by Student A in June? Was it a totally
different test or just a different version of the same test?
If it was a different version what had changed?
- Similarly, questions
change over time. Did two students take the exact same test
with the same questions or did they have questions that
seemed the same but were actually different versions of
very similar questions?
- Can question results
be tracked across multiple tests? While it is always necessary
to have item results information within a test it is often
important to be able to track question history over multiple
uses of the question, independent of which tests the question
has actually been used in.
- It was possible to go
from test to question – that is we could look at a
test and see which questions were in it – but we couldn’t
do the inverse process: Given a question we couldn’t
see which tests it had been used in.
- There was a lack of
a good internal auditing system. If we looked at a test
we couldn’t tell much beyond that particular administration
of the test. When was the test first created? How often
had it been used? When? How many students used it at each
administration? Who within the training department created
the test? Who modified it (if it was modified) and why?
Versioning and auditing are rather dull procedural
back-ends to a testing system, but critically important for
a complete test validation process.
|