Content deleted Content added
answered question, howard wainer |
No edit summary |
||
Line 8:
In order to implement an adpative test, you must first have scaled the pool using item response theory. The sample size typically required for IRT scaling is at least 200 and preferrably several hundreds or thousands of people. As a consequence, it is uncommon and generally impractical for use in academic settings unless there is a dedicated staff who produce one or a few exams each year and who can manage the CAT. The way this usually evolves is to first administer the exam on computer for a period of time while the database of questions and answers accumulates. Then the scaling is performed and an adaptive version of the exam is introduced.
I haven't seen any studies that reinforce the notion that CAT's punish the most intelligent students although CAT introduces a new administration method that many students dislike. For example, item review and modification are generally disallowed in CAT because if you get an easier item, then you know that you got the previous item wrong. If you can backup and correct that answerm then your score will be biased. In fact,
Why would you want to introduce CAT exams in an academic setting? CAT is really only helpful if you need extremely reliable tests over a broad range
Amead
|