Date of Degree
PhD (Doctor of Philosophy)
Psychological and Quantitative Foundations
First Committee Member
Second Committee Member
Third Committee Member
Fourth Committee Member
A simulation study was carried out to assess the effects of using different testing frameworks and different statistical estimators in constructing a vertical scale. The adaptive multistage testing framework (MST) was comprised of five test forms which were administered across three testing occasions. The single form testing framework (SFT) was comprised of one form at each of the three testing occasions. Maximum likelihood estimation (MLE) and Bayesian Expected a-posteriori (EAP) estimators were used to estimate each simulee's ability at three "testing'' occasions. Item response theory (IRT) true scores, or domain scores, were used as the score scale. This was done to facilitate the use of growth scores between testing occasions. It was hypothesized that testing framework and estimation procedures would influence the recovery of the known domain score for each simulee across the three testing occasions and growth values between testing occasions.
Average absolute deviation (AAD) values indicated that the MST framework offered a slight reduction in error when compared to the SFT framework in estimating IRT domain scores. The pattern of errors in estimation indicated that the MST framework provided more accurate estimates across the range of ability. The MST framework also offered a slight reduction in error when estimating IRT growth scores. Horizontal distances between test administrations indicted that EAP estimation produced uneven departures from known horizontal distances, but MLE did not. This was true for both the SFT and MST framework. Also, when the distributions of IRT domain scores were considered, the MLE estimation method was more consistent with the distribution of known domain scores.Overall, the MST framework performed better than did the SFT framework with respect to reduced estimation error and approximating the known IRT domain score.
IRT, vertical, multistage, adaptive, NELS;
xii, 180 pages
Copyright 2008 Jonathan James Beard
Beard, Jonathan James. "An Investigation of vertical scaling with item response theory using a multistage testing framework." PhD (Doctor of Philosophy) thesis, University of Iowa, 2008.