The following is an excerpt from a letter by Rosemary K. Jones,
former AICP Director of Administration, 10/7/97.
"We have a continuous program for revising the exam and
writing new questions. We . . . obtain comprehensive data on
key professionaltasks or functions performed by planners by conducting a job
analysis survey every five or so years. We use the results
to link the knowledge of planners to the dimensions of their
jobs, and use that information for the revision and further
development of our test specifications.
"The exam is graded in two stages. We use a systematic
approach to identifying and removing flawed questions from
the exam.
Those examinees who get high scores in the exam usually disagree
when it comes to picking the answer to a flawed question. They
will divide between two or even three of the possible answers
as being correct. Our statistics show which questions have
received such dispersed answers. If our review traces the problem
to the question, we grade all answers to that one question
as correct. This is done prior to scoring [the individual exams].
Your reported score is not the number
of questions or the percent of questions you answered correctly.
The number of questions answered correctly is converted to
a scale ranging from 25 to 75. [The spread of test scores nationwide
for the 1997 exam was from a low of 39 to a high of 72.] The
number of correct answers necessary to achieve a scaled score
of 55 represents the cut score established by AICP as the minimum
acceptable competency level. It is also important to note that
scaled scores are not percentage
scores. Scaled scores allow us to report different raw scores
that represent the same kind of knowledge, skills, and abilities.
These raw scores are converted, and all of the raw scores are
mapped to the same or 'common' scale to produce the 'scaled
scores.'
"Furthermore, the content area scores are not used
to determine whether you passed or failed the test. They are
included in the score report to indicate your strengths and
weaknesses within major content areas and are provided for
self-evaluation only. The total score
is used to determine a passing grade, not content area scores.
"The Standard Error of Measurement (SEM) is one means
of assessing the precision of a test score, which is never
perfect. If you
were to take many different forms or versions of the test on
different occasions, your score would be different on each
form but would cluster around a hypothetical 'true score.'
This "true score" cannot be established absolutely because
any test score can be influenced by a variety of factors: the
condition of the candidate, the kind of test given, and other
factors that are unrelated to the test itself. A candidate
might perform differently on one occasion than on another.
You might try harder, be more tired or anxious, have greater
familiarity with the content of the questions on one test form
than on another, or you might simply guess correctly more often
on one test than another. All of these factors are taken into
consideration when the SEM is calculated for a specific form
of a test.
"The AICP test is 'equated' from one year to the next.
Through the process of equating, we can make mathematical adjustments
to scores so that one test form is comparable to another form
of the test. . . . [For example,] through equating, we can
determine that a score of 35 on one test represents the same
level of math knowledge, skills, and abilities as a score of
40 on [another] test.