Interpreting Standardized Achievement Test Scores
Standardized
achievement tests have been widely used in the schools as a means of
determining how well schools are doing. These tests have been primarly
norm-referenced tests that compared local student performance to the
performance of a representative sample of students in a norm group (e.g., a
group of students at the national, regional, or state level). In the past, the
test items were selection-type items, primarily multiple choice. In recent
years, the tests have been modified to provide for criterion-referenced
interpretations as well (e.g., by including more items per task, using
open-ended tasks, and providing for interpretation by clusters of tasks). Both
types of interpretation will be discussed in turn.
Being able to interpret the various
types of norm-referenced test scores and understand how criterion-referenced
interpretations are used in standardized tests is important, if the tests are
to play a role used in standardized tests is important, of course, to
understand them well enough to be able to explain them to students and parents.
To start with, it is important to keep in mind that norm-referenced
interpretation indcates a student’s relative level of performance in comparison
to others and criterion-referenced interpretation describes the task a student
can perform.
Features of standardized
Achievement Test
Standardized
achievement test are designed to determine how well students are achieving a
common set of broadly based goals. Well-constructed standardized achievement
test typically have the following features.
1. The content of the test is based on widely used
textbooks and curriculum guides.
2. The test item are writtten by test experts in
consultation with subject-matter experts and are based on a clear set of
specifications.
3. The test items are tried out, reviewed, analyzed for
difficulty and discriminating power, and either revised or eliminated.
4. The final set of items is selected on the basis of
the test specifications.
5. Directions for administering and scoring the test
are rigidly prescribed.
6. The test is administering to select groups of
students to establish national, regional, or statewide norms for interpretation
of the test scores.
7. The final version of the test is published along
with a test manual that describes the test’s technical qualities and the
procedures for administering, scoring, interpreting, and using the results.
Thus, a standardized test measures a
standard set of broadly based educational outcomes, uses standard directions
and standard scoring procedures, and provides for a comparison of a student’s
score to that of similar students who have taken the same test under the same
conditions. If a battery of tests is used, and all tests have been standardized
on the same norm group, a student’s performance on the different tests can also
be compared. On a basic skill battery, for example, we can determine a
student’s relative level of performance in reading, language, and mathematics.
With comparable froms of the test, we can also examine learning progress over a
series of grade levels. All of these norm-referenced interpretations and
comparison of standardized test scores requires an understanding of the various
types of test scores that are used in describing students test performance.
Interpreting Norm-Referenced Scores
The
scores a student receives when a test has been scored according to the
directions is called the raw score. On a classroom test, this is typically the
number of items a student answers correctly. Although raw scores are used in
classroom testing, the interpretations and comparisons made with standardized
test require that the raw scores be converted to some type of derived score.
Comparison of performance on two different test (e.g., reading and math), for example,
require that both tests be on the same scale. Raw scores won’t work because the
two tests may differ in the number of items in the tests same derived score
scale, we provide a common basis for comparing relative performance. Although
there are may different types of derived scores, the most common types used in
school achievement testing are
1. Percentile ranks.
2. Grade equivalent scores.
3. Standard scores
The raw scores on a standardized
test are converted to derived scores during the norming of the test. Attempts
are made to obtain norm groups that contain a sample of students like those for
whom the test is intended. National norm, for example, typically include
students from the various geographic regions of the United States, urban and
rural schools, and schools of different size. A balance of boys and girls,
socioeconomic levels, and ethnic groups is also sought. Thus, national norm
should apporoximate as closely as possible the student population throughout
the United State. The same care is typically also followed in obtaining
regional, state, and special groups norm (e.g., private schools). Despite the
care in obtaining norm groups, however, the obtained sample of students only
apporoximates the ideal sample, due to such constraints as the needed
cooperation of selected schools to administer the tests and the time limits for
obtaining the norm sample.
After the norm groups have been
selected and the tests administered and scored, the raw scores are converted to
derived scores and presented in the test manual in tables of norms. These
tables present the raw scores and derived scores in columns so that a raw
scores can be converted into a derived scores by going across the parallel
columns from the raw score to the derived score. Of course, the printout from
machine scoring will give both the raw score and the derived score.
Before using the derived scores from
a standardized test, it is wise to consider the nature of the norm group. Does the norm group provide a
relevant basis for interpreting student performance? How was the norm group
obtained? When were the norm obtained? We can obtain the most meaningful
norm-refernced interpretation of test scores when the norms are relevent,
representative, and up to date.
A final caution. The scores in the
norm group should not be viewed as goals or standards. They are simply the
scores that a representative group of students have earned on the test. They
aid in interpreting and comparing test performance, but they do not represent
levels of performance to strive for. They are average or typical scores
obtained in average or typical schools.
Percentile
Ranks
The
percentile rank is one of the easiest scores to understand and to interpret to
presents. A percentile rank indicates a student’s relative position in a group
in terms of the percentage of group members scoring at or below the student’s
raw score. For example, if a raw score of 33 equals a percentile rank of 80, it
means 80 percent of the group members had raw score equal to or lower than 33.
By converting raw scores meaning with different size groups and for different
length tests.
To further clarify the meaning of
percentile ranks, Table 13.1 illustrates how raw scores are coonverted to
percentile ranks. The following steps illustrate the procedure.
1. The raw scores are ranked from high to low (column
1).
2. The number of students obtaining each scoe is listed
in the frequency column (column 2).
3. The score frequencies are added from the bottom up
(i.e., adding each score frequency to the total frequency of all lower scores)
to obtain the cumulative frequency (column 3).
4. Apply the following formula at each score level to
get the percentile rank for that raw score (column 4).
Where
PR = percentile rank
CF =
cumulative frequency
Table
13.1 frequency distribution and
percentile rank for an objective test of 40 items
To illustrate the
computation, let’s compute the percentile ranks for two scores.
Score 33 
Score 30 
Precentile ranks are rounded to the
nearest whole number, so the percentile rank for the raw score of 30 is listed
in table 13.1 as 58. To be sure you understans this procedure, you can compute
the percentile ranks of other raw scores in the table and check your answers.
When interpreting percentile ranks,
there are a number of cautions to be kept in mind. (1) Percentile ranks
describe test performance in terms of the percentage
of persons earning a lower score and not
the percentage of items answered correctly. The percentage correct score is a
criterion-referenced interpretation; percentile rank indicates relative
standing and, therefore, is a norm-referenced score. (2) Percentile ranks are
always specific to a particular group. For example, a percentile rank of 90 in
a gifted group represents higher test performance than a percentile rank of 90
in an average group. Thus, whenever we are describing a student’s relative
performance, knowing the nature of the group is just as important as knowing
the student’s relative standing. (3) Percentile ranks are not equally spaced on
the scale. A difference of 5 percentile ranks near the middle of the
distribution of scores represents a smaller difference in test performance than
a 5-percentile-rank difference at the ends of the distribution. This is because
percentile ranks are based on the percentage of persons being surpassed and
there is a larger percentage of persons in the middle of a score distribution
to surpass than at the ends of the distribution. For example, at the high end
of the distribution, a raw score difference of several points will make little
difference in percentile rank because there are so few high scores. Although
this limits some uses of percentile ranks (e.g., they can’t be directly
averaged), they remain one of the most useful and easiest to interpret types of
derived scores.
Percentile
Bands.
Some
test manuals use percentile bands in presenting test norms. Instead of a
specific percentile rank for each raw score, a range of percentile ranks is
presented. For example, a table of norms may show that a raw score of 52 has a
percentile band of 60-64. This allows for the possible error in the test score.
The band tells us that we can be fairly certain that a student who earns a raw
score of 52 on the test has arelative standing that falls somewhere between the
60th percentile rank. We cannot be more precise than this because our estimates
of test performance (i.e., raw scores) always contain some error, due to such
factors a flucttuations in attentioon, memory, efforrt, and luck in guessing
during testing.
The width of the percentile band is
determined by the reliability of the test. With a highly reliable test, the
band is narrow. With a test of low reliability, the band is wide. The width of
the band is computed by using the standard error of measurement. This is a
statistic computed from the realiability coefficient and used to estimate the
amount of error in an individual test score (Chapter 4). These error bands can,
of cousre, be computed for raw score or for any type of derived score. They are
sometimes called confidence bands because they indicate how much confidence we
can have in the score representing a person’s test performance. With a narrow
band, we are more confident-that the score represents the respon’s “true” or
“real” level of achievement.
In addition to using percentile
bands to interpret an individual’s test performance, percentile brands can also
be used to interpret differences in test performance on a battery of tests. In
comparing percentile bands for thr different tests, we cwn conclude that where
the bands do not overlap there is probably a “real” differnece in test
performance and where they do overlap, the differences are likely to be due to
error. For example, he following percentile bands from a test battery for Mario
indicate that there is no “real” difference in her performance in reading and
lenguage, but she is lower in math.
Maria’s
percentile bands
Reading : 70-75
Lenguage : 74-79
Math : 63-68
The use of percentile bands prevents
us from overinterpreting small differnces in test scores. Test publisher that
use percentile bands typically plot them as bars on student’s test profiles,
making it eassy to determine when the ends of bands overlap and when they
don’t.
Grade
Equivalent Scores
Garde
equivalent scores provide another widely used method of describing test
performance. They are used primarily at the elementary school level. With these
scores, a student’s raw score on the test is converted to get grade level at
which the score matches the average raw score of students in the norm group. As
with other derived score, tables in the test manual present parallel column pf
raw scores and grade equivalents. Thus, all we need to do is consult the table
and obtain the grade equivalent for any give raw score. Although it is easy to
obtain and is apparently easy to interpret, it is probably one of the most
misinterpreted types of score. Let’s take a look at what the grade equivalent
score means and what it doesn’tmean.
To clarify the meaning of grade
equivalent, let’s assume that wa obtained the following grade equivalent score
from a test battery for Dave,
Reading 4.5
Language 6.5
Math 7.8
First note that the gradde
equivalent score is expressed in terms of the grade level and the month in that
school year. Thus, Dave’s score in reading is equal to the average score earned
by student (in the norm group) in the middle of the fourth grade. Because Dave
is in the middle of the fourth grade, we interpret his performance in reading
as average. In language, Dave is two years advanced, and in math, he is more
than three years advanced. Does that mean that Dave can do the work at these
levels? No, it most likely means that he does fourth-grade work in these areas
faster and more efficiently than other fourth-graders. The test probably did
not include sixth- and seventh-grade material. The same misinterpretations can
occur with low grade equivalent. If Dave hade a math score of 2.0, for example,
it wouldn’t mean he could only do second-grade math problems. It would more
likely mean that he did fourth-grade problems slower and with more errors than
other fourth-graders. High and low grade equivalent scores are typically
obtained by extrapolation and do not represent average score earned by those
goups. This is often necessary because students at lower grade levels may not
have the knowledge and skill needed to take the test, and students at higher
grade levels may have moved beynod the types of skill measured by the test.
Grade equivalent score provide a
simple method of interpreting test performance, but when using them and
interpreting them to parents, the following common misinterpretations should be
avoided.
1. They are not standards to be achieved but simply the average
scores of students in the norm group.
2. They do not indicate the grade level at which a student can
dothe work.
3. Extermely high and low grade equivalent scores are not as
dependable indicators of test performance as those near the student’s grade
level.
In addition to these cautions to be
observed when interpreting an individual’s grade equivalent scores, a
comparison of scores on tests in a test battery requires an additional caution.
Growth in basic skill, for example, is uneven. In reading, growth is more rapid
than in math, which depends more directly on the skill taught in school. Thus,
a difference of a year in grade squivalent scores represents in larger
difference in achievement on a reading test than on a math test. In addition,
growth in achievement tends to slow down at different times for different
skills, and when growth slow down, the differences in achievement between grade
equivalent score become smaller. Both the variations in growth of skill from
one area to another and the variations in patterns of growth over time
contribute to the unevenness of the units on our grade equivalent score scale.
In comparing a student’s grade equivalent scores pn different tests from a test
battery, it may be wise to look at the norm table to see how the raw scores
spread out on each test. A high or low grade equivalent score is based on
relatively few raw score points.
Standard
Scores
A
standrad score describes test performance in terms of how far a raw score is
above or below average. It is expressed in units that are computed from the
mean and the standard deviation of a set of scores. We are all familiar with
the use of the mean as an average. It is obtained by summing the test scores
and dividing by the number of scores. The computation for obtaining the
standard deviation is shown in Box 13.1, but that does not help us understand
its meaning or its use in interpreting standard scores. This cant best be done
by describing its
properties and showing how it is used as the basic unit for the various types
of standard scores.
The standard deviation (SD or s ) is an
important and widely applicable statistic in testing.In addition to its use as
a basic unitin standard scores,it also serves as a basis for computing
reliability coefficients and the standard error of measurement,as we saw in the
chapter 4.
The mean,standard
Deviation,and normal curve.The
mean and the and standard deviation
can probably be best understood in terms of the normal curve,although a normal
distribution is not required for computing them.The normal curve is a
symmetrical bell-shaped curve based on a precise mathematical equation.Scores
distributed according to the normal curve are concentrated near the mean.A
sample normal curve is presented in figurure.
it will be noted in figure 13.1 that
the mean falls at the exact center of a normal distribution.Note also that when
the normal curve is devided into standard deviation ( SD) Units,Which are equal
distances along the baseline of the curve,eachpartion under the curve contains
a fixed percentage of cases thus,34 percent of the cases fall between the mean
and +1 SD,14 percent between + 1 SD and + 2 SD and 2 percent of the cases fall
between the mean and + 1 SD,14 percent be tween + 1 SD and +2 SD and 2 percent
between + 2 SD and + 3 SD.Since the
Interpreting
Standardized Achievement test Scores
curve is
symmetrical,the same percentages,of course, apply to the intervals below the
mean.These percentages have been rounded to the nearest whole number,but only a
small fraction of a percent ( 0,13 percent) of the cases fall above and below
three standard deviations from the mean. Thus,from a practical standpoint, a
normal distribution of scores falls between -3 and +3 standard deviations from
the mean.
To aid in understanding the meaning
of standard deviation,a set ofraw scores with a mean of 40 and a standard
deviation of 5 has been placed below the baseline of the curve in figure 13.1
Note that the mean raw score of 40 has been placed at the zero point and that
the distance of one standard devi ation is 5 raw score points everywhere along
the baseline of the curve.Thus,the point one standard
deviation
above the mean aquals 45 ( 40 + 5 ) . in
this particular set of scores,then,approximately 68 percent of the scores (
about two thirds )fall between 35 and 45 approximately 96 percent fall between
30 and 50 and approximately 99.7 percent fall between 25 and 55 ( the figure
shows 100 percent because numbers are rounded ).
When the standard deviation is
being computed for a set of normally distributed scores we are assentially
determining how far we need to go above ( or below ) the mean in raw score
points to include 43 percent of the cases.The scores obtained with
standardizwed tests typically approximate a normal distribution or are
normalized by statistical means and thus permit the types of interpretations we
are making here.
z-scores. A number of standard scores are based on the
standard deviation unit.The simplest of these ant the one that is basic to the
others is the z-score.This score is above or below the mean,the raw score of 45
in figure 13.1for example,would be assigned a z-score 0f 1.0because it is one
standard devi-ation above the mean.the raw score of 30 in figure 13.1 would be
given a z-crore of 20 because it is two standarddeviations below the mean.the
formula for z-scores is :

standard deviation
For example,z-score for raw score of 47 and 36 in Figure 13.1 would be computed as follows :


5 5
Thus, a raw score of 47 is 1.4 standard deviations above the mean and a raw score 0f 47 is 1.4 standard deviation below the mean.
While usedin research,z-score are
seldom used directly in test interpretation because ofthr use of decimal points
and minus sign.instead, z-sxcore are converted to otherbtypesof standard score
that use only whole numbers and positive values.such score are more convebient
to use and avoid the possibility of misinterpretation due to a forgotten minus
sign.
There are a number of different types
of standard score used in test in terpretation.All one needs to do is select an
arbitrary mean and standard deviation and convert the score into the standard
score units.To illustrate,we will describe the procedure for some common
standard score.
T-score T-score have a mean of 50 and a standard deviation of
10.they are obtained from z-score by multiplying the z-score by 10 and adding
the result to 50,as shown in the following formula.
T- score = 50 + 10 (z-score )
Applying
the formula to the various z-scores discussed earlier ( 1.0 -2.0,1.4 – 8 ) we
wouldobtain T,scores as follows :
T=50 +
10 (1.0) = 60
T = 50 + 10 (-2.0) = 30
T = 50 +
10 (1.4) =64 T = 50 + 10 ( -8)=
42
T-score can be easily interpreted because
they always have the same mean and standard deviation.A T-score of 60 always
means one standard deviation above the mean and a T-score of 30 always means
two standard deviations below the mean.Thus,with the use of T-scores,an
individual’s per formance on different test can be directly compered,and the
scores can be combined or averaged without the distortion of different size
standard deviations,which accur the raw scores.
Where a normal distribution can be
assumed,T- scores can also be in terpreted in terms of percentile ranks
because,in this case,there is a direct relationship between the two as shown in
figure 13.2 note that a t- score of 30 is equaivalent to a percentile rank of 2
a t score of 40 is equivalent to a percentile rank of 16 and so on. This
relationship makes it possible to use standard scores for those purpose where
equal units are needed and to use percentile ranks when interpreting test
performance to students and parents.
because T-scores and percentile ranks
both have a mean of 50 and use similar two- digit numbers, the two types of scores
are often confused by those inexperienced in test interpretation.thus, it is
important to keep in mind that percentile rank indicates the percentage of
individuals who fall at or below a given score,while a t-score indicates how
many standard deviation units a given score falls above or below the mean.Note
in figure 13.2 that although percentile
ranks and T-scores have the same mean value of 50 .
Below the mean percentile ranks are amaller than
T-scores and above the mean they are large than T-scores.This is accounted
for,of course, by the fact that percentile ranks are crowded together in the
center of the distribution and spread out at the ends,while T-scores provide
equal units through out the distribution of scores.
Normal curve equaivalent score ( NICE ) .Another
standard score that might be confused with both T-scores and percentile ranksis
the normal curve equivalent score (NICE). This set of scores also has a mean of
50, but a standard deviation of 21.06.This provides a set of scores with equal
units, like the T-score, but the scores range from 1 to 99 percentile ranks
also range from 1 to 99, but they do not provide equal units. As can be seen in
figure 13,2 percentile ranks are smaller at the middele of the distribution(
e.g one SD = 34%) than at the ends ( e.g One SD = 2 %).Thus,when interpreting
NICE scores,don’t confuse them with T-scores,which have a move restricded range
of scores ( typically 20 to 80 ) ,or with percentile ranks that have the same
range of scores ( 1 to 99 ) but are based on unequal units.
Ability score.Publishers of achievement test
batteries typically administer a test of learning ability (also called
cognitive ability ,school ability,or intelligence ) to the same norm groups as
the aschievement battery to make comparisons of learning ability and
achievement possible.The scores on these tests are now reported as standard
scores with a mean of 100 and standard deviatoion of 16 ( 15 on some tests).
these scores are to be interpreted like any other standard score ( see figure
13.2 ) A score of 116 means one standard deviation above the mean ( percentile
= 84 ).these scores were originally called deviation iQs, because they replaced
the old ratio iQ ( ie, MAx 100 ).
CA
more recently,however to avoid the confusion surrounding iQ scores,they have been given more appropriate names,such as school abilityscores and standard age scores.
CA
more recently,however to avoid the confusion surrounding iQ scores,they have been given more appropriate names,such as school abilityscores and standard age scores.
Stanine scores. Test scores ccan also be expressed
by single-digit standard scores called stanines ( pronounced stay-nines) .The
stanine scale divides the distribution of raw scores into nine parts ( the
tream stanine was derived from standard nines ). The highest stanine scores is
9,tge lowest is 1 and stanine 5 is located in the center of the
distribution.Each stanine,except 9 and 1,includes a band of raw scores one half of a standard deviation wide.
Thius,stanines are standard scores with a mean of 5 and a standard deviation of
2 the distribution of stanines and the percentage of cases in each stanine are
shown in figure 13.2
The
nine-point scale is simple to interpret to students and parents be nine point
scale,where 5 is average.Because each stanine includes aband of raw
scores,there is also less ahance that test performance will overwinter
preted.When comparing scores on two different tests in a test battery, a
diference of two stanines is tyicallysignificant. thus, ininterpreting the
following scores for a student, we would conclude that the student is higherin
math but there for is no difference between reading and language.
reading stanine
= 5
language
stanine = 4
Math stanine = 7
Imn
addition to the ease of interpretation,stanines provide a simple method for
combining and everaging test scores.The conversion of raw scores to stanines
puts the scores from different tests on the same standard scores scales, with
equal units units. Thus, they have uniform meaning from one part nine 5 and
stanine 7 is the same as a difference between a stanine of 4 and a stanine of 6
A difference of two stanines is also the same,if we are reffering to a reading
test,a language test a math test or any other test. like other atandard
scores,they provide equal units that can
be readily combined.Unlike other standard scores,they are easy to interpret And
to explain to others see box 13.2 for suggestions on how to interpret test
results.
Criterion-
Referenced Interpretation
To make standardized achievement test batteries more
useful for instructional purpose.some test publishers have modified the
multiple-choice items to in-clude more real-life situations,added open-ended
performance tasks,and made provisions for criterion-referenced interpretation
of test performance.
One of the
best known methods of reporting criterion-referenced test results is the percentage-correct score. This simply
reports the percentage of test items in a test,or,subset,answered correctly.The
reports can be for individual students,the class,the school,or the entire
school district.Percentage-correct scores in the national norm sample as one
basis for evaluating the schools.
Scores for
individuals can also be presented by clusters of items representing a content
area,skill,or objective,with an indication of the level of performance (
e.g,above everage,below everage ). Care must be taken in interpreting these
result where small number of items is included in the item clusters.
Some test
batteries provide reports that include standards of reformance.The standards
are typically set by panels of educators and the report indicates a student
level of performance by categories ranging from lack of mastery to superior
performance.
When making
criterion-referenced interpretations of test performance there are a number of
questions to keep in mind.
1.
Do the
objective the tests were designed to measure match the school’s objectives for
these subjects (e.g.routine skills versus reasoning) ?
2.
Are the
skills and content measured by the tests appropriate for the grade levels
tested?
3.
Did
elimination of the easy items from the test,to obtain greater discrimination
among students for norm-referenced interpretation,result in inadequate
description of what low- achieving students can do?
4.
Was there a
sufficient number of test items in each item cluster to permit criterion-referenced
interpretation?
5.
Was the
procedure in setting performance standars edequate for this type of
interpretation?
Criterion-referenced interpretation of standardized tests can be useful
in classroom instruction but they must be cautiously made.
Summary of points
1.
standardized
achievement tests have been widely used in the schools to determine how student
performance compared to that of a sample of students ( i.e,a norm group) at
national,regiona,or state level.
2.
standardized
tests were carefully constructed to fit
a set of test specifi cations,triedout improved,and administered to a norm
group for norm referenced test interpretation.
3.
The most
common types of norm-referenced scores used with standardized tests of standard
scores.
4.
A percentile
rank indicates relative position in a group in terms of the percentage of group
members scoring at or below a given score.It should not be confused with the
percentage of items answered correctly ( a criterion-referenced
interpretation).
5.
Percentile
bands are used in reporting test performance to allow for possible error in
test scores.The width of the band indicates the amount of error to allow for
during interpretation,and it prevents the overin-terpretation of small
differences in test scores.
6.
A grade
equaivelent score indicates relative test performance in terms of the grade
level at which the student’s raw score matches the everage score earned by the
norm group.Thus,a grade equaivalent score of 4.5 indicates performance equal to
the everage student in the middle of the fourth grade.Grade equivalent score
are asy to interpret but they are subject to numerous misinterpretations.
7.
Standard
scores are based on the mean ( M) and standard deviation ( SD)of set of score.To fully understand them it is
necessary to understand the meaning of these statistics.
8.
Standard
score indicate the number of standard deviations a raw score falls above and
below the mean.They are more difficult to understand and interpret to others
but they have the advantage of providing equal units.
9.
The standard
scores discussed in this chapter have the following means ( M ) and standard
deviations ( SD ). The third column shows the score for one standard above the
mean.
Z-score 0 1 1
T-scores 50 10 60
NCE scores 50 21.06 71
Ability scores 100 16 116
Stanines 5 2 7
10. In a normal
distribution,any standard score can be converted to per centile rank for easy
interpretation.For example,one standard deviation above the mean ( see column 3
) haw a percentile rank of 84,no matter what type pf standard score ,and is
used to express test performance.
11. Because
T-score,NCE scores,and percentile ranks all have a mean of 50 and use similar
two-digit numbers,care must be taken not to confuse them when interpreating
test performance.
12. stanines are single-digit score that range from1 to 9.They are easily
explained to students and a difference of two stanines typically indicates a
significant ( i.e,”real”) difference in test performance
13. Stanines and percentile ranks using percentile bands
are the two types of scores that are favoredwhen interpreting test result to
others and judging differences between test scores
14. Criterion referenced interpretation of test
performance have been added to many standardized tests.these include percentage
correct scores,the use of performance standard,and interpretation by item
clusters representing a content area,skill or objective.
15. Criterion referenced interpretation of standardized
tests require a check on how well the objectives,content,and skills of the test
match the local instructional program wheter the construction of the test
favors criterion-referenced interpretation whether there is a sufficient number
of test items for each types of interpretation,and how the performance
standards are determined.
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
Bagi teman-teman yang mau
share dan copy paste saya persilakan dengan senang hatitapi jangan lupa sertakan link alamat di mana anda mendapatkannya, yaitu
http://Susilotakueducation.Blogspot.com
Saling berbagi dan bisa menghargai itu indah!
==Tinggalkan jejakmu berupa komentar ya teman-teman. Kalo kalian suka artikel ini, atau terdapat kesalahan pada artikel ini, atau apapun deh sebagai bukti kunjungan kalian telah membaca artikel ini ===
Terimakasih! (^-^)
Komentar
Posting Komentar