Skip to main content
Skip Navigation

Table of Contents  |  Search Technical Documentation  |  References

NAEP Analysis and Scaling → Initial Activities → Constructed-Response Interrater Reliability → Range of Response Codes, Percentage Exact Agreement, and Cohen's Kappa or Intraclass Correlation for the Constructed-Response Items Used in Scaling, by Block and Item, Grade 4 Reading Assessment: 2000
NAEP Technical DocumentationRange of response codes, percentage exact agreement, and Cohen's Kappa or intraclass correlation for the constructed-response items used in scaling, by block and item, grade 4 reading assessment: 2000
Block Item Range of response codes Sample size Percentage exact agreement Cohen's Kappa Intraclass correlation
R3 R017001 1–2 500 86 0.68
R017003 1–3 500 81 0.87
R017004 1–2 500 90 0.80
R017006 1–2 500 91 0.83
R017007 1–4 500 78 0.91
R017009 1–3 500 87 0.94
R4 R012102 1–2 500 95 0.91
R012104 1–2 500 93 0.86
R012106 1–2 500 92 0.87
R012108 1–2 500 96 0.93
R012109 1–2 500 96 0.92
R012111 1–4 500 91 0.96
R012112 1–2 500 92 0.88
R5 R012601 1–2 500 93 0.83
R012604 1–2 500 93 0.85
R012607 1–4 500 83 0.88
R012611 1–2 500 92 0.87
R6 R017301 1–2 500 94 0.86
R017303 1–3 500 88 0.93
R017305 1–2 500 95 0.92
R017307 1–4 500 85 0.92
R017309 1–3 500 88 0.95
R7 R012702 1–2 500 91 0.78
R012703 1–2 500 88 0.78
R012705 1–2 500 93 0.86
R012706 1–2 500 83 0.70
R012708 1–4 500 83 0.90
R012710 1–2 500 94 0.91
R8 R015702 1–3 500 81 0.79
R015703 1–3 500 90 0.90
R015704 1–3 500 83 0.89
R015705 1–3 500 93 0.96
R015707 1–4 500 85 0.91
R015709 1–3 500 95 0.98
R9 R015802 1–2 500 96 0.90
R015803 1–3 500 88 0.88
R015804 1–4 500 77 0.85
R015806 1–3 500 86 0.92
R015807 1–3 500 89 0.94
R015809 1–3 500 93 0.96
R10 R012503 1–2 500 90 0.82
R012504 1–2 500 98 0.97
R012506 1–2 500 94 0.90
R012508 1–2 500 97 0.95
R012511 1–2 500 98 0.96
R012512 1–4 500 84 0.95
† The intraclass correlation is not reported for dichotomously scored items; Cohen's Kappa is not reported for polytomously scored items.
NOTE: Cohen's Kappa is a measure of reliability that is appropriate for items that are dichotomously scored. The intraclass correlation coefficient is most appropriate for items with more than two categories.
SOURCE: U.S. Department of Education, Institute of Education Sciences, National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 2000 Reading Assessment.

Last updated 15 July 2008 (KL)

Printer-friendly Version