Skip Navigation
small NCES header image
Digest of Education Statistics: 2011
Digest of Education Statistics: 2011

NCES 2012-001
May 2012

Appendix A.2. National Assessment of Educational Progress

The National Assessment of Educational Progress (NAEP) is a series of cross-sectional studies initially implemented in 1969 to gather information about selected levels of educational achievement across the country. At the national level, NAEP is divided into two assessments: long-term trend NAEP and main NAEP. NAEP has surveyed students at specific ages (9, 13, and 17) for the long-term trend NAEP and at grades 4, 8, and 11 or 12 for main NAEP, state NAEP, and long-term writing NAEP. NAEP in the early years also surveyed young adults (ages 25 to 35). The assessment data presented in this publication were derived from tests designed and conducted by the Education Commission of the States (from 1969 to 1983) and by the Educational Testing Service (ETS) (from 1983 to the present).

The National Assessment of Educational Progress (NAEP) is a series of cross-sectional studies initially implemented in 1969 to gather information about selected levels of educational achievement across the country. At the national level, NAEP is divided into two assessments: long-term trend NAEP and main NAEP. NAEP has surveyed students at specific ages (9, 13, and 17) for the long-term trend NAEP and at grades 4, 8, and 11 or 12 for main NAEP, state NAEP, and long-term writing NAEP. NAEP in the early years also surveyed young adults (ages 25 to 35). The assessment data presented in this publication were derived from tests designed and conducted by the Education Commission of the States (from 1969 to 1983) and by the Educational Testing Service (ETS) (from 1983 to the present).

Long-term trend

NAEP long-term trend assessments are designed to inform the nation of changes in the basic achievement of America's youth. Nationally representative samples of students have been assessed in science, mathematics, and reading at ages 9, 13, and 17 since the early 1970s. Students were assessed in writing at grades 4, 8, and 11 between 1984 and 1996. To measure trends accurately, assessment items (mostly multiple choice) and procedures have remained unchanged since the first assessment in each subject. Recent trend assessments were conducted in 1994, 1996, 1999, 2004, and 2008. Approximately 26,600 students took part in the 2008 reading assessment and 26,700 took part in the 2008 mathematics assessment. Results are reported as average scores for the nation, for regions, and for various subgroups of the population, such as racial and ethnic groups. Data from the trend assessments are available in the most recent report, NAEP 2008 Trends in Academic Progress (NCES 2009-479). The next long-term trend assessment (of reading and mathematics) is scheduled for 2012.

The 2004 NAEP long-term trend assessments marked the end of tests designed and administered since the inception of the assessments in 1971, marked the beginning of a modified design that provides greater accommodations for students with disabilities and English language learners, and limited the assessments to reading and math. Science and writing are now assessed only in main NAEP.

To ensure that the assessment results can be reported on the same trend line, a "bridge" assessment was administered in addition to the modified assessment in 2004. Students were randomly assigned to take either the bridge assessment or the modified assessment. The bridge assessment replicated the instrument given in 1999 and used the same administrative techniques. The 2004 modified assessment provides the basis of comparison for all future assessments, and the bridge assessment links its results to the results from the past 30 years.

In the 2008 long-term trend reading assessment, sample sizes and overall weighted participation rates were 8,600 for 9-year-olds (94.9 percent), 8,400 for 13-year-olds (93.8 percent), and 9,600 for 17-year-olds (87.7 percent). Sample sizes and participation rates for the math assessment were 8,600 for 9-year-olds (94.6 percent), 8,500 for 13-year-olds (93.6 percent), and 9,600 for 17-year-olds (88.0 percent).

In the 2004 long-term trend reading assessment bridge group, the number of participants and overall weighted participation rates were 5,200 for 9-year-olds (94.5 percent), 5,700 for 13-year-olds (92.4 percent), and 3,800 for 17-year-olds (75.5 percent). For those taking the modified assessment, the sizes and unweighted rates were 7,300 for 9-year-olds (80 percent), 7,500 for 13-year-olds (76 percent), and 7,600 for 17-year-olds (56 percent). Sample sizes and overall participation rates for the mathematics assessment bridge group were 5,200 for 9-year-olds (80 percent), 5,700 for 13-year-olds (76 percent), and 3,800 for 17-year-olds (57 percent). For those taking the modified assessment, the sizes and rates were 7,300 for 9-year-olds (80 percent), 7,500 for 13-year-olds (76 percent), and 7,600 for 17-year-olds (56 percent).

The 1999 NAEP long-term trend study sample sizes for the reading/writing proficiency portion were 5,790 for 9-year-olds, 5,930 for 13-year-olds, and 5,290 for 17-year-olds. Overall participation rates were 80 percent, 74 percent, and 59 percent, respectively. Sample sizes for the mathematics/science portion of the 1999 long-term trend study were 6,030 for 9-year-olds, 5,940 for 13-year-olds, and 3,800 for 17-year-olds. Overall participation rates were 78 percent, 73 percent, and 59 percent for 9-year-olds, 13-year-olds, and 17-year-olds, respectively.

Main

In the main national NAEP, a nationally representative sample of students is assessed at grades 4, 8, and 12 in various academic subjects. The assessments change periodically and are based on frameworks developed by the National Assessment Governing Board (NAGB). Items include both multiple-choice and constructed-response (requiring written answers) items. Results are reported in two ways. Average scores are reported for the nation, for participating states and jurisdictions, and for subgroups of the population. In addition, the percentage of students at or above Basic, Proficient, and Advanced achievement levels is reported for these same groups. The achievement levels are developed by NAGB.

From 1990 until 2001, main NAEP was conducted for states and other jurisdictions that chose to participate (e.g., 45 participated in 2000). Prior to 1992, the national NAEP samples were not designed to support the reporting of accurate and representative state-level results. Separate representative samples of students were selected for each participating jurisdiction. State data are usually available at grade 4, grade 8, or both grades, and may not include all subjects assessed in the national-level assessment. In 1994, for example, NAEP assessed reading, geography, and history at the national level at grades 4, 8, and 12; however, only reading at grade 4 was assessed at the state level. In 1996, mathematics and science were assessed nationally at grades 4, 8, and 12; at the state level, mathematics was assessed at grades 4 and 8, and science was assessed at grade 8 only. In 1997, the arts were assessed only at the national level at grade 8. Reading and writing were assessed in 1998 at the national level for grades 4, 8, and 12 and at the state level for grades 4 and 8; civics was also assessed in 1998 at the national level for grades 4, 8, and 12. These assessments generally involved about 130,000 students at the national and state levels.

In 2002, under the provisions of the No Child Left Behind Act of 2001, all states began to participate in main NAEP and an aggregate of all state samples replaced the separate national sample. In 2002, students were assessed in reading and writing at grades 4, 8, and 12 for the national assessment and at grades 4 and 8 for the state assessment. In 2003, reading and mathematics were assessed at grades 4 and 8 for both national and state assessments.

The NAEP national samples in 2003 and 2005 were obtained by aggregating the samples from each state, rather than by obtaining an independently selected national sample. As a consequence, the size of the national sample increased, and smaller differences between scores across years or types of students were found to be statistically significant than would have been detected in previous assessments. The most recent NAEP assessment was administered in 2011 concerning reading skills. The next assessment is planned for 2013.

The main NAEP assessments are conducted separately from the long-term assessments. Mathematics assessments were administered in 2000, 2003, 2005, 2007, and 2009. About 172,000 4th-graders (93.9 percent, weighted), 162,000 8th-graders (90.9 percent, weighted), and over 21,000 12th-graders (67.9 percent, weighted) participated in the 2005 assessment. The 2007 math assessment was administered to approximately 197,700 4th-graders and 153,000 8th-graders. The weighted response rates were 94.8 percent and 92.2 percent, respectively. The 2009 math assessment was administered to approximately 168,800 4th-graders and 161,700 8th-graders. The weighted response rates were 95 percent and 93 percent, respectively.

Reading assessments were administered in 2000, 2002, 2003, 2005, 2007, and 2009. Over 165,000 4th-graders (93.8 percent, weighted), 159,000 8th-graders (91.0 percent, weighted), and 21,000 12th-graders (67.5 percent, weighted) participated in the assessment in 2005. The 2007 reading assessment was administered to approximately 191,000 4th-graders and 160,700 8th-graders. The weighted response rates were 94.7 percent and 92.2 percent, respectively. The 2009 reading assessment was administered to approximately 178,800 4th-graders and 160,900 8th-graders. The weighted response rates were 95 percent and 93 percent, respectively.

Science assessments were administered in 1995–96, 2000, and 2005. More than 300,000 students in grades 4, 8, and 12 participated in the 2005 science assessment with weighted response rates of 93.5 percent, 90.6 percent, and 68.1 percent respective to grade.

Geography assessments were administered in 1993-94 and 2000-01. About 5,510 4th-graders, 6,880 8th-graders, and 6,230 12th-graders participated in the 1993-94 assessment. The unweighted response rates were 93 percent for the 4th-graders, 93 percent for the 8th-graders, and 90 percent for the 12th-graders. The 2000–01 geography assessment was administered to 7,780 4th-graders, 10,040 8th-graders, and 9,660 12th-graders. The unweighted response rates were 95 percent for the 4th-graders, 93 percent for the 8th-graders, and 78 percent for the 12th-graders. The next geography assessment is scheduled for 2009–10.

Writing assessments were administered in 1997-98, 2002, and 2007. The 2007 writing assessment was administered to 139,900 8th-graders and 27,900 12th-graders with weighted response rates of 92.3 percent and 79.6 percent, respectively.

The 2006 U.S. history assessment, the first since 2001, was administered to over 29,000 students in grades 4, 8, and 12 nationwide. The weighted response rates were 95.3 percent, 92.4 percent, and 73.3 percent, respectively. Students in public, private, Department of Defense, and Bureau of Indian Affairs schools were assessed.

The 2006 civics assessment was administered to approximately 25,000 students in grades 4, 8, and 12 nationwide. The weighted response rates for the respective grades were 94.6 percent, 91.7 percent, and 73.4 percent. The previous civics assessment was in 1998.

The first economics assessment was administered in 2006 at grade 12. Results are based on a nationally representative sample of 11,500 12th-graders from 590 public and private schools. The student participation rate was 71.9 percent for public school students and 87.0 percent for private school students.

Trial Urban District Assessments

The Trial Urban District Assessment (TUDA) is designed to explore the feasibility of using NAEP to report on the performance of public school students at the district level. NAEP has administered the mathematics, reading, science, and writing assessments to samples of students in selected urban district public schools since 2002. The purpose of the TUDA is to allow reporting of NAEP results for large urban school districts and to allow the NAEP program to evaluate the usefulness of NAEP data to cities of different sizes and demographic compositions. The number of urban school districts participating has grown from 6 in 2002 to 18 in 2009. School districts vary in terms of whether the charter schools within their boundaries are independent of the districts. In 2007, charter schools were included in the TUDA district results if they were listed as part of the district's Local Education Agency in the NCES Common Core of Data. In 2009, charter schools were included in TUDA district results if they contribute to the district's AYP results as part of the Elementary and Secondary Education Act. This change had little or no impact on the 2007–09 average score differences of the TUDA districts. Most TUDA districts have higher combined percentages of Black and Hispanic students as well as higher percentages of low-income students than the nation as a whole.

All charter schools were included in the 2007 assessment if they were listed in the districts' Common Core of Data; however, in 2009 only those charter schools whose results were included in the Adequate Yearly Progress report were included in the TUDA results. This change had little or no impact on the 2007–09 average score differences, except for the District of Columbia at grade 8 for mathematics. The District of Columbia's 2007 grade 8 sample included 20 charter schools. All charter schools in the District of Columbia are independent of the school district, and none were included in their TUDA sample in 2009. The change in scores for the District of Columbia Public Schools that would have resulted from using comparable sample frames, i.e., excluding charter schools from the NAEP sample in both years, would have resulted in a statistically significant increase from 244 in 2007 to 251 in 2009, rather than the nonsignificant change from 248 to 251.

Information from NAEP is subject to both nonsampling and sampling errors. Two possible sources of nonsampling error are nonparticipation and instrumentation. Certain populations have been oversampled to ensure samples of sufficient size for analysis. Instrumentation nonsampling error could result from failure of the test instruments to measure what is being taught and, in turn, what the students are learning.

Further information on NAEP may be obtained from

Arnold Goldstein
Assessment Division
State Support and Constituency Outreach
National Center for Education Statistics
1990 K Street NW
Washington, DC 20006
/nationsreportcard


Would you like to help us improve our products and website by taking a short survey?

YES, I would like to take the survey

or

No Thanks

The survey consists of a few short questions and takes less than one minute to complete.