Skip to main content
Skip Navigation

​​​​​Interpreting NAEP Arts Results

Overview of the Assessment
Reporting the Assessment
Results Are Estimates
NAEP Reporting Groups
Statistical Significance
Exclusion Rates and Assessment Results
Cautions in Interpretations

Overview of the Assessment

Nationally representative samples of schools and students participated in the 2016 NAEP arts assessment. The results of the arts assessment are based on eighth-grade students from about 260 public and private schools across the nation. Approximately 4,300 eighth-graders were assessed in music, and another 4,400 were assessed in visual arts.

The NAEP arts framework PDF serves as the blueprint for the assessment, describing the specific knowledge and skills that should be assessed in the arts disciplines. Developed under the guidance of the National Assessment Governing Board, the framework incorporates standards and benchmarks taken from the National Standards for Arts Education and reflects the input of arts educators, artists, assessment specialists, policymakers, representatives from the business community, and members of the public.

The framework specifies that students' arts knowledge and skills be measured in four arts disciplines: dance, music, theater, and visual arts. In 2016, NAEP assessed students in music and visual arts based on a nationally representative sample of eighth-grade students. Due to budget constraints and the small percentage of schools with dance and theater programs, these two arts disciplines were not assessed in 2016. Additionally, three arts processes—creating, performing, and responding—are central to students' experiences in each of the disciplines. Again, because of budget constraints, only the responding process in music and both the responding and creating processes in visual arts were assessed in 2016.

Read more about what the arts assessment measures, how the arts assessment was developed, who took the assessment, and how the assessment was administered.

Reporting the Assessment

Because music and visual arts are two distinct disciplines, results are reported on two separate NAEP scales, each ranging from 0 to 300, and are not combined into a single arts score. Within the visual arts discipline, the results for responding and creating are also reported separately since the two processes may not draw upon the wide range of arts knowledge and skills in ways similar enough to be combined into a single score.

Because the scales for the two disciplines were developed independently, the responding results for music and visual arts cannot be compared. In addition to average responding scores, five selected percentiles show score results for students performing at lower (10th and 25th percentiles), middle (50th percentile), and higher (75th and 90th percentiles) levels on the responding scale.

Questions that required students to work with various media to create original works of visual art were used to assess the creating process. Because of the small number of these questions in the assessment, it was not possible to summarize the results using a standard NAEP scale. Instead, creating results in visual arts are presented as the average creating task score, which is expressed as the average percentage of the maximum possible score ranging from 0 to 100. The creating task score for each creating question (task) is the sum of the percentage of students receiving full credit and a fraction of the percentage of students receiving partial credit. The individual scores are then averaged together to report an average creating task score for the entire set of the visual arts creating questions.

Although the questions in the 2016 arts assessment were taken from those administered in the previous arts assessments in 1997 and 2008, not all of the results can be compared between the three years. Because of the length of time between each of the assessments, some materials that students used to create artworks (for example, papers for collages or markers of a certain color) could no longer be obtained. Additionally, some scoring procedures for constructed-response questions could not be replicated in 2016. For these reasons, comparisons between 1997, 2008, and 2016 cannot be made for the average responding scores in music and visual arts or the average creating task scores in visual arts. However, since the scoring method was the same in 1997, 2008, and 2016 for multiple-choice questions, comparisons of the percentages of correct responses for these questions are provided for music and visual arts. It is important to note, though, that because multiple-choice questions made up only a portion of the arts assessments in both years, it would be inappropriate to make inferences about changes in students' overall performance on the entire 2016 assessment based on these results.

Item maps provide another way to interpret the responding scale scores for each of the disciplines. The item maps show student performance on NAEP music and visual arts questions at different points on their respective scales.

Arts achievement levels were developed as part of the arts framework. However, the arts assessment results were not reported in terms of the NAEP arts achievement levels. To set achievement levels for the results of any given NAEP assessment, the results of the whole assessment must be summarized together. The complex, diverse nature of the assessment tasks for the arts necessitated that different scales be used for different kinds of tasks: that is, students' written responses and responses to multiple-choice questions could not be summarized together with their responses to complex tasks where they created or performed works of art.

To view the achievement levels developed for the arts assessment, take a look at the NAEP Arts Education Assessment Framework(1.14 MB).


Results are Estimates

The average scores and percentages presented on this website are estimates because they are based on representative samples of students rather than on the entire population of students. Moreover, the collection of subject-area questions used at each grade level is but a sample of the many questions that could have been asked. As such, NAEP results are subject to a measure of uncertainty, reflected in the standard error of the estimates. The standard errors for the estimated scale scores and percentages in the figures and tables presented on this website are available in the NAEP Data Explorer.


NAEP Reporting Groups

Results are provided for groups of students defined by shared characteristics—gender, race or ethnicity, eligibility for free/reduced-price school lunch, students with disabilities, and students identified as English learners (EL). Based on participation rate criteria, results are reported for subpopulations only when sufficient numbers of students and adequate school representation are present. The minimum requirement is at least 62 students in a particular group from at least five primary sampling units (PSUs). However, the data for all students, regardless of whether their group was reported separately, were included in computing overall results. Explanations of the reporting groups are presented below.


Results are reported separately for males and females. Gender was reported by the school.


Prior to 2011, student race/ethnicity was obtained from school records and reported for the six mutually exclusive categories shown below:

  • White
  • Black
  • Hispanic
  • Asian/Pacific Islander
  • American Indian/Alaska Native
  • Other or unclassified

Students who identified with more than one of the other five categories were classified as “other” and were included as part of the "unclassified" category along with students who had a background other than the ones listed or whose race/ethnicity could not be determined.

In compliance with new standards from the U.S. Office of Management and Budget for collecting and reporting data on race/ethnicity, additional information was collected in 2011 so that results could be reported separately for Asian students, Native Hawaiian/Other Pacific Islander students, and students identifying with two or more races. Beginning in 2011, all of the students participating in NAEP were identified by school reports as one of the seven racial/ethnic categories listed below:

  • White
  • Black or African American
  • Hispanic
  • Asian
  • Native Hawaiian or Other Pacific Islander
  • American Indian or Alaska Native
  • Two or More Races

Students identified as Hispanic were classified as Hispanic in 2011 even if they were also identified with another racial/ethnic group. Students who identified with two or more of the other racial/ethnic groups (e.g., White and Black) would have been classified as “other” and reported as part of the "unclassified" category prior to 2011, and from 2011 on classified as “Two or More Races." When comparing the results for racial/ethnic groups from 2011 and 2015 to earlier assessment years, the 2011 data for Asian and Native Hawaiian/Other Pacific Islander students were combined into a single Asian/Pacific Islander category. Information based on student self-reported race/ethnicity will continue to be reported in the NAEP Data Explorer.


Eligibility for Free/Reduced-Price School Lunch

As part of the Department of Agriculture's National School Lunch Program, schools can receive cash subsidies and donated commodities in turn for offering free or reduced-price lunches to eligible children. Based on available school records, students were classified as either currently eligible for the free/reduced-price school lunch or not eligible. Eligibility for free and reduced-price lunches is determined by students' family income in relation to the federally established poverty level. Students whose family income is at or below 130 percent of the poverty level qualify to receive free lunch, and students whose family income is between 130 percent and 185 percent of the poverty level qualify to receive reduced-price lunch. Students whose family income is at or below 130 percent of the poverty level qualify to receive free lunch, and students whose family income is between 130 percent and 185 percent of the poverty level qualify to receive reduced-price lunch. For the period July 1, 2015 through June 30, 2016, for a family of four, 130 percent of the poverty level was 31,525 and 185 percent was 44,863 in most states. The classification applies only to the school year when the assessment was administered (i.e., the 2015–2016 school year) and is not based on eligibility in previous years. If school records were not available, the student was classified as "Information not available." If the school did not participate in the program, all students in that school were classified as "Information not available."  .  Because of the improved quality of the data on students' eligibility for NSLP, the percentage of students for whom information was not available has decreased compared to the percentages reported prior to the 2003 assessment. See the proportion of students in each category at grade 8 for music and visual arts in the NAEP Data Explorer.


Students with Disabilities (SD)
Results are reported for students who were identified by school records as having a disability. A student with a disability may need specially designed instruction to meet his or her learning goals. A student with a disability will usually have an Individualized Education Program (IEP), which guides his or her special education instruction. Students with disabilities are often referred to as special education students and may be classified by their school as learning disabled (LD) or emotionally disturbed (ED). The goal of NAEP is that students who are capable of participating meaningfully in the assessment are assessed, but some students with disabilities selected by NAEP may not be able to participate, even with the accommodations provided. Beginning in 2009, NAEP disaggregated students with disabilities from students who were identified under section 504 of the Rehabilitation Act of 1973. The results for SD are based on students who were assessed and could not be generalized to the total population of such students.

English Learners (EL)

Results are reported for students who were identified by school records as being English learners. (Note that English learners were previously referred to as limited English proficient (LEP).

Type of School
The national results are based on a representative sample of students in both public schools and nonpublic schools. Nonpublic schools include private schools, Bureau of Indian Affairs schools, and Department of Defense schools. Private schools include Catholic, Conservative Christian, Lutheran, and other private schools. Results are reported for private schools overall, as well as disaggregated by Catholic and other private schools. The school participation rates for private schools overall in 2016 met the 70 percent criteria for reporting grade 8. The results for Catholic schools also met the criteria and are presented in the report.

Type of Location
NAEP results are reported for four mutually exclusive categories of school location: city, suburb, town, and rural. The categories are based on standard definitions established by the Federal Office of Management and Budget using population and geographic information from the U.S. Census Bureau. Schools are assigned to these categories in the NCES Common Core of Data based on their physical address.

The classification system was revised for 2007 and 2009. The new locale codes are based on an address's proximity to an urbanized area (a densely settled core with densely settled surrounding areas). This is a change from the original system based on metropolitan statistical areas. To distinguish the two systems, the new system is referred to as "urban-centric locale codes." The urban-centric locale code system classifies territory into four major types: city, suburban, town, and rural. Each type has three subcategories. For city and suburb, these are gradations of size—large, midsize, and small. Towns and rural areas are further distinguished by their distance from an urbanized area. They can be characterized as fringe, distant, or remote.

Parental Education

Parents' highest level of education is defined by the highest level reported by eighth-graders and twelfth-graders for either parent. Fourth-graders' replies to this question were not reported because their responses in previous studies were highly variable, and a large percentage of them chose the "I don't know" option.


Prior to 2003, NAEP results were reported for four NAEP-defined regions of the nation: Northeast, Southeast, Central, and West. As of 2003, to align NAEP with other federal data collections, NAEP analysis and reports have used the U.S. Census Bureau's definition of "region." The four regions defined by the U.S. Census Bureau are Northeast, South, Midwest, and West. The Central region used by NAEP before 2003 contained the same states as the Midwest region defined by the U.S. Census. The former Southeast region consisted of the states in the Census-defined South minus Delaware, the District of Columbia, Maryland, Oklahoma, Texas, and the section of Virginia in the District of Columbia metropolitan area. The former West region consisted of Oklahoma, Texas, and the states in the Census-defined West. The former Northeast region consisted of the states in the Census-defined Northeast plus Delaware, the District of Columbia, Maryland, and the section of Virginia in the District of Columbia metropolitan area. The table below shows how states are subdivided into these Census regions. All 50 states and the District of Columbia are listed. Other jurisdictions, including the Department of Defense Educational Activity schools, are not assigned to any region.

States within regions of the country defined by the U.S. Census Bureau





New Hampshire
New Jersey
New York
Rhode Island

District of Columbia
North Carolina
South Carolina
West Virginia

North Dakota
South Dakota

New Mexico


Exclusion Rates and Assessment Results

Some students selected for participation in the NAEP arts assessment were identified as English learners (EL) or students with disabilities (SD). See tables that summarize the percentage of students identified, excluded, and assessed in arts.

For each student selected to participate in NAEP who was identified as either SD or EL, a member of the school staff most knowledgeable about the student completed an SD/EL questionnaire. Students with disabilities were excluded from the assessment if an IEP (individualized education program) team or equivalent group determined that the student could not participate in assessments such as NAEP; if the student's cognitive functioning was so severely impaired that he or she could not participate; or if the student's IEP required that the student be tested with an accommodation or adaptation not permitted or available in NAEP, and the student could not demonstrate his or her knowledge of the assessment subject area without that accommodation or adaptation.

A student who was identified as EL and who was a native speaker of a language other than English was excluded if the student received instruction in the assessment subject area (e.g., reading or mathematics) primarily in English for less than three school years, including the current year, or if the student could not demonstrate his or her knowledge of reading or mathematics in English without an accommodation or adaptation.


Statistical Significance

The differences between scale scores and between percentages discussed in the results on this website take into account the standard errors associated with the estimates. Comparisons are based on statistical tests that consider both the magnitude of the difference between the group average scores or percentages and the standard errors of those statistics. Throughout the results, differences between scores or between percentages are discussed only when they are significant from a statistical perspective.

All differences reported are significant at the 0.05 level with appropriate adjustments for multiple comparisons. The term "significant" is not intended to imply a judgment about the absolute magnitude or the educational relevance of the differences. It is intended to identify statistically dependable population differences to help inform dialogue among policymakers, educators, and the public.

Cautions in Interpretations

Users of this website are cautioned against interpreting NAEP results as implying causal relations. Inferences related to student group performance or to the effectiveness of public and nonpublic schools, for example, should take into consideration the many socioeconomic and educational factors that may also have an impact on performance.

The NAEP music and visual arts scales makes it possible to examine relationships between students' performance and various background factors measured by NAEP. However, a relationship that exists between achievement and another variable does not reveal its underlying cause, which may be influenced by a number of other variables. Similarly, the assessments do not reflect the influence of unmeasured variables. The results are most useful when they are considered in combination with other knowledge about the student population and the educational system, such as trends in instruction, changes in the school-age population, and societal demands and expectations.

A caution is also warranted for some small population group estimates. At times in the results pages, smaller population groups show very large increases or decreases across years in average scores. However, it is often necessary to interpret such score gains with extreme caution. For one thing, the effects of exclusion-rate changes for small subgroups may be more marked for small groups than they are for the whole population. Also, the standard errors are often quite large around the score estimates for small groups, which in turn means the standard error around the gain is also large.


Last updated 07 July 2023 (AA)