Skip Navigation
small NCES header image

Results from the 2002 National Assessment of Educational Progress (NAEP) Reading and Writing Trial Urban District Assessment

Dr. Peggy G. Carr Hello, and welcome to today's StatChat on the results for the NAEP 2002 Trial Urban District Assessment in reading and writing. I hope you've had time to look at the results on the website. I'm sure that you have many questions regarding today's release, so let's get right to them...

Katherine Shek from Washington DC asked:
What is the significance of including the scores of students with disabilities? Is this the first time NEAP allowed all students with disabilities to use accommodations? what's the significance of this compared with the past two years, when two samples, one with accommodations, one without, was used? Why isn't students with disabilities counted as a subgroup? Is there a plan to do that? Overall what messages is NEAP sending by allowing accommodations? thank you
Dr. Peggy G. Carr: It is NAEP?s intent to select all students from the target population. Therefore, every effort is made to ensure that all selected students who are capable of participating in the assessment are assessed. Before 1996 NAEP did not allow any testing under nonstandard conditions (i.e., accommodations were not permitted). From 1996 through 2000, NAEP administered its assessments in two ways: one with accommodations permitted and the other without accommodations permitted. This allowed the program to explore the effects of permitting accommodations on overall results, while continuing to measure trends. Beginning in 2002, NAEP uses only the more inclusive samples, in which assessment accommodations are permitted. Consequently, the data reported here represents samples in which accommodations were permitted. Where sample sizes permit, performance data is available on the NAEP website for students with disabilities.

Martin Weiss from Brooklyn, New York, NY asked:
How is it that proficient readers are a minority in every grade/age tested by NAEP, yet the scores of US grade schoolers on international tests are on a par with those of their French and German peers? If NAEP and international tests are fundamentally different, what is the critical difference?
Dr. Peggy G. Carr: NAEP is designed to assess what students know and can do over challenging subject material. Achievement levels for NAEP are set through the National Assessment Governing Board and reflect broad input from stakeholders. This process is independent for each assessment. The NAEP and international assessment designs, in terms of subject areas assessed, time of year administered, proportion of multiple-choice and constructed-response, frameworks, length of assessment, and age/grade of students varies across each assessment and may contribute to perceived differences in performance.

Pat from Columbia, MD asked:
How do the demographics for urban districts compare to those of states and jurisdictions? Do overall assessment scores reflect those differences?
Dr. Peggy G. Carr: The urban districts that participated in the Trial Urban District Assessment were predominantly non-White. In addition, approximately seventy percent or more in these school districts were eligible for free or reduced-price school lunch. The performance of these districts on the NAEP assessment reflects the particular challenge faced by schools in high poverty areas. In interpreting the results it is important to bear in mind that the estimated performance of a particular group does not include the whole range of performance with that group. Difference in subgroup performance cannot be ascribed solely to students? membership in an identified subgroup. Average student performance is affected by the interaction of a complex set of factors not addressed by NAEP assessments.

Catherine from Westfield, MA asked:
On what basis were the urban districts chosen? Why isn't Boston included?
Dr. Peggy G. Carr: Representatives of the Council of the Great City Schools worked with the National Assessment Governing Board (that sets the policy for NAEP) to identify districts for the trial assessment. Districts were selected that permitted testing of the feasibility of conducting NAEP over a range of characteristics, such as district size, minority concentrations, federal program participation, and percentages of students with disabilities and limited-English-proficient students. Boston did participate in the 2003 Trial Urban District Assessment ? the results of which will be released this fall.

Richard Turnock from Portland Oregon asked:
To avoid federal penalties and to continue get federal funds, states are lowering their standards to the national average of 30 percent proficiency in reading and writing so that schools are not singled out as non-performing. Why is the national average so low? Because the large urban school districts drag it down. Why are the scores for the large urban school districts so low? Not because of ethnic, income or language disadvantages, but because the urban schools are not organized to support continuous improvement. They do not leaders focused on quality. Have urban school districts improved since the beginning of the reporting?
Dr. Peggy G. Carr: This assessment of urban districts was a new, trial activity for NAEP in 2002. The six districts in the 2002 trial urban NAEP assessment participated in the 2003 assessment, with four new districts. Since 2002 and 2003 NAEP both include reading, information on changes in performance over that period will be available this fall. However, NAEP results are based on a sample of students and cannot support causal inference. State standards for achievement are set by the states and are not directly related to NAEP achievement levels.

Brea from Tallahassee, FL asked:
The top 3 most populous states (Texas, California and New York) are represented in the 5 districts chosen to participate in the 2002 TUDA. Why was a district (such as Miami-Dade) representing Florida not chosen as well?
Dr. Peggy G. Carr: The limitations to five districts were quite severe. Many districts deserved to be included. I described our selection process in my answer to Catherine of Westfield, MA. If Congress mandates funding to conduct assessments in more districts, districts such as Miami-Dade would be good candidates (assuming that they are willing to participate).

David from Boyds, MD asked:
Were you able to separate out the urban district results from the state's overall score so that you could see the state's performance excluding the district? If so, what were the disaggregated state scores?
Dr. Peggy G. Carr: We did not separate the urban districts' results from the rest of the states? results because the urban district is an integral part of the state sample. The goal of the Trial Urban District Assessment is to estimate performance of students within the five urban districts. The goal of the state assessment is to estimate the performance of fourth- and eighth-graders within the states. The NAEP Data Tool at provides disaggregated data for the state by urban, rural, and suburban school locations, which will shed additional light on the results.

Jim from Columbia, Missouri asked:
I have a question regarding the interpretation of the test scores for urban districts. Given that NAEP has no academic repercussions for the individual student test-taker, why would we assume that students provided maximal effort when answering the questions? It could certainly be argued that the scores are as much a measure of "effort" as they are "proficiency."
Dr. Peggy G. Carr: In various assessment years, NAEP has in fact collected information from students about their motivation on the assessment. Data has consistently indicated that students at all grades try as hard ? or harder ? on the assessment as they do on schoolwork.

Tammy from Philadelphia, PA asked:
How is it decided which cities participate?
Dr. Peggy G. Carr: See my response to Catherine from Westfield, MA, where I described the selection process.

Jill from Lawrenceville, NJ asked:
How were the districts chosen? Will other districts be added in future assessments?
Dr. Peggy G. Carr: I described the process for choosing the five districts in my answer to Catherine of Westfield, MA, above. In 2003, the same five districts will have NAEP data to report, as well as four more: Boston, MA; Charlotte-Mecklenburg, NC; Cleveland, OH; and San Diego, CA. The future of the Trial Urban District Assessment is under review.

Katherine Shek from Washington DC asked:
A follow-up question, you said when sample sizes permit, data for students with disabilities will be available. Does that mean there will be a students with disabilities subgroup and what sample sizes are we talking about? thank you
Dr. Peggy G. Carr: For the NAEP program, reliable results for a subgroup can be produced when there are at least 62 students in the subgroup. NAEP results for students with disabilities should be interpreted with caution. These data are not representative of all students with disabilities. For example, NAEP is not conducted in ungraded special schools, so those students are not part of the assessment. Also, students in regular schools who have severe disabilities are often excluded from NAEP by school staff. There is variation across states, and even schools, in the way students with disabilities are defined, and in how the decision is made to include students in NAEP.

Leonie from New York, NY asked:
In many of your statistical summaries, the footnote to the NYC results say, "Although deemed sufficient for reporting, the target response rate specified in the NAEP guidelines was not met." What does this mean? and why were the no. of schools insufficient in NYC to give results for 8th grade averages? Was this an expected or unexpected occurrence?
Dr. Peggy G. Carr: NCES has standards regarding required levels of participation in order for results to be reported. New York City did not meet the initial public-school participation rate of 70 percent at eighth grade, so results for that grade were not reported. For grade 4, the weighted participation rate for the initial sample of schools was below 85 percent, and the weighted school participation rate after substitution was below 90 percent. As a result, the grade 4 NYC results are shown with the notation you saw, indicating possible bias related to non-response. Analysis of the grade 4 NYC data has not shown evidence of bias, so the results, although school participation was low, were reported. The assessment was conducted in late-winter/ early-spring of 2002. Conditions in NYC due to September 11, made administering NAEP challenging, and many NYC schools were operating under difficult conditions during the NAEP assessment window.

Jessica from Reading, PA asked:
Can the results from a district (e.g. Atlanta) be compared to the results from that state (e.g. Georgia)?
Dr. Peggy G. Carr: The results from a district and from a state can be compared, because students from both Atlanta and Georgia took the same reading and writing tests with the same test questions, so the scales are comparable. The sampling and administrative procedures were the same in both jurisdictions so the procedures are comparable. The Atlanta results are part of the Georgia results ? the children in Atlanta are members of both jurisdictions. This dependence makes statistical tests more complicated, but the comparisons are still valid.

Susan Lopez from Seattle, WA asked:
How was the sampling done for the urban districts? Were these students also included in any way in the results reported within the past month or so for Reading and Writing at the national and state levels?
Dr. Peggy G. Carr: The samples of students in the urban school districts are augmentations of the samples of students who were selected as part of the state samples reported last month. Students in the urban samples are also included as part of the sample for the state in which they are located. The data for these students were then weighted to be representative for the urban district and for the participating state.

Lisa from Hawaii asked:
Participation rates in NY were much lower than the other districts and both grades. Is there a reason behind the low participation? Do you expect that NYC will have data reported for both grades in the 2003 assessments?
Dr. Peggy G. Carr: Please take a look at my response to Leonie, who asked some similar questions. Events in NYC of September 11 obviously had impact on the schools and students in NYC during the NAEP assessment window, late January to early March. Conditions have stabilized in NYC in the intervening year, and we expect to be able to report 2003 results for both grades.

Pat from Columbia, MD asked:
Are higher percentages of LEP and disabled students found clustered in urban school districts than spread across states?
Dr. Peggy G. Carr: About 21 percent of students in the US were identified by their schools as students with disabilities (SD) and/or limited-English-proficient students (LEP). The percentages in the states ranged from seven percent to 37 percent at grade 4. In the urban districts, the range was from eight to 43 percent. Most of the variation among both the states and the urban districts was in the LEP component.

Mike from Westchester, NY asked:
How are the "Central City Schools," to which the districts are compared, defined?
Dr. Peggy G. Carr: The U.S. Census Bureau defines a "central city" as a city of 50,000 people or more, that is the largest in its metropolitan area or can otherwise be regarded as ?central,? taking into account such factors as commuting patterns. So this includes large cities such as Boston, MA as well as Lawton, OK and Parkersburg, WV. ?Central city? is not synonymous with ?inner city.?

Tim from Chicago, IL asked:
Are there plans to assess subjects other than Reading and Mathematics at the urban-district level? I would think that Writing would be an important subject to test as well, since that skill is an integral part of most people's everyday lives.
Dr. Peggy G. Carr: You have a good point, because writing is an important skill for adults. The 2002 assessments were conducted in reading and writing, and in 2003 reading and mathematics were assessed. Our current planning has placed a higher priority on assessing reading and mathematics. The next state writing assessment is scheduled for 2007. Whether urban districts will be part of that writing assessment has not yet been decided.

Andre from New Amsterdam, IN asked:
Following up on Katherine Shek's question: is the exclusion rate in the Urban Districts comparable with those from the States they are located in?
Dr. Peggy G. Carr: They are, by and large, comparable. For 2002, reading, for example: Atlanta 2%, Georgia 4%; Chicago 9%, Illinois 7%; Houston 17%, Texas 11%; Los Angeles 8%, California 5%; and New York City 8%, New York 8%. Keep in mind that the exclusion rates of the states represent students who were excluded in the entire state, which includes the participating urban district students as well as those from other areas in the state.

Hollie from DC asked:
Although overall scores for the urban districts are lower, some of the racial subgroups appear to be performing near the national average. That sounds like good news. What do you think?
Dr. Peggy G. Carr: It is good news. Some racial subgroups are performing near and even above the national average (New York, Atlanta, and D.C. Whites; Los Angeles and New York Asian/Pacific Islander). In addition, Houston and New York Blacks performed better than the national average for Black students.

Fred Regan from Newton, MA asked:
Why is it important to break results out by race/ethnicity in NAEP assessments? Does NAGB/NCES see race/ethnicity as a predictor of achievement? (I would hope not). Might not the breaking out of results in such a manner reinforce certain stereotypes? Thank you.
Dr. Peggy G. Carr: Race/ethnicity is not used in any way as a predictor of achievement. We are required by federal law to report NAEP results for race and ethnic groups. For example, the No Child Left Behind Act calls for closing educational gaps between Whites and minority students. NAEP provides data for measuring progress in closing these gaps. You make a good point, though. That?s why we also report on students? eligibility for the free/reduced-price lunch program ? a proxy for low-income status. Low social/economic status is often confounded with race.

Deric from Los Angeles, CA asked:
Since non-white students tend to score lower than whites, is it possible that the test questions are biased against minority groups?
Dr. Peggy G. Carr: NAEP assessment questions are carefully examined, statistically and qualitatively, for any possible bias. Items are reviewed by advisory groups; most items undergo more than one hundred such reviews before they are used in the assessment. All NAEP items are pre-tested before being used in a "live" assessment. All items are also examined statistically for evidence of different functioning between groups of students, including comparisons among different racial/ethnic groups. Items that show evidence of bias are eliminated from the assessment.

Bruce Mitchell from Boston, MA asked:
Your website includes the statements ?Because individual states have assessments based on a variety of scores, scales, and test designs, districts have not been able to validly compare themselves to a district in another state. For the first time, the TUDA makes such comparisons possible.? (a) To what extent might the district results be affected by curriculum differences? (b) If the districts are following state level curriculum documents, how fair is it to make comparisons across states/districts? Is there any statistical adjustment for ?opportunity to learn?? (c) Has NCES, NAGB or some other body mapped the NAEP assessment frameworks onto the various state (or district) curriculum documents? If so, is there a relationship between the degree of congruence between the assessment frameworks and curriculum documents and state (or district) achievement? (d) At the moment NAEP is considered a ?low-stakes test. Might this change through NCLB? Given that assessments can be used to drive educational change, are we moving towards a national curriculum? Thank you.
Dr. Peggy G. Carr: NAEP assessments are based on frameworks generated with broad input from stakeholders and overseen by NAGB. These frameworks are not intended to reflect individual states? or districts? curricula but instead represent prevailing expert views about appropriate assessment content. NAEP assessments, therefore, give general pictures of student knowledge and skills that may overlap with, but are not specifically designed to, capture knowledge and skills described in state and local curricula. Therefore, what is relevant here is the opportunity TUDA offers to compare results across districts, based on the SAME assessments used at the national and state levels. It is as fair to compare student subgroups at the national level as it is to compare results across states and districts. NAEP does not adjust for opportunity to learn ? assessments are designed to measure what is deemed valid for subjects and grade levels, and to be as accessible to a range of students as possible ? background questionnaires answered by students and teachers do supply some information about opportunity to learn. Some states have explored the relationship between their curricula and assessment frameworks and NAEP frameworks. Again, while there is likely overlap between NAEP frameworks and state and district curricula, frameworks are not designed to reflect local curricula. Opportunity to learn information revealed by responses to background questionnaires do indicate some positive relationships between student performance and subject exposure ? but this is not necessarily related to specific curricula. There are no plans to make NAEP a high-stakes assessment. To do so would require a major redesign of NAEP?s test forms and sampling procedures. The NCLB Act prohibits the federal government from influencing state curricula in the direction of a national curriculum. If states wish to move their curricula closer to NAEP, they are free to do so ? and a few states have done this ? but this is their prerogative.

Charles from Arlington, VA asked:
If NAEP results cannot be used to determine causation, why do we collect data such as race, gender, free/reduced priced lunch, etc? Doesn't providing results in terms of these variables lead people to make such assumptions and comparisons?
Dr. Peggy G. Carr: Please see my earlier response to Fred Regan on this topic.

Katherine Shek from Washington DC asked:
2002 was the first year NAEP allowed students to use accommodation to take the test, Right? What is the significance of this? Why is it important for students with disabilities to participate in NAEP? And is there a plan to make students with disabilities a subgroup in NAEP data? Did more students participate in NAEP in 2002 because accommodations were allowed? thanks
Dr. Peggy G. Carr: Students have been allowed to use accommodations on NAEP since 1996. Please see my response to your earlier questions.

Thanks for all the excellent questions. Unfortunately, I could not get to all of them, but please feel free to contact me or members of the NAEP staff , if you need further assistance. I hope that you found this session to be helpful and the reports to be interesting. This fall, we will release results from the NAEP 2003 reading and mathematics assessments.

Back to StatChat Home

Would you like to help us improve our products and website by taking a short survey?

YES, I would like to take the survey


No Thanks

The survey consists of a few short questions and takes less than one minute to complete.
National Center for Education Statistics -
U.S. Department of Education