Skip Navigation
Managing an Identity Crisis: Forum Guide to Implementing New Federal Race and Ethnicity Categories
NFES 2008-802
October 2008

Chapter 5. Getting it Out: Coding, Reporting, Storage, and Bridging

5.2 Data Coding

  • The Final Guidance does not dictate any coding schemes. States are allowed to design their own coding structure, as long as they are able to report the racial and ethnic data using the seven aggregate categories.
  • The five race categories with respondents allowed to choose multiple races yields a combination of 62 racial combination codes. (If a race category is broken out in more detail, that is, specific Asian subgroups, the number of categories could increase exponentially.) Two more codes may be assigned for respondents who selected Hispanic or non-Hispanic, without any race selected or assigned (note that this is an instance of missing data rather than a valid category). A full list of these 62 codes can be found in NCES's Statistical Standards. It is also included in exhibit 5.1 of this guide. Note that NCES statistical standard codes contain two codes for “no race specified or refused” that are for postsecondary institutions and cannot be used for K–12 reporting to ED.
  • Besides coding each race and ethnicity as single items, there are other approaches to coding. For example, each race and ethnicity category can be assigned as a “Y/N” or “1/0” in the system, such as:
Hispanic/Latino Y/N 1/0
American Indian/Alaska Native Y/N 1/0
Asian Y/N 1/0
Black or African American Y/N 1/0
Hawaiian Native/Other Pacific Islander Y/N 1/0
White Y/N 1/0

Another format for this coding scheme is to assign a 1/0 for each of the race and ethnic categories. This code system could be suggested for storage, not data entry/recording.

  American Indian/ Alaskan Native Asian Black or African American Hawaiian Native/ Other Pacific Islander White Hispanic/ Latino
Name 1 1 1 0 0 0 0
Name 2 0 0 1 0 0 1

100000 American Indian or Alaska Native
010000 Asian
001000 Black
000100 Native Hawaiian or Other Pacific Islander
000010 White
000001 Hispanic
110000 American Indian or Alaska Native and Asian
101000 American Indian or Alaska Native and Black
111111 All five races and Hispanic
  • For accuracy and data quality reasons, do not recycle old codes. The Massachusetts Department of Elementary and Secondary Education ran into some code-related data quality issues when it used some of the same codes in the new scheme that it had used in the past. Since Black, for example, was “03” under the old system, but was “02” in the new system, with “03” assigned to Asian, some confusion and coding errors occurred. As a result, the state implemented additional data quality reviews to ensure accuracy and has resolved such issues. (See Massachusetts State Department of Education case study in Chapter 2.)
  • State data systems vary in design. States should consider the best options for their systems based on assessment of such factors as costs to convert the systems, feasibility, and quality of data yielded, or whether or not the coding allows alpha/numeric codes only. Some states may prefer a two-digit (for major categories) or four-digit code system (for more specific information such as ancestry or tribal information). Some states may choose to use codes that match those used in the previous year with any necessary modifications to accommodate the new categories. After such consideration, standards should be developed for school districts to change their systems. Some states, such as Vermont and North Dakota, are already working toward a system using the new race and ethnicity codes. Their systems, developed prior to the release of the Final Guidance, are documented in case studies included later in this chapter.
  • It is recommended that school districts use the easy coding system for data entry (such as a yes/no or 1/0 for each of the five races). To minimize data entry errors, it would be wise to design the data entry screen to look like the data collection form.
  • It is important to ensure the accuracy of data received from schools. Technology can help data quality through automation of edit checks. Data entry staff, administrators, and technology personnel can work together to produce and implement these edit checks. For example, staff should re-check the information if the existing data in a record are different from the new data and it is:
    • Not one of the “split out” categories such as from “Asian or Other Pacific Islander” into “Asian” or “Native Hawaiian or Other Pacific Islander;”
    • A single-race selection but with a different category;

    Or if “Hispanic” has been entered without a race.