Frequently Asked Questions

Data File Information

Who's included in the child, teacher, and school catalogs? Why are there teachers and schools in the child catalog, but there are no teachers and students in the school catalog? Why are there schools in the teacher catalog that are not in the school catalog?

In the base year, the ECLS-K was representative at three levels—kindergartners (i.e., the child level), kindergarten teachers, and schools educating kindergartners. The longitudinal kindergarten through eighth grade electronic codebook (ECB) contains a catalog pertaining to each of these levels. The child catalog is longitudinal and contains data for all children who participated in the kindergarten year, as well as data for all subsequent rounds of data collection. The teacher catalog is cross-sectional and contains information for the representative sample of kindergarten teachers only for the kindergarten round of data collection. The school catalog also is cross-sectional and contains information for the representative sample of schools educating kindergartners, again only for the kindergarten round of data collection.
Within the representative sample of schools educating kindergartners, kindergarten teachers were selected for the teacher sample, regardless of whether any children were sampled from their classrooms for the child sample. Thus, there are teachers in the teacher catalog who are not represented in the child catalog. Conversely, there are teachers in the child catalog who are not in the teacher catalog, because some children changed teachers during the kindergarten year and their new teachers were not part of the representative sample of teachers. Similarly, the school catalog contains only those schools that were sampled as part of the representative sample of schools educating kindergartners and that had a completed school administrator questionnaire. The school-level file does not contain those schools that children moved into but were not part of the initial representative sample of schools educating kindergartners. Data collected from children’s new schools were, however, included in the child-level file.
A user might see school IDs on the teacher file that are not on the school file. This is because, while these schools were part of the representative sample of schools with kindergartens, they did not have a completed school administrator questionnaire.
Thus, users interested in creating teacher/classroom-level files or school-level files based on presence of child data, regardless of whether they were part of the representative teacher or school sample, should use the child-level file.

How are the data collected in the fall and spring kindergarten teacher Part B questionnaires presented in the child and teacher catalogs?

In the fall of kindergarten (round 1), teachers were asked about their characteristics and the characteristics of their classroom in Teacher Questionnaire, Part B (TQB). Teachers who were added to the study in the spring of kindergarten (either in a school that joined the study in round 2 or for a child who had a new teacher in round 2) were administered a similar version of the TQB in the spring (round 2). Teachers who answered TQB in the fall did not complete the spring version, so for any one teacher there is only one set of TQB data.
In the base year teacher file all of the TQB items have the B1 prefix, regardless of the round in which the data were collected. Two flags indicate the round in which the data were collected. If the flag B1TQUEX equals 1, the data were collected in the fall; if B2TQUEX equals 1, the data were collected in the spring.
In the longitudinal child file there are two sets of TQB data, one set beginning with the B1 prefix and one set beginning with the B2 prefix. The B1 items pertain to the teacher linked to the child in the fall [with variable T1_ID] and the B2 items pertain to the teacher linked to the child in the spring [with variable T2_ID]. The majority of the children have the same teacher in the fall and spring, so for these children their case-level information for the B1 TQB items is identical to their case-level information for the for B2 TQB items. For children who changed teachers during the year, these two sets of data are different because they come from different teachers [you can determine whether children changed teachers by comparing T1_ID with T2_ID or by looking at variable FKCHGTCH]. For cross-sectional analyses, analysts should use the TQB items from the time period for which they are doing analyses (i.e., B1 items for fall kindergarten; B2 items for spring kindergarten). For information that reflects the kindergarten experience as a whole, analysts might choose to limit their analysis to children whose teacher remained constant across the year.
  • The Spring Kindergarten TQB contains a subset of items asked in the Fall Kindergarten TQB. However, the variable structures for the B1 and B2 variables on the child-level data file are parallel; that is, all Fall Kindergarten TQB variables (B1 variables) have a corresponding Spring Kindergarten TQB variable (B2 variables) in the Electronic Codebook (ECB) and in the resulting data file.
  • Some questions asked of teachers in the Fall Kindergarten TQB were not asked of teachers new to the study in the Spring Kindergarten TQB. For children who had the same teacher in fall and spring, the information from the B1 variable pertaining to such a question is carried forward to the B2 variable. Children who have teachers who are new to the study in the spring (i.e., they are in schools that joined the study in the spring or they changed teachers between fall and spring) do not have information pertaining to these questions in the spring. Data for these measures for these children are coded either as system missing or not ascertained, depending on whether or not his/her teacher responded to the survey.
  • On the Fall Kindergarten TQB, question 1 asks teachers about time spent in whole class activities, small group activities, individual activities, and child selected activities. This question was not repeated on the Spring Kindergarten TQB, but was asked on the Spring Kindergarten TQA (question 8). Therefore, there are three sources of information from kindergarten on time spent in whole class activities, small group activities, individual activities, and child selected activities [Fall Kindergarten TQB: B1WHLCLS; B1SMLGRP; B1INDVDL; B1CHCLDS] [Spring Kindergarten TQA: A2WHLCLS; A2SMLGRP; A2INDVDL; A2CHCLDS] and [Spring Kindergarten TQB: B2WHLCLS; B2SMLGRP; B2INDVDL; B2CHCLDS]. Since this question did not appear in the Spring Kindergarten TQB, the B2 variables are simply the responses that teachers had provided on the fall TQB. The A2 variables present the information that best reflects the spring kindergarten time period.

Is it possible to compute the elapsed time period between two assessments?

It is possible to calculate the elapsed time between two direct assessments for a child using variables in the Electronic Codebook (ECB). For each direct assessment, there are corresponding variables that indicate the month, day, and year in which the direct assessment was administered. For instance, in round 1 the assessment date variables are: C1ASMTMM (C1 Assessment month), C1ASMTDD (C1 Assessment day), and C1ASMTYY (C1 Assessment year-4 digits). To calculate the elapsed time between two assessments for a child, one can use the assessment date variables from the two rounds of interest to determine the number of days between the two direct assessments.
The ECB also includes composite variables for children's age at assessment at each direct assessment time point (e.g., R1_KAGE for Round 1 Composite child assessment age, in months). These variables are based on the children's date of birth and the date on which they were assessed. In some cases, there are discrepancies in the age at assessment variables due to masking of variables for the public-use file or improvements in the date of birth variables collected in earlier rounds of data collection. Since this is the case, we recommend that users calculate the elapsed time between assessments using the method described above, rather than use the composite assessment age variables on the public-use data file.
Below are examples of SPSS and SAS code that can be used to calculate elapsed time between direct assessments:
SPSS Code:
COMMENT Calculate elapsed time between R1 and R2
COMMENT convert 3 variable assessment date into a one variable assessment date.
COMPUTE date1=DATE.DMY(c1asmtdd, c1asmtmm, c1asmtyy).
FORMATS date1(DATE11).

COMPUTE date2=DATE.DMY(c2asmtdd, c2asmtmm, c2asmtyy).
FORMATS date2(DATE11).

COMMENT calculate elapsed time between R1 and R2 assessments, in days.
COMPUTE elapse = (date2 - date1)/86400.

SAS Code:

/* EXAMPLE - Calculate elapsed time between R1 and R2*/
data new file;
set original file;
Because C1ASMTMM , C1ASMTDD, C1ASMTYY are numeric values, only the SAS function is needed to convert it to SAS date value, which can then be extracted.

I noticed that in the data file not all cases have data for the round 4 school administrator questionnaire (SAQ) variables. What do I do?

Schools that had already completed the school administrator questionnaire (SAQ) in round 2 were given a modified repeat school SAQ in round 4, whereas schools that were new to the study in round 4 were asked to complete an SAQ for new schools that collected more information than the modified SAQ (and was very similar to the SAQ used in round 2). The questions that were not in the repeat school SAQs (e.g., the grade levels included in the school, how many students the school site is designed to accommodate, what grades are tested with standardized tests) had already been asked in round 2, and variables associated with these questions are on the data file as S2 variables.
For those SAQ questions that were asked at round 2 but not at round 4 for repeat schools, a user can pull forward the data collected from round 2 to have a complete set of round 4 variables. For children who did not change schools between rounds 2 and 4, their round 2 child-level S2 variables can be used in analyses at round 4. Care should be taken to not replace updated information collected at round 4 with round 2 data (i.e., this procedure should only be used for data that were not collected at round 4). However, for children who changed schools between rounds 2 and 4 and moved into a repeat school, their own S2 data cannot be used at round 4. For these children, an analyst can use the child's round 4 school ID (S4_ID) to find another child who attended that school (i.e., had the same school ID) in round 2 and then pull that other child’s round 2 school data forward.

When I try to download the online kindergarten through eighth grade data from, I receive a message saying that the file is corrupted. What troubleshooting steps do you recommend?

The following are the steps that should be taken in order for an ASCII file to be created that contains all the data from kindergarten through eighth grade. (The procedures work the same for the smaller files with the base year school and teacher data.)
  1. Click on Childk8p.z01 from NCES's website. A prompt will appear that asks whether you want to "find," "save," or "cancel." Select "save" and save the file to the desired location on your computer. It may be helpful to create a specific folder in which to save all the files you will need to download.
  2. Repeat step 1 for Childk8p.z02, Childk8p.z03, Childk8p.z04, Childk8p.z05, and finally, the file, making sure to place the files in the same folder as that containing Childk8p.z01.
  3. Go to the location where you saved all your files and double-click the file.
  4. The WinZip dialogue box should open. Select the file and then select extract.
  5. A dialogue box may open that asks where you want WinZip to extract to. By default, it should be the folder in which you saved the files. If it is not, change the directory to the location where you would like the files to be located and select extract.
  6. The extraction process should begin, and after it is complete, you should see the childkp.dat file. This is the ASCII file with all the data in it.
The above-listed procedures for opening files do not seem to work if you use Microsoft Window's default Extraction Wizard along with some free unzipping programs (other than WinZip). Many data users find that WinZip is needed to open the online data properly, although Extraction Wizard may be able to open the smaller zipped folders (e.g., the school and teacher data).
For computers with slower connection speeds, it is possible that large files such as these may become corrupted during the lengthy time it takes to download them. If your computer has a slower connection speed, try the above procedures using a computer with a faster connection time.
Please keep in mind that the public-use data DVD with the electronic code book (ECB) is also available free of charge through EdPubs (, which distributes NCES products. Public-use data are also available through an on-line version of the ECB called the EDAT system. To access the EDAT system, please visit

I saved my taglist in the ECB and used the “Extract” function to save a file with the variables I tagged, but when I try to open the file in SPSS/SAS/Stata, I don’t see any data. What’s wrong?

The ECB does not create a data file. Rather, the ECB creates syntax code that must be run in a statistical software package to generate a data file. The syntax file reads in raw data from the ASCII data file (the file with a .dat extension). In the ECB, there are two “save” steps in the “Extract” procedure. The first step saves the syntax file. In the second step, there is no file that is actually saved. Instead, this step writes a line of code in the syntax file indicating what to name the data file once the syntax file is run and a data file is generated.

What geocode data are available?

Zip code and geocode data are available only to analysts with a restricted-use license. Information about applying for a restricted-use data license can be found at
Census tract and zip code tabulation area (ZCTA) codes for ECLS-K children's homes and schools are available for each round of the ECLS-K up to third grade. These data are available on the CD “Census Data and Geocoded Location for the Early Childhood Longitudinal Study, Kindergarten Class of 1998-99 (ECLS-K)” supplemental restricted-use file, available to restricted-use license holders upon request to the IES Data Security Office ( This file also has about 600 Census variables (or Census derived variables) for each Census tract and ZCTA including income, race/ethnicity, and many other sociodemographic characteristics of the people living in the tract or ZCTA. Supporting documentation is included on the CD and consists of a user’s manual, data file record layouts describing the variables on each of the ASCII data files, and code for converting the data files.
Home zip codes, school zip codes, and school FIPS codes (both state and county) are also available for first grade, third grade, fifth grade, and eighth grade on the cross-sectional restricted-use files for these grades. These data are identified by the variables RxHOMZIP, RxSCHZIP, RxFIPSST (state), and RxFIPSCT (county), respectively, where “x” refers to the data collection round number. Additional geographic information, such as home FIPS codes (P7FIPSST, P7FIPSCT) and tract and block data, is available at eighth grade. To match FIPS codes to the states' names, please visit
Please note that while the restricted data do identify children’s state of residence, the ECLS-K sample was not designed to support state-level (or city- or county-level) estimates, as the sample is not necessarily representative of children in particular states (or cities or counties).