Skip Navigation

Frequently Asked Questions

ECLS-B

Were children with disabilities sampled in ECLS-B?

Children with disabilities and special health needs were included in the ECLS-B sample. NCES and the Office of Special Education Programs (OSEP) conducted an evaluation of the feasibility of oversampling young children with disabilities and special health needs and determined that such an oversampling could not be pursued. However, the ECLS-B oversampled twins and infants born with low and very low birth weight. Low birth weight and short gestation (common among twins) are strongly associated with early developmental difficulties and special health needs. Additionally, information is gathered on early health and developmental disabilities and on the services children receive. The ECLS-B aimed to be as inclusive as possible, and assessments were designed to include children of all levels of ability and skill. Exclusion from the assessments was considered on a case-by-case basis; in most cases where an exclusion occurred, a child was excluded only from certain components of the assessment rather than from the entire assessment. For the 9-month and 2-year data collections, all children were included in the untimed one-on-one assessments. Accommodations were made (and documented) when necessary. For the preschool and kindergarten data collections, most children were included in the untimed one-on-one assessments. However, children who required Braille or sign language were not administered the cognitive assessments, though they did participate in the motor and physical assessments with accommodations. Children who were in a wheelchair did not participate in the gross motor assessments and accommodations were made in order to obtain their physical measurements. The special needs of all other children who had them were accommodated (e.g., children who normally used some type of assistive device were allowed to use the device during the assessment).

What instruments were used to assess children's cognitive development?

Information on children's development was captured directly from the children themselves and indirectly through a parent/primary caregiver interview. Both the direct and indirect child assessments include measures developed specifically for the ECLS-B and measures taken from other well-established and/or standardized assessments.

9-month and 2-years collections
The assessment of cognitive development used at 9 months and 2 years was the Bayley Short Form—Research Edition (BSF-R), an adaptation of the Bayley Scales of Infant Development–II (BSID-II). The BSF-R includes a subset of BSID-II items that can be used to approximate children's performance on the full BSID-II. This assessment captures children’s babbling, vocabulary, active exploration, understanding of repetitive actions, and problem solving skills.

Preschool and kindergarten collections
The early reading and mathematics direct cognitive assessments used in the preschool and kindergarten collections were similar to the assessments used in the Early Childhood Longitudinal Study, Kindergarten Class of 1998-99 (ECLS-K). They incorporated items developed for the ECLS-K, as well as items from the following copyrighted assessments:

Peabody Individual Achievement Test-Revised (PIAT-R)
Peabody Picture Vocabulary Test Third Edition (PPVT-III)
PreLAS 2000
Preschool Comprehensive Test of Phonological and Print Processing (Pre-CTOPPP)
Comprehensive Test of Phonological Processing (CTOPP)
Test de Vocabulario en Imagenes Peabody (TVIP)
Test of Early Mathematics Ability- Third Edition (TEMA-3)
Test of Early Reading Ability - Third Edition (TERA-3)
Test of Preschool Early Literacy (TOPEL)
Woodcock-Johnson III and Woodcock-Johnson III-Revised Tests of Achievement

At preschool only, children were asked about their color knowledge using a task developed by the Head Start Impact Study (Color Bears).

In the preschool and kindergarten collections, parents were asked about their children’s skills and knowledge of things like colors, letters, and numbers using items from the National Household Education Surveys Program (NHES) questionnaires.

Parents also reported on children’s vocabulary using a subset of items taken from the MacArthur Communicative Development Inventory (M-CDI) in the 2-year and preschool collections. Additionally, in the preschool collection, parents were asked about children’s conversational language (Leventhal 1999).

Did the ECLS-B measure outcomes other than cognitive development?

Yes, ECLS-B measured children's socioemotional, physical, and psychomotor development. At 9 months, children's socioemotional development (e.g., social skills, emotion regulation) was measured directly using the Nursing Child Assessment Teaching Scale (NCATS) and indirectly through parental reports. At 2 years, children's socioemotional development was measured directly using a semi-structured play activity (the Two Bags Task) and indirectly through parent and provider reports. Additionally, at 2 years attachment quality was assessed using a computerized panel sort task, the Toddler Attachment Q-Sort (TAS-45), which was completed by the interviewer after the home visit. At preschool, as at 2 years, children's socioemotional development was measured directly using a semi-structured play activity (the Two Bags Task) and indirectly through parent and provider reports. During the kindergarten waves, children’s socioemotional development was assessed indirectly through parent and teacher/provider reports. Children's length/height, weight, and middle upper arm circumference were measured directly during the home visit at each wave. Head circumference was measured for children who were born with very low birth weight. Children's fine and gross motor skills were assessed using items from the Bayley Short Form - Research Edition at 9-months and 2-years. At preschool and kindergarten, fine motor skills were assessed by asking children to copy a series of forms/shapes drawn by assessors and to build structures using blocks. In these rounds of data collection, gross motor skills were assessed by asking children to jump, balance on one foot, hop on one foot, skip, walk backward along a line, and catch a bean bag.

How do I choose the most appropriate score for analysis of my research question?

There are many direct cognitive, socioemotional, and physical measure scores available on the ECLS-B data files. For guidance in selecting scores, please see Choosing Scores (472 KB).

What populations of children can be studied with the ECLS-B data?

The ECLS B describes children born in the United States in 2001, with the exception of children born to mothers younger than age 15 and children who died or were adopted prior to the 9-month data collection. Additionally, the ECLS-B sample is large enough to support the analysis of many different subgroups of children. Because the ECLS-B oversampled certain groups of children that are relatively rare in the general population (twins; Chinese, Other Asian and Pacific Islander, and American Indian/Native Alaska children; and children born with moderately low or very low birth weight), reliable estimates can be produced for these groups of children.

Do I need to use weights for my analyses? What if I am not interested in making statements about the population?

Weights are used to adjust for disproportionate sampling, survey nonresponse, and undercoverage of the target population when analyzing complex survey data. They also are necessary to produce national-level estimates of the children born in 2001.

Oversampling results in certain groups of children being represented as a larger proportion of the sample than their representation in the general population. Estimates produced in analyses that do not adjust for this oversampling may be biased if the characteristics of the oversampled groups are related to the outcomes being studied. For example, the average birth weight estimate will be lower in the ECLS B sample than in the population in general due to the oversample of twins and children born with low and very-low birth weight in the sample. Using the weights will correct for the over-representation of these groups and produce a more accurate estimate of birth weight among babies born in the United States in 2001.

Do I need to use the same weight for all analyses?

Researchers are encouraged to use the same weight throughout all analyses in a publication or paper, even when there is a different ideal weight for each analysis. Weights are assigned to cases with valid data for the component(s) contributing to the weight. Selecting different weights within the same publication or paper results in each analysis being run with a different analytic sample (i.e., the exact cases contributing to the analyses).

Can researchers produce state-level estimates?

No, the ECLS-B sample is designed to support national and regional estimates. It is not designed to estimate characteristics of children, families, and schools at or below the state level.

What statements can be made about participation in early child care and education programs, and its relation to cognitive, physical, and socioemotional development?

The ECLS-B data on child care and early education can be used to examine nonparental care and education experiences of the children born in 2001, both at a particular point in time (e.g., when they are 9 months old) and longitudinally. For example, it is possible to compare children’s nonparental care and early childhood education experiences before kindergarten to their experiences in before- and/or after-school programs and activities once they enter kindergarten.

All statements about nonparental child care and early education should be made in relation to the experiences of children born in the United States in 2001. The ECLS-B data cannot be used to make statements about nonparental child care and early education in the United States generally or about the population of providers. For example, the data cannot be used to generate estimates of the number of providers in the United States who provide different types of care and early education (i.e., center-based, relative care, or nonrelative care). Because early care and education providers were identified through their link to the ECLS-B sample of children born in 2001, as opposed to being identified through a random sample of providers from a universal list of providers, the sample of providers is not nationally representative.

To study the relationship between early care and education and children's development, it is important to have data both before children receive care and education from persons other than parents and after they begin receiving nonparental care and education. The ECLS-B captures information on children's participation longitudinally, making it possible to compare differences in key child development outcomes before and after experiencing nonparental care and education.

Which data file should I use if I want to analyze data from just one round of collection?

The longitudinal 9-month—kindergarten 2007 restricted-use data file can be used for any analysis. This longitudinal file contains data for all cases that ever participated in the study, including those that became nonrespondents at some point after the 9-month collection. (NOTE: The data file does not contain information about the originally sampled cases that never participated in the study.) This final longitudinal data file includes some important updates and corrections to errors discovered in previously released data; for this reason, researchers with previously released data files (e.g., the 9-month-—2-year restricted-use file) are strongly encouraged to obtain the most recent release of the data.

What is the Bayley Short Form—Research Edition (BSF-R)? How does the BSF-R compare to the BSID-II? How does it compare to the BSID III?

The Bayley Scales of Infant Development-Second Edition (BSID-II), which assesses young children’s cognitive and motor development, was too long and complex for administration by non-clinicians during the ECLS-B home visit. Consequently, with publisher’s permission, the Bayley Short Form—Research Edition (BSF-R) was developed. The BSF R comprises a subset of items from the BSID-II, which can be used to estimate performance on the full BSID-II and yet was feasible to administer in the home by non-clinicians. The subset of items selected to approximate children's performance on the full BSID-II was chosen using Item Response Theory (IRT) modeling. Children’s estimated BSID-II scores, derived from their performance on the BSF-R, are on the ECLS-B data file. (The item-level BSF-R data used to estimate the BSID-II are not on the file.)

Creating the BSF-R.

The items administered in the BSF-R were selected based on ease of administration and analysis of the item properties using IRT modeling. A two-parameter IRT model was used (discrimination power & item difficulty level). BSID-II publisher data for the administration of BSID-II items in a standardization sample were obtained and all the items were scaled on one metric. Then items were chosen based on difficulty level and discrimination power. First, the item pool was reduced to those items representing the constructs appropriate for the targeted age range at assessment, were at even intervals for difficulty level, and had a discrimination power of approximately 1. Then, within the reduced pool, items that were simple to administer and straightforward in scoring were chosen. Whenever possible, "twofers" were chosen: these are sets of items that can be scored from one administration (e.g., a child is given a cup and 5 blocks; items "puts one block in cup," "puts 3 blocks in cup," and "puts 5 blocks in cup" can all be scored). Also, items requiring the least amount of materials for administration were preferred.

Once the final set of items was determined, they were organized to approximate the BSID-II age sets. That is, the BSID-II groups items into age sets such that no one child received all the items. A child would begin with the items in his/her age set (e.g., a 9-month-old would begin with the 9-month age set which has items appropriate for ages 8-11 months). If these items were too difficult, the assessor then administered the age set for younger children (e.g., the 8-month-old age set, or even the 7-month-old age set). Conversely, if the items were too easy, the child would be administered an age set for older children (e.g., the 10-month-old age set, or even the 11 month old age set). To approximate the BSID-II age sets, the items chosen for the BSF-R were organized into a Core set (i.e., administered to all), a Basal set (i.e., administered to those who performed poorly on the Core set), and a Ceiling set (i.e., administered to those who performed perfectly or nearly perfectly on the Core set). In this way, determining when to administer the Basal or Ceiling set and which set to administer was straightforward and all children were appropriately challenged and assessed. The BSF-R diverges from the BSID-II primarily in its use of shortened core, basal, and ceiling item sets.

Lastly, the BSID-II uses a 30-item Behavior Rating Scale (BRS) to help interpret children's performance. Nine items were chosen from the BRS and included in the ECLS-B for this purpose. These items do not, however, approximate the full BRS.

Scores.

Children's performance on the BSF-R was used to estimate their performance on the BSID-II through the use of IRT modeling. The standard error of these estimates can be found of the data file (e.g., X1MTLSE, X1MTRSE, X2MTLSE, X2MTRSE). Separate scores were produced for the mental and the motor scales. Three types of scores were generated and can be found on the ECLS-B data file. The scale score (i.e., X1RMTLS, X1RMTRS, X2MTLSCL, X2MTRSCL) represents the number of items a child would have gotten correct on the full BSID-II. It is a straight score and does not take into account prematurity.

Also on the file is the child's ranking relative to other children his/her age in the ECLS-B sample, correcting for prematurity. This ranking is similar to the Developmental Index scores on the BSID-II; this is a standardized score that can be used to compare groups of children. T-scores were used to standardize ECLS-B children's scale scores (i.e., X1MTLT, X1RMTR1, X2MTLTSC, X2MTRTSC). The T-scores have a mean of 50 and a standard deviation of 10. As mentioned above, these scores take into account premature birth. To obtain the child's chronological age at the time of the assessment, the child's birth date was subtracted from the date of the assessment to obtain child age at assessment. In the case of children who were born at least 21 days (i.e., 3 weeks) early, the amount of prematurity (e.g., 4 weeks) was then subtracted from the child's age at assessment. In this way, children born premature were ranked relative to other children at the same developmental age (as opposed to chronological age).

Lastly, the ECLS-B data file includes 20 proficiency probability scores: 10 from the Mental Scale and 10 from the Motor Scale. The individual proficiencies are described in the User's Manual. Each proficiency probability was generated from the child's estimated performance on 4 to 6 BSID-II items and represents the probability that the child has mastered the skill represented by that proficiency. Thus, the proficiency probability scores range from 0 - 1. Scores on a particular proficiency can be averaged across children to produce estimates of mastery rates within population subgroups.

BSF-R v BSID-II.

The BSF-R is a subset of the BSID-II and can be equated with the BSID II using IRT modeling.

  • Like the BSID-II, the BSF R has a Mental Scale and a Motor Scale. While the BSID-II has 178 Mental items and 111 Motor items, the BSF-R has 29 Mental items at 9 months and 33 Mental items at 2 years, and 35 Motor items at 9 months and 32 Motor items at 2 years.
  • The BSID-II groups items in age sets; the BSF-R has a Core set of items that are administered to all children and the supplementary Basal and Ceiling items sets that are administered if needed.
  • The BSID-II generates a raw score or "true" score that is then converted to a standardized score known as the Mental Developmental Index or the Motor Developmental Index (MDI). IRT modeling was used to estimate the BSID-II raw or true scores from the BSF-R. These estimated scale scores and their associated standard errors are on the ECLS-B data file. Additionally, T-scores are used to standardize the raw scores relative to the ECLS-B sample, taking into account prematurity.
  • The ECLS-B also provides proficiency probabilities based on the BSID-II raw scores.

Only BSID-II scores are on the ECLS-B data file; the item-level BSF-R scores used to estimate these BSID-II scores are not available on the file.

BSID-II v. BSID-III.

The BSID-III, published in October of 2005, differs from the BSID-II in that it assesses development in more than just the cognitive and motor domains. It also examines development in areas such as language and adaptive behavior and in the socioemotional domain. Additionally, the BSID-III is normed on a more recent population than the BSID-II; the BSID-III uses the 2000 census in stratifying children by age. More information about the BSID-III can be found at www.PsychCorp.com.

How do I access ECLS-B data?

Due to NCES's confidentiality legislation, ECLS-B case-level data are available only to qualified researchers who are granted a restricted-use data license. Information about applying for or amending a restricted-use data license can be found at http://nces.ed.gov/pubsearch/licenses.asp. When presenting analyses, preparing manuscripts, publishing ECLS-B results, or corresponding through email (including with NCES staff), analysts must comply with ECLS-B rounding rules. Specifically, unweighted sample sizes must be rounded to the nearest 50. For example, a cell size of 25 to 74 is rounded to 50, 75 to 124 is rounded to 100, and a cell size less than 25 is denoted by a symbol indicating ''rounds to zero.'' Further, all presentations and manuscripts prepared using ECLS-B restricted-use data must be sent to the NCES Data Security Office (IESData.Security@ed.gov) for disclosure review prior to publication or presentation, as is required by the terms of the NCES restricted-use data license.

A subset of variables from the ECLS-B 9-month data collection is available to the general public in the Data Analysis System (DAS).The DAS allows users to develop tables with weighted estimates from the ECLS-B but does not provide users with access to case-level data. For more information about the DAS, please click on Data Information.

I saved my taglist in the ECB and used the “Extract” function to save a file with the variables I tagged, but when I try to open the file in SPSS/SAS/Stata, I don’t see any data. What’s wrong?

The ECB does not create a data file. Rather, the ECB creates syntax code that must be run in a statistical software package to generate a data file. The syntax file reads in raw data from the ASCII data file (the file with a .dat extension). In the ECB, there are two “save” steps in the “Extract” procedure. The first step saves the syntax file. In the second step, there is no file that is actually saved. Instead, this step writes a line of code in the syntax file indicating what to name the data file once the syntax file is run and a data file is generated.

What is the difference between resident and nonresident fathers?

The resident father was identified during the parent interview as the person who resided in the household who was either the child’s biological/adoptive/step-/foster father or the person identified as the partner or spouse of the parent interview respondent (i.e., was the father figure or played an important role in the child’s life). As the partner or spouse of the parent respondent, the person completing the resident father questionnaire could be the child’s grandfather if the child’s grandmother was the parent respondent, or the male partner of the child’s mother.

For the ECLS-B, fathers identified as eligible for the nonresident questionnaire had to be the child’s biological father, could not reside in the household with the child, and had to meet one of the following criteria: (1) the father must have seen the child at least once in the last month; (2) the father must have seen the child at least 7 days in the last 3 months; or (3) the father must have been in touch with the child’s birth mother at least once a month in the 3 months preceding the parent interview. Contact was defined as a telephone call or an in-person visit. Additionally, the biological mother had to be the respondent for the parent interview and she had to give permission for the nonresident biological father to be contacted.

Are all fathers surveyed in all data waves?

Both resident fathers and nonresident biological fathers were surveyed in the first two waves of data collection (at 9 months and 2 years). Only resident fathers were surveyed during the preschool data collection. No fathers were surveyed during the kindergarten collections unless they were the primary caregiver and responded to the parent interview during the home visit.

What is the difference between teachers, Early Care and Education Providers (ECEPs), and Wrap-Around Early Care and Education Providers (WECEPs)?

In the ECLS B, ‘teachers’ refers to the educators who taught the ECLS-B children in kindergarten or higher. Early Care and Education Providers, referred to as ECEPs, provided child care and early education below the kindergarten level. They may have been teachers in a preschool, babysitters, family daycare providers, nannies, or relatives; in short, anyone who was not the child’s parent or guardian and regularly provided care and/or education prior to kindergarten for the study child. Wrap-Around Early Care and Education Providers, or WECEPs, were people who provided care and/or education for the ECLS-B children enrolled in kindergarten during the hours before and after school. Like ECEPs, they were a varied group and provided care and/or education in a variety of settings.

When were teachers and early care and education providers (ECEPs/WECEPs) surveyed?

Early care and education providers (ECEPs) were interviewed by telephone during the 2-year, preschool, and kindergarten 2006 data collections. In the 2-year collection, the interview was referred to as the Child Care Provider (CCP) interview. The name was changed to the Early Care and Education Provider (ECEP) interview at preschool to include the more educational settings children tended to be in as they got older (e.g., preschool, nursery school, public pre-kindergarten programs, etc.). Teachers and wrap-around early care and education providers (WECEPs) were surveyed in the two kindergarten data waves. In the first kindergarten collection in 2006, ECEP interviews were conducted by telephone for children who were not yet enrolled in kindergarten or higher and participated in regularly scheduled nonparental care and/or education (e.g., preschool, child care, etc.). Teachers of children who were enrolled in kindergarten or higher were mailed self-administered questionnaires to fill out and return. WECEP interviews were conducted by telephone for children enrolled in kindergarten who received regularly scheduled before- and/or after-school care. The WECEP phone interview was a modification of the ECEP phone interview, tailored to wrap-around settings and programs. While the teacher and WECEP components were included in the second kindergarten collection in 2007, the ECEP interview was not included because almost all of the ECLS-B children were in kindergarten or higher.

What supplemental data are available?

The ECLS B currently has two supplemental datasets associated with it: the Twin Triad dataset and the Reading Aloud Profile – Together (RAPT) dataset.

The Twin Triad file consists of data from the 9-month collection for more than 50 twin pairs. The supplemental Twin Triad study is a modification of the 9-month teaching activity (The Nursing Child Assessment Teaching Scale, or NCATS). In the regular NCATS activity protocol, the parent was asked to interact with his/her child one-on-one and to teach the child one of a selected set of tasks that was slightly beyond the child’s current abilities (e.g., how to stack a set of blocks). All the twins in the sample completed the NCATS task individually with their caregiver (usually the mother). However, a subset of twin pairs and their mothers agreed to do the NCATS task a third time, in a triadic interaction such that the mother simultaneously attempted to teach both twins a new NCATS task (one not yet attempted). The Twin Triad file contains information on how the threesome interacted during the teaching task and thus sheds light on how the triad might interact naturally in their day-to-day lives (as triadic interactions may be more common for twins than the one-on-one interactions observed in the dyadic NCATS interaction). The Twin Triad dataset is a supplemental restricted-use dataset and is available to restricted-use license holders upon request to the IES Data Security Office (IESData.Security@ed.gov).

The Reading Aloud Profile – Together (RAPT) data provide detailed information about parents’ and children’s behaviors while engaged in the joint book reading activity of the preschool Two Bags Task. In the preschool Two Bags Task the parent and child were given 10 minutes to play with the contents of two bags. The first bag contained a book to read and the second bag contained toys with which to play. The RAPT data can be used to examine whether joint book reading behaviors of parents and children vary by family and child characteristics and whether joint book reading behaviors relate to children’s early reading competency at preschool and upon entry to kindergarten. The RAPT sample of approximately 800 cases is a random sample drawn from the larger ECLS-B sample, so all the oversamples are represented, though it may not be able possible to study each oversample group individually using the RAPT data due to relatively small subgroup sample sizes. Although the RAPT data were initially released as a separate dataset, they are now included in the longitudinal 9-month—kindergarten 2007 restricted-use data file. For more information on the RAPT, see the Early Childhood Longitudinal Study, Birth Cohort (ECLSB), Preschool–Kindergarten 2007 Psychometric Report (NCES 2010-009) (Najarian et al. 2010).

How do I determine if a case is in one of the supplemental samples?

In order to determine which twins are included in the Twin Triad dataset, one must request the Twin Triad dataset and merge it to the main data file using child ID. In order to determine if a case was included in the RAPT study, one may use any one of the Z3 variables. If a case has valid data for a Z3 variable, then it was included in the RAPT sample.

Can you apply the weights to the supplemental samples to get national estimates?

You cannot apply weights to the Twin Triad sample to obtain national estimates because the sample was not a random sample. Consequently, the findings can only be generalized to twins with caution. The RAPT sample, however, is a random sample of study children with Two Bags Task data. RAPT data can be weighted using a main sample weight adjusted by multiplying by the inverse of the probability of selection (for example, W3R0 * [8,900/800]).

What are the proficiency probabilities on the data file?

The ECLS-B data file includes 20 proficiency probability scores from the 9-month and 2-year direct child assessments: 10 from the Mental Scale and 10 from the Motor Scale. The individual proficiencies are described in the User's Manual. Each proficiency probability was generated from the child's estimated performance on 4 to 6 assessment items and represents the probability that the child has mastered the skill represented by that proficiency. Thus, the proficiency probability scores range from 0 – 1. The proficiency probabilities are informative in that they indicate where growth has occurred (i.e., in what skills), whereas gains in scale scores only indicate that growth has occurred. That is, two children may have both gained the same number of points on the BSF-R mental scale, for example, but in different places. The proficiency probabilities could show that the first child has mastered jabbering expressively, but the second child had already mastered that skill and is working on mastery of expressive vocabulary. Currently, the ECLS B provides proficiency probabilities for the 9-month and 2-year BSF-R mental and motor assessments on the longitudinal 9-month—kindergarten 2007 restricted-use data file. Proficiency probabilities for the preschool – kindergarten cognitive assessments are not currently available.

Are children with disabilities sampled in ECLS-B?

Children with disabilities and special health needs are included in the ECLS-B sample. NCES and the Office of Special Education Programs (OSEP) conducted an evaluation of the feasibility of oversampling young children with disabilities and special health needs and determined that this could not be pursued. However, the ECLS-B oversampled twins and infants of low and very low birth weight. Low birth weight and short gestation (common among twins) are strongly associated with early developmental difficulties and special health needs. The ECLS-B aims to be as inclusive as possible and assessments are designed to include children of all levels of ability and skill. Additionally, information is gathered on early health and developmental disabilities and on the services children receive.

ECLS-K

What is the difference between restricted-use data and public-use data files?

Several modifications are made to the data on the public-use files in order to reduce the likelihood that any respondent could be identified in the data.
  • Outlier data (i.e., unusual or rare responses) are top- or bottom- coded on the public-use files. For example, the number of kindergarten teachers who did not have at least a bachelor’s degree was so small that such teachers are grouped in the same category as teachers who have a bachelor’s degree. Bottom and top coding prevents identification of schools, teachers, parents, or children who have unique characteristics without affecting overall data quality. Outlier data appear in their original form on the restricted files.
  • Certain variables with too few cases having valid data or a sparse distribution are suppressed in the public-use files (i.e., no data are reported for those variables) but are available in the restricted-use files.
  • Certain continuous variables are transformed into categorical variables, and certain categorical variables have their categories collapsed in the public-use file. This categorization and collapsing reduce disclosure risk, while still providing data with adequate variability that can be used in many different kinds of analyses, such as regression analysis. Data that are modified in this way on the public files appear in their original form on the restricted files.

Additionally, ECLS-K restricted-use files are cross-sectional; most ECLS-K public-use files are longitudinal.

How will the difference in public-use files and restricted-use files impact analysts?

For most users, the public-use files provide all the data they will need for most analyses. Both the public- and restricted-use files provide data at the individual child level; for the kindergarten round, data are also provided at the teacher and school levels. Overall, few variables have been suppressed on the public files. (Information about which variables have been suppressed can be found in the data file user’s manuals. Additionally, all data for suppressed variables have a value of -2 in the data file.)

Some users may find that only the restricted files have the specific data they need. For example, those researchers examining certain groups of children whose representation in the population is relatively small, such as children in special education or children who speak a specific non-English language at home, or researchers interested in examining the types of kindergarten programs offered in schools, will find that the restricted files have more variables related to their topics of interest than the public files do. In many cases, however, even though the detailed information on the restricted-use files may be of interest, the sample sizes are too small for detailed analyses. Before requesting restricted-use data, NCES recommends examining the public-use files to verify if the needs of the researcher can be met using those data files.

The modifications used to reduce the likelihood that any respondent could be identified in the data do not affect the overall data quality.

Will you be collecting and releasing any more data?

At this time, NCES does not have plans to collect any more data from the students in the ECLS-K cohort or their families. The last round of data was released on the longitudinal kindergarten through eighth grade data file.

NCES is continuing its program of longitudinal studies of young children with the Early Childhood Longitudinal Study, Kindergarten Class of 2010-2011 (ECLS-K:2011). For more information on the ECLS-K:2011, please visit http://nces.ed.gov/ecls/myeclsk2011/index.asp.

How many children per school did ECLS-K sample?

On average, 23 kindergartners were sampled from each ECLS-K school. In some small schools and early childhood programs offering kindergarten programs, the number sampled was smaller. In some of these smaller schools, the entire population of kindergarten children was selected to participate in the ECLS-K. The average number of children per school decreases with each round of data collection as many children changed schools. For example, approximately one-quarter of children changed schools between kindergarten and first grade, and half of the children had changed schools at least once between kindergarten and third grade.

Did you sample whole classrooms?

No, children in each ECLS-K school were randomly sampled from a list of all kindergartners attending that school. During the design phase of the study, a number of different sample designs were considered and evaluated. The option of sampling entire classrooms was given strong consideration. In the end, this option was not adopted primarily because of the burden such a design would place on the teachers participating in the study and the loss of efficiency associated with an additional level of clustering.

Did the ECLS-K sample include children who were retained in kindergarten? Did it follow children if they were retained in later grades?

Yes, the ECLS-K sample included children who were repeating kindergarten. The base-year sample was composed of children who were kindergartners in the fall of 1998. Approximately 5 percent of these children were in their second year of kindergarten at that time. In addition, about 5 percent of the children who were first-time kindergartners in the fall of 1998 repeated kindergarten during the second year of the study (school year 1999-2000), when the majority of the sample was in first grade.

The ECLS-K did follow children who were retained in later grades. For instance, in the spring of 2002, when most children (89 percent) were in third grade, about 10 percent were in second grade and around 1 percent were in other grades (e.g., first or fourth grade). In the eighth-grade round of data collection (spring 2007), about 87 percent of the ECLS-K cohort was in eighth grade and about 13 percent was in a lower grade. Less than half a percent of the cohort was in a grade higher than eighth.

Were children with disabilities sampled for the ECLS-K?

Yes, children with disabilities were included in the sample for the ECLS-K, though disability status was not used as a sampling characteristic at the time of sampling (i.e., children with disabilities were not sampled at different rates than were children without disabilities). Many of the children in the sample were identified as needing and began receiving special education services over the life of the study. Thus, the sample of children receiving special education services increased in size between kindergarten and eighth grade.

How did the ECLS-K identify children with disabilities?

All children with disabilities who meet federal eligibility requirements are expected to participate in special education programs or receive special education services through the school. During data collection, ECLS-K project staff asked schools whether the sampled children had an Individualized Education Plan (IEP), an Individualized Family Service Plan (IFSP), or a 504 plan on file with the school district. Once children were identified as receiving special educational assistance due to a disability, field supervisors identified what accommodations, if any, needed to be made in order to administer the direct child assessment battery to them appropriately. The special education teachers of children with an IEP, IFSP, or 504 plan were asked to complete questionnaires about their background and the services provided to the children and their families. Additionally, parents were asked a series of questions about children’s health and disabilities in the parent interview.

What information was collected from the teachers and parents of disabled children?

Parents and teachers of children with disabilities were asked the same questions that parents and teachers of children without disabilities were asked. The parent and teacher instruments did contain additional items that asked about the services children with disabilities received. Also, a supplemental questionnaire was administered to the special education teacher of children who had an Individualized Education Plan (IEP). Copies of the parent and teacher instruments and the special education teacher questionnaires can be downloaded from the Instruments and Assessments page of the ECLS-K website.

Were children with limited English skills excluded from participating in the direct child assessment?

The ECLS-K took special steps to include as many children with limited English skills as possible in the direct child assessments while not assessing them unfairly. In kindergarten and first grade, children from homes where English was not the primary language were first administered the Oral Language Development Scale (OLDS), a subset of tests from the preLAS 2000 (preLAS 2000 assessment of oral language proficiency in young children), to determine if they had sufficient English skills to meaningfully take part in the ECLS-K direct child assessment. In the fall of kindergarten, children whose performance on the OLDS indicated that they could not meaningfully participate in the main ECLS-K cognitive assessment battery (which was administered in English) and whose home language was Spanish were then administered the Spanish preLAS 2000, a translated version of the ECLS-K mathematics assessment, and a Spanish version of the psychomotor assessment in the fall of kindergarten. They also had their height and weight measured. Children whose performance on the OLDS indicated that they had sufficient English language skills to participate in the main ECLS-K battery were administered all of the ECLS-K assessments in English. Children who did not achieve a sufficient score on the OLDS and who spoke a language other than English or Spanish were not administered any cognitive assessments, but they did have their height and weight measured. These same general procedures were used in each round of data collection in kindergarten and first grade.

In the spring of kindergarten, fall of first grade, and spring of first grade, the English language proficiency of children who were not administered the English version of the ECLS-K assessment battery in the prior round was re-evaluated using the OLDS. Once a child passed the OLDS, he or she was administered the assessments in English for all subsequent rounds of data collection; that child’s English language proficiency was not reassessed with the OLDS. The OLDS was not administered in third, fifth, or eighth grade because most of the children in the sample by the spring of the first grade had demonstrated sufficient English proficiency to complete the main ECLS-K cognitive assessments in English.

Did the ECLS-K measure outcomes other than academic achievement?

Yes, the ECLS-K included measures of children's social skills, approaches to learning, and physical well-being. During the fall of kindergarten a psychomotor assessment was administered to gauge children's fine and gross motor skills. The main instrument for measuring children's social development was an adaptation of Gresham and Elliott's Social Skills Rating System (SSRS; its adaptation in the ECLS-K is the Social Rating Scale (SRS)), which was completed by teachers (in kindergarten, first, third, and fifth grades) and parents (in kindergarten and first grade only). The SRS item-level data and questionnaire items are available as restricted-use files (http://nces.ed.gov/pubsearch/pubsinfo.asp?pubid=2010071).

In the third- and fifth-grade years, children provided information about themselves by completing a short self description questionnaire (SDQ). On the SDQ, children rated their perceptions of their competence and their interest in reading, mathematics, and school in general. They also rated their popularity with peers and competence in peer relationships and reported on any problem behaviors that they might exhibit. The third- and fifth-grade SDQ item-level data and questionnaire items are available publicly at http://nces.ed.gov/pubsearch/pubsinfo.asp?pubid=2010070.

In eighth grade, a new version of the SDQ was developed using items from a published instrument designed to be used with adolescents (Self Description Questionnaire II; Marsh 1992). In addition, two scales from the eighth-grade student questionnaire, which were adapted from the National Education Longitudinal Study (NELS:88), tapped students’ self-concept and their perceptions of how much control they had over their own lives. Students completed self-administered paper and pencil questionnaires about their school experiences, their activities, their perceptions of themselves, and their weight, diet, and level of exercise. The students’ self-reported data from the eighth-grade data collection are available on the Kindergarten Through Eighth Grade Full Sample Public-Use Data File.

Does the ECLS-K provide information on children's participation in child care and early education programs?

Yes, the ECLS-K collected data on children's child care and early education program participation prior to kindergarten. It also collected data about children's participation in before- and after-school care and education during kindergarten, first, third, and fifth grades.

Can researchers produce state-level estimates with the ECLS-K data?

No, the ECLS-K sample was designed to support national and regional estimates. It was not designed to estimate characteristics of children, teachers, or schools at the state, county, or city level.

Can you use the ECLS-K data to produce estimates that are nationally representative of school and teacher characteristics?

Yes, but only the ECLS-K kindergarten data will support such estimates. The base-year (i.e., kindergarten) school sample is nationally representative of schools that educate kindergartners. A separate school-level data file with school weights is included on the longitudinal kindergarten through eighth grade data file. During the kindergarten year, the ECLS-K sampled all kindergarten teachers in each of the ECLS-K schools. Data from this nationally representative sample of kindergarten teachers also are available as a separate file with teacher weights on the longitudinal kindergarten through eighth grade file. The ECLS-K data do not, however, support teacher-level or school-level estimates in first, third, fifth, or eighth grades. After kindergarten, teachers and schools were only included in the study if they educated one or more ECLS-K children. Therefore, no teacher-level or school-level weights are provided after the base year.

Do I need to use weights for my analyses?

Weights are used to adjust for disproportionate sampling, survey nonresponse, and undercoverage of the target population when analyzing complex survey data. They also are necessary to produce national-level estimates of the ECLS-K cohort, of kindergarten teachers in 1998-99, and of schools educating kindergartners in 1998-99.

Do I need to use the same weight for all analyses?

Researchers are encouraged to use the same weight throughout all analyses in a publication or paper, even when there is a different ideal weight for each analysis. Weights are assigned to cases with valid data associated with the component(s) contributing to the weight. Selecting different weights within the same publication or paper results in each analysis being run with a different analytic sample (i.e., the exact cases contributing to the analyses).

Who's included in the child, teacher, and school catalogs? Why are there teachers and schools in the child catalog, but there are no teachers and students in the school catalog? Why are there schools in the teacher catalog that are not in the school catalog?

In the base year, the ECLS-K was representative at three levels—kindergartners (i.e., the child level), kindergarten teachers, and schools educating kindergartners. The longitudinal kindergarten through eighth grade electronic codebook (ECB) contains a catalog pertaining to each of these levels. The child catalog is longitudinal and contains data for all children who participated in the kindergarten year, as well as data for all subsequent rounds of data collection. The teacher catalog is cross-sectional and contains information for the representative sample of kindergarten teachers only for the kindergarten round of data collection. The school catalog also is cross-sectional and contains information for the representative sample of schools educating kindergartners, again only for the kindergarten round of data collection.

Within the representative sample of schools educating kindergartners, kindergarten teachers were selected for the teacher sample, regardless of whether any children were sampled from their classrooms for the child sample. Thus, there are teachers in the teacher catalog who are not represented in the child catalog. Conversely, there are teachers in the child catalog who are not in the teacher catalog, because some children changed teachers during the kindergarten year and their new teachers were not part of the representative sample of teachers. Similarly, the school catalog contains only those schools that were sampled as part of the representative sample of schools educating kindergartners and that had a completed school administrator questionnaire. The school-level file does not contain those schools that children moved into but were not part of the initial representative sample of schools educating kindergartners. Data collected from children’s new schools were, however, included in the child-level file.

A user might see school IDs on the teacher file that are not on the school file. This is because, while these schools were part of the representative sample of schools with kindergartens, they did not have a completed school administrator questionnaire.

Thus, users interested in creating teacher/classroom-level files or school-level files based on presence of child data, regardless of whether they were part of the representative teacher or school sample, should use the child-level file.

How are the data collected in the fall and spring kindergarten teacher Part B questionnaires presented in the child and teacher catalogs?

In the fall of kindergarten (round 1), teachers were asked about their characteristics and the characteristics of their classroom in Teacher Questionnaire, Part B (TQB). Teachers who were added to the study in the spring of kindergarten (either in a school that joined the study in round 2 or for a child who had a new teacher in round 2) were administered a similar version of the TQB in the spring (round 2). Teachers who answered TQB in the fall did not complete the spring version, so for any one teacher there is only one set of TQB data.

In the base year teacher file all of the TQB items have the B1 prefix, regardless of the round in which the data were collected. Two flags indicate the round in which the data were collected. If the flag B1TQUEX equals 1, the data were collected in the fall; if B2TQUEX equals 1, the data were collected in the spring.

In the longitudinal child file there are two sets of TQB data, one set beginning with the B1 prefix and one set beginning with the B2 prefix. The B1 items pertain to the teacher linked to the child in the fall [with variable T1_ID] and the B2 items pertain to the teacher linked to the child in the spring [with variable T2_ID]. The majority of the children have the same teacher in the fall and spring, so for these children their case-level information for the B1 TQB items is identical to their case-level information for the for B2 TQB items. For children who changed teachers during the year, these two sets of data are different because they come from different teachers [you can determine whether children changed teachers by comparing T1_ID with T2_ID or by looking at variable FKCHGTCH]. For cross-sectional analyses, analysts should use the TQB items from the time period for which they are doing analyses (i.e., B1 items for fall kindergarten; B2 items for spring kindergarten). For information that reflects the kindergarten experience as a whole, analysts might choose to limit their analysis to children whose teacher remained constant across the year.

Note:

  • The Spring Kindergarten TQB contains a subset of items asked in the Fall Kindergarten TQB. However, the variable structures for the B1 and B2 variables on the child-level data file are parallel; that is, all Fall Kindergarten TQB variables (B1 variables) have a corresponding Spring Kindergarten TQB variable (B2 variables) in the Electronic Codebook (ECB) and in the resulting data file.
  • Some questions asked of teachers in the Fall Kindergarten TQB were not asked of teachers new to the study in the Spring Kindergarten TQB. For children who had the same teacher in fall and spring, the information from the B1 variable pertaining to such a question is carried forward to the B2 variable. Children who have teachers who are new to the study in the spring (i.e., they are in schools that joined the study in the spring or they changed teachers between fall and spring) do not have information pertaining to these questions in the spring. Data for these measures for these children are coded either as system missing or not ascertained, depending on whether or not his/her teacher responded to the survey.
  • On the Fall Kindergarten TQB, question 1 asks teachers about time spent in whole class activities, small group activities, individual activities, and child selected activities. This question was not repeated on the Spring Kindergarten TQB, but was asked on the Spring Kindergarten TQA (question 8). Therefore, there are three sources of information from kindergarten on time spent in whole class activities, small group activities, individual activities, and child selected activities [Fall Kindergarten TQB: B1WHLCLS; B1SMLGRP; B1INDVDL; B1CHCLDS] [Spring Kindergarten TQA: A2WHLCLS; A2SMLGRP; A2INDVDL; A2CHCLDS] and [Spring Kindergarten TQB: B2WHLCLS; B2SMLGRP; B2INDVDL; B2CHCLDS]. Since this question did not appear in the Spring Kindergarten TQB, the B2 variables are simply the responses that teachers had provided on the fall TQB. The A2 variables present the information that best reflects the spring kindergarten time period.

Is it possible to compute the elapsed time period between two assessments?

It is possible to calculate the elapsed time between two direct assessments for a child using variables in the Electronic Codebook (ECB). For each direct assessment, there are corresponding variables that indicate the month, day, and year in which the direct assessment was administered. For instance, in round 1 the assessment date variables are: C1ASMTMM (C1 Assessment month), C1ASMTDD (C1 Assessment day), and C1ASMTYY (C1 Assessment year-4 digits). To calculate the elapsed time between two assessments for a child, one can use the assessment date variables from the two rounds of interest to determine the number of days between the two direct assessments.

The ECB also includes composite variables for children's age at assessment at each direct assessment time point (e.g., R1_KAGE for Round 1 Composite child assessment age, in months). These variables are based on the children's date of birth and the date on which they were assessed. In some cases, there are discrepancies in the age at assessment variables due to masking of variables for the public-use file or improvements in the date of birth variables collected in earlier rounds of data collection. Since this is the case, we recommend that users calculate the elapsed time between assessments using the method described above, rather than use the composite assessment age variables on the public-use data file.

Below are examples of SPSS and SAS code that can be used to calculate elapsed time between direct assessments:

SPSS Code:
COMMENT Calculate elapsed time between R1 and R2
COMMENT convert 3 variable assessment date into a one variable assessment date.
COMPUTE date1=DATE.DMY(c1asmtdd, c1asmtmm, c1asmtyy).
FORMATS date1(DATE11).
VARIABLE WIDTH date1(11).
EXECUTE.

COMPUTE date2=DATE.DMY(c2asmtdd, c2asmtmm, c2asmtyy).
FORMATS date2(DATE11).
VARIABLE WIDTH date2(11).
EXECUTE.

COMMENT calculate elapsed time between R1 and R2 assessments, in days.
COMPUTE elapse = (date2 - date1)/86400.
EXECUTE.

SAS Code:

/* EXAMPLE - Calculate elapsed time between R1 and R2*/
data new file;
set original file;
/*
Because C1ASMTMM , C1ASMTDD, C1ASMTYY are numeric values, only the SAS function is needed to convert it to SAS date value, which can then be extracted.
*/
date1=mdy(C1ASMTMM , C1ASMTDD, C1ASMTYY);
date2=mdy(C2ASMTMM , C2ASMTDD, C2ASMTYY);
diff=abs(date2-date1);
run;

I noticed that in the data file not all cases have data for the round 4 school administrator questionnaire (SAQ) variables. What do I do?

Schools that had already completed the school administrator questionnaire (SAQ) in round 2 were given a modified repeat school SAQ in round 4, whereas schools that were new to the study in round 4 were asked to complete an SAQ for new schools that collected more information than the modified SAQ (and was very similar to the SAQ used in round 2). The questions that were not in the repeat school SAQs (e.g., the grade levels included in the school, how many students the school site is designed to accommodate, what grades are tested with standardized tests) had already been asked in round 2, and variables associated with these questions are on the data file as S2 variables.

For those SAQ questions that were asked at round 2 but not at round 4 for repeat schools, a user can pull forward the data collected from round 2 to have a complete set of round 4 variables. For children who did not change schools between rounds 2 and 4, their round 2 child-level S2 variables can be used in analyses at round 4. Care should be taken to not replace updated information collected at round 4 with round 2 data (i.e., this procedure should only be used for data that were not collected at round 4). However, for children who changed schools between rounds 2 and 4 and moved into a repeat school, their own S2 data cannot be used at round 4. For these children, an analyst can use the child's round 4 school ID (S4_ID) to find another child who attended that school (i.e., had the same school ID) in round 2 and then pull that other child’s round 2 school data forward.

When I try to download the online kindergarten through eighth grade data from http://nces.ed.gov/ecls/dataproducts.asp, I receive a message saying that the file is corrupted. What troubleshooting steps do you recommend?

The following are the steps that should be taken in order for an ASCII file to be created that contains all the data from kindergarten through eighth grade. (The procedures work the same for the smaller files with the base year school and teacher data.)
  1. Click on Childk8p.z01 from NCES's website. A prompt will appear that asks whether you want to "find," "save," or "cancel." Select "save" and save the file to the desired location on your computer. It may be helpful to create a specific folder in which to save all the files you will need to download.
  2. Repeat step 1 for Childk8p.z02, Childk8p.z03, Childk8p.z04, Childk8p.z05, and finally, the Childk8p.zip file, making sure to place the files in the same folder as that containing Childk8p.z01.
  3. Go to the location where you saved all your files and double-click the Childk8p.zip file.
  4. The WinZip dialogue box should open. Select the Childk8p.zip file and then select extract.
  5. A dialogue box may open that asks where you want WinZip to extract to. By default, it should be the folder in which you saved the files. If it is not, change the directory to the location where you would like the files to be located and select extract.
  6. The extraction process should begin, and after it is complete, you should see the childkp.dat file. This is the ASCII file with all the data in it.

The above-listed procedures for opening files do not seem to work if you use Microsoft Window's default Extraction Wizard along with some free unzipping programs (other than WinZip). Many data users find that WinZip is needed to open the online data properly, although Extraction Wizard may be able to open the smaller zipped folders (e.g., the school and teacher data).

For computers with slower connection speeds, it is possible that large files such as these may become corrupted during the lengthy time it takes to download them. If your computer has a slower connection speed, try the above procedures using a computer with a faster connection time.

Please keep in mind that the public-use data DVD with the electronic code book (ECB) is also available free of charge through EdPubs (http://www.edpubs.gov/), which distributes NCES products. Public-use data are also available through an on-line version of the ECB called the EDAT system. To access the EDAT system, please visit http://nces.ed.gov/edat/.

I saved my taglist in the ECB and used the “Extract” function to save a file with the variables I tagged, but when I try to open the file in SPSS/SAS/Stata, I don’t see any data. What’s wrong?

The ECB does not create a data file. Rather, the ECB creates syntax code that must be run in a statistical software package to generate a data file. The syntax file reads in raw data from the ASCII data file (the file with a .dat extension). In the ECB, there are two “save” steps in the “Extract” procedure. The first step saves the syntax file. In the second step, there is no file that is actually saved. Instead, this step writes a line of code in the syntax file indicating what to name the data file once the syntax file is run and a data file is generated.

What geocode data are available?

Zip code and geocode data are available only to analysts with a restricted-use license. Information about applying for a restricted-use data license can be found at http://nces.ed.gov/pubsearch/licenses.asp.

Census tract and zip code tabulation area (ZCTA) codes for ECLS-K children's homes and schools are available for each round of the ECLS-K up to third grade. These data are available on the CD “Census Data and Geocoded Location for the Early Childhood Longitudinal Study, Kindergarten Class of 1998-99 (ECLS-K)” supplemental restricted-use file, available to restricted-use license holders upon request to the IES Data Security Office (IESData.Security@ed.gov). This file also has about 600 Census variables (or Census derived variables) for each Census tract and ZCTA including income, race/ethnicity, and many other sociodemographic characteristics of the people living in the tract or ZCTA. Supporting documentation is included on the CD and consists of a user’s manual, data file record layouts describing the variables on each of the ASCII data files, and code for converting the data files.

Home zip codes, school zip codes, and school FIPS codes (both state and county) are also available for first grade, third grade, fifth grade, and eighth grade on the cross-sectional restricted-use files for these grades. These data are identified by the variables RxHOMZIP, RxSCHZIP, RxFIPSST (state), and RxFIPSCT (county), respectively, where “x” refers to the data collection round number. Additional geographic information, such as home FIPS codes (P7FIPSST, P7FIPSCT) and tract and block data, is available at eighth grade. To match FIPS codes to the states' names, please visit http://www.census.gov/geo/www/ansi/statetables.html.

Please note that while the restricted data do identify children’s state of residence, the ECLS-K sample was not designed to support state-level (or city- or county-level) estimates, as the sample is not necessarily representative of children in particular states (or cities or counties).

What is the difference between restricted-use data and public-use data files?

There are several types of modifications on the public-use files that will cause it to differ from the restricted-use files:
  • Outliers will be top- or bottom- coded. This prevents identification of unique schools without affecting overall data quality.

  • Certain schools identified as at risk for disclosure have a small percent noise introduced in those variables that pose a risk for disclosure. Again, this does not affect overall data quality.

  • Certain variables with too few cases and a sparse distribution are suppressed in the public- use files, but are available in the restricted-use files.

  • Certain continuous variables are modified into categorical variables, and certain categorical variables have their categories collapsed in the public-use file. While this protects from disclosure risk, these variables can still be used in different kinds of analysis such as regression analysis.

How will the difference in public-use files and restricted-use files impact analysts?

For most users, the public-use files provide all the data and variables required for most analyses. Both the public- and restricted-use files provide data at the individual child, teacher, and school levels. However, some users may require the restricted file. For example, those researchers examining certain rare sub-populations such as the disabled, or children with specific non-English home languages and those interested in examining the type and number of hours of kindergarten programs offered in schools will find that the restricted files have a few more variables. In many cases, even though the detailed information on the restricted-use files may be of interest, the sample sizes are often too small for these analyses. However, the modifications used to avoid the identification of schools, teachers, and children do not affect the overall data quality and most researchers should be able to find all that they need in the public-use files. Overall, few variables have been suppressed. For any user uncertain of their needs, NCES recommends first examining the public-use files to verify if the needs of the researcher can be met using those data files.