Education Longitudinal Study of 2002 (ELS:2002)

4. SURVEY DESIGN

TARGET POPULATION

The ELS:2002 base year comprises two primary target populations—schools with 10th grades and 10th-grade students—in the spring term of the 2001–02 school year. There are two slightly different target populations for the first follow-up. One population consists of those students who were enrolled in the 10th grade in 2002. The other population consists of those students who were enrolled in the 12th grade in 2004. The former population includes students who dropped out of school between 10th and 12th grades, and such students are a major analytical subgroup. The target populations of the ELS:2002 second follow-up (2006) were the 2002 sophomore cohort and the 2004 senior cohort. The sophomore cohort consists of those students who were enrolled in the 10th grade in the spring of 2002 and the 12th-grade cohort comprises those students who were enrolled in the 12th grade in the spring of 2004. The sophomore cohort includes students who were in the 10th grade in 2002 but not in the 12th grade in 2004 (i.e., sophomore cohort members but not senior cohort members). The senior cohort includes students who were 12th-graders in 2004 but were not in the 10th grade in U.S. schools in 2002; they were included through a sample freshening process as part of the first follow-up activities. No additional sampling was performed for the third follow-up. The target populations for the third follow-up are the same as those in the first and second follow-ups; namely, those students who were enrolled in the 10th grade in 2002 and those students who were enrolled in the 12th grade in 2004.

SAMPLE DESIGN

The sample design for ELS:2002 is similar in many respects to the designs used in the three prior studies of the National Center for Education Statistics (NCES) Longitudinal Studies Program: the National Longitudinal Study of the High School Class of 1972 (NLS:72), HS&B, and NELS:88. ELS:2002 is different from NELS:88 in that the ELS:2002 base-year sample students are 10th-graders rather than 8th-graders. As in NELS:88, there were oversamples of Hispanics and Asians in ELS:2002. However, for ELS:2002, counts of Hispanics and Asians were obtained from the Common Core of Data (CCD) and the Private School Universe Survey (PSS) to set the initial oversampling rates.

ELS:2002 used a two-stage sample selection process. First, schools were selected with probability proportional to size, and school contacting resulted in 1,220 eligible public, Catholic, and other private schools from a population of approximately 27,000 schools containing 10th-grade students. Of the eligible schools, 752 participated in the study. These schools were then asked to provide 10th-grade enrollment lists. In the second stage of sample selection, approximately 26 students per school were selected from these lists.

Base-year survey. The ELS:2002 base-year sample design comprises two primary target populations—schools with 10th grades and sophomores in these schools—in the spring term of the 2001–02 school year. The base-year survey used a two-stage sample selection process. First, schools were selected. These schools were then asked to provide sophomore enrollment lists.

The target population of schools for the ELS:2002 base year consisted of regular public schools, including state Department of Education schools and charter schools, and Catholic and other private schools that contained 10th grades and were in the United States (the 50 states and the District of Columbia). The sampling frame of schools was constructed with the intent to match the target population. However, selected schools were determined to be ineligible if they did not meet the definition of the target population. Responding schools were those schools that had a survey day (i.e., a day when data collection occurred for students in the school). Of the 1,270 sampled schools, there were 1,220 eligible schools and 752 responding schools (67.8 percent weighted response rate). School-level data reflect a school administrator questionnaire, a library media center questionnaire, a facilities checklist, and the aggregation of student data to the school level. School-level data, however, can also be reported at the student level and serve as contextual data for students.

The target population of students for the full-scale ELS:2002 consisted of spring-term sophomores in 2002 (excluding foreign exchange students) enrolled in schools in the school target population. The sampling frames of students within schools were constructed with the intent to match the target population. However, selected students were determined to be ineligible if they did not meet the definition of the target population. Of the 19,220 sampled students, there were 17,590 eligible students and 15,360 participants (87.3 percent weighted response rate). Student-level data consist of student questionnaire and assessment data and reports from students’ teachers and parents.

Top

First follow-up survey. The basis for the sampling frame for the first follow-up was the sample of schools and students used in the ELS:2002 base-year sample. There are two slightly different target populations for the follow-up. One population consists of those students who were enrolled in the 10th grade in 2002. The other population consists of those students who were enrolled in the 12th grade in 2004. The former population includes students who dropped out of school between 10th and 12th grades, and such students are a major analytical subgroup. Note that in the first follow-up, a student who is defined as a member of the student sample is either an ELS:2002 spring 2002 sophomore or a freshened first follow-up spring 2004 12th-grader.

If a base-year school split into two or more schools, many of the ELS base-year sample members moved en masse to a new school, and they were followed to the destination school. These schools can be thought of as additional base-year schools in a new form. Specifically, a necessary condition of adding a new school in the first follow-up was that it arose from a situation such as the splitting of an original base-year school, thus resulting in a large transfer of base-year sample members (usually to one school, but potentially to more). Four base-year schools split, and five new schools were spawned from these four schools. At these new schools, as well as at the original base-year schools, students were tested and interviewed. Additionally, student freshening was done, and the administrator questionnaire was administered.

Second follow-up survey. The target populations of the ELS:2002 second follow-up (2006) were the 2002 sophomore cohort and the 2004 senior cohort. The 2002 sophomore cohort consists of those students who were enrolled in the 10th grade in the spring of 2002, and the 2004 senior cohort comprises those students who were enrolled in the 12th grade in the spring of 2004. The sophomore cohort includes students enrolled in the 10th grade in 2002, but not in the 12th grade in 2004 (i.e., sophomore cohort members, but not senior cohort members). The senior cohort includes students enrolled in the 12th grade in 2004, but not in the 10th grade in 2002; they were included through a sample freshening process as part of the first follow-up activities.

The second follow-up fielded sample consisted of 16,430 sample members: 14,100 respondents for both the base year and the first follow-up; 1,200 first follow-up nonrespondents who were base-year respondents; 650 base-year nonrespondents who were subsampled in the first follow-up and responded in the first follow-up; 210 base-year or first follow-up questionnaire-incapable members; 170 freshened respondents in the first follow-up; and 100 base-year respondents who were determined to be out of scope in the first follow-up. Once fielded, some members of the sample of 16,430 were determined to be out of scope. There were 460 out-of-scope second follow-up sample members who fell into five basic groups: deceased, out of country, institutionalized/incarcerated, questionnaire incapable/incapacitated, or unavailable for the duration of the 2006 data collection.

High school transcript study. Transcripts were collected for all sample members who participated in at least one of the first two student interviews: the base-year interview or the first follow-up interview. These sample members include base-year respondents who were first follow-up nonrespondents and base-year nonrespondents who were first follow-up respondents. Thus, sample members who were dropouts, freshened sample members, transfer students, homeschooled students, and early graduates are included if they were respondents in either of the first two student interviews. Transcripts were also requested for students who could not participate in either of the interviews because of a physical disability, a mental disability, or a language barrier.

Unlike previous NCES transcript studies, which collected transcripts only from the last school attended by sample members, the ELS:2002 transcript study collected transcripts from all base-year schools and the last school attended by sample members who transferred out of their base-year school. Incomplete records were obtained for sample members who had dropped out of school, had fallen behind the modal progression sequence, or were enrolled in a special education program requiring or allowing more than 12 years of schooling. Eighty-six percent of transcript respondents have 4 complete years of high school transcript information.

Third follow-up survey. No additional sampling was performed for the third follow-up. The target populations for the third follow-up are the same as those in the first and second follow-ups; namely, those students who were enrolled in the 10th grade in 2002 and those students who were enrolled in the 12th grade in 2004. Eligible sample members who had not responded in the second follow-up and in the first follow-up were not fielded for the third follow-up. A total of 16,176 sample members were fielded for the third follow-up.

Postsecondary transcript study. The basis for the ELS:2002 postsecondary transcript data collection is the same as the basis for the third follow-up, although the target population for the postsecondary transcript data collection corresponds to a specific subpopulation of the two overarching ELS student populations; namely, the students in the 10th grade in 2002 or the 12th grade in 2004 who had attended one or more postsecondary institutions since 2002 and who were alive as of the third follow-up. During the second and third follow-up surveys, respondents were asked to provide the name and location of each postsecondary institution that they had attended. Those institutions were subsequently contacted in 2013–14, and postsecondary transcripts were requested for each ELS:2002 sample member who reported attendance. The ELS:2002 postsecondary transcript data include 11,522 members of the sophomore cohort for whom at least one postsecondary transcript was collected. Overall, transcripts were obtained for 11,623 of 12,549 eligible sample members for a weighted response rate of 77 percent.

Top

Data Collection and Processing

The base-year survey collected data from students, parents, teachers, librarians, and school administrators. Self-administered questionnaires and cognitive tests were the principal modes of data collection. Data collection took place primarily during in-school survey sessions conducted by Research Triangle Institute (RTI) field interviewer or team. Base-year data were collected in the spring term of the 2002 school year. A total of 752 high schools participated, resulting in a weighted school response rate of 67.8 percent. A total of 15,360 students participated, primarily in in-school sessions, for an 87.3 percent weighted response rate. Each sampled student’s mathematics teacher and English teacher were given a questionnaire to complete. Weighted student-level coverage rates for teacher data were 91.6 percent (indicating receipt of a report from the math teacher, the English teacher, or both). School administrators and library media coordinators also completed a questionnaire (the weighted response rates were 98.5 percent and 95.9 percent, respectively). Questionnaires were mailed to parents, with a telephone follow-up for nonresponders.

Student coverage for parent questionnaires was 87.5 percent (weighted). Survey administrators (SAs) completed a facilities checklist at each school. For the first follow-up, overall, about 89 percent (weighted) of the total ELS:2002 sample (comprising both 2002 sophomores 2 years later and 2004 freshened seniors) was successfully surveyed—whether through completion of a student, transfer student, dropout, homeschool, or early graduate questionnaire. For the second follow-up, the sample represents a subset of the combined population of 10th-graders in the spring term of 2002 and 12th-graders in the spring term of 2004. Of the total sample, approximately 15,900 were considered to be eligible for the 2006 data collection, among which 14,200 participated, resulting in an 88.4 percent weighted response rate. For the third follow-up, a weighted student response rate of 78 percent was achieved. For the postsecondary transcript study, a weighted response rate of 77 percent was achieved.

Reference dates. In the base-year survey, most questions referred to the students’ experience up to the time of the survey’s administration in spring 2002. In the follow-ups, most questions referred to experiences that occurred between the previous survey and the current survey. For example, the first follow-up largely covered the period between 2002 (when the base-year survey was conducted) and 2004 (when the first follow-up was conducted).

Data collection. The base-year student data collection began in schools on January 21, 2002, and ended in schools in June 2002; telephone interviews with nonresponding students ended on August 4, 2002. Data collection from school administrators, library media center coordinators, and teachers ended in September 2002. The parent data collection ended on October 17, 2002. The first follow–up in–school data collection occurred between January and June 2004; out–of–school data collection took place between February and August 2004 and included telephone and in–person interviews. The second follow–up data collection was conducted from January to September 2006. To notify sample members about the start of data collection, all sample members and parent(s) were sent a packet which included instructions for the web–based survey. The third follow–up data collection began July 2012 and continued through February 2013. The postsecondary transcript data collection began in March 2013 and continued through early April 2014.

During the field test of the base-year study, endorsements were secured from organizations felt to be influential in the eyes of the various entities being asked to participate (school administrators, librarians, teachers, students, and parents). Before school recruitment could begin, it was necessary to obtain permission to contact the schools. The Chief State School Officers (CSSOs) of each state (as well as the District of Columbia) were contacted to approve the study for the state. Permission to proceed to the district level was obtained in all 50 states as well as the District of Columbia. Once state approval was obtained, an information package was sent to the District Superintendent of each district/diocese that had sampled schools in the state. Permission to proceed to the school level was received from 693 of the 829 districts/dioceses having eligible sampled schools (83.6 percent). This represented a total of 891 eligible schools with district/diocese permission to be contacted among 1,060 eligible schools affiliated with districts/dioceses (84.1 percent). For public and Catholic schools, school-level contact was begun as soon as district/diocese approval was obtained. For private non-Catholic schools, it was not necessary to wait for higher approval, though endorsements from various private school organizations were sought. The principal of each cooperating school designated a school coordinator to serve as a point of contact at the school and to be responsible for handling the logistical arrangements. The coordinator was asked to provide an enrollment list of 10th-grade students. For each student, the coordinator was asked to give information about sex, race, and ethnicity, and whether the student had an Individualized Education Program (IEP). Dates for a survey day and two make-up days were scheduled. At the same time, staff members were designated to receive the school administrator and library media center questionnaires. Parental consents were obtained. On the survey day at each school, the survey administrator (SA) checked in with the school coordinator and collected any parental permission forms that had come in.

For the base-year and first follow-up surveys, the SA and survey administrator assistant (SAA) administered the student questionnaire and tests via a group administration. The SA and SAA graded the routing tests (see details in the section of “Cognitive test data”) and edited the student questionnaires for completeness. Makeup sessions were scheduled for students who were unable to attend the first session. Interviews were conducted by CATI for students who were unable to participate in the group-administered sessions. The school administrator, teacher, library media center, and parent questionnaires were self-administered; individuals who did not return their questionnaires by mail within a reasonable amount of time were followed up by telephone. The facilities checklist was completed by the SA based on his/her observations in the building on the school’s survey day.

The first follow-up data collection required intensive tracing efforts to locate base-year sample members who, by 2004, were no longer in their 10th-grade schools, but had dispersed to many high schools. In the spring and again in the autumn of 2003, each base-year school was provided a list of ELS:2002 base-year sample members from their school. The school was asked to indicate whether each sample member was still enrolled at the school. For any sample member who was no longer enrolled, the school was asked to indicate the reason and date the student left. If the student had transferred to another school, the base-year school was asked to indicate the name and location of the transfer school. In the fall of 2003, each base-year school was also asked to provide a list of the 12th-graders enrolled at that school, so this information could be used in the freshening process. For students who had left their base-year school, the school was asked to provide contact information to allow for out-of-school data collection during the first follow-up survey period. Telephone data collection began in February 2004. Sample members identified for initial contact by the telephone unit included those no longer enrolled at the base-year school and those who attended base-year schools that did not grant permission to conduct an in-school survey session. Other cases were identified for telephone follow-up after the survey day and all makeup days had taken place at the school that the sample members attended. Some nonresponding sample members were assigned to SAs for field follow-up. A total of 797 sample members were interviewed in the field. An additional 80 field cases were completed either by mailed questionnaire or telephone interview and were withdrawn from the field assignment.

Data collection for the second follow-up was significantly redesigned to include survey modes and procedures that were completely independent of the in-school orientation of the first follow-up survey. An important aspect of the second follow-up data collection was that high schools were no longer involved in providing assistance with locating sample members. Tracing and sampling maintenance techniques included the following: batch tracing services for updated address information and telephone numbers; updated locating information obtained from student federal financial aid applications; direct contact with sample members and their parents via mail, telephone, or the Internet; intensive tracing efforts by centralized tracing specialists; intensive tracing efforts by field locating specialists in local areas; and tracing students through postsecondary schools applied to or attended, as specified in the 2004 interview. Also, incentive payments were offered to respondents to maximize their participation.

There were three survey modes in the second follow-up: a web-enabled self-administered questionnaire, CATI, and CAPI. Data collection for the second follow-up began on January 25, 2006. For the first 4 weeks, only web and call-in data collection was made available to sample members. After the initial 4 weeks, outbound CATI data collection efforts were undertaken. The primary purpose of the CATI data collection was to complete telephone interviews with sample members when contacted or to set up an appointment to complete the interview. The CATI instrument was virtually identical to the web self-interview. (The only difference was that the CATI version provided an interviewer instruction on each screen to facilitate administration of each item.) CATI interviewers adhered to standardized interviewing techniques and other best practices in administering the interview. To reach sample members who had not yet participated by web or CATI modes, CAPI data collection commenced on April 17 (8 weeks after the start of outbound CATI calling). The approach for CAPI data collection followed the strategy used successfully in B&B:93/2003 and other recent NCES studies. This approach first identified geographic clusters according to the last known zip codes of sample members who could potentially be assigned to CAPI interviewing. Then, based on the distribution of cases by cluster, those that had the highest concentration of cases were staffed with one or more field interviewers. CAPI interviews were conducted on laptop computers via a web-based interface that used personal web server software. To maintain consistency across interviewing modes, the CAPI interview was identical to the CATI interview. CAPI interviewers were allowed to administer the interview over the telephone, which produced conditions even more similar to CATI interviewing.

Several locating methods were used to find and collect up-to-date contact information for the ELS:2002 third follow-up sample. Batch searches of national databases and address update mailings to sample members and a parent were conducted prior to the start of data collection. Follow-up locating methods were employed for those sample members not found after the start of data collection, including computer-assisted telephone interview (CATI) locating, computer-assisted personal interview (CAPI) field tracing, and intensive tracing. Initial mailings began on July 3, 2012 with CATI production beginning on August 5, and nonrespondent abbreviated interviews offered beginning on January 7, 2013.

Sample members were provided with a link to the ELS:2002 third follow-up website prior to the start of data collection. The website provided general information about the study, including the study sponsor and contractor, how the data are used, answers to frequently asked questions (FAQs), confidentiality assurances, and selected findings from earlier rounds of ELS:2002. The website also provided contact information for the study help desk and project staff at RTI, as well as a link to the NCES website. Sample members were able to log in to the secure website to provide updated contact information and complete the sample member interview once it became available. Designed according to NCES web policies, the study website used a three-tier security approach to protect all data collected. The first tier of security included secure logins, with a unique study ID and strong password provided to sample members. The second tier of security protected any data entered on the website with secure socket layer technology, allowing only encrypted data to be transmitted over the Internet. The third tier of security stored any collected data in a secured SQL Server database located on a server machine that was physically separate from the web server. Sample members were also provided with a toll-free telephone number, which was answered by help desk agents. Help desk staff were available to sample members who had questions or technical issues related to completion of the web interview.

Top

Data processing. Data processing activities were quite similar for the base-year survey and the first follow-up. An initial check of student documents for missing data was performed on-site by the SA and SAA staff so that data could be retrieved from the students before they left the classroom. If a student neglected to answer a questionnaire item deemed to be critical, the SA/SAA asked the student to complete it after the end of the second-stage test (see details in the section of “Cognitive test data”).

All TELEform questionnaire scans were stored in a Structured Query Language (SQL) server database. CATI data were exported nightly to ASCII files. Cleaning programs were designed to concatenate CATI and TELEform SQL server data into SAS datasets, adjusting and cleaning variables when formats were not consistent. Special attention was focused on this concatenation to verify that results stayed consistent and to rule out possible format problems. Once questionnaire data were concatenated and cleaned across modes and versions, the following cleaning and editing steps were implemented:

anomalous data cleaning based on a review of the data with the original questionnaire image;
rule-based cleaning (changes that were made based on patterns in the data rather than on a review of the images);
hard-coded edits based on changes recommended by a reviewer, if a respondent misunderstood the questionnaire (e.g., respondent was instructed to enter a percentage, but there was strong evidence that the respondent entered a count rather than a percentage); and
edits based on logical patterns in the questionnaire (e.g., skip pattern relationships between gate and dependent questions).

All respondent records in the final dataset were verified with the Survey Control System (SCS) to spot inconsistencies. Furthermore, the data files served as a check against the SCS to ensure that all respondent information was included in production reports.

Data processing activities for the second follow-up differed from those in the base-year survey and the first follow-up, because respondents could complete a self-administered web questionnaire as an alternative to the survey modes used in previous years. A database was developed in which case/item-specific issues were reviewed and new values were recorded for subsequent data cleaning and editing.

Many of the systems and processes used in the ELS:2002 third follow-up were designed during the first follow-up field test with improvements implemented for the main study and later for the second follow-up. The following systems were developed and used for the first follow-up and employed and improved thereafter: Integrated Management System (IMS)--a comprehensive tool used to exchange files between RTI and the National Center for Education Statistics (NCES), post daily production reports, and provide access to a centralized repository of project data and documents; Survey Control System (SCS)—the central repository of the status of each activity for each case in the study; Hatteras Survey Engine and Survey Editor—a web-based application used to develop and administer the ELS:2002 instrument; Computer-assisted telephone interview (CATI) Case Management System (CMS)—a call scheduler and case delivery tracking system for telephone interviews; Integrated Field Management System (IFMS)—a field reporting system to help field supervisors track the status of in-school data collection and field interviewing; ELS:2002 survey website—public website hosted at NCES and used to disseminate information, collect sample data, and administer the survey; and Data-cleaning programs—SAS programs developed to apply reserve code values where data are missing, clean up inconsistencies (because of respondents backing up), and fill data where answers are known from previously answered items.

Editing. An application was developed in which case/item-specific issues were reviewed and new values were recorded for subsequent data cleaning and editing. Records were selected for review based on one of the following criteria: random selection, suspicious values found during frequency reviews, values out of expected ranges, interviewer remarks, and values not adhering to a particular skip pattern. The review application provided the case/item-level information, the reason for the review, and a link to the scanned image of the questionnaire. Reviewers determined scanning corrections, recommended changes (if respondents had misinterpreted the question), and reviewed items randomly to spot potential problems that would require more widespread review.

The application was built on an SQL server database that contained all records for review and stored the recommended data changes. Editing programs built in SAS read the SQL server database to obtain the edits and applied the edits to the questionnaire data. Questionnaire data were stored at multiple stages across cleaning and editing programs, so comparison across each stage of data cleaning could be easily confirmed with the recommended edits. Raw data were never directly updated, so changes were always stored cumulatively and applied each time a cleaned dataset was produced. This process provided the ability to document all changes and easily fix errors or reverse decisions upon further review.

Editing programs also contained procedures that output inconsistent items across logical patterns within the questionnaire. For example, instructions to skip items could be based on previously answered questions; however, the respondent may not have followed the proper pattern based on the previous answers. These items were reviewed, and rules were written either to correct previously answered (or unanswered) questions to match the dependent items or blank out subsequent items to stay consistent with previously answered items.

Variables drawn directly from third follow-up questionnaire items were edited in three ways: (1) they were edited via the application of reserve codes; (2) they were edited by carrying forward known information from previously administered items/variables to downstream items/variables which were legitimately skipped during survey administration; and (3) they were edited to address inconsistent responses.

Top

Estimation Methods

The general purpose of the weighting scheme was to compensate for unequal probabilities of selection of students into the base-year sample and freshened students into the first follow-up sample and to adjust for the fact that not all students selected into the sample actually participated.

Weighting.
Student level. Two sets of student weights were computed. There is one set of weights for student questionnaire completion; this is the sole student weight that appears in the public-use file and generalizes to the population of spring 2002 sophomores who were capable of completing an ELS:2002 student questionnaire. A second set of weights, for the expanded sample of questionnaire-eligible and questionnaire-ineligible students, appears only in the restricted-use file. This weight sums to the total of all 10th-grade students.

First, the student-level design weight was calculated. The sample students were systematically selected from the enrollment lists at school-specific rates that were inversely proportional to the school’s probability of selection. Specifically, the sampling rate for the student stratum within a school was calculated as the overall sampling rate divided by the school’s probability of selection. To maintain control of the sample size and to accommodate in-school data collection, the sampling rates were adjusted, when necessary, so that no more than 35 students were selected. A minimum sample size constraint of 10 students was also imposed, if a school had more than 10 tenth-graders. Adjustments to the sampling rates were also made, as sampling progressed, to increase the sample size in certain student strata that were falling short of the sample size targets. The student sampling weight then was calculated as the reciprocal of the school–specific student sampling rate. The student nonresponse adjustment was performed using Generalized Exponential Models (GEMs) to compute the two student nonresponse adjustment factors. For data known for most, but not all, students, the data collected from responding students and weighted hot–deck imputation were used so that there would be data for all eligible sample students. An additional eight student weights were computed for ELS: 2002 PETS to incorporate data from both the 10th– and 12th–grade student populations, as the PETS sample contains students from both cohorts.

School level. School weights were computed in several steps. First, a school-level design weight equal to the reciprocal of the school’s probability of selection was calculated; second, the school’s design weight was adjusted to account for field-test sampling; third, the school weight was adjusted to account for the probability of the school being released. Next, GEMs, which are a unified approach to nonresponse adjustment, poststratification, and extreme weight reduction, were used. For data known for most, but not all, schools that would be useful to include in the nonresponse adjustment, weighted hot-deck imputation was used so that there would be data for all eligible sample schools.

Six sets of weights were computed for the third follow-up. Although third follow-up student weights were created, no third follow-up school weights were created.

Scaling. Item Response Theory (IRT) was used to calibrate item parameters for all cognitive items administered to all students. This makes it possible to obtain scores on the same scale for students who took harder or easier forms of the test. IRT also permits vertical scaling of the two grade levels (10th grade in 2002 and 12th grade in 2004). A scale score estimating achievement level was assigned based on the pattern of right, wrong, and omitted responses on all items administered to an individual student. IRT postulates that the probability of correct responses to a set of test questions is a function of true proficiency and of one or more parameters specific to each test question. Rather than merely counting right and wrong responses, the IRT procedure also considers characteristics of each of the test items, such as their difficulty and the likelihood that they could be guessed correctly by low-ability individuals. IRT scores are less likely than simple number-right or formula scores to be distorted by correct guesses on difficult items if a student’s response vector also contains incorrect answers to easier questions.

Imputation. In the base-year study, after the editing process (which included logical imputations), the remaining missing values for 14 analysis variables and two ability estimates (reading and mathematics) were statistically imputed. In the first follow-up study, two new variables were selected for imputation: the spring 2004 student ability estimate for mathematics and the spring 2004 student enrollment status. These variables were chosen because they are key variables used in standard reporting and cross-sectional estimation. Most of the variables were imputed using a weighted hot-deck procedure. Additionally, multiple imputations were used for a few variables, including test scores. A set of 14 key analytic variables was identified for item imputation on data obtained from the ELS:2002 third follow-up member interview. These 14 variables include indicators of whether the respondent ever applied to or attended a postsecondary institution, highest level of education attained, and various employment indicators such as whether the respondent has held a job for pay since high school, as well as total job earnings. A weighted sequential hot-deck (WSHD) imputation procedure was used to impute the missing values for the ELS:2002 third follow-up data.

Top