Listed below is a collection of papers and presentations (2011 to present) that summarize methodological research using NHES data.
Sampling and weighting
Declining survey response rates imply an increased need for post-survey adjustments to mitigate nonresponse bias. In this context, a key area of research is whether sources of auxiliary data not traditionally used for probability surveys can improve the effectiveness of weighting adjustments for nonresponse. The Pew Research Center (2018) found that, in the context of an online opt-in panel, weighting on indicators of political engagement and attitudes led to significant reductions in bias, relative to weighting only on traditional demographics. This paper applies similar auxiliary data to a nationally representative household survey, to examine whether weighting on political variables available from voter file vendors is similarly effective for a probability survey on non-political topics. This paper uses data from the National Household Education Survey (NHES), a large-scale (n ˜ 200,000) nationally representative study by the National Center for Education Statistics. The NHES uses an address-based sample (ABS), and NHES weighting models have historically incorporated demographic variables obtained directly from the frame vendor. A recent NHES sample was matched to a voter and commercial database maintained by a separate vendor, resulting in over 200 new auxiliary variables. Estimates obtained by weighting on both the new variables and the traditional demographics were compared to those obtained by weighting on demographics only, to assess the marginal bias reduction attributable to the new variables. Measures of variance inflation were also compared. Preliminary findings show that the additional auxiliary variables contribute to bias reduction without leading to substantial variance inflation, suggesting that political variables have value for weighting even non-political estimates from surveys that use traditional probability-based methods.
As response rates decline, it becomes challenging for survey researchers to maintain statistical precision for rare populations. For this reason, auxiliary data linkable to sampling frames are often used to oversample cases likely to be in the target population, to increase efficiency and precision. However, these “stratification” variables are often appended from commercial data sources that may not be accurate (Harter et. al., 2016). Consequently, the sampling efficiency obtained from the use of these variables varies by survey and by variable (Roth, et. al., 2018). However, Dutwin (2018) demonstrated that machine learning techniques, such as random forests, could predict rare population membership better than single sampling frame indicators of group membership. This research will extend this work using the National Household Education Survey to determine whether predictive models developed using machine learning techniques can be used to efficiently oversample households with children (which comprise approximately 35% of the U.S. population). Previously, we found that the vendor-appended indicator of the presence of children at an address shows moderate sensitivity but low specificity. Consequently, if households are oversampled based on this variable alone, the gain in the yield of households with children doesn’t offset the increased design effect. We will apply random forests to a large set of address-level auxiliary data—including commercial, geographic, demographic, and voting-related data along with the indicator of the presence of children—to create a model-based flag for households likely to have children. We then simulate the effect of oversampling based on this flag on the effective sample size of households with children, adjusting for the design effect from unequal weighting. From this, we evaluate whether machine learning techniques, applied to a broad range of auxiliary data, are useful to develop an indicator of a relatively rare population with sufficient sensitivity and specificity to permit efficient oversampling.
Given declining survey response rates, there has been increasing interest in identifying ways to make data collections more efficient, such as using targeted or responsive designs. Such designs rely on the availability of high-quality data that is predictive of outcomes of interest, such as eligibility or response (Buskirk, Malarek, & Bareham 2014; West et al. 2015). When conducting surveys that use address-based sampling (ABS) frames, such as the National Household Education Surveys (NHES), it is possible to append auxiliary data to the frame that can then be used to facilitate targeted or responsive designs. This is especially critical for mailed surveys such as the NHES that tend to lack rich paradata. However, studies evaluating the quality of auxiliary data find that it can suffer from high rates of missingness and that the available information may be of varied quality (Disogra, Dennis, & Fahmi 2010; Pasek et al. 2014).
Building on the efforts of Buskirk et al. (2014), West et al. (2015), and others, this presentation will report on the quality and utility of a newly acquired commercial data source that was appended to the NHES’s ABS frame. Prior frames only included about 15 address characteristic variables provided by the frame vendor, and those variables appear to be of limited utility for predicting NHES survey outcomes (Jackson, Steinley, & McPhee 2017). The new commercial data source is promising because it includes about 300 additional variables on topics ranging from voting history to commercial behavior. However, its potential use involves some challenges; for example, while the NHES is a survey of addresses, the commercial data consists of person-level records that the vendor matches to addresses, and which researchers then must translate to address-level characteristics prior to analysis.
This presentation will report on the steps taken to prepare this commercial data for use in developing a responsive design for NHES:2019, as well as assessing the quality of the data itself. First, we will report on the rate at which the commercial data vendor was able to match records to the NHES:2016 and NHES:2017 frames. We also will assess the quality of the person-level matches and discuss the procedures taken to generate address-level characteristics from person-level records. Next, we will report on the extent of missing data among matched cases, both at the address-level (what percent of variables are missing for this address?) and variable-level (what percent of addresses are missing data on this variable?). We also will report on the agreement rate between the commercial data and NHES survey responses (for example, when the commercial data suggests there is a child present, did respondent households also report having at least one child on the survey?). Finally, we will touch on whether incorporating this data increases the predictive power of models predicting key survey outcomes, such as response or eligibility.
The results of these analyses will help researchers gain insight into the feasibility of appending this type of commercial data to ABS survey data.
The National Adult Education and Training Survey (NATES) was a 2013 pilot study sponsored by the National Center for Education Statistics to evaluate whether high-quality data on adult education, training, and credentials for work could be collected using a mailed household survey. A unique aspect of the NATES pilot was its nonresponse follow-up study, in which a random subsample of nonrespondents to the mailed survey was selected for in-person interviews using a shortened questionnaire. This paper uses data from the nonresponse follow-up interviews to estimate the extent of unit nonresponse bias in key NATES estimates. Base-weighted estimates are compared to nonresponse-adjusted estimates in order to assess the efficacy of standard weighting class adjustments at mitigating the observed bias, using a set of commercially purchased household-level auxiliary variables appended to the NATES sampling frame. Associations between the auxiliary data and key NATES outcomes are observed in order to further assess the potential utility of commercial auxiliary data at correcting for nonresponse in future administrations.
Address-based mail surveys have become increasingly popular in recent years. A challenge in conducting such surveys in the U.S. is determining which households are eligible for the survey, and then ensuring that the appropriate household member(s) complete it – all without the aid of the interviewer that would be present in other modes to help facilitate this process. A popular option for addressing this challenge is to first send a screener questionnaire to identify eligible households and the eligible individuals living in them; eligible households are later sent the full questionnaire and told which household member(s) should complete it. Drawbacks to such two-stage designs include the time and cost associated with sending the two stages and the potential for lower response rates due to challenges in getting households to respond to both stages. We report on efforts to combine the household and individual-level eligibility determination and topical survey data collection stages into a single-stage. As part of a national pilot study of adults conducted in 2013, sampled households were mailed questionnaires that included a “screener” page in which the sampled households were asked to indicate the number of eligible individuals in the household. Instructions were included requesting that each of these eligible individuals complete one of the three survey questionnaires that had been mailed to the household. Households were randomly assigned to receive the questionnaires as one composite booklet or three individual booklets. This presentation will compare results from this single-stage design overall to the two-stage design typically employed by this survey on outcomes such as the response rate, screener responses, respondent demographics as compared to frame data and the Current Population Survey, responses to key questions, and response quality. We will also report on differences between the composite- and individual-booklet conditions in terms of these same outcomes.
The National Household Education Survey (NHES), sponsored by the National Center for Education Statistics, recently experimented with methods for reaching and measuring rare populations in a household mail survey. This paper will describe these experiments and report the results. The rare populations of interest in the NHES are households with preschool and school-age children, Spanish-speaking households, and households with non-traditional family structures, e.g. grandparent guardians and same-sex parents. Households with children constitute 35% of households nationally. NHES used a two-stage mail survey to first screen households for children and then follow-up with a longer topical survey for eligible households. NHES tested 3 screener types in a pilot survey and found a longer screener yielded a higher percentage of households with children but lower overall response than a 1-page screener. We are replicating this experiment in a large field test with a condition that measures response when we ask for the child’s name or not. The pilot study also showed lower response using a bilingual form, but sample size was small. Another field test experiment compares a bilingual form to a mailing containing both a Spanish and English questionnaire for a random national sample and for a large, linguistically isolated sample (n=18,000). We also employed telephone screener nonresponse follow-up to attempt to capture linguistically isolated households that may struggle with the mail survey. For non-traditional family structures, we tested two versions of a series of questions about parent characteristics; one version is more traditional in use of wording such as “mother” and “father”; the alternate version targets non-traditional parent structures using the gender and relationship neutral wording “parent 1” and “parent 2” instead of “mother” and “father.” Results from the field test will be available this summer and included in the paper.
Response quality and questionnaire design
Comparison to external data
Commercial databases offer a wide range of auxiliary variables that can be appended to survey samples and used in weighting and response propensity modeling. However, many of these databases are provided at the person level, which necessitates the development of aggregating the data to the address level for applications to household-level surveys. The utility of the resulting auxiliary variables will depend on the extent to which the raw data accurately capture the persons residing at the address. This study evaluates the agreement between person-level commercial data and survey reports for adults in an address, and the implications it might have on household-level analyses.
This study uses data from the 2016 and 2017 National Household Education Survey (NHES), an address-based sample of U.S. households. In the screening phase of the NHES, sampled addresses were asked to provide the name, gender, and age of all persons residing at the address. These addresses were matched to a commercial database containing person-level information such as voting and purchase behaviors. The nested structure—associating people with each address—facilitates the comparisons to the NHES screener data. We match the people on the commercial data with the people reported on the screener at the same address, by first name, date of birth, and gender. Each person listed on the commercial database and the NHES screener will be assigned a matching status indicating whether the person can be identified on both sources and the confidence in the match: exact match, fuzzy match, or no match. We explore the match rate overall and by person and household characteristics (e.g., person’s age, whether household has children). As person-level records are aggregated for household-level analysis, we also will examine whether the removal of fuzzy matches or non-matches from aggregation process makes a difference in distribution of characteristics at household level.
The percentage of public school students receiving English Language Learner (ELL) services increased over the past decade (Aud et al., 2013). As the U.S. population becomes more diverse, collecting accurate data on language use and participation in language programs is increasingly important. However, survey mode may influence whether accurate data on these topics are collected. A non-native English speaker may find it more difficult to complete a mail survey than a telephone interview because there is no person or computer-assisted program available to adjust the survey to a language the respondent speaks (Zukerberg and Han, 2010). This paper will examine the influence of survey mode on estimates of ELL students using data from the National Household Education Study (NHES). These estimates will be compared to administrative data from the U.S. Department of Education’s Common Core of Data (CCD). The CCD identifies ELL students in public elementary and secondary schools.
The NHES Parent and Family Involvement in Education Survey (PFI) collects data on education programs and participation, including questions related to language spoken at home by both the child and the parent(s). In addition, parents are asked about their child’s participation in English as a Second Language (ESL) programs. The survey was fielded in 2003 and 2007 as a telephone survey, and 2012 as a mail survey. In 2003 and 2007, bilingual interviewers were available to conduct the interviews in English or Spanish. In 2012, English and Spanish screener packages were mailed in targeted areas.
For this analysis, we will compare changes in NHES estimates of the number of school-aged children enrolled in public school and participating in language programs to changes in CCD counts of ELL students between 2003, 2007, and 2012. We will also examine trends in child language use at home and parent-child concordance on language use.
A key component to successful survey research is data accuracy, which can be identified by comparing self-reported data to administrative data. Inaccurate self-reported data may be a result of a lack of information on the part of the respondent. In many cases it is hard to determine the validity and reliability of self-reported data, because verifiable information is unavailable. With the growing debate over using school choice as an alternative to traditional schooling in the United States, there is a need to investigate to what extent parents understand the types of schools that their children attend, and the accuracy of self-reported data about school choice. This is particularly pertinent to education surveys that ask about school choice, as it is an important component of social and educational research.
Using the Parent and Family Involvement Survey (PFI) from the 2012 National Household Education Surveys Program (NHES:2012), we investigate the accuracy between parent-reported data about children’s school type (i.e., Public, assigned; Public, chosen; Private, religious; and Private, non-religious) and administrative data about school type from the Common Core of Data (CCD) and the Private School Universe Survey (PSS). NHES:2012 is a nationally representative survey of school-age children in the United States that asks questions related to school choice and parent involvement in education and links CCD and PSS data to the focal child.
The research questions we intend to explore are: 1) What percentage of parent-reported data differ from school-reported data from the CCD/PSS? 2) Do these differences vary by school, and student and family characteristics? Initial findings indicate that in approximately 5 percent of cases parent-reported data differ from CCD and PSS data across different school types. The implications regarding school choice and the accuracy of self-reported data compared to administrative data in education surveys will be addressed in this discussion.
Administrative records are not only useful for the data that they provide, they are also useful tools for developing and evaluating the measurement properties of new survey constructs and items. In 2010, a federal Interagency Working Group designed a pilot study evaluating new survey measures that could be used to estimate educational credentials not adequately or consistently measured in the past, despite their potential added value to education and labor market research.
The goal of the pilot was to develop and evaluate a parsimonious set of survey items that could be included in multiple federal surveys, such as the National Household Education Survey and the Survey of Income and Program Participation. Therefore, it was imperative that the study design allow for a careful evaluation of the measurement properties associated with these new measures of non-degree credentials (e.g., industry-recognized certifications and state and local government-issued licenses). The development effort culminated in the Adult Training and Education Survey Pilot Study—a household survey of non-institutionalized adults conducted by mail and telephone by using an address-based, nationally-representative sample and a convenience, or “seeded,” sample of certification or license holders who were identified using administrative records to develop and quantitatively evaluate new survey measures.
This paper describes the specific methods used, details how the methods were operationalized, presents the results, and describes how the results were used to refine the survey items and questionnaires. The study specifically examines the use of administrative records to (1) detect systematic underreporting by respondent and credential characteristics, and (2) identify credential commonalities—or groups of survey items that could be used in combination to identify instances of over reporting—that can be used in surveys where administrative records are not readily available.
Household structure, rostering, and enumeration
The household rostering stage of the survey administration determines the participant’s eligibility and survey weighting; however, the optimal number of people a survey should ask about is not agreed upon. Battle et. al (2014) showed that in a mailed survey, asking information about all 10 household members during rostering stage didn’t make much difference in unit response, compared to asking information about 5 children only. The study evaluates this topic on a web survey to what extent the number of people asked at screener stage, and whether asking about children versus both children and adults in a household make a difference in sample composition of children.
This study uses data from the 2016 and 2019 National Household Education Survey (NHES), an address-based sample of U.S. households. At the screening phase, sampled addresses were asked to provide the name, sex, and age of persons residing at an address. In 2016, the NHES screener asked information of up to 10 household members at the address, including both children and adults. In 2019, the screening stage only asked information of up to 5 children at the address. We will examine the number of children reported between the two years, and whether children of a specific age range were more likely to be reported in one year than the other, in addition to the completeness of the screener items. All the results will be compared between web and paper mode. Furthermore, there was a series of household member items at the topical phase of the NHES about the number and relations of persons living in the household. These reports at topical stage will be compared to the number and type of persons reported at the screening stage. The results of the study will inform the screener design of household surveys targeted at children.
Screener listing, a way to obtain a list of household residents, is used for respondent sampling and eligibility determination. This presentation will compare two listing methods in the online administration of the 2017 National Household Education Surveys (NHES). The first method, person-by-person listing, asked all of the screener questions about a single household member before switching to the next household member. The second asked questions in characteristic-by-characteristic format, in which a single characteristic was noted for all household members before proceeding to the next characteristic. A randomized procedure was used to assign the listing method, resulting in half of the sample receiving each method. The presentation will address completion rates, response time, item missingness, household member characteristics, and possible reporting error, by screener listing method.
Preliminary results suggested the completion rate between the two screener designs is comparable, around 38% for each. The second screener design, however, showed a minor but significant increase in screener response time and item missingness. Examination of household member characteristics did not show notable difference in reports of the number of household members, number of adults and children, enrollment type and grade level, despite a slightly larger number of people aged 35 and older reported in the second screener design. Further investigation will compare response time by number of people in a household and by household composition to evaluate any time difference in recalling the household members. In the second screener design, a small number of respondents (approx. 30) were discovered who chose to backup and delete certain household members from the roster– though the characteristics of those members were preserved. These cases will be further examined to understand which members were deleted, and whether the design produced any errors.
Specification error, a type of non-sampling error, occurs when the concept intended to be measured is different from what is actually measured. For example, a screening question asking “how many adults live in this household” could have different interpretations. Respondents may count students away from home as living in the household when they should not count them. Unless the respondent can clearly interpret screening questions, the responses may bring in unintended participants for a follow-up survey. In the 2016 National Household Education Surveys (NHES:2016), a two-phase self-administered mail and web-based survey, the screener respondent enumerated the members of the household and indicated the school enrollment status of each member. Based on the response to the school enrollment question, households with school-aged children were sampled for one of two topical surveys: one for homeschooled students or one for students enrolled in public or private schools. In NHES:2016, due to concerns that prior administrations may have missed part of the homeschooled population, the order of the response options for the screening item was changed so that the “homeschool” option was presented before the “public or private school” option, possibly making it less likely for respondents to overlook. However, preliminary findings from the NHES:2016 administration indicate that, unlike in prior administrations, the response rate was lower and breakoff rates were higher for the homeschool survey than for the enrolled-students survey. This suggests that respondents may have misunderstood the homeschool response option when it was placed before the “public or private school” option; thus, paradoxically, NHES:2016 may have sampled more households that do not have homeschooled children. By reviewing responses to topical survey items related to the definition of a homeschool child, this analysis will examine whether specification error in the screening question resulted in households being incorrectly sampled for the homeschool survey.
Household enumeration and within household screening are important for identifying the appropriate individual to sample in household surveys, but sometimes there is reporting error around who lives in the household that can lead to coverage error. This paper will present results from a study evaluating inconsistent reporting in the household enumeration questions used in the National Household Education Survey (NHES). NHES enumerates households twice: 1) in the screener survey and 2) in the topical survey completed after screening. The same question is used in both surveys, yet in prior rounds of mail-based data collections about a quarter of households report a different total number of household members in the screener than in the topical survey. Usually, fewer household members are reported in the screener, which is problematic for within household screening. However, with up to a two month lag between the screener and the topical surveys, real changes in the household composition could have occurred during that lag. In 2016, a web mode which allowed the screener and topical surveys to be completed in the same sitting was piloted, reducing the likelihood that discrepancies could be caused by true differences in household composition. This paper will compare the percentage of households where there was a discrepancy between the total household size reported in the screener and the topical survey between the 2016 web and mail modes. We will use a logistic regression to analyze predictors of inconsistent reporting such as household size, survey mode, time lag between screener and topical surveys, respondent consistency from screener to topical, whether the inconsistent reporting was for an adult or child in the household, and demographics of the respondent. The presentation will conclude with a set of recommendations that can be applied to NHES as well as other household enumerations.
Evaluating strategies for within household screening in household mail surveys is increasingly important as address-based mail surveys are used to replace RDD household telephone surveys. This study evaluates whether or not it possible to effectively enumerate the entire household versus only target populations without affecting response rates and coverage of a particular sub-group. In household studies that seek to sample household members with certain characteristics, a specific household member must be identified and it is essential to enumerate all eligible individuals before selecting the sampled individual. However, researchers must balance the scope of the enumeration (coverage) against enumeration burden, which might affect response rates. Another potential issue is increased likelihood of measurement error in a more complex household roster. The National Center for Education Statistics (NCES) conducted a household enumeration experiment in 2013 to examine these potential issues.
In the experiment, a random sample of 500 addresses received a screener questionnaire that enumerated the entire household; another random 500 addresses received a screener that enumerated only children and youth ages 3 to 20. This child-only screener had previously been used by NCES in a 2012 survey. Expanding it to include all household members would allow NCES to use the sampled household for surveys other than child-focused surveys. We will compare response rates achieved in the two experimental groups and to each other and to response rates achieved in the 2012 survey to determine if there is any negative effect on overall response of requesting information about all persons in the household. We will also look at coverage rates of children between the two groups and the 2012 survey by comparing the number, age, and grade level of the children reported in each screener, and with population estimates.
One major challenge of questionnaire design is collecting information about familial relationships in self-administered questionnaires. Whereas data collections with interviewers allow for a dialogue between interviewer and respondent which can help clarify familial relationships, self-administered questionnaires rely on sound item development to capture the nuances of ever-diversifying family structures. The National Household Education Surveys Program (NHES), undergoing a redesign from RDD telephone survey to ABS mail survey, conducted methodological experiments including split panels of important questionnaire items in its 2011 Field Test. The survey collected information about the education, care, and household characteristics of a sampled child. Sections of the questionnaire that were designed to collect information about children’s parents (called “Mother/Female Guardian” and “Father/Male Guardian” sections) were adapted to “Parent 1” and “Parent 2” sections in order to accommodate a diversity of family structures. The marital status item was of particular importance because of the attempt to rewrite it such that same-sex parent households could report familial relationships more easily. This paper will compare the quantitative data on these two sections to assess which household characteristics section was more effective. Specifically, the paper will address: 1. How do the two sections compare in terms of descriptive statistics? Where are the differences in the data collected? 2. Comparing data to the ACS, which section gathered more reliable information about households? 3. Which marital status questionnaire item performed better? Which item identified more same-sex parent households? Which item identified more nontraditional household structures? 4. Did the change in questionnaire items introduce measurement error in the form of item nonresponse?
This poster will use data from 58 cognitive interviews to identify issues related to collecting information in a household survey about parents of a sampled child, with special attention to same-sex parents, other nontraditional household structures, and Spanish-speakers. Most research conducted on household enumeration has used roster data; this analysis is unique because it deals with capturing household adults’ relationships to the sampled child.
The National Household Education Survey (NHES), undergoing a redesign from an RDD telephone survey to an ABS mail survey, has conducted methodological experiments, including questionnaire redesign. Sections of the questionnaire that were designed to collect information about children’s parents (called “Mother/Female Guardian” and “Father/Male Guardian” sections) were adapted to “Parent 1” and “Parent 2” sections in order to accommodate a diversity of family structures. As part of the upcoming 2010 NHES Field Test (n= 60,000), a random sample of respondents will receive the revised questionnaires with the “Parent 1”/ “Parent 2” sections. In the questionnaire redesign process, cognitive interviews were conducted with 58 parents, including 6 same-sex parents, 11 parents in other, nontraditional households, and 9 Spanish-speaking parents. Lessons were learned about measuring familial relationships as they relate to a sampled child.
Descriptive statistics about nontraditional households, same-sex couple households, and Spanish-speakers from the RDD 2007 NHES will help illustrate dimensions of the challenges in enumerating parents. Recommendations for collecting information on parent relationships for a diversity of household structures will be presented.
Reporting of sensitive information has long been a concern of survey researchers. Much of the literature on sensitive questions compares data collection modes, finding that self-administered modes yield higher levels of reporting of sensitive information than interviewer-administered modes. This research shows that within self-administered surveys, reporting can vary widely depending upon the context of the survey.
The redesigned National Household Education Survey is a mail survey with two-stage sampling. The 2009 pilot study (n= 10,000) tested different versions of the first stage of the survey (the screener) by sending some respondents a short, stripped down survey that asked the bare essentials while other respondents were sent a longer, more comprehensive survey designed to engage them in the purpose of the survey. A third group of respondents received a screener survey that was a compromise between the stripped down approach and the engaging approach; it was a survey with the bare essentials plus some additional demographic questions. All versions of the screener asked for the names of all children in that household. However, to accommodate sensitivity concerns raised by cognitive interview respondents, the surveys also gave the option of providing children’s initials or nicknames.
Results indicate that about 21.1 percent of respondents from the 2009 NHES pilot study’s shortest version of the screener used children’s initials or skipped children’s names entirely. However, respondents to the longest, engaging version of the screener used children’s initials or skipped children’s names entirely 38.3 percent of the time. Among the group that received the medium-length screener survey, 26.7 percent used children’s initials or did not provide child’s names.
This paper’s goal is to explore the differences in the response patterns for child’s name. We will be comparing differences in demographics of the respondents. The characteristics explored will include age, education, income, locale, and child’s age and gender.
Language and translation
Multilingual and multicultural research is becoming increasingly relevant for survey researchers in the U.S. As the Hispanic population in the U.S. continues to grow both in size and as a percentage of the total population, the stakes to properly reach and represent this population increase. Hispanic Americans also are increasingly diverse in country of origin, where they reside, and their education experience. The languages they speak and their experiences with U.S. institutions and cultural concepts may vary by numerous factors (e.g., generational status, length of time since immigration, and country of origin). This diversity presents new challenges in questionnaire design and translation.
The National Household Education Survey (NHES), used to produce national estimates on educational experiences, offers an example of these design and translation challenges. Using NHES data, this presentation aims to answer two main questions. First: Are the patterns observed in responses to the Spanish questionnaire also seen among Spanish-speaking respondents who complete the English questionnaire? Second: Among Spanish-speaking respondents, do response patterns vary across other indicators of English proficiency and immigration experience?
This presentation will combine response data from NHES:2019 with cognitive interview findings for NHES:2023 to compare response patterns and indicators of response error by Spanish and English proficiency, immigration status/experience, and questionnaire language. Preliminary findings suggest that translation attempts for some items may be highly prone to measurement error, potentially because certain educational terms and concepts (e.g. homeschooling) are not familiar to many monolingual Spanish respondents. Additional preliminary analysis suggests disproportionately high rates of missing and inconsistent responses across these items among Spanish-speaking respondents.
These findings will illustrate a mixed-method approach to identifying conceptually problematic items for translation and determining some underlying factors associated with Spanish-speaking respondents’ understanding of key concepts. The presentation will attempt to identify areas of improvement for survey design.
Survey methodologists continue to struggle with how to solve problems with skip patterns in mail questionnaires. Past experiments by Dillman et al. (1999) used college students to examine the effects of eight characteristics on error on errors of omission and errors of commission. Our interest lies in the impact of language and literacy on correctly following skip instructions. This paper aims to investigate the role of respondent choice and possible knowledge of what is termed the “form culture” on errors of commission using the National Household Education Surveys (NHES): 2015 cognitive testing. “Form culture” refers to the notion that individuals, starting from a young age, are required to fill out extensive amounts of paperwork (including testing, taxes, certifications, medical forms) and are expected to complete this paperwork without assistance. We contend that individuals most unfamiliar with the “form culture” will struggle the most with skip patterns and written questionnaires in general. In this case, those most unfamiliar with the “form culture” are those in the process of acculturation and adaptation to the U.S. model of forms and information. During the NHES: 2015 cognitive testing, we interviewed ninety seven Spanish dominant speakers and found that a forty five of the respondents missed at least one skip instruction in the questionnaire. During probing we explored the reasons why respondents missed the skips. Some respondents reported noticing the skip instructions but choosing to ignore these instructions. Upon further probing we found that respondents chose to continue answering questions because they feared that they would not provide all the necessary information. Results of this research provide a greater understanding of these patterns of behavior from respondents, improving our ability to design questionnaires that meet the needs of these and other respondents.
Obtaining accurate data from a paper questionnaire depends heavily on respondents following skip instructions correctly. Wang and Sha (2010) found that there were significantly more skip errors for survey forms completed in Spanish than for forms completed in English, and that the increase in the error rate for Spanish-language forms was higher among respondents who had not completed high school. Nonetheless researchers have found mixed results when examining the role of education of respondents and the impact it may have on rates of missing data (McBride and Cantor 2010). Thus, it is important to assess what other factors might exacerbate or mitigate the impact of survey language on skip error rates. In the present analysis, we seek to investigate the role of socioeconomic status (SES) on skip error rates using data from the 2012 National Household Education Surveys (NHES: 2012). The NHES: 2012 was a primarily paper based two-phase survey. During the first phase a nationally representative sample of households were selected to fill out a screener listing all members under the age of 20. The second phase entailed administering either of two surveys based on the age of the selected child. Households in the Hispanic sampling stratum (constructed based on Census tract with more than 40% Hispanic persons) received both English and Spanish screeners. Households that returned a Spanish screener received a Spanish topical questionnaire in the second phase. Limiting the analysis to those households that returned the screener and topical questionnaires, a composite measure of SES with indictors such as household income, and education level will be constructed. Skip error rate will be analyzed separately by questionnaire language and SES to assess whether respondent SES influences the magnitude of any language effect. Findings will help improve questionnaire design that responds to the needs of various target populations.
Previous work on the translation of educational attainment has shown the challenges of translating concepts related to country-specific programs (Fernández, Goerman, Quiroz 2012, Schoua-Glusberg et al. 2008). In this paper I present the results of two iterative rounds of cognitive testing on a series of the National Household Education Surveys (NHES) educational questions with monolingual Spanish-speaking respondents from different countries living in the United States. The focus of the cognitive testing was on the validity of terminology such as charter schools, homeschooling, and after-school programs, with a goal of improving concept validity in Spanish. In the first round of testing the study team tested the questions using a translation that was as close as possible to the English original, in other words the translation was a literal translation of the original English In the second round of testing the translations were revised for comprehension and re-tested. Some of these revisions included translations that were not direct and were instead equivalent to the meanings of the concepts in English, for example after school programs was initially translated as “programas extracurriculares.” On the second round it was re-translated to “actividades despues de la escuela” when respondents were not familiar with the term “extracurriculares” and felt confused by the term that appeared unfamiliar. This paper focuses on the results from the two rounds of cognitive testing; where we found that direct translations are less successful at getting reliable results and that terminology often require an explanation in order for terms to make sense in the surveys. As an increasing number of surveys seek to collect data in several languages one of the lessons learned is the importance of translations that collect reliable and valid data.
The National Household Education Survey (NHES) collects data on the educational experiences of children in the United States. In the 2019 administration of the NHES, a split panel experiment was conducted on the order of the response options for three multiple-select questions. These questions ask about the reasons for choosing virtual education and homeschooling among K-12 students who are homeschooled or enrolled in school. The goal of the experiment was to determine the impact of the order of the response options on the percent of respondents who chose each answer. There was one control group and six treatment groups. Each group had a different order for the response options on the reasons for choosing the school type. Initial analysis conducted by the Census Bureau showed that for all three questions, the order of the response options did not have statistically significant impact on the response distribution.
This presentation examines the order of the response options and looks deeper into the characteristics of those who received the experiment and how they responded. One of the NHES questions analyzed in the experiment asks about the reasons that the child, who is not homeschooled, is enrolled in online, virtual, or cyber courses. Rather than analyzing the data as a whole, this presentation further analyzes the response data across various demographics, such as child’s race/ethnicity, locale, parent’s employment, and family income. The objective is to see how changing the order of the response options impacts the response distribution for each demographic and whether the responses chosen were measurably different across respondents of different demographic groups. The same treatment groups will be used here when assessing the order of the responses. The results of this study will allow survey methodologists to further understand the future of virtual learning and ways to improve on measurement error.
Many studies have evaluated the utility of commercial data for weighting and response propensity modeling but little research has examined imputation of person- or household-level characteristics. This research explores the feasibility of imputation of person- and household-level characteristics by using person-level commercial data. Two research questions are asked in this study: (1) if the same person can be found between the two data sources, is it possible to substitute the missing responses directly from the information available on the commercial data; and (2) if we cannot find the same person between the two sources, can we supplement the imputation process with commercial data for the variables imputed through population characteristics?
This study will use data from 2016 and 2017 Adult Training Education Survey (ATES), an address-based sample of U.S. adults. In the screening phase of the survey administration, sampled households are asked to provide the information of all persons residing at the address. These addresses were matched to a commercial database containing individual information including voting and purchasing behaviors. To address the first research question, common variables between ATES and the commercial data such as veteran and marital status will be selected. We will compare the commercial values of these variables to the self-reported values to determine whether direct imputation from the commercial source is feasible for persons identified to be the same across the data. In the case where no one was found to be same across the two sources, demographic variables will be imputed through weighted random procedure by using the population characteristics of the commercial data. We will use simulation to examine the imputed values by randomly selecting 10% of the respondent cases and compare them against the self-reported values for these variables. This will evaluate the usability and accuracy of the imputation procedure using commercial data.
In self-administered surveys data quality is an important concern to evaluate when the respondent in a household may or may not be the “best” or most knowledgeable about the questions being asked. Such as being knowledgeable about the children in a household. The National Center for Education Statistics conducted two web-based data collections, in 2016 a question was asked if the respondent is the most knowledgeable about a child. If they reported being knowledgeable they received questions about the child. If they reported they were not, the survey would end and then a letter was sent to the household requesting the most knowledgeable to complete the survey. Few cases in 2016 reported they were not knowledgeable and of those cases even fewer did not respond to the child survey. In 2017, the respondent proceeded into the child specific questions without being asked if they were knowledgeable or not. The 2017 method is operationally easier and could increase response rates. There is a concern of potential data quality issues with not having the most knowledgeable respondent. The goal of this analysis is to determine differences in data quality based on the relationship type that respondents have with the child. First, we will compare the response rates of parent versus non-parent respondents between administrations. Then we will look at key features of data quality for parents and non-parents in 2016 and 2017 – item non-response rates, evidence of straightlining, breakoff rates, and number of overall data edits. Preliminary results indicate that there was an increased number of non-parents completing the survey in 2017 than in 2016. There are indications that data quality declined only slightly in 2017. This indicates that it might not be necessary to find the “best” respondent as long as response rates increase and data quality does not decline.
As response rates decline, conducting surveys has become increasingly resource-intensive. In this presentation, we will discuss the results of an experiment that attempted to improve efficiency in a repeated cross-sectional survey, the National Household Education Surveys (NHES) Program. As part of this experiment, a random subset of households that responded during the screener phase of the NHES:2017 web test was asked to take on more response burden in the topical questionnaire phase of the survey than normally would have been imposed: completing two of the topical questionnaires that make up the NHES program (“dual-topical condition”) instead of just one (“single-topical condition”). Ideally, the dual-topical approach would result in a more efficient data collection; however, there also was a risk that this more burdensome request could backfire in terms of responsiveness or response quality. In particular, this experiment aimed to determine if the promising results of similar experiments conducted in previous, paper-only NHES administrations would be replicated when the survey was conducted online.
The preliminary results of this experiment are encouraging. Though the dual-topical condition resulted in a lower response rate than the single-topical condition, there were enough households willing to complete two topical questionnaires that the topical yield (number of questionnaires completed per sampled household) was increased – and the incentive cost per completed questionnaire was decreased. Since the instrument was programmed to skip any overlapping questions from the questionnaire that was presented second, we also find encouraging results in terms of reducing average respondent burden per completed questionnaire (as measured by completion time). Finally, there was no evidence of a negative effect on breakoff rates, item missing rates, or the representativeness of respondent households. These results suggest it is possible to ask households to complete more than one questionnaire in web surveys without a negative effect on survey quality.
Reducing the burden placed on respondents is necessary when asking them to recall events. Item design, especially in self-administered questionnaires, can reduce burden by providing cues that aid in recall. There are two different approaches that have been used to collect this type of data. One is a calendar format, in which respondents are asked to mark the days of the week that an event occurs. One challenge of this method is possibly higher levels of straight-lining due to the complexity of the response process. The second approach is a traditional frequency grid, in which the response options are number of days per week (e.g. once a week, 2-3 days per week, etc.). However, this method might lead to overreporting due to respondents increasing their reported frequency to allow for forgotten events. Overall, some research has suggested that calendar formats may enhance recall and improve data quality, but methodological research comparing the two approaches has been minimal. As part of the 2014 National Household Education Surveys Feasibility Study, a self-administered mail-based survey, a split panel experiment was conducted to determine whether a weekly calendar format or an event frequency format would be preferable for future administrations. Half of the sample received a form with the items in a calendar format, and the other half received a form with the same questions in a frequency format. The proposed analysis of the calendar format (n=2,850) versus the frequency format (n=2,870) will examine potential recall issues such as over- or underreporting by comparing response distributions. Initial findings indicate there is a statistically significant association between the item format and the distribution of responses. The analysis will also look at data quality by comparing unit response rates, item non-response rates, and straight-lining patterns to determine which format is best suited for a self-administered questionnaire.
Survey researchers are often faced with striking a balance between asking all questions to which they would like a response and creating an easier to complete, concise instrument. Existing research suggests that sample members are less likely to respond to longer questionnaires and that responses to items presented later in questionnaires may be of lower quality (e.g., Burchell & Marsh 1992; Galesic & Bosnjack 2009). However, it is difficult to know the ‘optimal’ length to maximize the response rate and response quality in any particular survey. In this presentation, we will report on the results of a questionnaire-length experiment embedded in the adult training and education component of the mailbased 2014 National Household Education Survey Feasibility Study. This randomized experiment was conducted to determine the optimal questionnaire length for future administrations. Sample members were randomly assigned to one of two questionnaire length conditions. In the first, sample members received a longer questionnaire that included all topics of interest (28 pages, 80 numbered items). In the second, they instead received a shorter questionnaire focusing on a critical subset of these topics (20 pages, 66 numbered items). Preliminary results show that, as expected, the response rate was significantly higher when the shorter booklet was used. This presentation will report on (1) the final response rate in the two conditions; (2) other potential effects of questionnaire length on key survey outcomes, such as sample representativeness, key estimates, and response quality indicators (e.g., item nonresponse, skip errors); (3) whether the use of a prepaid incentive mitigated the impact of questionnaire length; and (4) the implications of these results for future administrations of this and other mail surveys.
This presentation will review the results of an experiment embedded in a recent national ABS field test survey that varied the response burden placed on the sampled households by varying the number of questionnaires that sampled households were asked to complete. Once a screener phase established the presence of eligible individuals within the household, some of the eligible households were randomly assigned to receive a single topical questionnaire, while others also received a second topical questionnaire on a different, but related, topic as part of the survey mailing. Sending a second questionnaire to the household did not have a negative impact on the response rate, making it an attractive option to consider for gaining efficiency in future administration by reducing the number of households that need to be sampled in the topical questionnaire phase. Before implementing this strategy in a full-scale administration, it is important to consider whether sending a second topical questionnaire had an impact on who responded or on the quality of their responses. This presentation will explore whether the experiment had an impact on nonresponse bias by comparing key estimates and demographic characteristics reported by respondents from single- and dual-questionnaire households, as well as comparing the distributions of frame variables among single-questionnaire respondents and dual-questionnaire respondents as compared to the eligible sample. It will also explore whether the amount of burden placed on the household had an impact on the extent of measurement error in the provided responses by comparing the prevalence of response quality indicators, such as item nonresponse, straightlining, and skip errors in the single- and dualquestionnaire households.
Self-administered questionnaires rely on a carefully planned design in order to attain high response rates and increased data quality. During the design process, survey designers must create a clear organizational structure that allows the respondent to complete the survey as intended (Fanning, 2005). However, survey designers are balancing two important factors, first is the need for parsimony by reducing space to save on cost, but they also need to create an organizational structure that is intuitive to respondents and will yield high quality data. Specifically, grid organizational systems are one type of system that allow for differing questions with the same set of criteria to be answered in a more condensed space than a question by question format. Because grids can vary in complexity, there are a wide variety of challenges associated with their use. For example, a grid with a simple rating system can be prone to higher levels of straight-lining, whereas a complex grid might have higher item non-response rates. As part of the 2014 National Household Education Surveys Feasibility Study (NHES-FS), a self-administered mail-based survey, a split panel experiment on a set of four questions was conducted to determine whether a grid system would or would not be feasible for future administrations. Half of the sample received a form with the grid rating system, and the other half received a form with similar questions in a check all that apply and rank order format. The assessment of common problems related to grids versus a non-grid mark all that apply format will allow for better utilization of organizational structures in future surveys by knowing which format provides the highest quality data. By comparing unit response rates, distribution across items, item non-response rates, and straight-lining patterns, analysis will show which organizational structure is better suited for self-administered mail-based surveys.
Authorizing a proxy respondent to report on behalf of the target sample member is an attractive option for increasing response rates when survey practitioners are unable to make contact with the intended respondent. However, this response rate increase may come at the cost of greater measurement error if proxy respondents are not sufficiently knowledgeable of the sample members about whom they are reporting.
We will report on the use of proxy respondents in a recent telephone survey aimed at determining the feasibility of collecting national estimates of the prevalence of educational and work-related credentials among the U.S adult population. Proxy responses were obtained in two ways. First, if a target sample member was not reached after a certain number of call attempts, another household member was asked to provide a proxy response for the target sample member. Second, the survey included an experiment in which a portion of respondents were randomly selected to receive a request to provide a proxy response for a second household member right after they had reported about themself.
We will discuss the effect that utilizing proxy respondents had on the response rate. We also will review the characteristics of (1) the individuals who required a proxy respondent and (2) those who agreed to provide a proxy response – as well as whether or not permitting proxy responses improved the representativeness of the final sample. Finally, we will compare the quality of the proxy reports to that of the self-reports by comparing the extent of “don’t know” responses and the relative accuracy of key survey responses as compared to record data.
Political polling and other opinion related surveys literature has a large body of knowledge on open- vs. close-ended questions. These surveys are often telephone RDD or in-person interviews or, more recently, Web-based surveys. High item non-response has been one of the major reasons for surveys to avoid open-ended questions whenever possible. Even the open- ended questions that are limited to filling out a text field have been known to have high nonresponse. National Household Education Survey (NHES) in 2011 administered a mail survey field test with imbedded experiments on open-/close-ended questions. The respondents to the survey were first recruited by filling out a screener questionnaire. After an eligible child was selected from the information in the screener, a more extensive topical questionnaire was sent. The follow-up survey asked parents about their child’s education and the parental care and family involvement in child’s development. Imbedded in the design, there were questions which were asked in one form as an open-ended question and in another form as a close-ended question. The two forms were tested experimentally. One such question was on how many times a child was read to in the past week: one set of parents received an answer option in a write-in form and another set of parents received an answer option in the form of categories. In this paper, we will explore the response rates for these open-ended vs. close-ended option items. Our hypothesis is that the open-ended items are skipped more often than the close- ended items. We will use logistic regression to estimate the likelihood of response for one type of question vs. the other, controlling for other factors that may affect the response. This study will build up literature on the open- vs. close-ended questions as it relates to the mail household surveys.
Recruitment and nonresponse
Offering both web and paper modes of response is an attractive – and increasingly employed – design feature for maximizing response rates, minimizing bias, and controlling costs. However, there are challenges to employing such designs. While researchers often prefer web response, sample members may prefer paper (Shih & Fan 2008). Offering both modes concurrently may reduce the response rate (Medway & Fulton 2012) but offering them sequentially may prematurely dissuade sample members from responding who would have done so if they knew the second mode would be offered. Biemer et al. (2018) reported promising results of a “choice plus” protocol that addresses some of these challenges; it offers both response modes concurrently but incentivizes web response by offering a larger contingent incentive for web response than for paper response.
This presentation reports the results of an experiment that builds on these findings. Households sampled for the 2019 National Household Education Survey (NHES:2019) were randomly assigned to: (1) choice plus, (2) web-push (sequential web-then-paper), or (3) paper-only. Within the choice plus condition, households were randomly assigned to receive either a $10 or $20 contingent incentive for web response. Compared to web-push and paper-only, choice plus resulted in a higher response rate, both overall and among subgroups that typically have lower-than-average NHES response rates. Compared to web-push, choice plus also increased the response rate to early mailings but did not increase the percentage of responses by web – the increase in response was almost entirely by paper. Among those in the choice plus condition, $20 led to more response to early mailings and more response by web than did $10. However, choice plus also was more expensive than the other conditions, particularly when the $20 incentive was used. This presentation will be of interest to practitioners interested in maximizing response rates in mixed-mode surveys.
Adaptive survey designs often aim to mitigate nonresponse bias by targeting interventions (such as incentives) to “high priority” subsets of the sample. A challenge in developing adaptive designs is identifying effective interventions for low-response-propensity sample members—who, by definition, are the most difficult to persuade to respond, but whose responses are most needed to reduce nonresponse bias. This challenge is particularly salient in self-administered surveys, in which effective adaptive design interventions for interviewer-administered surveys—such as case prioritization and interviewer bonuses—are not available. Using an experiment incorporated into the 2019 National Household Education Survey (NHES:2019), this paper will evaluate whether a “choice plus” design (Biemer et al. 2018)—in which respondents are offered a choice of modes along with a promised incentive contingent on response by web—holds promise as an adaptive design intervention in address-based, self-administered studies. Preliminary NHES:2019 evaluations found that the choice plus design achieved a higher overall response rate than a sequential mixed-mode control; and, furthermore, that low-response-propensity subgroups showed higher-than-average increases in response rates, relative to higher-response-propensity subgroups. This suggests that using choice plus for only lower-propensity cases could improve sample representativeness to a similar degree as using choice plus for all cases. Building upon these findings, this paper will use data from the NHES:2019 experiment to project the impact of an adaptive design using choice plus as a targeted intervention for low-propensity cases. A Monte Carlo simulation will be used to evaluate the potential for reduced nonresponse bias, reduced weighting variance, and/or a shortened field period, compared to non-adaptive designs that use the sequential mixed-mode control or the choice plus treatment for all cases. By evaluating the utility of the choice plus protocol as an adaptive design intervention, this paper will contribute to ongoing efforts to incorporate adaptive design principles into address-based studies.
Monetary incentives are frequently used to improve survey response rates. While it is common to use a single incentive amount for an entire sample, allowing the incentive to vary inversely with the expected probability of response may help to mitigate nonresponse and/or nonresponse bias. Using data from the 2016 National Household Education Survey (NHES:2016), an address-based sample (ABS) of US households, this article evaluates an experiment in which the noncontingent incentive amount was determined by a household’s predicted response propensity (RP). Households with the lowest RP received $10, those with the highest received $2 or $0, and those in between received the standard NHES incentive of $5. Relative to a uniform $5 protocol, this “tailored” incentive protocol slightly reduced the response rate and had no impact on observable nonresponse bias. These results serve as an important caution to researchers considering the targeting of incentives or other interventions based on predicted RP. While preferable in theory to “one-size-fits-all” approaches, such differential designs may not improve recruitment outcomes without a dramatic increase in the resources devoted to low RP cases. If budget and/or ethical concerns limit the resources that can be devoted to such cases, RP-based targeting could have little practical benefit.
The National Household Education Survey (NHES) is administered periodically by the National Center for Education Statistics using an address-based sample (ABS) and a mailed, self-administered questionnaire. The 2016 administration will incorporate an experimental test of a tiered incentive structure, in which different households receive a different value of a prepaid cash incentive in an amount determined by a predicted response propensity (RP) score. All this is determined prior to data collection. In preparation for this experiment, pre-collection research used the results of a 2014 NHES pilot study to develop, validate, and estimate a logistic regression model, which was then used to assign predicted RP scores to the 2016 sample. The 2014 data also were used to define the criteria by which a household’s incentive amount was assigned in 2016. This paper reports the results of this pre-collection research and model building stage. It describes the auxiliary variables available for use in modeling, the analyses conducted to select a subset of these variables for inclusion in the final model, and the use of a cross-validation procedure to evaluate the final model’s out-of-sample predictive power. It then discusses the use of this final model to identify an optimal incentive structure for the NHES 2016 experiment—namely, the RP score “cutoff points” that were used to determine a particular household’s incentive amount. Finally, it compares the distribution of predicted RP scores in the 2014 sample to those in the 2016 sample to assess the stability of the model’s predictions across administrations. The results provide evidence that a fairly parsimonious logistic regression model, estimated using auxiliary data appended to the ABS frame, generates in-sample and out-of-sample predictions that are sufficiently accurate and stable for use in assigning prepaid incentive amounts prior to data collection.
Incentives have long been used to improve response rates in surveys, and research suggests that in some circumstances incentives can reduce nonresponse bias by eliciting cooperation from reluctant or hard-to-reach individuals who are “different” on what is being measured from those easy-to-reach individuals. Incentives have also been shown to reduce total survey costs by reducing the level of effort required to gain cooperation. The 2016 administration of the National Household Education Survey, an address-based mail survey of 206,000 U.S. households, will include an experiment designed to test a tiered (differential) incentive structure, in which households will receive a prepaid cash incentive in an amount determined by a predicted response propensity (RP) score assigned prior to data collection. The tiered incentives are being assigned such that cases with the highest predicted propensity to respond would receive no ($0) noncontingent incentive, with the next most likely group receiving $2, followed by a group receiving $5, and finally the predicted cases predicted to be least likely to cooperate receiving $10. Two randomly assigned control groups are included where all cases, regardless of their RP, received either $2 or $5. This analysis will examine the impact of the incentive experiment on response rates and level of effort. These dependent variables for the modeled treatment groups will be compared to the control groups. Additionally, response rates within each RP group will be compared between those receiving a targeted incentive and those receiving a uniform incentive, in order to determine how the tailored incentive impacts response across different response propensity levels. Finally, appended auxiliary data will be used to examine the demographic distribution of respondents from both the modeled and control treatments, to determine whether the modeled incentive protocol leads to a more representative respondent sample compared to a uniform incentive protocol.
Extensive literature documents prepaid incentives’ ability to increase survey response rates. However, there is less evidence available as to their effect on nonresponse error and measurement error. Do incentives reduce nonresponse error by leading the types of people who tend to be underrepresented in surveys to participate at a higher rate, or do they increase nonresponse error by simply leading more of the same types of people who already tend to participate to take part? Do incentives increase measurement error by leading less interested or motivated individuals to take part, who ultimately provide lower quality data – or do they have little impact on respondents’ actions beyond the point of agreeing to participate? Does the answer to these questions differ depending on what kind of incentive is provided? This presentation will utilize the results of a recent incentive experiment that was part of a nationwide ABS field test survey to explore the answers to these questions. In this experiment, sampled households were randomly assigned to one of the following conditions: $5 prepaid + magnet, $5 prepaid only, magnet only, or no incentive. The cash incentive significantly increased the response rate, while the magnet incentive did not. We will explore the effect of the incentives on nonresponse error by comparing the key survey responses and demographic characteristics reported by survey responses in each condition, as well as comparing the distributions of frame variables for respondents in each condition to those of the eligible sample. We will also explore the effect of the incentives on measurement error by comparing the quality of the responses received from the respondents in each group by examining the prevalence of indicators such as item nonresponse, straightlining, and skip errors. This presentation will help researchers considering the use of incentives to be aware of the broader impact that they may have on survey error beyond their effect on the response rate.
Since 1991, the National Center for Education Statistics (NCES) has used the National Household Education Surveys Program (NHES) to collect education-related data from households on topics that are difficult to study through institution-based frames. From 1991 through 2007, the NHES used a list-assisted RDD CATI survey. However, like most RDD surveys, NHES response rates have been declining over time and the increase in households converting from landlines to cell phone-only service has raised concerns about population coverage. These issues prompted NCES to redesign the NHES program, shifting to a two-stage address-based mail survey.
In July 2011 NCES completed a field test of the redesigned mail survey on a nationally representative sample of approximately 41,000 addresses in the United States. Included in the field test were three incentive experiments. At the screener phase, prepaid incentives of $2 and $5 were offered. At the second phase, eligible screener respondents were either offered no incentive, a prepaid incentive of $5, $10, $15, or $20 at the first mailing, or a prepaid incentive of $5 or $15 only at the second nonresponse follow-up (third mailing).
Our analyses examine the effectiveness of differential incentive levels in NCES’s new two-stage design. Logistic regression analyses indicate that topical incentives are generally more effective if provided at the initial request rather than at the follow-up phase, and that it is unnecessary to offer larger second phase incentives to households that responded quickly to the initial screener, but necessary and effective for households that required several follow-ups before returning the initial screener.
We found that the patterns observed nationally are not homogeneous across all demographic groups. The patterns diverged in linguistically-isolated areas, Hispanic households, African-American households, and lower income households, further demonstrating the complexity of targeting incentives in order to maximize response and representativeness while containing costs.
A growing literature has consistently demonstrated the effectiveness of prepaid cash incentives in boosting survey response rates across a variety of modes. As a result, survey researchers increasingly rely on incentives to improve response rates in surveys. Understanding how incentives can be used to achieve desired response rates under cost constraints is a critical challenge for survey researchers. In this paper we look at the impact of incentives on response rates in the National Household Education Survey (NHES) 2011 Field Test. The NHES utilizes a two-phase data collection approach in which households are first screened via the mail to determine if there is an eligible child in the household. If the household contains an eligible child, an in-depth topical survey is then sent to the household. To avoid possible biases associated with survey nonresponse and to protect the power of the sample, it is critical to achieve a high response to both surveys. Experiments using different incentive levels were included at both the first phase, referred to as the screener phase, and the second phase, referred to as the topical phase. At the screener level, the effectiveness of $2 and $5 prepaid cash incentives was tested. At the topical level, the effect of including no incentive or a $5, $10, $15, or $20 prepaid cash incentive with the initial topical mailing was tested. An additional experiment at the topical level tested the effectiveness of including a $5 or $15 cash incentive with the final nonresponse mailing.
An effective tailored survey data collection protocol can increase response rate and efficiency (Dillman, 2014; Stern, 2014). In this vein, the National Household Education Surveys (NHES) program has been experimenting with model-driven responsive designs that target subgroups with varied data collection protocols based on what is predicted to be the most effective approach for those groups. Recently, researchers have begun to assess the utility of data mining techniques for predicting survey outcomes (Buskirk, 2015; McCarthy, 2009; Phipps, 2012). These methods are of particular interest for the NHES given the recent addition of nearly 300 commercial data variables to the NHES address-based sampling frame. In this presentation, we will compare traditional modeling approaches with ensemble methods to assess whether nonparametric methods offer an improvement in predicting response mode preference for the NHES.
The data come from the 2016 NHES data collection. While the majority of NHES:2016 cases were assigned to a paper-only protocol, a random subset was assigned to an experimental web-push condition. Offering the web option was successful, as it increased the overall response rate for this two-phase survey, and improved data collection and data processing efficiency. However, it had the negative effect of decreasing the screener response rate, especially among particular subgroups– suggesting that some sample members still prefer the paper-only protocol. Hence, the next NHES administration will include an experiment where those cases that are predicted to prefer paper response (based on response patterns in NHES:2016) will be assigned to a paper-only protocol.
We will compare three approaches for developing a model that identifies those cases that are likely to prefer to respond by paper. These approaches will vary in terms of whether they use a traditional or a nonparametric approach for selecting predictor variables and the actual modeling of response outcome. For the first approach, we will use stepwise selection to identify the variables that should be included in the model, and a binary logistic regression model with survey response status as the outcome variable. The second approach will also use a binary logistic regression model; however, we will use a conditional inference tree to select predictor variables. The logistic regression models will include a mode condition indicator (i.e., paper vs web) for each case. After these models are developed and actual response propensities obtained, we will also calculate counterfactual response propensities by assigning each case the opposite mode condition indicator value. The difference between the actual and counterfactual response propensities will allow us to identify mode preference for each case. Finally, for the third approach, conditional forests will be used for both variable selection and modeling response outcome. Here, we will grow two conditional forests—one for the paper cases and another for the web cases. After growing the forests based on each case’s assigned mode, we will use the other tree to obtain the counterfactual response propensity.
As mixed mode survey designs become increasingly important, researchers have begun experimenting with strategies to determine which mix of modes is most efficient for gaining response. However, limited research has been done to validate methods of predicting, in advance of the survey, the mode by which individuals are most likely to respond. Understanding individuals’ propensity to respond by a particular mode could reduce survey administration costs by allowing the mode of response to be tailored to individuals as early in the administration as possible.
Previous research using data from the National Household Education Survey (NHES) found that address-level auxiliary data were moderately effective at modeling response mode preference, but that there was too much error in the model predictions (resulting from only moderate correlation to response) to make them practical to use in future administrations (McPhee, 2017).This paper builds on that research by incorporating address-level record data on voting history (shown to be predictive of survey participation (Tourangeau et al., 2010)) and other topics (such as mail order shopping behavior) into these models to increase their predictive power.
The 2016 administration of the NHES included a mixed-mode experiment in which a randomly assigned set of 35,000 cases were asked first to complete the survey by Web (and later by paper). Using voting history data aggregated to the household level as well as address-level demographic data (e.g., characteristics of the housing unit, area demographics, internet penetration) a multinomial logistic regression model is used to predict response mode preference. This model is cross-validated to evaluate the model’s robustness when used to predict the mode of response in out-of-sample data. Finally, the paper-only control sample (n=136,000) from the NHES:2016 is used to test whether response rate improvements could be garnered if households unlikely to respond by web are sent paper questionnaires exclusively.
As mixed mode survey designs become increasingly important for survey research, due to their potential to reach a diverse respondent pool while controlling cost, researchers have begun experimenting with a variety of strategies to determine which mix of modes is the most efficient method to gain response. Experimentation with adaptive and responsive designs has begun to explore how different modes can be leveraged for different respondents to improve response rates and representativeness. However, limited research has been done thus far to validate methods of predicting, in advance of the survey administration, the mode by which individual cases are most likely to respond. It is known, for example, that some respondents are more likely or able than others to respond to a web survey, while others would be unlikely to respond to a web survey but more likely to respond if sent a paper survey, while others are unlikely to respond regardless of the mode. Understanding individuals’ propensity to respond by a particular survey mode in advance could potentially reduce survey administration costs by allowing the mode of response to be tailored to the individual respondent as early in the administration period as possible, freeing up resources to leverage other modes or incentives on the more difficult-to-reach sample cases.
This paper examines the potential to predict mode-specific response propensity for the National Household Education Survey (NHES). The 2016 administration of the NHES included a mixed-mode experiment in which a randomly assigned set of cases (n=35,00) were asked first to complete the survey by Web. Cases that did not respond by web after several contact attempts were sent a paper questionnaire. This research aims to determine whether household-level data available on the address-based sampling frame, and/or geographic data available from the Census Bureau, can be used to accurately predict the mode by which individual cases are most likely to respond. The ability to predict the mode of response could allow the efficiency of future administrations to be improved by sending some cases a paper questionnaire with the first mailing, rather than waiting until the third mailing to offer the paper option.
Parametric models (e.g. multinomial logistic regression) and non-parametric algorithms (e.g. classification and regression trees) will be compared with respect to their ability to predict response status and the mode of response using the available auxiliary data. Cross-validation procedures will be used to evaluate each method’s robustness when used to predict the mode of response in out-of-sample data. The paper will describe the auxiliary variables available for use in modeling, the variable selection procedures used to determine the optimal specification for the multinomial logistic regression model, and each method’s predictive accuracy when applied to the NHES dataset. The results will provide initial insight into whether it is possible to improve the efficiency of sequential mixed-mode designs by tailoring the mode of initial contact based on information known about sampled cases prior to data collection.
Mixed-mode data collection is an attractive option due to its potential ability to increase response rates, reduce nonresponse error, and reduce survey costs as compared to single-mode surveys; however, data collected in diverse modes also tend to have different measurement error properties. In particular, we may observe mode-driven “measurement effects” – that is, a difference in measurement error across modes caused by respondents answering the same question differently depending on the mode in which they respond. These measurement effects are potentially problematic because they suggest that respondents in each mode may be interpreting the survey questions differently, challenging the ability to confidently combine responses from these modes into a single data set. However, measurement effects can be difficult to estimate since they tend to be confounded with selection effects – differences between the characteristics of the respondents to the two modes. In this presentation, we will utilize two analytic approaches ( (1) mixed-mode calibration and (2) extended mixed-mode comparison) to disentangle measurement effects from selection effects in the 2016 National Household Education Survey (NHES) data. This administration included an experiment, in which a random sample of the sampled households was assigned to a sequential mixed-mode condition that gave them the opportunity to complete the survey either on the web or, later, on paper (35,000 households). The remaining cases were assigned to a single-mode, mail-only condition, in which they received the standard NHES mail survey protocol (171,000 households). Disentangling selection and measurement effects allows us to assess both the extent to which the mixed-mode condition brought in respondents who differed on key survey topics from those who responded in a mail-only administration (selection effects), and the extent to which the measurement of these topics varied by mode of response (measurement effects).
Recent trends show that survey respondents are increasingly difficult and expensive to reach. Methodology research consistently demonstrates that tailored and adaptive designs may offer the best solution for collecting high-quality data. One strategy that can increase coverage and representativeness— and potentially reduce cost — uses sequential mixed-mode designs that include a web-based response component. In January 2016, the National Center for Education Statistics will test a sequential mixed-mode, web-push design for the 2016 administration of the National Household Education Survey (NHES). For several cycles, the NHES has used an address-based sample to administer a two-phase, self-administered mailed questionnaire in which sampled households are rostered using a phase-1 screener and then a single individual is sampled from responding households to complete a longer phase-2 “topical” survey. This presentation will describe the process of adapting the two-phase paper design to incorporate a variable-phase web survey, and some of the key challenges faced while transitioning from a well-tested paper-only to a mixed-mode administration. Authors will describe the tradeoffs between maintaining consistency with the paper instrument and optimizing the web survey; the complexity of building a web instrument that in some situations (e.g., single-adult households) must be a single-phase survey with both phases completed by one individual, while other situations require a different respondent to complete each phase; and the intricacies of using phase-1 screener data to customize wording in both English and Spanish using known information about the respondent. In addition to discussing the above challenges and proposed solutions, the paper will present selected results of usability testing and the resulting design changes to the web instrument. This study contributes to the growing body of research examining the most effective ways to use mixed-mode designs to increase survey response and representativeness while minimizing cost and mode effects in a national household survey.
Decreasing telephone response rates have caused survey researchers to explore alternate modes of data collection. The increasing accuracy and availability of address sampling frames have led researchers in the public and private sector to reconsider mail self-administered surveys as a viable alternative to telephone RDD studies. Mode effects in surveys have been well documented. One aspect of the effect that is worthy of exploration in the transition of a survey from telephone to mail is whether or not the change in mode draws in a different respondent from the household. The National Household Education Survey (NHES) has been conducted by telephone approximately every two years since 1991. As a result of falling response rates and concerns about coverage in the list assisted RDD sample, the survey began a redesign in 2009. The first step in this redesign was a feasibility test of a two phase mail survey. The NHES consists of a series of rotating modules, most of which are focused on a reference child. Past research has shown that the household informant plays a role in the quality of data collected. As a result of this, telephone interviewers were instructed to ask that the adult in the household that is most knowledgeable about the selected child serve as the respondent. The mail self-administered questionnaire requested that someone knowledgeable about the child respond, however, the request is likely to be less salient to the respondent in a self-administered survey compared to an interviewer-administered survey. This paper will examine the characteristics of reporters in the phone and mail surveys. Changes in the household reporter could have profound effects on measurement. Understanding differences in who is likely to respond by phone compared to mail can also lead to the development of better contact approaches for mail surveys.
Targeting recruitment and nonresponse intervention
The results of the 2017 National Household Education Survey web test (NHES:2017) suggested a pressure-sealed envelope could be a promising option as a survey reminder (Medway et al. 2018). However, the NHES:2017 administration was web-only and thus not entirely comparable to a mixed mode web-push design. In addition, it was not a randomized experiment, in which a reminder postcard and pressure-sealed envelope are tested simultaneously within the same data collection. The NHES:2019 administration predominately used a web-push design and experimented with the use of a pressure-sealed envelope as a survey reminder. For this experiment, sample members in the baseline group received a regular reminder postcard, while the sample members in the treatment group received a pressure-sealed envelope. Both types of survey reminders were sent a week after the initial screener package. The NHES:2019 pressure-sealed envelope included the web survey URL and the household’s unique web login credentials, but the regular reminder postcard did not (because the postcard format did not allow for sufficient protection of this information). It was hypothesized that the pressure-sealed envelope’s ability to include the survey URL and the household’s web login credentials would increase internet response (Census, 2018).
We will present, within the context of a web-push design, whether a pressure-sealed envelope reminder motivated survey participation. The results show that it is more effective to send a pressure-sealed envelope (a 4-percentage-point difference). We also evaluated the effect on response rates by selected household characteristics. There were a few subgroups for which sending a pressure-sealed envelope led to a somewhat larger-than-average increase in response relative to the reminder postcard. Our presentation will provide insights into the effectiveness of pressure-sealed mailing materials in a web-push design.
Recent research by the U.S. Census Bureau (Kulzick et al. 2019) clustered every census tract into one of eight “audience segments”, defined as “groups of census tracts with similar predicted self-response behavior...selected for their distinctive patterns of media consumption and distribution of mindsets.” Files identifying each tract’s audience segment are publicly available through the Bureau’s Response Outreach Area Mapper (ROAM) application. Although primarily designed to assist in tailoring public outreach efforts for the 2020 decennial census, these audience segments may also predict response behavior to other studies of households, particularly those that (like the census) initially request self-response by web or mail. If they do, they may be a useful resource to survey managers, insofar as they offer preexisting cohorts between which data collection strategies could be varied to achieve a more balanced distribution of responses. Using data from several cycles of the National Household Education Survey (NHES), a large-scale recurring cross-sectional study, this presentation will examine the utility of the Census audience segments for tailoring data collection strategies in mixed-mode, address-based household surveys. The discussion will focus on two research questions. First, do response rates, eligibility rates, and the distribution of responses by mode vary meaningfully between the audience segments? Second, based on randomized experiments incorporated into recent NHES cycles, is there evidence that the effectiveness of common data collection interventions—including cash incentives, advance mailings, and the offered response mode—varies between the audience segments? Time permitting, variation in substantive survey estimates will also be discussed. Results will provide insight into whether the Census Bureau’s tract-level audience segmentation research can be usefully applied in data collection contexts other than the decennial census.
The National Household Education Surveys program (NHES), a national cross-sectional survey, has used mailed invitations to contact households since 2012. Nonrespondents to the initial invitation receive up to three reminder mailings. Typically, one of the reminders is sent using FedEx courier service. In the 2019 administration, an experiment was included that varied when this FedEx mailing was sent to assess the ideal timing for sending this costly reminder. As part of this experiment, 70,000 sampled cases were randomly assigned to one of three experimental conditions: FedEx second, FedEx fourth and Modeled FedEx conditions. Cases that were assigned to FedEx second or fourth conditions, received FedEx as the second or fourth mailing, respectively. In the modeled FedEx condition, a cost-weighted response propensity model based on available frame data was used to identify cases that were the most likely to respond to the screener, and for which the FedEx mailing is the costliest—these cases received FedEx as a fourth mailing while the remaining sample received FedEx as a second mailing.
In the current study, the effectiveness of the model used to assign cases to receive FedEx as the second or fourth mailing will be evaluated by comparing response rates, respondent representativeness, and cost outcome indicators amongst the three experimental conditions. Since FedEx mailing timing is based on a pre-defined model, results from this study will provide insight into whether FedEx mail timing is an effective adaptive design intervention in address-based sample surveys. Findings from the study will also guide future NHES designs and inform other researchers seeking to hold-off more expensive methods of reminders to targeted sample members.
As the Spanish-speaking population continues to grow in the United States, so too does the need for accurate estimates of this group’s attitudes and experiences in national surveys. Historically, Spanish-speaking households have had lower-than-average response rates on the National Household Education Survey (NHES), despite the use of bilingual mailings for households that are flagged as being likely-Spanish-speaking. Previous research suggests that tailored invitation letters can increase response in low-response propensity groups (Lynn, 2016).
This presentation presents the results of an experiment included in NHES:2019 that used targeted materials to increase response among likely-Spanish-speaking households. In this experiment, likely-Spanish-speaking households were identified based on auxiliary data available on the NHES sampling frame and appended from publicly available sources. This group was then sent targeted mailing materials that were developed through a series of focus groups with Spanish speakers to engage or appeal to this group. For example, the letters emphasized key themes that emerged in the focus groups, the paper questionnaire included images of Hispanic families, and a Spanish-first presentation was used for the bilingual materials. However, our findings suggest that, among likely-Spanish-speaking households, the targeted mailings decreased the response rate as compared to the standard NHES materials. We also find that the flag used to identify likely-Spanish-speaking households was possibly imprecise. In this presentation, we summarize results from this experiment and offer lessons learned for future targeted designs for surveying Spanish-speaking populations.
Being offered a survey via the preferred mode has been found to boost participation. Moreover, paper questionnaires tend to have higher response rates than web surveys, but also tend to be more expensive than web. These competing forces could be addressed by using paper only for cases where it is expected to have the strongest positive impact on response. However, mode preference is not typically available on sampling frames. The 2019 administration of the National Household Education Survey (NHES) included an experiment where a modeling approach was used to make predictions of mode preference (“modeled paper-only condition”). Using data from prior NHES administrations, each household was assigned a paper-sensitivity score indicating how likely it was to respond to a paper questionnaire, relative to a web survey. Within the modeled paper-only condition, the top 15 percent of households (“paper-sensitive group”) were assigned to receive paper questionnaires and the rest were assigned to receive a web-push protocol. Randomly assigned web-push and paper-only groups were used as comparisons. In these conditions, all cases received the same mode protocol, regardless of their paper-sensitivity score. The effects were evaluated at the screening phase of the NHES.
The results showed a much higher screener response rate for those who received paper-only, than those receiving web-push. However, while the predictive model successfully differentiated paper-sensitive from non-paper-sensitive households, it was seen primarily in early-stage screener response rates; the benefit of sending paper to the paper-sensitive group dwindled once the paper-sensitive web cases had also been sent paper. The modeled condition increased early-stage participation but had only a marginal impact on overall participation relative to a sequential mixed-mode protocol.
Mail surveys using address-based sampling (ABS) methods require precise tracking of housing unit eligibility. The occupancy status of the housing unit is a key aspect when determining unit eligibility and impacting response rates, and non-response adjustments. Additionally, these studies typically utilize multiple contacts, sometimes layering increasingly expensive contact strategies. However, additional contacts are unnecessary if an address is confirmed to be unoccupied, nonresidential, or nonexistent. Organizations often use two sources to identify potentially ineligible units, vendor provided auxiliary data (often based on USPS vacancy status) prior to the start of data collection and postmaster return status during data collection.
Experience has demonstrated that neither source is completely reliable. Analyses of the National Household Education Survey (NHES) 2014 feasibility study found that excluding addresses from subsequent mailings that were initially identified as undeliverable would lead to a measurable loss of completed screening questionnaires. While certain vendor provided address-level auxiliary data can differentiate “true” ineligibles from potential valid addresses among those initially returned as nondeliverable, no single piece of information identifies ineligible addresses with enough certainty to offset the potential loss in precision and representativeness that results if these addresses are excluded from future contact.
This paper expands on this research through the development of a propensity model predicting address ineligibility from the NHES:2016, which included 171,000 nationally representative addresses. Using address-level demographic data (e.g., characteristics of the housing unit, area demographics, census response likelihood) logistic regression models are used to predict final address-level ineligibility. These models are then cross-validated to evaluate the model’s robustness when used to predict the ineligibility in out-of-sample data. We then examine whether these predictive models can be coupled with initial occupancy status information to weed out “true” ineligible addresses from future contacts, without the introduction of bias into the final responding sample.
The 2016 National Household Education Survey was a two-stage self-administered survey. While most mailings were sent using the United States Postal Service (USPS), the second screener-stage reminder was mailed using FedEx. Preliminary analysis showed that the FedEx mailing led to the highest increase in the screener response rate out of the three screener reminders that were sent. Additionally, bivariate analysis revealed that the respondents to the FedEx reminder were more likely to come from underrepresented groups (such as non-White, less educated, low-income, and rural households), potentially decreasing nonresponse bias. However, although these results suggest that the FedEx mailing is a worthwhile nonresponse follow-up option, using this service can be very expensive. Instead of sending all nonresponding households the second reminder via FedEx, subgroups of households could be targeted with FedEx mailings as part of a tailored design (with the others receiving a cheaper USPS reminder mailing).
The current study will expand the preliminary analyses by developing a model to determine which household characteristics available on the frame are the most predictive of a particularly high likelihood of responding to the FedEx mailing. The results will then be used to identify those households that should continue to get a FedEx mailing in future administrations and those that could receive a less costly mailing. Identification of these households will take into account (1) whether the case improves the representativeness of the respondent pool (reducing bias) and (2) the relative cost of FedEx versus USPS mailing for that case. An analysis will be included that assesses the effect of this targeted approach on the expected response rate and cost per complete for future administrations. The results of this analysis will be of interest to researchers considering the cost-benefit tradeoffs of more expensive mailing strategies – particularly those conducting surveys that include FedEx mailings.
Leverage salience theory implies that the effectiveness of an intervention aimed at increasing survey response rates is unlikely to be uniform across a sample. Rather, sampled cases are likely to vary in their “sensitivity” to a potential intervention. In principle, an adaptive design that assigns an intervention only to cases predicted to be sensitive to that intervention would allow for a more efficient resource allocation than assigning the intervention to all sampled cases. However, while statistical methods for predicting response propensity are well established, the problem of accurately predicting (in advance of data collection) sensitivity to potential interventions has received limited attention in the survey research literature. This paper will investigate statistical approaches to identifying subgroups that are likely to be particularly sensitive to response rate interventions, using auxiliary data available prior to data collection. Drawing in part on recent advances in predictive subgroup identification from the biostatistics and machine learning literatures, several methods will be compared, including regression-based and recursive partitioning approaches. Data from a randomized incentive experiment incorporated into the 2016 National Household Education Survey—a large-scale (n = 206,000) study of U.S. households using an address-based sample and self-administered mailed questionnaires—will be used to demonstrate and evaluate each method. The primary research question is whether any of these methods is able to isolate cohorts between which there is substantial variation in the response rate increase attributable to a $5 prepaid incentive; and, crucially, whether this information can then be used to accurately predict incentive sensitivity in out-of-sample data. Taking advantage of the large sample size, cross-validation techniques will be used to assess each method’s out-of-sample accuracy. The results will provide insight into whether any of these statistical approaches to predicting incentive sensitivity show promise for use in adaptive survey designs.
As summarized by Groves (2006), there are characteristics of surveys (sponsorship, burden) and of respondents (gender, urbanicity) that predict response propensity. Information about response propensity from one administration of a survey can be used to develop targeted contact procedures that increase response, reduce bias, and defray costs for subsequent administrations. It is less clear, however, whether information from one cross-sectional household study can be used to predict response to a different survey. The authors evaluate whether using response propensity from the decennial Census can increase efficiency of the National Household Education Surveys (NHES) design, sponsored by the National Center for Education Statistics. Because NHES uses address-based sampling, it has been limited to address-level information, provided commercially by a sample frame vendor, for designing efficient contact strategies. In this paper, the authors evaluate the use of the Census Bureau’s Planning Database (PDB), which provides block-group level response rates to the 2010 decennial census, to enhance sampling frame data and differentially target respondents. The authors conduct two analyses to evaluate the PDB for use in an NHES targeted-design strategy. First, they compare response to the NHES:2014 Feasibility Study with PDB response propensity scores by demographic characteristics such as percent minority and percent in poverty to assess whether the PDB would have accurately predicted response to the NHES:2014. Second, the authors match sampled households from the NHES:2014 study to PDB data and analyze the results of a random-assignment incentive experiment conducted in the NHES:2014 in relationship to decennial Census response propensity data. The analysis allows for a simulation of response rate outcomes, post hoc, had the PDB been used in 2014 to target likely responders with a lower incentive than the incentive given to likely nonresponders. The authors evaluate the results of the simulation on both response rates and bias in key estimates.
Surveys using address based sampling (ABS) methods require precise tracking of housing unit eligibility. The occupancy status of the housing unit is a key aspect when determining unit eligibility, response rates, and non-response adjustments. Additionally, mail-based studies typically utilize multiple contacts, sometimes using increasingly expensive contact strategies (e.g., special delivery mail). These additional contacts are unnecessary if an address is confirmed to be unoccupied, nonresidential, or nonexistent. However, because it is often difficult to confirm the occupancy status of sampled addresses, organizations often retain all housing units in subsequent mailings even if early attempts indicate the unit is unoccupied. The U.S. Postal Service and special delivery services (e.g., FedEx) may provide different information about a unit’s occupancy status. Further, the extent to which occupancy information collected by USPS is comparable to information collected by special delivery services and whether it is beneficial to use special delivery mailings in follow-up contact attempts if USPS has indicated the unit is unoccupied remains unknown. This analysis addresses these questions by analyzing data from the 2014 National Household Education Surveys Feasibility Study (NHES-FS). The NHES-FS is an ABS mail survey sent to an initial sample of 60,000 addresses across the U.S. The first two mailings were sent using USPS First Class mail and the third contact was sent via FedEx. Occupancy information will be compared across delivery services, as well as to occupancy status indicated on the sampling frame. These analyses will be conducted by region and urbanicity. Response status will be analyzed to determine the impact on response rates of re-mailing to addresses identified as unoccupied. This analysis will evaluate the effectiveness of this strategy and determine whether a refined methodology that accounts for frame information, location, and postal returns could be used to more accurately identify unoccupied housing units and reduce expenditures.
Reaching non-English speaking households is a challenge for many surveys, especially those conducted by mail. Unlike telephone surveys, where the interviewer can immediately identify a language problem and route the case to an interviewer that speaks the respondent’s language, a mail survey must identify ways to target the household prior to contact. As part of the transition from a telephone administered to a mail self-administered design, The National Household Education Survey (NHES) has conducted a number of experiments to look at optimal ways to identify and reach Spanish-speaking households.
The issue of correct identification of the language spoken in the household is especially acute for the NHES, as it is a two-phase study, where sampled households are screened by mail with a simple household roster to determine the presence of eligible children. If eligible children are present, within household sampling is performed to select a reference child. The household is then sent a longer and more complex topical survey by mail. The screener is used to determine the language for the topical survey form.
In 2011, NHES undertook several experiments to explore ways to identify Spanish-speaking respondents through a mail screener questionnaire as part of a larger field test. This paper examines the results of experiments that compared a Bilingual form, an English form, separate English and Spanish forms, and sending English and Spanish forms to households that had previously received only an English form. The experiments were done on three independent samples of households 1) those in Census tracks identified as Linguistically Isolated areas, 2) households with Hispanic Surnames on the sample frame, and 3) a nationally representative sample of households. This paper explores the optimal approaches to reach Spanish-speaking households by mail identified in the 2011 NHES field test.
Given persistent declines in survey response rates in recent decades (Brick and Williams 2013), researchers have devoted considerable attention to understanding the factors that contribute to survey nonresponse. Some researchers have argued that concerns about privacy and confidentiality can be a driver of nonresponse, particularly to government-sponsored surveys or surveys by any entity that is believed to share data or not protect respondents’ privacy (Singer and Presser 2008).
This presentation analyzes privacy attitudes among nonrespondents to the National Household Education Survey (NHES). It draws on findings from an in-depth qualitative study by the National Center for Education Statistics focused on better understanding the drivers of nonresponse to the NHES. The study included over 80 in-person, qualitative interviews with households that did not respond to the first three NHES:2019 survey mailings. These 90-minute interviews took place in sample members’ homes at four locations across the country.
In this presentation we explore nonrespondents’ definitions of privacy, reported privacy protection measures, and degrees of concern about privacy. We also report on sub-group analysis results to help understand variation in privacy attitudes. Our findings suggest that nonrespondents generally defined privacy in two ways: (1) protecting personal information (i.e. confidentiality); and/or (2) maintaining distance or boundaries between themselves and others. Additionally, some respondents believed that there was no such thing as privacy because their information was already freely available to the government or corporations. One in five participants were extremely concerned about privacy and took various measures to protect it, including not using social media or cell phones, not using banks and/or credit cards, and/or burning their mail. We conclude by discussing the implications of these findings for nonresponse. Though the drivers of nonresponse are complex, understanding privacy attitudes could help inform aspects of survey design, such as contact materials and cover letter content.
To better understand characteristics of nonresponding households, 760 address and neighborhood observations were conducted across the United States as part of a nonresponse follow-up study to the 2019 National Household Education Survey (NHES). The NHES is a nationally representative household survey that uses an address-based sampling (ABS) frame to collect information about educational experiences of children in the U.S. In 2019, an in-person follow-up was conducted to understand more about nonresponding addresses. The purpose of the observation component was to determine the characteristics of addresses that are prone to nonresponse or having their NHES mailings be undeliverable as addressed (UAA) and to assess the accuracy of the information available on the NHES ABS frame.
The observation instrument included items designed to measure the accuracy of the frame (e.g., occupancy status, presence of children, and household income) and challenges in receiving mail (e.g., structure type, mail access type). It also included a series of items that captured characteristics of the address that were not available on the frame (e.g., presence of items related to privacy or pride in education). For each item, observers recorded rich contextual information about the house and neighborhood.
This presentation shares the observable characteristics of nonrespondent and UAA addresses. Indicators of privacy concerns were observed for just over a quarter of observed nonrespondent addresses. Addresses with UAA outcomes were “problematic” (e.g., no mailbox in sight) more often than non-UAA addresses. Finally, there was a considerable range in the agreement rates between the observed data and the ABS frame. Drawing on these findings, the presentation will conclude with implications for the contact materials and procedures used in future NHES data collections and other household surveys that use an ABS frame.
Over the past few decades, surveys have faced persistent declines in response rates (Brick and Williams 2013). Given the global nature of this phenomenon, researchers have devoted considerable attention to understanding the factors that contribute to it. However, the drivers of nonresponse are complex and much remains unknown about sample members’ reasons for nonresponse and the best way to counteract those concerns. In response to these trends, the National Center for Education Statistics conducted an in-depth qualitative study focused on better understanding the drivers of nonresponse to the National Household Education Survey (NHES) and other household surveys. The study included over 80 semi-structured interviews with households that did not respond to the first three NHES:2019 survey mailings. These 90-minute interviews covered a variety of topics (e.g., survey experiences, government attitudes, mail sorting behavior). They took place in sample members’ homes at four study sites across the country.
This presentation focuses on three topics. First, we discuss participants’ prior experiences with and general attitudes toward surveys; for example, commonly reported reasons for negative attitudes towards surveys included being too busy, having survey fatigue, and feeling like survey participation doesn’t make a difference. Next, we discuss the survey-specific factors that participants said influence their survey participation decisions, such as the survey topic, offered response mode(s), length and incentives. Third, we report the extent to which interview participants said they had engaged with the NHES survey mailings; while most remembered receiving at least one mailing, more than half of those who opened at least one mailing said they had rejected the survey request. We conclude by discussing the implications of this qualitative interview study for understanding nonresponse to the NHES and similar household surveys.
The reliance on mail-based data collection using address-based sampling to survey U.S. households has increased in recent years (Harter et al., 2016). Following this trend, the National Household Education Surveys program (NHES) has used mailed invitations to contact households since the 2012 administration. Even before the household member evaluates the mailed survey invitation, gaining response from a household to a mail request requires that a) the survey reaches the household successfully, and b) the household member opens the envelope.
To evaluate the current NHES mailing procedures and to optimize the design of future administrations, a qualitative interview study was conducted with over 80 households that did not respond to the first three NHES:2019 mailings. Interview participants were asked about receiving mail (e.g., how often they check mail, mail delivery challenges they face) and about their general mail-handling process (e.g., when and how they sort mail, how they decide whether or not to open specific pieces of mail). As part of a mock mail sorting activity, interview participants reviewed 12 pieces of example mail, including an NHES:2019 mailing, that varied on several aspects, such as the type of mail (e.g., bill, solicitation), envelope design (e.g., color, size), and postage type. Interviewers observed household member(s) sorting these mail pieces and prompted household members to capture their thoughts about what they were doing and why.
This presentation provides a summary of participants’ mail receiving and processing behaviors and explores themes in their reactions to the mail pieces. For example, participants considered factors such as the degree of personalization of the addressee line, familiarity with/perception of the sender, and the size of the envelope when deciding whether to engage with the mail pieces. Findings from the study will inform mail survey protocols for future NHES administrations and other mailed surveys.
Extensive research has been done in the survey methodology field to try to understand why people have become less likely to respond to surveys. Theories of nonresponse range from individual-level drivers such as busyness and privacy concerns to societal-level factors, such as declining confidence in public institutions and growing concerns about security and identity theft (ASA Task Force on Improving the Climate for Surveys 2017). However, precisely why people decline, and the degree to which these drivers vary across households from diverse sociodemographic backgrounds, remains unclear.
Using data from over 80 unstructured interviews conducted with households that did not respond as of the third mailing to the 2019 National Household Education Survey (NHES:2019), the presentation outlines the combination of factors that influenced interview participants not to respond to the survey request. This study, rooted in the ethnographic tradition and the emic approach, combines 90-minute in-person interviews with household observations to explore the cultural frameworks participants use to interpret their world and guide behavior. Interviews were conducted with households in four geographically diverse locations and in English and Spanish.
The presentation describes seven typologies that offer insight into why these individuals chose not to respond. We examined participants’ behaviors and attitudes across several key factors identified through analyzing interview and observational data. Each participant was placed in a single typology that explained the primary reason for nonresponse. For some typologies, excessive time constraints, having multiple adults in the household, and/or mail delivery challenges stopped participants from responding. For others, their beliefs about the federal government, concerns over data security, and/or assuming the NHES survey request was not relevant to them drove their decision. The findings can shed light on the various drivers that contribute to survey nonresponse to the NHES and to household surveys more generally.
Advance letters have been shown to increase survey response rates by alerting potential respondents about the upcoming survey request (de Leeuw et al, 2007; Spruyt and Droogenbroeck, 2014). In line with this expectation, the National Household Education Survey (NHES), an address-based household study that predominately uses a web-push design, typically sends households an advance letter a week before sending the initial survey request. In the 2019 administration, an experiment was included (1) to assess whether sample member engagement could be increased by using a more intensive advance mailing campaign that increases familiarity with the NHES prior to notifying the household about the survey request; and (2) to confirm whether there is a benefit to sending an advance letter in an NHES web-push design. For this experiment, sample members were randomly assigned to an advance mailing campaign condition, an advance letter-only condition, or a no-advance mailings condition. Within the advance mailing campaign condition, sample members received two oversized, color postcards prior to the advance letter (for a total of three advance mailings). These postcards presented interesting statistics from past NHES administrations, but they did not mention that the household had been sampled for NHES:2019. It was hypothesized that the advance mailing campaign would increase the response rate to earlier mailings as compared to both the advance letter-only and no-advance mailings condition by increasing awareness about the NHES (Lavrakas et al, 2004).
We will present, within the context of a web-push design, how the advance letter and mailing campaign motivated survey participation at different stages of administration. We will also assess whether the effect on response rates and response mode varied by selected household characteristics. Our presentation will provide insights on the effectiveness of advance mailings in a web-push design and inform decisions on the ideal number and types of advance mailings.
One cause of nonresponse is unfamiliarity with and lack of familiarity and engagement between respondents and the survey organization. Maximizing brand/sponsor recognition and topic salience is reasoned to raise survey response rates. One way to do this is to develop engaging mailing materials with consistent branding. The 2019 National Household Education Survey (NHES) is testing new mailing materials that provides information about the NHES and brands the materials. The test begins in early January 2019. Most addresses are sent an advance letter, but a randomly assigned experimental group (n ≈ 23,000) will be mailed two over-sized, brightly colored postcards one and two weeks before the advance letter. The postcards will not instruct the sampled household to do anything or mention that they have been sampled for the NHES. Instead, their purpose is to build awareness and recognition of NHES by highlighting interesting findings from recent past NHES administrations. They will also introduce sample members to the survey “branding” by including logos that will appear on subsequent mailings. We hypothesize that the postcards will make the people at the address more familiar with the NHES so that when they receive the survey invitation, they will be more likely to recognize where it is coming from.
We will explain the process of developing the materials used in the postcard awareness campaign. We will also look at response rates over the first two months of data collection to NHES:2019 – both overall and for subgroups of interest (e.g., households with children). This preliminary analysis will determine whether those who receive the awareness mailing campaign are more likely to respond to the actual survey. Our findings will provide insight into the importance of building awareness and engagement by raising topic salience and using branding before the first survey invitation is sent.
The 2017 National Household Education Survey (NHES) web test was the first time NHES data were collected almost entirely online. The intent of testing this mode was to determine the feasibility of using web as the primary mode in the next full-scale collection in 2019. Households were mailed information about how to access the NHES web instrument; they did not have the option to complete a paper questionnaire. The NHES includes a screener and four topical instruments. This presentation will focus on two operations experiments that were conducted to determine if lower-cost mailing strategies could be implemented without impacting the response rate.
One experiment tested whether response rates would be impacted for sample members who received a smaller, letter-sized envelope for their first two screener mailings instead of the standard full-sized envelope. Preliminary results show that using a smaller envelope did not produce a significantly lower response rate than the full-sized envelope (43% versus 43%, respectively). We will next explore whether the envelope size impacted response rates for specific subgroups of the sample.
Another experiment tested whether using FedEx for the third screener reminder, which is the standard approach, or USPS First Class mail in a cardboard priority mail envelope would influence the response rate. Preliminary results show that response rate was significantly lower for the screener phase of the survey for those sent the priority mail envelope than for those sent the FedEx (42% versus 45%, respectively). However, the priority mail envelope did not appear to have a negative impact on response rates to the topical questionnaires. We will next explore whether the priority mail envelope impacted response rates for specific subgroups of the sample.
Both experiments have potential to decrease operation costs in future NHES administrations or in other large scale national mail surveys of individuals or households.
Questionnaire length is often considered a key component of respondent burden and thought to suppress response rates. But does a shorter questionnaire improve the representativeness of the resulting completed questionnaires enough to offset the reduction in data? In this paper, we will look at the results of an experiment that sent non respondents two surveys that differed in length, one that was the same survey sent in prior mailings and the other was a shortened form. We will present the difference in overall response rate between the short and long form and look to see if the short form had an impact on estimates by bringing in populations that tended to be less likely to respond. Our analysis finds limited change in response distributions among short form respondents on certain items in the questionnaire. This potentially indicates a reduction in bias, and has implications for survey designers as they contemplate tradeoffs between questionnaire length and response rates.
The potential to increase survey response by sending email reminders to sampled cases is seductive because email messages are inexpensive to send. However, privacy concerns may hinder respondents from providing email addresses within survey instruments or worse-may scare them away from responding to the survey entirely. This paper presents results of a 2016 email experiment (n = 35,000) which was designed to understand the impact of asking for email addresses in a two-stage household survey so that an email reminder strategy could be evaluated for future collections. Half of the respondents to the screener stage of the web survey were asked to provide the email address for the adult in the household who would be asked to complete the second-stage survey. Most of the time, the screener respondent was also asked to be the respondent for the second stage of the survey, but less than a third of the time, the screener respondent was asked for the email address of another adult in the household. Though email addresses were not used for nonresponse follow-up, the experiment yielded data about item nonresponse to the screener email question and about screener breakoffs, which lead to unit-level nonresponse for the second stage of the survey. Additionally, respondents who provided email addresses were sent a “thank you” email, thereby providing data about the number of email bounce-backs, which indicates the quality of the email addresses provided by respondents. Results indicate that, for the most part, the risk of unintended consequences of an email strategy is relatively low, with a low breakoff rate and relatively high item response and data quality. However, the authors found that asking for email addresses of other household adults yielded lower item response to the email question and appeared to encourage proxy responses to the second stage of the survey.
In January 2011 the National Center for Education Statistics (NCES) conducted a large-scale field test of a multi-mode survey about childcare and parent involvement in children’s education with a nationally representative sample of approximately 41,000 addresses in the United States. The primary data collection mode was a two-phase mail survey. The two-phases involved a screener questionnaire to determine household eligibility (presence of children) and a topical questionnaire sent only to eligible households. There were two non-response mail follow-ups and a telephone follow-up for the screener, that is the focus of this paper.
The paper examines the characteristics of early and late screener responders in the National Household Education Survey (NHES) field test. The preliminary analysis suggests that households that responded to the first mailing of the screener questionnaire are different from those who responded after the second follow-up. In this paper, we will use the data from the address vendor frame to describe the characteristics of responders and non-responders by each of the 4 waves of screener response. We will explore the changing portrait of respondents over the survey response period. The specific characteristics examined include gender, age, education level, race/ethnicity, income of the head of the household and marital status, number of adults, and residency type of the household.
In addition, we will look at some of the characteristics of the respondents by two different questionnaire treatments – a 20 question screener versus a 5-question screener. Preliminary analysis suggests that households with children may be more likely to respond to the longer version of the screener questionnaire. We will explore at what mailing phase they tend to respond and how that affects the availability of the sample for the topical stage of the survey.
Since 1991, the National Center for Education Statistics (NCES) has used the National Household Education Surveys Program (NHES) to collect data on such topics as early childhood care and education, children‘s readiness for school, and parent involvement in education. Surveys were conducted approximately every other year from 1991 through 2007, and each of these prior administrations used random digit dial (RDD) sampling and telephone data collection from landline telephones only. Telephone interviews were conducted using computer-assisted telephone interviewing (CATI) to accommodate the survey‘s complex skip patterns and automated within-household sampling techniques. However, like most RDD surveys, NHES response rates have been declining over time and the increase in households converting from landlines to cell phone-only service has raised concerns about population coverage.
In an effort to address these concerns, NCES opted to redesign the methodology of the NHES data collection and after a smaller-scale feasibility pilot test in 2009 in January 2011, began a large-scale methodological field test of a redesigned multi-mode survey on a national sample of approximately 60,000 addresses in the United States. The primary data collection mode used in the redesigned NHES was a two-phase self-administered mail survey. Included in the field test were multiple embedded experiments intended to determine how to maximize response rates and population coverage, including the use of pre-notice letters, differential incentive levels, different versions of the first-stage screener instrument, and different mailing methods.
This paper explores the effects of the 2011 Filed Test experiments on the response rates. The analyses described below illustrate that sending a pre-notice letter, mailing the second non-response follow-up via Federal Express, and increasing monetary incentives had a statistically significant positive impact on the response rate in the first-stage of data collection. Second-stage differences in response rates were related to both first and second-stage experimental treatments including incentive levels, magnet receipt, and questionnaire customization. Interaction effects between the first-stage and second-stage incentives were also observed, which may suggest that the respondents make the connection between the two stages of the survey, rather than viewing them as separate inquiries.