IES Blog

Institute of Education Sciences

From Data Collection to Data Release: What Happens?

In today’s world, much scientific data is collected automatically from sensors and processed by computers in real time to produce instant analytic results. People grow accustomed to instant data and expect to get things quickly.

At the National Center for Education Statistics (NCES), we are frequently asked why, in a world of instant data, it takes so long to produce and publish data from surveys. Although improvements in the timeliness of federal data releases have been made, there are fundamental differences in the nature of data compiled by automated systems and specific data requested from federal survey respondents. Federal statistical surveys are designed to capture policy-related and research data from a range of targeted respondents across the country, who may not always be willing participants.

This blog is designed to provide a brief overview of the survey data processing framework, but it’s important to understand that the survey design phase is, in itself, a highly complex and technical process. In contrast to a management information system, in which an organization has complete control over data production processes, federal education surveys are designed to represent the entire country and require coordination with other federal, state, and local agencies. After the necessary coordination activities have been concluded, and the response periods for surveys have ended, much work remains to be done before the survey data can be released.

Survey Response

One of the first sources of potential delays is that some jurisdictions or individuals are unable to fill in their surveys on time. Unlike opinion polls and online quizzes, which use anyone who feels like responding to the survey (convenience samples), NCES surveys use rigorously formulated samples meant to properly represent specific populations, such as states or the nation as a whole. In order to ensure proper representation within the sample, NCES follows up with nonresponding sampled individuals, education institutions, school districts, and states to ensure the maximum possible survey participation within the sample. Some large jurisdictions, such as the New York City school district, also have their own extensive survey operations to conclude before they can provide information to NCES. Before the New York City school district, which is larger than about two-thirds of all state education systems, can respond to NCES surveys, it must first gather information from all its schools. Receipt of data from New York City and other large districts is essential to compiling nationally representative data.

Editing and Quality Reviews

Waiting for final survey responses does not mean that survey processing comes to a halt. One of the most important roles NCES plays in survey operations is editing and conducting quality reviews of incoming data, which take place on an ongoing basis. In these quality reviews, a variety of strategies are used to make cost-effective and time-sensitive edits to the incoming data. For example, in the Integrated Postsecondary Education Data System (IPEDS), individual higher education institutions upload their survey responses and receive real-time feedback on responses that are out of range compared to prior submissions or instances where survey responses do not align in a logical way. All NCES surveys use similar logic checks in addition to a range of other editing checks that are appropriate to the specific survey. These checks typically look for responses that are out of range for a certain type of respondent.

Although most checks are automated, some particularly complicated or large responses may require individual review. For IPEDS, the real-time feedback described above is followed by quality review checks that are done after collection of the full dataset. This can result in individualized follow up and review with institutions whose data still raise substantive questions. 

Sample Weighting

In order to lessen the burden on the public and reduce costs, NCES collects data from selected samples of the population rather than taking a full census of the entire population for every study. In all sample surveys, a range of additional analytic tasks must be completed before data can be released. One of the more complicated tasks is constructing weights based on the original sample design and survey responses so that the collected data can properly represent the nation and/or states, depending on the survey. These sample weights are designed so that analyses can be conducted across a range of demographic or geographic characteristics and properly reflect the experiences of individuals with those characteristics in the population.

If the survey response rate is too low, a “survey bias analysis” must be completed to ensure that the results will be sufficiently reliable for public use. For longitudinal surveys, such as the Early Childhood Longitudinal Study, multiple sets of weights must be constructed so that researchers using the data will be able to appropriately account for respondents who answered some but not all of the survey waves.

NCES surveys also include “constructed variables” to facilitate more convenient and systematic use of the survey data. Examples of constructed variables include socioeconomic status or family type. Other types of survey data also require special analytic considerations before they can be released. Student assessment data, such as the National Assessment of Educational Progress (NAEP), require that a number of highly complex processes be completed to ensure proper estimations for the various populations being represented in the results. For example, just the standardized scoring of multiple choice and open-ended items can take thousands of hours of design and analysis work.

Privacy Protection

Release of data by NCES carries a legal requirement to protect the privacy of our nation’s children. Each NCES public-use dataset undergoes a thorough evaluation to ensure that it cannot be used to identify responses of individuals, whether they are students, parents, teachers, or principals. The datasets must be protected through item suppression, statistical swapping, or other techniques to ensure that multiple datasets cannot be combined in such a way as to identify any individual. This is a time-consuming process, but it is incredibly important to protect the privacy of respondents.

Data and Report Release

When the final data have been received and edited, the necessary variables have been constructed, and the privacy protections have been implemented, there is still more that must be done to release the data. The data must be put in appropriate formats with the necessary documentation for data users. NCES reports with basic analyses or tabulations of the data must be prepared. These products are independently reviewed within the NCES Chief Statistician’s office.

Depending on the nature of the report, the Institute of Education Sciences Standards and Review Office may conduct an additional review. After all internal reviews have been conducted, revisions have been made, and the final survey products have been approved, the U.S. Secretary of Education’s office is notified 2 weeks in advance of the pending release. During this notification period, appropriate press release materials and social media announcements are finalized.

Although NCES can expedite some product releases, the work of preparing survey data for release often takes a year or more. NCES strives to maintain a balance between timeliness and providing the reliable high-quality information that is expected of a federal statistical agency while also protecting the privacy of our respondents.  

 

By Thomas Snyder

Partnering with Researchers Can Help State Leaders Build the Case for CTE

In Massachusetts, Career/Vocational Technical Education Schools (CVTE) are renowned for offering rigorous, high-quality programs of study across a variety of disciplines. While CVTE graduates have always experienced high rates of success academically and in their careers, state leaders in Massachusetts wanted to know whether these outcomes directly result from the CVTE model. In 2017, the Massachusetts Department of Elementary and Secondary Education partnered with Shaun Dougherty (at the time, a researcher at the University of Connecticut), and learned that CVTE students are significantly more likely to graduate from high school and earn an industry-recognized credential than similar students who were not admitted.

Demand for rigorous research on Career Technical Education (CTE) has increased as more policymakers ask questions about the impact on college and career readiness. State CTE Directors may be interested in similar questions as researchers (such as “Does CTE improve educational and career outcomes? Do different programs help different students? What types of programs offer students the highest economic returns?”) but may not think to seek out and collaborate with them or know how to prioritize among the many research requests they receive.

This blog series, a partnership between Advance CTE and the Institute for Education Sciences (IES) seeks to break down the barriers between State CTE Directors and researchers to encourage partnerships that can benefit both.

What Can Research with State Data Tell Us?

Research can be a powerful tool to help State CTE Directors understand what’s working, what isn’t working, and what needs to change. The findings described below provide examples of how strong partnerships between researchers and state policymakers can result in actionable research (click on state name for link to full article).

  • In Arkansas, students with greater exposure to CTE are more likely to graduate from high school, enroll in a two-year college, be employed, and earn higher wages. The study, which was rigorous but not causal, also found that students taking more CTE classes are just as likely to pursue a four-year degree as their peers, and that CTE provides the greatest boost to boys and students from low-income families.
  • Boys who attended CTE high schools in Connecticut experienced higher graduation rates and post-graduation earnings than similar students who did not attend CTE high schools. Further follow-ups using both postsecondary and labor data could provide information about college completion and employment and earnings for different occupational sectors.
  • CTE concentrators in Texas had greater enrollment and persistence in college than their peers. Although rates of CTE concentration decreased, student participation in at least some CTE programming, as well as number of CTE credits earned, increased between the 2008 and 2014 cohorts. Unsurprisingly, the study also found differences by CTE programs of study. Education & Training; Finance; Health Science; and Science, Technology, Engineering & Mathematics (STEM) were most strongly associated with postsecondary enrollment, particularly in baccalaureate programs.

How Can States Use CTE Research to Improve Policy and Practice?

Here are a few things states can do today to start building a CTE research base:

  • Create a codebook of CTE variables in your state’s data system: Include K-12, postsecondary, and labor force variables if you have them. Define the variables clearly – what do they measure, at what level (student, program, district), and for how many years did you collect these variables? Are the measures comparable across years and across datasets?
  • Maximize opportunities to collect longitudinal data: longitudinal databases that span education levels and connect to workforce outcomes permit researchers to conduct rigorous studies on long-term outcomes.
  • Identify universities in your state with strong education, economics, or public policy departments:  Make a list of questions that policymakers in your state most wanted answered, and then approach universities with these proactively. Reach out to the chair(s) of these departments to connect with faculty who may be interested in partnering on answering the questions. Universities can often apply for a research grant that will cover part or all of the funding for state personnel to work on the research project. IES, which provides funding of this nature, opens its next grant competition in summer 2020.
  • Reach out to your Regional Educational Lab (REL) or the REL Career Readiness Research Alliance to inquire about partnering on CTE research: The mission of these IES-funded labs is to provide research and evidence to help educators in the states in their region. For example, REL Central is currently working with four states to replicate the Arkansas study described above (see “Review of Career and Technical Education in Four States”).
  • Stay up to date on the latest research findings in CTE: New research is regularly posted on the CTE Research Network and other websites. This can help you get ideas for what types of research you would like to conduct in your state. Another good source of inspiration is the recommendations of the CTE technical workgroup, which was convened by IES in late 2017 to guide future CTE research directions.
  • Become familiar with how researchers approach CTE research: Learn about why it’s so challenging to understand its impact. The CTE Research Network will hold research trainings for different audiences—including state agency staff— beginning in the summer of 2020. Stay tuned!

Over the next several months, Advance CTE and IES will publish a series of Q&A blog posts with researchers and state CTE leaders talking about how their partnerships developed and what states can do to advance CTE research.

This blog series was co-authored by Corinne Alfeld at IES (corinne.alfeld@ed.gov) and Austin Estes from Advance CTE (aestes@careertech.org), with thanks to Steve Klein of Education Northwest for editorial suggestions. IES began funding research grants in CTE in 2017 and established a CTE Research Network in 2018. IES hopes to encourage more research on CTE in the coming years in order to increase the evidence base and guide program and policy decisions. At the same time, Advance CTE has been providing resources to help states improve their CTE data quality and use data more effectively to improve CTE program quality and equity.

Updates from the CTE Research Network!

“Does Career and Technical Education (CTE) work?” and “For whom does CTE work and how?” are questions on many policymakers’ and education leaders’ minds and ones that the CTE Research Network aims to answer. The mission of the Network, as described in a previous blog post, is to increase the amount of causal evidence in CTE that can inform practice and policy. The Network’s members, who are researchers funded by IES to examine the impact of CTE, have been busy trying to answer all of these questions.

This blog describes three Network updates:

  • Shaun Dougherty, of Vanderbilt University, and his colleagues at the University of Connecticut have been studying the effects of attending a CTE-focused high school among 60,000 students in Connecticut as part of their Network project. They recently reported that:
    • When compared to males attending traditional high schools, males who attended CTE schools were 10 percentage points more likely to graduate from high school and were earning 31 percent more by age 23. The authors noted that the more CTE courses that are available at the regular high school, the less attendance at a CTE high school makes a difference.
    • Analyses of potential mechanisms behind these findings reveal that male students attending a technical high school have higher 9th grade attendance rates and higher 10th grade test scores. However, they are 8 percentage points less likely to attend college (though some evidence indicates that the negative impact on college attendance fades over time).
    • Attending a CTE high school had no impacts on female students. Further, the effects did not differ over student attributes like race and ethnicity, free lunch eligibility or residence in a poor, central city school district.

The study results are being disseminated widely in the media, including via the Brookings Brown Center Chalkboard, The Conversation, and the National Bureau of Economic Research.

  • In other news, the CTE Research Network has welcomed a fourth IES-funded project, led by Julie Edmunds. Edmunds’ team is studying dual enrollment pathways in North Carolina, and one of the pathways focuses on CTE.
  • Finally, the two co-PIs for the Network Lead, Kathy Hughes and Shaun Dougherty, recently participated in a Q&A in Techniques magazine about the purpose of the CTE Network, how the Network will help the field of CTE, and how each of their careers has led them to this work.

The Network Lead has launched a new website where you can get new information about ongoing work and sign up to receive their newsletter.

This post was written by Corinne Alfeld, the NCER-IES program officer responsible for the CTE research topic and the CTE Research Network. Contact her at Corinne.Alfeld@ed.gov with questions.

New Study on U.S. Eighth-Grade Students’ Computer Literacy

In the 21st-century global economy, computer literacy and skills are an important part of an education that prepares students to compete in the workplace. The results of a recent assessment show us how U.S. students compare to some of their international peers in the areas of computer information literacy and computational thinking.

In 2018, the U.S. participated for the first time in the International Computer and Information Literacy Study (ICILS), along with 13 other education systems around the globe. The ICILS is a computer-based international assessment of eighth-grade students that measures outcomes in two domains: computer and information literacy (CIL)[1] and computational thinking (CT).[2] It compares U.S. students’ skills and experiences using technology to those of students in other education systems and provides information on teachers’ experiences, school resources, and other factors that may influence students’ CIL and CT skills.

ICILS is sponsored by the International Association for the Evaluation of Educational Achievement (IEA) and is conducted in the United States by the National Center for Education Statistics (NCES).

The newly released U.S. Results from the 2018 International Computer and Information Literacy Study (ICILS) web report provides information on how U.S. students performed on the assessment compared with students in other education systems and describes students’ and teachers’ experiences with computers.


U.S. Students’ Performance

In 2018, U.S. eighth-grade students’ average score in CIL was higher than the average of participating education systems[3] (figure 1), while the U.S. average score in CT was not measurably different from the average of participating education systems.

 


Figure 1. Average computer and information literacy (CIL) scores of eighth-grade students, by education system: 2018p < .05. Significantly different from the U.S. estimate at the .05 level of statistical significance.

¹ Met guidelines for sample participation rates only after replacement schools were included.

² National Defined Population covers 90 to 95 percent of National Target Population.

³ Did not meet the guidelines for a sample participation rate of 85 percent and not included in the international average.

⁴ Nearly met guidelines for sample participation rates after replacement schools were included.

⁵ Data collected at the beginning of the school year.

NOTE: The ICILS computer and information literacy (CIL) scale ranges from 100 to 700. The ICILS 2018 average is the average of all participating education systems meeting international technical standards, with each education system weighted equally. Education systems are ordered by their average CIL scores, from largest to smallest. Italics indicate the benchmarking participants.

SOURCE: International Association for the Evaluation of Educational Achievement (IEA), the International Computer and Information Literacy Study (ICILS), 2018.


 

Given the importance of students’ home environments in developing CIL and CT skills (Fraillon et al. 2019), students were asked about how many computers (desktop or laptop) they had at home. In the United States, eighth-grade students with two or more computers at home performed better in both CIL and CT than their U.S. peers with fewer computers (figure 2). This pattern was also observed in all participating countries and education systems.

 


Figure 2. Average computational thinking (CT) scores of eighth-grade students, by student-reported number of computers at home and education system: 2018

p < .05. Significantly different from the U.S. estimate at the .05 level of statistical significance.

¹ Met guidelines for sample participation rates only after replacement schools were included.

² National Defined Population covers 90 to 95 percent of National Target Population.

³ Did not meet the guidelines for a sample participation rate of 85 percent and not included in the international average.

⁴ Nearly met guidelines for sample participation rates after replacement schools were included.

NOTE: The ICILS computational thinking (CT) scale ranges from 100 to 700. The number of computers at home includes desktop and laptop computers. Students with fewer than two computers include students reporting having “none” or “one” computer. Students with two or more computers include students reporting having “two” or “three or more” computers. The ICILS 2018 average is the average of all participating education systems meeting international technical standards, with each education system weighted equally. Education systems are ordered by their average scores of students with two or more computers at home, from largest to smallest. Italics indicate the benchmarking participants.

SOURCE: International Association for the Evaluation of Educational Achievement (IEA), the International Computer and Information Literacy Study (ICILS), 2018.


 

U.S. Students’ Technology Experiences

Among U.S. eighth-grade students, 72 percent reported using the Internet to do research in 2018, and 56 percent reported completing worksheets or exercises using information and communications technology (ICT)[4] every school day or at least once a week. Both of these percentages were higher than the respective ICILS averages (figure 3). The learning activities least frequently reported by U.S eighth-grade students were using coding software to complete assignments (15 percent) and making video or audio productions (13 percent).

 


Figure 3. Percentage of eighth-grade students who reported using information and communications technology (ICT) every school day or at least once a week, by activity: 2018

p < .05. Significantly different from the U.S. estimate at the .05 level of statistical significance.

¹ Did not meet the guidelines for a sample participation rate of 85 percent and not included in the international average.

NOTE: The ICILS 2018 average is the average of all participating education systems meeting international technical standards, with each education system weighted equally. Activities are ordered by the percentages of U.S. students reporting using information and communications technology (ICT) for the activities, from largest to smallest.

SOURCE: International Association for the Evaluation of Educational Achievement (IEA), the International Computer and Information Literacy Study (ICILS), 2018.


 

Browse the full U.S. Results from the 2018 International Computer and Information Literacy Study (ICILS) web report to learn more about how U.S. students compare with their international peers in their computer literacy skills and experiences.

 

By Yan Wang, AIR, and Linda Hamilton, NCES

 

[1] CIL refers to “an individual's ability to use computers to investigate, create, and communicate in order to participate effectively at home, at school, in the workplace, and in society” (Fraillon et al. 2019).

[2] CT refers to “an individual’s ability to recognize aspects of real-world problems which are appropriate for computational formulation and to evaluate and develop algorithmic solutions to those problems so that the solutions could be operationalized with a computer” (Fraillon et al. 2019). CT was an optional component in 2018. Nine out of 14 ICILS countries participated in CT in 2018.

[3] U.S. results are not included in the ICILS international average because the U.S. school level response rate of 77 percent was below the international requirement for a participation rate of 85 percent.

[4] Information and communications technology (ICT) can refer to desktop computers, notebook or laptop computers, netbook computers, tablet devices, or smartphones (except when being used for talking and texting).

 

Reference

Fraillon, J., Ainley, J., Schulz, W., Duckworth, D., and Friedman, T. (2019). IEA International Computer and Information Literacy Study 2018: Assessment Framework. Cham, Switzerland: Springer. Retrieved October 7, 2019, from https://link.springer.com/book/10.1007%2F978-3-030-19389-8.