All household members age 3 and older in the civilian noninstitutionalized population of the 50 states and the District of Columbia. Excludes military personnel and their families living on post, inmates of institutions, and residents of homes for the aged.
The CPS sample is a multistage stratified sample of approximately 72,000 assigned housing units from 824 sample areas designed to measure the demographic and labor force characteristics of the civilian noninstitutionalized population 15 years of age and older. Published data, however, focus on those ages 16 and over. Currently, the CPS samples housing units from lists of addresses obtained from the 2000 Decennial Census of Population and Housing. The sample is updated continuously for new housing built after the 2000 Census..
To improve the reliability of estimates of month-to-month and year-to-year change, eight panels of housing units are used to rotate the sample each month. A sample unit is interviewed for 4 consecutive months and then, after an 8-month rest period, for the same 4 months a year later. Every month, a new panel of housing units, or one-eighth of the total sample, is introduced. Thus, in a particular month, one panel is being interviewed for the first time, one panel for the second, and so on.
The first-stage sample selection is carried out in three major steps: definition of the primary sampling units (PSUs), stratification of the PSUs within each state, and selection of the sample PSUs in each state. There are currently (after the 2000 Decennial Census) 2,025 defined PSUs in the United States from which to draw the CPS sample. The CPS sample design calls for combining PSUs into strata within each state and selecting one PSU from each stratum. The CPS currently uses the Stratification Search Program (SSP), created by the Demographic Statistical Methods Division of the Census Bureau, to perform the PSU stratification. CPS strata in all states except Alaska are formed using the SSP. (A separate program performs the stratification for Alaska.) A total of 824 PSUs are selected for the sample. Using a procedure designed to maximize overlap, one PSU is selected per stratum with probability proportional to its 2000 population. This procedure uses mathematical programming techniques to maximize the probability of selecting PSUs that are already in sample while maintaining the correct overall probabilities of selection.
The second stage of the CPS sample design is the selection of sample housing units within PSUs. These ultimate sampling unit (USU) clusters consist of a geographically compact cluster of approximately four addresses, corresponding to four housing units at the time of the census. Each month, about 72,000 housing units are assigned for data collection, of which about 60,000 are occupied and thus eligible for interview. The remainder are units found to be destroyed, vacant, converted to nonresidential use, containing persons whose usual place of residence is elsewhere, or ineligible for other reasons. Of the 60,000 housing units, about 5 percent are not interviewed in a given month due to temporary absence (vacation, etc.), other failures to make contact after repeated attempts, the inability of persons contacted to respond, unavailability for other reasons, and refusals to cooperate (which make up about half of the noninterviews). Information is obtained each month on for approximately 110,000 persons 15 years of age or older and on approximately 30,000 persons under the age of 15.
Since 2005, the CPS sample has been selected based on 2000 census information. From 1995 to 2004, the sample was based on 1990 census information; samples prior to 1995 similarly used earlier censuses. The number of PSUs, housing units, and persons interviewed are also different in samples prior to 2005. Specifics on each given CPS sample can be found in the technical documentation report for the year’s CPS.
The U.S. Bureau of the Census is the collection agent for the CPS and its supplements. Additional details on data collection and processing are provided in The Current Population Survey: Design and Methodology (Technical Paper 66) (U.S. Department of Commerce 2006).
Reference Dates. The reference period for the October Supplement is the current school year, which is assumed to be in progress in the interview month of October. The CPS labor force questions ask about labor market activities for 1 week each month. This week is referred to as the ‘‘reference week.’’ The reference week is defined as the 7-day period, Sunday through Saturday, which includes the 12th of the month.
Data Collection. Each month, Bureau of the Census field representatives attempt to collect data from the sample units during the week containing the 19th of the month. For the first month-in-sample interview, the interviewer visits the sample address to determine if the sample unit exists, if it is occupied, and if some responsible adult will provide the necessary information. If someone at the sample unit agrees to the interview, the interviewer uses a laptop computer to administer the interview. In most cases, the interviewer conducts subsequent interviews by telephone (use of telephone interviewing must be approved by the respondent) and does not actually visit the sample unit again until the fifth month-in-sample interview, the first interview after the 8-month resting period. Fifth-month households are more likely than any other household to be a replacement household; that is, a household in which all the previous month’s residents have moved out and been replaced by an entirely different group of residents. However, any person can change his or her household status during the time in sample: a person who leaves the household is deleted from the roster; a person who moves into the household is added to the roster.
Most month-in-sample 2 through 4 and 6 through 8 interviews are conducted by telephone. (For instance, 78.8 percent of the interviews for the October 2004 Supplement were conducted by telephone, which is highly consistent with the usual monthly results for telephone interviews.) Interviewers continue to visit households without telephones, with poor English language skills, or that decline a telephone interview.
The interview begins with questions about the housing unit and the people who consider this address their usual residence. Basic demographic information is collected for each household member. Labor force information is collected for each civilian 15 years of age or older, although the data for 15-year-olds are not used in official BLS estimates. After the labor force information has been collected for all eligible household members, supplemental questions particular to that month’s interview may be asked of specific family members or the entire household.
Editing. Completed interviews are electronically transmitted to a central processor where the responses are edited for consistency and various codes are added. The edits effectively blank out all entries in inappropriate questions and ensure that all appropriate questions have valid entries.
Weighting is used in the CPS to adjust for sampling and unit nonresponse, and imputation is used to adjust for item nonresponse.
Weighting. For the basic CPS, the estimation procedure involves weighting the data from each sample person by the inverse of the probability of the person’s housing unit being in the sample. With some exceptions, sample persons within the same state have the same probability of selection. The CPS uses raking ratio estimation to derive the weights used to tabulate total U.S. and state estimates. The goal is to control the survey estimates of the population in specific subgroups to match independently obtained estimates of the civilian noninstitutionalized population in the 50 states and the District of Columbia. These population estimates are prepared monthly to agree with the most current set of population estimates that are released as part of the Census Bureau’s population estimates and projections program. In addition, household and family weights provide a basis for household-level estimates and estimates for married couples living in the same household.
For all CPS data files, a final weight is prepared and used to compute the monthly labor force status estimates. The final weight, which is the product of several adjustments, including a nonresponse adjustment, is used to produce estimates for the various characteristics covered in the full monthly CPS. This weight is constructed from the basic weight for each person, which represents the probability of selection for the survey. For supplements, such as the October Supplement, separate data processing is required, not only to edit responses for consistency and impute for missing values, but also to incorporate special weighting procedures to account for the fact that the supplement is targeting a special universe, such as school-age children, in contrast to the working-age labor force emphasis of the basic CPS.
Starting with the data collected in the October 1994 CPS, independent estimates have been based on civilian noninstitutionalized population controls for age, race, and sex established by the decennial census and adjusted to compensate for an undercount. These independent estimates are based on statistics from decennial censuses; statistics on births, deaths, immigration, and emigration; and statistics on the size of the Armed Forces.
Imputation. When a response is not obtained for a particular data item, or an inconsistency in reported items is detected, an imputed response is entered in the field. Before the edits are applied, the daily data files are merged and the combined file is sorted by state and PSU within state. This sort ensures that allocated values are from geographically related records; that is, missing values for records in Maryland will not receive values from records in California. This is an important distinction since many labor force and industry and occupation characteristics are geographically clustered. The edits are run in a deliberate and logical sequence. Demographic variables are edited first because several of these variables are used to allocate missing values in the other modules. The labor force module is edited next, since labor force status and related items are used to impute missing values for industry and occupation codes and so forth.
CPS edits use three imputation methods: relational imputation, longitudinal edits, and hot-deck imputation. Relational imputation infers the missing value from other characteristics in the person’s record or within the household. Longitudinal edits are used primarily in the labor force edits. If a question is blank and the record is in the overlap sample, the edit looks at the previous month’s data to determine whether the person had responded then for that item. If so, the previous month’s entry is assigned; otherwise, the item is assigned a value using the appropriate hot deck. The hot-deck method assigns a value from a record with similar characteristics. Hot decks are always defined by age, race, and sex. Other characteristics used in hot decks vary depending on the nature of the question being referenced. The imputation procedure is performed one item at a time. In a typical month, the imputation rate for demographic items is less than 1 percent. The rates for labor force items are slightly over 1 percent. Over all earnings items, the imputation rate is near 10 percent, with some items having much higher and others much lower nonresponse rates. In October 2005, the imputation rate for the basic school enrollment items ranged from 4 to 7 percent per item.
The October Supplement will always include the traditional school enrollment questions; questions on other topics will be added as occasion warrants. For example, over the last several decades NCES has funded additional items on education-related topics such as language proficiency, disabilities, computer use and access, student mobility, and private school tuition. Plans for additional questions in future years have yet to be determined.