NCES provides data files from many of the surveys it conducts. These data files are available for downloading from the NCES website, through the Electronic Catalog.
These pages give you information about finding and using the data files.
Advantages of Using Data Files
Researchers use data files to perform customized data analysis not available in the web tools and publications. For example, publications and web tools may not make available an analysis using the particular variables the researcher needs.
The "Find Public Libraries, Branches, Bookmobiles" tool does not allow users to export or download data. The "Compare Libraries" tools (available for both public and academic libraries) do provide export or download capabilities.
In the web tools and publications, ratios (e.g., per capita) are calculated by NCES. However, these calculations may have been done differently than the researcher requires, perhaps by using a different formula. On the other hand, calculations, aggregations, etc. are under the control of the researcher, if he or she downloads and works with the data files directly.
Data files contain some data fields not used in the web tools, and therefore not included in a file downloaded or exported from the web tools. These include:
Disadvantages of Using Data Files
Users must download the entire data file, then sort records as desired and delete records not wanted. On the other hand, the web tools allow users to download just those records they may be interested in.
Fields containing calculated values for the web tools aren't included in the data files. However, we have provided information on how these fields are calculated below, so the user can perform his/her own calculations.
What's in them?
The record layouts (usually in one or more Appendices) give you:
Since different applications require files in different formats, data files are often available in several formats.
Files are often available in these formats: MSAccess, ASCII, or SAS formats (not all data files are available in SAS format). A number of older files are available only on diskette or magnetic tape.
The NCES Electronic Catalog data page for each data file will give you specific information on what format(s) the data file is available in, and how to obtain the data file.
Many data files are Zipped using WinZIP, for faster downloads.
Most documentation files are in the Adobe .PDF format.
Using MS Access format files
Data files that are in MS Access format (with .MDB extension) can be used directly in Microsoft's Access database application, and any application that can import or read MS Access database files.
Using ASCII format files
ASCII-format files (with .TXT extension) can be viewed and edited using any text editor (MS WordPad, TextPad, etc), and imported into many software applications:
Some tips for using ASCII-format files:
Two types of data files available, restricted-use data files and public-use data files.
Restricted use data files contain all data as it was collected, edited, corrected, and imputed for non-response.
Restricted-use data files contain individually identifiable information, which is confidential and protected by law. The terms restricted-use data and "subject data" are synonymous.
The Education Sciences Reform Act of 2002 requires NCES to follow special procedures to protect the privacy of individual respondents.
From the "Restricted-Use Data Procedures Manual":
The goal is to maximize the use of statistical information, while protecting individually identifiable information from disclosure. The Restricted-Use Data Procedures Manual was created to provide a guide to the restricted-use data application process, as well as to explain the laws and regulations governing these data.
Researchers requiring access to the restricted-use data must obtain a license from NCES to use the data on loan. To obtain a license, the following information is necessary:
Click here for more information on how to obtain a Restricted-use license.
Public-use data files are the same as restricted-use data files, but they have had some data removed to protect the confidentiality of individually identifiable survey respondents.
Public-use data files are publicly available without restriction, and do not require a license. Survey data are coded or aggregated without individually identifiable information. Data that could be directly identified with one individual (salaries and wages for librarians for a library with one librarian, for example) are removed.
The library web tools use the public-use data files; that is, some of the data used by the tools have been removed as described above.
From the PLS documentation titled "Data File, Public-Use: Public Libraries Survey: Fiscal Year 2001" (1,036 KB):
Public-use data. On the public-use Public Library Data File, selected expenditures data (i.e., Salaries, Benefits, Total Staff Expenditures, and Other Operating Expenditures) for public libraries have been removed (i.e., the field is blank) when total full-time equivalent (FTE) staff is less than or equal to 2.00, to protect the confidentiality of respondents. These data may also be suppressed for other libraries to ensure that all states that have suppressed data have a minimum of 3 suppressed records. The library's Total Operating Expenditures are not affected by the suppression of these data. No data are suppressed on the public-use State Summary/State Characteristics Data File or the Public Library Outlet Data File.
Restricted-use data. No data are suppressed on the restricted-use Public Library Data File. The inclusion of all expenditures data irrespective of the number of employees enables the identification of individual salary data at some libraries.
Calculated fields in the web tools
Several fields used by the web tools are calculated from other fields in the data files. These calculated fields are not included in the data files downloaded directly from the NCES website. Calculated fields include per-capita and per-1,000-enrolled values, percent-of-total values, etc.
Click below for:
Calculation of Enrollments for Academic Libraries
Enrollment figures are calculated from the Integrated Postsecondary Education Data System (IPEDS) Fall Enrollment Survey data for each postsecondary institution having a record in the Academic Libraries Survey data. Fall enrollment data for 1999-2000 was used with the Fiscal Year 2000 Academic Libraries Survey data. (The 1999-2000 fall enrollment data is not currently available for download; for IPEDS data availability, click here: Integrated Postsecondary Education Data System (IPEDS), Data Files.
Total full-time equivalent enrollment is used to calculate several "per-person-enrolled" values, such as Total library expenditures per person enrolled (FTE) and Total library staff per 1,000 enrolled.
"Full-time equivalent" enrollment is calculated as the full-time enrollment plus one-third the part-time enrollment. See Calculated Data Fields for more information.
Enrollment figures are calculated for undergraduate and post-baccalaureate enrollments and the total of the two. For purposes of the "Compare Academic Libraries" tool only, "Post-baccalaureate" means enrollment in any program for which a baccalaureate degree is required for admission, including graduate studies (M.A. and Ph.D. programs), professional school studies (i.e., M.D. or J.D. programs), and post-baccalaureate certificate studies.
Population fields - Differences between "Population of Legal Service Area" and "Unduplicated Population" for Public Libraries
From the PLS documentation titled "Data File, Public-Use: Public Libraries Survey: Fiscal Year 2001" (1,036 KB):
Survey Population Items
The PLS has three population items: (1) Population of Legal Service Area (reported for each public library by the state library agency), (2) Total Unduplicated Population of Legal Service Areas (a single figure, reported by the state library agency), and (3) Official State Total Population Estimate (reported by the state library agency). The total Population of Legal Service Area for all public libraries in a state may exceed the state's Total Unduplicated Population of Legal Service Areas or the Official State Total Population Estimate. This occurs when the state has one or more geographically adjacent libraries (for example, a county library and a city library within the county) that serve, and therefore count, the same population. Twenty-six states had such overlapping service areas in FY 2001.
In order to do meaningful analysis using Population of Legal Service Area data (for example, the number of books/serial volumes per capita), the data were adjusted to eliminate duplicative reporting in states with overlapping service areas. The Public Library Data File has a derived unduplicated population of legal service area for each library for this purpose, called POPU_UND. This value was prorated for each library by calculating the ratio of a library's Population of Legal Service Area to the total Population of Legal Service Area for all libraries in the state, and applying the ratio to the state's Total Unduplicated Population of Legal Service Areas. (The latter item is a single, state-reported figure. It is on the State Summary/State Characteristics Data File and is also called POPU_UND.)
Imputation
Imputation is a statistical means for providing a valid value for missing data. Note that data files, both public- and restricted-use, have had imputation applied, but the data used by the Library Statistics Program web tools have not.
Imputation in the Public Libraries Survey
From the PLS documentation titled "Data File, Public-Use: Public Libraries Survey: Fiscal Year 2001" (1,036 KB):
All libraries, including nonresponding libraries, were sorted into imputation cells based on the region and size of population served. Item imputation was performed on each record with nonresponse variables. The data are identified as either imputed (estimated) or reported (actual) on the survey data file, through the use of imputation codes.
Imputation in Other Surveys
Other documents that have more information about imputation and how it is applied: