Searcher
Vol. 10 No. 6 — JUne 2002
• FEATURE • 
The U.S. Census Bureau in the 21st Century
by Miriam A. Drake 
Table of Contents Previous Issues Subscribe Now! ITI Home
The U.S. Constitution in Article 1, Section 3 mandates an official counting of the population every 10 years. The first Census was completed in 1790. While the official purpose of the Constitution's mandate of a population count is to reapportion congressional districts, the Census also provides a statistical history of the nation and its people and is an economic asset and tool of inestimable value. The Census shows not only where people live at the time of the count, but also their educational levels, income, and other vital data.

The U.S. Census Bureau has historically led in the use of information technology to collect, process, analyze, manipulate, and publish data. It is a prestigious bellwether of where public sector information is going. It has developed methods to manage large data sets and large data collection projects. Counting the population has been and continues to be a formidable task. The first Census in 1790 took 18 months to complete1. The Congress assigned responsibility for taking the Census to U.S. marshals from 1790 to 1840. From 1850 to 1900, the Department of the Interior was responsible for the enumeration2. In 1902, the Census Bureau was established as part of the Department of the Interior. In 1903, it was transferred to the Department of Commerce. 

The 1790 Census revealed a population of 3.9 million people living in an area of 891,364 square miles. By 2000 the area of the U.S. had expanded to 3.6 million square miles and the population had grown to 281 million people. The number of questions on the Census has grown over the years to include more and more socio-economic data. The 2000 Census used short forms for most people and longer forms with detailed demographics for a sample of the population. 

In addition to contributions in the development of information technology, the Census Bureau's statisticians have formulated better population sampling techniques. However, Congress has mandated that the population count for reapportioning the House of Representatives must be done person by person, with no sampling. All other data collection relies on sampling the population. 
 

Asset Value
Collecting, processing, analyzing, and distributing Census data cost millions of dollars. The return on this investment in terms of value to users and the economy is many times the cost. Wide availability of the data permits forecasting of population-related developments with greater accuracy. It supports decision making and planning based on facts instead of guesses and estimates. Industry, business, government, education, and other organizations use the Census of Population for many purposes, ranging from forecasting the needs for classrooms for our schools to locating fast food outlets. Other Census data products include data about foreign trade, manufacturing, transportation, and many other areas. The data products from the Census Bureau are especially valuable in making business decisions and planning industrial and urban development. 

Data collected by the Census Bureau answer thousands of questions and provide the basis for planning growth and development. How many high school teachers will we need in 2010? How many manufacturers and what products are produced in Gwinnet County, Georgia? What is the average number of years of school completed for different geographic areas? Different groups in the population? Where are the highest income communities? Where are the poorest communities? 
 

Value-Added Publishers 
Many third-party publishers distribute Census data into the marketplace. Most businesses and individuals need only subsets of any of the Census data compilations. Subsets tailored to the needs of specific groups or businesses are easier to use and satisfy the needs of users more quickly than having users try to find the data themselves. By selecting appropriate subsets and analyzing and mapping data, third-party publishers add value for their customers. Having the right set of data in the right format facilitates decision-making and planning. 

The sale of value-added Census data by third-party publishers represents a public/private partnership that pays off for everyone involved. While taxpayers pay for the Census, taxpayer funds do not have to be used to satisfy special needs. Publishers can serve these special markets and make a profit. 
 

The Role of Librarians 
In order to learn more about access and distribution of the Census, I sent surveys to two librarians associated with the Government Documents Round Table (GODORT) of the American Library Association. While these librarians could not speak for GODORT, they presented some interesting perspectives. 

How should U.S. Census data be distributed? 

Both librarians answered, at least in part, paper. Acid-free paper for preservation makes sense; however, it is not necessary to store hundreds of volumes of numeric data in every depository library. Selected depository libraries or trusted third parties can fulfill the need to protect and preserve the data. Given that more than half the households in the U.S. have access to the Internet, the need for paper seems odd except for preservation. 

The librarians indicated that a paper compilation of data for a particular community would be useful, because people often ask to check a quick fact. Looking up an answer to simple questions may be faster in paper than on the computer. Other desired forms of distribution included the Internet, DVD-ROMs, and CD-ROMs. They suggested that depository libraries have all the raw data on DVDs. One librarian commented that the Census Bureau's own Web site needed sufficient bandwidth and computer power to sustain service at peak times. 

How much do libraries rely on third-party publishers of value added Census data? 

One librarian said, "Not much." The other librarian indicated some reliance on third-party vendors. Both respondents are employed in academic libraries where the faculty and researchers often prefer to do their own data manipulation. Corporate libraries and information centers rely heavily on third-party products because specific data is often needed quickly. These libraries usually do not need the entire compilation of data. Public libraries also may rely on third-party products depending on the composition of their communities and the needs of local businesses for Census data. 

How involved are libraries with state data centers? 

Again, there was a difference. One librarian said most depositories are not involved. The other librarian said it varied from state to state and that some university libraries are associated with the data centers. This situation seems odd because the state data centers often can provide useful help to librarians and data users. 

Since Census data are numeric and many librarians are not trained to manipulate numeric data, how much help can librarians offer Census data users? 

My own experience as a data user is mixed and mostly negative. The limitations arise because librarians are not trained to ask the right questions related to numeric data, especially economic and social data. Most librarians have been trained to deal with text rather than statistics. Younger librarians may have had more mathematics and statistics courses and be more attuned to describing and using numeric data. 

One respondent librarian stated, "If there is adequate online documentation, and there often is, librarians do not necessarily need to be trained to manipulate this data, but to know where a researcher can find online assistance." This librarian also pointed out that skill with spreadsheets could satisfy most needs. The other librarian agreed with the idea of spreadsheet skills and added, "The math is basic. If librarians don't know, they should be proactive to obtain the basic skills." 

How should Census data from 1790 forward and into the future be digitized, archived, preserved, and accessed? 

Both librarians pointed out the work of the Inter-University Consortium for Political and Social Research [http://www.icpsr.umich.edu] working with the University of Virginia [http://fisher.lib.virginia.edu/Census] in bringing about access to population Census data from 1790 to 1960. One librarian described the ideal as interactive access to enable building of tables across censuses. They also pointed out that Census is converting some files to pdf format. Another ideal was for Census to do the work; however, our librarians recognized the difficulty of obtaining funding for such a big job. 

Access to all Census data would allow the study of trends and changes in population, business, manufacturing, foreign trade, transportation, and other aspects of society and the economy documented by the data. People studying earlier censuses now need to build their own tables and data sets for study. They often have significant challenges in access and construction of consistent data sets. The changes in geographic boundaries in metropolitan areas and the inevitable errors in earlier Censuses create obstacles and the need to temper results. 

How does GODORT work with the Bureau of the Census? 

"GODORT has proven a good forum for pooling needs and advice from librarians and communicating these with the Census Bureau." GODORT indeed provides a useful forum, satisfying the need and desire to improve Census products, to provide incentives for listening to end-users, and to transmit their difficulties, experiences, and suggestions to the Census Bureau. While the Census Bureau deals directly with different users and user groups, librarians can contribute ideas and suggestions from their user communities. 

What are GODORT's most important issues regarding Census data? 

Our librarian respondents spoke for themselves, not for GODORT. They indicated that the chief issues are permanent access to electronic materials, training, preservation of data, and migration to new formats. 

What role to do you see for the depository libraries in maintaining and/or distributing digital federal data?

The librarians had different views. One librarian indicated that depositories should be the permanent repositories for print, DVD-ROM, and CD-ROM formats. As space continues to escalate in value in our cities and suburbs, it is not clear how long depository libraries can justify storage in prime space and maintain and preserve large print collections. As more and more public sector information becomes available on the Internet, the need for all depositories to store everything declines. Selected depositories and trusted third-party sites outside urban areas may be needed in the future. 

Both librarians saw the need for profiles of their local communities in print. One librarian suggested "cooperative projects with local governments to develop historic data sets for their own communities." Using Census data, old and new, provides an opportunity to produce data profiles of value and importance to local communities. 

The need to preserve the data is clear. The question is how to preserve and how to make the data accessible in usable form in perpetuity. There are no easy or inexpensive answers. While the use of data on paper is limited to the quick lookup, acid-free paper is a reasonable storage medium for the long term. The uncertainty of the world situation clearly calls for preservation in several secure sites and in all formats. The best storage media for the long run are acid free paper and silver halide microfilm. 
 

Census Bureau Survey
In addition to soliciting views from librarians, we sent a survey to the Census Bureau. Several Census staffers collaborated to complete the survey 

Was the distribution of the 2000 Census completely digital? 

The data were distributed primarily through the Internet as well as CD-ROM, DVD-ROM, and paper. Census 2000 maps can be accessed online in PDF format. Maps for the 1990 Census had to be purchased on paper. More information on Census 2000 products is available at [http://www.census.gov/population/www/censusdata/c2kproducts.html]. The number of printed pages we distributed was about 50,000 pages, down from 450,000 in 1990. 

What are the main ways Census data are accessed by users? 

"The Census Web site [http://www.census.gov] receives several million hits per day. Information is posted to the Web site as soon as it becomes available. "Users who need a few number for a few geographic areas, a few data tables, or a thematic map can go to the American FactFinder, a data retrieval tool on the (Census) Web site." One librarian indicated that American Fact-Finder was a "viable product" for retrieving data. Other means of access include FTP, DVD-ROM, and CD-ROM

How does the Bureau interact with value-added distributors and publishers of Census data?

The Bureau realizes that there are "customers who may need the information in different formats or with additional functionality. There we encourage our dissemination partners and others to tailor information to local needs, combine it with data from other sources, analyze it, or otherwise add value to it." 

What role does the Bureau see for GPO and Federal Depository Libraries? 

The Bureau works closely with GPO and the depository libraries. GPO can obtain copies of Census information for distribution to the depository libraries. 

What role does the Bureau see for librarians in aiding Census users? 

"Many librarians are knowledgeable about the census data, access tools, maps, and census terminology and can guide users to the information they need." The Bureau also noted that depository libraries provide data from past censuses through their collections. 

When will the complete file of each Census from 1790 forward be made available online? 

"Once a census is taken, the Census Bureau provides a record of the responses to the National Archives, where they are kept confidential for a period of 72 years. Genealogists have been anxiously awaiting the release of the 1930 Census records, which were recently made available by the Archives. The Archives has not digitized files from previous censuses for online access, although several private organizations are doing so." 

What enhancements are being planned for users? 

Over the next several months enhancements to American Fact Finder are planned for implementation. These include a new main page to help guide users; addition of FIPS codes with the Geographic Comparison Table for U.S. by state, by county and county by county subdivision by place; "zoom by latitude and longitude to the Thematic Maps, Reference Maps and the geographic selection by map." In addition, the Bureau has been testing "an Advance Query function that will allow users to develop custom tabulations from the basic records with confidentiality restrictions and safeguards." 

What is the Bureau's commitment to maintain the Census data and making it available in perpetuity in usable form? 

"Even though the Bureau does not anticipate removing any of the 1990 Census or 2000 Census data from its site, we are working with the Government Printing Office and the Federal Depository Library Program to provide additional long-term access through a depository library." 
 

Use of Information Technology
From 1790 to 1880, "Census data were tabulated by clerks who made tally marks or added columns of figures with a pen or pencil3.. "As the nation and its population grew, new methods were needed to tabulate and analyze the vast amounts of data collected by Census takers. In 1880, the Census bureau first used "a tabulating machine, a wooden box in which a roll of paper was threaded past an opening where a clerk marked the tallies in various columns and then added up the marks when the roll was full." This operation made tabulating the data twice as fast4

In 1890, Herman Hollerith assisted the Bureau with punch cards. Data were recorded by punching holes in the cards for the data elements. The cards were run through equipment that counted the holes. The Hollerith cards were developed from cards used by Joseph Maria Jacquard to control pattern weaving on looms. 

The 1950 Census of Population used a Univac computer for tabulation of data. The Univac tabulated 4,000 items per minute. Punch cards were no longer suitable for recording data. For the 1960 Census, the Census Bureau and the National Bureau of Standards developed FOSDIC (film optical sensing device for input to computers). FOSDIC was used until the 2000 Census. Filling in dots opposite the appropriate answers completes the survey. The survey was photographed onto microfilm. FOSDIC read the dots and transferred the data to tape for computer input5.

The 1960 Census was the first to use the mail for collection of data. People were asked to complete the survey forms and hold them until the Census taker appeared to review and retrieve the form. Now the Census of Population is completed mostly by mailing forms to households and having them returned via the mail. 

The Census Bureau began making data available to the public early in the 20th century. From the 1920s to the 1950s, data were distributed on punch cards. In the 1960s, the Bureau began distribution on tape. By the 1980s, it became possible to distribute data on diskettes. Later the Bureau switched to CD-ROMs and the Internet. 

The Census Bureau has led in the development and use of technology for the collection and distribution of large data sets. The Bureau also has been a pioneer in the distribution of data on maps to illustrate demographic data for particular geographic areas or specific data items. 
 

Future 
Work on the Census of 2010 is underway. The 2010 Census forms will be mailed to households with addresses that receive mail. The forms will be returned by mail, scanned, and recorded. Households that do not return the forms or that do not receive mail will be visited by a Census taker who will record data about the residents using a hand-held device. This part of the operation will be paperless, more efficient, and perhaps more accurate6.

The Bureau has not indicated when households will complete their forms via the Internet. Since people can now file their income tax returns electronically, it seems reasonable to assume that in the near future people will be able to complete their Census forms via the Internet. Security may be a primary obstacle. Secure systems are essential to preserve the privacy and integrity of the Census process. 

Preservation and access in perpetuity are difficult problems. The Census Bureau and librarians recognize the challenges and are committed to finding solutions. The 1990 and 2000 Censuses are in digital form and can be preserved and archived. The cost of converting past censuses to digital formats may be too expensive for the Bureau, the GPO, or any single agency. Digitization may have to come from the private sector or some cooperative arrangement of government and nongovernment organizations. Preservation of the statistical documentation of our history is essential. The loss of records and archives on September 11th alone illustrated the need for preservation and archiving. 

Footnotes

1. Census History and 20th Century Firsts, http://infoplease.com/spot/Census2.html

2. http://fisher.lib.virginia.edu/Census/background.

3. U.S. Census Bureau, Factfinder for the Nation, Washington, DC, May 2000, p. 10.

4. Ibid.

5. Ibid., p. 11 

6. Bob, Brewin, "U.S. Census Bureau Plans for the First Paperless Tally in 2010," Computerworld, March 18, 2002, p. 5. 


Correspondence with Professor Drake should go to the Library, Georgia Institute of Technology, Atlanta, GA 30332-0900, miriam.drake@library.gatech.edu.
Table of Contents Previous Issues Subscribe Now! ITI Home
© 2002