Infrastructure for Survey Data Processing in Urban and Planning Studies

Infrastructure for Survey Data Processing in Urban and Planning Studies

Randall J. Olsen (Ohio State University, USA)
DOI: 10.4018/978-1-4666-0074-4.ch002
OnDemand PDF Download:
No Current Special Offers


Applied social science research has increasingly come to rely on surveys to generate detailed data, especially on firms, persons, and households, needed to study social phenomena. The methods used to collect survey data have changed substantially in the past quarter century and appear on the cusp of changing again with the rise of Web-based technologies. These changes can be best implemented by adopting computational methods designed for relational databases. This is true not only for survey data, but also administrative data that government agencies collect, store, and use. In this chapter, the author explains how these changes are best accommodated and how new telecommunications technologies, including Voice over Internet and smart phones, fit into this new paradigm. These techniques dominate survey data collection for urban studies and other fields.
Chapter Preview


A quarter century ago, survey research had two key tools—pencil and paper. The situation in the late 1980s was ripe for fundamental changes in how surveys handled the flow of data. There were two general responses to that problem—computerize the process a step at a time with solutions targeted on particular stages of the process, or target the process as a whole, reconstructing how surveys handle their data. In this chapter, we describe the second approach. In our judgment, survey organizations will find themselves there sooner or later; the only issue is how painful the trip will be.

Today, when interviewers set out to do interviewing, paper and pencil is largely relegated to generating sticky notes and jotting down reminders. For the serious business of data collection, we discourage interviewers from writing things down lest they (a) constitute a breach of security; or (b) make it difficult and costly to store and process all the data that a project needs to collect, store and keep organized in order to achieve the scientific goals of the study. For surveys, our invitation to the prospective respondent is increasingly, our keyboard or yours? That is, aside from relatively rare mail-out interviews that send booklets for the respondent to fill out, we either call the respondent on the telephone, with the interviewer entering the data on a PC at his or her workstation, or the interviewer visits the respondent’s home or place of business, again using a computer, this time a stand-alone laptop, to enter answers and other data secured during the interview. Even in the event we ask the respondent to enter the data him or herself, the preferred approach is a Web site rather than a booklet that needs to be mailed out, mailed back in, received and tracked, data entered, edited and cleaned to deal with the inevitable respondent or data entry errors that result from failing to read the instructions, skipping items or writing down answers that are blatantly inconsistent with previous responses.

The process of handling what was, often, literally tons of paper was labor intensive to say the least. Even before the first interviewer did the first case, forms needed to be printed, bound and mailed to interviewers. That was the easy part. For face-to-face interviewing, the completed forms needed to be mailed back to the central office, receipted, tracked against the case load, and packaged together with all the other relevant paper materials, such as any consent forms, receipts for incentive payments, specialized questionnaire modules for self-administered components, and paper forms that may have been filled out as part of the interview process, such household rosters, calendars that were marked up for event histories, answer forms for tests and other items. These materials needed to be assembled into a folder, identifiers checked and, if necessary, added to the loose forms. Each expected item was systematically recorded as either present or missing, and this information was passed along to the interviewer’s supervisor so she could track the interviewer’s performance and productivity.

After receipt, the interview booklets passed to an editor who reviewed the form for completeness, unclear markings, missing answers and other irregularities. Depending on the study and the protocol, this initial review might lead to the case being set aside for a call to the interviewer asking her to clarify those items that were garbled, missing or in error, and if the interviewer could not do this and the items in question were crucial, either the interviewer or a member of the central office staff might place a call to the respondent to retrieve the necessary information.

Assuming the editor found the interview to be in order, the next step was data entry, often double entry to catch errors. Then came cleaning, often based upon cleaning specifications that had to be designed, agreed upon, programmed, tested and then run. Usually the process was iterative with multiple passes through the cleaning program until all errors were either resolved or accepted as irreconcilable. Once the data were cleaned, they were assembled into one or more files and frequencies run, usually with a program such as SAS or SPSS. From that point, staff created additional variables and the data were usually distributed on magnetic tape with a printed codebook that ran for hundreds of pages.

One can appreciate that such a complex process used a lot of time and a lot of money. The labor-intensive process started with, and was driven by, a printed questionnaire. Besides making the survey process labor intensive, paper questionnaires made it expensive to add questions as each additional question not only absorbed interviewer time, it also required, editing, data entry, cleaning and documentation.

Complete Chapter List

Search this Book: