Talking isn't doing.
It is a kind of good deed to say well;
and yet words are not deeds.
- Shakespeare (King Henry VIII. Act 3, Scene 2)
Recent years have seen an explosion of survey research which purports to provide data about the world. These are tests where
respondents provide "pencil-and-paper" answers to questions about beliefs or values (the present),
what they remember (the past) or what they might do in some hypothetical situation (the future). Think of the time, money, and effort that could be saved if learning how drivers avoided collisions (or any other behavior or belief) were simply a matter of asking questions. There would be none of that messy going out into world to examine what is actually happening or setting up costly and arduous experiments that measure actual behavior.
As in most things, the promise of survey research is too good to be true. There are some fundamental problems with self-report research.
First, statements about beliefs and future actions typically exhibit the "socially desirable response" (SDR) bias.
Cook & Campbell (1979) found that subjects are biased to say what they believe the researcher expects and to say what reflects
positively on themselves. Of special relevance, drivers are less likely to report their aberrant behavior (Lajunen & Summala, 2003).
The question asked can also profoundly influence the obtain data. The difficulties inherent in writing good survey questions that do
not lead the respondents are well known. Even when corrected for social desirability, however, surveys may still have poor performance
(Harrison, 2009).
A second, sampling bias is likely present. Respondents must agree to be in the study, so self-selection is a high probability.
Researchers may attempt to prevent sampling bias by matching variables such as demographics.
When researchers attempt to remove sampling bias, however, they are often overly optimistic in their successs (Harrison, 2009).
One newer form of sampling bias is, to coin a phrase, "internet bias." Some advocate using the internet for surveys because they are impersonal and should theoretically reduce SDR bias compared to face-to-face or even to phone surveys. However, the flip side of its impersonal nature is likely increased lying, a problem in many surveys. Younger responders, in particular, like to mess with researchers.
Internet research can also have a large selection bias. At a recent road safety conference, for example, a researcher presented her results on a survey concerning the acceptance of new technology by older drivers. She conducted the survey on the internet! It never seems to have occurred to her that she was biasing her sample both through self-selection of respondents, which infects all such surveys, but also because the respondents were unlikely to be a random sample of the older population: internet respondents would be a more technology sympathetic subpopulation. It is hard to take such research seriously, not just because of the biased methodology, but because the research was conducted by someone who could not see such an obvious methodological flaw. This harkens back to the Gallup poll's classic gaff in predicting that Thomas Dewey would defeat Harry Truman in the 1948 presidential election. Gallup conducted a telephone poll, but it failed to notice that there was a large sampling bias because many more Republicans than Democrats owned a phone.
Data based on statements about the past introduce an additional set of problems. They rely on memory, which is highly unreliable, as studies have demonstrated repeatedly (e.g., Loftus & Palmer, 1974). Memory is not a record, but rather a reconstruction created on the fly in response to some current event, such as a question. The reconstruction is created from many potential information sources, including those implied by the question itself. When asked to estimate the frequency of previous events, responders often err. For example, drivers greatly underestimate the number of near-misses that they have had (Chapman & Underwood, 2000). Other research (af Wahlberg & Dorn 2015) further found that driver self-reports of crashes, mileage, and violations were highly unreliable. Note that some research topics, like bicycle mishaps, are dependent almost exclusively on self-report data.
Studies asking about future behavior are also highly suspect. In one case (Wagenaar, 1972), research found that 97 percent of people interviewed claimed that they would read warnings on dangerous products. In an actual test, only 13-39 percent read the warning labels on a range of real products. These observations likely demonstrate demand characteristics that push the outcome in a certain direction. In a question about safety, respondents are likely to provide a socially acceptable answer. For one thing, it costs them nothing to say that they will act in a socially responsible way. For another, questions seldom specify all the relevant factors in the future situation, so people must essentially guess the answer. Lastly, most people are unaware that much of their behavior is automatic and is not under conscious control. When asked about future behavior, people can at most answer about their future conscious decisions, not their automatic behavior. Other factors, confirmation bias, base rate neglect, the availability heuristic and other unconscious mental processes that also influence memory and judgment.
Despite these issues, some have published research data claiming that their questionnaire has good validity.
This still leaves some problems. No one is likely to publish data saying that his survey is invalid.
The published literature likely contains a large sampling bias toward positive results.
Negative results are not published. Even if one survey were proved valid, it would say nothing about any others.
Moreover, the vast majority of such survey and self-report studies have never been subjected to any validity testing, so their values are not known.
None of this has prevented survey and self-reports studies from filling the journals and libraries of the world. The main reason is simple:
talk is cheap-literally. Conducting such studies requires even less cost, equipment, and expertise than simple observational studies.
It requires little or no training outside of the use of a basic statistical software package.
These factors lend themselves to bulk, low-quality research. In most cases, it would not be wise to draw any firm conclusions
or to engineer society based on this class of research. See af Wahlberg (2017) for a good general discussion on the problems
with pencil-and-paper research. He is only reiterating what Shakespeare noted long ago.