University of Minnesota Driven to Discover
U of MNUniversity of Minnesota
Center for Transportation Studies

Measurement Error, Simpson's Paradox, and the Ecological Fallacy in the 'Speed Versus Safety' Debate

Presentation by Dr. Gary Davis, Dept. of Civil Engineering

October 16, 2001

"Measurement Error, Simpson’s Paradox, and the Ecological Fallacy in the 'Speed versus Safety' Debate," the third Advanced Transportation Technologies seminar of fall semester 2001, featured Civil Engineering professor Gary Davis at the podium. Davis brought the big guns of statistical analysis to bear on a question long debated among transportation researchers and planners.

The debate over speed limits touches many issues in the field of transportation, including fuel efficiency and roadway construction—but the most obvious issue is safety. While it may seem obvious that driving at a lower speed is safer, several researchers have advanced the theory that it is not high speed per se which leads to collisions between vehicles. Instead, they propose that variances in speed between different vehicles on a roadway are more closely associated with collisions.

The "variance kills" argument has three major tenets:

  1. Drivers traveling at speeds lower and higher that what is typical on a road have increased risk of crashing, with the risk being highest for slower drivers;
  2. Increases in the variability of speeds on a roadway cause increases in the accident rate; and
  3. Speed variance is highest where there is a marked discrepancy between the posted speed limit and the design speed.

At the heart of the argument is a compelling graph known as "Solomon’s Curve," frequently cited and reproduced by opponents of lower speed limits. In the mid-1960s, David Solomon analyzed crash report data from the 1950s and developed a graph purporting to show that drivers were more likely to be involved in crashes if they were traveling slower or faster than the average speed; slower drivers, Solomon claimed, were more likely to be involved in vehicle collisions than faster drivers. (D. Solomon, "Accidents on Main Rural Highways Related to Speed, Driver and Vehicle," Bureau of Public Roads, July 1964).

Solomon’s Curve, however, becomes less convincing when measurement error is taken into account. Errors in speed measurement, arising from the fact that investigators must estimate the speeds of vehicles by interviewing drivers, measuring skid marks, etc.—can give rise to inaccuracies in statistical results. Later researchers like White and Nelson (1970) showed how Solomon-type curves can arise entirely as artifacts of measurement error.

The fact that Solomon aggregated his data from a number of different sources also leaves it open to a surprising possibility—the real relationship, if it exists, that Solomon proposes to show in his data may be just the opposite of what the graph suggests. This is due to a statistical oddity known as Simpson’s Paradox (first noted not by anyone named Simpson but by one Karl Pearson, in 1899). Briefly, Simpson’s Paradox consists of the fact that an apparent causal effect shown in disaggregated data is reversed when the data are aggregated.

Perhaps the most famous real-world example of Simpson’s Paradox was found by analyzing graduate student admissions at the University of California at Berkeley: for the university as a whole, researchers found, women were admitted at a slightly lower rate than men, but in each individual department the admission rate for women was actually slightly higher than for men.

Despite being confirmed by empirical testing, Simpson’s Paradox is difficult to explain statistically. Some researchers have suggested that the paradox arises when outcomes from different causal mechanisms are aggregated, as in the case of Solomon’s data. Davis offered an example of Simpson’s Paradox affecting hypothetical crash rates, in which data from two different road types showed a higher crash risk for speeding vehicles, but appeared to show that slower vehicles were more at risk when the data were aggregated.

Following his examination of statistical pitfalls in accident data analysis, Davis went on to propose some possible ways to study this type of data while avoiding statistical ambiguity. Instead of studying broadly aggregated data, Davis proposes focusing on the causal mechanisms that lead to accidents. To do this, researchers should begin by "treating individual accidents as instances of mechanisms, rather than as instantiations of law-like regularities."

By understanding the specific mechanisms that lead to accidents, researchers can eliminate deceptive results that appear as artifacts of statistical analysis procedures. Davis and other University of Minnesota researchers are currently employing mechanism-based analysis in several traffic-related research projects, with funding from sources including the ITS Institute, Minnesota Guidestar, and the Local Road Research Board.