All types of phenomena, whether its home runs, traffic accidents, PhDs granted, cancer, and so on vary in number from year to year (and place to place). That is, the number of home runs hit by the Yankees, the rate of traffic accidents in California, the number of PhD awarded by Stanford's Sociology Department, and the incidents of kidney cancer is not constant from year to year. This may seem obvious, but it helps explain why the increase in violent crimes in San Jose may have more to do with natural variance than a breakdown in San Jose police department's policing abilities, but I'm getting ahead of myself.
One characteristic of the observed variance of a particular phenomenon is the fact that greater variance occurs with small numbers than with large numbers. To illustrate this imagine a large urn that contains 1,000 marbles, 500 of which are red and 500 of which are white. Now, assume that you repeatedly reach into the urn and take out 4 marbles until the urn is empty. Chances are each time you reach into the urn, the 4 marbles you take out will be a mix of white and red. And after you've removed about half the marbles, you'll have approximately the same number of white and red marbles. Nevertheless, every so often, you'll pull out 4 marbles that are all of one color. Now consider a related example
From the same urn, two very patient marble counters take turns. Jack draws 4 marbles on each trial, Jill draws 7. They both record each time they observe a homogenous sample--all white or all red. If they go on long enough, Jack will observe such extreme outcomes more often than Jill--by a factor of 8 (the expected percentages are 12.5% and 1.56%) (Kahneman, "Thinking, Fast and Slow," p. 110). Why? Because you are more likely to experience an extreme event (in this case, drawing marbles of the same color) when working with small numbers than with large ones. That is why surveys that interview 1,200 people are generally more accurate than those that interview 600.
Now, consider a study of the incidence of kidney cancer in 3,141 US counties, which found that the counties with the lowest rate were mostly rural, sparsely populated, and located in traditionally Republican states. As the statisticians Howard Wainer and Harris Zwerling noted, "it is easy and tempting to infer that their low cancer rates are directly due to the clean living of the rural lifestyle--no air pollution, no water pollution, access to fresh food without additives." Such an inference would be wrong, however, because the counties in which kidney cancer rates are highest are mostly rural, sparsely populated, and located in traditionally Republican states. As Wainer and Zwerling remarked, focusing on just these results, "it is easy to infer that their high cancer rates might be directly due to the poverty of the rural lifestyle--no access to good medical care, a high-fat diet, and too much alcohol, too much tobacco." Again, however, such an inference would be wrong. So, what's going on? As Daniel Kahneman notes, "Just as in the game of Jack and Jill, extreme outcomes (very high and/or very low cancer rates) are most likely to be found in sparsely populated counties. This is all there is to the story" (Kahneman, p. 111).
Which brings us back to the spike in San Jose's violence. The average number of homicides in San Jose is relatively small (33), which means that extreme outcomes (i.e., extreme spikes and drops in the murder rate) will be the norm, not the exception (see the chart below). Moreover, after a wide swing in one direction, it is quite common for rates to move back toward the average (known as regression to the mean), which is why if the homicide rate declines for the rest of 2012 and into 2013, it will probably have more to do with the natural pattern of things than the efforts of the SJ Police Department.
No comments:
Post a Comment