I was finally curious enough to look at the data and run a chi-squared analysis of the data. The idea here is that I can compare the observed number of times a given number is the first digit in all the precinct-level counts, and the expected number according to Benford’s Law.
The first attempt yielded huge values, corresponding to wildly improbable differences between the observed first digits of the precinct counts and the values expected according to Benford’s Law. The smallest chi-squared statistic, corresponding to the closest fit with Benford’s Law, was still quite improbable — less than one chance in a billion of being due to chance. The largest ones had a likelihood similar to winning the Powerball grand prize a dozen times in a row.
On further reading, I learned that the Chi-square test is rather sensitive to sample size, even though the sample size does not show up in the formula. Large sample sizes will result in chi-square statistics that look highly significant (highly unlikely) even for small deviations from the expected value.
So I redid the calculation after dividing all of the values for each category by a number intended to make the smallest value in all the categories equal to five.
The chi-square test starts to run into trouble when the value of any category drops below five. For a large number of categories, it’s OK if 80% of the values are five or greater. That would mean I could have set it so the second-smallest value is five. However, setting the lowest value at five gave me reasonable results.
I looked at the data for Allegheny County, PA and Fulton County, GA.
|Allegheny County, PA||Fulton County, GA|
|Trump: total votes||X2 = 5.80||p = 0.669||X2 = 4.00||p = 0.857|
|Biden: total votes||X2 = 190.5||p = 5.73e-37||X2 = 15.50||p = 0.050|
In both counties, Trump’s precinct-level vote totals match pretty well with Benford’s Law. In Fulton county, Biden’s vote totals are on the edge of significance.
In Allegheny county, Biden’s vote totals vary from Benford’s law by an amount well outside the bounds of chance.
Can we call “shenanigans” here?