Benford’s Law

We’ve talked a lot lately about different types of geometry, planes, and numbering systems and whatnot. I was thinking about our own wonderful base 10 numbering system and the amazing phenomena that occur with base 10 numbers. So in this article, I’d like to elaborate on one of the craziest mathematical laws I have ever had the privilege of learning about—a crazy law that comes as a natural result of a base 10 system. It is known as Benford’s law.

Okay, so first grab a bunch of numbers from a large set of statistical data, like the death rate in countries in Europe or the first 1000 powers of 3. If you were to look at the first significant digit of all of those numbers, how frequently do you think all of the digits from 1-9 would occur?

It seems reasonable to assume that there would be an even distribution of the numbers 1 to 9, right? RIGHT? Each digit should appear roughly 1/9 of the time. Anyway, that’s probably what most of us would assume, but there are several sets of data where that is actually not the case. As it turns out, 1 is by far the most common digit and 9 is by far the least common digit. This bizarre principle is summed up neatly by Benford’s Law.

A visualization of the distribution of first digits that are found in sufficiently large sets of data Uploaded by User Gknor for Wikipedia on August 6th, 2008. Public Domain License.

Stated simply, Benford’s Law is that in sets of statistical data where the numbers are largely spread over multiple orders of magnitude, if we randomly pull out a number the probability that the first digit of the number will be D is P(D) = log10 (1 + 1/D). Plugging in the values of 1 to 9, we can find the probability percentages for all 9 digits.

P(1) = 30.1%        P(2) = 17.6%        P(3) = 12.5%

P(4) = 9.7%          P(5) = 7.9%          P(6) = 6.7%

P(7) = 5.8%          P(8) = 5.1%          P(9) = 4.6%

So it would seem that the distribution of numbers is not even after all! What the heck, right? This law says that randomly pulled numbers from data sets should have 1 as the first significant digit about 1/3 of the time. The probability that the first significant digit will be D decreases as D gets larger. The attached graph provides a visualization of this phenomenon. Benford’s Law may seem really strange at first, but let me back it up with a few examples of some data sets that exhibit this behavior.

I found an interesting application online called the Fibonacci Calculator which can do many neat and interesting things with numbers in the Fibonacci sequence. I had it perform some calculations on the first 10,000 numbers in the Fibonacci sequence and it printed out the following information about the distribution of first digits for those 10,000 numbers.

Digit:               1            2            3            4           5            6         7           8            9

Frequency:   3011     1762      1250       968        792        668     580       513        456

Percent:          30         18          13          10          8            7          6           5           5

The numbers are pretty much identical to the percent values generated by the aforementioned P(D) function! If that doesn’t amaze you, consider the following chart. This is the distribution of first digits for a large selection of physical constants in the universe:

This graph shows the distribution of first significant digits for physical constants of the Universe compared to the Benford distribution. Uploaded by User Drnathanfurious for Wikipedia on May 16th, 2007. Public Domain License.

So a large selection of seemingly random constants also follows this pattern! As stated previously, there are several naturally occurring phenomena that follow Benford’s Law. Examples of other sets of data that comply to the Benford distribution are: geographical sizes of countries , sizes of lakes, numbers of the Lucas sequence (among other sequences), powers of 2 (or powers of any number for that matter), prices of stocks, physical and mathematical constants, specific heats, atomic weight of elements, electricity bills, street addresses, death rates, etc.

I could list off several, several more but I will restrain myself for the sake of pragmatism. This law has some pretty crazy and far reaching effects, doesn’t it? Now, it goes without saying that the law is more accurate when the data sizes become larger, so you shouldn’t be surprised if you pick out 3 numbers in a set and none of them start with 1. Also, Benford’s law seems to be much more accurate for certain types of data. This begs the question….when is Benford’s Law applicable?

It seems that it is more accurate when the values in a set are distributed logarithmically across multiple orders of magnitude. The more orders of magnitude covered, the more accurate Benford’s law will be. Also, the accuracy is affected by the randomness of the phenomenon being recorded. If a phenomenon is due to human influence instead of natural occurrence it will be less accurate. Benford’s Law will also be less accurate when the possible range of values is restricted. For instance, the law will not be applicable to randomly generated numbers between the values of 1 and 99. No, that will probably look a lot more like an even distribution.

So all in all, Benford’s Law is very applicable for phenomena that involve exponential growth, though the law in and of itself still seems very weird. If you’re having a hard time convincing yourself that this law is actually valid, keep a tally of the first digits of a sequence of numbers and see for yourself how the tally marks naturally progress as you increment.

Heck, if we even try just counting natural numbers the phenomenon might become clearer. First we count the numbers 1 through 9 and add one tally mark to all nine digits. Then we count the numbers 10-19 and add 10 more tally marks to the number 1. The numbers 20-99 will add 10 more tally marks to the other nine digits. After 99, we count the numbers 100-199, which will add 100 more tally marks to 1. So although the numbers will again have an even distribution after all numbers to 999 are counted, the number 1 will get the tally marks before the other eight numbers. The number 9 will always be the last number to be tallied.

So as numbers grow, they’re likely to have 1 as the first digit before the other 8 digits. This might explain just why it is that random sets of numbers are more likely to start with the number one. Honestly, though, it still weirds me out that so many different types of data all conform to this law. I’m not the only one who finds this law so bizarre, though. Benford’s Law has been fascinating mathematicians for years—from its first discovery in 1881 to the present day.

Sources:

Jon Waltoe. “Looking out for number one.” (September 1999)

https://plus.maths.org/content/looking-out-number-one

-MathPages. (n.d.) Benford’s Number. Retrieved April 3rd, 2015 from Wikipedia: http://www.mathpages.com/home/kmath302/kmath302.htm

-Benford’s Law. (n.d.). Retrieved April 3rd, 2015 from Wikipedia: http://en.wikipedia.org/wiki/Benford%27s_law

-Fibonacci and Lucas Number Calculator 1.3. Dr. Ron Knott. 4/30/2014. Retrieved 4/3/2015 from: http://www.maths.surrey.ac.uk/hosted-sites/R.Knott/Fibonacci/fibCalcX.html

Advertisements