Statistical Thinking

advertisement
Curious examples of statistical thinking. . .
What is the probability that the sun will rise
tomorrow morning?
If that’s too hard to think about, consider a
slightly bent coin that has been flipped K times
and come up heads every time. What is the
probability that the following flip will also be
heads?
This can also be thought of as a business process that has
been done K times with perfect success.
Laplace asked this question about 1800. Using an argument
that today would be called Bayesian, he got this answer:
K 1
K 2
If Laplace claimed to have 10,000 years of human
observation as his experience base, the number is
365  10,000  1
365  10,000  2
This rule is called “Laplace’s Law of Succession.”
If your firm has hired a new supplier, and if that supplier
has been on time with her first 15 orders, this rule says that
she’ll be on time with the next order with probability
16
17
and this is about 94%.
We can get some interesting predictions as well from the
Copernican Principle.
Copernicus argued that there was nothing special about the
location of the earth in the cosmos.
Yes, he got in a lot of trouble for this.
This gets interesting when applied to events in time.
Suppose that something has a beginning time B and an end
time E (unknown). At time X, we observe that this thing
has been going on for duration X – B and we’d like to make
a conjecture about its end time E.
The Copernican principle says that there is nothing special
about time X within the lifetime interval (B, E).
The statistician will respond by giving a 95% confidence
interval. This is an interval which, you think, will enclose
the value of B. You’d bet 95-to-5 that B will be in the
interval.
For this situation, the interval is
1
B 
 X  B
0.975
to
1
B 
 X  B
0.025
This works best for cosmological or other large-scale events
for which you have no other information.
Still, you could try this for business events.
Suppose that a firm was founded in 2000. It’s been around
for 12 years. How long do you think it will last?
The 95% confidence interval for its date of demise is
(2024.31, 2492.00).
Let’s think also about the leading-digit phenomenon.
The leading digit of 246.22 is 2.
The leading digit of 1.46 is 1.
The leading digit of 0.0063 is 6.
0 is never a leading digit.
Only 1, 2, 3, 4, 5, 6, 7, 8, 9 can be leading digits.
What are the probabilities associated with leading digits? Are
they 1 each?
9
The observation which will lead us to the result is this:
The probabilities P[ leading digit = k ]
for k = 1, 2, …., 9 should be the same for all scales
of measurement.
(The “same” here refers to the
scales of measurement, not k.)
The leading digit probabilities for a set of measurements in
inches should be exactly the same if the measurements are
converted to centimeters.
… or feet or yards or cubits or furlongs or smoots or
anything else…
The Massachusetts Avenue Bridge has marks at 100 smoots, 200 smoots, and so
on. This bridge connects Boston to the M.I.T. campus in Cambridge.
The consequence of this observation is the the mantissas
(decimal parts) of the base-10 logarithms must be uniformly
distributed on [0, 1). Now restrict consideration just to
numbers between 1 and 10.
P[ leading digit = 1 ] = P log10 1  mantissa  log10  2  
= P[ 0 ≤ mantissa < 0.3010 ] = 0.3010
In a similar style,
P[ leading digit = 2 ] = P log10  2   mantissa  log10  3 
= P[ 0.3010 ≤ mantissa < 0.4771 ] = 0.1761
So 30% of all leading digits will be “1” and about 18% will
be “2.”
This phenomenon is known as Benford’s law.
It has been known (sort of) since the 1930s. There are
claims that this has been used in auditing, as forgers seem
not to know about Benford’s law.
Side note…. Is this the last legitimate use of base
10 logarithms?
Download