When arguing that Nate Silver is not very good at what he does, Nassim Taleb does something he regularly accuses other people of doing: he uses bad statistics.
Truly bad statistics consists of three parts:
- A kernel of truth
- A smoke screen of technical details irrelevant to the case
- A mistakes in the details that is hard to spot and requires some background knowledge to understand
So first, here is the kernel of truth:
Derivations of how to make and update election forecasts without BS.
— Nassim Nicholas Taleb (@nntaleb) August 10, 2016
Trying @WolframResearch new release Mathematica pic.twitter.com/VSLtigZqz3
The relevant formula in this derivation that is indeed correct is
- f = CDF[NormalDistribution[0, sigma Sqrt[t]], x]
Please note that the sigma in this formula is the same sigma as in the Wiener process with which he starts out.
The derivation is more complex than it needs to be and contains some formulas which are completely irrelevant. This already forms part of the smoke screen, but the true smoke screen is this paper: https://arxiv.org/pdf/1703.06351.pdf. Nothing in this paper is in any form relevant to the argument.
So where is the mistake? You can find it in the code for the chart below
(cont) a tutorial on standard way to look at elections. Updating proportional to square root of time to elections. pic.twitter.com/gbq1pqtkEJ
— Nassim Nicholas Taleb (@nntaleb) August 8, 2016
This is Mathematica code
- r := Random[NormalDistribution[0, 1]]
- ta = Table[r, {100}] // Accumulate
- ta1 = Table[{i, CDF[NormalDistribution[0, Max[.0001, 14 Sqrt[Length[ta] -i]]], ta[[i]]]}, {i, 1, Length[ta]}];
- ta2 = Table[{i, CDF[NormalDistribution[0, Max[.0001, 1 Sqrt[Length[ta] -i]]], ta[[i]]]}, {i, 1, Length[ta]}];
- ListLinePlot[{ta1, ta2}]
He starts out by simulating a Weiner process with a sigma of 1. So according to his own derivation he should use the same sigma when calculating the probability estimate. He does use the correct value when calculating ta2, which is the orange line in the chart which he labels “538”, but when calculating ta1 (the blue line) he uses a sigma of 14, which is simply the wrong value.
You could also fix this by increasing the sigma of the underlying Weiner process to 14, but in this case the blue line will start to look exactly like the orange line in the original chart:
If you work through math you will realize that sigma gets cancelled out and has no impact on the behavior of the correct estimate. A higher sigma will mean the estimation function will become less sensitive and require more extreme values to produce the same probabilities. But at the same time the higher sigma in the Weiner process will produce more extreme values exactly compensating for this effect.
Nassim Taleb has become famous for attacking bad statistics. The big issue however is that he seems to be either incapable or unwilling to distinguish good statistics from bad statistics, and he attacks good statistics employing bad statistics himself. This means Nassim Taleb isn’t part of the solution, he is part of the problem.