Via Midtopia, I came across this essay at the New York Times:
In addition, the National Security Agency’s entire spying program seems to be based on a false assumption: that you
can work out who might be a terrorist based on calling patterns…. [I]t’s bad mathematics, for two reasons.The simplest reason is that we’re all connected… in the sense of the Kevin Bacon game. The sociologist Stanley Milgram made
this clear in the 1960’s when he took pairs of people unknown to each other, separated by a continent, and asked one of the pair to
send a package to the other — but only by passing the package to a person he know, who could then send the package only to someone
he knew, and so on. On average, it too only six mailings — the famous six degrees of separation — for the package to reach its
intended destination.[...]A second problem with the spy agency’s apparent methodology lies in the way terrorist groups operate and what the scientists call
the “strength of weak ties”…. This is the principle under which sleeper cells operate: there is no communication for years. Thus
for the most dangerous threats, the links between nodes that the agency is looking for simply might not exist.
I’ve mentioned previously that I’m uncomfortable with the NSA database of phone calls as it’s been described in the media, for
privacy reasons. However, despite that I do take issue with the above argument.
If I were trying to snoop based on this sort of a database, I would be looking for anomalous calls. If, for example, I saw a call
being made from someone whom I suspected of being a terrorist to a number associated with a disposable cell phone that never
received any other calls… that would be something worth taking a look at. Or, what if my suspected terrorist started making
calls to large fertilizer (e.g. ammonium nitrate) dealers?
It’s probably not possible to define or identify every possible type of phone call that would constitute an anomaly. However, if
you had a large enough collection of phone calls, a model could be built that would facilitate such identification.
And, as I’ve written previously, such a database could be constructed in a way as to preserve privacy (replace actual numbers with
randomized pseudonumbers, etc.) until such time as appropriate oversight authorities say it’s appropriate to identify a particular
node in the database. You don’t need to know what the actual phone numbers are, or the names associated with the numbers, to build
the model to identify anomalies.
It’s understandable that someone would apply the “Six Degrees of Separation from Kevin Bacon” game to this issue…but it’s bad
modeling theory to think of it as an appropriate argument.