“Open-source intelligence has always been crucial, but for most of the cold war it was neglected by western intelligence agencies,” says Calder Walton, a research associate at Cambridge University and author of the book Empire of Secrets, to be published in 2013. “That was the archetypal intelligence war: intelligence necessarily involved information that couldn’t be gained from any other source — human agents or telephone tapping.” That doesn’t mean covert intelligence was more effective, though: Daniel Moynihan, a former US senator, compared CIA reports gathered from secret sources with Soviet documents recovered after the fall of the Berlin Wall and found they significantly overestimated Soviet capabilities. But he discovered that western think tanks using publicly available material, such as the RAND Corporation, were much more accurate. US diplomat George Kennan estimated in 1997 that “95 per cent of what we need to know about foreign countries could very well be obtained by the careful and competent study of perfectly legitimate sources of information open and available to us”.
Excerpt from an article in Wired, the tech and futurism magazine, about a Swedish investment firm, Recorded Future, that is taking the use of social networks and other systems to new heights in its attempt to get a jump on the market. In the process, it sheds new light on how the intelligence-gathering process works.
Here’s another couple of paragraphs:
The 20 employees of Recorded Future aren’t foreign-policy experts. They aren’t traders either, but if you’d started using Recorded Future’s predictions to buy US stocks on January 1, 2009, you would have made an annual return of 56.69 per cent. (The S&P 500 had an annualised return of 17.22 per cent over the same period.) Between May 13 and August 5 this year, as markets behaved with vertiginous abandon, their strategy returned 10.4 per cent; in contrast, the S&P 500 lost 9.9 per cent of its value. They’re data experts: computer scientists, statisticians and experts in linguistics. And in the data, they think, lies the future.
All Recorded Future’s predictions, whatever the field, are based on publicly available information — news articles, government sites, financial reports, tweets — fed into the company’s own algorithms. The result, it claims, is a “new tool that allows you to visualise the future” — one that is changing how government intelligence agencies gather information and how giant hedge funds place bets. On its website, Recorded Future states: “We don’t grant interviews and we don’t issue press releases.” But behind closed doors, the company is developing the technology that has been described be one tech blog as an “information weapon”.
The businesses was founded by a chap called Christopher Ahlberg, a former member of Sweden’s special forces and a serious entrepreneur. In its own way, this article is just another example of how Sweden is not quite the socialist nation that it is sometimes said to be, either by its starry-eyed admirers or detractors. There is a lot of entrepreneurial zest up there in the frozen north, it seems.
Interestingly, and somewhat apropos to the science fiction thread above, Heinlein wrote a short non-fiction piece describing how he was sitting in Russia with his wife and estimated the population of Moscow based upon various things such as the number of boats using the river. With two slightly different methodologies, they came up with numbers that were significantly less than the official numbers. I forget the details but I think it was very nearly an order of magnitude and was borne out by later information.
So there is a definite contrast between analysis and received information. And perhaps therein lies the link between Science fiction and libertarianism.
I suspect that the claim that publicly available information used to be neglected is just plain wrong.
40 years ago I was acquainted with a Special Branch officer who claimed that about half his work time was devoted to simply reading the newspapers: mainstream national titles, local press and the publications of groups like the National Front and the Socialist Workers’ Party.
A US diplomat, not a spy, once told me that her job largely consisted of reading the trade and business press and going to product launches and receptions.
Allow for boasting, distortion and the certain knowledge that neither was telling me any secrets and it still shows that open-source intelligence gathering is certainly not cutting edge stuff.
It’s amazing the portrait you can put together from publicly released information. Talking to people, reading the news, and understanding the past can give valuable insights.
To be frank, I wonder how much SIS/CIA are looking at that sort of stuff and keeping it inside.
I have dabbled in web data-mining as a part of my job, so here are my two cents.
First, you need a crapload of computing power to do it. I am sure Google and the CIA can pull it off, but we could not, to the extent where we thought it would work. You can only run so many machines web-mining Twitter, CNN, Drudge, BBC, WSJ, etc.
Second, it does produce a limited amount of actionable knowledge, but also a crapload of noise. My company mostly invests on a daily basis, guessing whether the market, or some sectors of it, would go up or down the next day. Our model that incorporates web-chatter has done okay – up 24% this year, but this has included swings from down 17% to up 54%. If this were clients’ money, as opposed to house money, the clients probably would have bailed en masse when seeing down 17% on their statements. You cannot run a hedge fund with this kind of volatility.
Third, without a long history (and we do not have it), it’s hard to tease out useful contribution from luck. If on a single day, where the model made 14% by being beta 2 short (US rating downgrade), we had been wrong, all the gain would have been gone. It’s very hard (nigh impossible) to backtest such models – it’s just how data sources work.
To recap, I definitely would look at web mining as a source, but I would not bet my 401(k) on it. If, however, you are looking for longer term trends, and you are looking to improve your hit rate from 53-47 to 60-40, yep, definitely count them in.
Hedge funds do this with their algos too, it’s roughly why the markets currently trade almost entirely on rumours nowadays. 50-odd % yield/year probably wasn’t difficult to achieve from late nineties to 2007, in fact I wish I had been a few years further ahead on starting to trade myself a bit. But if their algo is still yielding 10% then good luck to ’em, they’re doing better than most of the hedgies, who seem to be faultering now that the easy money is no longer there to be made. Of course you could have just been in precious metals since 2007 and made a lot more than 10%, but you know…
I know a lot of the stuff the world’s militaries think is top secret can be found in Jane’s. When I was an army cadet, we used to be given stuff labelled “restricted” along with stern warnings about not giving them to any passing Paddy (they were the enemy in them days), but which we all had anyway in our copies of “In Combat” (hey, I was a teenager).
Jonanthan quots:
That’s great, exciting, encouraging, stimulating. I want to believe, I truly do.
But I would bear in mind that the quoted returns were undoubtedly provided by the company, and the start and end points were chosen by them, too.
As regards intelligence gathering I am right up there with Hugh Trevor-Roper and Stuart Hampshire on this. Government intelligence agencies (now THERE is a good oxymoron) tend to place importance upon the source of the information, rather than the actual quality of it. Thus, to quote Trevor-Roper, “… they would prefer some nonsense smuggled out of Sofia in the fly-buttons of a vagabond Rumanian pimp to what they could learn from a prudent reading of the foreign press”.
From what one can glean from a thorough reading of the world’s press agencies’ output, which by the way is something I am required by a (corporate) intelligence agency to do on a daily basis, this culture of attaching importance to the source, rather than the actual data is a the problem at the heart of any intelligence gathering operation. When this problem is finally addressed (it won’t be, but we can wistfully dream) then we will actually have genuine intelligence being fed to the right people who can make an properly informed decision.
Nice informative article by the way, but I regret I do not ‘rate’ Mr Ahlberg too greatly.