With all of the hype concerning the NSA’s slippery-slope reach into our Honey Boo-Boo and Kim Kardashian tweets this week, it’s time to take a break from conjecture, intrigue and speculation, and look at the bright side of semantic analysis, language processing, and so-called ‘distant reading’. I’ll do a two-part entry.
The first week: topic of the day.
MOUNTAINS OF GARBAGE
Let’s indulge for a moment. J. Edgar Hoover, similar to other dreadful historical figures, was a fanatic collector of information. What made him different from a regular fact-freak was his power and reach. Through this combination, he was able to destroy or significantly manipulate political and administrative figures. The information he sought is basically the same between his time and our time now:
- Where was this person at a given time?
- To whom was he speaking?
- Was anything interesting happening?
Back then, a hired gumshoe had to tediously follow and observe a lead, for days, weeks, and even months. Eventually the lead would slip up, and a juicy encounter that fit the objectives of the case would fall out of the tree. But if that lead was too disciplined, and the department managed to scare up the money or favors, there was Plan B: use a disposable third party to sneak in and install the tap. Get the dirt, (for God’s sake, don’t get caught doing it!) and if the hunch is correct, start looking for ways to get a legal tap, or other court-admissible evidence.
These days, the surveillance technology highlighted in recent news allows government to capture the ‘where’ and ‘with whom’ a lot faster than the gumshoe route. And so, let's not only indulge, let's take a big leap of faith and assume that in fact the NSA or other authorities actually had access to our content, versus just the data about our messages.
(By the way, here’s a question: where was the massive outrage when, even back in 2006, we learned about Narus’s Semantic Traffic Analayzer, and the possibility that it was already being put to intrusive, rights-violating use?)
Under normal circumstances (in those extreme circumstances!), a law-abiding citizen has little concern. In fact, what most pundits and those invested in speculation and intrigue have failed to parse lately is the basic problem that has thwarted insight-seekers and opportunists since the beginning of meta-data: mass droves of humans mostly have normal lives and talk about slightly boring things.
Apparently there’s a lesson here in a difference between what’s important, what gets revealed, and what goes ultimately viral.
But what if I consistently use ‘interesting’ key words that would implicate me to the thought police as a criminal or enemy of the state? Will my life become a virtual and literal hell of harassment, a sullied reputation and a ruined livelihood?
A more valid, thoughtful, and strategic concern: will this capability be abused to smear political and administrative figures? Will a low-level employee use information for his or her own personal vendettas? Will it be a data-collection component of a future, totalitarian regime?
The answer to the first two: it probably already has. And for the third: it’s a risk.
Remember this, though. Through history, subversives tend to innovate, and technology tends to be the eventual equalizer between freedom fighter and totalitarian. And technology tends to accelerate, rather than delay, the cycle between regime and revolution.
Next week: The Hills and Valleys of Culture.