A Stroll Down Memory Lane
Well folks, now we face an interesting point in our history. A point where we have to examine if everything we always thought was okay really is. When Google told the government to go to hell when they asked for search results, many people thought “It’s no big deal. I have nothing to hide.” In fact, many partisans went to great lengths to tar and feather privacy advocates who argued that even the slightest bit of information, when mingled together algorithmically, could churn out a reasonably good picture of a person’s habits, personality, etc. Despite many reassurances from search providers who did turn over results that there was no personally identifiable information in those search logs, some of us just weren’t buying it.
Some of us even went as far as to request that if the search engines were going to release this harmful innocuous data to the government, they should release it publicly seeing as they believed it could never be traced back to a human source.
Fast forward to last week. AOL made what, even for them, would have to be the worst move you could ever possibly make if you’re in a position of personal trust. Users, trusting that AOL would protect their privacy, used AOL’s Google-powered search engine happily and repeatedly, clicking away for different tidbits on the vast series of tubes we call the internet. In return for the use of their search engine, AOL decided to release hundreds of thousands of search results to the general public in the hopes that people would do cool things for them, analyze their metrics, teach them some new tricks, and do some awesome academic studies on the search habits of AOL’s users.
The plan had a flaw, though, as AOL released that data to the general public with no need for credentials, no opt out for customers, and only minimal protection of the actual information stored therein (no IP’s were released, but usernames were replaced with cryptic names like USER927 and 9437777). The tech-hungry public grabbed that data because of the gold mine it represented and started munging through them at an alarming rate, reporting stories back to the public of how one guy searched for pictures of corpses, how to kill his wife, and so on. Another searched for prom dresses and incest pictures. One guy searched for black gay porn and was curious to know if “niggers have x-ray vision” (just for the record, they don’t).
All harmless and if anything, just a fun look at the f’ed up world we live in, right?
Not exactly.
Like many privacy experts have said all along, there’s more than enough information in these logs to find someone’s identity. People have certain habits when they search (googling their own name, for example). When you combine enough queries over a long enough time, guess what? You can easily identify a source. Just ask the New York Times. They did it, and they did it with only a month’s worth of data for .0333% of AOL users. “But Vinny,” you’ll say, “It’s only one person.” They found the woman because she queried the name of her town, a dog that pissed all over the place, and numbness in her fingers.
You’re right. It’s only one person. And that one person was identified through manual research through hundreds of thousands of results in a period of two days. Substitute the volume exponentially, and the equipment parsing it thousandfold, and suddenly this innocuous data becomes a user profile. Scared yet? Frankly, you should be because what the government wants is on a much more complete scale.
And it isn’t just search records, either. In fact, they want to do the same thing with phone records. Everyone you call, every call you receive. Everything. Oh sure, they tell you that it’s only combing for known terrorists, but how do they know you’re a terrorist until they do a little digging? And how do we know that a list like the AOL mess won’t be out, only instead of showing that you looked up granny tranny porn, it’s gonna show that you once called your ex-girlfriend while you were engaged.
There’s a point to all this, folks.
Privacy advocates are right. You should be outraged that AOL just haphazardly gave your data out to millions of people. They’re going to be licking their wounds about it for a long time, and I’m sure their “AOL Security Edition” is going to cause quite a few chuckles among those in the know from now on. However, you should also be outraged that the government wants more “anonymous non-identifiable information.” Face it folks, if they couldn’t figure out what they wanted from the data they were requesting, they wouldn’t be requesting it.
Spin it however you want, but the facts are the facts.
Technorati Tags: privacy, aol, usa, us government, government, security
August 10th, 2006 at 12:29 pm
The general public needs a better education about internet cookies, too (including blocking and destroying them). The vast majority of the time cookies serve the interests of marketers — not consumers — and violate privacy. And of course marketers don’t want consumers to know that this is the case. Grocery store club cards also create an enormous repository of private information that could easily be shared or leaked. (See http://www.nocards.org.) They really are not OK.
March 23rd, 2007 at 3:19 pm
A Work In Progress
Data Mining, Part II: Identification…
I saw this on the front page of theNew York Times this morning, and I was stunned at how quickly anonymous data becomes not so……