"Data, Search Engines: Why the Net Won't Let You Forget"

"Data, Search Engines: Why the Net Won't Let You Forget"
by Viktor Mayer-Schönberger

"Stacy Snyder wanted to be a teacher. By spring 2006, the 25-year-old single mother of two had completed her coursework at Millersville University in Pennsylvania and was looking forward to her future career. Then her dream died. Summoned by university officials, she was told she would not be a teacher, although she had earned all the credits, passed all the exams and completed her practical training (much of it with honors). She was denied her certificate, she was told, because her behavior was unbecoming of a teacher.

Her behavior? An online photo showed her in costume wearing a pirate's hat and drinking from a plastic cup. She had put this photo on her MySpace web page, and captioned it "drunken pirate", for her friends to chuckle over. The university administration, alerted by an over-zealous teacher at the school where Stacy was interning, argued that the online photo was unprofessional since it might expose pupils to a photograph of a teacher drinking alcohol. Stacy considered taking the photo offline. But the damage was done. Her page had been catalogued by search engines, and her photo archived by web crawlers. The internet remembered what Stacy wanted to have forgotten.

Stacy later unsuccessfully sued her university. She claimed that putting the photo online was not unprofessional behaviour for a budding teacher. After all, the photo did not show the content of the plastic cup and even if it did, Stacy was old enough to drink alcohol at a private party. This case, however, is not about the validity of the university's decision to deny Stacy her certificate. It is about something much more important. It is about the importance of forgetting.

Since the beginning of time, for us humans, forgetting has been the norm and remembering the exception. Today, because of digital technology and global networks, this balance has shifted. Forgetting has become the exception and remembering the default. The potential consequences are enormous.

Stacy Snyder's case is not exceptional. Dozens of cases of profound embarrassment, and even legal action, have occurred since then - from the lawyer who cannot get the internet to forget an article in a student newspaper more than a decade ago to a young British woman who lost her job because she mentioned on Facebook that it was "boring". Worldwide, perhaps 600 million people have pages on social networking sites. Disclosing one's information - through Facebook or MySpace entries, blogs, photos, networks of "links" or "friends", content preferences, "geo-tagging" or "tweets" - has become deeply embedded into youth culture. As these young people grow older, and more adults adopt similar traits, Stacy Snyder's case will become paradigmatic, not just for a generation, but for society as a whole.

Web 2.0 has fuelled this development, but conventional publishing - paired with the power of the internet - has rendered similar results. Take the case of Andrew Feldmar, a Canadian psychotherapist in his late 60s living in Vancouver. In 2006, on his way to pick up a friend from Seattle-Tacoma International Airport, he tried to cross the US/Canadian border, as he had done over 100 times before. This time, however, a border guard queried an internet search engine for "Feldmar". Out popped an article Feldmar had written for an interdisciplinary journal in 2001, in which he mentioned that he had taken LSD in the 1960s. Feldmar was held for four hours, fingerprinted and, after signing a statement that he had taken drugs almost four decades ago, was barred from further entry into the United States.

An accomplished professional with no criminal record, Feldmar knows he violated the law when he took LSD. But he maintains he has not taken drugs since 1974, more than 30 years before the border guard stopped him. It was a time in his life that was long past, an offence that he thought had long been forgotten by society as irrelevant to the person he had become. But because of digital technology, society's ability to forget has become suspended, replaced by perfect memory. "I should warn people that the electronic footprint you leave on the net will be used against you," Feldmar said. "It cannot be erased."

Snyder and Feldmar had voluntarily disclosed information about themselves. Often, we disclose without knowing. Outside the German city of Eisenach lies MAD, a mega-disco with space for 4,000 guests. When customers enter, they have to show their passport or ID card; particulars are entered into a database, together with a digital mug-shot. Guests are issued a special charge card, which they must use to pay for drinks and food. Every such transaction is added to a guest's permanent digital record. By the end of 2007, MAD's database contained information on more than 13,000 individuals and millions of transactions. Sixty digital video cameras continuously capture every part of the disco and its surroundings; the footage is recorded and stored in over 8,000GB of hard disk space. Real-time information about guests, their transactional behaviour and their consumption preferences is shown on large screens in a special control room. Management boasts how, through the internet, local police have 24/7 online access to customer information stored on MAD's hard disks. Few if any of the disco's guests realise that their every move is being recorded, preserved for years, and made available to third parties - creating a comprehensive information shadow.

For an even more pervasive example, consider internet search engines. Crawling web page by web page, Google, Yahoo!, Bing, Ask.com and a number of others index the web, allowing all of us to access it simply by typing a word or two into a search field. We assume that such search engines "know" a great deal of the information that is available on the global internet. However, they remember much more than what is posted on web pages.

In the spring of 2007, Google conceded that until then it had stored every single search query ever entered by each of its users, and every single search result a user subsequently clicked on to access it. By keeping the massive amount of search terms - about 30 billion search queries reach Google every month - neatly organised, Google is able to link them to demographics. For example, Google can show search query trends, even years later. It can tell us how often "Iraq" was searched for in Indianapolis in the fall of 2006, or which terms the Atlanta middle class sought most in the 2007 Christmas season. More importantly, though, by cleverly combining log-in data, cookies, and IP addresses, Google is able to connect search queries to a particular individual across time - with impressive precision.

The result is striking. Google knows for each one of us what we searched for and when, and which search results we found promising enough to click on them. Google knows about the big changes in our lives - that you shopped for a house in 2000 after your wedding, had a health scare in 2003, and a new baby the year later. But Google also knows minute details about us: details we have long forgotten, discarded from our mind as irrelevant, but which nevertheless shed light on our past: perhaps that we once searched for an employment attorney when we considered legal action against a former employer, researched a mental health issue, looked for a steamy novel, or booked ourselves into a secluded motel room to meet a date while still in another relationship. Each of these information bits we have put out of our mind, but chances are Google hasn't. Google knows more about us than we can remember ourselves.

Google has since announced that it will no longer keep individualized records forever, but will anonymize them after a period of nine months, erasing some of its comprehensive memory. But keeping individualized search records for many months still provides Google with a very valuable information trove it can use as it sees fit. And once the end of the retention period has been reached, Google's pledge is only to erase the individual identifier of the search query, not the actual query, nor the contextual information it stores. So while Google will not be able to tell me the terms I searched for and what search results I clicked on five years ago, it may still be able to tell me what a relatively small demographic group - middle-aged men in my income group, owning a house in my neighborhood - searched for on the evening of 10 April five years ago.

Google is not the only search engine that remembers. Yahoo!, with about 10 billion search queries every month, and the second-largest internet search provider in the world, is said to keep similar individual records of search queries, as does Microsoft. But other organisations, too, collect and retain vast amounts of information about us. Large international travel reservation systems are similarly remembering what we have long forgotten. Credit bureaux store extensive information about hundreds of millions of individuals. For example, the largest US provider of marketing information offers up to 1,000 data points for each of the 215 million individuals in its database. In addition, doctors keep medical records and are under economic and regulatory pressure to digitise and commit decades of highly personal information to digital memory. Law enforcement agencies store biometric information about tens of millions of individuals even if they have never been charged with a crime - and most of these sensitive yet searchable records are never deleted. And in the UK alone, 4.2 million video cameras survey public places and record our movements."

0 Response to ""Data, Search Engines: Why the Net Won't Let You Forget""

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel