Saturday, December 31, 2011

Ghettoizing the internet

kw: book reviews, nonfiction, technology, internet, privacy, polemics

Try this sometime, on a computer that you don't use for your banking or paying bills online (maybe at the public library...): Delete all cookies, go to a search web site of your choice and make a search. Then open the privacy window that lets you read all your cookies and see how many have appeared, all from a single action. There may be very few, or even none, but the top "cookie monster" that I've heard about so far was looking up "depression" on dictionary.com. The harvest? 233 tracking cookies!

Another interesting experiment: Call a friend in the phone, then both of you do the same simple search on Google or Yahoo! or Bing, and tell each other what are the top ten hits. Also take note of how many hits there are (it will probably be in the millions). Or you can make a game of it. Collaborate with five or ten friends to do several searches, chosen beforehand, and see who can collect the most hits, or the fewest (these friends all have to be honest!). It is important that everyone do this from their own home, because search personalization depends on your location, and whether you are logged in or searching anonymously.

Why should there be such differences? That is the subject of The Filter Bubble: What the Internet is Hiding From You by Eli Pariser. The book is a polemic, in the better sense of the word, exposing and decrying practices that disturb the author, and ought to be of concern to all of us. Yet it is not shrill. The 233 cookies example comes from page 6 of the book.

Just looking at my cookie collection, after a run by AdAware that removed a couple dozen tracking cookies, reveals nearly 260 domains that have cookies stored on my computer. Spot checking shows that my e-mail account server keeps 34 cookies; findagrave.com, where I do genealogical research, has 9; and my online bank keeps 8. Even this blog has one cookie stored there. It is safe to say there are at least a few thousand cookies on my computer, and for most of them I don't know whether I really need them.

Cookies are just the tip of the iceberg. Google stores everything we do through its web site, whether we are logged in or not. Google usually knows who we are even when we are not logged in. But our searching and clicking activities are aggregated into a great many categories that are said to be for purposes of "personalization": the company uses our preferences, as evidenced by what we look for and what we look at, to raise or lower the ranking of search results. The days of pure PageRank are long gone. The data about you that Google or Yahoo! or Bing or Amazon have gathered is stored on their own computers. I've even noticed that, when I finish a blog post and publish it, the next screen usually includes one or two ads that are pertinent to the blog post's subject (in case you didn't know, Blogger is a Google product).

What is the danger of personalization? How harmful can it be? That depends on the personality of each of us. I happen to be rather easy to influence when confronted by an authority figure. This is why I gave up live debating many years ago. It is also why I have learned to hang up quickly (even if it is rude) when a telemarketer or pollster calls. Sometimes I am kind enough to say, "No, sorry" before I hang up; sometimes not. Psst! Did you know that many "polls" over the telephone are actually attempts to influence the way you are likely to vote, or a product you might soon buy? Online marketers of both the commercial and political variety are experts at discerning the levers that influence what you will do next. They vary their approach by time of day, by the mood you seem to be in as seen in how you write, and the various profiles about you, such as your FaceBook account.

Here is a FaceBook experiment: Check your News feed, and sort it by time. Then go to the wall of someone with whom you almost never interact, and find something to "Like", or leave a comment or two. Return to News and see if that person's updates have magically appeared. Sometimes it takes leaving several tracks to get this to work. P.S. If you have more than 100 FaceBook friends, keeping up with all of them is quite arduous. Let's all suggest that FB make available a sorted list of interactions so we can figure out a few people to UnFriend.

In eight chapters, Pariser makes his case, that all this personalization has the effect of isolating us, with the technology that was supposed to make us more connected. I have often said that, if two people have all the same opinions, one of them is redundant. We need variety in our lives, but too much personalization removes a great source of variety. If you only see search results that square with your past interests, how will you ever develop new interests? How can we grow? Serendipity, and the matching of diverse topics, drive our creativity. Personalization is a great creativity killer.

The author and others, including the authors of this USA Today article, mention products like Ghostery, AdBlock Plus, TrackerBlock and Do Not Track Plus, as means to cut out part of the tracking. Nothing will eliminate it all. Thus the author suggests political solutions, legislative solutions, in his last two chapters. I am skeptical of the approach. Like dealing with a home invader at 3:00 AM, sometimes the 38 caliber "solution" is the only effective one; the police can only do something after a crime has been committed. Too late.

I have, instead, a suggestion that is in keeping with the way Google and other search sites already work. I suggest prefixes that influence the filters temporarily. I suppose you know if you put "define:jejune" in the search box, the first hit will be a definition. Google has other prefix words. Here are some it needs to add:
  • all or every: Results only filtered by the PageRank algorithm that got Google started in the first place.
  • serendipity or ser for short: Results deliberately scattered among topical interests.
  • anti: Results from an interest set that is the opposite of my own. If I'm liberal, a bunch of right-wing rants; if I'm literary, some scientific and engineering stuff; if I like rock music, some old Roy Rogers or folk music or swing; and so forth.
  • statistical or stat: Results based solely on Bayesian statistical ranking, defeating even PageRank.
I am sure some more could be determined. This would shine a light on just what personalized ranking is doing, and we could make better choices as a result.

The other political solution is a series of high-profile lawsuits. To avoid such suits, Google needs to provide a tool to show you what it has on you; FaceBook needs a tool to show you whom it is filtering in and filtering out of your News Feed; and the other aggregating companies out there need to make available to all of us the data that exist about us, and allow us to correct mistakes, of which there are billions I am sure. Otherwise, our entire history will dog us forever, not taking proper account of changes in marital status, religious conversions, or the change in lifestyle that would result from a sudden windfall or its converse, loss of occupational income. How ironic is it if the homeless woman using a library computer is besieged by ads based on her former status as a corporate VP?

The book's website has a "10 Things You Can Do" section that can help us all partially alleviate the effects of the Filter Bubble as it currently exists. On this last day of 2011, I wonder what the coming decade will bring?

No comments: