Gmail’s Data Collection
There was a recent, yet brief, uproar over a bit of a surprise from Gmail. Email privacy is always a major concern for any user, so people were a little shocked to learn that Google was using the contents of emails in its ad research.
Google has officially stated that:
Ad targeting in Gmail is fully automated, and no humans read your email in order to target advertisements or related information. This type of automated scanning is how many email services, not just Gmail, provide features like spam filtering and spell checking. Ads are selected for relevance and served by Google computers using the same contextual advertising technology that powers Google's AdSense program. (1)
To be perfectly clear, a program under Google's control is forcing their users' participation in what is effectively market research. This is a fairly deep issue, and one that's worth exploring. Despite some pressure, Google admits to and continues this practice under the guise of improving user experiences.
What They’re Doing
Gmail is effectively using a simple program to read the contents of your email and then run appropriate ads alongside it. For the most part, it will scan the contents and look for keywords. For example, when you're planning a vacation, it is likely that something like "Hawaii Vacation" will show up in one of your emails. You would then start to see ads about hotels in Hawaii, cheap flights, rental cars, etc.
For a real life example, one reporter for the Daily Mail lost her laptop in a cab. In the few days that it took for her laptop to make its way back to home, she naturally sent a few emails through Gmail to friends talking about the situation. When she got her laptop back, she was a bit surprised to continuously see ads for new PCs alongside the usual emails, noting that it must have picked up on the lost computer and started some poorly timed targeting (2).
Why They Do It
The issue at hand is actually fairly simple from a technical standpoint. Google is an ad provider. While it's mainly known for being a search engine, it makes its money from selling ad space to people who wish to advertise. More importantly, it doesn't sell the Internet equivalent of a billboard. Their advertising model is focused heavily on whether the viewer actually clicks the link or not. A minor boost to clicks can mean millions of dollars, depending on the category and size of the audience (3).
The best way that any site can increase these clicks is by making efforts to have relevant advertising that their audience will genuinely want. Now, stop and think about Gmail. Google, via Gmail, has decided to a logical but somewhat disturbing extension of their service. It is now going through your email, though not with human eyes, and configuring the ads that it serves to you with this information in mind.
It's a very unique position too. Think of the travel example. If you do any pre-trip discussion, Google can effectively slip-in before you make any concrete plans, which means that you're more likely to be drawn toward their ads. With Gmail's email scanning, they effectively put themselves in a prime position to manipulate ad responses.
Why It’s a Problem
There are several issues with Gmail's scanning beyond a simple “invasion of privacy” angle, although that can't be ignored.
Obviously, Google isn't alone in gathering advertising information. Grocery stores have been doing this with rewards cards for a while, and big stores like Amazon watch on-site browsing to better target promotions. There is a clear difference though.
First, in those cases, it's a bit more of an altruistic motive. While neither is doing it out of the goodness of their heart, there is a logic to it at least. Your neighborhood store is trading discounts for minor demographics information, which will ultimately help producers better understand their audience. While Amazon has used its marketing information to manipulate prices in the past, their suggestion service is basically just a targeted mailer to remind people of products they may want or deals they otherwise wouldn't have noticed. Gmail is just trying to get extra money (unless you genuinely care whether your ads are relevant).
Second, there is a boundary issue. Grocery stores do not have hidden recorders listening to my conversation to know which flavors of Hamburger Helper I like and why, or that I'm waiting for a sale on soda. Amazon is just watching what I've browsed on their site, using basic information that I knowingly give them. Gmail is looking through emails that will often encompass all aspects of one's life. They can get keywords and information on your workplace, personal life, relationships, hobbies, etc.
One particularly striking note are the liberties that Google is taking with non-users. Anyone who sends an email to a Gmail address will have their message read and analyzed. While this information can't be directly tied to an outside user, it is a fairly gray area to take such a drastic action when the sender never formally consents to it.
And I'd be remiss if I didn't acknowledge the current boogeyman, known as hacking. While I'm not personally an alarmist, the fact that Google is storing information about the content of emails and tying it to computers and accounts is a little alarming given their past experiences with hackers. It seems like they're just one unfortunate exploit away from giving up tons of personal information.
Why Their Rationalization Doesn’t Cut It
Gmail offers a few rationalizations for this behavior, which I think just makes the issue more disturbing.
The first is that they are trying to defend the behavior by stating that humans do not read the messages. I find that to be a perfectly fine defense. No person out there randomly knows about my travel plans, tech purchases, work, etc. That's a nice thing. Google is merely allowing a robot to read my private messages to crack into my thoughts and allow Google's marketing algorithms to better manipulate me into clicking ads. And what could be wrong with that?
The other defense is that they don't target any potentially harmful subjects. They state that, "Also, we are careful about the types of content we serve ads against. For example, Google may block certain ads from running next to an email about catastrophic news. (1)" . Back of the envelope work seemed to show that as long as you mentioned death, suicide or various disasters every 167 words, ads would not display. Again, I'm very glad that one person inside Google's engineering or marketing department realized that marketing funeral supplies to grieving families might not be a good idea.
They note that this scanning is similar to spell check or spam filtering. Which is true, technically. But it's almost like they're trying to equate a shameless money grab with a useful service.
Which makes sense, because they also try to argue that the ad preferences program improves the user experience. I would personally wish for fewer ads or better screened ads, but I guess invading m
y privacy to flood me with slightly relevant ads is a plus…somehow.
Sarcasm aside, this is really frustrating to hear. Google insists on acting like this service is fine, that it's merely the status quo and that they've gone to great lengths to help us. The simple truth is that they've managed to establish a nice beachhead for more serious privacy invasions. They moved the goal posts to a more favorable position. The arguments are “What do you care, it's just a robot, it could have been a real person” or “Hey, we don't use it in blatantly tasteless manners like we could have.” The idea of not doing it in the first place doesn't come up once in their entire document.
Does It Matter?
This is the main question for any breach of privacy like this, and I'd like to temper my growing anger towards the Gmail issue with a simple overview. While everyone should care about the greater issues affecting their privacy online, in practice the more tangible effects are what really matter.
For the vast majority of people, the use of Gmail email scanning will not impact you in any noticeable way. Google will continue to try to blindly find good ads for you based on what you're reading, and now they'll take a crack at your emails too. They'll do this to try to earn a few extra dollars from advertisers, who will happily pay because they hope to increase sales. It's a cycle that funds most of the Internet. They've been doing this with websites for a very long time, so it really isn't too shocking that they finally decided to expand out to Gmail.
If it bothers you, you can always find a decent ad-blocker and just ignore it. Keeping something out of sight does wonders for keeping it out of mind. You can also run something like NoScript, which will drastically decrease the level to which they can record information about you. If you never allow Google's tracking cookies on websites, your profile will be incomplete and fairly worthless to advertisers. If you're truly angry, there are plenty of fish in the sea. Choose another provider with more privacy sensitive terms of service. Technology has a nice way of moving forward.
That said, it's a very disappointing incident. While it likely won't hurt you on a personal level, it has taken a nice chip out of the concept of Internet privacy. Emails exist in an odd state, somewhere between normal letters and simple data. Protecting them from prying eyes has been a long battle. Gmail tested the waters with this, and ultimately found that they could get away with openly scanning the emails of their users. Another wall of privacy has given way, and we didn't even get anything in return. That's certainly something that you should care about.
- Images from MorgueFile.com, http://morguefile.com/archive/display/34751, http://morguefile.com/archive/display/685423
- 3. Google’s Notes on Clicks Vs. Impressions
- 1. Google’s Help Page
- 2. Daily Mail on Gmail’s Scanning