Is Google Reading Your Text Messages?

by Nicholas Henkey — on  , 


Google has been incorporating their products into their marketing ecosystem at a very high rate. What would stop them from gathering marketing intelligence from Android SMS?

I received an unusual article in my news feed on July 20th. Usually when Google sends me an article slightly outside of frequented websites I try to read it, as the practice helps mitigate echo-chamber effects. Importantly, the article contained several keywords in the title that could have been pulled out of my text messages from July 18th. Let’s explore this anecdote:

The Basic Facts

The Infamous Text Message

“[Dogs] are lucky, their job is to be happy… [as noted] in a book by Dale Carnegie, ‘How to Win Friends and Influence People’.”

The Article in Question

National Geographic’s headline – “Why Are Dogs So Friendly? Science Finally Has an Answer

  • Note: the study’s sample size was super small and they did not reference peer review. It is unclear to me if the behavioral study actually qualifies as real science...

Additional Context

From Google’s perspective, they have been working on “natural language processing” for a while (this means that the computer can understand you). Here is a reference link to “Google Feud”. You can bet that the tech company pays more attention to the answers you get wrong than the answers you get right. This is because variations in human vocabulary can help develop a map of terms most useful to teach computers to read.

An obvious first step in natural language processing is developing software that can identify sentence keywords. When experimenting with this software, the best sources of natural language are in e-mails and SMS since we tend to be more candid with people we are familiar with. SMS text messages are better in this regard than e-mail… because spam. As these details improve, they need to be combined with people in order for companies to turn a profit.

Psychological profiling is real… and profitable. I am obsessed with articles about genetics and evolutionary theory. Google, Amazon, and Facebook are all aware of this personal detail and will exploit these kinds of details at every opportunity to make a buck.

These details add up to Google creating a profile of me just means that they will spam articles about genetics into my news-feed. So what is the control variable?

Simply “dogs.”

Dogs are fantastic creatures. They are playful and loving and just all around great… but apartment life can be animal cruelty. I purposely avoid looking at animal shelter websites because it is torture to find the perfect buddy and not be able to rescue him/her. Google should not have “dogs” as one of my frequent search-engine keywords.

The Fine-ish Details

Before I go any further, I must note that this newsfeed incident may be a coincidence. However, “We were doing it anyway!” works well as a PR excuse. As netizen it can be both entertaining and critical to keep your ear to the ground for “the next big thing.” With the amount of information on the internet combined with the sophistication of the Four Horsemen, anomalies are rare.

National Geographic Headline

The headline contained a couple of keywords directly from my text messages:

  • Dogs was the subject of my text messages
  • Friend was included as the root word of my text “Friends” and is the root of “Friendly”

This information is relatively simple to parse from a software perspective: dog, lucky, job, happy, “book title,” win, friend, influence, people. My software programming skills are limited but I could probably write a simple script to run each of those words through Google’s search engine. Any Silicon Valley firm has the expertise to check each term against a 10-point list of my personal interests.

National Geographic Content

Assuming that my suspicion was correct, I should have been able to find words in the article to reference against my messages. This was important to refine my thought-process on how an advertiser would use SMS to access consumers.

Here is a tally of keywords from the article:

  • Dog - 30
  • Friend - 5
  • People - 2
  • Happy - Does “wagging tail” count?
  • Job - 0, however I expect that many of the words in the article are related in Google’s natural language library. Examples could be “help” and “active”

One more keyword worth noting:

  • Gene - 19 uses of “Gene” or its derivatives

The top two keywords here are “dog” and “gene.”

Google Keyword Search

In order to verify that a simple script could have pulled up similar results to my Google News feed, I searched “gene dog” as of 5:30 PM on July 21st, 2017. The top article in Google News on Chrome desktop is from theLos Angeles Times: “Scientists find key 'friendliness' genes that distinguish dogs from wolves.” This article is very similar to the National Geographic article except that I sometimes read the Los Angeles Times’ content...

The operative words in the paragraph above is “simple script” because of cost. No government agency or company in the world has the processing power or budget to run an exhaustive profile on each of us every day. What would be pretty expensive, yet still have some ROI for an advertising company like Google is to run a script of 100 to 1,000 keywords or combinations.

  • Note: Distributed processing is a fantastic candidate to keep costs low. The best kind of hardware is hardware that someone else paid for.

Additional Thoughts and Analysis

Nostalgically, I remember getting barraged by ads to purchase Ham radios such as this Kenwood while I was studying for my Technician license. It occurred to me that my YouTube and browsing habits were giving Amazon and Google information that they could use to convert a sale. This life experience taught me to stay aware of how online advertising is constantly improving.

Today, I am very watchful of the intersection of technology and marketing due, in large part, to my education and career path.

With a glancing review, I am unable to verify my suspicion that Google mines Android text messages for keywords. With that said, they have ability and motivation. One of the core principles of IT security is, “where high motivation and ability exist, an action is very probable.” To mitigate blowback, Google is likely to begin the practice and announce that they are doing it later. As noted before, “we were doing it anyway” is a pretty dang good excuse for corporations.

Ultimately, a move to use SMS as a keyword source for advertising would open a whole new window of opportunity for advertising brokers to increase value-add to clients. Now is where I note that Android users are not Google’s clients. Android manufacturers and advertisers are the clients; we are the a market to tap into. Conversely, this scenario would open a whole new level of discomfort for a certain subset of Android users...