Short – THEORY on how to tune for Rank Brain

THIS IS JUST A HYPOTHESIS!!! NOT PROVEN !!!

THIS IS JUST A HYPOTHESIS!!! NOT PROVEN !!!

Thanks for listening to SEO Fight Club. This isn’t a full episode. This is just a short, but it is a very important and time sensitive topic so I wanted to share it right away.

On October 26, 2015
Jack Clark
At Bloomberg Business
Wrote an article announcing RankBrain to the world.

I’ll put the link in the show notes.

http://www.bloomberg.com/news/articles/2015-10-26/google-turning-its-lucrative-web-search-over-to-ai-machines

Since the announcements I’ve heard dozens of SEO try to explain Rank Brain and fail and none of them were able to tell you how to tune for it. It just so happens that I am an software engineer with some background in AI. I’m going to show you what rank brain does in this short and I will tell you exactly how to tune for it so you can start taking advantage.

In the article Google says

For the past few months, a “very large fraction” of the millions of queries a second that people type into the company’s search engine have been interpreted by an artificial intelligence system, nicknamed RankBrain, said Greg Corrado, a senior research scientist with the company

RankBrain uses artificial intelligence to embed vast amounts of written language into mathematical entities — called vectors

This is geek speak for Natural Language Processing… this is where software can understand the clauses and parts of speech within a sentence. NLP can identify the subject and predicate in a sentence and which verbs are actions of which nouns and which adjective phrases apply to which noun phrases and so on. They can even map things across different sentences. And of course all the synonyms, equivalent phrasings, and word stemming too.

The article then says:

If RankBrain sees a word or phrase it isn’t familiar with, the machine can make a guess as to what words or phrases might have a similar meaning and filter the result accordingly, making it more effective at handling never-before-seen search queries.

Because this is NLP it is no longer basic search term matching… it is mathematically understand what an answer to a question is based on language graphs. Because of this it can handle never-before-seen queries.

The system helps Google deal with the 15 percent of queries a day it gets which its systems have never seen before, he said.

For example, it’s adept at dealing with ambiguous queries, like,
“What’s the title of the consumer at the highest level of a food chain?”

Rank brain is breaking down the clauses in the question to find the criteria for the answer:

What’s the title (that’s the basic query)
of the consumer (criteria on which title)
of a food chain (criteria on which consumer)
at the highest level (criteria on which level of a food chain)

Then rank brain is looking for sentences with clauses that meet those criteria.

Acceptable Answers to the question are:

tertiary consumer
quaternary consumer
apex predators
predators
carnivores

The most common accepted answer appears to be “predators”

If you look at the results for the example they provide one of the page one results is OFF TOPIC… but it demonstrates the value of a high PR page with an exact match of the question in the main content but no answer:

Above that in the #2 spot is a result with the most common answer.

Google knew to make “predators” bold. They know to treat it like a keyword match.

This is likely what rank brain is… on a question query it folds in the likely answer to the question as one of the search term matches to count as a hit.

The pages with the exact match question and no answer ranked lower than the pages with the generally accepted correct answer.

The second result clearly states: “At the top of the levels are predators” with predators put in bold. Google knowing to make predators bold IS the evidence that what I am describing is in fact happening in some fashion behind the scenes.

So if you want to rank well for question and answer format you need both the question and the answer on the page. You probably want to mark them up with shema.org markup.
An optimized sentence answer to the question might perform better than just a simple answer like:

“Apex predators are the consumer at the highest level of a food chain.”

Specificity probably helps your answer score. But I haven’t tested this. I’m just making assumptions about the observations that we now know to be true.

You probably want to make sure that your answer is in agreement with the generally accepted answer.

Google’s AI is probably finding all the variations of the same question and all the variations of the same answer and using the one’s with the most agreement as the “correct” answer.

There is more evidence that what I am describing is happening. Midway down the page Google displays a “People Also Ask” block of similar questions:

Google has to understand what makes questions similar and that takes NLP.

The article goes on to say “In the few months it has been deployed, RankBrain has become the third-most important signal contributing to the result of a search query”

“Google search engineers, who spend their days crafting the algorithms that underpin the search software, were asked to eyeball some pages and guess which they thought Google’s search engine technology would rank on top.”

“While the humans guessed correctly 70 percent of the time, RankBrain had an 80 percent success rate.”

“Typical Google users agree. In experiments, the company found that turning off this feature “would be as damaging to users as forgetting to serve half the pages on Wikipedia,”

This makes sense because pages with the correct answer to your question are typically much more useful than pages without the correct answer.