Google’s fight against problematic content has drawn renewed attention to a common question: how does Google know what’s authoritative? The simple answer is that it has no single “authority” metric. Rather, Google looks at a variety of undisclosed metrics which may even vary from query to query.
When Google first began, it did have a single authority figure. That was called PageRank, which was all about looking at links to pages. Google counted how many links a page received to help derive a PageRank score for that page.
Google didn’t just reward pages with a lot of links, however. It also tried to calculate how important those links were. A page with a few links from other “important” pages could gain more authority than a page with many links from relatively unremarkable pages.
Even pages with a lot of authority — a lot of PageRank — weren’t guaranteed to rocket to the top of Google’s search results, however. PageRank was only one part of Google’s overall ranking algorithm, the system it uses to list pages in response to particular searches. The actual words within links had a huge impact. The words on the web pages themselves were taken into account. Other factors also played a role.
These days, links and content are still among the most important ranking signals. However, artificial intelligence — Google’s RankBrain system — is another major factor. In addition, Google’s ranking system involves over 200 major signals. Even our Periodic Table of SEO Success Factors that tries to simplify the system involves nearly 40 major areas of consideration.
None of these signals or metrics today involve a single “authority” factor as in the old days of PageRank, Google told Search Engine Land recently.
“We have no one signal that we’ll say, ‘This is authority.’ We have a whole bunch of things that we hope together help increase the amount of authority in our results,” said Paul Haahr, one of Google’s senior engineers who is involved with search quality.
What are those things? Here, Google’s quiet, not providing specifics. The most it will say is that the bucket of factors it uses to arrive at a proxy for authority are something it hopes really does correspond to making authoritative content more visible.
As I’ve explained before, those raters have no direct impact on particular web pages. It’s more like the raters are diners in a restaurant, asked to review various meals they’ve had. Google takes in those reviews, then decides how to change its overall recipes to improve its food. But in this case, the recipes are Google’s search algorithms, and the food is the search results it dishes up. Google hopes the feedback from raters, along with all its other efforts, provides results that better reward authoritative content.
“Our goal in all of this is that we are increasing the quality of the pages that we show to users. Some of our signals are correlated with these notions of quality,” Haahr said.
While there’s no single authority figure, that bucket of signals effectively works like one. That leads to the next issue. Is this authority something calculated for each page on the web, or can domains have an overall authority that transfers to individual pages?
Google says authority is done on a per-page basis. In particular, it avoids the idea of sitewide or domain authority because that can potentially lead to false assumptions about individual pages, especially those on popular sites.
“We wouldn’t want to look at Twitter or YouTube as, ‘How authoritative is this site?’ but how authoritative is the user [i.e., individual user pages] on this site,” Haahr said.
It’s a similar situation with sites like Tumblr, WordPress or Medium. Just because those sites are popular, using that popularity (and any authority assumption) for individual pages within the sites would give those pages a reward they don’t necessarily deserve.
What about third-party tools that try to assess both “page authority” and “domain authority?” Those aren’t Google’s metrics. Those are simply guesses by third-party companies about how they think Google might be scoring things.
That’s not to say that Google doesn’t have sitewide signals that, in turn, can influence individual pages. How fast a site is or whether a site has been impacted by malware are two things that can have an impact on pages within those sites. Or in the past, Google’s “Penguin Update” that was aimed at spam operated on a sitewide basis (Haahr said that’s not the case today, a shift made last year when Penguin was baked into Google’s core ranking algorithm).
When all things are equal with two different pages, sitewide signals can help individual pages.
“Consider two articles on the same topic, one on the Wall Street Journal and another on some fly-by-night domain. Given absolutely no other information, given the information we have now, the Wall Street Journal article looks better. That would be us propagating information from the domain to the page level,” Haahr said.
But pages are rarely in “all things equal” situations. Content published to the web quickly acquires its own unique page-specific signals that generally outweigh domain-specific ones. Among those signals are those in the bucket used to assess page-specific authority. In addition, the exact signals used can also vary depending on the query being answered, Google says.