searchmetrics email facebook github gplus instagram linkedin phone rss twitter whatsapp youtube arrow-right chevron-up chevron-down chevron-left chevron-right clock close menu search

Having doubts about the Searchmetrics’ US Google Ranking Factors study?


OK, so you have doubts about Searchmetrics’ US Google Ranking Factors – Rank Correlation 2013 study

Earlier this week, we announced our US Google Ranking Factors – Rank Correlation 2013 study. This tries to identify the key factors that well-positioned pages have in common and that separate them from lower ranking pages in Google searches. We performed our analysis by looking at those factors that correlate with pages that rank well.

Of course a lot of very clever SEO experts have been looking at this whole area for a very long time and have strong opinions about what factors are and are not important. And when they looked at our data many of them have asked questions and expressed doubts about what we’re trying to say. So here’s a response to some of the main concerns we’ve seen.  Hopefully it’s useful.

How did you collate the data and run the study ?

The study analysed Google US ( search results for 10,000 keywords and 300,000 web sites, featuring in the top 30 positions, as well as billions of backlinks, Tweets, Google plus ones, Tweets, Pins and Facebook likes, shares and comments.

The data was collected in March 2013 and again in June 2013 to take account of Google’s recent Penguin 2.0 algorithm update.  Altogether we looked at over 70 different factors (although not all were included in the final analysis) and we calculated the correlations between them and the Google search results using Spearman’s rank correlation coefficient.

So those factors that have a high correlation coefficient have the biggest impact on rankings?

No we can’t say for sure. This is about correlation not causation.  With correlation we can say that more highly positioned pages appeared more likely to have certain factors (or have more of those factors), but we can’t assume those factors definitively influence or cause high rankings. It’s impossible to be certain, unless you are Google!

But why do factors such as keyword in H1 and title have 0 (or near 0) correlation – are you saying they’re useless?

No please don’t make hasty, literal interpretations from a quick look at the data.  We’re not trying to say that.

With Spearman’s  coefficient a score of +1 implies a perfect positive correlation and a score of -1 implies a perfect negative correlation. A high positive correlation coefficient occurs for a factor if higher ranking pages have that feature / or more of that feature while at the same time lower ranking pages do not / or have less of that feature.

But…..certain factors such as keyword in h1 tended to have a very low correlation because they are present on nearly all pages that appear in the top 30 search results. In this case there is little difference in the way these factors relate to high ranking pages and low ranking pages. They’re always there –  which actually results in a low or zero Spearman correlation coefficient!

It’s seem a little bit absurd and confusing, but this issue of zero or near zero correlation occurs for some very basic on-page factors (such as the existence of H1 headings, a keyword in the meta description and site speed). But these factors are almost ever-present and should absolutely not be disregarded by SEO teams. You can find out more about this issue by reading our in-depth report about the 2013 ranking factors.

Do social signals have an impact on rankings?

OK, there is a huge debate about this. Many people are convinced Google is not using social data as part of its algorithm. Indeed, many believe it’s impossible because Google can’t even access Facebook and Twitter (but if you do a site query in Google for the number of pages that were indexed, you’ll see that Google has more than 5 billion pages of indexed – these are pages that are accessible and Google absolutely knows what’s going on these sites).

Please understand we do appreciate the arguments against social signals being a ranking factor– we’re not denying them. Our data simply shows that social signals do correlate with rankings. And of course we know that from Google’s perspective that this makes perfect sense ie good content is shared often and Google tries to rank good content. We can’t say from the data in this study whether there’s a causal relationship. So you can interpret it how you choose to.

If you don’t know which links pass value in the search indexes then your conclusions are highly dubious….

Nobody knows, which link passes value – except Google, who determine the value of links themselves. And as we said: We are not Google! Furthermore, the value of links is influenced by several factors (“value” of the link source, quality of the link etc), which we have also taken into account. And, moreover, it’s more than just link attributes that influence rankings. All we can do is to look at the features well ranking pages are featuring and interpret the data. And we are not guessing – since our interpretations are based on an extremely large data set.

Your data says exact match domains have decreased in importance, yet many EMDs rank very highly – so why is that?

We did not claim, that Google punished all the exact match domains. What we discovered was that EMDs seem to have lost their “ranking bonus”. If you’ve read our Ranking Factors 2012 edition, then you know that EMDs had some kind of bonus until the last year. And this era ended with Google’s Penguin and EMD updates.

Until 2012, there were lots of keyword domains ranking well in the SERPs that did not provide any value for the user except having the keyword in the domain name – together with ads on the page. Most of them were absolutely irrelevant to the user’s query and requirement for information. What Google seems to have done now, is devalue these irrelevant domains.

Of course thist does not mean, that all the EMDs are irrelevant now. In fact, there are still EMDs ranking well. We know that. But these are largely the domains that offer some kind of relevance for the user. The irrelevant ones are more or less gone. For the US, our data indicates that there are about 25% less EMDs in the top 10 now, in comparison to 2013.

You will always find exceptions. But this is normal, because if keyword domains have great content, why shouldn’t they rank?

But some factors did show correlations are very low – under 0.4. Is that a typo?

No, this is not a typo. The absolute value of the correlation coefficient should be interpreted as a an indicator of the relative strength of the correlation of the corresponding factor with top 30 rankings, in relation to the other factors and our data set. Since there are no comparable studies, we cannot really say whether a correlation coefficient value of 0.4 is high or low. Given the high variability of the data, our best guess is that a coefficient of 0.4 for a single factor indicates a “good” correlation, while a coefficient of less than 0.1 – 0.2 indicates a “low correlation”. The high variability of the data is also the reason why we did not publish the results of statistical significance tests. In a high variability setting, such tests tend to accept the null hypothesis of “no correlation” since the presence of considerable variance and heteroscedasticity overestimates standard errors, which in turn increases the chance of type II errors (i.e. accepting that there is no significant correlation, while in fact there is).

Are they important for us to consider?

Yes, you should take into consideration, but do not forget the 0-correlation.

What about the reliability of the published correlation coefficients – arent’t Google’s rankings highly variable? What about factors influencing each other?

Some have argued that our conclusions are flawed because we did not consider multi-collinearity and possible detrimental effects of high variance on correlation values.

To address the first issue: In this study, we analyzed the correlation between rank and an invididual factor, i.e. we performed a simple linear regression with a single explanatory variable for each of the factors. Collinearity, in contrast, refers to multiple regression, where a dependent variable is explained in terms of several input variables. In that latter case, and only then, the estimates of the coefficients (i.e. the correlation values) of the input variables are dependent on effects of collinearity. Thus, if we had analyzed, say, the effect of having a keyword in the title AND a high number of Facebook Likes on a good ranking, we would have had to make sure that the correlation values obtained from such a model were valid with respect to any collinearity. The argument is thus simply invalid for the conclusions drawn from this study.

We mitigated the effects of high variance and of heteroscedasticity by filtering the observed values for extreme outliers before computing correlation values. To quote from Wikipedia: “Regression analysis using heteroscedastic data will still provide an unbiased estimate for the relationship between the predictor variable and the outcome, but standard errors and therefore inferences obtained from data analysis are suspect.” Thus, even though the factors we analyzed may exhibit considerable variance, even after filtering, the estimate of the relationship between factor and rank is still correct. Any further conclusions, however, such as a statistical hypothesis test of the significance of the observed estimate, may suffer due to the overestimation of standard errors.

Social signals effect on ranking a small addendum:

Again, remember that correlation is not causation. For example, a page could gain a large number of social signals simply because it appears prominently in Google’s SERPs, and hence can come to the attention of many people that in turn may share it. In this case, there would be a high correlation between social signals and a good rank, but the large number of social signals would be the effect, not the cause, of Google’s ranking strategies.

Hopefully the above will have clarified our position about the areas that people are unsure of when looking at our data.  We’ve put a lot of time and effort into collating, analyzing and presenting the correlation data.  And we think what we’re presenting is of value.  But SEO experts have different opinions and we’re happy to have a discussion about this.

One thing we would urge people to do however is please read the detailed report rather than make assumptions from a quick look at the infographic and charts. A lot of the issues people may be unsure of are explained in the report.

Liv Longley

Liv Longley

16 thoughts on “Having doubts about the Searchmetrics’ US Google Ranking Factors study?

  • Its nice post i loved it thanks for sharing this article.

  • It is smart to share this article because a lot of people are interpreting the data in the wrong way!

  • SEO Company Manchester 2013/06/29 at 9:19 pm

    Its taken me a long while to understand that Google hate us SEO guys
    when they were building there brand up they could not get enough content
    from people like us and now they throw every spanner in the works to make it
    twice as hard….The blackhatters throw crap spam links at Google and rank in days

    Maybe ive got this all wrong ?


  • thanks alot for this detailed explaination, i really appreciate this.
    Thank for the clarification

  • Charlie Southwell (@charliesaidthat) 2013/06/29 at 11:45 pm

    You make an interesting point with the correlation of page titles etc being ubiquitous and therefore making it a 0.

    Thanks for the clarifications, will certainly revisit the stats more closely now.

  • I recently had 2 articles hit 1000+ tweets in the same day. When I checked my Author Stats in GWT, my search impressions had tripled. It seemed clear to me that social media is ranking factor.

  • Good work collecting all that data.

    It would be interesting to compare these results to the websites that rank 100-130 for the same keywords and see how the results differ.
    In reality they could be exactly the same.

    In the content analysis it might be good next time to add a spell and grammar checker. Maybe something to try and assess the level of writing.
    After all, reading level is part of google’s search tools, so they are obviously measuring it themselves.

  • Thats bad news for social media haters like myself.

    However this ranking study shows a 0.3 correlation between plain ol likes and ranking, which obviously cant be helping rankings.
    The confusion still continues, social signals, are they affecting ranking? on big or small consideration..

  • Here’s my 2 cents…
    I’m an artist, not an SEO geek, but back in late December, in an attempt to build “likes” on my Facebook business page, I replace all page-specific FB Like code on my website to like my FB page rather than the individual articles of my website.
    Within 3 weeks my Google ranking, which usually bounced between #3 and #5 on the first page for “sea glass”, has already dropped to the second page.
    Now it could be that other forces are causing such a drop but I think it a coincidence that this would happen.
    I’m holding out a little longer just to see what happens but if my ranking doesn’t improve I’m putting back the “like” code to reflect the individual pages on my website rather than my FB page, and hopefully, re-acquiring my Google status.
    If that happens, and I do get back to better ranking, someone will be hard-pressed to convince me that Google doesn’t look at what websites people are “liking” in their FB newsfeeds.

  • chattel mortgage 2014/04/04 at 6:05 am

    of course like your web site but you need to test the spelling on quite a few of your posts. Many of them are rife with spelling issues and I find it very bothersome to tell the reality then again I’ll definitely come back again.

  • Thanks for sharing your thoughts on free pou ellen degeneres game app.

  • best cuisinart coffee maker consumer reports 2014/09/07 at 3:36 am

    Good post. I learn something totally new and challenging on websites I stumbleupon every
    day. It will always be helpful to read through articles frtom other
    writers and use something from their websites.

  • 2014/09/12 at 8:42 pm

    Admiring the time and effort you put into your website and detailed information you provide.

    It’s good to come across a blog every once in a while that isn’t the
    same out of date rehashed material. Fantastic read!
    I’vebookmarked your site and I’m including your RSS feeds to
    my Google account.

  • This text is invaluable. How can I find out more?|

  • Hi J, please have a read of some of our other blogposts. We cover quite a lot of similar topics in a lot of detail, using examples from our software. Let us know what you think.

  • need quality links 2016/01/30 at 9:47 am

    Very quickly this web page will be famous amid all blog users,
    due to it’s good articles

Write a Comment

Note: If you enter something other than a name here (such as a keyword), or if your entry seems to have been made for commercial or advertising purposes, we reserve the right to delete or edit your comment. So please only post genuine comments here!

Also, please note that, with the submission of your comment, you allow your data to be stored by To enable comments to be reviewed and to prevent abuse, this website stores the name, email address, comment text, and the IP address and timestamp of your comment. The comments can be deleted at any time. Detailed information can be found in our privacy statement.