The biggest unknown in SEO is always the algorithm. Many myths and rumors surround the “what” and “how” of Google results. Although SEO experts won’t be able to know the specific algorithm used, by examining the data you can figure out how best to influence results.
At Searchmetrics, we continually aggregate billions of data points a month and naturally look for the answer to the question: “Which factors are relevant for a good ranking in Google search results?” With this study, from February and March 2012 data, we get closer to the answer based on an analysis of 10,000 selected top-keywords, 300,000 websites and millions of links, shares and tweets. This analysis compares potential ranking factors and thus website characteristics with corresponding Google rankings in both the UK and the U.S. by assessing their statistical correlation. For example, if there are many pages in the top positions of analyzed SERPs with keyword title tags, then we have identified a high correlation.
The data shows that all of the factors that have a positive effect on SEO in the U.K. also have a positive effect on SEO in the U.S. although to different degrees and perhaps in a different order of importance from other variables. Likewise, the same factors that have a negative impact in one country have a negative impact in another. We will explain the strong shared trend, surprising results and differences between the two regions.
A few highlights from our analysis
- Social media signals show extremely high correlation: social signals from Facebook, Twitter and Google+ are frequently
associated with good rankings in Google’s index. This is interesting in particular for the UK, which
hasn’t had such a strong correlation with social signals up to this point. - Too much advertising is detrimental: for the first time we are seeing sites with too many advertisements struggling to rank well. However, the problem correlates only to AdSense adblockers.
- Backlinks are still important but quantity is not the only important thing: even though the number of backlinks is still the most powerful factor, links with stop words and ‘nofollow’ should also be included in the link-mix.
- Brands leverage classic SEO signals: apparently pages with strong brands do not need be as concerned with the areas of title tags, headings etc. According to our figures, this group operates under different rules.
- Keyword domains still frequently attract top results: despite all the rumors to the contrary, keyword domains are still alive and well and are often in the top rankings.
Please see our U.S. and UK Whitepaper for more details.
Factor Overview
The clearest way to present the correlations between different factors and Google search results is Spearman’s rank correlation coefficient.
The larger the bar, the greater the correlation. The correlation coefficient is displayed on the x-axis. Greater values along the x-axis (e.g. Facebook Shares) have a positive correlation (the more, the better) while lower values (e.g. Keyword position in title) have a negative correlation. Therefore, we can say that the largest correlation occurs between Facebook Shares and the lowest between the “number of words in text” for the U.S. and “keyword position in title (words)”.
Between the UK and U.S. some strong differences appear, specifically in the order of a factor’s importance. For instance, the number of words in the text has a large negative correlation in the U.S. and has less than half the correlation in the UK. This may be due to the fact that more highly ranked sites in the U.S. have less text-rich sites then do their counterparts in the UK.
Data collection
Data for this study was collected in February and March 2012. In the results, you can see the effects of various Panda Updates, which have significantly changed the look of results since the start of 2011.
One impact over time is that, for the UK, social signals have jumped in importance this year, almost having the same impact on search as they do in the U.S.
Social signals are as strong as ever
Facebook and Twitter signals correlate as follows with higher rankings in the U.S. and the UK:
UK Data:
In the case of social measures, the U.S. and the UK are in lock step in terms of importance of various factors. ‘Shares’ appear to have the strongest association, followed directly by the number of backlinks in the overall summary. Twitter is far behind these values but is still the 6th strongest metric in our analysis behind Facebook and the number of backlinks. A note on Google+: analyzing Google +1s with a Spearman correlation, we found a significant result of 0.41. From this we can assume that the quantity of +1s has the strongest correlation of any of the metrics analyzed in the study.
Advertisements might be an obstacle
Too many and/or excessively clumsy advertisements were presumed to be a factor in the Panda Update and its successors. The data in this study support this assumption as all our analyzed advertisement factors returned a negative correlation. The negative correlation was slightly stronger in the UK (-0.05 vs -0.04 for the U.S.).
General AdLinks (common integrations e.g. Commission Junction, AdSense and others) are slightly less negative than the use of AdSense alone. However it is important to note that the correlation value above is for AdLinks for all integrations including AdSense. If we take the % trend for AdSense integration and all other analyzed competitor networks according to rankings, we arrive at a surprising conclusion:
U.S. Data:
UK Data:
We can clearly see that AdSense advertisements drop sharply among the top rankings. However, all other forms of advertisement that we analyzed have in fact remained consistent. The bottom line is, then, that only AdSense has a negative correlation.
Backlinks are still SEO gold – but standards are rising
Regardless of the rising power of social media, backlinks remain one of the most critical factors in achieving good rankings.
The correlation data supports this – following Facebook metrics, the number of backlinks is the factor that most strongly correlates with good rankings. Moreover, there still appear to be other quality factors at play when dealing with backlinks:
U.S. Data:
UK Data:
These figures indicate that the proportion of nofollow links correlates more strongly with rankings than the proportion of links containing keywords in the U.S., but there is almost no difference between the two in the UK. Even the proportion of links containing a stop word can have an effect. This strong correlation for factors that seem to suggest a more natural link structure illustrates a trend suspected by many SEOs, that dull, perfectly keyword-optimized links are often no longer effective and that another strategy is necessary.
Brand power endures
For a while now, the rule among SEOs has been that brands enjoy a ranking advantage and that its particularly worthwhile if you can establish yourself as this type of brand. However, the ‘brand’ factor is difficult to establish in large-scale data analyses. In the end, it is nearly impossible to ascertain the thematic criteria for a brand without access to a search engine’s algorithm. However, it’s not entirely impossible.
For a few of our analyzed on-page criteria, the effect of brand-power is obvious and even seems to turn the ‘conventional’ SEO logic on its head. This is noticeable with the following elements, all of which (surprisingly) feature a negative correlation:
U.S. Data:
UK Data:
The core message of this graph is that the less often a keyword appears in the headline or title and the fewer the words in the text, the better a page will rank. In addition, text quantity has negative effects in both countries. This is quite surprising at first. However, if you look at the precise trend of these metrics for the top 30 places up until right before the top result, the factors follow this correlation. However, in first place, the natural niche for brands, the factors do not apply as much.
“For many sites owned by brands, such as sony.com and nike.com, the ‘conventional’ SEO logic is turned on its head,” explains Marcus Tober, CTO and founder, Searchmetrics. “When we looked at sites that are in the top position on page one of Google – the natural position occupied by brands – they tend to have less text on the page, particularly in the U.S., and fewer keywords in headlines and titles.” Once the top sites are netted out, these factors have a fairly neutral or, in the case of quantity of text, only slightly negative effect.
Keyword domains & URL keywords
The power of keyword domains has been known for years and is still clearly visible in our analysis:
U.S. and UK Markets in Lockstep:
Keyword domains correlate much better with high rankings than results from just any random start page. The correlation with keyword domains is also significantly higher than the correlations for keywords in the rest of the URL title. Although Google has repeatedly emphasized that these sites will slowly weaken in power, this does not yet seem to be the case.
Additional on-page factors
We have tried to examine as many factors as possible for their potential impact, at least as far as is possible with such a large sample base. Besides the factors than have been summarized in the main points above, there are still a number of other identified factors that differ little from the expected results or not at all.
Of these the following on-page factors can be included:
U.S. Data:
UK Data:
Although you might think that pages with a lot of multimedia content would tend to rank better (possibly also indirectly through better user signals e.g. social media links), there was only a slight positive correlation for both U.S. and UK markets in our analysis between rankings and sites with more images (where all images except placeholders were counted).
Just as unexpected is the negative correlation between rankings and text length in the U.S. (In our UK analysis, quantity of words has little effect.) Title length and keyword title position (according to character and word) featured slightly negative correlations. This corresponds with the experience that keywords placed earlier tend to be more strongly weighted and individual keywords have less weight in longer titles.
Be careful about drawing conclusions: correlation ≠ causation!
We would like to emphasize that this does not in any way guarantee that corresponding factors have an effect on rankings or that they are even used by Google as signals. Questions like “does a site receive social signals because it ranks well or does it rank well because it receives social signals?” are absolutely valid and cannot be answered unequivocally with the current data.
Some information regarding our data
For the U.S. dataset we selected an extremely large keyword set of 10,000 search terms from Google.com U.S. However, we did not just include the top 10,000 search terms according to search volume, since they contain a disproportionately high number of brand keywords, which might have distorted the assessment of many other key factors. Instead our reference dataset includes:
- A mix of different keywords with, if not the largest, then at least generally high search volumes,
- Around 1 in 10 are keywords that were identified as navigation-oriented according to our logic,
- The rest are a mix of keywords from a variety of CPC areas to best cover transactional (higher CPC) and information oriented (lower CPC) searches as well as the hybrids in between.
- The assessment was limited to organic searches – AdWords, Universal Search OneBoxes, 2 to 7 packs, sitelinks, iGoogle integrations etc. were not included.
- The analysis’ 10,000 analyzed keywords lead to 30,000 SERPs with 300,000 titles, descriptions and URLs.
- The ranking sites’ content included 14.68 GB in data, 92,672 AdSense blockers, 338,562,612 Facebook comments, 3.04 billion shares and 8.1 billion likes
For the UK dataset: For our dataset we selected an extremely large keyword set of 10,000 search terms from Google UK. However, we did not just include the top 10,000 search terms according to search volume, since they contain a disproportionately high number of brand keywords which might have distorted the assessment of many other key factors. Instead our reference dataset includes:
- A mix of different keywords with, if not the largest, then at least generally high search volumes,
- around 1 in 10 are keywords that were identified as navigation-oriented according to our logic,
- the rest are a mix of keywords from a variety of CPC areas to best cover transactional (higher CPC) and information oriented (lower CPC) searches as well as the hybrids in between.
- The assessment was limited to organic searches – AdWords, Universal Search OneBoxes, 2 to 7 packs, sitelinks, iGoogle integrations etc. were not included.
- The analysis’ 10,000 analyzed keywords lead to 30,000 SERPs with 300,000 titles, descriptions and URLs.
- The ranking sites’ content included 14.68 GB in data, 97,884 AdSense blockers, 248,603,582 Facebook comments, 2.0 billion shares and 7.0 billion likes