Going a Little Deeper – Analyzing the Ranking Factors for Baidu, the Largest Chinese Search Engine
Many Google ranking factor correlation studies have been published in the past – by us as well as other organizations and individuals – but, until now, no one had undertaken systematic analysis of organic Baidu ranking factors.
When it comes to Baidu, many SEOs think about SEO in terms of what works in Google.
While this is not all together a bad or wrong approach, the fact is, it leaves gaping holes in their methods – there are some things that are nothing like Google at all, and that’s the exciting thing about this study I’ve put together and we’re offering you to download free of charge.
In mid-2020, I analyzed the URLs and the indexed snippets of Top 10 ranking pages in Baidu organic search results for around 50,000 Chinese search terms.
The rules for selecting the search terms were as follows:
- 100% Chinese (no numbers, no Latin letters, no Arabic, Japanese, or other languages)
- 100% Simplified Chinese (no long characters as common to Taiwan or Hong Kong Chinese languages)
- Keywords were between 2 and 8 characters in length.
Who to believe?
Baidu SEO experts profess many conflicting opinions making it hard for SEOs to know who to believe. One example is on the topic of Country Code Top Level Domains (ccTLDs).
For example, Veronique Duong, SEO Expert at Fabernovel and author of Baidu SEO published by ISTE Ltd.) says that there is an advantage to having a ccTLD when attempting to rank well in Baidu.
Gary Stevens, Front End Developer and author of the SEMRush Baidu SEO Guide, supports that believe, “Get a .cn or don’t bother. Baidu strongly favors the .cn domain suffix (China’s country code) over .com in ranking its search results.”
Dragon Metrics profess that a using Chinese ccTLD is probably not a ranking factor – I can validate that they are indeed correct! In 2017 I clearly disproved this myth and my detailed Baidu ranking analysis this year confirmed this yet again.
Having a .com.cn or .cn domain does not guarantee any ranking advantage.
Chart: TLD distribution in the Baidu Top 10 (excluding Baidu- owned properties)
It’s clear that the generic .com TLD is by far the most dominant domain extension in the Baidu SERPs (not including Baidu-owned properties which make up around 50% of Top 10 Rankings. Download the study for more information about Baidu’s dominance).
Having a ccTLD is not a disadvantage though: 9% of TLDs in the Baidu Top 10 (excluding Baidu-owned websites) are Chinese ccTLDs such as .cn, .com.cn, org.cn and .net.cn.
I’m not stating that there are clear ranking signals related to TLDs, the data simply doesn’t indicate that.
Another Myth Busted – HTTP vs HTTPS
Many Baidu SEO bloggers have jumped to the conclusion that having your website set up as https is a ranking factor after Baidu announced they would use this as a signal.
The study found that more than 50% of search results in the Top 10 are https URLs… But there was no clear correlation that it is definitely a ranking factor. If 50% of URLs ranking don’t use https then it’s clear that it is not a deal breaker for SEOs if they haven’t switched to https.
Of course, I would advise any website owner who wants to be successful in China to encrypt their website. Not just because it makes logical sense but because it might well be a ranking factor, time will tell. However, the key thing to note is you should not expect a big ranking advantage from it.
Chart: Percentage of https URLs per Page 1 Ranking Position (1 to 10)
Subdomain Usage a Surprise
I continue to be convinced that it’s advantageous for SEO for Baidu to distribute fundamentally different user intent across different subdomains.
I was surprised at what I found out about subdomain usage in the results.
The big Chinese players are leading the way when it comes to subdomains. Many of the biggest brands in China host their core businesses on the www subdomain, but their different site sections – Customer Support, FAQs, User Forums, Help Forums, Picture Galleries, Video Portals, Wikis etc. – are each hosted on unique, individual subdomains.
But, there is a clear indication that Top 10 ranked pages are hosted on the www subdomain.
Even if this correlation could look like a ranking factor, my belief based on the data is it is only a correlation and not a definitive ranking factor. The fact that many companies publish their respective core business on the www domain leads me to this conclusion.
Chart: Percentage of www. Subdomains per Page 1 Ranking Position (1 to 10)
Because Baidu prefers a clear user focus per website and per subdomain, my advice is this: If an independent domain is chosen for the Chinese market, the core business should be placed on the www subdomain (e.g. www.mychinesedomain.com), while further user intent (blog, forum, Q&A, etc.) should be catered to and hosted on different subdomains.
If, on the other hand, a subdomain strategy for internationalization is already in use, such as cn.mydomain.com, I would architect the website to split user intent in this way cn.mydomain.com/forum/, as this is the next best way to achieve clear structural separation from the subdomain.
But this starts to get a little philosophical and I’m sure other people would see things differently, let me know what you think in the comments below.
A Few (Unsurprising) Insights into Website Content
It won’t be a surprise to Sinophiles and anyone paying close attention to the digital landscape in China that more than 98% of top-ranking pages work with Simplified Chinese Characters.
Traditional Chinese Characters are primarily used in Hong Kong and Taiwan and, as the data shows, using too many Traditional Chinese Characters probably reduces your chances of success in Baidu search results.
Going a little deeper, page content, in general, is made up of 57% Chinese Characters (the remaining 43% is made up of Latin letters, numbers, punctuation marks and spaces) and ranks for search terms that consist of 100% Chinese characters.
Title Tags and Descriptions
Index Snippets Explained
In this image, Chinese characters are represented by the small squares.
An Invitation to Download the Study, on Us
- 99% of all Top 10 Ranked Pages refer to a Chinese social media channel
- Top 10 Ranked Pages have an average length of 3,194 characters, with the top pages having the most characters of all
- Top 10 Ranked pages contain on average 28 images (<img>) with 60% of them use the alt attribute
- 17.6% of the pages use on average 8 tables (<table>)
- 88.7% of the pages use on average 10 unsorted lists (<ul>) with only 7.9% use on average 2 sorted lists (<ol>)