Each page loaded on the web browser comes with a response code added at HTTP headers, which might not be visible on the web page. There are various response codes that a server sends for communicating the page’s loading status. And one of the prominent codes is a 404 response code.
Usually, a code between 400 and 499 denotes that a page didn’t load. And a 404-response code has a particular meaning. It indicates that the concerned page is gone and, in all probability, is not getting back anytime soon.
Understanding soft 404 error
Simply put, the soft 404 error isn’t the typical response code sent to web browsers. Instead, it’s merely a label that Google adds to any page in their index. Few servers get configured poorly, and the respective missing page has a 200 code which should have a 404-response code. If an invisible HTTP header showcases a 200 code despite the web page stating that it’s not found, the page can get indexed, a resource waste for Google.
And to resolve this problem, Google addresses the 404-page traits and attempts to distinguish if a 404 page is precisely what it is. To explain in other terms, Google states that if it looks, acts, smells like a 404, chances are it is a 404 page. If you want to know more about this, you can get in touch with any SEO company in India.
Wrongly Recognized As Soft 404
There are situations when a page is not missing! However, specific traits have urged Google to classify it as the missing page. Such characteristics have similarities with factors managed by the Panda algorithm. According to the Panda update, duplicate and thin content are negative ranking factors. And some of these features comprise a lack of or small amount of page content and several similar pages. Hence, fixing such problems will enable you to avert Panda and 404 errors.
The 404 Errors have two principal causes:
- A link error that leads users to a non-existing page.
- A page link that existed before and has disappeared abruptly.
The linking mistake
The complex part of this task is locating the broken links in the website. If the root cause of a 404 is a mistake in linking, you need to resolve the links. It can get increasingly difficult for complex, big sites having a million pages. And in such cases, crawling tools work best. If you want, you can use software like Botify, Screaming Fog, Xenu, and DeepCrawl.
A page that doesn’t exist anymore
If a web page doesn’t exist anymore, there are two options:
- Try to restore it in case it got removed accidentally.
- Do a 301 redirect to the nearest related page in case it got purposely removed.
However, initially, you need to find the linking error in the website. And much like seeing all the error in linking for a big website, resort to the crawling tools. But these tools might not detect orphaned pages that aren’t linked from any other pages or within navigational links.
Orphaned pages might exist if they formed a part of the website. After designing the website, the link leading to the old page might disappear, but the external links from various websites might still get linked to them. Do you want to see if such pages are there on the site? If yes, you can use a mix of tools.
Google Analytics won’t provide you with any missing page report by default. But you can trace them in several ways. You can go ahead and generate a custom report and segment pages with a page title that mentions, “Error 404 – Page Not Found”. The other way for locating orphaned pages inside Google Analytics is to generate custom content groups and assign every 404 pages to the content group.
Google Search Console
The search console will alert about the 404 pages because Google’s crawler crawls all pages. It can have links from various sites leading to any page which existed before on your website.
If you search Google for a site, it will list all the pages with your search term already indexed. It will enable you to check individually if the pages are getting loaded or are showing 404s. If you want to get this done at scale, use various digital marketing tools that come with an option to operate the site: operator in Google and other search engines like Yahoo, Bing, Seznam, and Baidu.
How Can You Resolve The Soft 404 Errors?
The crawling tools won’t spot any soft 404 error as it’s not a 404 error. However, you can make use of crawling tools for detecting something else. The following things to find are:
Thin content few crawling tools don’t just report thin content pages but also highlight the overall word count. You can start with the pages that have least words and assess if it has thin content. And from there, you might sort the URLs depending on the number of words in the content.
A few crawling tools are high-end for finding out how much of a page has template content. And if the primary content is similar to other pages, it’s essential to look at the page and decide why your site has duplicate content. You can also opt-in for Google Search Console other than crawling tools to detect the under-crawl error to spot pages that are soft 404 errors. If you crawl a complete site for discovering problems resulting in soft 404, you can find and fix issues much before Google chances upon it. Once you have detected the soft 404 errors, you have to correct them. It can comprise easy things such as increasing pages with thin content or substituting duplicate content with unique and new ones.
Does Google Addresses Hard and Soft 404 Errors Equally
Soft 404 errors are not the actual 404 error. However, Google indexes these pages if they don’t get resolved fast. Hence, it’s always better to crawl the site regularly to check for soft 404 and 404 errors. The crawling tools need to be the principal element of the SEO arsenal.
Lalit Sharma is the founder and CEO of Ranking By SEO, a leading SEO company in India. He has been working in the SEO industry since 2005. He can be seen contributing to SEMrush, SocialMediaToday, and Entrepreneur, etc. Connect with him on Twitter.