How an SEO solved a weird problem that is currently being explored not indexed

0

Technical SEO published a case study on how he solved a curious problem Currently explored not indexed problem on his site. While the solution he found may not be universal for others who are facing this problem, his method of identifying the problem and fixing it presents a helpful walkthrough for troubleshooting technical SEO issues.

What happened to the indexing of his site was really weird. But his solution was simple and logical.

I discovered a description of this problem on a tweet from Adam Gent (@Adoubleagent)

Advertising

Continue reading below

Explored – Currently not indexed

There are numerous anecdotal reports of currently unindexed exploration on Facebook, Twitter, and even in John Mueller’s office hour hangouts.

During a recent Hangout during office hours, someone asked why Google Search Console (GSC) shows Crawled Not Indexed, but when clicked they turn out to be indexed. John Mueller replied that it is only a discrepancy between the reports.

And in another hangout during office hours, John Mueller pointed out that it’s only normal for a site to have many pages that aren’t indexed.

He noted:

“… if you have a smaller site and you find that a significant portion of your pages are not being indexed, I would take a step back and try to reconsider the overall quality of the website and not focus as much on the technical issues for these pages.

The other thing to keep in mind when it comes to indexing is that it’s completely okay that we don’t index everything outside of the website.

And over time when you like 200 pages on your website and we index 180, that percentage goes down a bit.

Advertising

Continue reading below

While these two reasons are good reasons for why Crawled Not Indexed issue happens to some people, they are not the reason Adam Gent found out.

Adam Gent discovered an entirely different problem that appeared to be an algorithm problem at Google itself. There was nothing wrong with the site itself, the problem was with Google’s indexing.

Why explored – Currently not indexed

Adam looked at the GSC Index Coverage report and found that Google crawls and indexes its feeds as if they were HTML pages.

He took random words from these pages and created a site: searched for those words and found that the content of the feed page was well indexed.

To make matters worse, Google had apparently canonized the content of the RSS feed on the actual webpage, explaining why real webpages were crawled but not indexed.

The RSS feed was generated by WordPress

A strange thing in this case is that when you look at the feed page, it is displayed as a web page and not as an XML file usually displayed.

Screenshot of the RSS feed cache

I might be wrong, but it doesn’t look like a normal RSS feed. It looks like an HTML page.

Advertising

Continue reading below

While the underlying code is really XML, that’s not what most feeds normally look like.

Could this have played a role in Google’s choice to canonize the feed?

It is difficult to understand how this could happen because there are so many signals like internal linking that under usual circumstances would cause Google to favor HTML pages as canonical.

How Adam Solved the Problem

After Adam figured out what had happened, he deleted those WordPress generated feed pages, submitted the feed URLs for analysis, and then uploaded the pages to 404.

After these pages were removed from the index, it then submitted the correct URLs to Google and within days the issue was resolved.

Advertising

Continue reading below

What caused the problem?

Adam wrote that the problem appears to be with Google.

I asked around and someone told me that apparently a few years ago Google started indexing feeds, but they think this problem has been fixed.

I’m no XML expert, but it seems unusual for the feed to look like an HTML page instead of the normal XML layout which is displayed without HTML styling.

The diet doesn’t look normal, so it seems anything that gives the impression that this could be an underlying cause.

Either way, if you’re having issues with currently unindexed crawling, that’s another thing to check in case this happens to you as well.

Advertising

Continue reading below

Quote

Read the original message that explains how to fix the problem:

A curious case of canonization



Source link

Share.

About Author

Comments are closed.