Google may be a mystery to many SEO, but the reality is that they are there to help. They actually want your site to do well because it means it is a valuable resource to the person that is doing the search. In fact, they give us a ton of data to use to help us make our sites better and that helps the people using search.
If you love looking at data and trying to find out what it all means then you should be using the Google Index Coverage Report because it is a wealth of data. There are other tools that Google has given us so we can peek behind the curtain for certain things, but for the sake of this article, we will focus on the Google Search Console and how it can help you improve your rankings.
What is the Google Index Coverage Report?
Before we get into what the report actually is we have to understand exactly what Google does.
Basically, Google has spiders, as they’re called, that crawl URLs across the internet. When you create a site, you let Google know it’s there by submitting a sitemap so it knows where to find your content. Then the spider arrives and tries to figure out what your site is about and how it should rank for a certain set of keywords.
At this point it will determine which content it should index, meaning give it a ranking and then send the data it gets about what it found back to the Search Console where it can be accessed by the site owner.
It’s at this point that you’ll be interacting with the report so you can understand what Google thinks is happening on your site so you can either leave it alone or make changes to improve it for both Google and your readers.
It will provide feedback by breaking up the report into four categories. The first category is for Valid URLs. This indicates that the URLs looked good and that they were crawled and/or indexed according to how you set up your robots.txt.
You’ll get a reading of either the URL being submitted and indexed or indexed but not submitted to the sitemap. In the first instance, there is no action required. This doesn’t mean that it is going to rank in the top results of the search for that URL but Google knows about the page and will see what happens later.
When it is indexed but not submitted then this means that Google may have found the URL by other means as it is not in the sitemap. The actions required for this depends on what you need. It could be that you didn’t submit it in the sitemap because you don’t want it indexed to save on your crawl budget. You can add it to the sitemap if you want it indexed and then request that it be indexed, or add it to your robots.txt file to be excluded.
Valid With Warnings
Valid URLs with warnings are the next grade down. This means that there are potential problems that may need to be addressed. One possible reason is that you had set this up in your robots.txt file with instructions for Google to not index. Since Google takes it as a suggestion, sometimes it indexes the page anyway. If you already have this set up as a directive then there is not much to do here.
If you get an error warning that the page was Indexed Without Content then this may have been another page that you didn’t want Google to crawl or index but it did anyway. If there is no content then you either need to add the content needed, or move on to your robots.txt and update the directives there if this is a page that isn’t supposed to have content. Don’t leave this alone even if you don’t have content on the page because this could deplete your crawl budget if it indexes this page every time.
Excluded URLs could indicate that there are technical or configuration problems with the URL. This is a great bit of information to know since you can then fix those problems with precision. It could also be the content on the page that’s the problem. If it’s thin or duplicate content then Google may simply ignore this page and move on. If you need this page indexed then you’ll have to improve the content or structure.
This gets tricky if it is duplicate content is causing the problem. You’ll need to make sure that it has the canonical tag so Google knows that this is the URL that you want to rank. In some cases, Google ignores the directives given by the robots.txt and crawls and indexes the page anyway, but sometimes it actually listens. This can cause a URL to be excluded and Google is letting you know in case you do actually want this page indexed the next time it crawls it.
The last section is the Errors area. There are a lot of different reasons for these errors to occur. Many times it is due to a technical error on your part. For instance, if you try to do over five redirects then it won’t work and you’ll get an error message.
Sometimes, Google feels like a page is important and wants to index it but it was blocked by the robots.txt file. Usually, this is when the URL was in the sitemap so Google feels like you want this page indexed. In this case, it is letting you know in case it was done by mistake.
When there is no other type of description for the error it will simply say the submitted URL has a crawl issue. These ones are tough to figure out because there is no way to know how to proceed. Looking through the URL inspection tool can give you a clue as to what the problem is.