What is the longest a page can be in the Google Search cache before it is re-crawled?
The more popular a page is, the more often Googlebot crawls it.
For less popular pages, Googlebot can go many months between crawls. If the page were taken down in between crawls, the page could remain in the cache for a long time.
What is the longest a stale page can be in the Google Search cache before it gets re-crawled or dropped? Does Google publish this figure?
This is assuming I don't do anything to speed up the removal, like use the Outdated Content tool.
It looks like you answered most of this for yourself. It depends how often the page is crawled, which depends on its popularity (mostly). The next biggest factor that may cause problems with crawlers is page depth / click depth. Make sure the pages you want "fresh" can be navigated to within a few clicks from home.
But I don't think it'll ever be months between crawls, even for pages that are very rarely visited. I checked one of my websites that is about 2 months old (domain was fresh too). It only has 4 pages indexed in Google, doesn't update regularly and gets very little traffic, but Googlebot crawls my submitted sitemap almost every day.
So, I think if your sitemap is setup properly, Google will pull the cached copies very quickly without you doing anything. Depending on what you're trying to do, you could use robots meta tags to ask search engines not to offer cached copies in results (noarchive) or simply use Search Console to remove and recrawl.
Last point: The outdated content tool is no longer recommended for site owners. That tool directs you to use "Removals" in Search Console if you can.
Comments
Post a Comment