Stopping Google from crawling my static domain

Stopping Google from crawling my static domain - Google Search Console is a free application that allows you to identify, troubleshoot, and resolve any issues that Google may encounter as it crawls and attempts to index your website in search results. If you’re not the most technical person in the world, some of the errors you’re likely to encounter there may leave you scratching your head. We wanted to make it a bit easier, so we put together this handy set of tips about seo, google-search, web-crawlers, robots.txt to guide you along the way. Read the discuss below, we share some tips to fix the issue about Stopping Google from crawling my static domain.Problem :


I use a cookieless sub domain static.example.com to serve all images, js, and css files. This static sub domain has as its root directory the same directory as the parent http://example.com. So static.example.com/index.php will result in calling the same file as http://example.com/index.php



Google has taken up indexing static.example.com and I need to stop this.



I can't modify robots.txt since it would apply to both domains.
Webmaster tools allows "temporary" removal of a link, but this needs to be permanent.



I registered this static domain through google as a separate site from example.com, but this doesn't seem to buy me anything since I can't see a way to block crawling without using robots.txt - which would block it from the parent domain.



Any other ideas?



UPDATE



Found this conversation on Webmaster Help Forum



I am looking into whether it's possible to insert a META tag dynamically into the header of every page called on the static domain using php



CLARIFICATION



I am not putting any content on static.domain.com. There are NO webpages on it that are called - it only serves images. I have it redirecting to the same root directory as the main site for ease of coding image links (not every image is fed from the same image path).



Google got a hold of the static web root address - probably because .htaccess redirects to index.php if a file is missing and then crawled it. Now Google is showing the static version of the same pages with search results.


Solution :

You should not redirect your user/Googlebot to your main subdomain when your static assets is not found. Instead you should return 404 error. But because you're redirecting them, hence your subdomain is index by Google, so



There is three solution for you.




  1. Redirect only your main static.domain.com page to domain.com, for example facebook redirect their static content cdn domain (fbcdn.net) to facebook.com, so Google will follow the redirection and will drop your subdomain link from search result.

  2. Use below robots.txt



User-Agent: Googlebot
Allow: .js
Allow: .css
Allow: .png
Allow: .jpg
Disallow: /




  1. Use meta noindex tag on that webpage.



Make sure the noindex tag only placed on those webpages that you don't want to see on search result, I am telling that, because may be you're using some kind of parent template technique which applied to all child templates.



Another thing is that noindex tag don't stop from crawling, it just used to noindex specific webpage, so that page will not going to appear on Google search result, so if you're using any links on that webpage then Google will still follow all the links and pass it's value just like any normal webpage.


If the issue about seo, google-search, web-crawlers, robots.txt is resolved, there’s a good chance that your content will get indexed and you’ll start to show up in Google search results. This means a greater chance to drive organic search traffic to your site.

Comments

Popular posts from this blog

Years after news site changed name, Google is appending the old name to search titles and news stories

Is it possible to outrank Google for a search term on their own search engine?

Load Wikipedia sourced biographies via Ajax or render it with the rest of the page as part of the initial request?