5 Ways to Stop Your Content Being Found or Indexed By Google

3 min read

13 February 2013,

SEO

5 Ways to Stop Your Content Being Found or Indexed By Google

Strange as it may sound, there are times where you may want or need to stop some of your web content from being indexed and displayed in the SERPs by Google. For example, you may have content that is duplicate or perhaps corporately or personally sensitive.  If you are a webmaster who finds themselves in this situation, then you can perhaps use one of the methods shown below to keep your content away from Google’s grasp.

Method One – Don’t put the content on your website

This is the one sure fire way that Google is never going to index your content – just take it down! Inquire whether the information is actually needed – could it be stored as a physical copy, or perhaps on a memory stick, or hard drive somewhere? For those situations where this is not possible, try one of the following...

Method 2 -  Use a meta noindex tag in your HTML

This line of code can be used when you don’t want a page to be displayed in the search engine listings. Just put this snippet of HTML in the head of the page you want to block:

<meta name="robots" content="noindex" />

The “robots” name means this command will target search engine spiders, or robots, and tells them not to index the page. This command should work on all search engines.

Method 3 – Use robots.txt

In 1994 various search engines got together, and decided that a method was needed for webmasters to show that some pages were off limits. Thus robot.txt was born. This file tells the search engines which pages to go to, and which not to. Here is how a typical robots.txt file would look:

User-agent: *

Disallow: /images/

The “*” here means all, and means that this file applies to all search engine spiders. In this example, the folder “images” would be out of bounds to the robots.

Method 4 - Password protect sensitive content

Sensitive content is sometimes protected by requiring visitors to enter a username and password. Such secure content won’t be crawled by search engines, as they won’t have the logon details or functionality to access content behind a firewall. Using a method like this uses more development time than the previous two methods.

Method 5 – Remove the pages in Google Webmaster Tools

Use the “Remove URL’s” section of the Google Webmaster Tools interface to specify to Google the removal of a page from the index. Only site owners and users with full permissions can request removals. Obviously you can only use this function if you have a GWT account for your website, which is par for the course anyway...

Conclusion

So, I hear you ask, which one should I use if I want to stop Google indexing my content? Well one thing is sure, the only sure fire way to stop Google indexing stuff you don’t want to get seen is to take it down; I’ve seen real examples of Google bypassing robots.txt and meta noindex files contrary to the wish of the webmaster. Basically, when it comes to barring content, Google is a law unto themselves, and if they see fit to override your robots.txt and meta noindex protocols, they will. So my recommendation, if you have to have content online that you want to limit access to, is to put it behind a password protected firewall.