- Web Development
- Social Media
- Strategic Marketing
- White Label Services
- Referral Program
Webmasters have been using the “no index, no follow” tag for several years as an effective way of preventing a website from being indexed by the search engines. However, Google has never promised that using these tags will prevent a website from being crawled 100% of the time.
I can recall Google’s Matt Cutts saying similar things with regards to other tags like the rel=”canonical” tag. These tags should generally work for their designed purposes, but there are no guarantees to webmasters.
Using a tag with the HTML <meta name=”robots” content=”noindex,follow”> is often done by the webmasters while constructing a new website or webpage that they do not want to be indexed by the search engines yet.
Now, we have the evidence that explicitly shows that this tag does not work all of the time. The data shown below is from a hobby website that I decided to start working on a few months ago named historycontributions.com. There is nothing too impressive there to see right now, only a few fun facts that you may not have known. I mention it here because of the organic traffic visits that you can see in this screen shot below.
This website has had “no index, no follow” tags on every page since it was first put up. Yet, it has still received organic incoming search traffic from Google. Although 2 visitors is far from being a number to brag about, the fact that this website has been shown in the search engines is what is significant here.
I have even selected the landing page as a secondary dimension to show that this site received organic search visitors to multiple pages.
There are very few scenarios that I can think of in a webmaster’s world where the functioning of the “no index, no follow” tag is a do or die scenario. Most sensitive information is guarded in completely different ways, so proof that this doesn’t always work shouldn’t really shake things up too much for webmasters who are reading this. However, in the nearly 10 years that I have been working on SEO projects, this is the first time that I have ever actually witnessed this kind of meta tag failing to work.
There are a few ways that this can be done. The first way is to use the method that I have used here. This involves adding <meta name=”robots” content=”noindex,follow”> inside of the head element for every page. For this method, you will likely need to add this through a part of your content management system where you can control the head section on a site-wide basis, or on a page by page basis if you only want to use this for a specific webpage that is under construction.
Another way to do this is to use the robots.txt file for your website. In this file you can just add the following code and this will block the majority of robots from indexing your website.
Not all robots will pay attention to these instructions. Spam bots looking for email addresses will usually continue to crawl your website, however, this should usually prevent your website from being indexed by any search engines.
You can also use the robots.txt file to prevent crawling of specific pages on your website. For this, you would use the code below while replacing page xyz with your own page name.