Search engine bots are constantly crawling websites in order to add them to search engine index. However, sometimes developers want to hide their sites from search engine results. In this case, robots.txt can be used to block search engine spiders from accessing a website. In this tutorial you will learn how to create a robots.txt file and block search engines from accessing your website.
What you’ll need
Before you begin this guide you’ll need the following:
- Access to your hosting account control panel or FTP.
Step 1 — Access the server and create a new file
First of all, you need to create a robots.txt file. You can use FTP client or File Manager for this. The file should be placed in the same folder as your website (usually it is public_html). In this tutorial, we will use File Manager to create and edit robots.txt file:
Step 2 — Editing robots.txt
Each search engine has its own crawler (user-agent). In robots.txt you can specify the crawler with
User-agent. There are hundreds of crawlers, however, the most common are these:
- Yahoo! Slurp
For example, if you would like to block Bing crawler from accessing your website, you would have to edit robots.txt with the following rules:
User-agent: bingbot Disallow: /
In the case you want to block all search engines crawlers, you can use * as a wildcard:
User-agent: * Disallow: /
If you want to prevent search crawlers from accessing just a particular file or folder the similar syntax is used, however, you need to specify the name of file or folder. Let’s say we want to prevent search engine crawlers from accessing articles folder and private.php file only. In such case the content of robots.txt file should look like this:
User-agent: * Disallow: /articles/ Disallow: /private.php
Once you are done editing robots.txt file, save the changes.
By finishing this tutorial you have learned how to block search engines crawlers using the robots.txt file. This is useful if you want to prevent your website from appearing in search results.