Robots.txt Best Practices and Guide for KartRocket Stores
Robots.txt is a text file webmasters create to instruct robots (typically search engine robots) how to crawl and index pages on the website.
In order to make a robots.txt file, you need to look in the root of your domain. If you're unsure about how to access the root, you can contact your web hosting service provider. You can make or edit an existing robots.txt file using the robots.txt Tester tool. This allows you to test your changes as you adjust your robots.tx
Syntax
The simplest robots.txt file uses two key words, User-agent and Disallow. User-agents are search engine robots (or web crawler software). Disallow is a command for the user-agent that tells it not to access a particular URL.
On the other hand, to give Google access to a particular URL that is a child directory in a disallowed parent directory, then you can use a third key word, Allow.
The syntax for using the keywords is as follows:
User-agent: [the name of the robot the following rule applies to]
Disallow: [the URL path you want to block]
or
Allow: [the URL path in of a subdirectory, within a blocked parent directory, that you want to unblock]
These two lines are together considered a single entry in the file. You can include as many entries as you want, and multiple Disallow lines can apply to multiple user-agents, all in one entry. You can set the User-agent command to apply to all web crawlers by listing an asterisk (*) as in the example below:
User-agent: *
You must save your robots.txt code as a text file,Save your robots.txt file
- You must place the file in the root of your domain
- The robots.txt file must be named robots.txt
P. S. File saved at the root of example.com, at the URL address http://www.example.com/robots.txt, can be discovered by web crawlers, but a robots.txt file at http://www.example.com/not_root/robots.txt cannot be found by any web crawler.
Source : Google