What is Robot.txt File
How to Create robot.txt File
Robots.txt is a text file. it gives instruction to search engine crawlers about indexing and caching of a webpage, file of a website or directory, domain. Using this file you can also block search engine for crawing.Use the robotz.txt file to Restrict Search Engine crawlers from indexing selected areas of your websites
In simple word using robots.txt file you can allow or disallow search engine crawlers to visit your website (allow or block search engine on your website).
Robots.txt file mainly uses two keywords, User-agent and Disallow. Other one keyword is Allow.
- User-agents are search engine robots (or web crawler software).
- Disallow is a command for the user-agent that tells it not to access a particular URL.
Syntax
User-agent: [the name of the robot the following rule applies to] Disallow: [the URL path that you want to block] Allow: [the URL path that you want to unblock]
Disallow all robots
User-agent: * Disallow: /
The User-agent: * means this section applies to all robots. The Disallow: / tells the robot that it should not visit any pages on the website.
Allow all robots
User-agent: * Disallow:
The User-agent: * means this section applies to all robots (* repersent all). The Allow: / tells the robot visit all pages on the website.
Disallow some url
User-agent: * Disallow: /cgi-bin/ Disallow: / private_file.html
Note: robots.txt file store inside www or public_html directory on your website.
correct location where put robots.txt file on your website is www.tutorial4us.com/robots.txt
you can see our robots.txt file Click Here
Disallow particular directory
In below syntax we disallow all search engine to visit or craw java and seo directory or folder, it means no any url is indxed form these two directory or folder. You can easily generate robots.txt file using online tools like this Free Robots.txt Generator
Disallow particulat directory
User-agent: * Disallow: /java/ Disallow: /seo/
Block search engine | Syntax |
---|---|
Block Entire website | User-agent: * Disallow: / |
Block directory and its contents | User-agent: * Disallow: /sample-directory/ |
Block prticular url | User-agent: * Disallow: /private_file.html |
Block specific image | User-agent: * Disallow: /images/cats.jpg |
Block specific image from Google Images | User-agent: Googlebot-Image Disallow: /images/dogs.jpg |
Block all images on your site from Google Images | User-agent: Googlebot-Image Disallow: / |
Block files of a specific file type (.pdf) | User-agent: * Disallow: /*.pdf$ |