How to use Googlebot
Current version: Googlebot 2.1
Tag: Googlebot/2.1 (+http://www.googlebot.com/bot.html)
Switching User-Agent to Googlebot: FireFox extension (User-agent switcher)
IP address range:
- from 126.96.36.199 to 188.8.131.52 (googlebot.com)
(as of May 2008)
Tips: For Googlebot to function entirely, allow the bots (spiders) to have all the access they want/need.
Reminders: Ensure the Prevent Spiders option is set to true in your admin sessions settings.
Updates/changes to Googlebot: check the .txt file (such as "robots.txt") for content.
How to Allow/Disallow Googlebot (manually):
- User-agent: Googlebot
- Allow: / (or list a directory or page that you want to allow)
- User-agent: Googlebot
- Disallow: / (or list a directory or page that you want to disallow)
How to create a robots.txt file using the Generate robots.txt tool (in 5 steps):
(1) Users must go to the Webmaster Tools Home page and click the site they want.
(2) Under Site configuration, click Crawler access.
(3) Click the Generate robots.txt tab to allow robot access, or in in the Action list, select "Disallow" to block Googlebot from all files and directories on your site.
(4) In the Files or directories box, type /. Click Add. This will allow your robots.txt file to be automatically generated.
(5) Save your robots.txt file (Note: It must reside in the root of the domain and must be named "robots.txt".)
To ensure the robots.txt tool is working properly, test it! Here's how"
(1) Go to the Webmaster Tools Home page and click the site you want.
(2) Under Site configuration, click Crawler access. If it's not already selected, click the Test robots.txt tab.
(3) Copy the content of their robots.txt file and paste it into the first box. In the URLs box, list the site to test it against.
Creating the robots.txt file and saving tips: Be sure in the Robot list to click Googlebot. And, in the User-agents list, be sure to select the user-agents you want. "To save any changes, you'll need to copy the contents and paste them into your robots.txt file."
"Writing a robots.txt file is, as you have seen, a relatively simple matter. However it is important to bear in mind that it is not a security method. It may stop your specified pages from appearing in search engines, but it will not make them unavailable. There are many hundreds of bots and spiders crawling the Internet now and while most will respect your robot.txt file, some will not and there are even some designed specifically to visit the very pages you are specifying as being out of bounds."
Note: Through Googlebot, users can check out their own Web site as seen by Google. See how it works by clicking on this link: View a Web Page as 'Googlebot' .