Better handling of robots.txt
« on: December 26, 2008, 06:39:30 PM »
It is great that the robots.txt file is used in sitemap generation, however I would love to see it behave more like Googlebot in this case. According to the [ External links are visible to forum administrators only ], Googlebot supports basic pattern matching in the robots.txt file as well as the use of "Allow:" which I find very useful. I would love to see these included in the script since I use them both in my robots.txt file.

I realize that these are not in the web standards for valid robots.txt files, but many people still use them anyway.
Re: Better handling of robots.txt
« Reply #1 on: December 27, 2008, 02:51:27 PM »

sitemap generator supports only standard robots.txt directives. However, you can disable it manually in file ('xs_robotstxt' setting) and use "Exclude URLs" option instead.