vBulletin robots.txt
« on: April 22, 2009, 04:37:31 AM »
Google is complaining that my sitemap.xml is referencing files that are excluded in my robots.txt. Therefore, the google index is very much incomplete by a few thousand pages.

I have constructed the robots.txt to try to eliminate duplicate content and thought I configured the sitemap accordingly. Any idea what I need to do to fix this problem short of removing the sitemap reference in google webmaster tools?

Here is my robots.txt:
User-agent: *

#Crawl-Delay: 10

Disallow: /*&pp=
Disallow: /*&sort=
Disallow: /*.php
Disallow: /*daysprune=
Disallow: /*goto=
Disallow: /*mode=
Disallow: /*postcount=
Disallow: /*showthread.php?p=
Disallow: /PRCheck/
Disallow: /admincp/
Disallow: /archive/
Disallow: /backup/
Disallow: /calendar.php
Disallow: /cgi-bin/
Disallow: /chat/
Disallow: /clientscript/
Disallow: /cpstyles/
Disallow: /customavatars/
Disallow: /customprofilepics
Disallow: /images/
Disallow: /includes/
Disallow: /info
Disallow: /installed/
Disallow: /modcp/
Disallow: /personal/
Disallow: /signaturepics
Disallow: /showpost.php
Disallow: /showthread.php?goto
Disallow: /showthread.php?mode
Disallow: /showthread.php?p
Disallow: /showthread.php?page
Disallow: /showthread.php?post
Disallow: /showthread.php?pp
Disallow: /subscription.php
Disallow: display.php?daysprune
Disallow: display.php?do
Disallow: display.php?order
Disallow: display.php?page
Disallow: display.php?pp
Disallow: display.php?sort

User-Agent: msnbot
Crawl-Delay: 10

User-Agent: Slurp
Crawl-Delay: 10
Re: vBulletin robots.txt
« Reply #2 on: April 23, 2009, 04:49:57 AM »
Thanks for the pointer but that is a really old listing for vbulletin and I am running the latest and greatest that has a lot more content that should be protected. Any comments?
Re: vBulletin robots.txt
« Reply #3 on: April 24, 2009, 12:22:12 AM »
You can add any other entries there in "Exclude URLs" option if you want them excluded.
Re: vBulletin robots.txt
« Reply #4 on: April 24, 2009, 03:14:52 AM »
Thanks for the reply. Actually I think that may have been the question I should have asked. In using the exclude, if I want to mimic the robots.txt as far as content, can I use folders in the exclude?  When I set up the scipt, I did copy the contents of your suggested list, so now the question is can I just add folders to exclude like I do in the robots.txt file?

TIA for a reply.
Re: vBulletin robots.txt
« Reply #5 on: April 25, 2009, 01:41:54 AM »
Yes, you can add folders too (do not include leading slash character though).