XML Sitemaps and Vbulletin
« on: May 04, 2006, 11:44:41 PM »
Hi,

Sounds like a great piece of software,  only question I have is how well it works for vbulletin and what files would I need to specifiy to exclude.

Thanks,

Ivan
Re: XML Sitemaps and Vbulletin
« Reply #1 on: May 06, 2006, 10:32:23 PM »
Hello Ivan,

it should work well with vBulletin, the files to exclude depend on what do you keep included in sitemap, but the basic list is:
Quote
attachment.php
calendar.php
cron.php
editpost.php
image.php
member.php
memberlist.php
misc.php
newattachment.php
newreply.php
newthread.php
online.php
private.php
profile.php
register.php
search.php
usercp.php
Re: XML Sitemaps and Vbulletin
« Reply #2 on: May 09, 2006, 02:04:16 AM »
Thanks,

Just to make sure I just throw the above into "Exclude URLs:" box, right?  Just trying to figure out how to limit crawling as my 500 post vbb has 4,000 plus pages on queue. 

TIA,

Ivan
Re: XML Sitemaps and Vbulletin
« Reply #3 on: May 09, 2006, 10:43:21 AM »
Hello Ivan,

you should add this to both "Exclude URLs" and "Do not parse" options.
Re: XML Sitemaps and Vbulletin
« Reply #4 on: July 06, 2006, 03:39:10 AM »
I run vBulletin and found that my scan time is greatly reduced by excluding URLs with just the following three or four files:
 - calendar.php
 - memberlist.php
 - search.php
and possibly
 - forumdisplay.php

Calendar.php made the biggest difference, followed by memberlist.php.

 -Danny.
- Danny

Re: XML Sitemaps and Vbulletin
« Reply #5 on: July 06, 2006, 08:02:57 PM »
Thank you for information, Danny. I believe that calendar.php creates many pages (for every day/week/month etc) and it's good to exclude them indeed.
However, I would not suggest to remove forumdisplay.php pages as they are important and it's better to get them included in sitemap/indexed in search engines.
Re: XML Sitemaps and Vbulletin
« Reply #6 on: December 13, 2006, 02:47:02 PM »
Apart from putting in the contents of my robots.txt file, I added & as not to be parsed and URL's to exclude. This stripped a result of approx 5000 URLs down to about 1500 - which at a guess shoulld about cover everything at least once at this point, and is about 50% higher than Google's current indexed pages for my site.

I'm counting on 2 factors in doing this:
Google is still going to crawl additional links from the pages found in the sitemap.xml , and
the showthread?p= is going to cover subsequent pages of multipage threads.

I take the view that Google will only index as much as they deem fit - the goal of the sitemap is to make sure they index the right stuff.

Hopefully I'm not too far off the mark.
Bad regulation is worse than no regulation