Crazed Sitemap?
« on: August 02, 2010, 05:44:02 AM »
I have aprox 55 posts on my forum.

so far the sitemap has been running for more then 30 min.

has scanned 5000+ pages...

What is going on with this sitemap?

I would hate to think if I had 10,000 posts..

Admin,

Can you please add IPB (invision board) to  your list...
I do not see IPB in order to exclude a lot of this
junk from being added to the sitemap.
« Last Edit: August 02, 2010, 06:10:57 AM by dscurlock »
Re: Crazed Sitemap?
« Reply #1 on: August 02, 2010, 08:45:53 AM »
Hello,

please try these settings for IPB based site:
in "Exclude URLs":
Code: [Select]
&pid=
&view=
&do=reply
&do=new
act=Post
act=Login
act=usercp
req=sendentry
op=reportcomment
show=next
req=syndicate
req=showarchive
in "Do not parse":
Code: [Select]
showuser=
&user=
do=stats
req=printentry
show=comment
showtopic=
showentry=
lofiversion/index.php/t
&img=
Re: Crazed Sitemap?
« Reply #2 on: August 16, 2010, 06:49:15 AM »

Did not work...in fact just a day later the sitemap crashed...
it failed to load the sitemap page, nor would it no longer
allow me to log in the cPanel anymore...

I may not even use it for IPB..I am curious is
vbulletin on your list? I do not need another IPB failure
and loose the entire sitemap like i did on this one.

Hello,

please try these settings for IPB based site:
in "Exclude URLs":
Code: [Select]
&pid=
&view=
&do=reply
&do=new
act=Post
act=Login
act=usercp
req=sendentry
op=reportcomment
show=next
req=syndicate
req=showarchive
in "Do not parse":
Code: [Select]
showuser=
&user=
do=stats
req=printentry
show=comment
showtopic=
showentry=
lofiversion/index.php/t
&img=
Re: Crazed Sitemap?
« Reply #3 on: August 16, 2010, 11:23:48 AM »
Hello,

please let me know your generator URL/login in private message to check this.
Re: Crazed Sitemap?
« Reply #4 on: November 02, 2010, 07:19:14 PM »
you  need to put the calendar urls in the exclusions. I guarantee you have multiple posts for every day of the week coming from calendar rather than just the events themselves. I have found using robots.txt is a better way of restricting access, because then you catch the spiders that dont have your sitemap