Sitemap size?
« on: January 27, 2012, 09:25:05 PM »
Hope someone can give a little guidance on this.

Putting a little SEO work into the site currently with the sitemap and various other bits and pieces. We are reasonably well spidered currently but I'm sure it could be better. Anyhow the main part of the site is a forum, it's been running a few years and has around 180,000 threads (which I'm sure isn't too large as forums go) along with blog posts, archives of other content etc.

I've set standalone sitemap generator running with unlimited depth and it runs out of memory. Upped php limits a few times but it still doesn't make it all the way through. As such I'm not sure how to proceed
a) Limit crawl depth for the map so it completes with the memory available and leave it at that (already done this!)
b) Max out php memory in order the map completes at least once. Maybe leave the map with google for a few weeks in order they index as much as possible cleanly.
c) Ensure there is enough memory available the map can complete every time. Think I'd be needing at least 512mb...

Re: Sitemap size?
« Reply #1 on: January 29, 2012, 08:26:19 AM »
Hello,

what type of forum software are you running (vB/SMF/phpBB etc)? Forums usually have many "noise content" links like reply/quote/send pm/sorting options that can be excluded in generator configuration for better memory usage.
Re: Sitemap size?
« Reply #2 on: January 29, 2012, 12:16:27 PM »
Hello,

what type of forum software are you running (vB/SMF/phpBB etc)? Forums usually have many "noise content" links like reply/quote/send pm/sorting options that can be excluded in generator configuration for better memory usage.

I'm running phpbb. Already setup the excludes to filter out the noise. Have confirmed by running a less deep sitemap which comes out clean (i.e. only threads indexed rather than posts, reply etc).

More of a general question really. Although might be good if the script could index larger sites without being so memory intesnive. Had to increase php memory again to try and sitemap the whole lot at least once....