Crawled more pages than included in sitemap
« on: May 15, 2008, 08:57:23 PM »
I used the Generator to generate the sitemap for a site that has about 25000 links.
Although the number of pages scanned is about 20000, only 2800 are added the the sitemap. The crawling ends successfully (I get no error messages) but only a few links are added to the sitemap. Is this because of memory limit problems or might be caused by something else?

If memory limit is the problem, is there a way to generate the sitemap without changing the memory_limit value? (I tried to change it but with no luck. I think php should have been compiled with -enable-memory-limit directive).

Here is a printscreen with the generation process (just before ending).
[ External links are visible to forum administrators only ]


« Last Edit: May 15, 2008, 11:57:44 PM by admin »
Re: Memory limit problem?
« Reply #1 on: May 15, 2008, 11:41:26 PM »

this is not a memory limit problem once sitemap is created without errors.

Looks like most of those pages are just redirected to other URLs (which is typical for forums) or have "robots=noindex" meta tag in html source, and sitemap generator doesn't include them in sitemap as a result.