• Welcome to Sitemap Generator Forum.
 

problem...

Started by nellsonkimjr1, December 14, 2006, 05:58:13 PM

Previous topic - Next topic

nellsonkimjr1

I have a website with over 500 000 pages...Generator begins to work very slowly after 2000 pages, after 3000 i even can not update browser window..
what is the problem?

XML-Sitemaps Support

Hello,

for the larger sites it is suggested to execute sitemap generator from the command line via SSH for better performance.
You can also use "Do not parse" / "Exclude URLs" options to skip certain URLs from processing.

nellsonkimjr1

Ok thanks.
By the way, if i need to exclude the following url type
[ External links are visible to forum administrators only ]
and
[ External links are visible to forum administrators only ]
Where X and Y random numbers...
What  correct exlusion combination in "configuration" should I use?


nellsonkimjr1

Oleg, and what does this message mean and what can I do to avoid it...

Fatal error: Allowed memory size of 67108864 bytes exhausted (tried to allocate 1048576 bytes) in /home/test/www/sm/pages/class.grab.inc.php(2) : eval()'d code on line 286

XML-Sitemaps Support

This message means that your php configuration limits amount of memory for scripts and it is not enough to create full sitemap (in case if a lot of URLs are found). You should increase memory_limit setting in your php config (php.ini) to avoid this.

nellsonkimjr1


Hi,
Even after i wrote a code to exclude urls, genarator says :
Pages scanned: 12100 (301,823.7 Kb)
Pages left: 88474

But the amount of pages for indexing is maximum 15 000

So the first question is does generator fallow changes in configuration that had benn done after the crawling was started  ?
Does generator look for robots.txt and make any exclusions itself?

XML-Sitemaps Support

Hello,

QuoteSo the first question is does generator fallow changes in configuration that had benn done after the crawling was started  ?
In case if you resume generation with changed options, they will be applied correspondingly. This doesn't happen on-the-fly though (if generator is currently running).
QuoteDoes generator look for robots.txt and make any exclusions itself?
Yes, robots.txt exclusion is applied and there are options to apply additional exclusions:
"Do not parse extensions"
"Do not parse URLs"
"Exclude from sitemap extensions"