Sitemap Gen timing out, memory issue, chmod question
« on: February 12, 2008, 06:12:13 AM »
Hi Admin:

Again, great product, just a minor issue (and a few questions) that I am sure you can help me fix...

One of my sites is now 143,000 urls big and the sitemap gen takes about 50-60 hours to run.
It seems to be timing out and stops running at about every 24 hours.  I can easily resume it, but I do not want to have to babysit it..., I just want it to run without stoping.
Is there something in the php.ini I need to change.

I was forced to set the memory to 768M in the php.ini file, as the gen was running out of memory at anything less than 512m.  This seems like a lot of memory to use.  Is this normal?

Is chmod-ing the sitemap.xml (or any other file) to write from the public safe?  My netword admin thinks not.  Please explain in detail so I can explain this to him.  He wants me to hide the sitemap.xml file in a hidden directory and only tell Google where it is.

Your thoughts?

Thanks,
Dan
Re: Sitemap Gen timing out, memory issue, chmod question
« Reply #1 on: February 12, 2008, 10:48:07 PM »
Hello,

1. it is suggested to execute sitemap generator in command line for large sites to avoid timeouts. Also, you can setup a cron job for sitemap generator and it will be automatically resumed next time the scheduled task is executed.

2.in most cases it is possible (and suggested) to use "Do not parse" and "Exclude URLs" options to filter out the noise content pages and reduce the memory usage.

3. sitemap.xml is not an executable file, so it's safe to have it writable. Also, Google requires to put sitemap in topmost directory, so you cannot move it into a subfolder. But you can change the filename (to say mysitemap432432.xml) if you wish.