crawl never finishes, sitemap never generated
« on: December 03, 2006, 07:37:48 AM »
On multiple tries, I always get to the exact same place:

Already in progress. Current process state is displayed:
Links depth: 2
Current page: vlog/050122.html
Pages added to sitemap: 107
Pages scanned: 120 (2,183.0 Kb)
Pages left: 602 (+ 348 queued for the next depth level)
Time passed: 1:03
Time left: 5:18
Memory usage: -

Like many who have posted here, I've done everything I can figure out from this forum, with no useful results. Except modify php permissions, which I don't think I can do with my host.

The last item on every thread of this nature is Oleg's reply "please PM me your URL/password," and then we never hear the end of the story. This forum would be a lot more useful if we could see what was actually done to resolve the problem!
Re: crawl never finishes, sitemap never generated
« Reply #1 on: December 04, 2006, 05:33:40 PM »
Hello,

please increase max_execution_time  and memory_limit settings in your php configuration. In case if the script *requires* to run longer to complete the task, you should allow this in configuration. Otherwise, you can use "Save state" feature and resume generation multiple times until full sitemap is created.
Re: crawl never finishes, sitemap never generated
« Reply #2 on: December 07, 2006, 03:53:36 PM »
As mentioned earlier, I don't know/rather doubt whether DreamHost will allow me to change the PHP configuratiopn on this shared server.

I tried saving states, but it would never reload from the previous state - just sat there claiming it was doing so, without ever arriving at a conclusion.
Re: crawl never finishes, sitemap never generated
« Reply #3 on: December 07, 2006, 03:58:42 PM »
I did change the max execution time in the configuration page for Sitemap generator - tried several different settings, including longer pauses between various numbers of pages. This does not seem even to affect how far the script gets.
Re: crawl never finishes, sitemap never generated
« Reply #4 on: December 07, 2006, 05:48:47 PM »
Quote
I did change the max execution time in the configuration page for Sitemap generator - tried several different settings, including longer pauses between various numbers of pages. This does not seem even to affect how far the script gets.
The setting in sitemap generator configuration doesn't override your setting in php.ini, so you cannot make it greater in that way.
You can contact your hosting support with question whether you are allowed to change it (possibly via local php.ini file or .htaccess definition).
Re: crawl never finishes, sitemap never generated
« Reply #5 on: December 09, 2006, 06:02:54 PM »
DreamHost wants me to compile my own PHP in order to have control over the settings. I am not expert enough to do this and don't have time to become such an expert anytime in the next 12 months.

Now what?
Re: crawl never finishes, sitemap never generated
« Reply #6 on: December 09, 2006, 06:26:28 PM »
You can also install sitemap generator on different host, crawl your site and then manually upload sitemaps. Also, you can limit the total number of pages in sitemap so that you can create it partially.
Re: crawl never finishes, sitemap never generated
« Reply #7 on: December 15, 2006, 11:51:15 AM »
You can also install sitemap generator on different host, crawl your site and then manually upload sitemaps. Also, you can limit the total number of pages in sitemap so that you can create it partially.

I don't have a different host available.

I am able to create partial sitemaps on a per-directory basis, but how do I combine those into a single sitemap that Google and my site visitors can use?

It would have been helpful if the sales page on your website stated that this kind of PHP access and/or expertise was needed to use the product. The sample version made it look far "too" simple!
Re: crawl never finishes, sitemap never generated
« Reply #8 on: December 15, 2006, 07:39:12 PM »
Hello,

Quote
I am able to create partial sitemaps on a per-directory basis, but how do I combine those into a single sitemap that Google and my site visitors can use?
You can simply submit multiple sitemaps in Google webmaster account, this is allowed.

Quote
It would have been helpful if the sales page on your website stated that this kind of PHP access and/or expertise was needed to use the product.
There is a link to "System requirements" (and installation instructions) page right below "Checkout" button.

Thank you for your patience.
Re: crawl never finishes, sitemap never generated
« Reply #9 on: December 16, 2006, 07:46:03 AM »
I did look at the system requirements, which are vague:

"    * The PHP XML generator will work with PHP 4.3.x or higher in default configuration in Apache web-server environment.

I believe that's what I've got, otherwise, presumably, it wouldn't run at all.

    * Sitemap generator connects to your website via http port 80, so your host should allow local network connections for php scripts (this is default configuration)

Check.

    * For file permissions requirements please refer to "Installation" section.

No problem there.

    * The memory size requirements (as well as the time required to complete sitemap generation) depends on the number of pages your website contains. "

I have fewer than 1000 pages total in my site but I can't even get a crawl even up to 500, it gets stuck here:

Links depth: 2
Current page: vlog/test2.html
Pages added to sitemap: 90
Pages scanned: 100 (2,043.3 Kb)
Pages left: 559 (+ 185 queued for the next depth level)
Time passed: 0:24
Time left: 2:18
Memory usage: -
Re: crawl never finishes, sitemap never generated
« Reply #10 on: December 17, 2006, 01:11:37 AM »
In your case it is most likely related to max_execution_time setting rather than to memory_limit.
Thank you for your feedback.
Re: crawl never finishes, sitemap never generated
« Reply #11 on: December 27, 2006, 04:15:42 PM »
Is that something I can set outside of the php config file?
Re: crawl never finishes, sitemap never generated
« Reply #12 on: December 27, 2006, 07:31:59 PM »
Some hosts allow to define this in .htaccess file (per-directory) like this:
Code: [Select]
php_value memory_limit 128M