Crawl Never Completes ... Seems to Loop
« on: May 13, 2013, 02:34:51 AM »
I just purchased the product and in general I like the concept.  Except it will not complete a site map.  It looks like this situation may be similar to what some other people have posted here, but many of the other posts have said to PM an administrator so I am not exactly sure what the deal is in my case.  Here are the details.

I started the script last night and manually had to restart the crawl every couple of minutes.  Then I read about setting the crawl to automatically restart after XX seconds of inactivity.  I set mine for 10 seconds and that restart seemed to have helped for a period of time.

However, now I seem to be hitting another road block.

Everytime I get to 18,260 pages scanned, I get the following allocated memory size exhausted as shown here (see my follow up question below this message).

Links depth: 3
Current page: ships_store/index.php?p=details&ident=174892&mfc=Harken&sku=418&prod_name=Trigger+Cleat&sectionid=5004
Pages added to sitemap: 18247
Pages scanned: 18260 (684,093.5 KB)
Pages left: 17272 (+ 94682 queued for the next depth level)
Time passed: 0:42:55
Time left: 0:40:36
Memory usage: 115,421.0 Kb
Resuming the last session (last updated: 2013-05-12 19:17:32)
Fatal error: Allowed memory size of 268435456 bytes exhausted (tried to allocate 208 bytes) in /home/shop/public_html/generator/pages/class.utils.inc.php on line 102


After 10 seconds, the program restarts giving me a slightly lower number of scanned pages and then once it gets up to this number again it bails.

This leads me to believe that there is a memory limitation on my end.  I know I should probably change some setting ... but what setting and what should I change it to?  What are the implications of changing the setting?  Will it negatively impact my overall site performance?

I would love to get some assistance ... even if I have to pay more for it.

Thanks for your help.
Re: Crawl Never Completes ... Seems to Loop
« Reply #1 on: May 13, 2013, 03:08:42 AM »
A bit frustrated ... I get an error message with a line number ... but then when I go to look at the line number, I find the code is compiled/encrypted or some such thing so I can not see what is going on.
Re: Crawl Never Completes ... Seems to Loop
« Reply #2 on: May 13, 2013, 03:19:41 AM »
Update ... I just saw as suggestion in the forums to try to run it on the command line and I did and it still bailed on me ... this is what I have:


18356 | 17176 | 688,034.6 | 0:42:55 | 0:40:10 | 3 | 113,237.3 Kb | 18343 | 95529 | 5
18357 | 17175 | 688,061.7 | 0:42:55 | 0:40:10 | 3 | 113,232.8 Kb | 18344 | 95530 | -5
18358 | 17174 | 688,088.7 | 0:42:55 | 0:40:09 | 3 | 113,232.2 Kb | 18345 | 95530 | 0
PHP Fatal error:  Allowed memory size of 268435456 bytes exhausted (tried to allocate 201 bytes) in /home/shop/public_html/generator/pages/class.utils.inc.php on line 102

Fatal error: Allowed memory size of 268435456 bytes exhausted (tried to allocate 201 bytes) in /home/shop/public_html/generator/pages/class.utils.inc.php on line 102

I have also tried up'n memory to -1 (unlimited) and max execution time from 120 to 36000.



« Last Edit: May 13, 2013, 03:25:44 AM by christopher2 »
Re: Crawl Never Completes ... Seems to Loop
« Reply #3 on: May 13, 2013, 01:33:57 PM »
Hello,

looking at the progress report, it seems like a total number of pages found by crawler at that moment is more than 100K. Is this correct number for your site?
If yes, you need to increase memory_limit (there is also a setting in generator configuration that limits memory and you need to increase it too).
If not, possibly there is an incorrect loop of links that we might find if you can send me your generator URL/login in private message.