maga

*
  • *
  • 13
Craw Stops
« on: October 06, 2007, 08:36:03 AM »
Crawl keep stoppings.
a)Even when I check the box to continue running in the background.
b) Even when I leave the window open and on top all of the time.

Running on a shared server with a big hosting company.
Website built with Joomla CMS, Uses PHP 5.2 and MySql.
Website has about 10,000 pages.
Host allows maximum 50,000 MySQLqueries per hour. Is that an issue?

Configuration of XML-sitemap has all limits set at Zero = unlimited.
Has Save set at 180 seconds
Has interrupt set at 1s in 10 pages.

Any ideas why the crawl stops?

Thank you

« Last Edit: October 06, 2007, 08:55:22 AM by maga »

maga

*
  • *
  • 13
Re: Craw Stops
« Reply #1 on: October 06, 2007, 10:17:44 AM »
Crawled 1300 pages in 5 hours.
There is something wrong.

This is the Crawler Settings:
Crawler Limitations, Finetune (optional)
Maximum pages:
"0" for unlimited
Maximum depth level:
"0" for unlimited
Maximum execution time, seconds:
"0" for unlimited
Save the script state, every X seconds:
180 this option allows to resume crawling operation if it was interrupted. "0" for no saves
Make a delay between requests, x seconds after each N requests:
1s after each 10 requests
This option allows to reduce the load on your webserver. "0" for no delay
« Last Edit: October 06, 2007, 10:41:38 AM by maga »
Re: Craw Stops
« Reply #2 on: October 07, 2007, 11:46:16 AM »
Hello,

it looks like your server configuration doesn't allow to run the script long enough to create full sitemap. Please try to increase memory_limit and max_execution_time settings in php configuration at your host (php.ini file) or contact hosting support regarding this.

maga

*
  • *
  • 13
Re: Craw Stops
« Reply #3 on: October 09, 2007, 06:06:24 PM »
I have increased memory to 64 MB and set max_execution_time to 86400 seconds = 24 hrs.
Craw still only runs a about 200 pages before stopping

maga

*
  • *
  • 13
Re: Craw Stops
« Reply #5 on: October 11, 2007, 09:20:32 AM »
Private message sent with url.

melih

*
  • *
  • 23
Re: Craw Stops
« Reply #6 on: November 25, 2007, 03:34:00 AM »
Hi,

Out of curiosity was there a fix to this problem?  We have the very same issue and we're not convinced it has to do with the php.ini file.  The crawl continually stops around the same spot.  Here is the response.

Links depth: 5
Current page: myarticles/Mark-Mason/621/Computers-and-Technology/10/title/asc/1/1
Pages added to sitemap: 29158
Pages scanned: 29180 (295,798.8 Kb)
Pages left: 9925 (+ 9210 queued for the next depth level)
Time passed: 39:10
Time left: 13:19
Memory usage: 31,444.1 Kb
Resuming the last session (last updated: 2007-11-24 11:31:26)

Any feedback is greatly appreciated.

Melih (\"may-lee\") Oztalay, CEO
SmartFinds Internet Marketing
Re: Craw Stops
« Reply #7 on: November 26, 2007, 04:14:46 AM »
Hello,

what are your current max_execution_time and memory_limit settings?

melih

*
  • *
  • 23
Re: Craw Stops
« Reply #8 on: November 26, 2007, 04:50:38 AM »
Hi,

Thanks for the response.  Here is what we found on the server:
max_execution_time   30
memory_limit   64M

Let me know if you have any suggestions for changes.  What concerns me is that earlier in this thread the original poster has difficulty even after increasing max execution time.  Just not clear what the resolution was to that problem.

Look forward to your response.
Melih (\"may-lee\") Oztalay, CEO
SmartFinds Internet Marketing
Re: Craw Stops
« Reply #9 on: November 27, 2007, 03:12:03 AM »
That depends on the total number of pages on your site, try increasing memory_limit to 128M and max_execution_time to 3000.

melih

*
  • *
  • 23
Re: Craw Stops
« Reply #10 on: November 27, 2007, 03:15:28 AM »
Hello,

Our site has over 25,000 pages and grows weekly.  We can increase the information as suggested and see what happens.  Thanks.

Melih (\"may-lee\") Oztalay, CEO
SmartFinds Internet Marketing

maga

*
  • *
  • 13
Re: Craw Stops
« Reply #11 on: January 10, 2008, 02:54:18 PM »
Crawl keep stoppings.
a)Even when I check the box to continue running in the background.
b) Even when I leave the window open and on top all of the time.

Running on a shared server with a big hosting company.
Website built with Joomla CMS, Uses PHP 5.2 and MySql.
Website has about 10,000 pages.

I may have found the problem and solution for my website.
I was using Open-SEF, which is a Joomla component that translates Joomla URLs to Search Engine Friendly URLs.
As I found out later, Open-SEF is notorious for creating multiple URLs to the same page, and the entire website slows down
I changed the SEF component to sh404SEF to eliminate the duplicate URL problem.
Eveything got faster, including XML-Sitemap's crawler.