XML Sitemaps Generator

Author Topic: Crawler appears all the time with Run in background/Resume last session  (Read 26556 times)

paypal92

  • Registered Customer
  • Approved member
  • *
  • Posts: 4
Hi Guys,

I went over this about a 100 times, but can't get it work.

Every time I click the Crawling tab, it gives me the 2 options to either run in the background (restart) or to resume the last session.

So, it seems the job is not continuing, as it display always the same time and # of records at the 'resume last session' option.

I believe this kind of happend when I specified to save every 60 seconds. I have all other options set to the default, except for the number of lines, which is set from 1000 to 49000. I also created manually (in the root) sitemap.xml, and just in case, sitemap1.xml up to sitemap9.xml and granted the access rights 666.

My site is quite big, about 35000 pages, completely php driven, however, very SOE friendly.

Any idea what's causing this?

Thanks,
Henk.

paypal92

  • Registered Customer
  • Approved member
  • *
  • Posts: 4
Re: Crawler appears all the time with Run in background/Resume last session
« Reply #1 on: September 19, 2006, 10:48:32 PM »
Some additional info:

This time i kept the browser open and kept on pressing the Crawling Tab. It seems just to stop after a while. This is what's displayed and never changes again:

Links depth: 3
Current page: auctiondetails.php?id=113968
Pages added to sitemap: 2339
Pages scanned: 2360 (95,505.9 Kb)
Pages left: 660 (+ 5608 queued for the next depth level)
Time passed: 8:30
Time left: 2:22
Memory usage: 3,668.2 Kb

I check the data directory and see 2 files:

crawl_state.log (size=167 bytes)
crawl_dump.log (size=1.78Mb)

Again, I created sitemap.xml manually (empty) and granted 666, so did I for sitemap1.xml up to sitemap9.xml

I hope this info helps determine the cause.

Rgds,
Henk.

XML-Sitemaps Support

  • Administrator
  • Hero Member
  • *****
  • Posts: 10622
Re: Crawler appears all the time with Run in background/Resume last session
« Reply #2 on: September 20, 2006, 12:58:36 AM »
Hello Henk,

what are your max_execution_time and memory_limit settings in php.ini at the server? I suggest to increase them.
Oleg Ignatiuk
www.xml-sitemaps.com
Send me a Private Message

For maximum exposure and traffic for your web site check out our additional SEO Services.

paypal92

  • Registered Customer
  • Approved member
  • *
  • Posts: 4
Re: Crawler appears all the time with Run in background/Resume last session
« Reply #3 on: September 20, 2006, 06:09:56 AM »
Hi Admin,

Meanwhile, I did run the script through SHH. It turns outs that it fails due to a memory error. The error is as follows:

Fatal error: Allowed memory size of 8388608 bytes exhausted (tried to allocated 153 bytes) in .../class.grab.inc.php(2)..eval()'d code(1) on line 315

Do I really have to change the allowed memory now in php.ini? I don't like to do this, as then all php scripts will allocated more memory, which impacts the resources substantial on my server. Would like to keep performance high.

What to do next?

btw: The max execution time is set to 30, but the script runs for 3 minutes. So, I really think this is down to the memory max allocation.

paypal92

  • Registered Customer
  • Approved member
  • *
  • Posts: 4
Re: Crawler appears all the time with Run in background/Resume last session
« Reply #4 on: September 20, 2006, 09:12:57 AM »
<UPDATE>

Just a little update for those that run into the same problem.

I modified the php.ini file on the server and set the "max execution time" to 32M

This solved the problem, script running perfect now.

Cheers,
Henk.

XML-Sitemaps Support

  • Administrator
  • Hero Member
  • *****
  • Posts: 10622
Re: Crawler appears all the time with Run in background/Resume last session
« Reply #5 on: September 20, 2006, 11:19:16 PM »
Great, thank you for the follow-up!
Oleg Ignatiuk
www.xml-sitemaps.com
Send me a Private Message

For maximum exposure and traffic for your web site check out our additional SEO Services.

 

SMF 2.0.12 | SMF © 2014, Simple Machines
XHTML RSS WAP2