crawl stops
« on: May 16, 2006, 09:06:32 AM »
Hi
I have just installed the script with no problem and manualy started the first crawl but this stopped after 5 minutesI have set the  Save the script state, every X seconds: to 140 and Make a delay between requests, X seconds after each N requests: 5 - 20
The sript is running on a windows server.

   Regards
      Gary
Re: crawl stops
« Reply #1 on: May 16, 2006, 09:58:38 AM »
Hello Gary,

sometimes the server configuration limits the maximum execution time for the scripts and interrupts the script when time exceeds.
You can either:
1. restart script execution multiple times (with "Resume session" checkbox enabled) until full sitemap is created
OR
2. modify your max_execution_time  setting in php.ini (increase this value). In case if you are running IIS, you should also increase "Maximum script execution time" setting in IIS admin configuration.
Re: crawl stops
« Reply #2 on: May 16, 2006, 10:12:09 AM »
Hi
Thanks for the quick response Re: "Resume session" check box I cant find this in the config page only the save the script state.

    Regards
        Gary
Re: crawl stops
« Reply #3 on: May 16, 2006, 10:02:21 PM »
Hello,

you will see this checkbox at the Crawling page after your initial session will be interrupted by server (timed out), assuming that you have set the "Save state" option.
Re: crawl stops
« Reply #4 on: May 17, 2006, 07:28:57 AM »
Hi
The server dosen't seem to time out the crawl just stops, I then need to close the browser and re open the page and then continue the Continue the interrupted session.

   Gary
Re: crawl stops
« Reply #5 on: May 17, 2006, 04:28:08 PM »
Hello Gary,

when the crawling just stops, it means that the script timed out (there is no special "timeout" message  sent by the server). And you should resume the generation after that (or modify php configuration as described above).
Re: crawl stops
« Reply #6 on: June 07, 2006, 05:51:48 PM »
I'm having the same problem. I've set the crawler time out in the config for the generator, and gone into php.ini and modified the max_execution_time, adjusted the script execution timeout, and connection timeout to ridiculously high numbers and it's still stopping after a couple minutes. Do you have to do anything for PHP to use the updated settings in php.ini? We've got a site with 75000 pages, and really can't sit around all day restarting the script.
Re: crawl stops
« Reply #7 on: June 07, 2006, 06:35:35 PM »
Hi
As I do not have access to php.ini and my server would not increase or change the time out, I have se up a scheduled task (cron) as I am on a windows server to run periodicaly which seems to work great with no great loads on the sever.

    ATS
Re: crawl stops
« Reply #8 on: June 07, 2006, 07:35:56 PM »
So to get the  new php.ini loaded you have to recycle the DefaultAppPool, but even doing this it still just stops. With a site as large as ours, setting up a cron job to run every 5 minutes isn't going to work for getting the job done anytime soon. Two days turns into four days....
Re: crawl stops
« Reply #9 on: June 08, 2006, 10:09:16 PM »
So to get the  new php.ini loaded you have to recycle the DefaultAppPool, but even doing this it still just stops. With a site as large as ours, setting up a cron job to run every 5 minutes isn't going to work for getting the job done anytime soon. Two days turns into four days....
Hello,

you should not set the cron task to every 5 minutes! The daily task is more than enough to keep your sitemap fresh (and it's perfectly fine to have it generated weekly for larger sites).
Re: crawl stops
« Reply #10 on: June 08, 2006, 10:11:03 PM »
If it's timing out every couple minutes is the cron job running once a day really going to accomplish anything worthwhile?
Re: crawl stops
« Reply #11 on: June 08, 2006, 10:15:57 PM »
When script is executed by the cron, it is usually configured with other options that allows it to run longer (i.e., Apache doesn't interrupt the php process in this case).
Could you please try to execute generator from the command line? (via SSH)
Re: crawl stops
« Reply #12 on: June 08, 2006, 10:24:57 PM »
Will try
Re: crawl stops
« Reply #13 on: June 09, 2006, 12:04:41 AM »
Gotten to 10,000 and it's still going. So I'm hopeful it'll finish.
Re: crawl stops
« Reply #14 on: June 11, 2006, 03:25:37 AM »
did you get this to work out?  I am having the same troubles.  The generator kept stopping... and I was just moving save state to save state.  BUT  the script stopped going far enough forward for the save state to save (I had it set every 300 seconds.. or 5 min). 

So I tried running the script in the uninterrupt mode... but I don't know if it's still going.... The save states aren't continually being made.... so how do I know if it's still running?

thanks.
BV