• Welcome to Sitemap Generator Forum.
 

Cron does NOT start the crawl

Started by funds, January 14, 2008, 03:40:53 PM

Previous topic - Next topic

funds

I have cron setup on a remote server to call the runcrawl.php script on the site that has the site map generator installed.

It works fine if the command is "wget http://www.*******.co.uk/generator/runcrawl.php"
But I don't want to create a new file on the remote server of the progress report, so I changed the command to:
"wget --spider http://www.*******.co.uk/generator/runcrawl.php" so that no file is downloaded, but now the crawl does not start even though wget indicates it got an HTTP status of "200 OK".

Why?????


sifeet

i have same problem too .. i run that command and then just numbers start to shows up and count ..

i dont get the result ! there is no setting that i can set for example every day at this time make a new sitemap ..
am i did wrong somewhere ?

let me kno plz thanks

XML-Sitemaps Support

Hello,

the cron job is configure in hosting control panel, where you define the command line and additional parameters for scheduled time/date.

funds

I am not running cron on the server that the runcrawl.php script is on. I do not have access to cron on that server, so the "/usr/bin/php /path/to/runcrawl.php" is not applicable to this situation.

The crawl script starts fine when I run wget WITHOUT the "--spider" option, but since I do not want the cron to "wait" for the crawl to finish (on this domain it takes almost an hour) I am using the --spider option which SHOULD trigger the crawl script to start without waiting for a return, but for some reason it is not. I can't really troubleshoot it or look for places where it might be testing for the HTTP_USER_AGENT since the script is encoded.

XML-Sitemaps Support

The script doesn't test user agent for incoming requests, so there should be no difference here. It is possible though that your server where generator is installed doesn't allow scripts to stay in background when called with web request, so it's only running while network connection is still open (regardless of whether the script requests to stay in bg or not).