XML Sitemaps Generator

Author Topic: Cron does NOT start the crawl  (Read 21270 times)

funds

  • Registered Customer
  • Approved member
  • *
  • Posts: 2
Cron does NOT start the crawl
« on: January 14, 2008, 03:40:53 PM »
I have cron setup on a remote server to call the runcrawl.php script on the site that has the site map generator installed.

It works fine if the command is "wget [external links are visible to admins only].*******.co.uk/generator/runcrawl.php"
But I don't want to create a new file on the remote server of the progress report, so I changed the command to:
"wget --spider [external links are visible to admins only].*******.co.uk/generator/runcrawl.php" so that no file is downloaded, but now the crawl does not start even though wget indicates it got an HTTP status of "200 OK".

Why?????

XML-Sitemaps Support

  • Administrator
  • Hero Member
  • *****
  • Posts: 10624
Re: Cron does NOT start the crawl
« Reply #1 on: January 15, 2008, 12:14:40 AM »
Your command line should be:
/usr/bin/php /path/to/runcrawl.php
Oleg Ignatiuk
www.xml-sitemaps.com
Send me a Private Message

For maximum exposure and traffic for your web site check out our additional SEO Services.

sifeet

  • Registered Customer
  • Jr. Member
  • *
  • Posts: 23
Re: Cron does NOT start the crawl
« Reply #2 on: January 19, 2008, 07:11:34 PM »
i have same problem too .. i run that command and then just numbers start to shows up and count ..

i dont get the result ! there is no setting that i can set for example every day at this time make a new sitemap ..
am i did wrong somewhere ?

let me kno plz thanks

XML-Sitemaps Support

  • Administrator
  • Hero Member
  • *****
  • Posts: 10624
Re: Cron does NOT start the crawl
« Reply #3 on: January 19, 2008, 08:44:00 PM »
Hello,

the cron job is configure in hosting control panel, where you define the command line and additional parameters for scheduled time/date.
Oleg Ignatiuk
www.xml-sitemaps.com
Send me a Private Message

For maximum exposure and traffic for your web site check out our additional SEO Services.

funds

  • Registered Customer
  • Approved member
  • *
  • Posts: 2
Re: Cron does NOT start the crawl
« Reply #4 on: January 21, 2008, 05:46:37 PM »
I am not running cron on the server that the runcrawl.php script is on. I do not have access to cron on that server, so the "/usr/bin/php /path/to/runcrawl.php" is not applicable to this situation.

The crawl script starts fine when I run wget WITHOUT the "--spider" option, but since I do not want the cron to "wait" for the crawl to finish (on this domain it takes almost an hour) I am using the --spider option which SHOULD trigger the crawl script to start without waiting for a return, but for some reason it is not. I can't really troubleshoot it or look for places where it might be testing for the HTTP_USER_AGENT since the script is encoded.

XML-Sitemaps Support

  • Administrator
  • Hero Member
  • *****
  • Posts: 10624
Re: Cron does NOT start the crawl
« Reply #5 on: January 21, 2008, 10:42:10 PM »
The script doesn't test user agent for incoming requests, so there should be no difference here. It is possible though that your server where generator is installed doesn't allow scripts to stay in background when called with web request, so it's only running while network connection is still open (regardless of whether the script requests to stay in bg or not).
Oleg Ignatiuk
www.xml-sitemaps.com
Send me a Private Message

For maximum exposure and traffic for your web site check out our additional SEO Services.

 

SMF 2.0.12 | SMF © 2014, Simple Machines
XHTML RSS WAP2