• Welcome to Sitemap Generator Forum.
 

Problema running the script via Crontab

Started by webmaster254, July 03, 2012, 02:52:03 PM

Previous topic - Next topic

webmaster254

Hello Forum, I would like to discuss here a strange situation I am experiencing with XML-Sitemaps Standalone Generator.

Firstly, the script is currently installed and fully working.

SCENARIO ONE
With the script installed, I manually run with a SSH session the following command:
/usr/bin/php /var/www/mysite.com/generator/runcrawl.php
The result is 2560 URL in sitemap.

SCENARIO TWO
The same script installed. I have set in CRONTAB the very same command:
/usr/bin/php /var/www/mysite.com/generator/runcrawl.php (the command runs at 3AM)
The result is 1950 URL in sitemap.

THE QUESTION IS: why running the very same script with a crontab command, it yields a less number of URL in sitemap?

XML-Sitemaps Support

Hello,

is the result consistent? i.e. if you run it in ssh session again, it will create 2560 URLs sitemap again?

webmaster254

Yes, everytime I run via SSH I get 2560 URLs.

XML-Sitemaps Support

Hello,

could you please PM me your generator URL and an example URL that is not included in sitemap and how it can be reached starting from homepage?

webmaster254

OKAY, start from the homepage:

1) [ External links are visible to forum administrators only ] then click on GIOCARE MODERNO
2) [ External links are visible to forum administrators only ] then click on Squadre FUORI CATALOGO
3) [ External links are visible to forum administrators only ] then click on SUCCESSIVO (the blu arrow for "next" 63 pages)

[ External links are visible to forum administrators only ]
THIS PAGE IS NOT in sitemap.

XML-Sitemaps Support

If you check the html source of the page, the pagination link looks like:
<a class="pagina_successiva" href="/page/1-20/2-20" title="Successivo">Successivo

i.e. points to domain root, and later it's corrected with javascript, which crawler bots will not see.