Generator stops after index.php (site's home page)
« on: January 16, 2014, 02:00:03 PM »
Hello, I have been using Generator for years and just recently, on one of the sites, Generator stopped indexing beyond the index.php page. ie I had sitemaps, (ror,xml,txt etc) but with only 1 page indexed instead of the usual 350-plus pages, either via cron or manually. As far as I can gather, the script does not error-out, it simply stops. My Apache host has upgraded to PHP 5.4.21 recently. I don't have .htaccess in the root as I am trying to resolve this issue and robots.txt is not the problem. Can you advise please? Thanks, Nick.

Will an upgrade to your latest version fix this? I just installed Standalone Sitemap Generator (PHP) v6.1, 2013-11-19 and no change.
 
« Last Edit: January 16, 2014, 02:15:18 PM by Nicky- »
Re: Generator stops after index.php (site's home page)
« Reply #1 on: January 16, 2014, 07:50:33 PM »
FOR INFO:
To add, just now, via a cron job, and at another php site and built by a different php developer but hosted at the same hosting - the Generator failed to index some 250+ pages. Just the index.php and index.html were counted in the sitemaps. Sitemaps were generated, but only 2 pages were indexed.

I really don't know if it is the hosting's php config. or the Generator script? Incidentally, this evening, another Generator script worked perfectly on another site (same hosting) that was all html (no php).

Hope this helps in your diagnosis !

Regards, Nicky.   
« Last Edit: January 16, 2014, 07:54:04 PM by Nicky- »
Re: Generator stops after index.php (site's home page)
« Reply #2 on: January 17, 2014, 07:25:25 AM »
Hello,

please try our search engine bot simulator tool to check if the site is "crawlable":
http://www.xml-sitemaps.com/se-bot-simulator.html
Oleg Ignatiuk
www.xml-sitemaps.com
Send me a Private Message

For maximum exposure and traffic for your web site check out our additional SEO Services.
Re: Generator stops after index.php (site's home page)
« Reply #3 on: January 17, 2014, 08:20:41 AM »
SE Bot Simulation Results
http://www.mairiedevillardonnel.net/index.php
Page layout as seen by the robots
Proxy access not allowed

SE Bot Simulation Results
http://www.mairiedevillardonnel.net/index.php

Page layout as seen by the robots
Proxy access not allowed
Raw HTTP Headers
Internal Links Found
none found
Restricted with robots.txt
none found
Restricted with rel="nofollow" attribute
none found
External URLs
none found

There is no .htaccess in the root ?!?
« Last Edit: January 17, 2014, 08:27:40 AM by Nicky- »
Re: Generator stops after index.php (site's home page)
« Reply #4 on: January 17, 2014, 04:19:41 PM »
ISSUE RESOLVED

I found this in the index.php which prevented crawling:

<?php if(@fsockopen($_SERVER['REMOTE_ADDR'], 80, $errstr, $errno, 1))
die("Proxy access not allowed"); ?>

Thanks,
Nicky
Re: Generator stops after index.php (site's home page)
« Reply #5 on: January 17, 2014, 06:29:10 PM »
I'm glad you got it resolved!
Oleg Ignatiuk
www.xml-sitemaps.com
Send me a Private Message

For maximum exposure and traffic for your web site check out our additional SEO Services.