Moved Hosts Now Crawls 1 Page Only
« on: July 26, 2018, 10:09:25 PM »
I have read the various posts pertaining to this problem.  I can ping from my terminal and I have set the folders the correct permissions.

When I choose to crawl the site it comes back with one page only.

Please help.

Thanks,

Randal
Re: Moved Hosts Now Crawls 1 Page Only
« Reply #1 on: July 27, 2018, 05:51:52 AM »
Hello,

please let me know your generator URL/login in private message to check this.
Re: Moved Hosts Now Crawls 1 Page Only
« Reply #2 on: July 27, 2018, 11:24:38 PM »
I have pm'd you the details.
Re: Moved Hosts Now Crawls 1 Page Only
« Reply #4 on: July 28, 2018, 02:06:14 PM »
I have sent them again.
Re: Moved Hosts Now Crawls 1 Page Only
« Reply #5 on: July 29, 2018, 09:12:01 PM »
now I am getting this error, when I attempt to create a sitemap.

[ External links are visible to forum administrators only ]
Re: Moved Hosts Now Crawls 1 Page Only
« Reply #6 on: July 30, 2018, 05:09:08 AM »
Probably there is a configuration problem - it looks like your server doesn't allow local network connections via port 80 (http) or 443 (https) - as a result sitemap generator is not able to crawl the site. This is usually related to firewall installed at the host - could you please contact your hosting support regarding this?
Re: Moved Hosts Now Crawls 1 Page Only
« Reply #7 on: July 30, 2018, 04:55:33 PM »
You were correct and I have since fixed the problem and verified I can telnet into my server using both ports. 

However, I am still getting this error:  [ External links are visible to forum administrators only ]
Re: Moved Hosts Now Crawls 1 Page Only
« Reply #8 on: July 30, 2018, 06:40:39 PM »
Please try to create a testing script to check that connection is working:

<?php

 $initurl 
'https://www.yourdomain.com';

 
$ch curl_init();
 
curl_setopt($chCURLOPT_URL$initurl);
 
curl_setopt($chCURLOPT_HEADER1);
 
curl_setopt($chCURLOPT_VERBOSE1);
 
curl_setopt($chCURLOPT_SSL_VERIFYPEERFALSE);
 
curl_setopt($chCURLOPT_RETURNTRANSFER1);

 if(
$errno curl_errno($ch)) {
  
$error_message curl_error ($ch);
  echo 
"cURL error ({$errno}):\n {$error_message}";
 }
 
$info curl_getinfo($ch);
 
print_r($info);
 
$fdata curl_exec($ch);
 
print_r($fdata);
 
curl_close($ch); 
Re: Moved Hosts Now Crawls 1 Page Only
« Reply #9 on: July 30, 2018, 07:06:42 PM »
this is the page I get when I run that script - [ External links are visible to forum administrators only ]

I ran a netstat and got this.  From what I know this seems to show the ports are open:

[root@ip-172-31-8-214 conf.d]# netstat -a | grep -i LISTEN
tcp        0      0 0.0.0.0:ssh             0.0.0.0:*               LISTEN
tcp        0      0 localhost:smtp          0.0.0.0:*               LISTEN
tcp        0      0 0.0.0.0:sunrpc          0.0.0.0:*               LISTEN
tcp6       0      0 [::]:ssh                [::]:*                  LISTEN
tcp6       0      0 [::]:https              [::]:*                  LISTEN
tcp6       0      0 [::]:mysql              [::]:*                  LISTEN
tcp6       0      0 [::]:sunrpc             [::]:*                  LISTEN
tcp6       0      0 [::]:http               [::]:*                  LISTEN
Re: Moved Hosts Now Crawls 1 Page Only
« Reply #10 on: July 31, 2018, 04:28:13 AM »
As it's seen on a screenshot, the test script receives 403 forbidden response. It means that port is open, but your website blocks access from our server IP address.
Re: Moved Hosts Now Crawls 1 Page Only
« Reply #11 on: July 31, 2018, 04:35:23 AM »
It is also possible that your website blocks access from generator bot user-agent, please try to add this setting in generator/data/generator.conf file:
Code: [Select]
<option name="xs_crawl_ident">Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:61.0) Gecko/20100101 Firefox/61.0</option>
Re: Moved Hosts Now Crawls 1 Page Only
« Reply #12 on: August 01, 2018, 12:25:59 PM »
You were correct, I have a plugin to stop bots.  I never dreamed your bot's name would be in the bot table but it was.  Once I whitelisted it, everything worked perfectly.

Thanks for sticking with me on this.