Large site only indexing 130 pages out of 1000s
« on: October 07, 2010, 11:52:43 PM »
Hi,

I am having some problems getting my site crawled properly so that the whole thing is indexed.

Is there any settings I should change to get the whole site indexed?  The site has about 40,000 pages.

An example of a root to a page not being indexed would be

1. [ External links are visible to forum administrators only ]
2. [ External links are visible to forum administrators only ]
3  [ External links are visible to forum administrators only ]
4  [ External links are visible to forum administrators only ]

Number one and two are indexed ok - but it does not seem to follow the links on 2 so it never reaches 3 & 4.

Do you have any ideas what I could change to make this work.

Andrew
Re: Large site only indexing 130 pages out of 1000s
« Reply #1 on: October 08, 2010, 09:35:09 AM »
Hello,

looks like links 3 and 4 are pointing to www domain, while starting link is from non-www domain. You should use [ External links are visible to logged in users only ] as Starting URL to resolve that.
Re: Large site only indexing 130 pages out of 1000s
« Reply #2 on: October 08, 2010, 07:18:44 PM »
Hi,

Thats great its seems to be working better now!

But I am getting the error message:

fatal error: Allowed memory size of 67108864 bytes exhausted (tried to allocate 15466507 bytes) in /home/squ1981/public_html/brightmindsrecruitment.co.uk/generator/pages/class.utils.inc.php(2) : eval()'d code on line 27

Do you have any idea on how to deal with this / reduce the impact of the crawl so it can get all the way through.

Re: Large site only indexing 130 pages out of 1000s
« Reply #3 on: October 08, 2010, 09:04:15 PM »
Hello,

it looks like your server configuration doesn't allow to run the script long enough to create full sitemap. Please try to increase memory_limit and max_execution_time settings in php configuration at your host (php.ini file) or contact hosting support regarding this.