Unable to crawl site hosted in a subfolder of another
« on: February 12, 2007, 01:17:07 PM »
Hi,

I have two sites [ External links are visible to forum administrators only ] and [ External links are visible to forum administrators only ]

The first indexes fine using the free generator, the second reports an error. The first is a full domain hosted by the registrar of the domain, the second is registered with a different registrar and forwarded to a subfolder of the first.

I think this error is probably linked to the cottage site's inability to register in Google. Using their webmaster tools I found that I had robots.txt in the top level site restricting acccess to the cottage one, but even though I've removed that restriction Google, and your tool, both seem unable to crawl the cottage site.

Is there some other thing I have to change? Unfortunately your error page just says "an error", not which error. Perhaps that would be a nice enhancement ...?

Best regards

Nick
Re: Unable to crawl site hosted in a subfolder of another
« Reply #1 on: February 13, 2007, 11:23:19 PM »
Hello,

you have the second domain redirected to the first one, but google sitemap protocol only allows to include URLs fromthe same domain in sitemap, so you have technically no URLs from cliftonfarmcottage domain to be indexed.
Re: Unable to crawl site hosted in a subfolder of another
« Reply #2 on: February 14, 2007, 09:32:30 AM »
That wasn't my question, sorry.

If I go to the free sitemap generator tool, and enter "[ External links are visible to forum administrators only ]" as the URL - the sitemap generator says "an error occurred". I'm not wanting to include the cottage pages under the main site, I want to index it in its own right. I provided the other information as background in case it helped.

Does that make the question clearer?

Thanks

Nick
Re: Unable to crawl site hosted in a subfolder of another
« Reply #3 on: February 15, 2007, 02:40:21 PM »
The "error occured" message is displayed because your http://www.cliftonfarmcottage.co.uk/ page is redirected to http://www.cliftonfarm.co.uk (which is external domain) - that's why the script cannot retrieve your homepage.