robots.txt error message
« on: October 24, 2007, 01:39:20 PM »
Hi, I am sorry if this sounds silly, but I am fairly new and don't have a lot of experience with my website, etc... Your service looks great and we may want to explore more options once we learn more.  Anyway, we downloaded an xml sitemap from your site, then uploaded it to our main public folder (where our index file is). Then we gave Google the url (what we think is the url).... I mean the url doesn't go to anywhere on our site, but it is the only one I thought of (our site's url with /sitemap.xml after it.  We don't have to make a page on our site that has that url on it do we?

Anyway, google keeps giving us the robots.txt error message.... I don't even know where that is or if we have to do something.  Here is the error message:   "Network unreachable: robots.txt unreachable
We encountered an error while trying to access your Sitemap. Please ensure your Sitemap follows our guidelines and can be accessed at the location you provided and then resubmit."

Maybe I am not uploading to the right area, but I think I am..... the Home Directory (where all my files are including my index file). 

Please help.... I will give you my site if that helps, thanks and sorry for the newbie question

Chris
Re: robots.txt error message
« Reply #1 on: October 24, 2007, 01:59:31 PM »
Sorry, I guess I should have given you the url that was created
[ External links are visible to forum administrators only ]

Thanks again for your patience

Chris
Re: robots.txt error message
« Reply #2 on: October 24, 2007, 09:46:25 PM »
Hi, well I am slowly trying to fix my problem, but after creating a robots.txt file and validating with your service, I tried to get Google to index my site again... and the following error message came up after about 30 minutes:
"URL timeout: robots.txt timeout
We encountered an error while trying to access your Sitemap. Please ensure your Sitemap follows our guidelines and can be accessed at the location you provided and then resubmit."

Not sure what to do since the sitemap is good and the robots.txt file is good.... any ideas?

Thanks
Re: robots.txt error message
« Reply #3 on: October 24, 2007, 11:34:04 PM »
Hello,

this is perhaps related to your host blocking (or limiting) googlebot from accessing your pages.
Check this page for more details: http://www.homewithandrew.com/index.php/debugging-the-network-unreachable-robots-txt-unreachable-error/
Re: robots.txt error message
« Reply #4 on: October 30, 2007, 06:53:20 AM »
hi,


One of the most fundamental steps when optimizing a website is writing a robots.txt file. It helps tell spiders what is useful and public for sharing in the search engine indexes and what is not. It should also be noted that not all search spiders will follow your instructions left in the robots.txt file. In addition, a poorly done robots.txt file can stop the search spiders from crawling and indexing your website properly. In this article I will show you how to be sure everything will work correctly.
While there are many other SEOs who will tell you that a robots.txt file will not improve your rankings, I would disagree, in order for the robots to index your site properly, they need instruction on which folders or files to not crawl or index, as well as which ones you want to have indexed.

Another good reason to use the robots.txt file is because many of the search engines tell the public to use them on their websites. Below is a quote taken from Google:

Make use of the robots.txt file on your web server. This file tells crawlers which directories can or cannot be crawled. Make sure it's current for your site so that you don't accidentally block the Googlebot crawler.

Even though others feel this is of no use unless you are blocking content, keep this in mind; when a search engine goes out of their way (and this is the tightest-lipped search engine ever) to tell us to use something, it is usually to ones advantage to follow the little clues we are offered.

for example: [ External links are visible to forum administrators only ]

Also if you read your stats file on your web hosting server, you will usually find the URL to your robots.txt being requested. If a search bot asks for the robots.txt and does not find it on your server, the spider often just leaves.

I am including a screen shot from my own web hosting stats. As you can see below, the robots.txt file is #14 of the top URLs requested on my site. Keep in mind, no human visitor is looking at that file, yet it ranks better than a lot of the human visited pages. Now if the bots want that file that much, it is something everyone should be using.

thanks
jay