Broken XML being generated
« on: July 31, 2005, 11:46:38 PM »
This is the message I get when I open sitemap.xml in my root (public_html)  folder

Quote
The XML page cannot be displayed
Cannot view XML input using style sheet. Please correct the error and then click the Refresh button, or try again later.


--------------------------------------------------------------------------------

XML document must have a top level element. Error processing resource '[ External links are visible to forum administrators only ]'.


I do have it set to create sitemap.xml.gz and that seems to be getting updated every day with a CRON that I setup. However, the sitemap.xml doesn't update at all.

I run a large blogging site and am currently under 50,000 pages

Quote
Request date:
31 July 2005, 09:15
Processing time:
36952.88s
Pages indexed:
47285
Sitemap files:
1
Pages size:
767.28Mb

The main thing is that I don't see that this is working right and I hope that it is just the settings.

How can I be sure that it IS or IS NOT working correctly?

Thanks in advance for your help,
Mark



 
Re: Broken XML being generated
« Reply #1 on: August 01, 2005, 10:48:58 AM »
Hi Mark,

when gzip compression is enabled, the output file is sitemap.xml.gz, that's why sitemap.xml is not updated.
Re: Broken XML being generated
« Reply #2 on: August 01, 2005, 04:31:31 PM »
Thanks for the quick response.

Will Google recognize that file and scan it? I wouldn't normally be asking so many questions but I can't find the answers in the forum or the current docs.

Re: Broken XML being generated
« Reply #3 on: August 01, 2005, 08:16:32 PM »
Good question,

When I first ran the on-line version, Google picked it up almost immediately and listed everything that was in the sitemap.xml. Since then, I've purchased and run the stand-alone version a number of times. The sitemap file is updating successfully each run but the Google results have not changed since the original.

Not sure why this is...I guess I'm assuming that when the program pings Google, the new xml file is read. Maybe not?

Peter
Re: Broken XML being generated
« Reply #4 on: August 01, 2005, 09:03:13 PM »
Hi,

yes, Google accepts the gzip-compressed sitemaps:
https://www.google.com/webmasters/sitemaps/docs/en/protocol.html

When the generator script sends ping request to Google, they read the sitemap file from your host. Then it may take few hours before they will parse your sitemap. And after that they will crawl your site to index the pages you listed in sitemap - this may take longer time (it depends on many things). So, the generator ensures to collect all you pages and informs Google with the full list, making it easier to crawl your site.
Re: Broken XML being generated
« Reply #5 on: August 01, 2005, 09:21:20 PM »
Now I get it...thanks!

Peter