acnz

*
  • *
  • 14
Google rejects urllist.txt and sitemap.xml
« on: July 22, 2007, 09:58:52 AM »
Hi there
When I crawl successfully and tell Google to get sitemap.xml it comes back ERRORS. So I tried by copying urllist.txt from sitemap/data/ to the root directory and uploading that - but Google again comes back ERRORS. Any advice?
For example [ External links are visible to forum administrators only ] to view.

Thanks

acnz

*
  • *
  • 14
Re: Google rejects urllist.txt and sitemap.xml
« Reply #2 on: July 23, 2007, 05:49:50 AM »
Hi there -- um, I said:
When I crawl successfully and tell Google to get sitemap.xml it comes back ERRORS

The Validator also says errors. You said I should just submit it anyway to Google webmaster account, which I do - and Google also says errors.

It looks like Sitemap Generator is having trouble making the sitemap.xls ??
Re: Google rejects urllist.txt and sitemap.xml
« Reply #3 on: July 23, 2007, 10:33:33 AM »
um, if you really want help, try telling what errors you are getting.
Typing the EXACT error message you receive makes it much easier for the XML team to help resolve the problem.  They cannot read your mind and they cannot predict some of the incredibly silly things people do when they don't read the instructions.

Most people who claim to have a problem with Google finding sitemap.xml have simply neglected to set the path for sitemap.xml when configuring the standalone generator.  The sitemap file ends up in the data directory and they try to tell Google the file is in the web root directory.  Google even gives you a hint where it expects to find the file.

sitemap.xml should be in the root of your web directory
If you read ALL of the Documentation page and followed the instructions, your sitemap.xml file will be created correctly.
You would then submit [ External links are visible to forum administrators only ] to Google

Pretty darn simple and repeated often in this forum.
Join the xml-sitemaps affiliate program and get paid for sales referrals -

acnz

*
  • *
  • 14
Re: Google rejects urllist.txt and sitemap.xml
« Reply #4 on: July 23, 2007, 12:23:47 PM »
XML Parsing Error: no element found
Location: [ External links are visible to forum administrators only ]
Line Number 1, Column 1:
Re: Google rejects urllist.txt and sitemap.xml
« Reply #5 on: July 23, 2007, 06:25:25 PM »
Your sitemap.xml is over 5Mb uncompressed and has over 148,000 lines.

Quote
[ External links are visible to forum administrators only ]

You can provide multiple Sitemap files, but each Sitemap file that you provide must have no more than 50,000 URLs and must be no larger than 10MB (10,485,760) when uncompressed. These limits help to ensure that your web server does not get bogged down serving very large files.

If you want to list more than 50,000 URLs, you must create multiple Sitemap files. If you anticipate your Sitemap growing beyond 50,000 URLs or 10MB, you should consider creating multiple Sitemap files. If you do provide multiple Sitemaps, you can list them in a Sitemap index file. Sitemap index files may not list more than 1,000 Sitemaps.

Quote
https://www.xml-sitemaps.com/documentation-xml-sitemap-generator.html

14. # Google doesn't support sitemap files with more than 50,000 pages. That's why script supports "Sitemap Index" creation for the big sites. So, it will create one sitemap index file and multiple sitemap files with 50 thousand pages each.
For instance, your website has about 140,000 pages. The XML sitemap generator will create these files:

    * "sitemap.xml" - sitemap index file that includes links to other files (filename depends on what you entered in the "Save sitemap to" field)
    * "sitemap1.xml" - sitemap file (URLs from 1 to 50,000)
    * "sitemap2.xml" - sitemap file (URLs from 50,001 to 100,000)
    * "sitemap3.xml" - sitemap file (URLs from 100,001 to 140,000)

Please make sure all of these files are writable if your website is large.

You should upload blank files for sitemap[1-5].xml and sitemap[1-5].html change the permissions on those files to match sitemap.xml to prevent any errors or additional problems.   

Considering the size of your sitemap files, you should also turn on compression. The xml-sitemaps config option will compress ALL of the output files.  You don't really need ror.xml or the text sitemap anymore so don't select them for creation.  If your site already has a text menu that links all the pages, you can disable the html sitemap creation until you can get this first problem resolved.

If your installation is not creating the "sitemap index" and subsequent 50,000 URL sitemap[1-5]?.xml files  then you should post again to see what you need to change in your configuration.

I am curious as to why you adjusted your configuration so index.php would be listed for each of your subdirectories.  You really don't want to do that.  It puts emphasis on index.php instead of the directory  name.  The bots know that index.php is in the directory, you don't need to direct them to it. 
Quote
<loc>[ External links are visible to forum administrators only ]</loc>

What files are in your xml-sitemap/data directory?
Join the xml-sitemaps affiliate program and get paid for sales referrals -
Re: Google rejects urllist.txt and sitemap.xml
« Reply #6 on: October 07, 2007, 05:29:11 PM »
Have you considered PASS protecting your setup?