• Welcome to Sitemap Generator Forum.
 

Generator gets stuck in a loop

Started by Robert_Readman, May 19, 2009, 10:42:41 AM

Previous topic - Next topic

Robert_Readman

When a site uses relative urls, the generator can get stuck in a loop because it finds pages that don't exist because it ignores the <base href="http://www.domain.tld/"> tag.

i.e.
you are at the URL "http://www.domain.tld/cat/sub/"
there are links to just "info.asp" "contact.asp" which do not exist in /cat/sub/

The crawler detects http://www.domain.tld/cat/sub/info.asp and contact.asp as a URL.

I have asked our developers to change "info.asp" to "/info.asp" which is how it should be done for good code, but they argue that <base href="http://www.domain.tld/"> is valid code.

How can I turn on base url support in XML-Sitemaps?
Due to this issue, the crawler hits the site of 700 odd pages, and finds 90,000+ when running 4 levels deep.

Anyhelp please? other than to fire the developers!?!
Rob

XML-Sitemaps Support

Hello,

please try to install sitemap generator v3.0 that has just been released, it has a fix for <base href> included.

Robert_Readman

Hi, I upgraded to v3, and this issue still occures.

When the sitemap generator looks for the <base href> will it matter if base href is formatted as below...

<base href = "[ External links are visible to forum administrators only ]">

or does it have to be <base href="[ External links are visible to forum administrators only ]"> (note no spaces)

Thanks,

Rob.
Rob