Generator gets stuck in a loop
« on: May 19, 2009, 10:42:41 AM »
When a site uses relative urls, the generator can get stuck in a loop because it finds pages that don't exist because it ignores the <base href="[ External links are visible to forum administrators only ]"> tag.

i.e.
you are at the URL "[ External links are visible to forum administrators only ]"
there are links to just "info.asp" "contact.asp" which do not exist in /cat/sub/

The crawler detects [ External links are visible to forum administrators only ] and contact.asp as a URL.

I have asked our developers to change "info.asp" to "/info.asp" which is how it should be done for good code, but they argue that <base href="[ External links are visible to forum administrators only ]"> is valid code.

How can I turn on base url support in XML-Sitemaps?
Due to this issue, the crawler hits the site of 700 odd pages, and finds 90,000+ when running 4 levels deep.

Anyhelp please? other than to fire the developers!?!
Rob
Re: Generator gets stuck in a loop
« Reply #1 on: May 19, 2009, 01:00:15 PM »
Hello,

please try to install sitemap generator v3.0 that has just been released, it has a fix for <base href> included.
Re: Generator gets stuck in a loop
« Reply #2 on: May 22, 2009, 10:01:00 AM »
Hi, I upgraded to v3, and this issue still occures.

When the sitemap generator looks for the <base href> will it matter if base href is formatted as below...

<base href = "[ External links are visible to forum administrators only ]">

or does it have to be <base href="[ External links are visible to forum administrators only ]"> (note no spaces)

Thanks,

Rob.
Rob