Duplicate links
« on: March 01, 2007, 11:19:46 AM »
Hi, I am new to the standalone XML sitemap generator, so sorry if this is a gumby question.

It seems to be basically working OK, but I am having an issue where it lists certain links twice.

For instance, my homepage...

[ External links are visible to forum administrators only ]

If you click the 'sitemap' link at the bottom right of every page you will see there are 2 entries.  This seems to happen anywhere I have an index.php.  I have these index files inside each top level folder, which correspond  to each main navigation section - Benefits, Portfolio, Resources,Process, Hints & Tips, Contacts, Links.

What am I doing wrong here - and how can I fix it?

I notice that one links URL is;
[ External links are visible to forum administrators only ]

and the other is;
[ External links are visible to forum administrators only ]
Re: Duplicate links
« Reply #2 on: March 02, 2007, 08:40:59 AM »

you can add this in "Exclude URLs" option:

I could, but as most of the site structure relies on links present in each index file, that results in only a few 'siewide links' appearing in the XML Sitemap.

This is no solution.

I also tried experimenting with listing an actual URL, but it looks like the Exclude URLs option only matches strings, not entire URLs.  In any case, this would have had the same effect.

It looks like the problem is the index.php and the folder are equivalent.

I then thought about the robots.txt file, but as far as I know - it only lets you remove folders or files.

I needed more control, and the only thing I have found that will allow this to work is a command in the HTML head section.

<meta name="robots" content="noindex">

This works, in that it does not create a link for the index file itself, however it does create them from links in the file.

If I had have used the following meta, not even the links inside the file would have worked...

<meta name="robots" content="noindex,nofollow">

Hope this helps someone else.

Any moderators listening, it would be nice to have this feature inside the sitemap generator, but I have no idea if sitemap standards support it anyway.

Re: Duplicate links
« Reply #3 on: March 03, 2007, 01:41:19 AM »
Drat - I just realised.  This does not really fix the issue satisfactorily either.  Not only does it remove the duplicate, it removes the index.php files as well.  >:(

Having 2 links - 1 for the index.php and another for the reference that does not have the filename looks pretty lame imho.  There should be a way to specify that folders don't also get listed.

Please can someone come up with a suggestion, or confirm this as a bug?
Re: Duplicate links
« Reply #4 on: March 03, 2007, 02:30:27 AM »
To be more precise, here is what shows up in the actual sitemaphtml.
(Note that it is the same location / file)

Code: [Select]
<tr><td class="lpage"><a href="http://designforge.com.au/" title="Designforge: Homepage">Designforge: Homepage</a></td></tr>
<tr><td class="lpage"><a href="http://designforge.com.au/index.php" title="Designforge: Homepage">Designforge: Homepage</a></td></tr>
Re: Duplicate links
« Reply #5 on: March 03, 2007, 10:37:25 PM »

you can replace all "index.php" links on your site to the folder-like links.