The crawler seems to ignore the Exclude URLS field
« on: March 07, 2007, 05:32:34 PM »
I have a hierarchical file list.

Concretely, I have a contacts.php file that lists users in my website (every item in the list is a link to a profile.php file), and I want the crawler to include all the user profiles in the sitemap.xml.

So, I want a sitemap.xml like this:

contacts.php?page=0
profile.php?id=1
profile.php?id=2
profile.php?id=3
profile.php?id=4
contacts.php?page=1
profile.php?id=5
profile.php?id=6
profile.php?id=7
profile.php?id=8

etc...

Ok, so I tell it to start crawling from that contacts.php file, and configure it to NOT parse the profile.php file (this file contains links to other files I don't want to include), but the crawler seems to ignore it, because it puts every link it finds inside profile.php.

Finally, I get a sitemap.xml like this:

contacts.php?page=0
profile.php?id=1
messages.php?id=1
photos.php?id=1
profile.php?id=2
messages.php?id=2
photos.php?id=2
profile.php?id=3
messages.php?id=3
photos.php?id=3
profile.php?id=4
messages.php?id=4
photos.php?id=4
contacts.php?page=1
profile.php?id=5

etc...


What am I doing wrong?  ???
« Last Edit: March 07, 2007, 05:38:19 PM by contact8 »
Re: The crawler seems to ignore the Exclude URLS field
« Reply #1 on: March 07, 2007, 08:37:43 PM »
Hello,

perhaps links to messages.php?id=X are located not ONLY on profile.php but on some other pages, and sitemap generator finds them.