number of urls crawled
« on: December 23, 2009, 08:13:12 AM »
I use sitemaps generator on my site, which is a directory with location categories with listings for 'businesses'

Each business listing in each category creates a separate 'Detailed' page, with a different url

This creates a potential problem with regard to a canonicalization issue with regard to pages with different urls but the same content

Detailed pages for listing that are in more than 1 category

Example here for link ID123 (each page123.html has identical content)


A 'fix' was made for the site so that each detailed page only had 1 url for each listing no matter which category the listing was in. I checked and could see that in the example above each category listing used 1 url ([ External links are visible to forum administrators only ].******.com/Detailed/1st_category/page_123.html)

So I thought that would be the solution

When I did a sitemap 'crawl' using "XML-Sitemap Generator" it still picked up all the 'detailed pages' with the number of url's crawled changing only slightly from 5817 down to 4977

Does sitemap generator crawl the site using the links on the pages?

If so why is it picking up the duplicate URL's for the detailed pages, that are not linked to?

Re: number of urls crawled
« Reply #1 on: December 23, 2009, 04:48:58 PM »

yes, sitemap generator "crawls" your site, finding all links on your pages. Most likely, there are still links to other types of URLs somewhere on your site (otherwise generator will not be able to find them), please check that. If you need assistance, please PM me your generator URL/login and example URL that you believe is not linkes anywhere on your site.