Hello I seem to be having a problem getting the sitemap generator to crawl all the pages of a particular website that I am responsible for, the site and section in question is: http://www.canadianblackbook.com/black-book-values

i ran the sitemap generator with the following settings:
starting URL: http://www.canadianblackbook.com/black-book-values/
sitemap URL: http://www.canadianblackbook.com/black-book-value.xml
Include Only URLs: black-book-values
parse only urls: black-book-values

When i ran this, it created approximately 15000 pages, the ultimate goal was to index the final page of the black book values, for example i want to index the last page for a trade in value, in which case the crawl path would be:
1.  http://www.canadianblackbook.com/black-book-values/   (you need to use the static links on the bottom)
2.  http://www.canadianblackbook.com/black-book-values/2009_Dodge
3.  http://www.canadianblackbook.com/black-book-values/2009_Dodge_Avenger
4.  http://www.canadianblackbook.com/black-book-values/2009_Dodge_Avenger/Trade-in-Value
5.  http://www.canadianblackbook.com/black-book-values/2009_Dodge_Avenger/Trade-in-Value/R/T_4D%20Sedan_48000 (need to index this page)

however the problem is, is that the crawler doesnt get past step 4.  I'm not sure if the crawler likes the link "Or get the Trade-in Value of my 2009 Dodge Avenger with default selections", or what the problem might be (in theory this is the link to the very last page (URL #5).

This is my first post, so let me know if you require additional information, thanks in advance for any assistance.
Re: Unable to crawl all pages of a certain section using sitemap generator
« Reply #1 on: September 27, 2010, 08:51:01 PM »
Hello,

the page linked in point#5 has the following tag in html source:
<meta name="robots" content="noindex, nofollow" >

it tells not to index the page, so you should remove that tag.
Oleg Ignatiuk
www.xml-sitemaps.com
Send me a Private Message

For maximum exposure and traffic for your web site check out our additional SEO Services.