Crawling non-existent pages
« on: December 23, 2012, 08:38:09 PM »
I've been watching the sitemap generator on my dedicated Linux/Apache server for 3 days now.... It seems to be crawling pages that do not exist... I suspect I've either mis-configured the tool - or there is a bug... It has crawled, so far, 90,000 pages - but I know I only have around 45,000 pages.... The problem seems to be with my URL structure where page numbers are involved... i.e. "../pageXX,htm"

I can PM you the URL if there is anyone reading this.... thanks.

Re: Crawling non-existent pages
« Reply #1 on: December 24, 2012, 07:51:02 AM »
Please let me know example URLs that get indexed. The best way to resolve this though is to make sure that any request to non-existing page results in 404 http code returned by your server. Then you will be able to find where the non-existing link is referred from (in broken links list report).