• Welcome to Sitemap Generator Forum.
 

Generator Removing URLs - Not Crawling New Pages

Started by informer9, May 09, 2007, 12:26:52 PM

Previous topic - Next topic

informer9

Hi
Im using my generator for some time now and everything was working fine until may.

It has removed over 200 urls and dont want to crawl it back...

it looks like that

Pages scanned: 220 (5,181.4 Kb)
Pages left: 30 (+ 186 queued for the next depth level)

then it drops all the next level queued pages and finishes with 246 pages crawled for the sitemap!!

its not crawling new entries into the directory

it has removed the links to all the entries from the directory
it is still crawling all the other pages (categories, subcategories, search results)

[ External links are visible to forum administrators only ]

Ive got few other pages on exactly the same script and generator is working fine there.
it is crawling all my links fine on [ External links are visible to forum administrators only ]

Ive also tried  other free generator from  [ External links are visible to forum administrators only ]
and it is working fine, crawling all the pages

ive updated the script to newest version
ive tried to reinstall the script
ive tried to use your free generator

still no joy

Please help

mike

t_a

Thomas A.


t_a

I am using these excludes:
rss.php?c=
submit.php?c=
?s=
authors?Page=
?Page=
?ArticleId=
addfav
addread
print

As for dept level.. have all to use "0" unlimited there.

Here is an example of a removed page: (thousands of articles has been removed during the last week or so)
[ External links are visible to forum administrators only ]

Any suggestions.
Thomas A.

t_a

Here is a page with some 404's that should not be: [ External links are visible to forum administrators only ]

The strange thing here is that some articles are added to the sitemap.

Example of added link:
[ External links are visible to forum administrators only ]

Example of link returning 404:
[ External links are visible to forum administrators only ]

I just cant seem figure this out.

I welcome any help I can get here.
Thomas A.

XML-Sitemaps Support

Hello,

perhaps your site returned an error for some URLs because of crawling intensity. Try to define a delay between requests in sitemap generator configuration.

t_a

I tried a 3 second delay for each 10 requests, but that did not help. Could it be something else?

Do you want login information?
Thomas A.


informer9

Hi
sorry for the delay

I dont have any limitations at all. I use 0 to have all web page crawled.

I have tried suggested 1 sec break between every request...


Still no joy !!! it has even removed my last entry in to the directory ( I lost 5 urls)

all the other pages are being crawled fine

Admin - please help... any other ideas?

regards

mike

XML-Sitemaps Support

Hello,

please send me a private message with your generator URL and example URL that is not included in sitemap.

t_a

I sent you a PM. I am missing appox 5000 pages in my sitemap.

Thanks.
Thomas A.