Any possibilty of making this?
« on: January 21, 2009, 01:28:50 PM »
I have one big question, since i have few/several sites with more then a million pages up to 3 million pages, i would like to somehow gather site structure or analyze it before i even start crawl, so i could exclude things etc...

or maybe perhaps this thingy:

 "1 out of every ---" taking an integer from 1 to 100

example: "10"
so if 1 out of 10
scans 1
skips 9

would cut 250k into 25k pages to scan

If you have any ideas i would love to hear them...
Re: Any possibilty of making this?
« Reply #1 on: January 22, 2009, 09:59:30 PM »
Hello,

you can set the "Maximum number of URLs" option to make an initial limited scan and then check created sitemaps to see the URLs list.