What is Links Depth?
« on: February 22, 2008, 04:15:23 PM »
What is does Links Depth Mean and how does it work? 
National Outdoors Website
Re: What is Links Depth?
« Reply #1 on: February 23, 2008, 12:31:29 AM »
Hello,

"Links depth" is a number of "clicks" required to reach specific page starting from homepage. It allows to limit sitemap generator crawler to certain level (in many cases most important pages are "closer" to homepage).
Re: What is Links Depth?
« Reply #2 on: November 21, 2008, 07:54:01 PM »
I just bought your product yesterday. It's running on one of my sites now. It's been going for about 9 hours and has added 76,000 pages to the sitemap.

It's at link depth 10 (and depth 11 is queued). This is a little confusing, as I know for a fact there is no page within this site that requires 10 clicks to get to. I doubt any page takes more than 4-5 clicks to get to from the homepage.

Now there is probably some kind of loop going on and I'm sure you have "duplication code" in place to prevent  double url's from being added to the finished sitemap. A couple of questions:

1. If I stop the scan now, I don't think it will generate the sitemap from what it has already scanned. Is there a way to do this?  Stop the scan and force it to generate now?

2. Even though it's at link depth 10, I'm carefully watching the stat "Pages added to sitemap" and it is growing.. Slowly, but pages are still being added. I'm not quite sure to make of this when taking link depth into consideration.

Here's the current stats:

Links depth: 10
Pages added to sitemap: 75973
Pages scanned: 108460 (2,935,449.2 KB)
Pages left: 58232 (+ 26711 queued for the next depth level)
Time passed: 540:21
Time left: 290:06
Memory usage: 99,276.4 Kb

Any comments or suggestions?


CT
[ External links are visible to forum administrators only ]
Re: What is Links Depth?
« Reply #3 on: November 22, 2008, 04:23:43 PM »
Hello,

yes, sitemap generator checks for "duplicate links" and it doesn't include the same link twice.

1. you can stop generator, then set "Maximum URLs" option so that it doesn't find other pages and resume generation.

2.I'd recommend to set "maximum depth" option to "10" at this point, so that it will include the pages that are already found and prevent further crawling.