This is weird!
« on: July 26, 2010, 06:46:45 PM »
OK, I am trying to use standalone version to create the sitemap. My site is relatively simple, about 100 pages, root and two directories: "feed" and "nanometrics". And this software has been crawling the Site for the past 50 min, and the script is producing really weird output:
---------
Links depth: 339
Current page: kensington.php/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/nanometrics/robot_quote.php?robot=031409-29,_Seal_Furon
Pages added to sitemap: 16413
Pages scanned: 16420 (243,647.0 Kb)
Pages left: 23 (+ 56 queued for the next depth level)
Time passed: 52:34
Time left: 0:04
Memory usage: -
Resuming the last session (last updated: 2010-07-26 10:17:25)
--------
Pages scanned: 16420 ???
Links depth: 339 ???
and what is with "Current page" ?

What could I possibly do wrong this time?

Please, help. Thanks
Re: This is weird!
« Reply #1 on: July 26, 2010, 10:20:08 PM »
Hello,

looks like an endless loop of links.
You should add this in Exclude URLs setting:
Code: [Select]
nanometrics/nanometrics
and set Maximum depth to 100.
Re: This is weird!
« Reply #2 on: July 27, 2010, 08:07:00 PM »
but I don't want to exclude my "nanometrics" folder
Re: This is weird!
« Reply #3 on: July 28, 2010, 01:43:06 PM »
It will not be excluded, just loop links that include double nanometrics/nanometrics will be removed.
Re: This is weird!
« Reply #4 on: August 02, 2010, 06:27:31 PM »
I have done what you suggested, and looks better now, BUT I find some weird URL's listed, like this one:

[ External links are visible to forum administrators only ]

Also, is it possible that some URL's are dropped because I set Max depth to 100 ?
« Last Edit: August 02, 2010, 06:36:34 PM by nermi »
Re: This is weird!
« Reply #5 on: August 03, 2010, 08:46:19 PM »
You can exclude those links with:
Code: [Select]
\.php/.*\.php
If some pages on your site require a lot of clicks to reach it from homepage (more than 100), then depth level limit could affect them.