command line output
« on: November 27, 2007, 02:32:56 AM »
I notice that the app prints some output when run in command line mode.

Example:
218680 | 64728 | 149,486.4 | 1810:00 | 535:45 | 3 | 304,547.1 Kb | 202939 | 4392 | 13
218700 | 64708 | 149,486.4 | 1810:09 | 535:34 | 3 | 304,559.3 Kb | 202959 | 4392 | 12

Can you let me know what these figures mean?

Our sitemap gen is still executing and I dont have an idea whether its close to completing or not.

Thanks.


Re: command line output
« Reply #1 on: November 27, 2007, 03:14:53 AM »
Hello,

the stats mean the following:
urls scanned | urls left (current depth level only) | downloaded bytes | time spent | estimated time left | depth level | memory usage | URLs queued | memory usage change
Re: command line output
« Reply #2 on: December 07, 2007, 03:00:45 AM »
Hi,

How can I get a sense of progress in terms of how many depths and actual URLs are left?
I got to zero for URL scanned count for depth 3, only to see that there is a depth 4 with 86,000 URLs more!

My URL scanned count is now 300,000 pages and counting.
This also mistifies me as I know our site only has 300k pages or less.
In the middle of execution I excluded the pages that were not supposed to be indexed, so are those pages stilll going to be included?

Pls. help:
1. How can I get a sense of progress in terms of how many depths and actual URLs are left?
2. How can I investigate this, when URLs scanned is now well more than the URLs I know are site has.
3. How can I know wc URL is currently being crawled?
4. What is a depth?

Thanks in advance.



Re: command line output
« Reply #3 on: December 07, 2007, 10:19:20 PM »
Hello,

unfortunately, there is no way to find out how many pages are there on the site until the site is crawled, i.e. when you open homepage you only see the pages linked from it, then level 2 is discovered etc.

You can limit the maximum depth level in configuration though (to say 3) to get the most of your pages indexed faster.