"Analyze" feature - don't see all crawled pages
« on: May 09, 2008, 08:49:46 PM »
From documentation:

"Analyze" feature allows you to easily investigate the site structure. It represents the tree-like list of directories of your website, indicating the number of pages in every folder. You can expand/collapse the tree parts by clicking the
  • signs.[/i][/b]

    When I try to use this feature, there's an "X" (main folder) and numbers  "- 382 (473)". Does it mean 382 pages out of 473 have been indexed? Hope not.

    When I expand this folder by clicking on the "X", I see only few pages from the main folder, and found out that those end with the trailing slash (/), which made me little mad at myself. I try to validate my pages, but sometimes I forget to add trailing slash to the end of my url's. I know it's not required for validation, but is usually good practice. But, the fact is that the 'Analyze' feature did find in my case only url's with the trailing slash. I wonder why?

    Further, it DID find one of subdirectories that I wanted indexed and it did put the "X" sign in front of it but is showing only 5 out of 11 pages there. Guess what, once again those url's  with a slash at the end.

    Then something that puzzles me the most. It is showing one subdirectory without the "X" and I can not expand it to see what is indexed. It shows "-9" which IS in turn how many pages are supposed to be there, but why it doesn't show it as a folder?  Looking at the tree structure on Analyze page, it looks more like this folder is inside another subfolder,  and IS NOT. It is a subfolder of the main ([ External links are visible to forum administrators only ]).
    So, it is all confusing and although looking at the XML file I find so far most of my pages, I wonder how robots like this actually SEE my site and its structure.

    And one more question: does it make a difference if the starting page is something like:
    [ External links are visible to forum administrators only ] OR
    [ External links are visible to forum administrators only ]
    ?

    Oleg, if you need me to PM you my URL please let me know. Other than that, this is one wonderful piece of program. Very pleased to have it. Like everything else, it will get even better with time.

    Thanks.

    Nermi 
Re: "Analyze" feature - don't see all crawled pages
« Reply #1 on: May 10, 2008, 07:14:26 PM »
Hello,

the "expand tree" links
  • are only displayed for folders that contain other subfolders inside.

Also, the tree does NOT include regular pages URLs, but only *folders*, i.e. this is not a sitemap, but just a "site structure" to analyze.

The numbers mean how many pages are found in that particular folder and total number of pages in that folder AND all subfolder.
I hope that makes sense :)
Re: "Analyze" feature - don't see all crawled pages
« Reply #2 on: May 12, 2008, 01:57:34 PM »
If that is the case, then the program has found "folders" that don't exist. It treats some of my regular url's as folders. Those url's with trailing slash (/).  ???


Nermi
Re: "Analyze" feature - don't see all crawled pages
« Reply #3 on: May 12, 2008, 09:25:32 PM »
Hello,

those entries are not folders on your server, but "folder-like" parts of your URLs.
Re: "Analyze" feature - don't see all crawled pages
« Reply #4 on: May 13, 2008, 12:26:26 AM »
Do you mind explaining a "folder-like" thing a little bit more?
Regular URL that looks like a folder? Why?

Thanks.

Nermi
Re: "Analyze" feature - don't see all crawled pages
« Reply #5 on: May 14, 2008, 12:52:19 AM »
Yes, an URL that looks like a folder.
Again - analyze page is aimed to create a "structure" of the site, it does NOT reflect real URLs, but just the "tree" of subfolders extracted from URLs.
Re: "Analyze" feature - don't see all crawled pages
« Reply #6 on: May 14, 2008, 01:19:46 PM »
10 years in Web programming and I never heard about regular URL's that "look like a folder".

Nermi
Re: "Analyze" feature - don't see all crawled pages
« Reply #7 on: May 14, 2008, 11:11:11 PM »
I'm not sure I understand what exactly is the problem in this case, can you describe it? That page is not visible to your website visitors and is provided as an illustration for the website structure for the site administrator. You should not even click on those links.