Number of pages indexed are less than crawled
« on: January 24, 2012, 08:53:43 AM »
Please, could You check the situation with my site - [ External links are visible to forum administrators only ]. The number of pages indexed is 60360 but crawled is 117897. Why it's so big difference? Before it was less difference - I had 4 sitemaps files (incl. sitemap as index), but now difference became bigger, I had only 3 files. May be I did mistakes in configuration? Thanks in advance!
Re: Number of pages indexed are less than crawled
« Reply #1 on: January 24, 2012, 09:37:26 PM »
Hello,

could you please PM me your generator URL and an example URL that is not included in sitemap and how it can be reached starting from homepage?
Re: Number of pages indexed are less than crawled
« Reply #2 on: January 25, 2012, 10:01:14 AM »
Hello! There aar following:
1. Generator URL is [ External links are visible to forum administrators only ]
2. Unfortunatelly I can't find all the URLs which is crawled. Most of it in the heading of our directory.

But for example, page of our customer - [ External links are visible to forum administrators only ] is not in [ External links are visible to forum administrators only ] but page [ External links are visible to forum administrators only ] is in it.

Really difference is bigger - You van see in statistic. Is it possible to see the crawled pages?
May be the problem is because mainly our pages starts from index.php? Some of them with index.htm and it indexed OK?

Kind!
Re: Number of pages indexed are less than crawled
« Reply #3 on: January 25, 2012, 07:11:03 PM »
Hello,

I cannot open your site (connection times out), is your site currently offline?
Re: Number of pages indexed are less than crawled
« Reply #4 on: January 26, 2012, 08:39:12 AM »
Sorry for thet - seems last night (Moscow time from 20.00) we had the problem with it.
But today I reloaded the server - seems it became to work normally. Please, try again.
Re: Number of pages indexed are less than crawled
« Reply #5 on: January 26, 2012, 08:38:07 PM »
Could you please let me know how those pages can be reached, starting from homepage, so that I can check it?
/pages/mega-instrument/
/pages/kaskad/
Re: Number of pages indexed are less than crawled
« Reply #6 on: January 27, 2012, 06:26:08 AM »
For "/pages/kaskad/" there is references from the DB on the site from [ External links are visible to forum administrators only ] (after address active reference on Rus: "shema proezda" ). For the  "/pages/mega-instrument/ " - from [ External links are visible to forum administrators only ] - full reference with real path after "Internet address". For us second reference much more important because customers pay for it more...
Re: Number of pages indexed are less than crawled
« Reply #7 on: January 27, 2012, 06:59:30 PM »
There must be a way to find the page by clicking links, starting from homepage (this is how sitemap generator's crawler works - it cannot read the entries in database).
Re: Number of pages indexed are less than crawled
« Reply #8 on: January 30, 2012, 06:35:55 AM »
Am I right that You are mentioned "homepage" as main page - [ External links are visible to forum administrators only ].
The situation is so that all references from the first level (from main page - [ External links are visible to forum administrators only ]) are OK. The problems begin from the 3rd - 4th level. May be I did the mistake to describe about DB - the references I wrote before is really clicable from the site (these pages are really exist - not created by the user asking)
Re: Number of pages indexed are less than crawled
« Reply #9 on: January 30, 2012, 12:41:17 PM »
The page can be at any "depth level" from homepage, i.e. it might take a number of clicks to get there and generator should include it in sitemap. Can you describe the path how the missing page can be reached, i.e. "open homepage", click "link X", then click "Y" etc?
Re: Number of pages indexed are less than crawled
« Reply #10 on: January 31, 2012, 10:07:54 AM »
Sample which is indexed OK: [ External links are visible to forum administrators only ] -> top menue, second item (Rus text:) "Все рубрики" by the address: "[ External links are visible to forum administrators only ]" -> to heading (Rus text:) "ДЕТСКИЕ ТОВАРЫ: ИГРУШКИ, ИГРЫ" with address "[ External links are visible to forum administrators only ]" -> second company in the listing (Rus text:) "КАСКАД ООО" -> click to the name, go into - "[ External links are visible to forum administrators only ]" -> see address and click (Rus text:) "схема проезда"

Sample2 - not indexed: [ External links are visible to forum administrators only ] -> top menue, second item (Rus text:) "Все рубрики" by the address: "[ External links are visible to forum administrators only ]" -> to heading (Rus text:) "ИНСТРУМЕНТ" to "[ External links are visible to forum administrators only ]" -> third company (Rus text:) "МЕГАИНСТРУМЕНТ ООО" -> go into to "[ External links are visible to forum administrators only ]", reference named (Rus text:) "Адрес в Internet:" with direct address of the page - [ External links are visible to forum administrators only ]
Re: Number of pages indexed are less than crawled
« Reply #11 on: January 31, 2012, 11:21:53 PM »
I see that first link is included in sitemap:

 <url>
       <loc>http://***/pages/kaskad/index.htm</loc>
       <lastmod>2011-09-14T15:02:08+00:00</lastmod>
       <changefreq>daily</changefreq>
       <priority>0.4096</priority>
  </url>


The second one has "nofollow" attribute in the <a> tag:
<a rel="nofollow" target="_blank" href="http://***/pages/mega-instrument/">
Re: Number of pages indexed are less than crawled
« Reply #12 on: February 01, 2012, 05:45:38 AM »
Thanks for the testing! We try to improve the situation and I will check the numbers again.
Re: Number of pages indexed are less than crawled
« Reply #13 on: March 28, 2012, 12:16:47 PM »
Dear sir! Excuse us for the long time improvement but it was depends not only for us... We did the advise You made - thanks a lot! But still the number of Pages indexed (now 58367) is less than crawled (now 115554), remind You our address: [ External links are visible to forum administrators only ]
Is it possible to find which pages are not indexed after crawling? Thats so big number - I can't find the problem to inprove the situation...Please, advise.
Re: Number of pages indexed are less than crawled
« Reply #14 on: March 28, 2012, 06:32:25 PM »
Please try to re-download generator package (using the same link) - it will show the list of URLs that were crawled but not added in sitemap.