• Welcome to Sitemap Generator Forum.
 

Number of pages indexed are less than crawled

Started by igors.kuvaga, January 24, 2012, 08:53:43 AM

Previous topic - Next topic

igors.kuvaga

Please, could You check the situation with my site - [ External links are visible to forum administrators only ]. The number of pages indexed is 60360 but crawled is 117897. Why it's so big difference? Before it was less difference - I had 4 sitemaps files (incl. sitemap as index), but now difference became bigger, I had only 3 files. May be I did mistakes in configuration? Thanks in advance!

XML-Sitemaps Support

Hello,

could you please PM me your generator URL and an example URL that is not included in sitemap and how it can be reached starting from homepage?

igors.kuvaga

Hello! There aar following:
1. Generator URL is [ External links are visible to forum administrators only ]
2. Unfortunatelly I can't find all the URLs which is crawled. Most of it in the heading of our directory.

But for example, page of our customer - [ External links are visible to forum administrators only ] is not in [ External links are visible to forum administrators only ] but page [ External links are visible to forum administrators only ] is in it.

Really difference is bigger - You van see in statistic. Is it possible to see the crawled pages?
May be the problem is because mainly our pages starts from index.php? Some of them with index.htm and it indexed OK?

Kind!


igors.kuvaga

Sorry for thet - seems last night (Moscow time from 20.00) we had the problem with it.
But today I reloaded the server - seems it became to work normally. Please, try again.

XML-Sitemaps Support

Could you please let me know how those pages can be reached, starting from homepage, so that I can check it?
/pages/mega-instrument/
/pages/kaskad/

igors.kuvaga

For "/pages/kaskad/" there is references from the DB on the site from [ External links are visible to forum administrators only ] (after address active reference on Rus: "shema proezda" ). For the  "/pages/mega-instrument/ " - from [ External links are visible to forum administrators only ] - full reference with real path after "Internet address". For us second reference much more important because customers pay for it more...

XML-Sitemaps Support

There must be a way to find the page by clicking links, starting from homepage (this is how sitemap generator's crawler works - it cannot read the entries in database).

igors.kuvaga

Am I right that You are mentioned "homepage" as main page - [ External links are visible to forum administrators only ].
The situation is so that all references from the first level (from main page - [ External links are visible to forum administrators only ]) are OK. The problems begin from the 3rd - 4th level. May be I did the mistake to describe about DB - the references I wrote before is really clicable from the site (these pages are really exist - not created by the user asking)

XML-Sitemaps Support

The page can be at any "depth level" from homepage, i.e. it might take a number of clicks to get there and generator should include it in sitemap. Can you describe the path how the missing page can be reached, i.e. "open homepage", click "link X", then click "Y" etc?

igors.kuvaga

Sample which is indexed OK: [ External links are visible to forum administrators only ] -> top menue, second item (Rus text:) "Все рубрики" by the address: "www.abspb.ru/rubric.php" -> to heading (Rus text:) "ДЕТСКИЕ ТОВАРЫ: ИГРУШКИ, ИГРЫ" with address "[ External links are visible to forum administrators only ]" -> second company in the listing (Rus text:) "КАСКАД ООО" -> click to the name, go into - "[ External links are visible to forum administrators only ]" -> see address and click (Rus text:) "схема проезда"

Sample2 - not indexed: [ External links are visible to forum administrators only ] -> top menue, second item (Rus text:) "Все рубрики" by the address: "www.abspb.ru/rubric.php" -> to heading (Rus text:) "ИНСТРУМЕНТ" to "[ External links are visible to forum administrators only ]" -> third company (Rus text:) "МЕГАИНСТРУМЕНТ ООО" -> go into to "[ External links are visible to forum administrators only ]", reference named (Rus text:) "Адрес в Internet:" with direct address of the page - [ External links are visible to forum administrators only ]

XML-Sitemaps Support

I see that first link is included in sitemap:

<url>
       <loc>http://***/pages/kaskad/index.htm</loc>
       <lastmod>2011-09-14T15:02:08+00:00</lastmod>
       <changefreq>daily</changefreq>
       <priority>0.4096</priority>
  </url>


The second one has "nofollow" attribute in the <a> tag:
<a rel="nofollow" target="_blank" href="http://***/pages/mega-instrument/">

igors.kuvaga

Thanks for the testing! We try to improve the situation and I will check the numbers again.

igors.kuvaga

Dear sir! Excuse us for the long time improvement but it was depends not only for us... We did the advise You made - thanks a lot! But still the number of Pages indexed (now 58367) is less than crawled (now 115554), remind You our address: [ External links are visible to forum administrators only ]
Is it possible to find which pages are not indexed after crawling? Thats so big number - I can't find the problem to inprove the situation...Please, advise.

XML-Sitemaps Support

Please try to re-download generator package (using the same link) - it will show the list of URLs that were crawled but not added in sitemap.