NO Urls in Web Index
« on: August 18, 2011, 07:23:10 PM »
Oleg

I am a complete newbie tasked with generating a sitemap which it outside of my talents, and I am lost.  I purchased the XML generator in April and have been using it to crawling my site since that time, I thought without any errors or difficulties, but it has come to my attention that it isn't working properly and I'm actually suffering rather drastic consequences.

First, I set the configurations mostly to default options but did exclude a few folders from being crawled.  My configuration is set to "calculate the change log" but it not set to "store temporary files." 

When I run a crawl, the results indicate 880 pages have been crawled.   However, the change log has never indicated any URLs added or changed, despite the fact they change quite frequently.  Only in the last few crawls have 1 or 2 URLs shown as having changed.   It's almost always "O" and only the number of crawled pages is updated.

What's  more concerning, however, is that someone pointed out to me, using Google Analytics, that Google Analytics indicates I have 880 pages in my sitemap (which is correct) but "NO URLS are in the WEB INDEX."   Our search engine returns have plummeted and he thinks this is why.  NO URLS are shown in the sitemap.

When I go to my generator and look at my sitemap URLS (I have one in HTML, one in TXT, one in ROR and one in XML), I can see all of my pages just fine in the HTML and TXT versions, but when I click on the XML version -- the one being pinged to search engines and read by Google (if that's the right term), it's full of parsing errors and HTML code errors. For instance, it says I have various code errors in the content of that page -- which I thought was automatically created when I had the generator installed for me.  I tried going to line/item entries shown in the error messages and deleted the HTML code it didn't like, but each time I do so, clicking on the site map link just generates a new error message with some other code problem.  I returned it to the way it was and just left it alone, but clearly, it's not being read correctly.

So now I have two problems.   The XML site map has parsing errors or code errors and I don't know what they are or how I should fix them, and, Google thinks I have no URLs in my Web Index, and hasn't thought so since April 2011.  Clearly that's long enough for it to have indexed my website, which has existed since 1998?

Please help, and it won't offend me to talk me like I know nothing and understand even less.  It's true.


Re: NO Urls in Web Index
« Reply #1 on: August 18, 2011, 08:46:30 PM »
Update:  In reading other forum messages I ended up adding "www" to the starting URL even though that isn't included in our site map, and then the crawl added my 889 pages. However, when I click the link to the xml site-map, it still gives me various errors and I can't look at it.

I don't use style sheets  myself, but the error says the xml software's style sheet is off in some way:

http://(mysite).com/sitemap.xml

shows:

Error loading stylesheet: An unknown error has occurred (805303f4)http://www.(mysite).com/generator/pages/mods/sitemap.xsl
Re: NO Urls in Web Index
« Reply #2 on: August 19, 2011, 11:40:45 AM »
Hello,

in case if starting URL contains www in domain name, then make sure that you open sitemap with www too (i.e. www.domain.com/sitemap.xml instead of domain.com/sitemap.xml)
Re: NO Urls in Web Index
« Reply #3 on: August 19, 2011, 04:40:31 PM »
That has no effect.

First, I'm not sure why the link that the program provides is without the www, but in any case, even if I add the www which is missing from the xml generated site map link (and shouldn't I be using the link the program generates?), the same type of errors occur -- html or stylesheet errors.   Since I don't "format" this page or its code, what are these formatting problems?  I went to the line/item references to try and take out the bad code but it has no effect. It just gives me different errors, line after line after line:

Without the WWW, the link on the configuration page takes me to the sitemap which then records this error:

Error loading stylesheet: An unknown error has occurred (805303f4)http://www.(mysite).com/generator/pages/mods/sitemap.xsl

With the WWW added, the link takes me to the sitemap which then records this error:

XML Parsing Error: mismatched tag. Expected: </html>.
Location: [ External links are visible to forum administrators only ].(mysite).com/generator/pages/mods/sitemap.xsl
Line Number 111, Column 3:</body></html>
--^
Re: NO Urls in Web Index
« Reply #4 on: August 20, 2011, 07:42:26 AM »
Hello,

looks like your sitemap.xsl file is corrupted, please try to re-download generator package and upload it's files to your server, overwriting existing ones.