More Pages Scanned then added to SiteMap?
« on: March 19, 2006, 02:24:49 AM »
I'm crawling one of my sites, and there are considerably more pages scanned then added to the sitemap... why?
Re: More Pages Scanned then added to SiteMap?
« Reply #1 on: March 19, 2006, 08:01:18 AM »
Hello,

the number of "scanned" pages includes ALL links found on the pages, but some of them are NOT includes into sitemap depending on the configuration settings ("excludes pages", "exclude extensions").
Re: More Pages Scanned then added to SiteMap?
« Reply #2 on: March 19, 2006, 08:40:56 AM »
Once the crawling is done, is there a way to add certain missed pages to the sitemap afterwards w/o recrawling?
Re: More Pages Scanned then added to SiteMap?
« Reply #3 on: March 19, 2006, 07:23:10 PM »
Hello,

it can be done only by manual modification of generated xml file (in any text editor).
Re: More Pages Scanned then added to SiteMap?
« Reply #4 on: March 19, 2006, 07:24:49 PM »
Oh... (yuck). ;)

Still waiting for LyricsLane to finish crawling...


Has anyone else noticed that it stops every once and a while?  I leave it running, and occassionally, I have to go back in and start it up again...?
Re: More Pages Scanned then added to SiteMap?
« Reply #5 on: March 19, 2006, 07:26:26 PM »
Yes, it can be stopped depending on your php configuration (memory_limit/max_execution_time settings).
Re: More Pages Scanned then added to SiteMap?
« Reply #6 on: March 19, 2006, 08:12:01 PM »
Is there a way to fix that?
Re: More Pages Scanned then added to SiteMap?
« Reply #8 on: March 20, 2006, 07:13:01 PM »
ok, that's something my host would have to do, I guess.

Now, when I go in to continue with the crawling, it won't even start... it goes to page cannot be displayed.  ???
Re: More Pages Scanned then added to SiteMap?
« Reply #10 on: March 22, 2006, 07:00:34 PM »
Ok, it has finally finished crawling the site.  But now it gives me the following error:

Quote
Sitemap file is not writable: /usr/home/public_html/sitemap1.xml
Sitemap file is not writable: /usr/home/public_html/sitemap2.xml
Sitemap file is not writable: /usr/home/public_html/sitemap3.xml
Sitemap file is not writable: /usr/home/public_html/sitemap4.xml
Sitemap file is not writable: /usr/home/public_html/sitemap5.xml

I have these filles created and they are writeable.
Re: More Pages Scanned then added to SiteMap?
« Reply #11 on: March 22, 2006, 07:52:16 PM »
Great! :)

You should either have /usr/home/public_html/ folder writable with 0766 permissions (which is not recommended) or CREATE empty files sitemap1.xml...sitemap5.xml and set their permissions to 0666.

Now you can simply copy these files from the data/ folder to  /usr/home/public_html/ and set permissions.
Re: More Pages Scanned then added to SiteMap?
« Reply #12 on: March 22, 2006, 07:53:36 PM »
I have already created those files, and set the permission @ 666.

So I copy the files from the data folder to public_html ?
Re: More Pages Scanned then added to SiteMap?
« Reply #13 on: March 24, 2006, 10:46:06 PM »
Yes. And set 0666 permissions for them in public_html for further sitemap generations.