• Welcome to Sitemap Generator Forum.
 

a few questions

Started by gjf03c, March 14, 2008, 08:40:44 AM

Previous topic - Next topic

gjf03c

It looks like I have everything working correctly.  I have a few questions though.

1.  My site is very large and there is probably a few million pages.  I only have 1024 MB of total space and the first crawl took up nearly 450MB and it crawled approx 2% of my pages.  The obvious solution is to get a dedicated server and more space, but do you have any suggestions until I go forward with the new server? 

2.  How do I know when Google, Ask, Yahoo etc.. are pinged?  Does it use the ROR, RSS, XML and TXT files when it pings these search engines?

Thanks!

XML-Sitemaps Support

Hello,

1. do you mean the metric displayed by sitemap generator when crawling? This is not the amount of space used by generator script, but total pages size crawled. You don't need that much space to store it.

2. the script pings search engines automatically when sitemap is created. You can also manually submit xml sitemap to Google webmaster account: [ External links are visible to logged in users only ]

gjf03c

Well, crawl_dump.log file is nearly 215MB.  not including the size of the sitemaps as well.  I used the gzip compression and that does help, but I can't figure out why crawl_dump.log is storing so much data.

Thanks,
Greg

XML-Sitemaps Support

You can disable HTML/ROR sitemaps to decrease the dump size (since it currently stores page titles/descriptions).