Cron Job is replacing old sitemaps, PLEASE HELP!
« on: September 24, 2012, 05:34:37 PM »
Ok, well my problem is simple. I have a site with a lot of pages (1million+). So i went ahead and created the sitemaps 1, 2, 3, 4, 5, and so on, about 20 sitemaps. The problem is when the cron job runs, which runs just fine btw, it crawls about 66,000 pages and distributes them between the first 3 different sitemaps correctly taking into consideration that every file has a limitation of 40000 or 9mb. My problem now is when the cron job runs the next day, it is replacing those first three files and its not resuming or adding new pages to other new sitemaps I created such as sitemap3.xml, sitemap4.xml and so on. When I check Google it show that I only have 66,000 and this prevents my other pages from being indexed. So my question is what option do I use to resume the crawling on the last used file and stop the  replacing of existing sitemaps? Thank you
Re: Cron Job is replacing old sitemaps, PLEASE HELP!
« Reply #1 on: September 25, 2012, 01:16:03 PM »
Hello,

sitemap generator starts crawling from the scratch every time the new session is started.
Re: Cron Job is replacing old sitemaps, PLEASE HELP!
« Reply #2 on: September 25, 2012, 04:40:17 PM »
ok, well is there any option to resume the session where it left off? if not, how can i set up the generator to crawl all my 1million+ pages in one go or am i just creating multiple sitemaps for no reason?
« Last Edit: September 25, 2012, 04:42:26 PM by jaydenverse »
Re: Cron Job is replacing old sitemaps, PLEASE HELP!
« Reply #3 on: September 26, 2012, 09:02:39 AM »
Hello,

it resumes the process in case if was not completed before. Otherwise it starts from the scratch (generator needs to keep previously crawled pages in memory to avoid crawling the same pages again).
Re: Cron Job is replacing old sitemaps, PLEASE HELP!
« Reply #4 on: September 27, 2012, 02:36:32 AM »
Hello, what you said is exactly my problem. See instead of keeping previously crawled pages in memory, it replaces them I dot know why it does that and that is why I had created new sitemaps like sitemap3.xml or 4 and so on but it just replaces the first 2 sitemaps. How do i keep the previously crawled pages in memory without them being replaced by new pages on a new cron job or even a manual crawl procedure?
Re: Cron Job is replacing old sitemaps, PLEASE HELP!
« Reply #5 on: September 27, 2012, 10:08:26 AM »
> See instead of keeping previously crawled pages in memory, it replaces them I dot know why it does that and that is why I had created new sitemaps like sitemap3.xml or 4 and so on but it just replaces the first 2 sitemaps.

You should not limit the maximum pages for sitemap generator for that.