How to generate sitemap from an interrupted session?
« on: February 07, 2011, 06:39:25 AM »
I have a large site:

The interrupted session says:
URLs added: 243644,
estimated URLs left in a queue: 226115
This process ran for 16 hours.

I want to build a sitemap from this interrupted session, how can I do that?
« Last Edit: February 07, 2011, 06:49:21 AM by kodokmarton »
Re: How to generate sitemap from an interrupted session?
« Reply #2 on: February 07, 2011, 05:25:37 PM »
That will stop the process to continue. I need the process to go forward.

I want a new feature added to the resume screen on Crawling tab.
It should call build sitemap from current status.

Also I need another new feature, I want regexp recognized in the do not parse section, as I have some wildcards I can use it there and can't do it by substring match.

When could you implement that?
Re: How to generate sitemap from an interrupted session?
« Reply #3 on: February 07, 2011, 05:26:46 PM »
Also `Click here to interrupt it.` doesn't seem to do the job.

As the process gets stopped, but after a few moments it starts again. I issued the process from the browser, not by a cron job.
Re: How to generate sitemap from an interrupted session?
« Reply #4 on: February 08, 2011, 02:41:10 PM »
Hello,

perhaps you have it still open in browser, so it auto-resumes the process. You can rename generator folder (to say "generato2") to avoid that.
Re: How to generate sitemap from an interrupted session?
« Reply #5 on: February 08, 2011, 08:53:29 PM »
On the latest version this seams a bug, as I barely can stop it. Please check this stuff, it might happen when memory is very high around 1000 megabytes.

For when can you implement a sitemap from an interrupted session?
Re: How to generate sitemap from an interrupted session?
« Reply #6 on: February 09, 2011, 01:26:19 PM »
We will consider this feature request for one of the future versions of sitemap generator (we have no release date estimates).
Re: How to generate sitemap from an interrupted session?
« Reply #7 on: February 09, 2011, 02:42:30 PM »
Could you at least tell me the file to modify, and I will try to modify myself.

As I see only the following change after build sitemap is called:
- do not delete/clear the process state, keep it as it is, to be able to resume it
Re: How to generate sitemap from an interrupted session?
« Reply #8 on: February 09, 2011, 08:32:02 PM »
The code wil lneed to be modified for that since such a feature is not available in existing version, it's not a configuration modification.
Re: How to generate sitemap from an interrupted session?
« Reply #9 on: February 09, 2011, 09:51:44 PM »
Could you at least tell me the file to modify, and I will try to modify myself.
Re: How to generate sitemap from an interrupted session?
« Reply #10 on: February 10, 2011, 10:04:00 PM »
I don't have details on what to modify until (if) we get it included in sitemap generator code in future version, sorry.